Incidence and Risk Factors of Lower Limb Deep Vein Thrombosis in Psychiatric Inpatients by Applying Machine Learning to Electronic Health Records: A Retrospective Cohort Study

Liang Xu; Miao Da

doi:10.2147/CLEP.S501062

Back to Journals » Clinical Epidemiology » Volume 17

Original Research

Incidence and Risk Factors of Lower Limb Deep Vein Thrombosis in Psychiatric Inpatients by Applying Machine Learning to Electronic Health Records: A Retrospective Cohort Study

Authors Xu L , Da M

Received 15 October 2024

Accepted for publication 11 January 2025

Published 25 February 2025 Volume 2025:17 Pages 197—209

DOI https://doi.org/10.2147/CLEP.S501062

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Thomas Ahern

Download Article [PDF]

Liang Xu, Miao Da

Department of Psychiatry, Huzhou Third Municipal Hospital, the Affiliated Hospital of Huzhou University, Huzhou, Zhejiang, People’s Republic of China

Correspondence: Miao Da, Department of Psychiatry, Huzhou Third Municipal Hospital, the Affiliated Hospital of Huzhou University, 2088 East Tiaoxi Road, Huzhou, Zhejiang, People’s Republic of China, Tel +860572 2290561, Email [email protected]

Background: Psychiatric inpatients face an increased risk of deep vein thrombosis (DVT) due to their psychiatric conditions and pharmacological treatments. However, research focusing on this population remains limited.
Methods: This study analyzed 17,434 psychiatric inpatients at Huzhou Third Municipal Hospital, incorporating data on demographics, psychiatric diagnoses, physical illnesses, laboratory results, and medication use. Predictive models for DVT were developed using logistic regression, random forest, support vector machine (SVM), and XGBoost (Extreme Gradient Boosting). Feature importance was assessed using the random forest model.
Results: The DVT incidence among psychiatric inpatients was 1.6%. Predictive model performance, measured by the area under the curve (AUC), showed logistic regression (0.900), random forest (0.885), SVM (0.890), and XGBoost (0.889) performed well. Logistic regression and random forest models exhibited optimal overall performance, while XGBoost excelled in recall. Significant predictors of DVT included elevated D-dimer levels, age, Alzheimer’s disease, and Madopar use.
Conclusion: Psychiatric inpatients require vigilance for DVT risk, with factors like D-dimer levels and age serving as critical indicators. Machine learning models effectively predict DVT risk, enabling early detection and personalized prevention strategies in clinical practice.

Keywords: psychiatric inpatients, deep vein thrombosis, machine learning, risk factors, predictive modelling

Introduction

Recent years have seen a steady rise in the number of psychiatric inpatients, a trend linked to the aging population and the increasing incidence of mental illness.¹ While psychiatric inpatient care primarily addresses psychiatric and behavioral symptoms, the physical health concerns associated with these patients, particularly the risk of deep vein thrombosis (DVT), have often been overlooked. DVT is a common and potentially serious complication in hospitalized patients, especially those who are bedridden or have limited mobility. In addition to raising healthcare costs during hospitalization, DVT can lead to severe complications, such as pulmonary embolism.²

The thrombotic risk for psychiatric inpatients is often exacerbated by the nature of their condition. For instance, long-term use of antipsychotic medications can lead to metabolic syndrome, while suicidal and self-harming behaviors, along with overall reduced physical activity, further heighten the risk of thrombosis.^3–5 Additionally, psychiatric patients frequently have comorbid somatic conditions such as hypertension and diabetes, which are also associated with an increased risk of thrombosis.⁶

Existing literature highlights that factors such as D-dimer levels, age, and thrombotic history are strongly associated with the occurrence of DVT. However, most studies have primarily examined general medical and surgical populations, with a notable lack of research focused on psychiatric inpatients.^7–9 Given the complex array of factors contributing to DVT in this population and the intricate interrelationships among these factors, conventional statistical analysis methods may face limitations in adequately addressing these multifaceted interactions.

To address the limitations of traditional statistical methods and offer a more thorough analysis of DVT risk factors in psychiatric inpatients, this study adopts a machine learning approach. As a data-driven analytical technique, machine learning enables the automated detection of patterns and trends within complex datasets.¹⁰ Its use in medical research has been expanding, particularly for multidimensional and high-dimensional data.¹¹ Compared to traditional statistical methods, machine learning is superior in handling non-linear relationships and uncovering intricate interactions between multiple variables.¹²

The objective of this study is a notable lack in the field of thrombosis in psychiatric inpatients through the application of machine learning methods, thereby contributing to the advancement of clinical practice. This study specifically focuses on the incidence and risk factors of lower extremity DVT among psychiatric inpatients, rather than the broader category of venous thromboembolic disease which also includes PE. The decision to limit our study to lower extremity DVT was based on its high incidence in this population and the relative ease of diagnosis compared to PE, which often requires additional imaging techniques such as CT pulmonary angiography. The findings contribute to the efficiency of early prevention and intervention of DVT, thereby improving the overall prognosis and quality of life in this population.

Methods

Study Population

A total of 17,434 patients admitted to the psychiatric department of Huzhou Third Municipal Hospital between July 2021 and May 2024 were included in the study. The inclusion criteria were as follows: patients who were admitted to the psychiatric department during the specified period, with medical records that provided comprehensive details on their medical history, diagnostic information, laboratory test results, and medication during hospitalization. The principal diagnosis for each patient was a mental or behavioral disorder, as classified by the International Classification of Diseases, 10th edition (ICD-10), within the range of codes F00-F99. For patients with multiple hospitalizations, data from the first hospitalization was used. Exclusion criteria included: Patients who died during hospitalization. Medical records that were incomplete or lacked essential data for comprehensive analysis. Patients with an insufficient length of stay to collect necessary data. Patients with pre-existing deep vein thrombosis at the time of admission. Additional data from multiple hospitalizations beyond the first hospitalization for patients with multiple admissions.

Data Collection

The researcher accessed patient data from the hospital’s electronic medical record system, which included the following information:

Demographic Information

Age and gender.

Medical History

Length of hospitalization and comorbid physical conditions, such as hypertension and diabetes.

Diagnostic Information

Patient diagnoses of mental disorders were classified according to the International Classification of Diseases, 10th Revision (ICD-10). Major diagnostic categories included Alzheimer’s disease, dementia, schizophrenia, depressive disorder, and bipolar disorder.

Laboratory Test Results

Data from 75 blood tests were collected, including hemoglobin, white blood cell count, platelet count, alanine aminotransferase (ALT), aspartate aminotransferase (AST), creatinine, urea nitrogen, total cholesterol, low-density lipoprotein (LDL), blood glucose, D-dimer, and troponin T, among others.

Drug Use

The study reviewed the use of 76 commonly prescribed drugs in psychiatry, including antipsychotics, antidepressants, anti-anxiety medications, anti-dementia drugs, and medications used for comorbid somatic conditions.

DVT Diagnosis

The primary outcome of interest was the diagnosis of DVT occurring during hospitalization. DVT was diagnosed based on color Doppler ultrasound of both lower limb veins, which was the established diagnostic criterion. The diagnostic criteria included an inability to compress and occlude the vein lumen, as well as the presence of hypoechoic or echogenic areas with minimal or no blood flow signal within the vein.¹³ Patients were categorized into two groups: those who developed DVT during their hospitalization and those who did not.

Data Preprocessing

Duplicate features were removed based on expert recommendations. Variables or research objects with a missing data rate exceeding 10% were excluded from the analysis. For the remaining missing data, the K-nearest neighbor (KNN) method was employed to impute values. After consolidating and imputing the data, preprocessing steps were undertaken to optimize machine learning computations. Continuous variables with near-zero variance were eliminated, and the data were normalized by standardizing them to have a mean of zero and a standard deviation of one. This preprocessing ensured that all indicators were on a comparable scale, facilitating a comprehensive comparative analysis. Categorical variables were converted into factor variables.

Model Development and Validation

The model was developed using four prevalent machine learning algorithms: logistic regression, random forest, support vector machine (SVM), and XGBoost. To mitigate the data imbalance, the under-sampling technique was applied, which reduced the number of instances in the majority class to enhance the representation of the minority class.¹⁴ The dataset was randomly split into training and validation sets with a 7:3 ratio. The training set was used to build the model, while the validation set was used for evaluation.

Feature selection was performed using LASSO (Least Absolute Shrinkage and Selection Operator) regression. LASSO regression, through L1 regularization, automatically selected features and reduced the number of independent variables in the model. The feature variables selected by LASSO regression were incorporated into the predictive model. To construct the model, ten-fold cross-validation (CV) was employed, wherein the training set was divided into ten equal parts. Each part served sequentially as the validation set, while the remaining nine parts were used for training. The model’s final performance metrics were derived from averaging the results of these ten iterations, providing a robust assessment of its generalization capabilities.

Model parameters were optimized through grid search. Performance was evaluated using metrics such as accuracy, precision, recall, F1 score, and AUC (area under the curve). ROC curves for each model were plotted, with AUC values interpreted as follows: 0.5 ≤ AUC < 0.7 indicated poor prediction, 0.7 ≤ AUC < 0.9 indicated moderate prediction, and AUC ≥ 0.9 indicated very good prediction.

Feature importance was assessed using the Random Forest algorithm, which improved model performance and stability by aggregating predictions from multiple decision trees. The importance of each feature was determined by calculating its average score across these trees, indicating its contribution to model performance. The study’s methodology was illustrated in Figure 1.

Figure 1 Study flowchart. This figure outlined the process of patient selection, data preprocessing, and model evaluation for a study involving psychiatric inpatients who met the ICD-10 diagnostic criteria. The flowchart detailed the steps taken to refine the dataset and the predictive models used for analysis.

Abbreviations: ICD-10, International Classification of Diseases, Tenth Revision; LASSO, Least Absolute Shrinkage and Selection Operator; LR, Logistic Regression; RF, Random Forest; SVM, Support Vector Machine; XGBoost, Extreme Gradient Boosting.

Statistical methods

The statistical analysis and modeling were performed using IBM SPSS version 26.0, Python 3.8, Anaconda distribution 23.1.0, and the Spyder development environment 5.4.1. For normally distributed data, results were presented as means ± standard deviations, and comparisons between groups were conducted using t-tests. For non-normally distributed data, results were reported as medians with interquartile ranges [M (Q1, Q3)], and the Mann–Whitney U-test was used for group comparisons. Categorical data were expressed as frequencies and percentages, with group comparisons analyzed using the χ²-test or Fisher’s exact test. A p-value of less than 0.05 was considered indicative of statistical significance.

Results

Clinical Features of Patients

A total of 17,434 inpatients with mental disorders were selected for the study. These participants were randomly divided into two groups: a training set consisting of 70% of the total sample (n = 13,947) and a validation set comprising the remaining 30% (n = 3,487). The mean age of the patients was 54.0 years (range: 35.0 to 65.0). Among them, 5,743 (32.94%) were male. The average length of hospital stay was 14.0 days (range: 10.0 to 21.0), and the mean D-dimer level was 0.26 mg/L (range: 0.15 to 0.51). Key characteristics of the data are detailed in Table 1.

Table 1 Comparison of General Information Between Thrombus and Non-Thrombus Groups

The Incidence of Lower Limb Deep Vein Thrombosis

The study findings revealed that 283 patients developed lower extremity DVT, yielding a incidence rate of 1.6%. The incidence of lower extremity DVT varied across different subpopulations. Specifically, Figure 2 showed a incidence of 9.05% in patients administered Madopar and 6.02% in those with Alzheimer’s disease. Furthermore, the incidence of lower extremity DVT increased with age, as illustrated in Figure 3.

Figure 2 The incidence of Lower Limb Deep Vein Thrombosis (DVT) by Various Factors. This figure presented the incidence of lower limb deep vein thrombosis (DVT) as a percentage, categorized by different factors that could influence the occurrence of DVT. The graph displayed the rates for various conditions and characteristics, highlighting the relative risk associated with each factor. The x-axis listed the factors that are associated with DVT incidence, including the use of specific medications (eg, Madopar, Olanzapine), medical conditions (eg, Alzheimer’s Disease, Diabetes, Hypertension), and gender (positive for female, negative for male). The y-axis represented the incidence rates of DVT, ranging from 0.00% to 10.00%.

Figure 3 The incidence of lower extremity deep vein thrombosis by age groups. This figure depicted the incidence rate of deep vein thrombosis (DVT) in the lower limbs, expressed as a percentage, across various age groups. The data was categorized into age ranges, as indicated on the x-axis, and the corresponding incidence rates were plotted on the y-axis. Age Groups: The age groups were represented by intervals, such as (18,30], (30,40], (40,50], (50,60], (60,70], (70,80], (80,90], and (90,100], where the numbers in parentheses denoted the lower and upper bounds of each age range. The y-axis displayed the incidence of DVT, ranging from 0.00% to 14.00%.

Feature Selection

This study included a total of 163 features, encompassing demographic information, psychiatric diagnoses, concomitant physical illnesses, laboratory findings, and substance use. These features were analyzed using LASSO regression, resulting in a final model that retained 16 key factors: D-dimer, age, troponin T, urea nitrogen, Alzheimer’s disease, red blood cell count, myoglobin, follicle-stimulating hormone, phosphorus, hemoglobin, Madopar, cholinesterase, glucose, albumin, C-reactive protein (CRP), and creatinine. The indicators evaluated at λ = 1 standard error (1se) were used as modeling indicators, with the results detailed in Table 2. Indicators with coefficients of 0 at λ = 1se were excluded from the table The selected features demonstrated high importance within the model and provided optimal performance with the minimal number of independent variables.

Table 2 LASSO Feature Screening

Predictive Modelling and Model Performance Evaluation

Based on the 16 identified features, logistic regression, random forest, support vector machine, and XGBoost models were utilized for training and evaluation. A comparison of the overall performance of these models was presented in Table 3, with the ROC curves illustrated in Figure 4. Logistic regression and random forest exhibited superior performance, demonstrating high accuracy and recall rates, thus indicating their effectiveness in identifying the majority of positive class samples. The F1 scores and AUC further demonstrated the models’ effectiveness in managing unbalanced data. The SVM model showed relatively weaker performance, particularly in terms of accuracy and F1 scores, indicating reduced effectiveness with unbalanced data. In contrast, the XGBoost model excelled in recall, proving effective at identifying most positive class samples. Although its precision was somewhat lower, the F1 score and AUC highlighted the model’s strong performance in handling unbalanced datasets.

Table 3 Comprehensive Performance of Each Machine Learning Prediction Model

Figure 4 The ROC Curves Comparison for Various Machine Learning Models. This figure presented the Receiver Operating Characteristic (ROC) curves for four different machine learning models, along with a random classifier, to evaluate their performance in distinguishing between two classes. The ROC curves plotted the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The Area Under the Curve (AUC) provided a single measure of the model’s ability to discriminate between the classes, with higher AUC values indicating better performance. Logistic Regression: AUC = 0.90. Random Forest: AUC = 0.89. Support Vector Machine: AUC = 0.89. XGBoost: AUC = 0.89. Random Classifier: AUC = 0.50. The x-axis represented the false positive rate (FPR), ranging from 0.0 to 1.0, and the y-axis represented the true positive rate (TPR), also ranging from 0.0 to 1.0. The diagonal line represented the performance of a random classifier, which serves as a baseline for comparison. Models with curves closer to the top-left corner of the plot demonstrated superior classification capabilities.

Feature Importance Analysis

In this study, a random forest model was employed to rank the relative importance of risk factors associated with the development of lower extremity deep vein thrombosis in hospitalized patients with mental disorders. The results indicated that D-dimer and age were the most influential variables, having the greatest impact on the predicted outcomes. The features were ranked in descending order of importance as follows: D-dimer (25.35%), age (20.18%), follicle-stimulating hormone (9.49%), troponin T (9.12%), and urea nitrogen (4.27%). The relative importance of each feature was illustrated in Figure 5.

Figure 5 The importance ranking of feature variables of random forest model. The illustration demonstrated the relative significance of the distinctive variables within the random forest model, as determined by their impact on the precision of the model predictions. The horizontal axis depicted the importance of the characteristic variables, with larger values indicating a greater impact on the model predictions. The vertical axis represented the characteristic variables, which include biomarkers and clinical parameters such as D-dimer, age, follicle-stimulating hormone, troponin T, urea nitrogen, myoglobin, creatinine, CRP, cholinesterase, erythrocyte count, glucose, phosphorus, albumin, hemoglobin, Madopar and Alzheimer’s disease status.

Discussion

The objective of this study was to investigate the incidence of DVT and its influencing factors among psychiatric inpatients. To achieve this, we conducted a retrospective analysis of data from 17,434 psychiatric inpatients. The study found that the incidence of DVT in this population was 1.6%. Previous studies on various inpatient populations have reported DVT incidence rates ranging from 0.86% to 50%, and the incidence in our study population is higher than that observed in the general population and non-surgical inpatient populations.^7,8,15–17 These findings suggest that heightened vigilance and monitoring for thrombotic events are necessary for psychiatric inpatients, particularly those with multiple risk factors.

The machine learning models exhibited strong predictive performance overall, with each model displaying unique characteristics tailored to different clinical scenarios and needs. This offers a variety of options for personalized risk assessment. The study identified several key factors significantly associated with DVT, including D-dimer levels, age, Alzheimer’s disease, and the use of Madopar. Furthermore, the significance of these factors was validated within the machine learning models.

Many conventional DVT scoring tools, such as the Oudega, Gagne, Wells, and Caprini scores, are limited by their lack of specificity in predictive power. These tools often rely on data, including specific blood indicators, that may not be readily accessible, which can undermine the accuracy of the predictive model. Consequently, the effectiveness of risk stratification in the inpatient setting was constrained by several factors.^18–22 Several retrospective studies have utilized machine learning techniques to predict DVT in various patient populations. Most of these studies have further segmented hospitalized patients into subgroups, such as those undergoing surgery, routine hospitalizations, oncology patients, and pregnant individuals. The predictive performance of machine learning models has consistently shown to be excellent across these diverse populations.^7,8,19,23,24

This study highlights the effectiveness of machine learning in predicting the occurrence of DVT among psychiatric inpatients. Of the various techniques evaluated, the logistic regression model is favored for clinical risk assessment due to its clear interpretability and high accuracy. Additionally, the random forest model enhances predictive precision by leveraging its robust feature selection capabilities, which both confirms the importance of key influencing factors and improves prediction accuracy. The XGBoost model proved effective in managing data imbalance, particularly with respect to recall, underscoring its potential for identifying high-risk patients. In conclusion, machine learning models present a powerful tool for predicting DVT risk and supporting clinical decision-making in psychiatric inpatient populations. Their broad applicability is particularly valuable in clinical settings where integrating multiple influencing factors is essential for enhancing prediction accuracy.

This study identified several key clinical features, including D-dimer, age, follicle-stimulating hormone (FSH), and troponin T, which could be instrumental in developing predictive models for DVT. D-dimer, a fibrin degradation product commonly used as a biomarker for thrombosis, has been shown to be significantly associated with DVT development. Elevated D-dimer levels were a crucial predictor of DVT onset, as demonstrated by research across both general and specific patient populations.^15,17 The prominence of D-dimer in predictive models underscores its importance and suggests that it should be prioritized for inclusion in future clinical decision support systems.

Age is a non-modifiable risk factor that plays a significant role in the development of venous thromboembolism (VTE). Evidence suggests that the risk of developing DVT rises significantly with age, likely due to physiological changes such as decreased vascular elasticity, reduced blood flow, and lower levels of physical activity in older adults.²⁵ Elderly psychiatric inpatients are particularly vulnerable due to the presence of multiple chronic conditions that further impair mobility and elevate DVT risk. Given these considerations, it is advisable for clinical practice to emphasize the monitoring of geriatric psychiatric inpatients and to implement more proactive preventive measures to mitigate their risk of DVT.

Studies have identified a significant association between psychiatric disorders, including Alzheimer’s disease, and the risk of developing venous thrombotic events, particularly among patients using Madopar. As Alzheimer’s disease progresses, patients commonly experience cognitive decline and diminished physical mobility. The study’s findings reinforced the substantial link between Alzheimer’s disease and the development of DVT.²⁶ This highlights the importance of addressing not only the cognitive aspects of Alzheimer’s disease but also the overall physical health of patients, with particular attention to the risk of thrombosis, in their management and treatment.

The findings reveal an increased risk of DVT in patients using Madopar, a medication commonly prescribed for Parkinson’s disease. This result aligned with previous research identifying knee/trunk flexion, a characteristic manifestation of Parkinson’s disease (PD), and increased blood pressure variability as specific risk factors for DVT, along with lower limb movement limitations or hypokinesis.^27,28 Additionally, some studies suggested that antipsychotic medications might elevate the risk of DVT; however, these occurrences were relatively rare. The current study does not find substantial evidence supporting a significant impact of antipsychotic medications on the incidence of DVT events among psychiatric inpatients.^29–31

These findings establish a basis for the early identification of DVT and guide the development of personalized prevention strategies. By integrating patient-specific information, clinicians can implement targeted preventive measures, thereby reducing the incidence of DVT and enhancing the overall prognosis for patients.

This study, a retrospective analysis based on a real-world database, has several limitations. Firstly, data were sourced from a single hospital, which, despite a large sample size, may limit the external validity due to the specific geographical location and patient population of that institution. Secondly, while the study included a diverse patient cohort, it did not account for potential confounding factors such as patients’ lifestyles, nutritional status, and family history, which could influence the etiology of DVT. Future research should incorporate these variables to provide a more comprehensive understanding of DVT mechanisms and identify additional influencing factors. Thirdly, the low incidence of thrombotic events led to data balancing through undersampling, potentially resulting in the loss of valuable information and affecting the predictive accuracy of the model. Fourthly, since medical professionals determine the need for bilateral lower extremity venous ultrasound based on clinical judgment, this study might have underestimated the incidence of DVT. Fifth, the study excluded patients who died during their hospital stay, which may have led to an underestimation of the true incidence of DVT, as severe cases of DVT or PE could have contributed to in-hospital mortality. The exclusion of these patients might also have resulted in a selection bias. Finally, this study did not specifically examine the use of VTE prophylactic drugs, such as low-molecular-weight heparin (LMWH) or direct oral anticoagulants (DOACs), which are commonly used in clinical practice to prevent venous thromboembolism. The lack of data on these drugs may have impacted the results, as they could act as protective factors against DVT and PE.

Furthermore, the study was limited to lower-limb DVT, which likely underrepresents the true impact of venous thromboembolism (VTE) in psychiatric inpatients, as VTE can occur in other locations, such as the upper extremities or visceral veins, and pulmonary embolism (PE) remains a significant complication. Therefore, the actual burden of VTE in this population may be greater than what is reflected by this study’s findings.

Despite these limitations, the study provides valuable insights into DVT incidence among psychiatric inpatients and influencing factors. Future studies should address these limitations to validate and expand upon these findings, including incorporating data on VTE prophylactic treatments, the effects of in-hospital mortality on DVT incidence, and the broader scope of VTE, including PE and thrombosis in other locations.

Conclusions

This study aimed to investigate the incidence of DVT and its key determinants among psychiatric inpatients using various machine learning models. The results showed that the incidence of lower extremity DVT in this population was 1.6%. D-dimer levels and age were identified as particularly significant predictors of DVT occurrence. Furthermore, the machine learning models demonstrated a high degree of accuracy in predicting DVT risk. The study also found that the incidence of psychiatric disorders and the use of medications such as Madopar in patients with Alzheimer’s disease were associated with an increased risk of DVT. These findings highlight the importance of enhanced clinical vigilance in managing these patients. Overall, the study provides valuable insights for assessing the risk of venous thromboembolism (VTE) in psychiatric inpatients and offers a scientific basis for developing personalized prevention and treatment strategies.

Data Sharing Statement

Not applicable (this manuscript does not report data generation or analysis).

Ethics Approval and Consent to Participate

This retrospective analysis was conducted using data sourced from a real-world database, with ethical approval granted by the Ethics Committee of Huzhou Third Municipal Hospital (Ethics No. 2022SY2022084). Our study complies with the Declaration of Helsinki.

The relevant medical statistics data concerning hospitalized patients were anonymized, and, in accordance with the Ethics Committee of Huzhou Third Municipal Hospital approval, consent was obtained without the need for individual informed consent.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This work was supported by Zhejiang Medical and Health Technology Project (No. 2024KY421) and the Public Welfare Technology Application Research Program of Huzhou (No. 2023GY24).

Disclosure

The authors declare that they have no competing interests in this work.

References

1. Collaborators GMD. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Psychiatry. 2022;9(2):137–150.

2. Navarrete S, Solar C, Tapia R, et al. Pathophysiology of deep vein thrombosis. Clin Exp Med. 2023;23(3):645–654. doi:10.1007/s10238-022-00829-w

3. Wang Z, Yang Y, He X, et al. Incidence and clinical features of venous thromboembolism in inpatients with Mental Illness. Clin Appl Thromb/Hemost. 2023;29:1299625039. doi:10.1177/10760296231160753

4. Kowal C, Peyre H, Amad A, et al. Psychotic, mood, and anxiety disorders and venous thromboembolism: a systematic review and meta-analysis. Psychosomatic Med. 2020;82(9):838–849. doi:10.1097/PSY.0000000000000863

5. Jørgensen H, Horváth-Puhó E, Laugesen K, et al. Venous thromboembolism and risk of depression: a population-based cohort study. J Thromb Haemost. 2023;21(4):953–962. doi:10.1016/j.jtha.2022.12.006

6. Duffett L. Deep venous thrombosis. Ann Int Med. 2022;175(9):ITC129–ITC144. doi:10.7326/AITC202209200

7. Wang X, Xi H, Geng X, et al. Artificial intelligence-based prediction of lower extremity deep vein thrombosis risk after knee/Hip arthroplasty. Clin Appl Thromb/Hemost. 2023;29:1309646529. doi:10.1177/10760296221139263

8. Shohat N, Ludwick L, Sherman MB, et al. Using machine learning to predict venous thromboembolism and major bleeding events following total joint arthroplasty. Sci Rep. 2023;13(1):2197. doi:10.1038/s41598-022-26032-1

9. Liu H, Yuan H, Wang Y, et al. Prediction of venous thromboembolism with machine learning techniques in young-middle-aged inpatients. Sci Rep. 2021;11(1):12868. doi:10.1038/s41598-021-92287-9

10. Sarker IH. Machine learning: algorithms, real-world applications and research directions. SN Comput Sci. 2021;2(3):160. doi:10.1007/s42979-021-00592-x

11. Rajula HSR, Verlato G, Manchia M, et al. Comparison of conventional statistical methods with machine learning in medicine: diagnosis, drug development, and treatment. Medicina. 2020;56(9):455. doi:10.3390/medicina56090455

12. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning in medicine. JAMA. 2017;318(6):517–518. doi:10.1001/jama.2017.7797

13. Zhang S, Chu W, Wang H, et al. Evaluation of stability of deep venous thrombosis of the lower extremities using Doppler ultrasound. J Int Med Res. 2020;48(8):1220741650.

14. Devi D, Biswas SK, Purkayastha B. A review on solution to class imbalance problem. In: Undersampling approaches: 2020 international conference on computational performance evaluation (ComPE), 2020. IEEE.

15. Jin S, Qin D, Liang B, et al. Machine learning predicts cancer-associated deep vein thrombosis using clinically available variables. Int J Med Inf. 2022;161:104733. doi:10.1016/j.ijmedinf.2022.104733

16. Lei H, Zhang M, Wu Z, et al. Development and validation of a risk prediction model for venous thromboembolism in lung cancer patients using machine learning. Front Cardiovasc Med. 2022;9:845210. doi:10.3389/fcvm.2022.845210

17. Meng L, Wei T, Fan R, et al. Development and validation of a machine learning model to predict venous thromboembolism among hospitalized cancer patients. Asia-Pacific J Oncol Nurs. 2022;9(12):100128. doi:10.1016/j.apjon.2022.100128

18. Wang P, Wang Y, Yuan Z, et al. Venous thromboembolism risk assessment of surgical patients in Southwest China using real-world data: establishment and evaluation of an improved venous thromboembolism risk model. BMC Med Inf Decision Making. 2022;22(1):59. doi:10.1186/s12911-022-01795-9

19. He L, Luo L, Hou X, et al. Predicting venous thromboembolism in hospitalized trauma patients: a combination of the Caprini score and data-driven machine learning model. BMC Emerg Med. 2021;21(1):60. doi:10.1186/s12873-021-00447-x

20. Ryan L, Mataraso S, Siefkas A, et al. A machine learning approach to predict deep venous thrombosis among hospitalized patients. Clin Appl Thromb/Hemost. 2021;27:1419577583. doi:10.1177/1076029621991185

21. Kafeza M, Shalhoub J, Salooja N, et al. A systematic review of clinical prediction scores for deep vein thrombosis. Phlebology. 2017;32(8):516–531. doi:10.1177/0268355516678729

22. Silveira PC, Ip IK, Goldhaber SZ, et al. Performance of Wells score for deep vein thrombosis in the inpatient setting. JAMA Int Med. 2015;175(7):1112–1117. doi:10.1001/jamainternmed.2015.1687

23. Wang KY, Ikwuezunma I, Puvanesarajah V, et al. Using predictive modeling and supervised machine learning to identify patients at risk for venous thromboembolism following posterior lumbar fusion. Global Spine J. 2023;13(4):1097–1103. doi:10.1177/21925682211019361

24. Nudel J, Bishara AM, de Geus SW, et al. Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endoscopy. 2021;35:182–191. doi:10.1007/s00464-020-07378-x

25. Chen F, Xiong JX, Zhou WM. Differences in limb, age and sex of Chinese deep vein thrombosis patients. Phlebology. 2015;30(4):242–248. doi:10.1177/0268355514524192

26. Masters CL, Bateman R, Blennow K, et al. Alzheimer’s disease. Nat Rev Dis Primers. 2015;1(1):1–18. doi:10.1038/nrdp.2015.56

27. Yamane K, Kimura F, Unoda K, et al. Postural abnormality as a risk marker for leg deep venous thrombosis in Parkinson’s disease. PLoS One. 2013;8(7):e66984. doi:10.1371/journal.pone.0066984

28. Takeda T, Koreki A, Kokubun S, et al. Deep vein thrombosis and its risk factors in neurodegenerative diseases: a markedly higher incidence in Parkinson’s disease. J Neurol Sci. 2024;457:122896. doi:10.1016/j.jns.2024.122896

29. Knudson JF, Kortepeter C, Dubitsky GM, et al. Antipsychotic drugs and venous thromboembolism. Lancet. 2000;356(9225):252–253. doi:10.1016/S0140-6736(05)74504-9

30. Waage IM, Gedde-Dahl A. Drug points: pulmonary embolism possibly associated with olanzapine treatment. BMJ. 2003;327(7428):1384. doi:10.1136/bmj.327.7428.1384

31. Liperoti R, Pedone C, Lapane KL, et al. Venous thromboembolism among elderly patients treated with atypical and conventional antipsychotic agents. Arch Int Med. 2005;165(22):2677–2682. doi:10.1001/archinte.165.22.2677

Creative Commons License © 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, 3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]

Incidence and Risk Factors of Lower Limb Deep Vein Thrombosis in Psychiatric Inpatients by Applying Machine Learning to Electronic Health Records: A Retrospective Cohort Study

Introduction

Methods

Study Population

Data Collection

Demographic Information

Medical History

Diagnostic Information

Laboratory Test Results

Drug Use

DVT Diagnosis

Data Preprocessing

Model Development and Validation

Statistical methods

Results

Clinical Features of Patients

The Incidence of Lower Limb Deep Vein Thrombosis

Feature Selection

Predictive Modelling and Model Performance Evaluation

Feature Importance Analysis

Discussion

Conclusions

Data Sharing Statement

Ethics Approval and Consent to Participate

Author Contributions

Funding

Disclosure

References

Recommended articles

Using Machine Learning Algorithms to Predict High-Risk Factors for Postoperative Delirium in Elderly Patients

Ten-Year Multicenter Retrospective Study Utilizing Machine Learning Algorithms to Identify Patients at High Risk of Venous Thromboembolism After Radical Gastrectomy

Triaging Clients at Risk of Disengagement from HIV Care: Application of a Predictive Model to Clinical Trial Data in South Africa

Using Machine Learning to Predict Linezolid-Associated Thrombocytopenia

Auxiliary Diagnosis of Pulmonary Nodules’ Benignancy and Malignancy Based on Machine Learning: A Retrospective Study