Identifying and Validating Prognostic Hyper-Inflammatory and Hypo-Inflammatory COVID-19 Clinical Phenotypes Using Machine Learning Methods

Xiaojing Ji; Yiran Guo; Lujia Tang; Chengjin Gao

doi:10.2147/JIR.S504028

Back to Journals » Journal of Inflammation Research » Volume 18

Original Research

Identifying and Validating Prognostic Hyper-Inflammatory and Hypo-Inflammatory COVID-19 Clinical Phenotypes Using Machine Learning Methods

Authors Ji X, Guo Y , Tang L , Gao C

Received 22 November 2024

Accepted for publication 18 February 2025

Published 27 February 2025 Volume 2025:18 Pages 3009—3024

DOI https://doi.org/10.2147/JIR.S504028

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Ning Quan

Download Article [PDF]

Xiaojing Ji, Yiran Guo, Lujia Tang, Chengjin Gao

Department of Emergency, Xinhua Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200092, People’s Republic of China

Correspondence: Lujia Tang; Chengjin Gao, Email [email protected]; [email protected]

Background: COVID-19 exhibits complex pathophysiological manifestations, characterized by significant clinical and biological heterogeneity. Identifying phenotypes may enhance our understanding of the disease’s diverse trajectories, benefiting clinical practice and trials.
Methods: This study included adult patients with COVID-19 from Xinhua Hospital, affiliated with Shanghai Jiao Tong University School of Medicine, between December 15, 2022, and February 15, 2023. The k-prototypes clustering method was employed using 50 clinical variables to identify phenotypes. Machine learning algorithms were then applied to select key classifier variables for phenotype recognition.
Results: A total of 1376 patients met the inclusion criteria. K-prototypes clustering revealed two distinct subphenotypes: Hypo-inflammatory subphenotype (824 [59.9%]) and Hyper-inflammatory subphenotype (552 [40.1%]). Patients in Hypo-inflammatory subphenotype were younger, predominantly female, with low mortality and shorter hospital stays. In contrast, Hyper-inflammatory subphenotype patients were older, predominantly male, exhibiting a hyperinflammatory state with higher mortality and rates of organ dysfunction. The AdaBoost model performed best for subphenotype prediction (Accuracy: 0.975, Precision: 0.968, Recall: 0.976, F1: 0.972, AUROC: 0.975). “CRP”, “IL-2R”, “D-dimer”, “ST2”, “BUN”, “NT-proBNP”, “neutrophil percentage”, and “lymphocyte count” were identified as the top-ranked variables in the AdaBoost model.
Conclusion: This analysis identified two phenotypes based on COVID-19 symptoms and comorbidities. These phenotypes can be accurately recognized using machine learning models, with the AdaBoost model being optimal for predicting in-hospital mortality. The variables “CRP”, “IL-2R”, “D-dimer”, “ST2”, “BUN”, “NT-proBNP”, “neutrophil percentage”, and “lymphocyte count” play a significant role in the prediction of subphenotypes. Use the identified subphenotypes for risk stratification in clinical practice. Hyper-inflammatory subphenotypes can be closely monitored, and preventive measures such as early admission to the intensive care unit or prophylactic anticoagulation can be taken.

Keywords: COVID-19, subphenotypes, K-prototypes clustering, machine learning, mortality prediction

Introduction

Coronavirus disease 2019 (COVID-19), a global public health crisis, continues to pose challenges despite the implementation of preventive measures. This highly contagious viral infection has rapidly spread worldwide, raising significant health concerns.¹ As of March 12, 2024, there have been 704 million confirmed COVID-19 cases globally, resulting in 7 million reported deaths. On December 7, 2022, the Chinese government introduced the New 10 epidemic prevention policy, effectively ending the dynamic zero-COVID strategy.² This change triggered a new wave of the omicron variant, leading to a sudden increase in disease burden that presents substantial challenges to clinical practice.

Since the early stages of the global COVID-19 pandemic caused by the SARS-CoV-2 virus, a wide range of clinical outcomes has been observed, from asymptomatic cases and mild manifestations to critical instances and fatal respiratory failure.^3,4 The mortality rate of COVID-19 varies, and treatment responses differ significantly.⁵ These disparities underscore the clinical and biological heterogeneity of COVID-19. Certain patient characteristics are associated with more severe disease and worse outcomes, including older age, male gender, diabetes mellitus (DM), cardiovascular disease, chronic obstructive pulmonary disease (COPD), and chronic kidney disease. Additionally, patients exhibiting a high inflammatory response may face an increased risk of death.

Phenotypes have been identified within heterogeneous diseases and are linked to significant prognostic and therapeutic implications.^6–9 Various studies have proposed different phenotypes of COVID-19.^10–15 However, most research on COVID-19 has been conducted during the early stages of the epidemic. Throughout the pandemic, the virus has frequently mutated during widespread transmission, leading to gene recombination when different subtypes infect individuals.^16,17 These genetic mutations or recombination can alter the biological characteristics of the virus and affect the entire pathophysiological process.

The primary objective of this study is to employ unsupervised clustering methods to identify distinct clinical phenotypes and to establish a supervised prediction model applicable to available admission datasets. The secondary objective is to evaluate factors independently associated with COVID-19 mortality. This research aims to enhance understanding of the clinical phenotypes of COVID-19 during a critical period, identify factors influencing prognosis, distinguish characteristics of the current viral infection process, and provide evidence-based guidance for subsequent treatment and early identification of key risk factors.

Clustering in medicine brings multiple benefits. In precision medicine, it refines disease sub-typing via multi-omics for personalized treatments. In public health, it exposes infectious disease trends for resource optimization. In clinics, it improves imaging diagnosis, mental health disorder classification and intervention evaluation, showing great potential in medical care.

K-prototypes clustering is an unsupervised machine learning method that effectively identifies clusters in heterogeneous data.¹⁸ This approach excels at integrating both continuous and categorical variables deemed clinically relevant.¹⁹ In this study, we employ consensus k-prototypes clustering to derive phenotypes and assess phenotype reproducibility using agglomerative hierarchical clustering.

Materials and Methods

Patients Screening

The study included clinical data from COVID-19 patients hospitalized at Xinhua Hospital, Shanghai Jiao Tong University, between December 15, 2022, and January 15, 2023. Inclusion criteria comprised a positive nucleic acid or antigen test, hospitalization requirement, and 51 clinical data. Exclusion criteria included hospitalization for non-respiratory issues, incomplete treatment or voluntary discharge, and age under 18.

Clinical Variables for Clustering

The features correlated with the clinical course and outcomes of COVID-19 were carefully considered. Data collection encompassed 51 characteristics, including age, sex, comorbidities (such as congestive heart failure, chronic pulmonary disease, rheumatic disease, renal disease, liver disease, and diabetes), the Charlson Comorbidity Index (CCI), length of stay (LOS), and various inflammatory markers (eg C-reactive protein [CRP], neutrophil count [NE-num], neutrophil percentage [Ne-per], lymphocyte count [Ly-num], monocyte count [Mo-num], interleukin-8 [IL-8], interleukin-1β [IL-1β], interleukin-6 [IL-6], interleukin-10 [IL-10], tumor necrosis factor-alpha [TNF-α], interleukin-2 receptor [IL-2R], and growth stimulation expressed gene 2 [ST2]). Hepatic markers included total protein (TP), albumin (ALB), total bilirubin (tBil), alanine transaminase (ALT), and gamma-glutamyl transferase (GGT). Renal markers included blood urea nitrogen (BUN), creatinine (Crea), chloride (Cl), sodium (Na), and potassium (K). Cardiovascular markers comprised troponin I (TnI), N-terminal pro-brain natriuretic peptide (NT-proBNP), creatine kinase-MB (CK-MB), and myoglobin (Myo). Hematologic markers included red blood cell (RBC) count, red blood cell distribution width (RDW), platelet count (PLT), and platelet distribution width (PDW). Coagulation markers encompassed prothrombin time (PT), activated partial thromboplastin time (APTT), fibrinogen (Fib), thrombin time (TT), and D-dimer (DD). Lipid metabolism markers included apolipoprotein E (APOE), low-density lipoprotein (LDL-C), high-density lipoprotein (HDL-C), triglycerides (TG), and apolipoprotein A (Apo-a). Laboratory values, age, gender, and comorbidities were used to generate subphenotypes. Both the median and interquartile range (IQR) of all laboratory values for each patient during admission were utilized as features. Features with more than 30% missing data were excluded from the analysis.

Data Pre-Processing and Subphenotype Generation

The k-prototypes algorithm enhances k-means clustering by accommodating both numerical and categorical variables. It calculates distances for each variable independently before summing them to obtain the overall sample distance, effectively integrating k-means into mixed-data analysis. We applied the k-prototypes method to identify the optimal number of clusters in our dataset. Initially, we assessed missingness and variable correlations, employing the k-nearest neighbor method (the R package: DMwR2, https://cran.r-project.org/web/packages/DMwR2/) to address and fill the missing data. Continuous variables were standardized, and the optimal number of phenotypes was determined using the k-prototypes method (the python package: KPrototypes, https://github.com/nicodv/kmodes), guided by the elbow method and Silhouette value. To validate the k-prototypes results, we conducted agglomerative hierarchical clustering (the R package: NbClust, https://cran.r-project.org/web/packages/NbClust/) based on continuous clinical variables, using the Ward linkage criterion for hierarchy construction.

Heatmaps and Chord Diagrams Represent the Distribution of Variables in Subphenotypes

We visualized the distribution patterns of variables across different subphenotypes using chord diagrams and heatmaps, organizing variables into comorbidities and system- or organ-related data. Chord diagrams were generated to illustrate chronic comorbidities by subphenotype (the R package: Circlize, https://cran.r-project.org/web/packages/circlize/). Heatmaps displayed the distribution differences of the variables relative to the complete derivation of subphenotypes, with red indicating higher values and blue indicating lower values.

Machine Learning Algorithms Predict Subphenotype Classification

We were constructing a predictive model for identifying phenotypes using various machine learning algorithms, including the extreme gradient boosting model (XGBoost), gradient boosting model (GBM), light gradient boosting model (LGBM), AdaBoost, decision tree model, logistic regression model, naive Bayes model, and support vector machine (SVM). SHAP values were employed to identify the most important classifier variables (the python package: xgboost, sklearn.ensemble.GradientBoostingClassifier, lightgbm.LGBMClassifier, sklearn.ensemble.AdaBoostClassifier, sklearn.tree, sklearn.linear_model.LogisticRegression, sklearn.naive_bayes.GaussianNB, sklearn.svm).

Statistical Analysis

The train_test_split function from the sklearn.model_selection module was used to proportionally divide the data into test and validation sets. The accuracy_score, precision_score, recall_score, roc_auc_score, and f1_score functions from the sklearn.metrics module were employed to calculate accuracy, precision, recall, AUC, and F1 score, respectively. Continuous variables are reported as medians with interquartile ranges (IQR), while categorical variables are expressed as counts and percentages. Categorical variables were compared using the Pearson χ²-test or Fisher’s exact test, whereas continuous variables were analyzed using the Mann–Whitney U-test (R version 4.2.1 “Funny-Looking Kid”, sklearn module with python version 3.9.7).

Results

Patient Screening, Data Preprocessing, and Baseline Information

A total of 1376 patients aged 18 years or older with confirmed COVID-19 infection, verified by nucleic acid testing during hospitalization at Xinhua Hospital affiliated with Shanghai Jiao Tong University School of Medicine, were included in the study from December 15, 2022, to January 15, 2023 (Figure 1). We excluded variables with a missing rate greater than 30% and those that were highly correlated (r > 0.8; see Supplementary Material for correlation matrix). Ultimately, we included 50 variables: 2 demographic markers, 39 relevant laboratory markers (comprising 12 inflammatory markers, 4 hepatic markers, 5 renal markers, 4 cardiovascular markers, 4 hematologic markers, 5 coagulation markers, and 5 lipid markers), 8 indicators of complications, length of hospitalization, and hospital mortality. The K-nearest neighbors (KNN) method was used to impute missing values in the variables.

Figure 1 Flowchart showing the exclusion and enrolment of COVID- 19 patients, subphenotype derivation and model development and performance by machine learning method. Xinhua hospital: Xinhua Hospital Affiliated to Shanghai Jiao Tong University.

Abbreviation: AUROC, Area Under the Receiver Operating Characteristic curve.

The baseline profiles of the patients are presented in Table 1. Significant differences were observed in demographic indicators and comorbidities, including chronic heart failure, chronic lung disease, renal disease, diabetes mellitus, tumor presence, and CCI, between the survival and non-survival groups. All inflammatory markers, except for the monocyte count, differed between the groups. Significant differences were also found in hepatic, cardiovascular, and renal markers, particularly in urea nitrogen, creatinine, and blood potassium levels. Hematologic markers, such as red blood cells, platelets, and platelet distribution width, showed significant differences. Most coagulation markers, except for APTT, and lipid markers, except for triglycerides, also varied significantly between the groups. Moreover, a significant difference in the length of hospitalization was observed between the survival and non-survival groups.The non-survival group exhibited significantly higher levels of inflammatory factors (IL-8, IL-1β, IL-6, IL-10, TNF-α, IL-2R, ST2) and lower lymphocyte counts. This group also showed worse nutritional status (Alb) and more severe nutrient depletion, as indicated by elevated urea nitrogen and creatinine levels. Additionally, there was a more pronounced elevation of D-dimers and a higher prevalence of cardiovascular, renal, and diabetic comorbidities. The overall baseline characteristics of all individuals were displayed in Table 1.

Table 1 Baseline Characteristics of All Individuals

The Study Population Can Be Divided Into Two Subphenotypes

The results of the previous section indicate that patients who succumbed to COVID-19 infection exhibited a more pronounced inflammatory response and higher consumption compared to surviving patients. Based on this observation and informed by data distribution and clinical experience, we hypothesized the presence of distinct subphenotypes among COVID-19 patients with markedly different characteristics. To further explore the presence of clinically heterogeneous subphenotypes within the COVID-19-infected population included in this study and to determine the optimal number of subphenotypes, we employed the K-prototype method, an adaptation of K-means, for subgrouping.

All variables, except for the length of hospitalization and survival outcomes, were included in the classification reference category. Continuous variables were normalized using the Python scale function, while factor variables were converted to boolean types and then incorporated into the model for classification. The number of clusters k was varied from 2 to 9, during which cost values and silhouette scores were computed. A line graph was plotted (Figure 2) to identify the optimal number of subphenotypes (k).

Figure 2 Elbow diagram and Silhouette plot of k-prototype model. (a). The inflection point of the elbow diagram is at the 2 categorizations, indicating that the 2 categorization is the best classification. (b). The Silhouette plot shows that the Silhouette value is maximum at 2 categorizations, indicating that 2 categorization is the best categorization.

The inflection point in the elbow diagram (Figure 2a) appears at k=3, indicating that clustering tends to form in either 2 or 3 clusters. Additionally, the Silhouette diagram (Figure 2b) shows the highest Silhouette score occurring with 2 clusters. Considering these factors, the optimal number of subphenotypes for this study population is 2.

Hyperinflammatory Phenotype Vs Hypoinflammatory Phenotype

Based on the previous validation using the elbow method and Silhouette diagram, clustering into two phenotypes was identified as the optimal solution. The overall characteristics of the patients after clustering were displayed in Table 2, while differences in continuous variables were illustrated in the heat map (Figure 3). Variations in comorbidity factor variables were depicted in the chord diagram (Figure 4). Differences in 60-day survival between subphenotypes were presented in the survival curves (Figure 5).

Table 2 Identified Subphenotype Characteristics

Figure 3 Heatmap illustrating the distribution of continuous variables across subphenotypes.

Figure 4 Chord diagrams in the comorbidity burden among subphenotypes.

Abbreviations: CHF, congestive heart failure; CPD, chronic pulmonary disease; RHD, rheumatic disease; RD, renal disease; LD, liver disease; DM, diabetes.

Figure 5 Kaplan–Meier (KM) plots depicting 60-day mortality across different subphenotypes.The survival probabilities were presented with a 95% confidence interval. The X-axis represents the number of days since COVID-19 confirmation, while the Y-axis indicates the survival probability.

Subphenotype I (Hypoinflammatory)

This subphenotype comprised 824 patients (59.9%). Compared to the other subphenotype, it included a younger population (median age 68 years, IQR [59, 75]) and a higher proportion of female patients. These patients had a lower Charlson Comorbidity Index (median 5.0, IQR [3.0, 6.0]) and exhibited better clinical outcomes, including a lower mortality rate (N = 13, 1.6%) and a shorter hospital stay (median 8.0 days, IQR [5.0, 14.0]).

Subphenotype II (Hyperinflammatory)

This subphenotype comprised 552 patients (40.1%). Compared to Subphenotype I, it included a higher proportion of older patients (median age 78 years, IQR [71, 86]) and male patients (N = 386, 70%). Patients in Subphenotype II showed more abnormal clinical values across most variables, including elevated inflammatory markers (eg, C-reactive protein, neutrophil percentage, interleukin-6, interleukin-1β, tumor necrosis factor-α, interleukin-2 receptor, growth stimulation expressed gene 2), the lowest lymphocyte counts, and abnormal cardiovascular markers (TnI, NT-proBNP, creatine kinase-MB, myoglobin).Markers for renal dysfunction (blood urea nitrogen, creatinine), hepatic function (total bilirubin, alanine transaminase, gamma-glutamyl transferase, albumin), hematologic parameters (red blood cell count, red blood cell distribution width, platelet count, platelet distribution width), and coagulation (D-dimer, fibrinogen) were significantly abnormal.The chronic comorbidity burden, as measured by the Charlson Comorbidity Index (median 15, IQR^9,20), was higher than in Subphenotype I, with a greater incidence of congestive heart failure (N = 214, 39%), renal disease (N = 188, 34%), and diabetes (N = 234, 42%), but a lower incidence of chronic pulmonary disease (N = 72, 13%).In line with these characteristics, patients in Subphenotype II had worse clinical outcomes, including higher mortality (N = 139, 25%) and a longer hospital stay (median 15 days, IQR).^9,20

The hyperinflammatory phenotype exhibits a significantly more intense inflammatory response, more pronounced cardiovascular damage, greater nutrient depletion, a higher risk of thrombosis, a greater burden of chronic comorbidities, higher mortality, and worse 60-day survival compared to the hypoinflammatory phenotype.

Hierarchical Clustering to Re-Validate the Clustering

To verify the validity and generalizability of the clustering results, the COVID-19 population in this study was re-clustered using the Hierarchical Clustering method. Agglomerative hierarchical clustering confirmed the statistical adequacy of the two-subphenotype model. This model categorized 785 patients (57.05%) into subphenotype I and 591 patients (42.95%) into subphenotype II. The rank plot of this new classification is presented in Figure 6. The clinical characteristics of the phenotypes derived from this method were comparable to those obtained from the k-prototypes. This consistency demonstrates the internal reliability of our data.

Figure 6 The Dendrogram of agglomerative hierarchical cluster. Euclidean distance was calculated using continuous type variables for hierarchical cluster. The obtained clustering results and the clinical characteristics of the clusters are similar to those obtained by Kprototype.

Machine Learning Approach to Predict Two Subphenotypes of COVID-19 Patients

A heterogeneous two-phenotype classification of COVID-19 patients was identified. To improve the clinical applicability of this finding and identify the most influential metrics for the two-phenotyping, we employed eight machine learning methods: Adaboost, LightGBM, XGBoost, Gradient Boosting, Support Vector Machine (SVM), Logistic Regression, Decision Tree, and Naive Bayes. These models were employed not only to predict the two subphenotypes but also to rigorously identify and prioritize the variables that are most critical for distinguishing between the phenotypes.

A total of 48 non-outcome variables were included in the machine learning models, while in-hospital mortality was treated as a separate outcome variable. Table 3 presents the model performance evaluations, highlighting the Adaboost model as the top performer. The model achieved a precision of 0.975, an accuracy of 0.967, a recall of 0.976, and an F1 score of 0.972. The variable importance and SHAP value rankings are illustrated in Figure 7.

Table 3 Predictive Performance of Machine Learning Models

Figure 7 Machine learning importance ranking and SHAP plot: (a) Prognostic importance ranking of COVID-19 patients in the Adaboost model; (b) SHAP values for each feature across all samples in the Adaboost model.

Abbreviatoins: SHAP,SHapley Additive exPlanations; CRP, c-reactive protein; ne_num, neutrophil number; ly_num, lymphocyte number; IL-8, interleukin-8; IL-1β (interleukin-1β); IL-6 (interleukin-6); TNF-α, tumor necrosis factor; IL-2R, interleukin-2 receptor; ST2, growth stimulation expressed gene 2; DD, D-dimer; TnI, troponin (I); NT-proBNP N-terminal pro-brain natriuretic peptide; CK-MB creatine kinase-MB; Myo myoglobin; RDW, red blood cell distribution width; Apo-a, apolipoprotein (A); LDL-C low-density lipoprotein; HDL-C, high-density lipoprotein.

In the Adaboost model, the most significant contributors to prognosis were CRP (6.4%), DD (5.6%), IL-2R (5.2%), BUN (4.9%), ST2 (4.7%), NT-proBNP (4.6%), ly-num (4.5%), ne_per (4.2%), HDL-C (4.1%), Myo (3.9%), Crea (3.8%), IL-6 (3.5%), Apo-a (3.3%), and Alb (3.3%).

In summary, based on the machine learning results, CRP, D-dimer, and IL-2R ap-pear to be key predictors of the high inflammation phenotype, warranting closer attention in clinical practice.

Discussion

This study utilized comprehensive clinical data, including blood biomarkers and comorbidities at hospital admission, to derive the phenotypes of COVID-19 patients at Xinhua Hospital, affiliated with Shanghai Jiao Tong University School of Medicine, using k-prototypes clustering, and validating with hierarchical clustering methods. Machine learning methods were employed to predict the contribution of variables to the formation of subphenotypes. Ultimately, COVID-19 patients are categorized into a hyperinflammatory phenotype, characterized by more severe conditions, higher mortality, and a poorer survival prognosis, and a hypoinflammatory phenotype, which exhibits relatively milder conditions, lower mortality, and a better prognosis. CRP, D-dimer, and IL-2R are crucial predictors in distinguishing between the two subgroups. The techniques employed in this study effectively captured the complex clinical and biological heterogeneity among COVID-19 patients. Notably, this study was the first to encompass a large cohort during the major COVID-19 outbreak in Shanghai. The clinical symptoms exhibited by COVID-19 patients vary widely, ranging from asymptomatic or mild cases to critically severe conditions, including deadly respiratory failure.^20–22 This vast disparity in symptoms clearly suggests the presence of distinct population groups that respond in markedly different ways. For patients with severe COVID-19 pneumonia, progression to acute respiratory distress syndrome (ARDS) poses a significant challenge, as reversing the condition at this stage is extremely difficult. Therefore, early identification of cases at risk of progressing to severe illness is crucial.

In response to this clinical need, our study analyzes subgroups within the COVID-19 population and seeks to predict their classification using machine learning methods. Subphenotype II can aptly be termed the “Hyperinflammatory phenotype” due to its association with the most unfavorable clinical outcomes. Subphenotype II is defined by factors such as C-reactive protein (CRP), IL-2R, D-dimer, ST2, BUN, NT-proBNP, neutrophil percentage, lymphocyte count, myoglobin, TNF-α, Apo-A, albumin (ALB), and IL-6. It has been well documented that significantly higher neutrophil counts and formation of NETs are significantly correlated with COVID-19 critically ill patients.²³ And patients’ lymphocyte counts were significantly negatively correlated with patients’ prognosis, implying the destructive effect of COVID-19 on the immune system.²⁴ This finding aligns with previous reports, further enhancing the credibility of our research. Consistent with these findings, our study observes that in Subphenotype II, the neutrophil count is significantly elevated, while the lymphocyte count is markedly reduced.Gender influences COVID-19 clinical severity, with females typically exhibiting milder symptoms.²⁵ Patients in Subphenotype II exhibit elevated D-dimer (DD) and fibrinogen (Fib), along with a prothrombotic state.²⁶ Thrombotic events and neurological symptoms, both life-threatening, are common in COVID-19 cases.²⁷ Interestingly, we discovered that these patients also presented with hypoalbuminemia at an early stage, accompanied by low apolipoprotein A (Apo-A). Due to their low nutritional status, their ability to resist infection is weakened. Unfortunately, COVID-19 infection may further deplete albumin levels, leading to a higher early mortality rate.For patients with the “Hyperinflammatory phenotype” (Subphenotype II), aggressive treatment strategies may be necessary to control the excessive inflammatory response. This could include administering anti-inflammatory drugs, cytokine inhibitors, or other therapies targeting the specific biomarkers associated with this phenotype. In addition, due to the heavier and more complex comorbidities in the hyperinflammatory phenotype, close monitoring of organ function and early intervention to prevent organ damage are essential for improving patient prognosis. Conversely, Subphenotype I can be accurately described as the “Hypoinflammatory phenotype.” This subphenotype is distinguished by significantly younger patients and a lower proportion of males. In comparison to Subphenotype II, Subphenotype I is associated with lower disease severity and higher survival rates.

This research partially aligns with previous retrospective studies in Spain, the USA, and France, which identified three to five phenotypes, compared to the two identified in our study. Gutiérrez et al¹⁰ (Spain) analyzed 4035 patients across 127 hospitals, identifying three phenotypes: Phenotype A (19%) with younger individuals, mild symptoms, normal inflammatory patterns, and higher lymphocyte counts; Phenotype B (73%) with more symptoms, no pulmonary infiltrations but interstitial changes, obesity, and moderately elevated inflammatory markers; and Phenotype C (7%) with patients having more obesity, comorbidities, and higher inflammatory biomarkers.

In the USA, Su et al.¹¹ An analysis of 14,418 patients from five hospitals identified four subphenotypes: Subphenotype I (33%) characterized by younger patients, more females, and fewer comorbidities; Subphenotype II (37%) with a higher proportion of males and abnormal inflammation markers; Subphenotype III (18%) involving older patients, Black ethnicity, renal dysfunction, and hematologic changes; and Subphenotype IV (12%) comprising older patients, more males, higher comorbidity burden, and abnormal biomarkers.

In France, Elie Azoulay et al.¹² The study of 85 ICU patients identified three phenotypes: Cluster 1 (43.5%, low mortality) characterized by females, low ferritin, D-dimers, and CRP but intermediate IL-6; Cluster 2 (20%, intermediate mortality) consisting of younger patients, 88% males, no fever, acute kidney injury, and intermediate ferritin, D-dimers, and CRP; and Cluster 3 (36.5%, high mortality) with predominantly older men, severe hypoxemia, fever, and intense inflammatory syndrome.

Our Subphenotype II closely aligns with Phenotype C from the Spain study, Subphenotype IV from the USA study, and Cluster 3 from the France study, all characterized by older patients with more comorbidities and the highest mortality rates. Identifying this phenotype is critical. In this study, we established a supervised prediction model for the identified subphenotypes.

After comprehensive evaluation and comparison of predictive performance, the Adaboost model emerged as the optimal selection. The Boosting method is a highly renowned and robust algorithm that combines multiple weak learners linearly to achieve a strong learner. It can also be regularized through early stopping techniques. Boosting is regarded as one of the most effective machine learning methods for both classification and regression tasks.²⁸

AdaBoost, a classic representative of the Boosting method, is among the most successful Boosting algorithms. It is derived from the integration of results from weighted voting problems and online allocation problems, removing the requirement for prior knowledge of weak learners. By recalculating weights in each iteration, AdaBoost enables the subsequent classifiers to focus on samples that are difficult to distinguish while assigning greater weight to the classifiers that performed well in the previous iteration.²⁹

Our prediction model achieved the desired predictive results in the study population, offering a feasible and highly accurate method for applying the identified subgroups in clinical practice. The SHAP plot reveals that IL-2R,³⁰ TNF-α, and IL-6^31,32 constitute the majority of inflammatory factors, differing from earlier studies that emphasized IL-6, IL-10, and IL-8. This finding may provide new insights and guidance for the future treatment of COVID-19.

Soluble suppression of tumorigenesis-2 (sST2), a member of the interleukin-1 (IL-1) superfamily and the soluble isoform of ST2, has emerged as a promising prognostic biomarker for sepsis. sST2 is released not only due to vascular congestion but also in response to inflammatory and pro-fibrotic stimuli. The IL-33/sST2 complex inhibits anti-inflammatory cytokine release and promotes the activation and release of inflammatory cytokines like IL-6 and TNF-α, triggering inflammation.^33,34 Previous research indicates that sST2 levels correlate with COVID-19 severity and prognosis. sST2 correlates with ICU admission, ventilator use, thrombosis, and mortality in COVID-19 patients.ST2 ranks highly in our prediction model’s importance, supporting its role as a COVID-19 biomarker for classification and prognosis.^35–37

Cardiac injury associated with COVID-19 is highly prevalent in clinical settings, primarily evidenced by increases in troponin I (TnI), myoglobin (Myo), and NT-proBNP.^38,39 Within the classification model, Myo and NT-proBNP rank prominently, indicating that cardiac dysfunction significantly influences patient classification and prognosis. Patients with a history of cardiac dysfunction tend to have poorer outcomes. B-type natriuretic peptides (BNPs) are predominantly produced in the heart and released into circulation in response to heightened wall tension in both the atria and ventricles. However, elevated NT-proBNP levels can arise from non-cardiac causes as well, including impaired renal function, advanced age, malnutrition (indicated by low albumin levels), and increased C-reactive protein levels. Additionally, hypoxia and various hormones, such as catecholamines, angiotensin II, and endothelin, can stimulate NT-proBNP secretion.

This research not only deepens our understanding of the disease but also provides practical tools for healthcare professionals to manage and treat COVID-19 patients more effectively. By tailoring treatment strategies according to these phenotypes, we aim to enhance patient outcomes and contribute to the global response against COVID-19.

This study has notable limitations. Firstly, as a retrospective analysis, it lacks external validation of the identified phenotypes, significantly restricting the generalizability and reliability of our findings.Secondly, the analysis focuses solely on hospitalized patients, limiting the applicability of the results to those with mild COVID-19 requiring outpatient care, whose characteristics and potential phenotypes are not included. This limitation underscores the need for future research to encompass a broader patient spectrum for a more comprehensive understanding of COVID-19 manifestations. No significant difference in the proportion of chronic pulmonary disease among subgroups was observed, likely due to insufficient information on disease severity or type within this cohort. This limitation introduces bias, emphasizing the need for further investigation in subsequent studies.

Conclusions

Two phenotypes of COVID-19 were identified based on symptoms, complications, inflammatory markers, infection markers, end-organ dysfunction, and comorbidities. The machine learning model accurately predicted COVID-19 patient survival and severity based on identified phenotypes. After comprehensive evaluation and comparison, the Adaboost model emerged as the optimal selection. This model effectively pinpointed ‘CRP’, ‘IL - 2R’, ‘D - dimer’, ‘ST2’, ‘BUN’, ‘NT- proBNP’, ‘Neutrophil percent’, and ‘Lymphocyte number’ as crucial predictors for in-hospital mortality in COVID-19 patients, highlighting their significant role in assessing the prognosis of these patients.

Data Sharing Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Xinhua Hospital Affiliated to Shanghai Jiao Tong University (protocol code XHEC-D-2024-156). Only participants who voluntarily gave written informed consent were enrolled.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Funding

This research was funded by Shanghai Municipal Health Commission, Key Supporting Subject Researching Project, No. 2023ZDFC0106. National Natural Science Foundation of China (No. 82172138); Innovation Research Project of Shanghai Science and Technology Commission (No. 21Y11902400); Medical Innovation Research Project of Shanghai Science and Technology Commission (No. 23Y31900102).

Disclosure

The authors declare no conflicts of interest.

References

1. Lindeboom RGH, Worlock KB, Dratva LM, et al. Human SARS-CoV-2 challenge uncovers local and systemic response dynamics. Nature. 2024;631(8019):189–198. doi:10.1038/s41586-024-07575-x

2. Huang S, Gao Z, Wang S. China’s COVID-19 reopening measures-warriors and weapons. Lancet. 2023;401(10377):643–644. doi:10.1016/S0140-6736(23)00213-1

3. Oran DP, Topol EJ. Prevalence of Asymptomatic SARS-CoV-2 infection: a narrative review. Ann Intern Med. 2020;173(5):362–367. doi:10.7326/M20-3012

4. Gandhi RT, Lynch JB, Del Rio C. Mild or Moderate Covid-19. N Engl J Med. 2020;383(18):1757–1766. doi:10.1056/NEJMcp2009249

5. Mateu L, Tebe C, Loste C, et al. Determinants of the onset and prognosis of the post-COVID-19 condition: a 2-year prospective observational cohort study. Lancet Reg Health Eur. 2023;33:100724. doi:10.1016/j.lanepe.2023.100724

6. Yang X, Yu Y, Xu J, et al. Clinical course and outcomes of critically Ill Patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med. 2020;8(5):475–481. doi:10.1016/S2213-2600(20)30079-5

7. Cummings MJ, Baldwin MR, Abrams D, et al. Epidemiology, Clinical Course, and Outcomes of Critically Ill Adults with COVID-19 in New York City: a Prospective Cohort Study. Lancet. 2020;395(10239):1763–1770. doi:10.1016/S0140-6736(20)31189-2

8. Richardson S, Hirsch JS, Narasimhan M, et al.;. Presenting characteristics, comorbidities, and outcomes among 5700 Patients hospitalized with COVID-19 in the New York City Area. JAMA. 2020;323:(20):2052–2059. doi:10.1001/jama.2020.6775

9. Grasselli G, Zangrillo A, Zanella A, et al. Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy Region, Italy. JAMA. 2020;323(16):1574–1581. doi:10.1001/jama.2020.5394

10. Gutiérrez-Gutiérrez B, Del Toro MD, Borobia AM, et al. Identification and validation of clinical phenotypes with prognostic implications in patients admitted to hospital with COVID-19: a multicentre cohort study. Lancet Infect Dis. 2021;21(6):783–792. doi:10.1016/S1473-3099(21)00019-0

11. Su C, Zhang Y, Flory JH, et al. Clinical Subphenotypes in COVID-19: derivation, validation, prediction, temporal patterns, and interaction with social determinants of health. NPJ Digit Med. 2021;4(1):110. doi:10.1038/s41746-021-00481-w

12. Azoulay E, Zafrani L, Mirouse A, Lengliné E, Darmon M, Chevret S. Clinical phenotypes of critically Ill COVID-19 Patients. Intensive Care Med. 2020;46(8):1651–1652. doi:10.1007/s00134-020-06120-4

13. Mueller YM, Schrama TJ, Ruijten R, et al. Stratification of Hospitalized COVID-19 Patients into Clinical Severity Progression Groups by Immuno-Phenotyping and Machine Learning. Nat Commun. 2022;13(1):915. doi:10.1038/s41467-022-28621-0

14. Wang X, Jehi L, Ji X, Mazzone PJ. Phenotypes and subphenotypes of patients with COVID-19. Chest. 2021;159(6):2191–2204. doi:10.1016/j.chest.2021.01.057

15. Rodríguez A, Ruiz-Botella M, Martín-Loeches I, et al. Deploying unsupervised clustering analysis to derive clinical phenotypes and risk factors associated with mortality risk in 2022 critically Ill patients with COVID-19 in Spain. Crit Care. 2021;25(1):63. doi:10.1186/s13054-021-03487-8

16. Qu P, Xu K, Faraone JN, et al. Immune evasion, infectivity, and fusogenicity of SARS-CoV-2 BA.2.86 and FLip Variants. Cell. 2024;187(3):585–595.e6. doi:10.1016/j.cell.2023.12.026

17. Bhattacharya M, Chatterjee S, Lee -S-S, Dhama K, Chakraborty C. Antibody evasion associated with the RBD significant mutations in several emerging SARS-CoV-2 variants and its subvariants. Drug Resist Updat. 2023;71:101008. doi:10.1016/j.drup.2023.101008

18. Preud’homme G, Duarte K, Dalleau K, et al. Head-to-head comparison of clustering methods for heterogeneous data: a simulation-driven benchmark. Sci Rep. 2021;11(1):4202. doi:10.1038/s41598-021-83340-8

19. Ceccato A, Forne C, Bos LD, et al. Clustering COVID-19 ARDS patients through the first days of ICU admission. An Analysis of the CIBERESUCICOVID Cohort Crit Care. 2024;28:91. doi:10.1186/s13054-024-04876-5

20. Petrilli CM, Jones SA, Yang J, et al. Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York City: prospective Cohort Study. BMJ. 2020;369:m1966. doi:10.1136/bmj.m1966

21. Wu C, Chen X, Cai Y, et al. Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China. JAMA Intern Med. 2020;180(7):934–943. doi:10.1001/jamainternmed.2020.0994

22. Chen T, Wu D, Chen H, et al. Clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study. BMJ. 2020;368:m1091. doi:10.1136/bmj.m1091

23. Ackermann M, Anders H-J, Bilyy R, et al. Patients with COVID-19: in the Dark-NETs of Neutrophils. Cell Death Differ. 2021;28(11):3125–3139. doi:10.1038/s41418-021-00805-z

24. Tan L, Wang Q, Zhang D, et al. Lymphopenia predicts disease severity of COVID-19: a descriptive and predictive study. Signal Transduct Target Ther. 2020;5(33). doi:10.1038/s41392-020-0148-4

25. Gebhard C, Regitz-Zagrosek V, Neuhauser HK, Morgan R, Klein SL. Impact of sex and gender on COVID-19 outcomes in Europe. Biology of Sex Differences. 2020;11(1):29. doi:10.1186/s13293-020-00304-9

26. Fanaroff AC, Lopes RD. COVID-19 Thrombotic Complications and Therapeutic Strategies. Annu Rev Med. 2023;74(1):15–30. doi:10.1146/annurev-med-042921-110257

27. Ryu JK, Yan Z, Montano M, et al. Fibrin drives thromboinflammation and neuropathology in COVID-19. Nature. 2024. doi:10.1038/s41586-024-07873-4

28. Bühlmann P, Yu B. Boosting WIREs computational statistics. 2010;2:69–74. doi:10.1002/wics.55

29. Cao Y, Miao QG, Liu JC, Gao L. Advance and prospects of adaboost algorithm. Zidonghua Xuebao/Acta Automatica Sinica. 2013;39(6):745–758. doi:10.1016/S1874-1029(13)60052-X

30. Jang HJ, Leem AY, Chung KS, et al. Soluble IL-2R levels predict in-hospital mortality in COVID-19 patients with respiratory failure. J Clin Med. 2021;10(18):4242. doi:10.3390/jcm10184242

31. Schultheiß C, Willscher E, Paschold L, et al. The IL-1β, IL-6, and TNF Cytokine Triad is Associated with post-acute sequelae of COVID-19. Cell Rep Med. 2022;3(100663). doi:10.1016/j.xcrm.2022.100663

32. Coomes Ea, Haghbayan H Interleukin-6 in covid-19: a systematic review and meta-analysis. Rev Med Virol. 2020;30:1–9. doi:10.1002/rmv.2141

33. Babic ZM, Zunic FZ, Pantic JM, et al. IL-33 Receptor (ST2) deficiency downregulates myeloid precursors, inflammatory NK and dendritic cells in early phase of sepsis. J Biomed Sci. 2018;25(1):56. doi:10.1186/s12929-018-0455-z

34. Xu H, Turnquist HR, Hoffman R, Billiar TR. Role of the IL-33-ST2 Axis in Sepsis. Mil Med Res. 2017;4:3. doi:10.1186/s40779-017-0115-8

35. Sabbatinelli J, Di Rosa M, Giuliani A, et al. Serum Levels of Soluble Suppression of Tumorigenicity 2 (sST2) and Heart-Type Fatty Acid Binding Protein (H-FABP) Independently Predict in-Hospital Mortality in Geriatric Patients with COVID-19. Mech Ageing Dev. 2023;216:111876. doi:10.1016/j.mad.2023.111876

36. Park M, Hur M, Kim H, et al. Soluble ST2 as a useful biomarker for predicting clinical outcomes in hospitalized COVID-19 patients. Diagnostics (Basel). 2023;13(2):259. doi:10.3390/diagnostics13020259

37. Li H, Liu L, Zhang D, et al. SARS-CoV-2 and viral sepsis: observations and hypotheses. Lancet. 2020;395(10235):1517–1520. doi:10.1016/S0140-6736(20)30920-X

38. Greene SJ, Chambers R, Lerman JB, et al. Sacubitril/valsartan and cardiovascular biomarkers among patients with recent COVID −19 infection: the PARACOR −19 randomized clinical trial. Eur J Heart Fail. 2024;26(6):1393–1398. doi:10.1002/ejhf.3199

39. Zhang Z, Tang L, Guo Y, et al. Development of biomarkers and prognosis model of mortality risk in patients with COVID-19. J Inflamm Res. 2024;17:2445–2457. doi:10.2147/JIR.S449497

Creative Commons License © 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, 3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]

Identifying and Validating Prognostic Hyper-Inflammatory and Hypo-Inflammatory COVID-19 Clinical Phenotypes Using Machine Learning Methods

Introduction

Materials and Methods

Patients Screening

Clinical Variables for Clustering

Data Pre-Processing and Subphenotype Generation

Heatmaps and Chord Diagrams Represent the Distribution of Variables in Subphenotypes

Machine Learning Algorithms Predict Subphenotype Classification

Statistical Analysis

Results

Patient Screening, Data Preprocessing, and Baseline Information

The Study Population Can Be Divided Into Two Subphenotypes

Hyperinflammatory Phenotype Vs Hypoinflammatory Phenotype

Subphenotype I (Hypoinflammatory)

Subphenotype II (Hyperinflammatory)

Hierarchical Clustering to Re-Validate the Clustering

Machine Learning Approach to Predict Two Subphenotypes of COVID-19 Patients

Discussion

Conclusions

Data Sharing Statement

Institutional Review Board Statement

Informed Consent Statement

Funding

Disclosure

References

Recommended articles

COVID-19 Case Fatality Rate and Factors Contributing to Mortality in Ethiopia: A Systematic Review of Current Evidence

Modeling the Transmission Dynamics of COVID-19 Among Five High Burden African Countries

Mortality Risk Factors of Early Neonatal Sepsis During COVID-19 Pandemic

Platelet-to-White Blood Cell Ratio as a Predictor of Mortality in Patients with Severe COVID-19 Pneumonia: A Retrospective Cohort Study

Disease Burden of Total and Early-Onset Colorectal Cancer in China from 1990 to 2019 and Predictions of Cancer Incidence and Mortality