Back to Journals » Risk Management and Healthcare Policy » Volume 18

Inequalities in Mild Cognitive Impairment Risk Among Chinese Middle-Aged and Older Adults: Insights from an Integrated Learning Model

Authors Bi S, Guo D, Tan H, Chen Y, Li G

Received 23 January 2025

Accepted for publication 18 May 2025

Published 3 June 2025 Volume 2025:18 Pages 1793—1808

DOI https://doi.org/10.2147/RMHP.S519049

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Gulsum Kubra Kaya



Shengxian Bi,1 Dandan Guo,2 Huawei Tan,1 Yingchun Chen,1 Gang Li3

1School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, 430030, People’s Republic of China; 2School of Public Health and Health Sciences, Hubei University of Medicine, Shiyan, Hubei, 442000, People’s Republic of China; 3School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, People’s Republic of China

Correspondence: Yingchun Chen, School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, People’s Republic of China, Email [email protected] Gang Li, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, People’s Republic of China, Email [email protected]

Objective: This study aims to address inequalities in mild cognitive impairment (MCI) risk among Chinese middle-aged and older adults by developing an integrated learning framework to predict MCI risk and identify key contributing factors.
Methods: Using CHARLS data of 4626 participants, we developed a convolutional neural network-bidirectional long short-term memory-attention (CNN-BiLSTM-Attention) model to capture the temporal and spatial features of MCI progression. SHAP (Shapley Additive Explanations) analysis quantified feature importance and enhanced interpretability, while mediation analysis explored causal pathways, particularly focusing on the role of education. Model performance was compared with eight other frameworks, including LSTM-based models, using Receiver Operating Characteristic (ROC) curves and classification metrics.
Results: The CNN-BiLSTM-Attention model demonstrated relatively promising predictive performance (AUC: 0.7317), with moderately high sensitivity (0.6902) and a high negative predictive value (NPV) of 0.9414. Education emerged as the most critical predictor, followed by Instrumental Activities of Daily Living (IADL) and gender. Mediation analysis revealed that education influenced MCI risk indirectly through health insurance, social interaction, physical activity, and depression.
Conclusion: We present an interpretable, data-driven framework for predicting MCI risk while uncovering key inequality factors, particularly the pivotal role of education. The model’s robust performance and interpretability highlight its potential to inform public health strategies and interventions aimed at addressing inequalities in dementia risk.

Keywords: mild cognitive impairment, inequality, integrated learning, CNN-BiLSTM-Attention, SHAP analysis, Mediation analysis

Introduction

Mild cognitive impairment (MCI) is a clinical syndrome that represents an intermediate stage between normal aging and early dementia, characterized by noticeable cognitive decline while daily functional abilities remain relatively intact.1,2 Although memory impairment is the hallmark of MCI, it may also involve mild deficits in attention, language, executive functions, or visuospatial abilities. These cognitive deficits are underpinned by early pathological changes in brain regions such as the hippocampus, entorhinal cortex, and prefrontal cortex, which are known to be particularly vulnerable in the progression from normal aging to dementia.3 MCI is widely regarded as a high-risk state for dementia, particularly in the early stages of neurodegenerative diseases such as Alzheimer’s disease. Studies indicate that 15% of individuals with MCI progress to dementia within one year, and this rate increases to 44–64% over a three-year period.4,5 The prevalence of MCI increases with age, affecting 15–20% of individuals aged 60 years and older.6 In China, the situation is particularly concerning due to the rapidly aging population and the resulting strain on healthcare systems.7 Meta-analyses estimate that 19% of Chinese individuals aged 60 years or older have MCI, with prevalence exceeding 30% among those aged 80 years or older.8–10 Disparities in healthcare access exacerbate the issue, thereby leading to higher prevalence rates in rural areas, among individuals with lower levels of education, and in older populations.11,12 Recent findings suggest that such social and environmental disparities may not only influence diagnostic and treatment opportunities but also interact with underlying neural vulnerability.13 The aging population in China, coupled with socioeconomic disparities and unequal access to healthcare, highlights the urgent need for early identification and intervention strategies to prevent dementia progression, optimize healthcare resources, and reduce societal and familial burdens.

Research on the progression of MCI has increasingly incorporated machine learning techniques to enhance prediction accuracy and identify key risk factors. Numerous studies have combined neuroimaging data,14 clinical indicators,15 and biomarkers with methods such as Support Vector Machines (SVM), Random Forests (RF), and Convolutional Neural Networks (CNN) to predict MCI progression.16 Despite these advancements, significant challenges persist. The high dimensionality and noise inherent in such datasets often exacerbate the limitations of small sample sizes, increasing the risk of overfitting and reducing model generalizability.17 Additionally, the heterogeneity in patient data, including demographic and clinical variability, undermines model robustness and reliability.18 Some studies have utilized cross-sectional data from public databases to identify MCI risk factors.19–21 These approaches are inherently limited in establishing causal relationships and often fail to adequately incorporate spatial characteristics of individual patients, thereby compromising predictive accuracy. To address these challenges, recent research has emphasized the use of longitudinal data to capture dynamic changes in MCI progression. Long Short-Term Memory (LSTM) networks, for instance, have shown promise in modeling temporal sequences effectively by capturing time-dependent patterns in cognitive changes.22,23 While these models excel at describing temporal trajectories, they often fail to account for the complex interactions among multiple features across different time points, such as the interplay between individual characteristics and environmental factors. This limitation reduces their ability to fully elucidate the mechanisms underlying MCI progression and achieve reliable long-term predictions, underscoring the need for more comprehensive and interpretable modeling approaches.

In this study, we proposed an integrated learning framework tailored specifically for MCI to enhance the predictive performance of temporal data models and improve the interpretability of prediction systems. Using a large-scale prospective cohort dataset from China, this study focused on: 1) Designing and implementing a CNN-BiLSTM-Attention model that integrated spatial information into the model structure and accounted for demographic characteristics, health status features, and social participation features to analyze the temporal progression of MCI in middle-aged and older adults over a two-year period, and comparing its performance with other LSTM-based models to identify the optimal mechanism for improving predictive accuracy and classification capabilities; 2) Applying the SHAP method to evaluate the contribution of individual features across nine different LSTM frameworks, and visualizing the distribution of feature impacts on prediction outcomes through summary and box plots to provide detailed interpretive insights, with a specific focus on key inequality factors such as education, gender, and rural-urban differences; 3) Conducting mediation analysis to uncover interactions between key features and MCI progression, while identifying risk factor combinations and prioritizing them for individual-level analysis. Through these comprehensive analyses, we developed an efficient and interpretable predictive model, offering a scientific foundation for the early detection and intervention of MCI, and addressing inequalities in dementia risk.

Data and Research Methods

Data Preprocessing

The China Health and Retirement Longitudinal Study (CHARLS) is a nationally representative longitudinal survey that examines the family and individual circumstances of adults aged 45 and older in China.24 The baseline survey was conducted nationwide between 2011 and 2012, with four subsequent follow-ups in 2013, 2015, 2018, and 2020. These surveys collected comprehensive data on family structure, economic support, health status, and healthcare service utilization. For this study, CHARLS data were utilized, selecting 10,920 individuals from the 17,705 baseline respondents who completed all five follow-up surveys. After applying the research criteria, a final sample of 4626 valid participants was retained, spanning 123 geographic locations (Figure 1). Geographic location data were retrieved from the Gaode Map API and converted into the WGS 1984 Albers projection coordinate system for further analysis.

Figure 1 Participant Selection Process.

Predictors

A nationwide survey on MCI among older Chinese adults, published in The Lancet,25 categorized the risk factors for MCI into modifiable and non-modifiable factors. Considering the existing research and the structure of the CHARLS data, we refined this classification to focus on predictors that reflect social and health inequalities. We further grouped these predictors into three categories.

Demographic Characteristics

This study included gender,26 age,27 residential area,28 education,29 marital status,30 poverty status,31 pension insurance, and health insurance32 as demographic predictors. These factors capture demographic and socioeconomic differences that contribute to disparities in MCI risk. Gender was categorized as male or female, while age was grouped into predefined ranges. Residential areas were classified as urban or rural, reflecting potential inequalities in healthcare access. Education was measured by years of schooling or educational attainment, reflecting disparities in access to resources. Marital status was categorized as married or single, and poverty status was assessed based on income level or economic hardship. Pension insurance and health insurance were evaluated in terms of both the presence of coverage and the type of coverage, serving as proxies for socioeconomic inequities.

Health Status and Function

This category included various health-related factors and lifestyle predictors that influenced MCI. Chronic diseases and comorbidities were included as they often reflect disparities in healthcare access and management, which may contribute to differences in MCI risk. These conditions included hypertension,33 dyslipidemia,34 diabetes,35 cancer,36 chronic lung disease,37 liver disease,38 heart disease,39 stroke,40 kidney disease,41 stomach disease,42 psychiatric disorders,43 arthritis,44 and asthma.45 To ensure clarity and statistical robustness, memory disorders were excluded as predictors to avoid conceptual overlap with MCI. Additionally, Activities of Daily Living (ADL)46 and Instrumental Activities of Daily Living (IADL)47 were included as functional predictors. Functional impairments, as measured by ADL and IADL, often disproportionately affect individuals with limited access to healthcare or rehabilitation services. ADL consisted of six basic activities: bathing, dressing, transferring between bed and chair, toileting, eating, and maintaining continence. IADL included six daily skills: housekeeping, cooking, shopping, managing finances, taking medication, and making phone calls. Difficulty in performing these tasks reflects inequalities in access to resources that support independent living. Lifestyle predictors included smoking,48 alcohol consumption,49 sleep,50 and physical activity.51 Additionally, health-related medical utilization indicators, such as outpatient visits,52 hospitalizations,53 physical examinations, and reported pain,54 were incorporated. These variables highlight disparities in health-seeking behaviors and access to healthcare services. Depression symptoms55 were assessed using the 10-item Center for Epidemiologic Studies Depression Scale (CESD-10), with a total CESD-10 score greater than 10 indicating a risk of depression. Mental health inequalities, often linked to socioeconomic and environmental factors, were considered critical for understanding MCI risk.

Social Participation

Social participation factors encompassed the frequency of social interactions and involvement in various activities.55–57 These activities included parent-child communication, interactions with friends, participation in card and board games, helping friends, engaging in fitness activities, community activities, volunteer work, attending training courses, and using the internet. These behaviors reflect an individual’s social support network and level of community integration. Differences in opportunities to engage in these activities indicate inequalities in access to community resources, influencing social integration and ultimately affecting MCI risk.

Outcome Variable

We used cognitive functions to define MCI. CHARLS evaluated participants’ cognitive functions using a methodology aligned with the US Health and Retirement Study (HRS).58 The cognitive assessment encompassed four domains: orientation, calculation, drawing, and episodic memory. Orientation was assessed by asking participants the current year, month, date, season, and day of the week, with each correct response earning 1 point, for a maximum of 5 points. Calculation ability was measured by asking participants to perform five consecutive subtractions of 7, with each correct subtraction earning 1 point, totaling 5 points. Drawing ability was evaluated by instructing participants to replicate specified shapes, with each accurate drawing earning 1 point. Episodic memory was assessed through immediate and delayed recall of ten words, with 1 point awarded for each correctly recalled word, yielding a maximum memory score of 20 points. The overall cognitive function score was a sum of these components, with a total of 31 points.

According to international consensus standards for cognitive aging59 and relevant studies,60 we defined MCI based on participants’ total cognitive scores compared to age-adjusted normative standards. Those scoring 1 standard deviation below the normative mean were classified as having MCI.

Study Design

We developed and implemented a hybrid model combining CNN, BiLSTM, and attention mechanisms to predict the risk of MCI among Chinese middle-aged and older adults.61 Initially, correlation analysis, Gradient Boosting Decision Trees (GBDT), and 10-fold cross-validation were used to select ten key features from the original dataset. Ten key variables were education, IADL, gender, residential area, sleep duration, card and board game activities, interactions with friends, physical activity, depression, and health insurance. Categorical variables were encoded into numerical representations using integer encoding, and numerical variables were standardized using sklearn’s StandardScaler. All features were then converted into PyTorch tensors to serve as model inputs.

We used five waves of longitudinal surveys conducted in 2011, 2013, 2015, 2018, and 2020. Specifically, 70% of the data from the 2011, 2013, 2015, and 2018 waves were used as the training set, while the remaining 30% of the 2018 data served as the validation set. The 2020 data were designated as an independent test set to evaluate the model’s generalization capability.

For the model architecture, the input layer included a convolutional layer that performed local feature extraction on the time-series data, capturing patterns within each time step.62 The CNN layer consisted of 256 channels with a kernel size of 4 and appropriate padding, followed by a ReLU activation function and a dropout layer with a dropout probability of 0.3 to mitigate overfitting. The extracted feature sequences were subsequently fed into a BiLSTM layer. The bidirectional structure of the BiLSTM enabled the simultaneous capture of both forward and backward dependencies in the sequence data, improving the modeling of temporal features.63 Our BiLSTM consisted of two layers with a hidden size of 128 in each direction.

To further enhance the model’s ability to focus on critical time steps, an attention mechanism was integrated. The attention module consisted of two sequential linear layers: the first mapped the concatenated BiLSTM outputs to an intermediate space of size 128 using a Tanh activation; the second produced scalar attention weights for each time step, which were then normalized using a softmax function to compute the context vector.64 Finally, the processed features were passed through a fully connected layer and a sigmoid activation function to produce the predicted probability of cognitive function decline. During the model training process, binary cross-entropy was used as the loss function, aiming to minimize prediction error. We used the Adam optimizer with an initial learning rate of 0.0008 and used a scheduler that reduced the learning rate by half every 20 epochs. Training was conducted for 200 epochs, with performance evaluated on the validation set at each epoch. Model parameters were saved based on the best validation loss to prevent overfitting (Figure 2).

Figure 2 CNN-BiLSTM-Attention Design Process.

To enhance the interpretability of the model, we combined SHAP analysis with mediation analysis. SHAP is a machine learning model explanation method based on the Shapley value concept from game theory, which assesses the impact of each feature by calculating its marginal contribution across different feature combinations.65 For SHAP analysis, the sequential input data were flattened into a two-dimensional array, and 100 random training samples were selected as the background data set. The SHAP Explainer was applied to the forward function of the model to compute SHAP values, which were reshaped to recover time-step and feature dimensions. Visualization was performed using summary plots and box plots to illustrate the importance of each feature. Subsequently, based on the identified independent variables, mediating variables, and control variables, we conducted a mediation analysis to explore the indirect influence pathways of key independent variables on the dependent variable through potential mediators (Figure 3).66 The mediation analysis was implemented using the statsmodels package. A bootstrap method with 5000 iterations was used to estimate indirect effects and obtain 95% confidence intervals, and the proportion of the indirect effect relative to the total effect was calculated. The integration of SHAP and mediation analysis provided complementary insights, combining quantitative feature importance with causal inference to better understand the underlying mechanisms affecting MCI risk.

Figure 3 Analysis of Potential Mechanisms Underlying MCI Risk.

Results

Characteristics of MCI in Middle-Aged and Older Individuals

The K-nearest neighbors (KNN) method was used to impute missing data (Supplementary Table 1). Descriptive statistics were obtained from the 2020 survey, which included 4626 middle-aged and older participants. The results revealed an overall MCI prevalence of 13.7% (Supplementary Table 2). The prevalence of MCI among female participants (15.3%) was significantly higher than that among male participants (7.7%, p < 0.001). Individuals with lower education levels had a significantly higher prevalence of MCI (16.4%) compared to those with higher education levels (2.0%, p < 0.001). Similarly, rural participants exhibited a significantly higher prevalence of MCI (13.3%) compared to urban participants (5.0%, p < 0.001), highlighting potential geographic and socioeconomic disparities. Regarding health status, individuals with severe hypertension, dyslipidemia, and arthritis had a significantly higher prevalence of MCI (p < 0.05), whereas other chronic diseases did not show a significant association with MCI. Functional status analysis revealed that impairments in Activities of Daily Living (ADL) and Instrumental Activities of Daily Living (IADL), as well as the presence of depressive symptoms, were associated with a higher prevalence of MCI (p < 0.001), indicating the strong association between MCI and both physical and mental health factors. In terms of social participation, five types of social activities—interaction with friends, card and board game activities, helping friends, volunteer work, and using the internet—were significantly associated with the prevalence of MCI (p < 0.05), indicating that fostering diverse opportunities for social engagement may play a crucial role in mitigating cognitive decline in aging populations.

Predictive Performance of CNN-BiLSTM-Attention

We developed nine time-series models—RNN, LSTM, CNN-LSTM, BiLSTM, LSTM-Attention, BiLSTM-Attention, CNN-BiLSTM, CNN-LSTM-Attention, and CNN-BiLSTM-Attention—to predict the risk of MCI among Chinese middle-aged and older participants. The performance of these models was evaluated using ROC curves, Decision Curve Analysis (DCA), and multiple classification metrics, including AUC, accuracy, sensitivity, specificity, Youden Index, positive predictive value (PPV), NPV, and F1-score. The DeLong test was applied to compare the performance differences among the models in the classification task.67

The results showed that as the complexity of LSTM-based nested structures increased, the overall predictive performance of the models exhibited an upward trend. However, these improvements were not statistically significant (p > 0.05). Notably, models incorporating nested CNN within the LSTM exhibited relatively better predictive performance, with comparatively higher AUC values observed (CNN-LSTM: AUC = 0.7405; CNN-BiLSTM: AUC = 0.7442; CNN-LSTM-Attention: AUC = 0.7402; CNN-BiLSTM-Attention: AUC = 0.7317) (Figure 4).

Figure 4 Predictive Performance of Nine LSTM Models. (A) ROC curves. (B) DCA curves.

Furthermore, the overall stability improved slightly as the nested structures became more complex. The CNN-BiLSTM-Attention model had a relatively higher sensitivity (0.6902) and NPV (0.9414) compared to the other models, suggesting a potential advantage in identifying positive cases and reducing false negatives. However, its specificity was comparatively lower (0.6738), indicating a possible trade-off between capturing true positives and avoiding false positives (Table 1).

Table 1 Classification Metrics of Nine LSTM Models

Feature Importance Visualization

SHAP analysis was used to evaluate the impact of key features on cognitive function prediction across nine LSTM models. Feature importance analysis (Figure 5) revealed that education, IADL, and gender were the most critical features, consistently ranking among the top three based on SHAP values for all features. Social activities, such as card and board game activities and interactions with friends, along with health insurance and sleep duration, made moderate contributions to the model’s predictive performance. In contrast, physical activity and residential area had relatively lower contributions. Notably, depression showed significant variability across different models, with its feature importance being markedly higher in the CNN-BiLSTM-Attention model compared to RNN and LSTM models. These differences suggest that model architecture should be considered when interpreting feature importance, as different architectures may capture unique aspects of feature interactions.

Figure 5 Feature Importance of Nine LSTM Models.

The SHAP Summary Plot (Figure 6) and SHAP Box Plots (Figure 7) corroborated these findings. The SHAP value distributions for education, IADL, and gender across all models exhibited spindle-shaped patterns, indicating substantial variability in their predictive influence across different samples and a strong clustering effect. Conversely, the SHAP value distributions for physical activity and residential area were concentrated near zero, lacking spindle-shaped patterns, which indicated minimal contributions to model predictions and relatively uniform impacts across samples. These results underscored the robustness of key features and the limited yet consistent influence of less important features.

Figure 6 SHAP Summary Plot for Nine LSTM Models (The horizontal axis represented the SHAP Value, reflecting the extent to which each feature influenced the model’s predictions. Larger SHAP values signified a greater impact on prediction outcomes. Positive SHAP values indicated that a feature drove the prediction towards a positive outcome, while negative SHAP values indicated a shift towards a negative outcome. The vertical axis listed the features used in the model, ranked in descending order of importance, with the most influential features positioned at the top. Red points represented positive feature values, while blue points represented negative feature values. The width of the point distribution reflected the variability of each feature’s influence across different samples. Wider distributions indicated greater heterogeneity in the feature’s impact on predictions).

Figure 7 SHAP Box Plots for Nine LSTM Models.

Mediation Analysis of Feature Effects

We controlled for gender and residential areas to assess the significant impact of education on cognitive function through various potential mediators, as shown in Table 2. The analysis revealed that IADL had the largest indirect effect, contributing 9.84% to the total effect, which was statistically significant (bootstrap 95% CI: −0.0147, −0.0108). Health insurance and social interaction followed, accounting for 3.39% and 2.76% of the total indirect effects, respectively. These results indicate that individuals with lower education levels may face greater challenges in maintaining functional independence and accessing essential healthcare resources, thereby amplifying their vulnerability to cognitive decline. In contrast, the indirect effects of physical activity and depression were relatively small, contributing 0.81% (bootstrap 95% CI: −0.0016, −0.0006) and 1.18% (bootstrap 95% CI: −0.0022, −0.0009) of the total effect, respectively. Additionally, the indirect effect of sleep duration was not statistically significant in the bootstrap analysis (bootstrap 95% CI: −0.0003, 0.0002). The smaller contributions of these factors may reflect underlying systemic inequalities that place individuals with lower socioeconomic status at a greater disadvantage.

Table 2 Mediation Analysis of Education on Health Insurance, Social Interaction, Sleep Duration, IADL, and Depression

Discussion

This study utilized five waves of longitudinal data from CHARLS to develop and implement a hybrid CNN-BiLSTM-Attention model for predicting the risk of MCI among Chinese middle-aged and older populations. The model integrated both the temporal features of time-series data and the spatial information of the samples, recognizing the impact of regional disparities on cognitive function. By integrating CNN, BiLSTM, and attention mechanisms, this approach significantly enhanced the model’s ability to capture complex temporal and spatial patterns, enabling more accurate predictions of MCI risk. Compared to traditional LSTM models, the hybrid model demonstrated improved adaptability and predictive performance in handling high-dimensional and heterogeneous data.68

When evaluating model performance, the CNN-BiLSTM-Attention model showed significantly better predictive capabilities and stability than other models. This suggests that the model has a clear advantage in comprehensively leveraging multi-level feature information, allowing it to more effectively capture the complex factors influencing MCI. Compared to traditional machine learning methods, deep learning models generally exhibit higher predictive accuracy and generalization ability, particularly when handling large-scale longitudinal data.61,69,70 Furthermore, the model’s relatively high stability underscores its considerable application potential in predicting MCI risk within large-scale population surveys, where identifying vulnerable groups in underserved communities is critical for reducing health disparities. However, the model’s AUC of 0.7317 indicates that there is still room for improvement. Additionally, its sensitivity (0.6902) and specificity (0.6738) may limit its effectiveness in early detection, where higher sensitivity and specificity are often desired. This is especially relevant in practical scenarios that demand high accuracy and reliability, such as public health monitoring and the development of early intervention strategies.71

The feature importance analysis identified that education was the most critical factor influencing MCI, followed by IADL and gender. Additionally, social activities, such as interaction with friends and card and board game activities, significantly contribute to the prediction of MCI. These findings were consistently validated across the nine time-series models we constructed, further reinforcing the pivotal role of these features in predicting cognitive function.72 The impact of education on cognitive function has been corroborated by numerous studies, highlighting that individuals with lower educational attainment often face structural disadvantages, such as limited access to cognitive stimulation and healthcare resources, while those with higher educational attainment typically possess greater cognitive reserve, which can delay the onset of cognitive decline.73 Decline in daily living activities (ADL/IADL) was closely associated with cognitive impairment, reflecting individuals’ overall health status in terms of cognitive and functional capabilities.74 This decline often disproportionately affects individuals from disadvantaged socioeconomic backgrounds, who may have fewer resources to maintain functional independence. Social activities promote mental health and cognitive stimulation, thereby protecting cognitive function.75 Mediation analysis further showed that education had a significant indirect effect on cognitive function, mainly mediated by IADL, health insurance, and social interaction. Specifically, IADL had the strongest indirect effect (95% CI: −0.0147, −0.0108), followed by health insurance (95% CI: −0.0058, −0.0029), and social interaction (95% CI: −0.0045, −0.0027), with all effects reaching statistical significance. This indicates that education not only directly affects cognitive health but also indirectly promotes the maintenance of cognitive function by improving functional status and social support networks. It may also shape neural mechanisms, such as hippocampal integrity, via long-term disparities in cognitive stimulation and healthcare access.76 This finding is consistent with the social ecological model, which emphasizes the interaction of multi-level factors in shaping health outcomes.77

To the best of our knowledge, this is the first study to integrate the LSTM framework with SHAP and mediation analyses to predict the risk of MCI among Chinese middle-aged and older adults. By utilizing SHAP analysis, we quantified the specific contributions of each feature to the prediction results, enhancing the model’s interpretability. This approach provided insights into how inequalities in education and access to resources affect cognitive health. Not only did this improve model transparency, but it also provided valuable insights for clinical decision-making. Mediation analysis enabled us to examine how education influenced the occurrence of MCI through multiple pathways, highlighting its role in addressing disparities in cognitive health outcomes. The integration of these methodologies not only enhanced the model’s predictive performance but also offered valuable methodological insights for future research, showcasing both innovation and practical value. Furthermore, the use of large-scale longitudinal data significantly improved the representativeness and generalizability of the findings.78

However, this study has several limitations. First, although we employed advanced integrated learning models, the black-box nature of these models limited their transparency and interpretability in practical applications. Future research could explore more transparent model architectures or incorporate additional explanatory methods to further improve interpretability. Second, the cognitive assessment used in this study was a global cognition tool adapted from the HRS protocol, which, while practical for large-scale studies, may lack sensitivity in detecting subtle domain-specific impairments such as executive function or visuospatial ability. In addition, the use of a −1 SD threshold to define MCI may increase the risk of false positives compared to the more conservative −1.5 SD standard commonly used in clinical settings. Third, while this study included multiple key features, other potential influencing factors, such as genetic and environmental variables, as well as the broader structural and social inequalities that affect cognitive health, may have been overlooked.

Conclusion

This study developed and deployed a CNN-BiLSTM-Attention model to predict the risk of MCI among Chinese middle-aged and older adults, highlighting the influence of education and unequal access to resources on cognitive health. The model’s notable stability and promising predictive performance demonstrate its applicability to real-world population surveys. By integrating SHAP and mediation analyses, the study not only enhanced the model’s interpretability but also provided new insights into the pathways through which education affects cognitive function, highlighting the need to address the social and structural barriers that worsen cognitive health disparities. In addition to the model’s predictive performance, this study has relevant implications for clinical and public health applications. The proposed model may aid in the early identification of high-risk individuals in community or primary care settings, allowing for timely preventive interventions and personalized management plans. Furthermore, its integration into electronic health record systems could facilitate clinical decision making and resource allocation, particularly in settings with limited cognitive screening capacity. Future research could focus on refining the model architecture and exploring additional contributing factors to further improve the accuracy and practicality of MCI prediction. Moreover, the findings support comprehensive intervention strategies, such as enhancing educational opportunities and fostering social interactions, with a particular focus on reducing inequalities among underserved populations to alleviate the burden of MCI.

Abbreviations

MCI, Mild Cognitive Impairment; CNN-BiLSTM-Attention, Convolutional neural network-bidirectional long short-term memory-attention; SVM, Support Vector Machines; RF, Random Forests; CHARLS, China Health and Retirement Longitudinal Study; HRS, Health and Retirement Study; GBDT, Gradient Boosting Decision Trees; CESD-10, 10-item Center for Epidemiologic Studies Depression Scale; ADL, Activities of Daily Living; IADL, Instrumental Activities of Daily Living; ROC, Receiver Operator Characteristic; FP, False Positive; TP, True Positive; TN, True Negative; FN, False Negative; PPV, Positive Predictive Value; NPV, Negative Predictive Value.

Data Sharing Statement

The data used in this study are publicly available through the CHARLS website (https://charls.pku.edu.cn/). Additional materials can be obtained upon reasonable request from the corresponding author.

Ethical Approval and Consent to Participate

As per the “Measures for Ethical Review of Life Science and Medical Research Involving Human Subjects” (February 18, 2023), the research qualifies for exemption under item [1 or 2] of Article 32. Specifically, analysis with no direct interaction with participants using anonymized data. Therefore, no additional ethical review or approval was required from our institution.

Acknowledgments

We are grateful to the research team of the China Health and Retirement Longitudinal Study for making the data available. We also extend our sincere thanks to all participants for their valuable time and contributions.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This study was supported by the National Natural Science Foundation of China (Grant No: 72374076) and the Fundamental Research Funds for the Central Universities (Grant No: YCJJ20242412). The funding bodies played no role in the study’s design, data collection, analysis, interpretation, or manuscript preparation.

Disclosure

The authors declare no competing interests in this work.

References

1. Dunne RA, Aarsland D, O’Brien JT, et al. Mild cognitive impairment: the Manchester consensus. Age Ageing. 2021;50(1):72–80. doi:10.1093/ageing/afaa228

2. Jongsiriyanyong S, Limpawattana P. Mild cognitive impairment in clinical practice: a review article. Am J Alzheimers Dis Other Dement. 2018;33(8):500–507. doi:10.1177/1533317518791401

3. Hernández‑Frausto M, Vivar C. Entorhinal cortex–hippocampal circuit connectivity in health and disease. Front Hum Neurosci. 2024;18:1448791. doi:10.3389/fnhum.2024.1448791

4. Farias ST, Mungas D, Reed BR, et al. Progression of mild cognitive impairment to dementia in clinic- vs community-based cohorts. Arch Neurol. 2009;66(9):1151–1157. doi:10.1001/archneurol.2009.106

5. Mitchell AJ, Shiri-Feshki M. Rate of progression of mild cognitive impairment to dementia: meta-analysis of 41 robust inception cohort studies. Acta Psychiatr Scand. 2009;119(4):252–265. doi:10.1111/j.1600-0447.2008.01326.x

6. Anderson ND. State of the science on mild cognitive impairment (MCI). CNS Spectr. 2019;24(1):78–87. doi:10.1017/S1092852918001347

7. Fang EF, Scheibye-Knudsen M, Jahn HJ, et al. A research agenda for aging in China in the 21st century. Ageing Res Rev. 2015;24:197–205. doi:10.1016/j.arr.2015.08.003

8. Shi LP, Yao SH, Wang W. Prevalence and distribution trends of mild cognitive impairment among Chinese older adults: a meta-analysis. Chin Gen Pract. 2022;25(1):109–114.

9. Deng Y, Zhao S, Cheng G, et al. The prevalence of mild cognitive impairment among Chinese people: a meta-analysis. Neuroepidemiology. 2021;55(2):79–91. doi:10.1159/000512597

10. Jia J, Quan M, Fu Y, et al. Dementia in China: epidemiology, clinical management, and research advances. Lancet Neurol. 2020;19(1):81–92. doi:10.1016/S1474-4422(19)30290-X

11. Jia J, Zhou A, Wei C, et al. The prevalence of mild cognitive impairment and its etiological subtypes in elderly Chinese. Alzheimers Dement. 2014;10(4):439–447. doi:10.1016/j.jalz.2013.09.008

12. Xue J, Li J, Liang J, Chen S. The prevalence of mild cognitive impairment in China: a systematic review. Aging Dis. 2018;9(4):706–715. doi:10.14336/AD.2017.0928

13. Granov R, Vedad S, Wang SH, et al. The role of the neural exposome as a novel strategy to identify and mitigate health inequities in Alzheimer’s disease and related dementias. Mol Neurobiol. 2025;62:1205–1224. doi:10.1007/s12035-024-04339-6

14. Varatharajah Y, Ramanan VK, Iyer R, Vemuri P. Predicting short-term MCI-to-AD progression using imaging, CSF, genetic factors, and cognitive resilience. Sci Rep. 2019;9(1):2235. doi:10.1038/s41598-019-38793-3

15. Venugopalan J, Tong L, Hassanzadeh HR, Wang MD. Multimodal deep learning models for early detection of Alzheimer’s disease stage. Sci Rep. 2021;11:3254. doi:10.1038/s41598-020-74399-w

16. Moradi E, Pepe A, Gaser C, Huttunen H, Tohka J. Machine learning framework for early MRI-based Alzheimer’s conversion prediction in MCI subjects. Neuroimage. 2015;104:398–412. doi:10.1016/j.neuroimage.2014.10.002

17. Ieracitano C, Mammone N, Hussain A, Morabito FC. A novel multi-modal machine learning approach for EEG classification in dementia. Neural Netw. 2020;123:176–190. doi:10.1016/j.neunet.2019.12.006

18. DeCarli C. Mild cognitive impairment: prevalence, prognosis, aetiology, and treatment. Lancet Neurol. 2003;2(1):15–21. doi:10.1016/S1474-4422(03)00262-X

19. Luo H, Hu H, Zheng Z, Sun C, Yu K. The impact of living environmental factors on cognitive function and mild cognitive impairment: evidence from the Chinese elderly population. BMC Public Health. 2024;24:2814. doi:10.1186/s12889-024-20197-2

20. Huang Y, Huang Z, Yang Q, et al. Predicting mild cognitive impairment among Chinese older adults using LSTM and machine learning. Front Aging Neurosci. 2023;15:1283243. doi:10.3389/fnagi.2023.1283243

21. Du Y, Hu N, Yu Z, et al. Characteristics of cognitive function transition and influencing factors among Chinese older adults: an 8-year longitudinal study. J Affect Disord. 2023;324:433–439. doi:10.1016/j.jad.2022.12.116

22. Zhang X, Fan H, Guo C, et al. Establishment of a mild cognitive impairment risk model in middle-aged and older adults: a longitudinal study. Neurol Sci. 2024;45(4):4269–4278. doi:10.1007/s10072-024-07536-2

23. Aqeel A, Zaman S, Bukhari SA. A long short-term memory biomarker-based prediction framework for Alzheimer’s disease. Sensors. 2022;22(4):1475. doi:10.3390/s22041475

24. Zhao Y, Hu Y, Smith JP, Strauss J, Yang G. Cohort profile: the China Health and Retirement Longitudinal Study (CHARLS). Int J Epidemiol. 2014;43:61–68. doi:10.1093/ije/dys203

25. Jia L, Du Y, Chu L, et al. Prevalence, risk factors, and management of dementia and mild cognitive impairment in adults aged 60 years or older in China: a cross-sectional study. Lancet Public Health. 2020;5(12):e661–e671. doi:10.1016/S2468-2667(20)30185-7

26. Mielke MM, Vemuri P, Rocca WA. Clinical epidemiology of Alzheimer’s disease: assessing sex and gender differences. Clin Epidemiol. 2014;6:37–48. doi:10.2147/CLEP.S37929

27. Blazer DG, Yaffe K, Liverman CT, editors; Committee on the Public Health Dimensions of Cognitive Aging, Board on Health Sciences Policy, Institute of Medicine. Cognitive Aging: Progress in Understanding and Opportunities for Action. Washington (DC): National Academies Press (US); 2015. doi:10.17226/21693

28. Cerin E, Soloveva MV, Molina MA, et al. Neighbourhood environments and cognitive health in the longitudinal Personality and Total Health (PATH) through life study: a 12-year follow-up of older Australians. Environ Int. 2024;191:108984. doi:10.1016/j.envint.2024.108984

29. Klee M, Aho VTE, May P, et al. Education as risk factor of mild cognitive impairment: the link to the gut microbiome. J Prev Alzheimers Dis. 2024;11(3):759–768. doi:10.14283/jpad.2024.19

30. Sommerlad A, Ruegger J, Singh-Manoux A, et al. Marriage and risk of dementia: systematic review and meta-analysis of observational studies. J Neurol Neurosurg Psychiatry. 2018;89(3):231–238. doi:10.1136/jnnp-2017-316274

31. Han SD, Boyle PA, James BD, et al. Poorer financial and health literacy among community-dwelling older adults with mild cognitive impairment. J Aging Health. 2015;27(6):1105–1117. doi:10.1177/0898264315577780

32. White L, Ingraham B, Larson E, et al. Observational study of patient characteristics associated with a timely diagnosis of dementia and mild cognitive impairment without dementia. J Gen Intern Med. 2022;37(12):2957–2965. doi:10.1007/s11606-021-07169-7

33. Ding J, Davis-Plourde KL, Sedaghat S, et al. Antihypertensive medications and risk for incident dementia and Alzheimer’s disease: a meta-analysis of individual participant data from prospective cohort studies. Lancet Neurol. 2020;19(1):61–70. doi:10.1016/S1474-4422(19)30393-X

34. Sun J, Cai R, Huang R, et al. Cholesteryl ester transfer protein intimately involved in dyslipidemia-related susceptibility to cognitive deficits in type 2 diabetic patients. J Alzheimers Dis. 2016;54(1):175–184. doi:10.3233/JAD-160053

35. Xu W, Caracciolo B, Wang H, et al. Accelerated progression from mild cognitive impairment to dementia in people with diabetes. Diabetes. 2010;59(11):2928–2935. doi:10.2337/db10-0539

36. van der Willik KD, Ruiter R, Wolters FJ, et al. Mild cognitive impairment and dementia show contrasting associations with risk of cancer. Neuroepidemiology. 2018;50(3–4):207–215. doi:10.1159/000488892

37. Koyanagi A, Lara E, Stubbs B, et al. Chronic physical conditions, multimorbidity, and mild cognitive impairment in low- and middle-income countries. J Am Geriatr Soc. 2018;66:721–727. doi:10.1111/jgs.15288

38. Fiorillo A, Gallego JJ, Casanova-Ferrer F, et al. Mild cognitive impairment is associated with enhanced activation of Th17 lymphocytes in non-alcoholic fatty liver disease. Int J Mol Sci. 2023;24(12):10407. doi:10.3390/ijms241210407

39. Hamilton CA, Frith J, Donaghy PC, et al. Blood pressure and heart rate responses to orthostatic challenge and Valsalva manoeuvre in mild cognitive impairment with Lewy bodies. Int J Geriatr Psychiatry. 2022;37(5). doi:10.1002/gps.5709

40. Stephan BCM, Minett T, Terrera GM, et al. Dementia prediction for people with stroke in populations: is mild cognitive impairment a useful concept? Age Ageing. 2015;44(1):78–83. doi:10.1093/ageing/afu085

41. Kurella Tamura M, Gaussoin SA, Pajewski NM, et al. Kidney disease, intensive hypertension treatment, and risk for dementia and mild cognitive impairment: the systolic blood pressure intervention trial. J Am Soc Nephrol. 2020;31(9):2122–2132. doi:10.1681/ASN.2020010038

42. Cao X, Zhu M, He Y, et al. Increased serum acylated ghrelin levels in patients with mild cognitive impairment. J Alzheimers Dis. 2018;61(2):545–552. doi:10.3233/JAD-170721

43. Gallagher D, Fischer CE, Iaboni A. Neuropsychiatric symptoms in mild cognitive impairment. Can J Psychiatry. 2017;62(3):161–169. doi:10.1177/0706743716648296

44. Di Carlo M, Becciolini A, Incorvaia A, et al. Mild cognitive impairment in psoriatic arthritis: prevalence and associated factors. Medicine. 2021;100(11):e24833. doi:10.1097/MD.0000000000024833

45. Abuaish S, Eltayeb H, Bepari A, et al. The association of asthma with anxiety, depression, and mild cognitive impairment among middle-aged and elderly individuals in Saudi Arabia. Behav Sci. 2023;13(10):842. doi:10.3390/bs13100842

46. Jefferson AL, Byerly LK, Vanderhill S, et al. Characterization of activities of daily living in individuals with mild cognitive impairment. Am J Geriatr Psychiatry. 2008;16(5):375–383. doi:10.1097/JGP.0b013e318162f197

47. Ahn IS, Kim JH, Kim S, et al. Impairment of instrumental activities of daily living in patients with mild cognitive impairment. Psychiatry Invest. 2009;6(3):180–184. doi:10.4306/pi.2009.6.3.180

48. Choi D, Choi S, Park SM. Effect of smoking cessation on the risk of dementia: a longitudinal study. Ann Clin Transl Neurol. 2018;5(10):1192–1199. doi:10.1002/acn3.633

49. Xu G, Liu X, Yin Q, et al. Alcohol consumption and transition of mild cognitive impairment to dementia. Psychiatry Clin Neurosci. 2009;63:43–49. doi:10.1111/j.1440-1819.2008.01904.x

50. Sindi S, Kåreholt I, Johansson L, et al. Sleep disturbances and dementia risk: a multicenter study. Alzheimers Dement. 2018;14(10):1235–1242. doi:10.1016/j.jalz.2018.05.012

51. Kivimäki M, Singh-Manoux A, Pentti J, et al. Physical inactivity, cardiometabolic disease, and risk of dementia: an individual-participant meta-analysis. BMJ. 2019;365:l1495. doi:10.1136/bmj.l1495

52. Janssen N, Handels RLH, Koehler S, et al. Combinations of service use types of people with early cognitive disorders. J Am Med Dir Assoc. 2016;17(7):620–625. doi:10.1016/j.jamda.2016.02.034

53. Bartley MM, St. Sauver JL, Schroeder DR, Khera N, Griffin JM. Social isolation and healthcare utilization in older adults living with dementia and mild cognitive impairment in the United States. Innov Aging. 2024;8(10):igae081. doi:10.1093/geroni/igae081

54. Smith L, López Sánchez GF, Shin JI, et al. Pain and mild cognitive impairment among adults aged 50 years and above residing in low- and middle-income countries. Aging Clin Exp Res. 2023;35(7):1513–1520. doi:10.1007/s40520-023-02434-7

55. Wu X, Hou G, Han P, et al. Association between physical performance and cognitive function in Chinese community-dwelling older adults: serial mediation of malnutrition and depression. Clin Interv Aging. 2021;16:1327–1335. doi:10.2147/CIA.S315892

56. Beentjes KM, Neal DP, Kerkhof YJF, et al. Impact of the FindMyApps program on people with mild cognitive impairment or dementia and their caregivers: an exploratory pilot randomised controlled trial. Disabil Rehabil Assist Technol. 2020;18(3):253–265. doi:10.1080/17483107.2020.1842918

57. Hughes TF, Flatt JD, Fu B, Chang CH, Ganguli M. Engagement in social activities and progression from mild to severe cognitive impairment: the MYHAT study. Int Psychogeriatr. 2013;25(4):587–595. doi:10.1017/S1041610212002086

58. Manly JJ, Jones RN, Langa KM, et al. Estimating the prevalence of dementia and mild cognitive impairment in the US: the 2016 health and retirement study harmonized cognitive assessment protocol project. JAMA Neurol. 2022;79(12):1242–1249. doi:10.1001/jamaneurol.2022.3543

59. Levy R, Howard RJ, Richards M, et al. Aging-associated cognitive decline. Int Psychogeriatr. 1994;6(1):63–68. doi:10.1017/S1041610294001626

60. Song Y, Yuan Q, Liu H, et al. Machine learning algorithms to predict mild cognitive impairment in older adults in China: a cross-sectional study. J Affect Disord. 2025;368:117–126. doi:10.1016/j.jad.2024.09.059

61. Lu G, Liu Y, Wang J, Wu H. CNN-BiLSTM-Attention: a multi-label neural classifier for short texts with a small set of labels. Inf Process Manag. 2023;60(3):103320. doi:10.1016/j.ipm.2023.103320

62. Zaheer S, Anjum N, Hussain S, et al. A multi-parameter forecasting for stock time series data using LSTM and deep learning model. Mathematics. 2023;11(3):590. doi:10.3390/math11030590

63. Zrira N, Kamal-Idrissi A, Farssi R, Khan HA. Time series prediction of sea surface temperature based on BiLSTM model with attention mechanism. J Sea Res. 2024;198:102472. doi:10.1016/j.seares.2024.102472

64. Liu J, Wang G, Ling-Yu D, Abdiyeva K, Kot AC. Skeleton-based human action recognition with global context-aware attention LSTM networks. IEEE Trans Image Process. 2018;27:1586–1599. doi:10.1109/TIP.2017.2785279

65. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc.; 2017:4768–4777. doi:10.5555/3295222.3295230

66. Rijnhart JJM, Lamp SJ, Valente MJ, et al. Mediation analysis methods used in observational research: a scoping review and recommendations. BMC Med Res Methodol. 2021;21(1):226. doi:10.1186/s12874-021-01426-3

67. Yin J, Liu H, Liu Z, et al. Genetic variants in Fanconi anemia pathway genes BRCA2 and FANCA predict melanoma survival. J Invest Dermatol. 2015;135(2):542–550. doi:10.1038/jid.2014.416

68. Uddin MA, Joolee JB, Lee YK. Depression level prediction using deep spatiotemporal features and multilayer Bi-LTSM. IEEE Trans Affect Comput. 2022;13(2):864–870. doi:10.1109/TAFFC.2020.2970418

69. Candemir S, Nguyen XV, Prevedello LM, et al. Predicting rate of cognitive decline at baseline using a deep neural network with multidata analysis. J Med Imaging. 2020;7(4):044501. doi:10.1117/1.JMI.7.4.044501

70. Pang Y, Kukull W, Sano M, et al. Predicting progression from normal to MCI and from MCI to AD using clinical variables in the National Alzheimer’s Coordinating Center uniform data set version 3: application of machine learning models and a probability calculator. J Prev Alzheimers Dis. 2023;10(2):301–313. doi:10.14283/jpad.2023.10

71. Muhammed Niyas KP, Thiyagarajan P. A systematic review on early prediction of mild cognitive impairment to Alzheimer’s using machine learning algorithms. Int J Intell Netw. 2023;4:74–88. doi:10.1016/j.ijin.2023.03.004

72. Lövdén M, Fratiglioni L, Glymour MM, et al. Education and cognitive functioning across the life span. Psychol Sci Public Interest. 2020;21(1):6–41. doi:10.1177/1529100620920576

73. Stern Y. Cognitive reserve in ageing and Alzheimer’s disease. Lancet Neurol. 2012;11(11):1006–1012. doi:10.1016/S1474-4422(12)70191-6

74. De Vriendt P, Gorus E, Cornelis E, et al. The process of decline in advanced activities of daily living: a qualitative explorative study in mild cognitive impairment. Int Psychogeriatr. 2012;24(6):974–986. doi:10.1017/S1041610211002766

75. Berkman LF, Glass T. Social integration, social networks, social support, and health. In: Berkman LF, Kawachi I, editors. Social Epidemiology. Oxford Academic; 2000. doi:10.1093/oso/9780195083316.003.0007

76. Noble K, Houston S, Brito N, et al. Family income, parental education and brain structure in children and adolescents. Nat Neurosci. 2015;18:773–778. doi:10.1038/nn.3983

77. Schlüter M, Baeza A, Dressler G, et al. A framework for mapping and comparing behavioural theories in models of social-ecological systems. Ecol Econ. 2017;131:21–35. doi:10.1016/j.ecolecon.2016.08.008

78. Du Y, Hu N, Yu Z, et al. Characteristics of the cognitive function transition and influencing factors among Chinese older people: an 8-year longitudinal study. J Affect Disord. 2023;324:433–439. doi:10.1016/j.jad.2022.12.116

Creative Commons License © 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.