Back to Journals » Drug Design, Development and Therapy » Volume 19

A Real-Time Plasma Concentration Prediction Model for Voriconazole in Elderly Patients via Machine Learning Combined with Population Pharmacokinetics

Authors Liu R, Ma P, Chen D, Yu M, Xie L, Zhao L, Huang Y, Shang S, Chen Y 

Received 18 October 2024

Accepted for publication 28 April 2025

Published 17 May 2025 Volume 2025:19 Pages 4021—4037

DOI https://doi.org/10.2147/DDDT.S495050

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 3

Editor who approved publication: Prof. Dr. Georgios Panos



Ruixiang Liu,1,* Pan Ma,1,* Dongxin Chen,2,* Mengchen Yu,2 Linli Xie,1 Linlin Zhao,2 Yifan Huang,3 Shenglan Shang,2 Yongchuan Chen1

1Department of Pharmacy, the First Affiliated Hospital of Army Medical University, Chongqing, People’s Republic of China; 2Department of Clinical Pharmacy, General Hospital of Central Theater Command, Wuhan, Hubei Province, People’s Republic of China; 3Medical Big Data and Artificial Intelligence Center, the First Affiliated Hospital of Army Medical University, Chongqing, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Shenglan Shang, Department of Clinical Pharmacy, General Hospital of Central Theater Command, Wuhan, Hubei Province, People’s Republic of China, Email [email protected] Yongchuan Chen, Department of Pharmacy, the First Affiliated Hospital of Army Medical University, Chongqing, People’s Republic of China, Email [email protected]

Purpose: Voriconazole (VCZ) is a first-line treatment for invasive fungal disease, characterized by a narrow therapeutic window and significant inter-individual variability. It is primarily metabolized by the liver, the function of which declines with age. Pathological and physiological changes in elderly patients contribute to increased fluctuations in VCZ plasma concentrations. Thus, it is crucial to develop a model that accurately predicts the VCZ plasma concentrations in elderly patients.
Patients and Methods: This retrospective study incorporated 31 features, including pharmacokinetic parameters derived from a population pharmacokinetic (PPK) model. Feature selection for machine learning (ML) models was performed using Recursive Feature Elimination with Cross-Validation (RFECV). Multiple algorithms were selected and combined into an ML ensemble model, which was interpreted using Shapley Additive exPlanations (SHAP).
Results: The predictive performance of ML models was significantly improved by incorporating pharmacokinetic parameters. The ensemble model consisting of XGBoost, random forest (RF), and CatBoost (1:1:8) achieved the highest R2 (0.828) and was selected as the final ML model. Feature selection reduced the number of features from 31 to 9 without compromising predictive performance. The R2, mean absolute error (MAE), and mean squared error (MSE) of the external validation dataset were 0.633, 1.094, and 2.286, respectively.
Conclusion: Our study is the first to incorporate pharmacokinetic parameters into ML models to predict VCZ plasma concentrations in elderly patients. The model was optimized using feature selection and may serve as a reference for individualized VCZ dosing in clinical practice, thereby enhancing the efficacy and safety of VCZ treatment in elderly patients.

Keywords: voriconazole, elderly patients, machine learning, population pharmacokinetics, precision medicine

Introduction

Recently, the incidence of invasive fungal disease (IFI) has been increasing due to the widespread use of immunosuppressive therapy and the growing utilization of invasive devices such as central venous catheters.1 For elderly patients, IFI potentially leads to severe complications such as organ failure and sepsis due to age-related declines in immune function, and is associated with a significantly increased risk of death. Consequently, IFI has become a major threat to the life safety of elderly patients.2

Voriconazole (VCZ) is a first-line treatment for IFI caused by Aspergillus. However, due to its significant toxicity, including hepatotoxicity and neurological side effects, its clinical use poses considerable risks, especially in elderly patients.3 VCZ has a narrow therapeutic window (0.5–5 mg/L) and exhibits nonlinear pharmacokinetics due to saturable metabolism.4 VCZ metabolism displays significant inter-individual variability, primarily due to the involvement of the hepatic enzyme CYP2C19. Factors such as age, weight, hepatic dysfunction, and drug interactions also affect VCZ plasma concentrations.5 Our previous study demonstrated that VCZ-related concentrations were less strongly associated with CYP2C19 polymorphisms in elderly patients than in younger adults. The study emphasized the importance of individualizing VCZ dosing in elderly patients based on age-related physiological changes. In elderly patients, age-related hepatic dysfunction may result in VCZ accumulation and increased plasma concentrations. Additionally, elderly patients frequently take multiple medications, and the co-administration of CYP450 enzyme inhibitors or inducers further impacts VCZ plasma concentrations. Studies have shown a significant correlation between VCZ plasma concentration and both clinical efficacy and potential toxicity.6 Subtherapeutic concentrations lead to treatment failure, while supratherapeutic concentrations are associated with potential liver damage, bone marrow suppression, and neurotoxicity. Therapeutic drug monitoring (TDM) of VCZ is recommended to ensure safe and effective therapy. However, waiting for TDM results may delay treatment in critically ill patients, and such monitoring is not widely available in most hospitals. A real-time VCZ plasma concentration prediction model not only reduces medical costs, but also provides timely guidance for medication adjustment. Thus, developing an accurate model to predict VCZ plasma concentrations in elderly patients is essential for optimizing individualized therapy.

Population pharmacokinetics (PPK) accounts for physiological, pathological, and genetic differences among individuals to predict drug exposure in special populations, thereby guiding dose adjustments and clinical decision-making.7 However, classical PPK methods are often limited by the use of fixed pharmacokinetic compartmental models and may fail to fully capture all factors influencing drug pharmacokinetics, especially under sparse sampling strategies in clinical practice. Machine learning (ML), which adopts a data-driven approach, can handle nonlinear and high-dimensional data, offering significant advantages in real-world studies.8 Compared to traditional PPK models, ML models incorporate a broader range of relevant factors and provide greater predictive flexibility.9–11 However, complex ML models often exhibit a “black box” nature, making it difficult for researchers to understand their internal mechanisms and decision-making processes.12 The integration of both approaches enhances the predictive performance and clinical interpretability of the model.13,14 Previous studies only compared PPK modeling and ML approaches for predicting voriconazole trough concentrations in critically ill patients.15 To date, no study has established a predictive model for voriconazole plasma concentrations by combining both methods. To fill this gap, our study aimed to establish a predictive model for VCZ plasma concentrations in elderly patients by integrating pharmacokinetic parameters into the ML models, thereby combining the strengths of both PPK and ML approaches.

Materials and Methods

Patients

A retrospective, single-center study was conducted in elderly patients at the First Affiliated Hospital of Army Medical University from March 2022 to December 2023. Patients were enrolled in this study according to the following inclusion criteria: (a) hospitalization and receipt of VCZ treatment; (b) age ≥ 60 years; (c) duration of VCZ treatment > 3 days.4 The following exclusion criteria were applied: (a) VCZ plasma concentrations below the lower limit of quantification; (b) undergoing dialysis during medication; (c) incomplete information on VCZ administration (eg, dosing interval, daily dose, route of administration).

Data Collection and Processing

The dataset including VCZ administration, demographic information, laboratory parameters, concomitant therapies, and comorbidities was obtained from the hospital’s Electronic Medical Record System (EMRS). VCZ plasma concentrations were measured using liquid chromatography (LC-30 AD, Shimadzu Corporation, Kyoto, Japan) coupled with tandem mass spectrometry (QTRAP 5500, AB Sciex, Boston, MA, USA), as described in our previous studies.16,17 After cleaning the dataset, median imputation was performed using Python (version 3.8, Python Software Foundation) to handle missing values, resulting in a final dataset of 393×32. VCZ plasma concentrations were designated as the target variable, while the dataset was randomly divided into training and testing groups at a ratio of 8:2. Additionally, another dataset of 48 patients (enrolled from December 2023 to March 2024) with 76 TDM measurements of VCZ was collected as the clinical validation group to evaluate the performance of the prediction models.

Population Pharmacokinetic Modeling

PPK analysis of VCZ was performed using the nonlinear mixed-effects modeling software NONMEM (version 7.5.1, Icon Development Solutions, Ellicott City, MD, United States). R (version 4.1.2, https://r-project.org/) and relevant R packages were used for graphical visualization analysis. The first-order conditional estimation method with interaction (FOCE-I) was used to estimate model parameters.

One- and two-compartment models were tested to fit the PPK data in subroutines ADVAN2 TRANS2 and ADVAN4 TRANS4, respectively. An exponential model was applied to explain the inter-individual variability (IIV) of the parameters. Residual variability was evaluated through a combined additive and proportional error model. The base model was selected depended on the objective function value (OFV), diagnostic plots, and relative standard error (RSE) of parameter estimates. When the absorption rate constant (Ka) was estimated, its RSE reached 145%, indicating high uncertainty. Given the sparse sampling method and insufficient absorption phase information, Ka was fixed at a value of 1.1 h−1 following reference.18 After fixing Ka, no significant changes were observed in model parameters estimates or OFVs.

Before covariate modeling, an exploratory analysis was performed to identify potential correlations between covariates and individual empirical Bayesian estimates (EBE) of pharmacokinetic parameters as well as the distribution of the covariates. A standard stepwise method was used to select the covariates. During the forward inclusion process, the covariates were considered statistically significant if the OFV decreased by 3.84 or more (p < 0.05, χ2, df = 1) and added simultaneously to the full model. In the backward elimination step, a covariate was retained in the final model if its exclusion from the full model resulted in an increase in the OFV greater than 10.83 (p < 0.001, χ2, df = 1).

The final model was validated and evaluated using bootstrap, goodness-of-fit diagnostic plots and visual predictive check (VPC) methods. Based on the final model, individual pharmacokinetic parameters were calculated using the EBE method and incorporated as new features into the ML modeling.

Machine Learning Modeling and Validation

Nine commonly used supervised ML algorithms were employed, including support vector regression (SVR), random forest (RF), bootstrap aggregating (Bagging), adaptive boosting (AdaBoost), light gradient boosting machine (LightGBM), gradient boosting regression trees (GBRT), extreme gradient boosting (XGBoost), categorical boosting (CatBoost), and backpropagation neural network (BPNN).19 Grid search combined with nested cross-validation was used to optimize the hyperparameters.20,21

To evaluate the predictive performance of the model, coefficient of determination (R2), mean squared error (MSE), and mean absolute error (MAE) were used. R2 reflects how well the model fits the observed data.22 Based on R2 values, the top three algorithms were selected to construct an ensemble model. The calculation formulas are shown below:23

Where yi is the actual value, is the predicted value, is the mean of the actual values.

Recursive Feature Elimination with Cross-Validation (RFECV) is a feature selection algorithm that automatically selects the optimal subset of features to improve model performance and generalizability.24 The final set of selected features was determined by comparing model performance using either the intersection or the union of features selected by the top three algorithms.

Voting Regressor is an ensemble ML algorithm that aggregates predictions from each individual model to produce a final prediction. The weight ratios of individual models were optimized using grid search and cross-validation, and the R2 of the ensemble model was calculated under different weight combinations. The optimal configuration that yielded the highest R2 was selected as the final ensemble model. This approach reduced the bias of individual models, enhancing the predictive accuracy and generalizability. The workflow of data processing, algorithm selection, and modeling was displayed in Figure 1.

Figure 1 The workflow of data processing and algorithm selection. (A) Patients and data. (B) PPK modeling process. (C) Machine learning modeling process.

Abbreviations: VCZ, voriconazole; PPK, population pharmacokinetics; CL/F, clearance divided by bioavailability; SVR, support vector regression; RF, random forest; Bagging, bootstrap aggregating; AdaBoost, adaptive boosting; LightGBM, light gradient boosting machine; GBRT, gradient boosting regression trees; XGBoost, extreme gradient boosting; CatBoost, categorical boosting; BPNN, backpropagation neural network.

Model Interpretation

Shapley Additive exPlanations (SHAP) values were used to interpret the ensemble prediction model.23 The SHAP Python package (version 0.39.0) and its permutation-based KernelExplainer were used to generate the SHAP summary plot, dependence plot, and decision plot for the relevant features.25

Statistical Analysis

The Kolmogorov‒Smirnov test was used to evaluate whether the measurement data were normally distributed. Measurement data were presented as median and interquartile range (IQR) for non-normally distributed variables and as mean ± standard deviation (SD) for normally distributed variables. Categorical data were expressed as n (%). Since the dataset included repeated measurements, linear mixed model (LMM) was used to assess differences between the training and testing sets for continuous variables, while generalized linear mixed model (GLMM) was applied to examine differences for categorical variables. To account for the correlation of repeated measurements at different time points within the same patient, an autoregressive model of order 1 (AR(1)) was used. A random intercept was included in the model to account for within-subject correlations. Sample size calculation was performed using the pmsampsize package in R. The justification was based on the predictive performance of the final model on the testing group and the number of selected features.

Results

Baseline Patient Characteristics

This study included 393 TDM measurements obtained from 270 elderly patients who received VCZ treatment. Baseline information for the 31 features and the target variable (voriconazole concentration) in both the training and testing groups was shown in Table 1.

Table 1 The Description of the Study Samples

Population Pharmacokinetic Analysis

A one-compartment model with first-order absorption and elimination best described the PPK data of VCZ. IIV was successfully estimated on clearance divided by bioavailability (CL/F). Covariate analysis identified that the incorporation of procalcitonin (PCT), total bile acids (TBA), and age significantly influenced the CL/F. The PPK final model was represented by the equations below:

Parameter estimates of the PPK final model and bootstrap results were presented in Table 2. The IIV of CL/F decreased from 53.8% to 43.9% after incorporating covariates. In the bootstrap analysis of 1000 datasets, 998 fitted successfully, resulting in a success rate of 99.8%. The parameters of the final model fell within the 95% confidence intervals (CI) of bootstrap estimates, indicating good robustness and stability of the model. Goodness-of-fit diagnostic plots for the base model and final model were shown in Figure S1 and Figure S2, respectively. Scatterplots of observed concentration (DV) versus population prediction (PRED) showed an improvement of the final model compared with the base model. Figure S3 depicted the VPC results for the final model. The 5th, 50th, and 95th percentiles of the observed data mostly fell within the 95% CI of the predicted percentiles, reflecting robust predictive performance of the final model.

Table 2 Population Pharmacokinetic Model Estimates and Bootstrap Results

Machine Learning Modeling and Validation

The performance comparison of different algorithms was shown in Table 3. The R2 and accuracy of models incorporating the PPK parameter CL/F were significantly higher than those of models without CL/F, while the MAE and MSE were correspondingly lower. This finding indicated that the inclusion of CL/F substantially enhanced the model performance. Moreover, feature selection significantly improved the performance of most models across individual algorithms. Among these algorithms, XGBoost (R2=0.793), RF (R2=0.741), and CatBoost (R2=0.737) demonstrated high goodness of fit. Thus, these three algorithms were selected to construct the ensemble model.

Table 3 The Comparison of Model Performance of Different Algorithms

The feature selection sets of the three algorithms and their feature importance interpreted by SHAP were presented in Figure S4. Table 4 showed the predictive performance among different ensemble models and PPK final model in the testing group. The ensemble model using features from the intersection of the three sets outperformed both the model using features from the union and the model without feature selection. Additionally, using the intersection of the three feature sets markedly reduced the number of features, thereby improving model applicability and minimizing the impact of confounding features. As a result, the intersection (CL/F, daily dose, weight, PCT, total duration of treatment, total protein [TP], time after dose [TAD], serum creatinine [Scr], white blood cells [WBC]) was taken as the important feature set for the ensemble model.

Table 4 Comparison of Predictive Performance Among Different Ensemble Models and PPK Final Model in Testing Group

Using the nine features obtained through feature selection, the R2 values of 231 permutations and combinations (from 0:0:1 to 1:0:0, interval=0.05) of XGBoost, RF, and CatBoost were calculated. Ultimately, the composition of XGBoost, RF, and CatBoost (1:1:8) with the highest R2 was selected as the final ensemble model. The distributions of predicted and observed VCZ concentrations in the testing group were shown in Figure 2A and B. The R2, MAE and MSE of the external validation were 0.633, 1.094 and 2.286, respectively (Table 5). These results indicated that the ensemble model exhibited good generalizability.

Table 5 The Model Performance of the Final Ensemble Model in External Validation

Figure 2 Comparison of predictions and observations in the testing group. (A) The red dots indicate the observed values, and blue dots indicated the predicted values. The X-axis represents the number of test samples. (B) Scatter plots of observed versus predicted concentrations.

Sample Size Justification

The sample size calculation results based on the final ensemble model were summarized in Table 6. The minimum sample size was estimated using four criteria: (1) shrinkage-based estimation; (2) residual variance estimation; (3) 95% CI for the intercept and residual standard deviation; (4) the maximum value derived from Criteria 1–3. When R2 was 0.833 with 9 features, the overall recommended minimum sample size was 243. The shrinkage value of 0.984 indicated the stable predictive performance with minimal risk of overfitting. In this study, the total sample size of 393 (314 in the training group and 79 in the testing group) exceeded the minimum requirement, supporting the robustness of the final ensemble model.

Table 6 The Sample Size Calculation Results Based on the Final Ensemble Model

Interpretation of the Ensemble Model

As shown in Figure 3A, each individual feature value of a certain feature corresponds to a SHAP value (x-axis), and the aggregation of the SHAP values of each feature represents the predicted VCZ concentration. The mean absolute SHAP values of each relevant feature were shown in Figure 3B. The features were arranged in descending order according to their importance on the model’s predictions. The SHAP dependence plot indicated that higher weight, daily dose, WBC, and Scr, as well as lower CL/F, were associated with higher VCZ concentrations (Figure 4). Figure 5 showed the SHAP decision plot of the model, in which each line represented an individual sample from the testing group. The SHAP decision plot illustrated how individual features cumulatively influenced the model output, providing insights into feature interactions and representative prediction trajectories.

Figure 3 Important features of the final ensemble model interpreted by SHAP. (A) SHAP summary bee swarm plot. The SHAP value (x-axis) represents a unified index that quantifies the contribution of each feature to the ensemble model’s predictions. Each row corresponds to a specific feature, with all patients’ attributes plotted as colored dots along that row. The position of a dot on the x-axis indicates the SHAP value, where positive values suggest an increasing effect on voriconazole plasma concentration, while negative values indicate a decreasing effect. Red dots represent high values of the feature while blue dots represent low values of the feature. If a feature with high values (red) is associated with high SHAP values, it indicates that higher values of this feature contribute to an increase in voriconazole plasma concentration. Conversely, if red dots cluster around negative SHAP values, it implies that higher values of this feature decrease voriconazole plasma concentration.

Abbreviations: CL/F, clearance divided by bioavailability; PCT, procalcitonin; Duration, total duration of treatment; TP, total protein; TAD, time after dose; Scr, serum creatinine; WBC, white blood cells.

Figure 4 SHAP dependence plot of the final ensemble model. The SHAP dependence plot showed how the relevant features affected the output of the final ensemble model. SHAP values for specific relevant features exceed 0, representing an increased voriconazole plasma concentration.

Abbreviations: CL/F, clearance divided by bioavailability; PCT, procalcitonin; TP, total protein; Duration, total duration of treatment; Scr, serum creatinine; WBC, white blood cells; TAD, time after dose.

Figure 5 SHAP decision plot of the final ensemble model. The SHAP decision plot provides a visual representation of the predictive results for individual samples. On the x-axis is the model output, which represents the predicted voriconazole plasma concentration. The range of values, spanning negative to positive, reflects the magnitude and direction of the features’ impact on the model output. The color of the lines indicates the overall impact direction of a sample’s features on the model output. Blue lines represent samples where the cumulative feature contributions have a predominantly negative influence on the predicted plasma concentration while red lines represent samples where the cumulative feature contributions have a predominantly positive influence on the predicted plasma concentration. Lines with intermediate colors (shades of purple or pink) represent samples with mixed feature contributions, where positive and negative impacts counterbalance to varying degrees. Each line corresponds to a specific sample and shows how the model’s prediction is formed. As the line moves through the levels of each feature, it shifts left or right based on the direction and magnitude of that feature’s impact on the prediction. The vertical progression of the line illustrates the cumulative contribution from the baseline to the final prediction.

Abbreviations: CL/F, clearance divided by bioavailability; PCT, procalcitonin; Duration, total duration of treatment; TP, total protein; TAD, time after dose; Scr, serum creatinine; WBC, white blood cells.

Clinical Application

Figure 6 illustrated the workflow of the model for predicting VCZ plasma concentration in a specific patient. Before prediction, covariates identified by the PPK model and features used in the ensemble model were retrieved from the EMRS. After calculating the individual CL/F using NONMEM, all relevant information was entered into the prediction interface. The plasma concentration was predicted using the established ensemble model and subsequently compared with the actual TDM measurement for adjustment and validation.

Figure 6 The process and principle of using the prediction model for clinical application. The basic information of patients such as age, daily dose, TAD, weight, total duration of treatment, TBA, TP, Scr, WBC, and PCT were collected from the EMRS system. The PPK parameter CL/F was calculated by NONMEM. Missing values were filled with the median of the modeling data. After entering the information, click “Run” to output the model’s prediction results. As shown in the figure, the predicted VCZ plasma concentration for the patient is 1.38 mg/L (actual TDM measurement for this patient is 1.55 mg/L).

Abbreviations: PCT, procalcitonin; TP, total protein; TAD, time after dose; Scr, serum creatinine; WBC, white blood cells; TBA, total bile acids; CL/F, clearance divided by bioavailability; EMRS, Electronic Medical Record System.

Discussion

In recent years, model-based dose adjustment has emerged as a key focus in the field of individualized antimicrobial therapy.9,11,14,15,23,26 In our previous study, we constructed a prediction model of teicoplanin trough concentrations using an ensemble ML algorithm, which demonstrated excellent predictive performance.23 The previous model incorporated all available features potentially affecting plasma concentrations as input features, without consideration of clinical practicality. In clinical practice, achieving accurate predictions with minimal input is preferred, as it enhances the model’s applicability and potential for broader clinical adoption. Feature selection is a technique used to reduce the number of input features, improve model applicability, and eliminate the influence of confounding features, thereby enhancing both predictive performance and generalizability. By applying the RFECV method, the model was substantially simplified from 31 to 9 features without compromising its predictive performance.

Notably, the individual-based predictive performance of the PPK final model also had a high R2 of 0.756. However, evaluating the performance of the PPK model solely based on metrics such as R2, MAE, and MSE may be inadequate. Sparse and limited individual-level data may lead to predictions that converge toward actual observations, potentially creating an illusion of agreement between observations and predictions. Additionally, the randomization was performed at the sample level rather than the patient level, allowing the PPK analysis to incorporate more dosing administration information from individual patients. Indeed, it is not appropriate to directly compare the predictive performance of PPK models which are constructed based on sparse sampling strategy and limited information with complex ML models. Both PPK and ML methods have their advantages in the context of personalized medication. PPK integrates pharmacokinetic theory with mixed effects (fixed and random effects) to characterize inter- and intra-individual variability in drug pharmacokinetics. These models analyze data from specific patient populations and provide results with strong physiological interpretability, offering reliable evidence for dose adjustment and clinical decision-making. ML excels at leveraging large datasets, extracting complex patterns from patient information, and demonstrating strong predictive capabilities. It can automatically identify hidden patterns in the data and generate accurate predictions, thereby reducing the need for manual intervention.19 Our study integrates the strengths of both PPK and ML approaches. Incorporating CL/F into the ML model significantly improved its predictive performance. This integrated model not only improves predictive accuracy but also provides deeper insights into the pharmacokinetic characteristics of VCZ in elderly patients, thereby providing more precise guidance for individualized antifungal therapy.

An increasing body of evidence suggests that age significantly influences the metabolism of VCZ.27–30 Elderly patients have been reported to exhibit higher VCZ plasma concentrations compared to younger adults.27 Chronic conditions such as diabetes, hypertension, and heart failure are common and increase susceptibility to infections in elderly patients. Inflammation, immunosenescence, and cellular senescence are factors contributing to impaired drug metabolism in elderly patients.31 VCZ is primarily metabolized in the liver, which tends to decline in function of elderly patients, leading to drug accumulation and subsequently elevated plasma concentrations.32 In our study, the typical value of CL/F (4.35 L/h) was lower in elderly patients compared to adults and children reported in other studies,33,34 whereas an opposite trend was observed for V/F (140 L). Elderly patients often experience organ dysfunction, resulting in reduced clearance and prolonged half-life. Factors such as reduced blood flow and concurrent use of hepatic enzyme inhibitors also contribute to decreased clearance. Additionally, as a lipophilic small molecule, VCZ is more likely to experience an increase in V/F in elderly patients, who typically have reduced total body water content and increased body fat.35

Despite the nonlinear pharmacokinetics of VCZ, most PPK studies still describe the data using linear elimination, even after comparing linear and nonlinear elimination in various compartment models.36–39 Our study established a PPK base model using a one-compartment model with linear absorption and elimination. Study found that the nonlinear elimination of VCZ predominantly occurred at daily doses exceeding 10 mg/kg/day, based on a comparison of the fitting performance of linear, mixed linear, and nonlinear elimination in the base model.26 In our study, only 5.7% of the daily doses exceeded this threshold, making the impact of nonlinear elimination on the base model negligible.

The interpretability of complex models is important, as it provides a more transparent understanding of the decision-making mechanisms. In this study, SHAP was used to explain the importance of each feature in the model. Notably, the PPK parameter CL/F ranked first not only in the final model, but also in three individual algorithms, making the largest contribution to the model’s predictive performance. Covariates selection in the PPK modeling indicated that PCT, TBA, and age significantly affected the CL/F, all three being negatively correlated with CL/F. Moreover, PCT’s fourth-ranking position in the SHAP importance of the final model further emphasizes the impact of inflammation on VCZ plasma concentration. Studies have shown that inflammatory states (eg, PCT, C Reactive Protein [CRP]) significantly affect VCZ plasma concentration by altering liver enzyme activity, which in turn impacts drug clearance.27,40 Additionally, changes in inflammatory states are also associated with genetic polymorphisms in the CYP2C19 metabolizing enzyme and drug interactions.41 Among multiple liver function indicators, TBA was found to be significantly associated with CL/F during covariates selection. Consistently, studies suggest that TBA is an independent factor influencing VCZ plasma concentration,42 as elevated TBA levels indicate possible liver dysfunction, which affects VCZ metabolism.43 Excessively high VCZ plasma concentrations can also impair liver function, leading to metabolic disorders and an increased risk of hepatotoxicity. Therefore, adjustments to the dosing regimen of VCZ should take into account the patient’s inflammation level and liver function. In clinical practice, the maintenance dose of VCZ is often determined based on the patient’s weight. Weight ranked third in the SHAP values, highlighting its significance in the model. Research has demonstrated that obese patients exhibit significantly higher VCZ trough concentrations compared to normal-weight patients receiving a daily dose of 4 mg/kg/day.44 Our findings also indicated a positive correlation between higher weight and increased VCZ plasma concentrations.

Figure 6 illustrated the clinical workflow of the ensemble model, demonstrating its potential for real-time prediction of voriconazole plasma concentrations in individual patients before TDM results become available, thereby improving therapeutic efficacy and minimizing the risk of adverse drug reactions. Since CL/F holds irreplaceable importance as a key feature in the ensemble model, ensuring its accurate estimation is critical for achieving reliable predictions. CL/F can only be estimated using typical values of pharmacokinetic parameters and covariates in the absence of TDM measurements, which may result in suboptimal estimation accuracy. Therefore, predicting plasma concentrations for patients who have had at least one prior TDM measurement is recommended. When TDM measurements are accessible, the CL/F estimated via the EBE method becomes more accurate following adjustment, thereby enhancing the predictive performance of the ensemble model.

The limitations of our study should be acknowledged. First, the genotypes of VCZ metabolism-related genes (such as CYP2C19) were not included as features in our model. However, studies have indicated that pharmacogenetic variation is less clinically significant in elderly patients.45 Additionally, our previous study found that CYP2C19 gene polymorphisms had a lesser impact on VCZ plasma concentrations in elderly patients compared to adults.16 Moreover, the incorporation of CL/F in our model partially compensates for the lack of genotype information by reflecting the metabolic status of the patients. Second, the sample size of our study was relatively small, and further studies with a larger cohort are needed to validate our findings. Third, while our study established a predictive model, the development of user-friendly software is necessary for its practical application in the future. Fourth, another important inflammation-related factor, CRP, was not included due to the high missing rate of over 50%. Finally, the ML ensemble model is currently insufficient for independently predicting plasma concentrations, as the estimation of CL/F is performed in NONMEM. The EBE estimation process will be integrated into Python to facilitate real-time prediction in the future and improve the clinical applicability of the ensemble model.

Conclusion

To the best of our knowledge, our study is the first to combine classical PPK with ML in a predictive model for VCZ concentrations in elderly patients.A one-compartment model with linear absorption and elimination was established for VCZ in elderly patients, with PCT, TBA, and age showing significant effects on CL/F. The pharmacokinetic parameter CL/F, estimated using EBE, was included for ML modeling. Furthermore, a prediction model for VCZ plasma concentration was constructed by integrating multiple algorithms, with feature selection performed using the REFCV method to reduce confounding factors. The inclusion of CL/F greatly improved the prediction performance of all ML models. An ensemble model combining XGBoost, RF, and CatBoost in a 1:1:8 ratio was selected as the final prediction model. The ensemble model achieved an R2 of 0.633 on the validation group. Our model provides a reference for individualized dosing of VCZ in clinical practice, enhancing the efficacy and safety of VCZ treatment in elderly patients.

Ethical Approval and Consent to Participate

This study was approved by the Hospital Ethics Committee of the First Affiliated Hospital of Army Medical University (KY202265), and performed in accordance with the Declaration of Helsinki. All patient related data in the medical records were anonymized and de-identified at the source. It is stored in a password protected database accessible only to authorized research team members, and strict protocols are followed to ensure the confidentiality of patient data. Thus, the informed consent was exempted in the ethical approval documents.

Acknowledgments

The authors would like to thank Mr. Binbin Lv for his kind technical support in the application of Python software.

Funding

Science and Health joint Medical Research Project of Chongqing (2023QNXM031); China Postdoctoral Science Foundation (2022M713859); Postdoctoral Scientific Research Foundation of General Hospital of Central Theater Command (20211227KY22).

Disclosure

The authors report no conflicts of interest in this work. This paper has been uploaded as a preprint: [https://advance.sagepub.com/users/792678/articles/1084300-a-real-time-plasma-concentration-prediction-model-for-voriconazole-in-elderly-patients-via-machine-learning-combined-with-population-pharmacokinetics]

References

1. Lass-Flörl C, Steixner S. The changing epidemiology of fungal infections. Mol Aspects Med. 2023;94:101215. doi:10.1016/j.mam.2023.101215

2. Pfaller MA, Carvalhaes CG, DeVries S, Huband MD, Castanheira M. Elderly versus nonelderly patients with invasive fungal infections: species distribution and antifungal resistance, SENTRY antifungal surveillance program 2017-2019. Diagnostic Microbiol Infectious Dis. 2022;102(4):115627. doi:10.1016/j.diagmicrobio.2021.115627

3. Peyton LR, Gallagher S, Hashemzadeh M. Triazole antifungals: a review. Drugs Today. 2015;51(12):705–718. doi:10.1358/dot.2015.51.12.2421058

4. Chen K, Zhang X, Ke X, Du G, Yang K, Zhai S. Individualized medication of voriconazole: a practice guideline of the division of therapeutic drug monitoring, Chinese pharmacological Society. Therap Drug Monitoring. 2018;40(6):663–674. doi:10.1097/ftd.0000000000000561

5. Schulz J, Kluwe F, Mikus G, Michelet R, Kloft C. Novel insights into the complex pharmacokinetics of voriconazole: a review of its metabolism. Drug Metab Rev. 2019;51(3):247–265. doi:10.1080/03602532.2019.1632888

6. Dolton MJ, Ray JE, Chen SC, Ng K, Pont LG, McLachlan AJ. Multicenter study of voriconazole pharmacokinetics and therapeutic drug monitoring. Antimicrob Agents Chemother. 2012;56(9):4793–4799. doi:10.1128/aac.00626-12

7. Sun H, Fadiran EO, Jones CD, et al. Population pharmacokinetics. A regulatory perspective. Clin Pharmacokinet. 1999;37(1):41–58. doi:10.2165/00003088-199937010-00003

8. Bica I, Alaa AM, Lambert C, van der Schaar M. From real-world patient data to individualized treatment effects using machine learning: current and future methods to address underlying challenges. Clin Pharmacol Ther. 2021;109(1):87–100. doi:10.1002/cpt.1907

9. Bououda M, Uster DW, Sidorov E, et al. A machine learning approach to predict interdose vancomycin exposure. Pharm Res. 2022;39(4):721–731. doi:10.1007/s11095-022-03252-8

10. Woillard JB, Labriffe M, Prémaud A, Marquet P. Estimation of drug exposure by machine learning based on simulations from published pharmacokinetic models: the example of tacrolimus. Pharmacol Res. 2021;167:105578. doi:10.1016/j.phrs.2021.105578

11. Huang X, Yu Z, Bu S, et al. An ensemble model for prediction of vancomycin trough concentrations in pediatric patients. Drug Des Devel Ther. 2021;15:1549–1559. doi:10.2147/dddt.S299037

12. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Mach Intell. 2019;1(5):206–215. doi:10.1038/s42256-019-0048-x

13. Damnjanović I, Tsyplakova N, Stefanović N, Tošić T, Catić-đorđević A, Karalis V. Joint use of population pharmacokinetics and machine learning for optimizing antiepileptic treatment in pediatric population. Ther Adv Drug Saf. 2023;14:20420986231181337. doi:10.1177/20420986231181337

14. Destere A, Marquet P, Labriffe M, Drici MD, Woillard JB. A hybrid algorithm combining population pharmacokinetic and machine learning for isavuconazole exposure prediction. Pharm Res. 2023;40(4):951–959. doi:10.1007/s11095-023-03507-y

15. Huang Y, Zhou Y, Liu D, et al. Comparison of population pharmacokinetic modeling and machine learning approaches for predicting voriconazole trough concentrations in critically ill patients. Int J Antimicrob Agents. 2024;26:107424. doi:10.1016/j.ijantimicag.2024.107424

16. Shang S, Cheng L, Li X, et al. Effect of CYP2C19 polymorphism on the plasma voriconazole concentration and voriconazole-to-voriconazole-N-oxide concentration ratio in elderly patients. Mycoses. 2020;63:1181–1190. doi:10.1111/myc.13105

17. Yu M, Yang J, Xiong L, et al. Comparison of ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) and enzyme-multiplied immunoassay technique (EMIT) for quantification of voriconazole plasma concentration from Chinese patients. Heliyon. 2023;9(11):e22015. doi:10.1016/j.heliyon.2023.e22015

18. Pascual A, Csajka C, Buclin T, et al. Challenging recommended oral and intravenous voriconazole doses for improved efficacy and safety: population pharmacokinetics-based analysis of adult patients with invasive fungal infections. Clin Infect Dis. 2012;55(3):381–390. doi:10.1093/cid/cis437

19. Badillo S, Banfai B, Birzele F, et al. An introduction to machine learning. Clin Pharmacol Ther. 2020;107(4):871–885. doi:10.1002/cpt.1796

20. Tang BH, Guan Z, Allegaert K, et al. Drug clearance in neonates: a combination of population pharmacokinetic modelling and machine learning approaches to improve individual prediction. Clin Pharmacokinet. 2021;60(11):1435–1448. doi:10.1007/s40262-021-01033-x

21. Jiang X, Xu C. Deep learning and machine learning with grid search to predict later occurrence of breast cancer metastasis using clinical data. J Clin Med. 2022;11(19):5772. doi:10.3390/jcm11195772

22. Brockhoff D, Wagner T, Trautmann H. R2 indicator-based multiobjective search. Evol Comput. 2015;23(3):369–395. doi:10.1162/EVCO_a_00135

23. Ma P, Liu R, Gu W, et al. Construction and interpretation of prediction model of teicoplanin trough concentration via machine learning. Front Med. 2022;9:808969. doi:10.3389/fmed.2022.808969

24. Zaenul Mustaqim SAA, Pristyanto Y, Astuti Y. The effect of recursive feature elimination with cross-validation (RFECV) feature selection algorithm toward classifier performance on credit card fraud detection. In: International Conference on Artificial Intelligence and Computer Science Technology. Yogyakarta, Indonesia; 2021. DOI 10.1109/ICAICST53116.2021.9497842.

25. Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable ai for trees. Nature Mach Intell. 2020;2(1):56–67. doi:10.1038/s42256-019-0138-9

26. Farkas A, Daroczi G, Villasurda P, Dolton M, Nakagaki M, Roberts JA. Comparative evaluation of the predictive performances of three different structural population pharmacokinetic models to predict future voriconazole concentrations. Antimicrob Agents Chemother. 2016;60(11):6806–6812. doi:10.1128/aac.00970-16

27. Cheng L, Xiang R, Liu F, et al. Therapeutic drug monitoring and safety of voriconazole in elderly patients. Int Immunopharmacol. 2020;78:106078. doi:10.1016/j.intimp.2019.106078

28. Allegra S, De Francia S, Nicolò A D, et al. Effect of gender and age on voriconazole trough concentrations in Italian adult patients. Eur J Drug Metab Pharmacokinetics. 2020;45(3):405–412. doi:10.1007/s13318-019-00603-6

29. Chen C, Xu T, Zhou K, Zhu S. Factors affecting voriconazole concentration to dose ratio changes according to route of administration. Eur J Hosp Pharm. 2022. doi:10.1136/ejhpharm-2021-003173

30. Kato K, Nagao M, Yamamoto M, et al. Oral administration and younger age decrease plasma concentrations of voriconazole in pediatric patients. J Infection Chemother. 2016;22(1):27–31. doi:10.1016/j.jiac.2015.09.008

31. Elias R, Hartshorn K, Rahma O, Lin N, Snyder-Cappione JE. Aging, immune senescence and immunotherapy: A comprehensive review. Semin Oncol. 2018;45(4):187–200. doi:10.1053/j.seminoncol.2018.08.006

32. Sheedfar F, Di Biase S, Koonen D, Vinciguerra M. Liver diseases and aging: friends or foes? Aging Cell. 2013;12(6):950–954. doi:10.1111/acel.12128

33. Purkins L, Wood N, Ghahramani P, Greenhalgh K, Allen MJ, Kleinermans D. Pharmacokinetics and safety of voriconazole following intravenous- to oral-dose escalation regimens. Antimicrob Agents Chemother. 2002;46(8):2546–2553. doi:10.1128/aac.46.8.2546-2553.2002

34. Neely M, Rushing T, Kovacs A, Jelliffe R, Hoffman J. Voriconazole pharmacokinetics and pharmacodynamics in children. Clin Infect Dis. 2010;50(1):27–36. doi:10.1086/648679

35. Klotz U. Pharmacokinetics and drug metabolism in the elderly. Drug Metab Rev. 2009;41(2):67–76. doi:10.1080/03602530902722679

36. Tang D, Yan M, Song BL, et al. Population pharmacokinetics, safety and dosing optimization of voriconazole in patients with liver dysfunction: a prospective observational study. Br J Clin Pharmacol. 2021;87(4):1890–1902. doi:10.1111/bcp.14578

37. Han K, Capitano B, Bies R, et al. Bioavailability and population pharmacokinetics of voriconazole in lung transplant recipients. Antimicrob Agents Chemother. 2010;54(10):4424–4431. doi:10.1128/aac.00504-10

38. Dvorackova E, Sima M, Vyskocilova K, et al. Population pharmacokinetics and covariate-based dosing individualization of voriconazole in lung transplant recipients. J Chemother. 2023;1–10. doi:10.1080/1120009x.2023.2219590

39. Lin XB, Li ZW, Yan M, et al. Population pharmacokinetics of voriconazole and CYP2C19 polymorphisms for optimizing dosing regimens in renal transplant recipients. Br J Clin Pharmacol. 2018;84(7):1587–1597. doi:10.1111/bcp.13595

40. Zeng G, Wang L, Shi L, et al. Variability of voriconazole concentrations in patients with hematopoietic stem cell transplantation and hematological malignancies: influence of loading dose, procalcitonin, and pregnane X receptor polymorphisms. Eur J Clin Pharmacol. 2020;76(4):515–523. doi:10.1007/s00228-020-02831-1

41. Gatti M, Fornaro G, Pasquini Z, et al. Impact of inflammation on voriconazole exposure in critically ill patients affected by probable COVID-19-associated pulmonary aspergillosis. Antibiotics. 2023;12(4). doi:10.3390/antibiotics12040764

42. Zeng G, Shi L, Li H, et al. Effect of cyclosporine a and polymorphisms in CYP2C19 and ABCC2 on the concentration of voriconazole in patients undergoing allogeneic hematopoietic stem cell transplantation. Xenobiotica Fate Foreign Compounds Bio Syst. 2020;50(5):614–619. doi:10.1080/00498254.2019.1672907

43. Li ZW, Peng FH, Yan M, et al. Impact of CYP2C19 genotype and liver function on voriconazole pharmacokinetics in renal transplant recipients. Therap Drug Monitoring. 2017;39(4):422–428. doi:10.1097/ftd.0000000000000425

44. Koselke E, Kraft S, Smith J, Nagel J. Evaluation of the effect of obesity on voriconazole serum concentrations. J Antimicrob Chemother. 2012;67(12):2957–2962. doi:10.1093/jac/dks312

45. Dekkers BGJ, Veringa A, Marriott DJE, et al. Invasive candidiasis in the elderly: considerations for drug therapy. Drugs Aging. 2018;35(9):781–789. doi:10.1007/s40266-018-0576-9

Creative Commons License © 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.