Background
Survival analysis is a critical tool in transplantation studies. The integration of machine learning tech‑ niques, particularly the Random Survival Forest (RSF) model, ofers potential enhancements to predictive modeling and decision-making. This study aims to provide an introduction to the application of the RSF model in survival analy‑ sis in kidney transplantation alongside a practical guide to develop and evaluate predictive algorithms.
Methods
We employed a RSF model to analyze a simulated dataset of kidney transplant recipients. The data were split into training, validation, and test sets using split sample (70%-30%) and cross-validation (5-folds) techniques to evaluate model performance. Hyperparameter tuning strategies were employed to select the best model. The con‑ cordance index (C-index) and Integrated Brier Score (IBS) were used for internal validation. Additionally, time-depend‑ ent AUC, F1 score, accuracy, and precision were evaluated to provide a comprehensive assessment of the model’s pre‑ dictive performance. Finally, a Cox Proportional Hazards model was ftted to compare the results of the main metrics between both models. All analyses were supported by step-by-step code to ensure reproducibility.
Findings
The RSF model obtained a C-index of 0.774, an IBS of 0.090. The F1 score was of 0.945, accuracy was 89.67 and preci‑ sion was 90.99%. In addition, the time-dependent ROC analysis produced an AUC of 0.709, indicating a moderate predictive performance. Lastly, the analysis shows that the three most important variables are donor age, BMI, and recipient age.
Conclusions
This study demonstrates the robustness and potential of the RSF model in kidney transplant analy‑ sis, achieving strong validation metrics and highlighting its advantages in managing complex, censored data, while emphasizing the need for further exploration of hybrid models and clinical integration.