Rheumatoid arthritis (RA) is a chronic autoimmune ailment marked by persistent systemic inflammation and progressive joint impairment, primarily driven by immune cell dysregulation.
Routine blood cell indices, particularly lymphocyte count and platelet parameters, accurately predict rheumatoid arthritis severity, with a Random Forest model achieving approximately 87% diagnostic accuracy.
Rheumatoid arthritis (RA) is a chronic autoimmune ailment marked by persistent systemic inflammation and progressive joint impairment, primarily driven by immune cell dysregulation. Although immune mechanisms are well-established in RA pathogenesis, the association between routine blood cell indices and disease activity remains inadequately clarified. The identification of accessible and reliable biomarkers is critical for improving risk stratification and enabling personalized therapeutic strategies.
Therefore, this study sought to develop an interpretable machine learning model based on routine hematologic parameters to check RA intensity and to investigate potential associations and causal relationships between blood cell indices and disease activity.
Researchers conducted a retrospective study using blood routine and biochemical data from 4,401 patients. Stratification by disease severity served as the primary outcome. A total of 55 clinical variables were investigated. Recursive feature elimination was applied for identifying the most relevant predictors. Ten machine learning algorithms were benchmarked, with internal validation performed to assess model performance.
Model interpretability was analyzed via Shapley Additive Explanations (SHAP). Restricted cubic spline models and logistic regression were employed to assess the links between RA severity and blood cell indices. Mendelian randomization analysis was further conducted to investigate potential causal relationships between identified indices and RA risk.
Blood cell indices emerged as the primary variables linked with RA severity. Among the evaluated algorithms, the Random Forest model demonstrated the best predictive performance, achieving test set area under the curve (AUC) values of 0.870 and 0.874. Lower lymphocyte counts and increased platelet distribution variation were significantly associated with higher odds of severe RA (odds ratio range 0.54–2.17). Nonlinear analyses showed that both extremely low and high blood cell levels were linked to worse outcomes. Furthermore, Mendelian randomization findings indicated a potential causal relationship between blood cell indices and the risk of developing RA.
Routine blood cell indices were strongly associated with RA severity and may play a contributory role in disease progression. The Random Forest–based machine learning model showed robust predictive performance and provided interpretable outputs that could enhance clinical decision-making. These findings supported the integration of blood cell–based predictive tools into personalized RA management strategies, potentially improving risk assessment and therapeutic planning.
Therapeutic Advances in Musculoskeletal Disease
Associations of blood cell indices with the severity of rheumatoid arthritis: a retrospective case–control and machine learning modeling study
Rongqing He et al.
Comments (0)