Background: Hepatocellular carcinoma (HCC) poses a global threat to life; however, numerical tools to predict the clinical prognosis of these patients remain scarce. The primary objective of this study is to establish a clinical scoring system for evaluating the overall survival (OS) rate and cancer-specific survival (CSS) rate in HCC patients. Methods: From the Surveillance, Epidemiology, and End Results (SEER) Program, we identified 45,827 primary HCC patients. These cases were randomly allocated to a training cohort (22,914 patients) and a validation cohort (22,913 patients). Univariate and multivariate Cox regression analyses, coupled with Kaplan-Meier methods, were employed to evaluate prognosis-related clinical and demographic features. Factors demonstrating prognostic significance were used to construct the model. The model's stability and accuracy were assessed through C-index, receiver operating characteristic (ROC) curves, calibration curves, and clinical decision curve analysis (DCA), while comparisons were made with the American Joint Committee on Cancer (AJCC) staging. Ultimately, machine learning (ML) quantified the variables in the model to establish a clinical scoring system. Results: Univariate and multivariate Cox regression analyses identified 11 demographic and clinical-pathological features as independent prognostic indicators for both CSS and OS using. Two models, each incorporating the 11 features, were developed, both of which demonstrated significant prognostic relevance. The C-index for predicting CSS and OS surpassed that of the AJCC staging system. The area under the curve (AUC) in time-dependent ROC consistently exceeded 0.74 in both the training and validation sets. Furthermore, internal and external calibration plots indicated that the model predictions aligned closely with observed outcomes. Additionally, DCA demonstrated the superiority of the model over the AJCC staging system, yielding greater clinical net benefit. Ultimately, the quantified clinical scoring system could efficiently discriminate between high and low-risk patients. Conclusions: A ML clinical scoring system trained on a large-scale dataset exhibits good predictive and risk stratification performance in the cohorts. Such a clinical scoring system is readily integrable into clinical practice and will be valuable in enhancing the accuracy and efficiency of HCC management.
A machine learning clinic scoring system for hepatocellular carcinoma based on the Surveillance, Epidemiology, and End Results database
Facciorusso, Antonio;
2024-01-01
Abstract
Background: Hepatocellular carcinoma (HCC) poses a global threat to life; however, numerical tools to predict the clinical prognosis of these patients remain scarce. The primary objective of this study is to establish a clinical scoring system for evaluating the overall survival (OS) rate and cancer-specific survival (CSS) rate in HCC patients. Methods: From the Surveillance, Epidemiology, and End Results (SEER) Program, we identified 45,827 primary HCC patients. These cases were randomly allocated to a training cohort (22,914 patients) and a validation cohort (22,913 patients). Univariate and multivariate Cox regression analyses, coupled with Kaplan-Meier methods, were employed to evaluate prognosis-related clinical and demographic features. Factors demonstrating prognostic significance were used to construct the model. The model's stability and accuracy were assessed through C-index, receiver operating characteristic (ROC) curves, calibration curves, and clinical decision curve analysis (DCA), while comparisons were made with the American Joint Committee on Cancer (AJCC) staging. Ultimately, machine learning (ML) quantified the variables in the model to establish a clinical scoring system. Results: Univariate and multivariate Cox regression analyses identified 11 demographic and clinical-pathological features as independent prognostic indicators for both CSS and OS using. Two models, each incorporating the 11 features, were developed, both of which demonstrated significant prognostic relevance. The C-index for predicting CSS and OS surpassed that of the AJCC staging system. The area under the curve (AUC) in time-dependent ROC consistently exceeded 0.74 in both the training and validation sets. Furthermore, internal and external calibration plots indicated that the model predictions aligned closely with observed outcomes. Additionally, DCA demonstrated the superiority of the model over the AJCC staging system, yielding greater clinical net benefit. Ultimately, the quantified clinical scoring system could efficiently discriminate between high and low-risk patients. Conclusions: A ML clinical scoring system trained on a large-scale dataset exhibits good predictive and risk stratification performance in the cohorts. Such a clinical scoring system is readily integrable into clinical practice and will be valuable in enhancing the accuracy and efficiency of HCC management.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.