Publication: Predicting Non-Surgical Root Canal Therapy Outcomes Using Machine Learning
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Despite the high success rate of nonsurgical root canal treatment (NSRCT), some cases fail, making outcome prediction challenging. Artificial intelligence (AI), particularly machine learning (ML), has shown promise in dentistry for diagnosing various conditions. However, precise AI tools for predicting endodontic outcomes remains limited. This study evaluates the ability of ML models to predict NSRCT prognosis, compare their effectiveness, and identify key factors influencing treatment outcomes. Specifically, it assesses and compares the performances of three ML algorithms: Decision Tree (DT), Gradient Boosting Machine (GBM), and Random Forest (RF). From 2009 to 2022, 11,232 primary NSRCT and recall cases were performed at the Harvard Dental Clinic Faculty Group Practice by residents and faculty. A total of 700 charts were randomly selected and manually screened, with 120 cases meeting the inclusion and exclusion criteria for ML analysis. The Chi-Squared p-value matrix identified “Satisfactory Coronal Restoration” as significantly correlated with NSRCT “Success vs. Failure” (p = 0.0107). To enhance model performance, features with p-values greater than 0.9 were removed prior to ML analysis to minimize noise. Feature importance rankings across all three ML models identified “Follow-up Time (Months)”, “Age During Treatment”, “Satisfactory Coronal Restoration”, and “Sealer Puff” as top influential factor in predicting NSRCT outcomes. Among the models, RF demonstrated the highest performance, as reflected by its slightly superior F1 score and AUC value. F1 sores ranked as follows: RF (F1 = 0.789), GBM (F1 = 0.707), and DT (F1 = 0.693). For AUC values, DT did slightly better than GBM with the following ranking: RF (AUC = 0.617), DT (AUC = 0.582), GBM (AUC = 0.501). With the current selection of naïve hyperparameters and features selected, cross validation scoring and accuracy metrics indicate a slight preference of the RF classifier over the other two models. Further hyperparameter tuning and improved feature selection may provide marginal improvements. The primary limitation of this study is the small sample size, which affects the reliability of the ML model predictions. However, with a larger dataset, ML could serve as a valuable tool for assessing NSRCT outcomes and guiding clinical decision-making in endodontics.