Machine learning-based decision support system for orthognathic diagnosis and treatment planning

Background Dento-maxillofacial deformities are common problems. Orthodontic–orthognathic surgery is the primary treatment but accurate diagnosis and careful surgical planning are essential for optimum outcomes. This study aimed to establish and verify a machine learning–based decision support system for treatment of dento-maxillofacial malformations. Methods Patients (n = 574) with dento-maxillofacial deformities undergoing spiral CT during January 2015 to August 2020 were enrolled to train diagnostic models based on five different machine learning algorithms; the diagnostic performances were compared with expert diagnoses. Accuracy, sensitivity, specificity, and area under the curve (AUC) were calculated. The adaptive artificial bee colony algorithm was employed to formulate the orthognathic surgical plan, and subsequently evaluated by maxillofacial surgeons in a cohort of 50 patients. The objective evaluation included the difference in bone position between the artificial intelligence (AI) generated and actual surgical plans for the patient, along with discrepancies in postoperative cephalometric analysis outcomes. Results The binary relevance extreme gradient boosting model performed best, with diagnostic success rates > 90% for six different kinds of dento-maxillofacial deformities; the exception was maxillary overdevelopment (89.27%). AUC was > 0.88 for all diagnostic types. Median score for the surgical plans was 9, and was improved after human–computer interaction. There was no statistically significant difference between the actual and AI- groups. Conclusions Machine learning algorithms are effective for diagnosis and surgical planning of dento-maxillofacial deformities and help improve diagnostic efficiency, especially in lower medical centers. Supplementary Information The online version contains supplementary material available at 10.1186/s12903-024-04063-6.


Introduction
Dento-maxillofacial deformities are a common problem in clinical practice, with approximately 5% of the population exhibiting abnormal jaw development [1,2].Orthodontic-orthognathic surgery is the primary treatment for these deformities, but accurate diagnosis and careful surgical planning are essential for optimum outcomes [3][4][5].The diagnosis of dento-maxillofacial deformities and the design of orthognathic surgery rely primarily on clinical examination, assessment of occlusal relationship, and cephalometry [1].
Artificial intelligence (AI) has demonstrated the capability to identify skeletal malocclusions and predict the need for orthognathic surgery [6,7].However, previous studies have mainly relied on lateral cephalometric data, which inevitably results in information loss.These studies diagnosed dento-maxillofacial deformities through imaging, and specific surgical plans were not designed.Moreover, certain scholars have employed 3D point cloud deep learning to generate bone templates for guiding bone block movement schemes in orthognathic surgery [8].While this represents a novel approach toward AI-assisted orthognathic surgery design, there is a lack of clinical validation for this method.
The present study created an interactive decision support system that could output an accurate diagnosis of dento-maxillofacial deformities and recommend individual surgical plans based on surgeon preferences.

Methods
This study was conducted in the West China Hospital of Stomatology of Sichuan University and was approved by the Investigational Review Board of West China Hospital of Stomatology (WCHSIRB-OT-2019-125).All procedures were in accordance with relevant guidelines and regulations.

Patients
A total of 574 patients who visited the Department of Orthognathic and Temporomandibular Joint Surgery, West China Hospital of Stomatology, Sichuan University, from January 2015 to August 2020 formed the training cohort.We excluded patients with missing teeth (other than third molars), previous history of orthodontic and orthognathic treatment, dento-maxillofacial deformities caused by fractures or tumors, and those who did not undergo digital surgical planning.A total of 89 patients were excluded.
Demographic information, extraoral and intraoral photographs, and pretreatment cephalometric measurements were extracted from the clinical records (Table 1) [2,9], and 27 features were used as input features (Table S1).The input features were preprocessed to ensure that all were quantified before being used for model training.The diagnosis of dento-maxillofacial deformity was made by an orthognathic surgeon (Dr.Luo) with 19 years of clinical experience.The deformities included those of maxillary development, mandibular development, maxillary deviation, and mandibular deviation.

Diagnosis model
Different types of dento-maxillofacial deformities were diagnosed based on their characteristics, resulting in a comprehensive collection of 28 clinical features.These features were used to evaluate maxillary and mandibular development and deviations.The measurements were obtained with spiral computerized tomography (CT) utilizing the Mimics 16.0 software (Materialise Inc., Leuven, Belgium).All measurements were repeated twice and averaged by two orthognathic surgeons after an interval of at least 1 week.Cephalometric measurements indicating the degree of jaw deflection were considered positive from top left to bottom right.
Common classifier algorithms included extreme gradient boosting (XGBoost), discriminant analysis, naive Bayesian classification, neural networks, and support vector machines (SVM).These models were applied to the test set to evaluate the appropriateness and effectiveness of each model and determined the most appropriate and superior model.To ensure optimal classifier performance and generalizability, we randomly divided the data into training, validation, and test sets in a ratio of 7:3:1.The binary relevance extreme gradient boosting (BR-XGBoost) algorithm demonstrated superior performance compared to the remaining four machine learning algorithms in the validation set.
BR-XGBoost algorithm was used to process the data on dental and maxillofacial malformations and realize Considering the differences in patients' clinical symptoms, it was necessary to study the generalization performance of the proposed algorithm.The feature selection of each XGBoost j was based on the forward sequence selection method.For label j, first, the importance ranking of all features was obtained based on BR-XGBoost.The features with the top rankings were then added to the feature subset (initially empty set), and the cross-validation classification accuracy of the feature subset was calculated after each addition.If the classification accuracy was improved, the feature was retained, otherwise it was eliminated, and the optimal feature subset of label j was obtained by traversing all features.The performance of the model was not only affected by the training set, but also by the selection of its built-in parameters, that is, super parameters.To avoid the complexity and uncertainty of manual parameter adjustment, the distributed asynchronous hyper parameter optimization module on the Pycharm platform was used.Based on the Bayesian optimization theory, the cross-validation method was adopted to optimize each BR-XGBoost, and determine the range of hyper-parameters, thereby improving model accuracy.
To evaluate the performance of the artificial intelligence model, the following normal metrics were used.where TP = true positive, TN = true negative, FP = false positive, and FN = false negative.

Surgical design model
Following preoperative orthodontics, cephalometric measurements were re-performed to obtain patientspecific characteristics.The terminal occlusal relationship was recorded by the 3D spatial relationship of three points marked on the maxilla and the mandible.The reference intervals of these characteristics were established based on their normal intervals in the Chinese population and our clinical experience.Six sets of data were used to accurately describe the 3D movement of the maxilla, mandible, and chin according to the commonly used clinical methods.
The AI-based surgical model employed the adaptive artificial bee colony (aABC) algorithm to calculate the maxillary movement, mandibular movement, and mentum movement.The 3D translation and rotation of each part in the surgical scheme constitute the solution space of the aABC algorithm.All possible solutions in the solution space are expressed by honey source, and the degree of honey source is measured by the value of fitness function.Bees can be divided into three types according to division of labor: collecting bees, following bees, and reconnaissance bees (collecting bees and following bees each account for half of the total number of bees; the collecting bees corresponding to inferior honey sources are transformed into reconnaissance bees to search for new honey sources).The specific search for honey source is as follows: 1) The algorithm was initialized.
2) Algorithm optimization objectives were determined.
In this step, the reasonable range of values for each feature was determined.These ranges were superimposed to form the optimized feasible region of the algorithm proposed in this paper.The optimal eigenvalue within the feasible range was determined as the final surgical plan.This study assumed that feature values closer to the middle of the range were better.3) Intelligent exploration of the optimal solution for surgical plans based on aABC algorithm: With increase of the number of cycles of the algorithm, the weight between algorithm exploration and development constantly changes.When the cycle starts, the algorithm has a high weight in exploration, and the global search ability is strengthened, so it is not easy to fall into local optimization.The continuous increase in the number of cycles gradually decreases the adaptive coefficient value.At this time, the algorithm tends to develop the region guided by the global optimal solution, which improves the convergence speed and accuracy of the algorithm.
4) For the following bee, the calculated probability value selects the honey source and carries out neighborhood search to generate a new solution, and selects the honey source with better fitness value.5) If no better honey source is found within a limited number of times, the honey source is discarded and a new honey source is randomly generated.6) The optimal honey source (global optimal solution) found by all bees is saved, and the termination condition of the algorithm (maximum number of iterations) is judged.If the conditions are met, the optimal surgical scheme is returned and the algorithm is terminated.Otherwise, we return to the first step to continue the algorithm.
The surgical plan generated by the surgical design model could be adjusted according to expert preferences, a process called human-computer interaction.The surgical plans were generated using features of the normal population.In case of dissatisfaction with the current design plan regarding the feasibility and outcome of the operation, the surgeon could propose potential modifications to the jaw's movement.
The AI-generated surgical plan was subjectively and objectively evaluated using the test set.The subjective evaluation of the surgical plan encompassed both feasibility and effectiveness assessments, conducted by three experienced orthognathic surgeons from our center.None of these surgeons were involved in the study's design, and only evaluated the effect of the surgical protocol based on their clinical experience.Prior to commencing the study, intraclass correlation coefficients (ICCs) for the realistic surgical plan from the test set were analyzed among the three participating surgeons by the realistic surgical plan from the test set.The objective evaluation included the difference in bone position between the AI-generated and actual surgical plans, as well as discrepancies in postoperative cephalometric analysis outcomes.Measurements were conducted using 3-matic 9.0 software (Materialise, Belgium).Paired t test was used to analyze statistical differences in cephalometric results.
Figure 1 presents an overview of the decision support system and treatment planning.

Performance of diagnostic system
Of the five different types of machine learning algorithms, the BR-XGBoost algorithm showed the highest accuracy and sensitivity for classification of different types of dento-maxillofacial deformities (Fig. 2).The receiver operating characteristic (ROC) curve in Fig. 3 illustrates the performance of the BR-XGBoost algorithm in the diagnosis dento-maxillofacial deformities.The AUC for the model ranged from 0.881 to 0.982.The best performance was for diagnosis of deviation, followed by mandibular overdevelopment and maxillary underdevelopment.The diagnostic model classified the dentomaxillofacial deformities, and the combination of the two provided the final diagnosis.

Performance of the surgical design model
The system can output a personalized surgical plan (Fig. 4).The output results contain six 3D parameters representing rotation and movement of maxilla, mandible, and chin.For movement of maxilla and mandible, the corresponding incisor point was taken as the movement center.For the chin, the pogonion point was taken as the movement center and is reduced to have translational movement only.
The ICCs for the three participating orthognathic surgeons in this study, are presented in Table S2.The effectiveness and feasibility of intelligently designed surgical procedures were rated on a scale of 1 to 10.The surgical plans were individually evaluated by the three surgeons; the distribution of their scores is presented in Table 2.
The fitting analysis process is shown in Fig. 5, while the discrepancies in the maxilla, mandible, and chin positions under two distinct schemes are presented in Table 3.
The differences in three-dimensional cephalometric measurements result between the actual surgical and the AI-model are presented in Table 4.
Figure 6, and 7 present the cephalometric data and automatically generated surgical plans for three patients with dento-maxillofacial deformities.

Discussion
In this study, we propose a machine learning-based decision support system for orthognathic diagnosis and treatment planning.The system can diagnose dentomaxillofacial deformities and output an interactive and personalized surgical option by analyzing patients' clinical features and the preferences of both the surgeon and Fig. 2 The performance of five machine learning algorithms (BR-XGBoost, SVM, neural network, discriminant analysis, Bayesian classification) was evaluated in six classification problems related to dento-maxillofacial deformities: maxillary overgrowth, maxillary undergrowth, maxillary deviation, mandibular overgrowth, mandibular undergrowth, and mandibular deviation.The radar chart shows the algorithm performs on Accuracy, Precision, Recall, Specificity, and F1 score.The BR-XGBoost algorithm consistently demonstrated superior results across all six classification problems Fig. 4 Personalized surgical plan designed by the system for a skeletal class III malocclusion patient Fig. 3 ROC curve of the BR-XGBoost algorithm in the diagnosis of dento-maxillofacial deformities.The AUC values ranged from 0.881 to 0.982 in different diagnostic classification tasks the patient.The system achieved an overall accuracy rate of > 90% and was highly rated by expert reviewers.
The diagnosis and treatment planning of dento-maxillofacial deformities is a multifaceted process influenced by various factors, including ethnicity, esthetics, and diagnostic practices [2,11].The surgeon's experience also plays an important role.Dento-maxillofacial deformities manifest as an incongruity in the spatial relationship between the teeth and the jaw, and generally requires orthognathic and orthodontic treatment.Advancements in orthognathic surgery have led to the development of several objective indicators for aiding the surgeons in diagnosis and surgical planning.The classical two-dimensional cephalometry, model surgery, and VTO techniques can provide surgeons with valuable quantitative diagnostic information and aid meticulous surgical planning [12].With the advancements in computer technology, digital surgical technology has enabled the surgeons to perform computerized preoperative osteotomy and postoperative bone morphology simulations [13,14].This holds great significance for the precise design and implementation of the surgical plan.
Digital surgery has become standard practice in orthognathic surgery across numerous clinical centers.Its execution requires a collaboration among professional institutions, their affiliated hospitals, and digital surgery design companies.However, certain challenges persist, including a complex manual operation process, a significant need for repetitive labor, low efficiency, and prolonged time consumption [15].The diagnosis and surgical design of dento-maxillofacial deformities adhere to highly standardized protocols, ensuring stability in the selection of surgical methods and enabling quantifiable surgical planning.These favorable conditions facilitate intelligent diagnosis and surgical design for dento-maxillofacial deformities.Considering the characteristics of orthognathic surgery, broader applications of AI in this domain are inevitable in the future.This may enhance the work efficiency, diagnostic accuracy, and treatment outcomes, thereby benefiting both doctors and patients.
AI can be used to accurately classify patients' skeletal patterns, enabling the identification of cranial structures with varying degrees of developmental deviations.This evaluation can be conducted using cephalometric images to classify anteroposterior and vertical skeletal discrepancies [16,17].The combination of posteroanterior radiographs and lateral cephalograms can enhance accuracy and provide additional information on asymmetry [6,18].Previous studies have utilized AI to elucidate the necessity of orthognathic surgery.These studies utilized cephalometric parameters either independently or in conjunction with other assessments such as anteroposterior radiographs, occlusal models, and demographic variables for evaluating the need for orthognathic surgery [6,17,19].

Effect of surgery
The majority of existing studies have primarily focused on the identification of patients requiring orthognathic surgery.While one study has utilized deep learning to generate a virtual jaw shape and employed this model as a reference for designing specific jaw movements, the constructed model requires validation before clinical application [8].To the best of our knowledge, no study has yet achieved the successful development of personalized and precise orthognathic surgical design tailored specifically to individual patients.
The XGBoost algorithm has unique advantages in dealing with diagnostic problems: it is excellent for parallel operation and can run large-scale data quickly; it can automatically optimize split nodes and is good at dealing with irregular data with many outliers and missing values; and, finally, it is interpretable and flexible [20,21].Further, the XGBoost algorithm can handle difficult over fitting, has higher accuracy of loss function solution, supports sparse data processing, and so on.Compared with Fig. 5 A color distance map was applied to align with the two surgical plans, where the colored bone segment depicted represents the actual surgical plan.The difference distribution is shown in the lower right corner the sample data in the industrial field where artificial intelligence is widely used, the sample size for the maxillofacial diagnosis problem studied is small and so is ideally suited for XGBoost, which is less demanding in terms of sample size.In this study, BR-XGBoost achieved the best overall performance, with overall accuracy rate of over 90%.Considering that some patient data were collected manually, there was a certain amount of deviation noise.We found that the proposed algorithm had good detection performance when the deviation conformed to the normal distribution.The clinical features exhibited variations between genders, with different reference intervals selected for different genders in this study [22].The purpose of the orthognathic surgery was to correct incongruous three-dimensional positional relationships between the teeth and the jaws.Cephalometric indicators were the main clinical measures utilized for characterizing disharmony.The designed plan used several cephalometric indicators that are commonly used in the clinic as a basis for correcting the abnormalities.When designing the optimal feature target value for AI algorithms in this study, we believe that the best feature values were those closest to the middle of each feature range.The aABC algorithm is a bionics adaptive artificial intelligence technique for solving extremum problems [23,24].It has natural advantages in solving function optimization problems and is also the most applied algorithm in the field though there are no reports of its application in the medical field.This algorithm can be used to solve multivariable function problems.To maximize the efficiency of the algorithm, we chose to use the aABC [25,26], which can adaptively change the weight of each influencing factor in the search equation to simultaneously provide good exploration and development ability.The aABC algorithm increases the operation speed by about 30%.In this study, single operation time for one set of data was ∼ 30 s.
The calculation of this process is complicated, and there may be multiple schemes that could meet the requirements.For cases requiring personalized surgical plans, this could be achieved by adjusting the optimal target feature values, a process called interactivity.The personalized parameter adjustment occurs in the 4) step of the aABC algorithm.Table 2 illustrates that the distribution of feasibility and effectiveness scores for the adjusted surgical protocol was superior to that for the primary output, indicating an improved effect of the surgical plan design.
It should be noted that cephalometric measurement is only one of the methods to evaluate dento-maxillofacial deformities, and using only cephalometric measurement to evaluate maxillofacial deformity will inevitably reduce diagnostic accuracy.However, the diagnostic accuracy rate in this study was acceptable.Except for mandibular underdevelopment, AUC was > 0.9 for all diagnostic problems.The AUC of mandibular underdevelopment was 0.881.The overall diagnostic accuracy for dentomaxillofacial deformities was over 90%.In the present study, the BR-XGBoost algorithm achieved the best overall performance.Our classification was more extensive and accurate compared to similar previous studies.
The surgical effect refers to the impact of the surgical plan on improving the patient's dento-maxillofacial deformity, including profile and symmetry improvements.The subjective effectiveness mean score was 8.8, which further increased to 9.08 after humancomputer interaction.Overall scores ranged from 8 to 10, with a median of 9, indicating that the AI-assisted surgical plan could successfully enhance facial appearance and achieve a high score deemed acceptable by experts.
Surgical feasibility refers to whether the extent of bone segment movement surpasses the surgical threshold, post-implementation jaw stability, and the risk of recurrence (e.g., the presence of bone contact).Subjective feasibility evaluation results indicated an initial average score of 8.34, which increased to 8.92 following humancomputer interaction.Overall scores ranged from 7 to 10, with a median of 9, highlighting the clinical feasibility of AI-assisted surgical plans.Subjective evaluations revealed generally acceptable surgical outcomes.Prior to scoring, a consensus recommendation was reached by the three participating surgeons involved in the study.A higher ICC value, closer to 1, indicated a stronger level of agreement.The consistency test conducted on the three study demonstrated a high level of agreement in their evaluations.
A comparison of the positional deviation of the bone segments in the two distinct plans demonstrated that the average discrepancy in both maxilla and mandible was approximately 2 mm, with most regions exhibiting a deviation of < 4 mm.This indicates that there existed a There were no statistically significant differences between the two groups certain level of disparity in jaw bone positioning between the two plans.However, considering the intricate threedimensional morphology of the bone segment itself, no significant divergence was observed in subjective assessment scores for efficacy within either group.Hence, it can be inferred that the maxillary and mandibular movement plans fell within an acceptable range.A comparison of the measurement items between the two schemes revealed no statistically significant differences between the two groups of data for each item.Additionally, the changes in the measurement indices closely aligned with the normal range, indicating an improvement in jaw morphology and Fig. 6 Representative case A. The decision support system diagnosed the patient with maxillary and mandibular deviation and designed a surgical plan.Three-dimensional visualization of the surgical effect was processed using Freeform plus software (3DSystems Inc., Rock Hill, USA).Lefort I osteotomy, BSSRO, and genioplasty were performed through the virtual surgical method.After completion of the osteotomy, the bone block was moved according to the AI-generated surgical plan.The patient's dento-maxillofacial deformity improved significantly and the cephalometric measurements returned to normal facial shape for the patients.As this experiment included individuals with both skeletal Class II and skeletal Class III malocclusions, there were no significant differences between the mean preoperative and postoperative values.However, compared to the preoperative cephalometric results, a significant reduction in the standard deviation was observed, suggesting a greater number of data points trending toward the normal range.
The proposed system also had some limitations.The prediction of post-orthognathic surgery soft tissue Fig. 7 Representative case B. The patient was diagnosed with maxillary underdevelopment and mandibular overdevelopment by the decision support system.Three rounds of interaction between the surgeons and the system resulted in the final operation plan.Three-dimensional visualization of the surgical effect was processed using Freeform plus software (3DSystems Inc., Rock Hill, USA).Lefort I osteotomy, BSSRO, and genioplasty were performed through the virtual surgical method.After completion of the osteotomy, the bone block was moved according to the AI-generated surgical plan.The patient's dento-maxillofacial deformity improved significantly and the cephalometric measurements returned to normal changes has long been a clinical challenge awaiting resolution.An intricate interplay of numerous factors significantly influences these changes.This study focused exclusively on hard tissue alterations and did not encompass the intricacies associated with soft tissue considerations [27].In the future, it may become possible to take these factors into consideration and plan orthognathicorthodontic therapy at the time of initial diagnosis [28].
In this study, jaw development was categorized into three conditions: overdevelopment, underdevelopment, and normal.However, this classification method lacks clinical information regarding the specific direction of jaw growth, making it difficult to determine whether a patient has a sagittal or vertical developmental issue, or their combination.The limited number of machine learning samples restricted this classification approach.A total of 574 patients were included in the study, with certain clinical patient types underrepresented due to data bias.To address this limitation in future studies, it is recommended to supplement the dataset with a wider range of patient types.

Conclusions
This study developed an interactive decision support system capable of providing precise diagnoses for dentomaxillofacial deformities and offering personalized surgical plans.For this purpose, we employed BR-XGBoost and the aABC algorithm.The decision support system utilized a combination of objective evaluation indices and subjective assessment scores to accurately diagnose dento-maxillofacial deformities and generate orthognathic surgery plans aligning with clinical requirements.Moreover, the surgical plan could be customized based on expert preferences to cater to individual needs.Future validation of this decision support system requires the involvement of diverse clinical centers and a wider scope, subsequently extending its application to subordinate hospitals for providing decision support to young doctors.Furthermore, there is a need for further research on AI-assisted soft tissue prediction in orthognathic surgery, which could be combined with real-time soft tissue change predictions to improve the design of jaw surgery plans.

1 )
Accuracy: Accuracy = T P +T N T P +F P +T N+F N 2) Precision: P recision = T P T P +F P 3) Recall: Recall = T P T P +F N 4) Specificity: Specif icity = T N F P +T N 5) F1score: F 1score = 2 • P recision•Recall P recision+Recall 6) AUC: Area under the curve.

Fig. 1
Fig.1Pipeline of the proposed decision support system for orthognathic diagnosis and treatment planning

Table 1
Demographic and clinical characteristics of patients in this study

Table 3
Deviations between the surgical design model plans and the actual surgeries

Table 4
Differences in cephalometric measurements between different plans after simulation