In orthodontics, artificial intelligence, including deep learning, can be applied to diagnosis and treatment planning for orthodontic extractions or orthognathic surgery21,22,23, automated cephalometric landmarking24,25, diagnosis of impaction26, determination of skeletal maturity for growth stage evaluation27, and automatic segmentation and setup of digital models14,19,28. Reducing the time and effort required for simple tasks for diagnosis and appliance fabrication allows the users to focus more on making decisions.
In this study, we verified the MD width and CCH, presented as continuous variables, for intra-rater reliability using ICC. The success and failure of segmentation, presented as nominal variables using Cohen’s kappa, indicated very high evaluation reproducibility: ICC and Cohen’s kappa were 0.987–0.997 and 0.885–1.000, respectively. In addition, this study showed statistically significant differences in segmentation success rate, time, and size of segmented teeth using three different orthodontic CAD/CAM programs; thereby, rejecting the null hypothesis.
We designed the DGCNN-based segmentation model in two stages to prevent degradation of the segmentation performance due to differences in number of vertices between tooth and gingiva. In the first stage, the digital dental model was segmented into gingiva and dentition using the two-class DGCNN model. In the second stage, the digital dental model was segmented into individual tooth and gingiva using the seventeen-class DGCNN model after adjusting the number of gingiva vertices, which were segmented in the first stage, not to exceed twice the number of individual tooth vertices.
DGCNN using point clouds is advantageous for semantic segmentation and classification of a digital dental model, but it suffers from poor resolution for tooth margins. We attempted to obtain a clear tooth margin by supplementary use of curve-based mesh segmentation using skeleton/pruning algorithm. However, in some cases, the closed loop may not be formed due to the unclear curvature of the scanned data, or a closed loop may be formed in the wrong area, such as the tooth groove. Therefore, the segmentation and classification by DGCNN was optimized by supplementing the curvature-based mesh segmentation.
To compare and evaluate the accuracy and efficiency of the DGCNN-based segmentation model, we used two existing commercially available software. Both software packages used for comparison (i.e., OrthoAnalyzer and Autolign) were selected for the following reasons: the first reason is popularity; both are popular with orthodontists. Second, we considered their functionality for the tooth segmentation; OrthoAnalyzer is characterized by the need to set precise MD points for tooth segmentation, and Autolign has the need to set approximate MD points. Third, we considered their versatility. Some orthodontic software packages create closed working environments that prevent exportation of segmented teeth. OrthoAnalyzer and Autolign can export segmented teeth as stereolithography files, which can be imported to Meshmixer and Geomagic Control X software for success/failure determination and tooth-size measurement.
This study presented the digital dental model segmentation success and failure for its clinical applications. As the purpose of tooth segmentation is to diagnose and fabricate orthodontic appliances, such as custom brackets, clear aligners, and indirect bonds, accurate tooth surface models are essential to fabricate orthodontic appliances, and defects cannot be allowed. Considering the width and height of the bracket base and the undercut needed to obtain the retention of the removable application, the cervical ± 25% line was set as the success baseline. Therefore, criteria for determining whether the segmentation was successful or not include the cervical margin of the segmented tooth not deviating beyond ± 25% of the cervical margin of the actual tooth and finding no defects in the occlusal or incisal edge of the segmented tooth.
A high segmentation success rate increases user convenience by reducing the time and effort required to modify segmentation splines. This study showed high segmentation success rate (Table 3) in all three groups. However, the success rates of LS and AS (97.14% and 97.26%) were significantly higher than the success rate of DS (87.86%). These findings imply that there are differences in the success rates of different segmentation methods. In contrast to segmentation of general objects, tooth segmentation has to work on the complex intersection of concave regions (e.g. tooth-gingival margin, tooth groove, and interproximal area), for which traditional geometry-based segmentation is typically used. Various tooth segmentation algorithms have been introduced to overcome these limitations. However, region-based and feature curve methods still involve some limitations, such as difficulty obtaining high-quality segmentation results and reduced efficiency due to complex implementation procedures. The recent developments in deep learning require little manual intervention and have low algorithm complexity and high accuracy. In this study, the AS method based on DGCNNs exhibited satisfactory success rates.
The MD width of the experimental group used the results provided by each software program. Consequently, results varied depending on different measurement and calculation methods. OrthoAnalyzer, used in the LS, set the mesial and distal points of individual teeth before tooth segmentation and calculated the MD width using the virtual plane formed by the screen view at the MD point setting. In contrast, Autolign and LaonSetup used in the DS and the AS, respectively, were normalised after tooth segmentation to calculate the MD width. Even if the software user sets the MD points precisely on an unsegmented dental model, they are impossible to set in the occlusion area of the interproximal region. Owing to the location characteristics of the measurement points, the MD width measured before tooth segmentation is likely to be measured more conservatively than when measured on the segmented teeth. In this study, the MD width of LS also showed an error of −0.35 mm and −0.23 mm when compared with DS and AS, respectively.
The MD widths of upper molars were recorded as larger in DS and AS groups than in the LS group. This may be because of the characteristic shape of the upper molars and the method of MD width measurement. An upper molar often forms a parallelogram in occlusal view, in contrast to other teeth whose height of contour on the MD surface is clear. When an upper molar is in the shape of a parallelogram, the heights of contour in mesial and distal surfaces show many differences in the bucco-lingual position. Without considering these morphological features, the MD width of a normalised upper molar can cause errors in measurement, leading to inaccurately large sizes. Therefore, the MD width measurements of upper molars require corrections.
A limitation of digital models is that measuring accurate MD width is difficult because of the presence of occlusions in the interproximal area. When using a plaster model, the adjacent surface is reproduced naturally during the teeth section, but in the case of a digital model with only surface information, the occlusions in contact with the adjacent area during segmentation remain empty after segmentation. To solve this problem, Kim et al.14 proposed an image reconstruction method for an adjacent occlusion using a generative adversarial network and obtained an average improvement of 0.004 mm compared with the conventional method.
CCH can affect vertical bracket positioning and thus requires evaluation of accuracy. Compared with the DS and AS, LS had fewer errors in the whole tooth group, and DS and AS produced shorter CCHs than reference group (REF). However, according to the CCH of each tooth group (Table 5), the maximum average error was −0.21 mm, which is not likely to cause problems in diagnosis and appliance fabrication, and it is considered clinically acceptable.
This study performed repeated measurement of digital dental models conducted with three different segmentation methods, and there were within-subject correlations that were not independent. In addition, segmentation failure resulted in a missing value because the MD width and CCH could not be measured. Therefore, we used GLMM as a statistical method to analyse the main effects and the first-order interactions of the MD width and CCH errors. The segmentation success rate of the DS was lower than that of the remainder of the group, but the mean error according to the segmentation method was the lowest in the DS. This was the result of evaluating successful teeth segmentation and excluding missing values due to failure of segmentation. In addition, the mean error value results between groups of post hoc tests were within 0.12 mm and thus clinically acceptable.
The mean segmentation time of the AS was 57.73 s, which was shorter than those of the DS and the LS, with means of 150.73 and 424.17 s, respectively, presenting significant differences in efficiency. This was attributed to the differences in the segmentation processes; in all three experimental groups, the digital model was orientated to the coordinate system, but the subsequent process differed for each group. In the case of LS, which required precise marking of the mesial and distal points of all teeth, the segmentation time was the longest due to the necessity of axes specification of each tooth. Similarly, the DS also required marking of mesial and distal points for all teeth but did not require precise marking, so the time required for point designation was short. In the case of AS, segmentation and classification were performed without manual intervention after orientation; thus, it took the shortest time for segmentation. As the convenience of using segmentation increases and the requirement of manual intervention decreases, it is important to automate segmentation and reduce the time and effort required for correction by decreasing the segmentation failure rate.
A limitation of this study is that we used digital models of permanent dentition in good condition without teeth and gingival defects. Therefore, it is not possible to determine the tooth segmentation ability in cases such as a missing tooth, severe wearing, dental caries, partial eruption, and third molar. In subsequent studies, it will be necessary to use digital dental models of various conditions.
Because the OrthoAnalyzer software used in LS was used to obtain the reference data, bias may have occurred. To reduce the potential bias, the splines of all segmented teeth in the REF method were corrected. Moreover, the MD width of REF was measured using Geomagic Control X software. There was no correction after segmentation across experimental groups (i.e., LS, DS, and AS), and raw MD width data provided by software were used. Therefore, depending on the segmentation accuracy and the method of calculating MD width, values may differ from REF.
This study used different automatic tooth segmentation software in different groups. Because the three software programs were not developed and distributed simultaneously, their performances may vary depending on the software version. Moreover, the software continues to be updated to new versions; thus, improvements in accuracy, convenience, and speed of segmentation and classification can be expected. In addition, because the CAD/CAM software for orthodontics used in this study was distributed for commercial use, the detailed algorithms were not disclosed, making a direct comparison of the segmentation methods impossible. Therefore, this study focused on comparing the software from the point of view of a user and compared the use of the software and results of tooth segmentations.