The present study is the first to report an AI-based automated system for the assessment of Fishman’s SMI, which is widely used in dental fields, especially in orthodontics. In recent years, an increasing number of studies have been reported on the automation of skeletal maturation assessment using AI for the improvement of clinical efficiency and reproducibility. The vast majority of these proposed systems are based on TW3 method12,17,18,19, as it has several advantages over other methods. Firstly, it takes into account the variability of skeletal maturation pattern in different populations and ethnic groups. Unlike its predecessor, TW2, which derived from a sample of British children20, TW3 was developed using additional data from multiple ethnic groups and populations7. Later studies have also shown its validity in various populations, including Korean children21,22,23. Furthermore, TW3 method offers a comprehensive evaluation of skeletal maturity by assessing multiple bones, leading to a more thorough analysis of skeletal development and improved reliability. On the other hand, a major drawback of TW3 is the complexity and time required to obtain results. This limitation has been addressed through the use of AI-based systems. Another limitation of TW3 is that it does not account for changes in the trend of maturational development over time. This could lead to potential obsolescence of the method, much like its predecessor TW2. Therefore, periodic updates are necessary to maintain the validity and reliability of the TW method7.
Similarly, the GP method is susceptible to changes in the trends of maturational development, as well as to variations resulting from differences in ethnicity, regional factors, and environmental influences24,25,26. Unlike the Tanner-Whitehouse method, which has undergone revisions since its inception, the GP method has not been updated since its introduction in 19595. In other words, it is solely based on the initial reference hand-wrist radiographs of Caucasian children obtained over 80 years ago. According to the study by Mansouvar et al., the GP method is reliable for Caucasian and Hispanic children, but not for African/American and Asian groups27. A study with the sample of Korean population also concluded that the rates of skeletal development provided by GP is not applicable to Korean children28.
In contrast to TW3 and GP, Fishman’s SMI provides a staging system for assessing maturation levels, which is not reliant on skeletal age4. Consequently, SMI is not subject to fluctuations in the trend of maturational development and differences arising from factors such as ethnicity. In simpler terms, SMI allows for an intuitive determination of an individual’s skeletal maturity level, without necessitating consideration of additional factors. When assessing skeletal maturity for orthodontic purposes, the level of maturation in relation to chronological age is of little importance. As a result, obtaining skeletal age is not typically necessary, in contrast to its importance in medical or forensic fields. Nevertheless, SMI, like other methods, has its limitations. Since a single skeletal indicator is assigned for each stage, variations in the appearance sequence of skeletal maturity indicators or unclear indications of these indicators can lead to misstaging.
Clinicians often encounter hand-wrist radiographs with osseous maturational characteristics that are ambiguous to be classified as a certain SMI stage29. In case of individual variations in the sequence of skeletal maturation that do not comply with the descriptions by Fishman, SMI stage may be over- or underestimated depending on the observer, and reproducibility and reliability of SMI may be affected. It has been reported in a previous study that the prediction accuracy was relatively low for SMI stages 5 and 6. Insufficient amount of data, and large inter-observer variabilities were considered as the possible reasons15. Similarly, in the present study, the prediction accuracy for SMI stage 5 was found to be lower compared to other SMI stages. This may be related with the fact that there was no radiograph in the dataset used for the accuracy test that was predicted as SMI 5. However, the prediction accuracy of SMI 5 was calculated to be only 0.19 also in the primary validation, which was carried out with a larger dataset. In other words, SMI stages that are more likely to deviate from the proposed sequence are more prone to higher inter-observer variability and lower prediction accuracy. The system introduced in the present study is a hybrid approach that evaluates maturational level by integrating the GP, TW3, and SMI methods. This approach compensates for the known limitations of each system and enhances the accuracy and reproducibility of SMI predictions.
Recent advances in technology have led a notable surge in the integration of AI into dental practice for tasks such as diagnosis, radiographic analysis and treatment planning30. Consequently, there have been attempts to streamline and accelerate the process of skeletal maturity assessment through the application of AI31, as the manual assessment has been subject to criticism for its tediousness and intra- and interobserver variabilities32,33. Previous studies have demonstrated clinically reliable performance of deep learning-based automated systems for assessing skeletal maturity11,17,18,34,35,36. However, the majority of introduced models are based on TW3 or GP methods. Few automated systems have been proposed for the assessment of skeletal maturity using SMI. However, previous studies that investigated SMI in relation to AI focused rather on the prediction of SMI using the radiographic images of cervical vertebrae15,37.
The performance of various automated skeletal maturation assessment systems has been evaluated in previous studies. According to these studies, the AI-predicted skeletal age was not significantly different from the skeletal age assessed by experts11,19. The range of MAE reported in the literature varies from 0.39 to 2.41 years depending on the study17,34,35,36,38,39,40,41,42. It is notable that the models proposed during the last decade17,34,35,38,42 show better performance with smaller MAE compared with the models introduced earlier36,39,40,41. Since the MAE computed in the present study does not refer to skeletal age, but SMI stage, it cannot be directly compared with the results of previous studies. According to the data provided in Fishman’s study, the mean interval between SMI stages is 0.61 years for the female and 0.64 years for the male sex4. Based on this information, MAE of 0.27 SMI stage can be converted to approximately 0.169 years. Although this conversion may not be accurate, as it does not consider the differences in the size of the interval between the stages, the results suggest that the SMI-modified automated skeletal maturation system shows a satisfactory performance compared with previous systems.
The present study has several limitations. The size of the study population as a whole was sufficient, however, the number of observations was relatively small for some of the SMI stages. This may have affected the reliability of the accuracy measured for these subgroups. Furthermore, the data were collected retrospectively. While the data used for AI-training were collected from different institutions including different ethnicities, the datasets used for the primary validation and final evaluation consisted of hand-wrist radiographs only of a single ethnic group. Therefore, the results of the present study may not reflect possible differences between various ethnicities, or populations. In order to validate the results of the present study, and to further improve the accuracy of the proposed system, future studies with larger datasets from multiple institutions and populations are required.