DOI was introduced in the UICC/AJCC 8th edition of the T classification of oral cancers, including tongue cancer that was published in 2017. DOI is defined histopathologically as the vertical distance from the virtual plane connecting the basement membrane of the normal mucosa adjacent to the tumor to the deepest part of the tumor, and is a factor associated with survival prognosis, including the risk of metastasis to the cervical lymph nodes, reported in many papers [2,3,4,5,6]. If DOI can be measured during preoperative diagnostic imaging, it would be useful for determining treatment policies and predicting prognosis, but no specific method of DOI measurement in diagnostic imaging has been shown . Therefore, in this study, the accuracy of diagnostic imaging was estimated by comparing the DOI measured by CT, MRI, and US with the DOI measured histopathologically in the same target group. Few studies have evaluated primary lesions of tongue cancer by CT [14, 15, 31]. Although CT has better spatial resolution than MRI, metal artifacts often make it difficult to evaluate lesions. The oral cavity has many metal restorations, making it difficult to assess primary lesions in many cases. Of the 38 patients that were examined by CT scan in this study, the lesion could not be clearly identified in 11. Although 27 patients were evaluable, there may have been a mixture of patients where appropriate imaging of the cross section could not be performed due to metal artifacts and the actual depth of the tumor was not captured. This was likely one of the factors that made the 95% limit of agreement in the Bland–Altman analysis larger than that of MRI. Previous reports have suggested that CT is accurate enough to assess the primary lesion of tongue cancer [14, 15, 31], and this effect should also be taken into consideration.
Since MRI is less susceptible to metal artifacts, it is expected to be superior to CT for DOI measurement. The accuracy varies across reports, with some suggesting it was almost consistent with the histopathological DOI  or overestimated by about 2 mm [21, 24, 28]. In this study, an overestimation of about 2–3 mm was observed overall. It should be noted that the overestimation was caused by peritumoral edema and reactive inflammation [16, 25, 32]. Such histopathological changes of the surrounding area may have affected these results also, indicating that not only the tumor per se is detected. Such effects seem to occur similarly even in CT. However, there might be enough data to refer to the influence of partial volume effects based on differences with MRI regarding slice thickness and voxel size.
For US, Bland–Altman analysis suggested that the bias and the width of the 95% limit of agreement was smaller than that of CT and MRI, suggesting that the extent of tumor invasion was more likely to have been captured almost accurately. A small hockey stick-type intraoperative transducer is used as a scanning method for US and perform oral scanning between the tumor and the transducer via a polymer acoustic coupling material. Although there are restrictions due to the examination being performed in a narrow oral cavity and in some cases involves considerable contact pain, it is possible to measure the tumor by applying the transducer from any direction to some extent. In addition, similar to CT and MRI, it seems that US was affected by edema and reactive inflammation of the peritumoral tissue. However, with post-contrast CT and post-contrast MRI, the tissue enhanced by a contrast agent was depicted, whereas, US seems to have detected a range closer to the tumor body as a hypoechoic lesion based on the difference in acoustic impedance with the surrounding tissue. Here, CT and MRI showed an overestimation of 2–3 mm, while the overestimation of 0.2 mm in the US was almost consistent with the histopathological DOI. It is therefore possible that US is not affected by the edema and inflammation of the peritumoral tissue as much as CT and MRI. However, in some cases, the examination is not possible if the patient has a mouth opening limitation or if the cancer has developed near the root of the tongue where the transducer does not reach, or if the contact pain is obvious. Although the tumor location of most cases (43 of 48) were lateral portion of the tongue and therefore, intraoral scanning was performed without marked difficulty in this investigation, it is judged that other modalities should be prioritized in such situations. There have also been reports of decreased accuracy in cases with histopathological DOI greater than 5 mm [11, 12], and caution should be paid when using US in advanced cancer cases. In this study, T3 and T4 cases were excluded, and as there were only eight cases where the histopathological DOI exceeded 5 mm, such a tendency was rarely observed. Given that there is no concern about exposure to ionizing radiation, for patients that cannot undergo contrast imaging tests due to renal disorders, allergies etc., US is a non-invasive and inexpensive test compared to CT and MRI. In addition, there are reports stating that lesions with a DOI of 5 mm or less could hardly be identified by CT and MRI [23, 33]. US should be used proactively for lesions with shallow depth of invasion, and it is reasonable to make US the first choice for preoperative diagnostic imaging of early tongue cancer. In addition, US has some disadvantages, such as the lack of objectivity in images and the dependence of the accuracy on the examiner, which can be ensured to some extent by the rationalization of the evaluation criteria and organizational training system.
Overestimation of lesions was observed in all modalities in this study as mentioned above. Previous reports have described edema and reactive inflammation of peritumoral tissue as factors of overestimation in MRI, but as we have confirmed through this study, the extent of inflammatory cell infiltration in peritumoral tissue that could be observed histopathologically was very limited, not as much as 2–3 mm. In other words, it was thought that edematous changes associated with circulatory disorders in surrounding tissues, which are not clearly reflected by histopathology, accounted for most of the factors of overestimation in MRI. For CT and MRI, an intravenously administered contrast agent reaches its maximum concentration in the circulating blood in 1–2 min; first, the dilated vascular cavity and the increased vascular bed are enhanced [34,35,36], then the contrast agent is transferred to the tumor tissue and other tissues according to osmotic pressure. Capillary-rich tumor tissue is contrasted relatively early, but the surrounding tissue with edematous changes has fewer capillaries compared to tumor tissue, and so the contrast agent migrates gradually [34, 35]. Thus, it should be noted that the areas imaged by CT and MRI include not only tumor body, but also the edematous changes portion of the tumor body itself and surrounding tissues. The range captured by each modality can be illustrated as shown in Fig. 10. There have been reports that dynamic MRI captures DOI and thickness more accurately than post-contrast T1-WI [22, 37], which seems to minimize the delineation of edema and reactive inflammation of peritumoral tissue. Dynamic imaging is also being performed at our institution, but there are variations depending on the imaging equipment and conditions. For the same reason, although DWI (diffusion weighted image) is utilized in head and neck region , it is not applied in this study. In this study, T1-WI and T2-WI (both of which are fat-suppressed) were applied after imaging, which are more commonly used and can be evaluated stably. When these were compared in this study, the tendency for an overestimation of T2-WI was slightly greater than that of T1-WI. Previous reports have also pointed out the tendency for an overestimation of T2-WI ; however, the same is true in this study also. In principle, T2-WI is more sensitive to edematous changes in peritumoral tissue than post-contrast T1-WI and that was likely linked to such a tendency. Since the slice thickness of MRI in this study was 4–5 mm, it may be affected by the partial volume effect in no small way. Studies using a slice thickness of 1–2 mm have been reported [19, 24, 31]; however, as with other reports, an overestimation of about 2 mm has been observed, and the effect of the partial volume effect on the overestimation seems to be less significant. However, it has been reported that with MRI of slice thickness of 4–5 mm, lesions with a DOI of 5 mm and below could hardly be identified [25, 33]. It may therefore be advantageous to reduce the slice thickness for smaller lesions. With regard to CT overestimation, there is little difference in pharmacokinetics between nonionic iodine contrast agents and gadolinium preparations in MRI, although the principles of contrast enhancing are different . The timing of T1-WI performance after imaging is a few minutes after administration of the contrast agent, while that for CT imaging is 70–90 s after intravenous administration of the contrast agent. With this as the timing before the contrast agent migrates to the area of the surrounding tissue with edematous change, the extent of overestimation should be reduced. However, it had the same extent as that of T1-W1 after MRI imaging. This may be due to the effect of the tissue resolution of CT, which is not sufficient. As such, this may need to be examined in detail in the future.
When comparing axial and coronal images in CT and MRI, for both post-contrast T1-WI and T2-WI in MRI, the Bland–Altman analysis revealed that the 95% limit of agreement was small for the coronal sections. This may be because the cutting out of the excised specimen is performed in the cross section close to the coronal section. CT, on the other hand, showed no significant difference between the axial and coronal sections (MPR images). Previous CT studies have shown better results for lingual lesions when axial sections were compared with coronal and sagittal sections . This may also be influenced by the limitations of CT tissue resolution.
When comparing two devices about CT and MRI, 25 cases were performed by Ingenuity Elite (64-row multi-detector CT) and two cases were performed by Aquilion ONE (320-row area detector CT). In MRI, 12 cases were performed by 1.5 T MRI, four cases were performed by 3.0 T MRI. There are not enough data supporting the influence between two different CT scanners, also between two different MRI machines.
For histopathological tissues, it is known that the excised tissue shrinks in the process of specimen production. The extent of these effects varies from report to report, but it is said to be between 20.2 and 34.7% [9, 38,39,40]. Although this study is based on histopathological specimens, the value seems to be about 20–30% smaller than the DOI in the actual living organism. However, on the other hand, it takes about 2–3 weeks on average from the examination by each modality to the surgery, and it is very likely that the tumor has increased in no small way during this time. In this study, the US results were very accurate, while CT and MRI results were overestimated by about 2–3 mm. This seems to have been derived from the intertwining of such complex factors; however, it does not go beyond speculation. The accuracy of the US shown in this study may actually be to some extent underestimated. To eliminate these effects and achieve more accuracy, it is necessary to conduct an imaging examination again before surgery. However, this is not practical as its purpose is not stage classification, which helps in determining the treatment plan. Basically, in the US procedure, a patient’s tongue is slightly pulled by the examiner, and a transducer applied to the tumor is lightly pressed for scanning. Although polymer acoustic coupling materials act as buffers, tumors may be compressed, deformed, and measured as being thinner than they actually are. Cancer tends to become more resilient to deformation compared to surrounding muscle tissue, and that effect seems limited, but in reality, one cannot rule out the possibility of it being underestimated.
In this study, the deepest part of the tumor in each image modality was visually evaluated and determined by consensus. In US, the deepest part can be evaluated in detail as an invasion front; there are reports that the risk of lymph node metastasis is high if the morphology of the invasion front is found to be irregular by comparing its morphology with that of the histopathological infiltration mode by US [41, 42]. However, it is often difficult to determine the location of the front when the contour of the deepest part is irregular. In US, the surrounding muscle tissue is hyperechoic because it is rich in fat and connective tissue, whereas, cancer is depicted in principle as hypoechoic because it has relatively little reflection, but its outer edge is accompanied by a slightly hyperechoic peripheral muscle tissue and a poorly defined region. In this study, we determined the range of the tumor, including the region of the outer margin based on our previous report . Several reports have been made using blood flow imaging and elastography [44,45,46], but no discussion has been made regarding the location of the deepest part of the tumor.
In the UICC/AJCC 8th Edition, DOI is histopathologically defined as the vertical distance from the horizontal virtual plane (line) connecting the basement membrane of the normal mucosa adjacent to the tumor to the deepest part of the tumor. In this study, the vertical distance was from the virtual line connecting the boundary between the tumor and the normal mucosa to the deepest part of the tumor in CT and MRI, and from the virtual line connecting the basal part of the normal mucosa to the deepest part of the tumor in US in accordance with the UICC/AJCC 8th edition, in principle. However, in CT, MRI, and the histopathological specimens, for patients where a straight virtual line was considered inappropriate based on the original curvature of the tongue, a virtual curve was set with reference to the curvature of the adjacent normal mucosa and the original form of the tongue on the opposite side and measurements performed. Essentially, the virtual line is defined as a horizontal straight line, but since the tongue mucosa is arched, the measurement, as defined, may be underestimated. To date, there are also reports stating that the normal mucosal surface corresponding to the external shape of the tongue is a virtual curve in DOI measurements [2, 14, 17, 25, 47]. In addition, since the approximate shape of the tongue changes depending on the presence or absence of residual teeth, defining the virtual line may be difficult. The type of virtual line or curve that should be set must be left to the discretion of the radiologist and pathologist. However, it is essential to specify the kind of line or curve to be set in the diagnostic report. Moreover, there seems to be the need to further examine the validity of such line setting in the future. In addition, UICC/AJCC 8th edition removed invasion to the external lingual muscle from the T classification, but previous reports have indicated that the external lingual muscle appears to be relatively shallow from the normal mucosa . It is reported that DOI can be judged more than 4 mm in cases of invasion to the styloglossus and hyoglossus muscles  and that cervical lymph node metastasis is more frequent in cases of invasion to the paralingual space . In preoperative diagnostic imaging, not only DOI measurement, but also detailed evaluation of the presence or absence of invasion to the external lingual muscles is necessary.
There are several limitations to this study. First, it is a retrospective study conducted with a limited number of cases in a single institution, and the period from examination to surgery is not constant. Moreover, the changes in the tumors during that time have not been evaluated and taken into account in the analysis. It also does not take into account histopathological findings such as the degree of malignancy and infiltration mode of the cancer, which may affect the diagnosis of the deepest part of the tumor. The choice of the modality for each patient was made by the oral surgeon who was the attending physician, and there may be potential bias.