This study used a novel approach for assessing the diagnostic accuracy of digital photographs by using six independent examiners, including a lay person. The aim was to demonstrate the versatility and validity of the photographic method as it can be used by both dental health professionals and non-dental professionals, therefore, no formal training on assessing caries on digital images was given. If lay examiners could be used in mass screening programmes this would greatly reduce the resource implications of large scale and national studies. A further advantage in the use of digital photographs and remote assessment would be that true blinding in research trials of dental interventions could be achieved. In addition, digital photographs could be accessed remotely and a digital archive can be used to track population changes over time thus facilitating longitudinal studies, which are currently a rarity due to the expense of repeated examination by professionals. This study also offers insights and solutions to some of the issues found in previous studies of digital examinations relating to time and image quality.7,9
Probably the most important finding in this study was the lay photographic assessment scores. No statistical difference was found between dt (P = 0.42; 95% CI [–0.5 to 1.2]) and dmft (P = 0.21; 95% CI [–0.4 to 1.6]) lay and GS scores. The lay inter-examiner showed similar, if not better reliability, in comparison to the other digital image examiners. This clearly indicates the feasibility for non-dental professionals to be recruited to carry out digital epidemiology for the oral health surveillance of children.
Oral health surveys are resource intensive. Sensitivity to workload is needed when scheduling, as inaccuracies and inconsistencies are more likely when the examiner is fatigued.1 A single observer is used for each region and although every care is taken to ensure calibration of each examiner, digital epidemiology would reduce opportunity costs by allowing multiple examiners to assess the images. The opportunity costs for the multiple children needing to be re-examined for intra-rater reliability would also be reduced.
Alongside the usual advantages to digital archiving for training purposes15 and contemporaneous record keeping, having a digital record from epidemiological surveys could also provide an opportunity to retrospectively extract further data from a digital database.6 Creating a digital archive for open source data would allow single databases to be used more widely, with data being leveraged, shared and combined with other data.23 Sharing of information in this way assists scientific collaboration, enriches research and advances analytical capacity to inform decisions.23
All photographic assessors were above the BASCD specificity benchmark (90%) for correctly recognising teeth free from dental disease (99.1%). However, all photographic assessors fell below the BASCD sensitivity benchmark (75%) for correctly recognising dental disease (57%). The PI was the closest to reaching the BASCD kappa benchmark (0.75) with a mean of 0.74, however, all other photographic assessors fell below this value. Similar underestimation of disease has been a consistent finding in almost all previous studies.7,9
Causes of underestimation of dental disease in this and previous studies were due to tooth coloured fillings being more difficult to identify, and transcription errors—when ‘extracted caries’ and ‘missing’ were interchanged.14 In this study, the angulation of the images was also problematic. The rationale for using full arch images with a single anterior view was to address the issues of time reported in the previous studies7,9,14 and to reduce additional resources needed for cross infection control. Boye et al. (2013) reported that taking digital images of all tooth surfaces took an average 8 min per child. Using a full arch image method reduced this time to a maximum of 2 min per child. However, this did affect the quality and diagnostic accuracy of the images.
More advanced technologies are becoming available, including handheld, intraoral high definition video devices. To date, the use of full mouth video technology for epidemiology is unexplored. Research in this area should be considered to test the validity of these modalities to address the underestimation of disease found in this and previous studies.6 Underestimation of disease can also be resolved by applying a correction factor to minimise bias.24 In addition, a possible area of interest may be including a comprehensive cost analysis when using digital images to screen for caries compared to a validated conventional method.
Currently, it is impossible to blind examiners for data collection in dental epidemiological studies, by the very nature of disease detection.6 Attempts to address issues such as participant identifiers or the examiner’s conscious or unconscious evaluation of a subject’s accent, vocabulary, dress or mannerism have been made.6,25 Inadequate blinding can cause significant differences in treatment effect size estimates.26,27,28 In well-designed research trials, possibility of bias is recognised and minimised as much as possible to ensure the integrity of the results. However, in disease measurement and reporting, blinding cannot be guaranteed. Digital dental disease screening as a data collection method could strengthen the blinding process.5
In this study, the photographic assessors were independent of the visual examination process, except for the PI. This meant that observation bias was minimised. The PI left a time gap of up to 6 months between assessing the digital photographs from both schools. Despite this, the PI reported a difficultly in remaining objective when scoring the photographs. This bias may be reflected in the results, where the PI showed consistently higher agreement with the GS in comparison the independent photographic assessors (see Tables 1 and 2).
Despite meeting the sample size reliability criteria,22 the sample size in this was considerably less than those tested in by Boye et al. (2013). Intra-rater reliability was not calculated due to the small sample size. A further limitation to this study was the use of one calibrated examiner for the visual examinations and one assessor from each category scoring the images. A combination of multiple visual examiners and multiple independent assessors would eliminate observational bias and test intra-examiner reliability.