Comparison of novel and established caries diagnostic methods: a clinical study on occlusal surfaces

Background The purpose of this prospective clinical diagnostic study with validation was to compare the diagnostic accuracy of near-infrared transillumination (NIRT), laser fluorescence measurement (LF), alternating current impedance spectroscopy (ACIS) and their combinations as adjunct methods to visual examination (VE) for occlusal caries detection using a hybrid reference standard. Methods Ninety-six first and second non-cavitated permanent molars from 76 individuals (mean age 24.2) were investigated using (VE) (ICDAS) and bitewing radiography (BWR), as well as NIRT, LF and ACIS. The findings of BWR and NIRT were evaluated by two examiners while the other examinations were conducted by one calibrated dentist. The hybrid reference standard consisted of non-operative validation based on the results of VE and BWR and operative validation. Statistical analysis included cross-tabulations, calculation of sensitivity, specificity and area under the receiver operating characteristic curve at three diagnostic thresholds: caries in general, enamel caries and dentin caries. Results NIRT, LF and ACIS exhibited high sensitivity for caries in general [1.00 (1.00–1.00), 0.77 (0.65–0.88), 0.75 (0.63–0.87)) and for dentin caries (0.97 (0.91–1.03), 0.76 (0.76–0.90), 0.64 (0.47–0.80)]. Sensitivity values for enamel caries were weak (0.21, 0.11, 0.37). Specificity values did not fall below 0.65 (NIRT) for all categories and methods, except for NIRT at the caries detection threshold (0.27). A combination of LF and ACIS with VE improved the diagnostic performance at the overall and the enamel caries threshold. The other methods showed fair to excellent discrimination at the overall caries threshold (NIRT 0.64, LF 0.89 and ACIS 0.86) and acceptable discrimination at the dentin caries threshold (NIRT 0.82, LF 0.81 and ACIS 0.79). AUROC for enamel caries exhibited the weakest discrimination. Accuracy was 65.6% for VE, 69.8% for BWR, 50.0% for NIRT, 53.1% for LF and 74.0% for ACIS. Reliability assessment for BWR and NIRT showed at least substantial agreements for all analyses. Conclusions The methods, NIRT, LF and ACIS, revealed different potential but no impeccable performance for occlusal caries detection. All are suitable instruments to detect hidden carious lesion in dentin. As auxiliaries to VE, LF and ACIS showed an increase in diagnostic performance.

larger portion of non-cavitated caries [2][3][4][5][6]. Optimal caries management requires structured caries detection, assessment, and diagnostic procedures. Visual examination (VE) is recognized as the first method of choice due to its simplicity, acceptable validity and reliability, especially for early occlusal caries detection and assessment [3,7,8]. Besides, bitewing radiography (BWR) is frequently considered adjunct diagnostic method of choice because of its widespread availability in dental practices, complete imaging of the posterior region on one side of the jaw, visualization of caries extension in relation to the pulp and acceptable validity and reliability [9]. Aiming at limiting the exposure to ionizing radiation for dental diagnostic purposes, many X-ray-free diagnostic methods have been introduced on the dental market. Here, laser fluorescence measurement (LF), e.g. the DIAG-NOdent (KaVo, Biberach, Germany), was first proven to be a valuable method for the evaluation of occlusal sites [10]. Since then, several other-mostly light-opticaldevices have been developed and have received increasing clinical and scientific attention. Near-infrared light transillumination (NIRT) and alternating current impedance measurement (ACIS) have been recently introduced into dental practice [11][12][13], and only a few diagnostic studies have analysed their diagnostic performance for occlusal caries detection thus far [14][15][16][17][18][19][20]. A few clinical studies directly compared the diagnostic performances of these adjunct methods, highlighting their individual potential for occlusal caries detection at the enamel and dentin threshold [15,21,22].
The objective of this clinical study using a hybrid reference standard was to compare the diagnostic accuracy of NIRT, LF and ACIS alone and as adjunct methods to VE for occlusal caries detection in relation to the thresholds for overall caries, enamel caries and dentin caries. The null hypothesis was that all diagnostic procedures would exhibit similar diagnostic performance.

Methods
This prospectively designed clinical diagnostic study was approved by the Ethics Committee of the Medical Faculty of the Ludwig-Maximilians University of Munich (Project Number 013-12).

Sample size calculation
The assumed caries prevalence of the study population is approximately 50%. Aiming for a power of 80% and setting alpha to 0.05 with the null hypothesis for sensitivity (SE) and specificity (SP) at 0.5, which was supposed to increase to 0.7, 98 samples were calculated to be required [23].

Eligibility of patients and teeth
The participants of this study were patients who came to the Department of Conservative Dentistry in Munich with the request for a dental examination and/or treatment from December 2012 to July 2014. Only healthy patients in generally good condition (ASA 1), with a minimum age of 12 years and a fully erupted permanent dentition were included. Further inclusion criteria were the presence of at least one molar without restoration, fissure sealing, orthodontic treatment, development defects or macroscopic cavitations. If these inclusion criteria were met, the patient was explained the study design and asked to participate (JK and FL). In case of a positive response, written informed consent was obtained. A further prerequisite for participation was the availability of bitewing radiographs which were not older than four months. New radiographs were only prescribed when there was a justifiable indication. These radiographs were analysed by the recruiting dentists (JK and FL) as part of the initial examination. The patients were informed about their findings, and all adequate therapeutic options were enumerated and offered.

Clinical examination
Visual examination was performed using magnifying lights (magnification: 2.5×, focal distance: 300-550 mm, field of view: 67-115 mm) after a professional tooth cleaning after ~ 5 s of air-drying by two calibrated dentists (JK and FL). All results of the VE were discussed and evaluated among the examiners (JK and FL) during the same appointment. The findings were categorized according to the International Caries Detection and Assessment System II (ICDAS) for occlusal surfaces with the following relevant scores: sound, first visible change in enamel, distinct visible change in enamel and localized enamel breakdown without visible dentin or underlying shadow [7]. Photographs were captured to allow later reassessment of the surfaces.

Digital bitewing radiography
Bitewing radiographs were acquired using an intraoral dental X-ray machine with a 203 mm tube (Heliodent DS, Sirona, Bensheim, Germany) including an X-ray field limitation (30 × 40 mm) with a CCD sensor (Intraoral II, sensor size 30.7 × 40.7 mm, Sirona, Bensheim, Germany) and an exposure time of 0.06 s at a cathode voltage of 60 kV and 7 mA of amperage. For the parallel technique, a sensor-holding system (XPP-DS Digital Sensor Holders for Sirona, Dentsply Rinn, Elgin, IL, USA) was used. Evaluation of all radiographs was conducted by two examiners (FL and KH) independently from each other and blindly from other diagnostic findings in a darkened  [24]. Divergent results were discussed until a consensus finding was reached.

Laser fluorescence measurements
LF was performed using a DIAGNOdent Pen device (KaVo, Biberach, Germany). The device was regularly calibrated according to the manufacturer's instructions. Furthermore, the rounded glass tip of the device was individually adjusted to the autofluorescence of the tooth at a healthy dental area after brief air drying. Measurements of the occlusal surface were then made. The maximum LF reading (0-99) was recorded, and the measurements were defined as follows: 0-12 sound, 13-24 enamel and 24-99 dentin involvement [25].

Alternating current impedance spectroscopy
ACIS was conducted with the CarieScan Pro device (orange dental, Biberach, Germany) on air-dried molar teeth and fissures isolated with cotton rolls. The ACIS readings were defined according to the following thresholds: 0-20 sound, 21-90 demineralized enamel and 91-100 dentin involvement [25].

Treatment decision, validation and definition of the reference standard
After the clinical assessment using the different diagnostic methods as described above, a management strategy was determined for all 96 molar teeth. This management included surface-related factors, e.g., the extent of caries in relation to the pulp, presence of (micro-) cavitation and caries activity as well as the overall caries risk of each subject [5]. All steps were pre-discussed in the study group and finally agreed to by each patient. To make the independent reference standard more powerful, VE and BWR were evaluated by three different examiners (JK, FL and KH) as described above. The hybrid reference standard consisted of two different procedures in relation to the diagnostic findings to meet the ethical requirement of an in vivo analysis. The samples undergoing non-operative validation were evaluated as healthy or occlusal surfaces with a noncavitated carious lesion according to the findings of VE and BWR, which did not justify any restorative intervention (N = 56). The lesions without the need for operative care were integrated into an individual prophylaxis and monitoring concept. The other group, undergoing operative intervention, consisted of samples that exhibited the indication for restorative care, which was conducted at a separate appointment a maximum of two weeks after diagnosis (N = 40). For operative validation, carious dentin was removed using restrictive and selective caries removal techniques [26]. Soft dentin beneath the pulp was excavated with a self-limiting polymer bur (P1, Komet, Lemgo, Germany). The assignment at the reference values was conducted immediately after caries excavation according to the listed scores in Table 1 by one examiner (FL). After excavation and clinical judgement, the cavity was photographed for later independent reassessment. Finally, the cavity was restored with an adhesively bonded restoration (Syntac classic, Vivadent, Schaan, Lichtenstein; SONICfill, KaVo, Biberach, Germany; SonicFILL, West Collins, Orange, CA, USA). All patients were consistently informed about appropriate home-based preventive measures and were offered riskrelated professional preventive dental care aiming at lowering caries activity and risk. All treatment decisions were made in cooperation with all examiners (JK, FL and KH) throughout the study period.

Training and calibration of all diagnostic methods and the validation process
The examiners (FL and KH) underwent two-day theoretical and practical training for all diagnostic procedures used in this study (VE, BWR, NIRT, LF and ACIS) under the guidance of an experienced dentist (JK). The training included information pertaining to the study design, indices, diagnostic principles of all methods and validation procedure. The examiners then evaluated a new set of bitewing radiographs and NIRT images in the trainer's presence while discordant findings were immediately discussed, and a consensus diagnosis was reached. Subsequently, the reliability between and within the examiners (FL and KH) was determined based on 50 case examples of BWR and NIRT, and an inter-/and intra-examiner agreement of more than 90% was achieved (linear weighted kappa analysis). The training was completed by a clinical training course, during which the examiner (FL) performed clinical examinations using all diagnostic methods (VE, BWR, NIRT, LF, and ACIS) and validated ten carious dentin lesions according to the study protocol under supervision (JK).

Statistical analysis
After data entry using a spreadsheet (Microsoft Excel, Version 16.36), statistical analysis was conducted using the statistical software SPSS (IBM SPSS Statistics for Windows, Version 25.0, Armonk, NY, USA) and R [27]. Diagnostic results from the test methods or their combinations with VE were cross-classified with the findings from the hybrid reference standard using the predefined definitions in Table 1. Overall accuracy was calculated as the percentage of correctly classified decisions (TP + TN)/(TP + TN + FP + FN), where TP, FN, FP and TN represent the counts of true positives, false negatives, false positives and true negatives, respectively. In addition to descriptive data analysis, contingency tables for cross-classification and calculation of SE and SP were done [28]. These procedures were consistently performed for all test methods and their combination using three diagnostic thresholds. These threshold values were, as shown in Table 1, enamel caries and dentin caries but also overall caries, which includes dentin caries and enamel caries together. Furthermore, the area under the receiver operating characteristic (AUROC) was calculated, and multiple comparisons between the AUROC from different methods and thresholds were conducted [29]. To interpret the AUROC, the classification by Hosmer and Lemeshow [30] was applied: AUROC value 0.5-0.7 = poor to fair discrimination; AUROC value of 0.7-0.8 = acceptable discrimination; AUROC value of 0.8-0.9 = excellent discrimination and AUC ≥ 0.9 = outstanding discrimination. If the area under the AUROC was 0.50, the model did not discriminate. The inter-/and intraexaminer reliability values were calculated using linear weighted Cohen's kappa, where a 1-category difference could be considered as less severe than a 2-category difference. Weights ranged from 0 to 1, and the weight for cells where the raters disagreed exactly equalled 1.

Results
Out of 155 patients evaluated for eligibility, 76 participants with 96 occlusal surfaces met the inclusion criteria (Fig. 1). A maximum of two teeth per patient were randomly selected for statistical analysis. The median age of the study participants was 24.2 years (range 14-49, 6 adolescents and 70 adults, 45 women and 31 men), and their caries prevalence was moderate according to WHO criteria (5.9 DMFT and 11.2 DMFS). A total of 45.8% (N = 44) of the relevant occlusal surfaces were found to be caries-free, 19.8% (N = 19) were restricted to the enamel and 34.4% (N = 33) reached the dentin according to the hybrid reference standard ( Table 2). Cross-classifications for findings of each diagnostic test method and its combination with VE in relation to the hybrid reference standard can be taken from All methods-NIRT, LF and ACIS-consistently showed an AUROC above 0.79 for dentin caries detection (Table 3). Comparing the AUROCs within the dentin threshold, all methods discriminated equally acceptable to excellent (0.79-0.82). In contrast, NIRT did not discriminate for the enamel threshold (0.43) and LF and ACIS discriminated poor to fair (0.53/0.63). An

Discussion
The main objective of this in vivo diagnostic study with validation was to compare different diagnostic methods for occlusal caries detection and diagnostics at different diagnostic thresholds. It was initially hypothesized that all methods would reveal similar diagnostic performance. According to the results (Tables 2, 3, 4), the initially formulated null hypothesis must be rejected because the test methods showed heterogeneous diagnostic performance. To our knowledge, no other clinical trial has combined the comparison of the diagnostic performance of these three test methods, NIRT, LF and

ACIS, at different diagnostic thresholds in one clinical
analysis. The strength of this study is that permanent first and second molars were analysed in a predominately homogenous group of young adult or adolescent patients and that all test methods were applied under standardized clinical conditions. The participants were evaluated and screened for study eligibility by the authors (JK and FL) before study entry. The study population included adolescents and young adults with complete permanent dentition and at least one molar without restoration, but only a sub-selection of these patients and their molars were finally included for statistical analysis. It must be reasoned that the included participants and their teeth may not be a representative sample of the targeted population and the generalisability of the study results must be regarded in this context. It can also be argued that the assumption of a caries prevalence, which is the basis of this study, is not representative for the targeted population. The following arguments justify our assumption of a 50% caries prevalence. Epidemiological data for the caries prevalence in Germany are merely available but based on oral health studies caries prevalence in 12-year-olds is 25.2% and in 35-45-year-olds is 97.5% [34]. Epidemiological surveys are usually based on clinical examinations of the teeth, which evaluate lesions from an advanced, clearly visible stadium while non-cavitated and/or initial lesions, as relevant in our study, mostly remain underestimated [11]. It is therefore a representative scenario for the targeted population of young adults to assume a caries prevalence of unrestored or sealed occlusal surfaces of molars with an ICDAS score of > 1 of 50% [35][36][37].
Further, the study data have a clustered nature because in 26% of all cases two samples per participant were included in the evaluation. Since the structure of the enamel and dentin tissue of each subject is individual, this can influence the optical properties of the teeth and thus, this may have an impact on the results. If we had opted for only one tooth per subject, the samples size would have been reduced to 76, which in turn would have lowered the statistical significance of this analysis. Future studies should include a complete cluster analysis.
Considering the cross-tabulation at the dentin caries threshold, the high rate of false-negative findings for VE becomes apparent (Table 2). Previous studies confirmed these findings with weak values for SE and strong values for SP for the detection of non-cavitated dentin lesions [38,39]. In this study, dentin lesions were not identified visually as such, because the lesions were non cavitated and predominately hidden. This explains the relatively weak values for accuracy of 65% for visual inspection in this study. This distribution of diagnostic potential is complemented by BWR, which has its strength especially in the detection of hidden dentin lesions. By excluding cavitated lesions (ICDAS Score > 3) from the study population, we were able to better demonstrate the strength of the auxiliary test methods to detect non-cavitated dentin caries. On the other hand, the analysis incorporates an incomplete caries spectrum in the sample. This limits the generalisability of the present study, as additional thresholds, e.g., other dentin caries levels, are not proved [40][41][42].
The type of reference standard used in this investigation provides a solution strategy for a common and wellknown problem in clinical diagnostic studies. The use of an independent and rigorous reference standard [43], e.g., histology, microradiography or µCT, is not feasible in clinical investigations as it excludes sound surfaces, non-cavitated lesions or those caries stages that can be managed by non-operative measures-shortly all lesions without the indication for operative care. With the aim of overcoming this methodological disadvantage, a hybrid reference standard was used in the present study. Although this model of a reference test meets the ethical and clinical requirements, it bears the risk of sample bias. It includes information from the test methods, which contradicts the principle of independence of index and reference tests, and therefore the present study is not free of any incorporation bias. The information about the diagnostic performance of VE as an index test is limited. The focus must be on the true index tests without intersection with the reference standard, NIRT, LF and ACIS. Nevertheless, other groups have also constructed a reference standard by including results from the index tests [44][45][46]. The reference standard for those samples that did not undergo operative validation is formed by VE and BWR, while the reference standard for all samples that required operative intervention is drawn from the results of the validation process. This model of a hybrid reference standard increases the risk of differential verification bias, as not all samples are subjected to the same reference standard.
Most of the results of this clinical study are in line with previous diagnostic studies concerning the methods VE, BWR and LF [16,[47][48][49]. We chose magnification 2.5X to achieve optimal results of visual inspection in this study. It must be considered that a visual assessment performed with unaided eyes would probably have resulted in lower values of SE and SP for VE [50]. This study additionally provides new diagnostic findings on NIRT and ACIS, which have not previously been reported in the literature [14,15,[18][19][20][21].
Alternating current impedance spectroscopy showed strong overall accuracy values of 74% but did not show particularly well diagnostic performance for either the enamel or dentin caries detection. This renders the evaluation and assessment of the method from a clinical point of view more difficult. At the overall caries and the enamel threshold LF and ACIS show a similar range of competence, while NIRT shows significantly weaker performance than the other methods and their combinations with VE (Tables 3, 4), as well as the lowest accuracy values of only 50.0%. Additionally, NIRT was more sensitive than specific at the overall caries detection threshold, in contrast to the other methods ( Table 3). The use of NIRT led to an increased number of false-positive diagnoses ( Table 2). These misinterpretations may have been caused by occlusal staining of healthy molars and are described by a previous in vitro study [20]. Regarding the detection of enamel caries, all diagnostic methods revealed insufficient diagnostic performance, which was mainly caused by low values of SE (Table 3). However, as auxiliaries to VE, ACIS and especially LF increase their diagnostic potential to detect enamel lesions. Both methods seem to complement the high sensitivity for enamel lesions of ICDAS. One main result of this study is the high diagnostic performance of all three auxiliary methods for the detection of dentin caries. This fact is very important for everyday clinical practice, as the use of these diagnostic methods can support the clinician to detect lesions in dentin. However, due to the numerous limitations listed here, the test methods cannot be recommended for occlusal caries detection in general. Finally, it is important to emphasize that the present study investigated the performance of different diagnostic methods and their combination with VE in relation to anatomical-based hard tissue structures, and it did not investigate recently suggested thresholds for operative intervention. Relevant thresholds from the clinician point of view are first caries in the middle third of dentin [40], second caries in the outer fifth of dentin [41] or third caries reaching the inner quarter of dentin [42]. Here, as shown in Table 3, the hypothesis can be made that different thresholds are associated with different diagnostic performance data. It must therefore be clearly stated that the shown data should not be transferred unconditionally to other clinical situations and that further research is needed to test the diagnostic accuracy in relation to thresholds for operative interventions.

Conclusions
All three test methods, NIRT, LF and ACIS, revealed its individual strength and limitations, but none of them exhibited impeccable diagnostic performance and is generally recommendable for occlusal caries detection. Laser fluorescence measurement and ACIS show an increase in diagnostic performance as adjunct methods to visual examination. All methods are helpful diagnostic tests to detect non-cavitated caries in dentin.