The Mahalanobis Distance method for matching of LIBS spectra of tooth samples
In order to test this discriminant analysis in the identification of the carious / healthy tissue samples, ten database entries were constructed from the collected spectra to form ten separate Discriminant Analysis models, five each from carious and healthy tissues. Six distinct spectral ranges covering a range of matrix and non-matrix elements were used; the relevant spectra are shown in Figure 3. In this way six pairs of "healthy / diseased" identifiers were generated. As pointed our earlier, in principle a single model would probably suffice but having more than one decider naturally improves on the identification accuracy.
For creating the Discriminant Analysis models a list of the training set spectra was simply entered into the PLS plus/IQ program attachment to GRAMS/32 spectral evaluation software package (Galactic Software Ltd.) and linked together with in-house written macro codes for visual (and audio) presentation of the analysis results. The program generated a Discriminant Analysis model for each sample, using the methods outlined in the previous section, against which test spectra were matched. When checking the identity of "unknown" spectra collected from a range of tooth samples, all were either identified as definite or possible matches to the healthy or diseased tissue discriminant analysis models, even if only one out of the six identifier spectral regions was used.
The major constituent of the tooth's crystalline enamel and dentine matrix structure is hydroxyapatite, Ca
10(PO
4)6(OH)2 whose absolute abundance is distinctly different for healthy dental tissue, and tissue affected by caries. For affected teeth the relative concentrations of the matrix elements Ca and P decrease severely. On the other hand, non-mineralising (non-matrix) elements, e.g. zinc, and organic materials (the occurrence of the carbon 193 nm line is indicative for these) increase strongly; see Figure 3d. A similar indicator for the effect of caries attack is the substantial increase of strontium, Sr, and barium, Ba, in relation to the matrix element Ca; see Figure 3c.
In the Discriminant Analysis models utilised here the important result is the M.Dist value. Depending on this, a pass (P) – healthy tissue, possible (?) – healthy/carious tissue or fail (F) – carious tissue – result was returned in the Limits tests, related to particular reference sample groups. Tests carried out on hundreds of spectra recorded from a multitude of different teeth showed conclusive evidence that by using the M.Dist values the spectra could be correctly categorised into the two distinct sample groups, namely sound, healthy tooth area / caries-affected tooth area.
The M.Dist value is effectively a measure of the similarity of an "unknown" spectrum to a group of training spectra. Thus the M.Dist value in Discriminant Analysis models reporting a "FAIL" result is normally high, indicating that the spectral contributions from individual elements are very different for e.g. healthy and caries-affected samples. The smaller the M.Dist values for a model giving a "FAIL" result the less elemental variations are encountered. On this basis statistical fluctuations in the spectra, caused by inevitable pulse-to-pulse intensity variations, can also be accounted for in the Prediction Module by adjusting the M.Dist PASS/FAIL limits appropriately [25].
As is the case in all Multivariate Quantitative Analysis approaches, careful application is required if the technique is to be applied both correctly and successfully. For example, the limits within which the M.Dist values indicate a match status of PASS, POSSIBLE or FAIL are frequently defined by default as <2, 2–3, and >3, respectively. For example, Raman spectroscopists often use values greater than these, e.g. <5, 5–15, and >15, respectively. Therefore, these limits always have to be determined prior to a practical application, such as distinguishing between healthy and caries-affected tooth material. The factors, which dictate these limits in LIBS analysis are (i) spectrum reproducibility and (ii) the sample-to-sample homogeneity. By testing the models produced with randomly collected spectra from samples of the material that they represent (carious or sound dental tissue), the range of M.Dist values, which gives a positive identification can be found. If this is not done then the model might incorrectly miss-identify materials.
In addition, by carefully adjusting the M.Dist limits, poor reproducibility can in principle be accounted for, provided there are sufficient elemental differences in the samples being sorted, such that clear changes in the spectral responses can be observed. With reference to Figure 3 indeed large differences in the spectral signature of healthy and carious-affected tissue are encountered, and we have shown that in contrast to these obvious cases, subtle differences can also be distinguished (meaning that even the detection of early caries lesions should be feasible). This will be discussed further below, also with reference to the choice of M.Dist limit values.
Finally, we like to note that multivariate analysis is rather unintuitive for the non-expert since a simple graphical representation of the statistical model can not normally be given, as is the case in univariate analysis. In univariate analysis a statistical distribution f(x) is plotted against its variable x, exhibiting a width parameter (confidence limit) ± Δx, or ±σ. In multivariate analysis, there are many variables x
i
, and the function would require a multidimensional plot. For a spectrum with up to a few hundreds of data points (variables) this can not be perceived. The M.Dist value may crudely be interpreted as sort of a confidence limit, similar to the σ in univariate analysis. In order to clarify this point, an example for univariate analysis is given further below. Note that in most analytical cases of well-behaved data multivariate algorithms will provide reduced errors when compared to univariate algorithms.
Application of the Mahalanobis Distance method to mapping of carious teeth
From the repeat analysis of the spectra collected from various tooth samples, it could safely be concluded that the Mahalanobis Distance algorithm had the potential of providing a superior tool for matching LIBS spectra and identifying "unknown" sound / carious materials. In this study we achieved close to 100% identification; only one single sample was misinterpreted during our test measurements.
This result is quite remarkable, since the spectra collected in this study were recorded for non-optimised settings. The distal-end of the optical fibre was just mounted at the distance of about 2 mm from the sample for in vitro applications, and for the in vivo measurements the fibre was simply held by hand while ablating the tooth.
Each spectrum was accumulated for only ten laser-induced plasma events. Fewer laser pulses per spectrum were used at times to speed up the analysis process, but this was at the expense of the spectrum reproducibility, and hence slightly reduced identification probability.
The training spectra utilised in this study were obtained from a range of tooth samples in vitro – namely from extracted tooth supplied by dentists.
An example for the strength of the analysis method can be seen in Figure 4. Here an in vitro measurement was carried out on a caries-affected tooth to map the areas of "healthy" and "diseased" tissue. The spectral region used in this specific case was that displayed in Figure 3a (basically including the elements Mg and Ca). For clarity, only ten measurement positions are indicated, although full raster scans were performed as well. The M.Dist values returned from the analysis according to the two model groups were providing an unequivocal "PASS" or "FAIL" in the "pure" areas (for a material match the M.Dist value was always smaller than ~1.5, for a mis-match said value normally was >10–100). At the transition boundaries healthy/diseased or diseased/healthy one or the other model test occasionally returned a "POSSIBLE" (the ablation area provides material from both "model" species). It should be noted that the example presented here represents only crude, lateral spatial resolution (about 750 μm); in real applications this may become much better with appropriate focussing of the laser radiation. Furthermore, depth resolution is of the order of 1–10 μm, depending on the applied laser pulse energy. Hence, in principle, excellent localised ablation control in three dimensions seems feasible.
As was pointed out further above, univariate analysis of the data may provide a more intuitive insight into the strength of LIBS for the identification of caries. In univariate LIBS analysis, which is the traditional analysis technique, one compares the (amplitude) change of a "trace" element with that of a "matrix" element (where significant changes in composition are encountered one may not have the distinction "trace" versus "matrix"). With reference to Figure 3a, i.e. the spectral region highlighting differences in the relative composition for Mg and Ca for healthy and carious tissue, one observes a substantial change in the line intensity ratios. In fact, for the two lines Mg(518.36 nm) and Ca(518.89 nm) the ratio changes from I
Ca
/I
Mg
= 4.95 to I
Ca
/I
Mg
= 0.90 (healthy-to-caries change). In repeat measurements for a range of healthy tooth samples, when normalising to the strongest peak (Ca), the intensity of the other line (Mg) varied by about 7%; this is typical in LIBS analysis of matrix materials which are not necessarily completely homogeneous. This statistical fluctuation filters through into the intensity ratio, yielding for the healthy tissue a value of I
Ca
/I
Mg
= 4.95 (-0.32/+0.37).
If one were to determine caries in its early stages then evidently the change would not be as dramatic as the one shown in Figure 3 for a far-advanced stage of caries, but the difference would be subtle. Assuming for the sake of argument that only a change in intensity of 5% of that in Figure 3a were encountered, a ratio I
Ca
/I
Mg
= 4.14 (-0.28/+0.33) would be obtained.
The two values are well outside their respective confidence limits, and thus can easily be distinguished. In principle, one now can set a decision threshold "healthy" / "caries-affected" tissue, according to the I
Ca
/I
Mg
ratio. The effectiveness of this approach was demonstrated by moving the ablation area gradually across the caries boundary (laser beam focussed more tightly than during the rest of this study). Only when nearly no overlap with the visually evident caries-affected area ensued did the I
Ca
/I
Mg
line ratio come close to the threshold value for "healthy" tissue (actually set to 4.3).
Thus, already with simple univariate (two-point) evaluation a reasonably precise monitoring procedure is at hand. The threshold accuracy is improved even further when applying a multivariate algorithm, as the one used in this study.
We like to stress that the assumption of a 5% change in the "healthy" I
Ca
/I
Mg
line ratio value, to reflect early caries, is somewhat arbitrary. A proper histopathologic analysis of various stages of caries would be required to ascertain the appropriate values, and hence the stage at which LIBS analysis could pick up the actual disease state.
Finally, we like to note that similar results to those shown in Figure 4 were obtained in an in vivo test measurement, albeit fewer locations on the tooth were probed. Said test was carried out using the "real-time" spectra from a carious-affected molar tooth of an adult volunteer, as mentioned further above. Again we like to emphasise that the laser intensity was kept near threshold levels for plasma generation, in order to prevent damage to the tooth.