In this report, we sought to identify and evaluate the quality of systematic reviews comparing surgical and non-surgical care of TMJD. Our search strategy identified two reviews (one with and one without a meta-analysis). Kropmans et al found no difference in clinical outcomes (e.g.: maximal mouth opening, pain, or function improvement) when comparing arthroscopic surgery, arthrocentesis, and physical therapy. Similarly, Reston and Turkelson examining the efficacy of the surgical techniques arthrocentesis, arthroscopy, and disc repair/reposition, found no difference in clinical outcomes. Conceptually, both studies provide estimates of surgical vs. surgical and non surgical care, and both indicate that there are no significant differences between the interventions examined (Table 2). However, the methodological quality (from 0 to 100, with 100 being the best) of these two systematic reviews (Kropmans et al = 23.5; Reston and Turkelson = 77.5), and the underlying high quality evidence upon which they are based (from 0 to 100, with 100 being the best) (Kropmans et al = 15; Reston and Turkelson = 8.7), suggests that the interpretations of these authors may be over- (or under-) statements. This leads one to an interesting dilemma: How can one best care for patients if there are few high quality studies upon which to base this care?
The authors of the systematic reviews recognized some of these shortcomings. Reston and Turkelson pointed out that their study is an explicit attempt to provide a conclusion when reliable primary data are not available and recognized that clinical and policy decisions must often be made in the absence of well-designed trials. They also discussed two main flaws. The first is that there are few studies that evaluated a specific treatment on specific TMJD sub-groups of patients; therefore the relative efficacy of each treatment could not be defined. The other is that the studies contained a wide range of definitions of success or improvement after treatment. Kropmans et al reported that none of the reviewed scientific papers reported the measurement error of the procedures used (e.g.: standard error of measurement or 95% confidence interval).
Another set of issues is the classification of trials design. There are 8 articles common to both reviews, from which 3 match the same study design classification and 5 don't match. This could be explained by differences in the way that the included systematic reviews classified the study design of each article. A possible solution to this problem is the use of standard approaches to classify study designs, thus providing more consistency (e.g.:). These approaches has been used successfully by other areas of dental care (e.g.: preventive dentistry[30, 31]).
While the overall quality of the reviews is of some interest, the results of each measurement tool (Tables 3, 4, 5, 6, 7, 8) may offer additional insight into areas that could be improved. For example, the common missing elements in systematic reviews, in general, are: explicit search strategy; explicit articulation of exclusion criteria; absence of quality assessment of the underlying studies; and inappropriate aggregation of study results. Thus, there are crucial elements in the conduct of a systematic review, without which the results may be questionable. From the analysis of Tables 3, 4, 5, 6, 7, 8 one can identify that each included systematic review showed different results for each instrument. This could be explained by the fact that each instrument has a question pattern to appraise methodological quality and address different key areas. For example, AMSTAR addresses to potential bias in systematic reviews (such as: funding source and conflict of interest) that OQAQ and CASP doesn't.
Ideally, systematic reviews should be the starting point for any search for information. The results of the current study raise questions about the quality of the systematic reviews and clinical trials upon which TMJD care is based. Unfortunately, this is a long standing systematic problem. A systematic review of TMJD diagnosis, which was carried out more than 10 years ago, reported major to extensive methodologic flaws in 40% of the analyzed studies. Similarly, in examining the clinical literature, Bader and Ismail identified 7 systematic reviews on diagnosis and non-surgical treatment of TMJD. For 6 of the reviews (85.7%) they identified a need for well designed controlled studies, the use of standardized diagnostic criteria and outcome measures. Finally a 2007 report from the United States Government Accountability Office examined the Food and Drug Administration agency approval for the use of temporomandibular joint implants. They identified some of the same methodological problems: inadequate or inaccurate measurements, sample size and patient follow-up.
Thus, many of the previous results and this systematic review are congruent. That is, clinical scientists need to begin designing, implementing, and reporting clinical trials and systematic reviews that meet international standards. In terms of international standards and chronic pain, there is the IMMPACT recommendations, which suggest that chronic pain clinical trials should assess outcomes representing six core domains: (1) pain, (2) physical functioning, (3) emotional functioning, (4) participant ratings of improvement and satisfaction with treatment, (5) symptoms and adverse events, (6) participant disposition (e.g. adherence to the treatment regimen and reasons for premature withdrawal from the trial).
More specifically, for TMJD, a coalescence of symptom and outcome measurements could be profound. For example, one might consider measurements and reporting of pain as pioneered by the Oxford Pain Research Center. This Pain Center, in generating or evaluating systematic reviews of acute pain, prefers validated visual analog scales (VAS) to report pain, specifically includes randomised, double-blind, single-dose studies in patients with moderate to severe pain, and looks for outcomes of 50% pain relief at 4–6 hours. The analog for TMJD could be VAS of symptoms pre-therapy and 3-months post-therapy. Post-therapy reporting would include both the number and percentage of patients with 50% pain relief for both the experimental and control group.
This would, however, necessitate having clinical scientists register their trials, select and implement interventions and comparisons in a standard, randomized, blinded fashion, and completely report these results using the CONSORT guideline (e.g.:). Were this to occur it would substantially improve the evidence-base, reduce the variation in care and improve knowledge of patient outcomes. Moreover, there are also guidelines (e.g.: TREND) to help researchers improve the transparency or clarity of non-randomized designs reports.
There are at least four limitations to our study. First, several instruments exist to assess the methodological quality of systematic reviews, but not all of them have been developed systematically or empirically validated and have achieved general acceptance. Furthermore, since their development, considerable empirical research has accumulated about potential sources of bias in systematic reviews. For example, recent methodological research has highlighted the potential importance of publication language and publication bias in systematic reviews. As an attempt to address the slightly varying perspectives of these instruments, we averaged the outcomes of each instruments. Moreover, the average of the group of does not differ radically from the assessment of any individual instrument. This suggests that the average is a reasonable estimate of the quality.
Second, our searches identified 211 studies, of which 22 (10.4%) were in excluded languages. Thus, while we attempted to capture the majority of the published literature, we clearly missed a significant portion of the literature. Third, we did not examine the reference lists of the identified articles, further increasing the probability of missed information. Forth, to extrapolate our results to the care of individual patients would be difficult. All patients vary with regard to their pathology and clinical characteristics and need a personalized approach to care. Although the overall outcomes of TMJ surgery in clinical trials may not be different than non-surgical care, TMJ surgery may still be the best option for specific patients and should be a treatment that should not be forgotten.
The most troubling aspect of our findings involves the ethics of trials that do not meet international standards of conduct, and the care of patients that is not based on high levels of evidence. The potential implications of this failing is clearest to understand in terms of the U.S. Supreme Court ruling in Daubert v. Merrell Dow. In this case the Supreme Court applied the Federal Rules of Evidence for causality of harm, based on the highest level of evidence. This ruling supplanted the common-law test of Frye v. United States, which based rulings on local practice customs. Thus one might imagine that legal suits could arise from the application of trial methodology or clinical practice that does not meet international standards.