1. Is the research design appropriate?
The strongest evidence of efficacy is provided by a well-executed randomized controlled trial. In any disease that is not uniformly fatal, improvement unrelated to the intervention can be taken into account by identifying and following a control group of patients who are similar in as many ways as possible to those receiving the intervention.
The best way of achieving comparability between the treatment and control groups is to assure that every patient entering the study has the same probability of receiving one or the other of the treatment being compared. If this is done, it should be stated as the key terms of randomized trial or random allocation in the abstract, the method section, or even the titles of such studies. Random allocation eliminates many of the biases that lead to false results and conclusions in non-randomized trials. For example, the claims of the efficacy of polio immunization (Francis et al 1955) and anti-microbial prophylaxis in the secondary prevention of acute rheumatic fever (Evans 1950) arise from randomized, double-blind, placebo controlled trials.
Non-experimental evidence from the cohort, before-after, case-control, and case series designs can provide important information about etiology and the adverse effects of therapy. However, they are not suitable for the demonstration of efficacy. This leads to the requirements for experimental validation of new drugs, new surgical procedures, or even health services.
There are three exceptions to this requirement for experimental evidence. First, when a disorder is associated with a uniformly fatal outcome, any intervention that saves lives is efficacious, and no randomized trial is necessary. An example is the efficacy of oral rehydration salts (ORS) in severely ill cholera patients; a dramatic reduction in case-fatality rate was shown (Levine 1985, Mahalanabis et al 1973). Secondly, when a disorder producing substantial mortality rates is uniformly cured, a controlled trial is also unnecessary. Thirdly, the experimental evidence is not required when the intervention is judged unethical or would be politically unacceptable. This exception depends on geography and local practice.
Although random allocation is the method most likely to produce comparable study groups, this method does not guarantee similarity of groups with respect to all-important variables. The use of prognostic stratification prior to randomization is one way to ensure that study groups are comparable with respect to known prognostic factors.
2. Were all relevant outcomes reported?
The outcomes assessed should include all potential components of health status relevant to the intervention being assessed, including quality of life and patient preference. The measurable terms of health include death, disease, distress, discomfort, disability, dysfunction, disharmony (family impact), dissatisfaction, disposition (risk factors) and debt. Increasing attention is being directed toward the importance of including an assessment of the impact of the intervention on the patient's quality of life, i.e. health status, functional abilities, and patient preferences.
Another important issue regarding relevant outcomes is the demonstration of explicit, objective outcome criteria. Outcome criteria should be reproducibly defined. For each outcome, the following factors should be considered when deciding whether the results are likely to be meaningful:
3. Were the study patients or population recognizably similar to your own?
This criterion has two elements. First, the study patients must be recognizable, i.e. how they were selected, what diagnostic criteria were used, and the patients' clinical and socio-demographic criteria status must be described in sufficient detail for you to be able to recognize the similarity between the study patients and your own patients. Second, the study patients must be similar to patients in your practice or community. If the patients are recognizable and similar, you will be able to predict the outcomes to be expected from the application of the specific therapy or program to specific patients or populations.
4. Were both clinical-community and statistical significances considered?
Clinical-community significance refers to the importance of a difference in health outcomes between treated and control patients. This difference is considered clinically significant to the community when it leads to a change in health care practice or community behavior. Statistical significance simply indicates whether a difference is likely to be real, not whether it is importance or large. It tells us about the likelihood that this difference is due to chance alone.
Clinically significant changes are reported in the terms of relative risk reduction and absolute risk reduction. Relative risk reduction is defined as the reduction of the risk in the treatment group in proportion to that of the control group, while absolute risk reduction is the difference in the risks between the control and treatment groups. Clinically significant changes are also reported as the number needed to treat (NNT), which is the reciprocal of the absolute risk reduction. Important differences in the quality of life are significant clinically and to the community from the patient's perspective. Utility measurement techniques, which quantify the strength of an individual's preference for alternative health outcomes or interventions is an approach that has been used to address this issue in a number of diseases.
The determinants of clinical and community significance are therefore the determinants of change in clinical and community action while the determinants of the statistical significance of any given result rises (i.e., the p value falls) when the number of subjects in the study is increased, when the health outcomes show less fluctuation from day to day or from patient to patient, and when the measurement of this health outcome is both accurate and reproducible.
Thus, certain issues should be considered when reading clinical studies. First, is the reported difference of clinical or community significance? Readers must examine the difference in clinical outcomes in the studies to see whether they are of potential significance. If so, is the difference statistically significant - if yes, then the results are both real and worthy of implementation.
Second, if the difference is not statistically significant, is the number of patients large enough to show a significant difference clinically or in the community if it should occur? If a study is huge, the difference in health outcomes can be statistically significant even when its magnitude is not clinically significant. On the other hand, if a study is too small, even large differences of enormous potential clinical or community significance may not be statistically significant.
5. Is the maneuver or health intervention feasible in your setting?
The health care maneuver or intervention has to be described in sufficient detail for readers to replicate it with precision. This issue represents special problems in trials evaluating the impact of a 'package' of interventions. For example, in the community-based trial comparing the efficacy of direct, observed treatment with antituberculosis drugs to that of self-supervised treatment, (Kamolratanakul et al 1999) the intervention consisted not only of anti-TB drugs but also of the availability of treatment supervisors and appropriate follow-up schedules required for maximum compliance.
The description of the maneuver or intervention in a published report should also indicate whether or not the authors avoided two specific biases in its application. These biases are contamination and co-intervention. Contamination is defined as the control patients accidentally receiving the experimental treatment - this results in a spurious reduction in the difference in outcomes between the treatment and control groups. Cointervention is the differential application of additional diagnostic or therapeutic acts to either treatment or control patients that could influence clinical outcomes and thereby bias the magnitude of difference observed between the two groups. Double-blinding (study patients and clinicians) can be used to prevent cointervention.
6. Were all patients who entered the study accounted for at its conclusion?
The number of patients entering the study and finishing the study should be provided in the report. In case the outcomes are not reported for missing subjects, one approach is to arbitrarily assign a bad outcome to all missing members of the group with the most favorable outcomes. If this maneuver fails to shift the statistical or clinical significance of the results across the decision point, the study's conclusion can be accepted.
7. Are the study results consistent with those of others?
This step concerns whether or not the study results agree with those of others. Different results can often be explained by the strength of the research method used.
With these seven guidelines, both health care policymakers and providers should be able to critically assess and judge the validity, applicability, and gaps in the knowledge about the efficacy of a specific health intervention.