H. Walach the Efficacy Paradox in Randomized Controlled Trials

download H. Walach the Efficacy Paradox in Randomized Controlled Trials

of 6

Transcript of H. Walach the Efficacy Paradox in Randomized Controlled Trials

  • 8/2/2019 H. Walach the Efficacy Paradox in Randomized Controlled Trials

    1/6

    THE JOURNAL OF ALTERNATIVE AND COMPLEMENTARY MEDICINEVolume 7, Number 3, 2001, pp. 213218Mary Ann Liebert, Inc.

    The Efficacy Paradox in Randomized Controlled Trialsof CAM and Elsewhere: Beware of the Placebo Trap

    213

    Ever since its formal inauguration in 1945,the double-blind, randomized controlledtrial (RCT) has become the Holy Grail ofmedical methodology (Kaptchuk, 1998a). Thisis understandable because the RCT has an ad-

    vantage that other methods are lacking: itide-allyprecludes bias. By randomizing patientsto groups and by concealing the allocation theRCT precludes selection bias. By blinding doc-tors and patients this type of trial prevents ob-servation bias. By blinding outcome assessorsand statisticians the RCT prevents reportingbias. Easy, do-it-yourself kits of methodologyassessment such as the Jadad-score (Jadad, etal., 1996) have incorporated these virtues ofRCTs and take it for granted that only studiesthat include all of these elements can bemethodologically sound. This, recursively,leads to the situation that authors of studiesthat naturally want to do good studies seekto fulfill all the criteria posed by quality cod-ings such as the Jadad score. Thus, a standardis created that is then called the gold stan-dard without anybody reflecting how it cameto be a gold standard in the first place. Thisstandard was instilled by a part of the medical

    community that equated effective therapy withthe use of pharmacologically active and pow-erful drugs (Kaptchuk, 1998b). And, mean-while, this global and universal hypnothera-peutic intervention has succeeded: At least thepublic, but also a large part of the scientificmedical community, is convinced that treatingis always better than not treating and that treat-ing, if it is to be effective, is treating with spe-cific pharmacologic agents. Thus, an array of

    implicit presuppositions (Collingwood, 1940)has taken hold of researchers in the medicalcommunity, which, by virtue of its unreflected

    nature, make the gold standard a golden calf(Kaptchuk, 2001). The most important of thesepresuppositions is also the most dangerousone. It is the presupposition that effectivenesscan only be granted if there is efficacy, and ef-

    ficacy is identical to specific efficacy againstplacebo. In what follows, I am going to showthat this equation is ill-founded because itmakes presuppositions that aremaybetruefor pharmacologic interventions butverylikelyfalse for complex interventions that arenot meant to act directly but indirectly via stim-ulation of the autoregulative functions of thebody.

    To prevent misunderstandings: I am not ar-guing that the RCT is a bad or even wrongmethodology. On the contrary, the RCT is a su-perb instrument, like the lancet of the surgeon.But you would not want to use the lancet forslicing bread or chiseling marble.

    Modern pharmacology was founded on thebasis of the prime paradigm of modern medi-cine: Virchows cellular pathology. This statedthat disease is identical to pathologic changeson the cellular level. Thus, understanding thephysiologic and cellular nature of the organism

    in states of health and disease enabled modernmedicine also to tailor powerful interventionsin order to prevent or counteract these patho-logic changes. This research paradigm has pro-vided us with magnificent insights into theworking of the bodys physiology and is farfrom having reached an end. Pharmacologic in-terventions try to act on the supposed orknown pathology at the cellular level. Aspirininhibits the synthesis of prostaglandines, thus

    intervening at the inflammation cycle and pre-venting inflammation and pain associated withthis cycle. While we have come to know the

    EDITORIAL

  • 8/2/2019 H. Walach the Efficacy Paradox in Randomized Controlled Trials

    2/6

    precise mechanisms of aspirin only recently, ithas been in use as a powerful therapeutic agentfor decades before the precise knowledge of itsmechanism was discovered, solely on the basisof careful observation and experience(Schonauer, 1994). Ideally, this should happenthe other way around: We know about thepathologic physiologic processes that we wantto inhibit and we look for a substance that can

    do this. Hence, specific efficacy is a prerequi-site of effectiveness for pharmacologic inter-ventions. It is the very aim of pharmacologicinterventions to find a specific inhibitor, en-hancer, or modifier for a specific, known path-way to be modified. What we tend to forget isthat this is only one of many options for ther-apeutic interventions. The modern medicalparadigm has adopted this mechanistic cellu-larpathologic model and has fared well withthis decision, at least concerning acute diseases

    and interventions. Conventional medicine hasfared less well when it comes to chronic dis-eases. Most of these chronic diseases have todo with a lack of regulation within highly com-plex interconnected systems, such as the im-mune system or the interactions between theimmune system and the neuroendocrine sys-tems, or even between the psyche and thesesystems. And with complex diseases such asrheumatoid arthritis, asthma, fibromyalgia,

    chronic fatigue syndrome, to name but a few,it is too simplistic to look at one effector andtry to find one specific intervention for it.

    Enter complex interventions, such as theones used in complementary and alternativemedicine (CAM). These interventions rarelyjust address one effector mechanistically. Theytry mostly to stimulate or regulate the organ-ism. Thus, their way of achieving efficacy isnonspecific, although they use very specifictheories and interventions. The importantpoint to note here is that the specificity stipu-

    lated within a modelsuch as the specificity ofhomeopathic remediesneed not necessarily be a specificity on the level of physiologicprocesses. It may be just a clever trick tobring about quite nonspecific changes. TheseCAM methods spur the organism to do some-thing by itself to find its original balance. It ispossible that these methods also have specificparts but, if so, this occurs in a different sensethan modern pharmacologic interventions.And it is very likely that a large part of their

    effectiveness is covered by most of their non-specific effects. This leads to what I call the ef-ficacy paradox, which is illustrated in Figure 1.This is a thought experiment, i.e., an idealizedsituation, which can easily be converted intopractical reality.

    Imagine the following situation as depictedin the figure: Let there be two treatments x and

    y for the same condition, say chronic pain. Letthere be two placebo controlled RCTs with

    comparable patient populations. In every oneof these trials we will have measurement arti-facts caused by unreliability of measures; let

    EDITORIAL214

    FIG. 1. Thought experiment illustrating the efficacy paradox. See explanation in text.

  • 8/2/2019 H. Walach the Efficacy Paradox in Randomized Controlled Trials

    3/6

    them be equal in all groups. In every one ofthese trials, we will also have regression to themean as a statistical artefact and as a result ofthe natural course of the disease studied; somepatients will improve regardless of the treat-ment applied. Then there will be nonspecifictreatment effects: Patients expect to get betterwhen treated, especially in a trial. Hope willwork against the general demoralizationcaused by disease. The attention of doctors andnurses within the context of a trial and perhapsthe special attention paid to patients within thecontext of a particular CAM intervention suchas homeopathy, healing, or acupuncture, willalso contribute to the nonspecific part of im-provement. Let us not forget that a treatmentthat can help patients to understand their suf-

    fering by providing an explanation, a commonexplanatory myth, is a therapeutic factor, too(Frank, 1989). And then there will be specificfactors of treatment. Let us assume that treat-menty is specifically effective. Its specific effi-cacy will be 20%, which, in a trial that is ade-quately powered, will be significant. Thus,everybody will conclude: Treatmenty is an ef-fective treatment for chronic pain. Treatment xonly has 10% specific efficacy and let us assume

    that studies of treatment x are generally un-derpowered to find this effect. Everybody willconclude: Treatment x is an ineffective treat-ment for chronic pain. What usually is over-looked is the fact that the nonspecific treatmenteffects of treatment x are much larger. In thethought experiment, I have chosen them to be30% for treatment x. For treatment y, theywould only be 5%. In such a case treatment x,although overall much more powerful with70% of patients potentially benefitting from it

    by virtue of its strong nonspecific effects,would be neglected in favor of treatment y,with 55% of patients benefitting from it, be-causey has a stronger specific treatment effect.

    I maintain that this situation is frequentlytrue for CAM therapies. Studies are often un-derpowered, e.g., for acupuncture, and thuspotential specific effects are overlooked. Theconclusion of reviewers and the educated pub-lic then is the verdict inconclusive evidence

    (Ezzo, et al., 2001), and the political conse-quence, as just happened in Germany, is the de-cision to not include acupuncture in the scheme

    for public reimbursement, because the evi-dence for specific efficacy is inconclusive (Bun-desausschuss rzte und Krankenkassen, 2001).However, nobody pays attention to the fact thatperhaps the magnitude of nonspecific effectsmakes a treatment effective and not the specificeffects. An even more complicated situationcan arise when the circumstances of a trial, suchas blinding and changing the natural flow ofpatientdoctor interaction and treatment se-quences, change the context of a treatment dra-matically and thus alter the potential nonspe-cific effects in a detrimental way. This canhappen in blinded trials of homeopathy, inwhich insecurity arises from the blinding ofdoctors, and also in trials of acupuncture, whenblinding procedures make it necessary that the

    doctor who is taking the case and making theassessment is different from the person who isadministering the treatment. In all such cases,trials may alter the context of a treatment andthus diminish potent nonspecific factors andthereby underestimate effectiveness.

    That the thought experiment is not justcooked up but simply an idealization of re-ality can be seen in a recent example of researchin CAM. Abbot and colleagues conducted an

    intelligent trial of healing (Abbot, et al., 2001).Sixty patients who were suffering from chronicpain were randomized to either receive realhealing or sham healing. Sham healing wasperformed by stage players who were other-wise not acquainted with healing. They stud-ied healers and mimicked their conduct duringa typical healing session such that, for the pa-tients, it was not discernible which was trueand which was sham healing. The second partof the study was a study of distant healing, in

    which healers were either present behind aone-way mirror or not present there. The sec-ond study was clearly negative. We thereforewill not deal with this part any more but focuson the first one. Patients were seen twice forbaseline measurement, an applaudable attemptto minimize error variance in measurement.Main outcome criteria were the Pain Rating In-tensity Total Score of the McGill Pain ques-tionnaire, a validated and frequently used mea-

    sure, and the general score of the Measure YourMedical Outcome Profile (MYMOP), an indi-vidual scaling procedure for individual prob-

    EDITORIAL 215

  • 8/2/2019 H. Walach the Efficacy Paradox in Randomized Controlled Trials

    4/6

    lems. Other measures, which we leave aside,were the Medical Outcomes Study Short Form36, depression and anxiety measures, VisualAnalogue Scale of pain, and specific MYMOPmeasures. We focus only on two major points.The overall result was negative. There was nospecific difference, the authors claim, betweenhealing and sham healing, and thus healing isnot effective for chronic pain. The authors haveprovided a power analysis, which stipulated aspecific effect of D 5 0.8. The effect size d is astandardized mean difference, standardized bythe standard deviation. It is frequently used,particularly if outcome measures are continu-ous, as in pain measurement. We know of notreatment, pharmacologic or otherwise, for pa-tients with chronic pain, which has a specific

    effect of that size. After all, one of the defini-tions of chronic pain is that it is resistant toknown effective treatment. Recent meta-analy-ses of treatments for chronic pain have foundlarge effects of the size of D5 0.48 to D 5 0.6.Nonsteroidal anti-inflammatory drugs forchronic pain reach effect sizes of d 5 0.68, butoften have smaller effects (Ward and Lorig,1996). Patient education trainings for patientswith rheumatoid arthritis reduce pain by D 5

    0.2 (Ward and Lorig, 1996) and are consideredto be worthwhile today. Behavioral interven-tions have effects sizes of D 5 0.5, which is con-sidered by the authors of the meta-analysis to be large (Morley, et al., 1999) and antide-pressant given to patients in pain have the sameeffects of D5 0.48 (Fishbain, et al., 1998). Thus,an effect size of D 5 0.8 is quite unrealistic fromthe outset. Remember that it is still specific ef-fectiveness we are talking about. Now if welook at the actual data and leave aside some sta-

    tistical technicalities, we see the following: Thewithin-group effect sizes for the primary out-come measures are quite large. These are mea-sures, which are based on prepost differencesper group and reflect the improvement patientshave experienced in each particular group,measurement error, regression to the mean,nonspecific effects, and specific effects, all in-cluded. This effect amounts to D 5 1.12 in thehealing group and to D 5 0.83 in the sham-

    healed group for the main outcome measureand is somewhat smaller with D 5 0.62 for thesecondary outcome measure in the healed

    group and D 5 0.34 in the sham group. The dif-ference between the groups that reflects thespecific effect is D 5 0.29/0.28, which is small but consistent among primary and secondaryoutcome measures. (Note that the specific ef-fect is not just the arithmetic difference of thewithin-group effect sizes, because the standarddeviations are different.) The same is true forthe secondary outcome measure. We shouldalso mention that other measures, such as thescales used, produce a paradoxical picture be-cause sometimes the control group is betterthan the treated group. I take this to be causedby the instability of measurement, with only 25subjects per group, because those scales usuallyonly provide stable measurements with largernumbers. Thus, the bottom-line of this RCT of

    healing really is: There is a huge effect for pa-tients treated with healing, but it is mainly anonspecific effect. The specific effect, althoughpresent, is small and was not detectable with astudy powered to detect an effect of D 5 0.8.Thus, the verdict of the trial is healing is inef-fective. This is a precise illustration of the ef-ficacy paradox described above. If we convertan effect size of D 5 1.12 in what is known asRosenthals r, we have something like r5 0.8.

    If we use Rosenthals binomial effect size dis-play (Rosenthal, 1991), we can say that, withsuch an effect size, approximately 90% of thepatients treated would improve. Let us be con-servative and turn the estimate down to 60%,which was approximately the size we found inour own trial (Wiesendanger, et al., 2001). Myquestion is: Would a patient with chronic painwho is told that a treatmentspecific or not,sham or notwill have a 60% chance of im-proving him or her decline treatment if told that

    however, we must warn you that the treat-ment is not proven to be effective in the strictscientific sense? What we have here is a real-world example of the efficacy paradox: A treat-ment is not proven as efficacious in the strictsense of the word, because a trial was unsuc-cessful in proving specific efficacy. However,the treatment is immensely effective in reliev-ing patients overall, probably even more effec-tive than proven efficacious treatments are, pre-

    cisely because the nonspecific parts of thetreatment are large, probably larger than inother types of treatment.

    EDITORIAL216

  • 8/2/2019 H. Walach the Efficacy Paradox in Randomized Controlled Trials

    5/6

    This is so because we remain hypnotizedabout placebo. Placebo is the bad guy inpharmacologic research. Legal requirementsdemand that a newly introduced drug is betterthan nonspecific elements of treatment, and justly so, because pharmacologic treatmentsalso introduce risks, cost money, and producewaste products. Because of this situation, weare used to jumping to the conclusion thatplacebo effects are bad for everybody. But thisis not so. Placebo effects exemplify all thoseprocesses that are not at the direct commandof doctors and researchers. These effects ex-emplify all those elements of a treatment thatlead to improvements without doctors or pa-tients knowing how and why. These effects arethe reflection of self-healing tendencies of an

    organism, possibly because of expectancy ef-fects (Kirsch, 1997), possibly because of com-plex activations of psychoneuroimmunologicprocesses via the tiny specific stimuli of a treat-ment, such as the needling of acupuncture, therituals of homeopathy, or the good intentionsof spiritual healing. To just dump all these ef-fects in the huge waste bin of medical researchand dub these effects as nothing but placebois, at best, scientific stupidity and testifies to an

    unwillingness of using ones brain to differ inthe face of mainstream opinion. To call all theseeffects nothing but artifacts (Kirsch, 1997) isnot seeing ones actions. Ludwik Fleck, the Jew-ish-German discoverer of the typhoid vaccine,who advanced the thesis of paradigmaticchanges in science one generation beforeThomas Kuhn, once said: A scientific fact is acollective decision to stop thinking. (Fleck,1980) This is true for the notion of placebo: Peo-ple just dump everything in the container they

    label placebo. And every therapeutic inter-vention that will be accepted as effective needsto be better than placebo. But what if a ther-apeutic intervention is better than anything elsebut not better than placebo because the inter-vention is an exemplification of placeboprocesses namely of the self-healing capacity ofthe organism? What if placebo has com-pletely different meanings depending on thecontext it is used? What if placebo effects are

    different in trials and in normal everyday prac-tice? What if the magnitude of placebo effectsoutperforms the magnitude of the so-called

    specific effects? And there is some evidencethat this is the case (Kirsch and Sapirstein, 1999;Walach and Maidhof, 1999; Maidhof, et al.,2000).

    The placebo trap is the idea that only specificeffects over and above placebo effects areworth looking for, worth demonstrating, andworth achieving. This is only true for the smallsector of research that has the aim of provingthe efficacy of newly developed drugs. It is nottrue for all those complex interventions that donot rely on one mechanistic specific agent butintervene in a more complex fashion, stimulat-ing organisms toward self-healing actions.

    In order to beware of the placebo trap weneed to do three things:

    (1) Argue against all those would-be method-ologic popes who want to make every-body believe that efficacy is identical withspecific efficacy against placebo.

    (2) Diversify research strategies to use multi-ple methods, such as randomized compar-ison trials of CAM therapies against stan-dard care or against waiting lists. This willenable us to quantify general therapeuticeffects. Other research options would be

    large outcomes studies or comparative co-hort studies in natural settings to addressselection process.

    (3) Start emptying the placebo waste bin anddisentangle what it contains. Perhaps, at thebottom, we will find what is at the base ofevery true healing process: the capacity of theorganism to heal itself. Starting to ask thequestion: What is self-healing after all? willbe the beginning of meaningful research andwill point the way out of the placebo trap.

    REFERENCES

    Abbot NC, Harkness EF, Stevinson C, Marshall FP, ConnDA, Ernst E. Spiritual healing as a therapy for chronicpain: A randomized, clinical trial. Pain 2001;91:7989.

    Bundesausschuss rzte Krankenkassen. (2001).Akupunktur: Zusammenfassender Bericht des Arbeit-sausschusses rztliche Behandlung des Bundesauss-chusses der rzte und Krankenkassen ber die Be-ratungen der Jahre 1999 und 2000 zur Bewertung derAkupunktur gemss 135 Abs. 1 SGB V.

    Collingwood RG. An Essay on Metaphysics. Oxford:Clarendon Press, 1940; reprinted in 1998.

    EDITORIAL 217

    http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0304-3959^282001^2991L.79[aid=1301376]
  • 8/2/2019 H. Walach the Efficacy Paradox in Randomized Controlled Trials

    6/6

    Ezzo J, Lao L, Berman BM. Assessing clinical efficacy ofacupuncture: What has been learned from systematicreviews of acupuncture? In: Stux G, Hammerschlag Reds. Clinical Acupuncture: Scientific Basis. Heidelberg:Springer Verlag, 2001:113130.

    Fishbain DA, Cutler RB, Rosomoff HL, Rosomoff RS. Doantidepressants have an analgesic effect in psychogenicpain and somatoform pain disorder: A meta-analysis.

    Psychosomatic Med 1998;60:503509.Fleck L. Entstehung und Entwicklung einer wis-

    senschaftlichen Tatsache: Einfhrung in die Lehre vomDenkstil und Denkkollektiv. Mit einer Einleitung her-ausg. v. L. Schfer und T. Schnelle. Frankfurt:Suhrkamp, 1935; reprinted in 1980.

    Frank JD. Non-specific aspects of treatment: The view ofa psychotherapist. In: M. Shepherd M., & Sartorius N,eds. Non-Specific Aspects of Treatment. Bern: Huber,1989:95114.

    Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds,DJM, Gavaghan DJ, McQuay H. Assessing the quality

    of reports of randomized clinical trials: Is blinding nec-essary? Control Clin Trials 1996;17:112.Kaptchuk TJ. Intentional ignorance: A history of blind as-

    sessment and placebo controls in medicine. Bull HistMed 1998a;72:389433.

    Kaptchuk TJ. Powerful placebo: The dark side of the ran-domised controlled trial. Lancet 1998b;351:17221725.

    Kaptchuk TJ. The double-blind randomized controlledtrial: Gold standard or golden calf? J Clin Epidemiol2001; in press.

    Kirsch I. Specifying nonspecifics: Psychological mecha-nisms of placebo effects. In: Harrington, A (ed.) The

    Placebo Effect: Interdisciplinary Explorations. Cam- bridge, MA: Harvard University Press, 1997:166186.Kirsch I, Sapirstein G. Listening to Prozac but hearing

    placebo: A meta-analysis of antidepressant medica-tions. In: Kirsch I. ed. Expectancy, Experience, and Be-havior. Washington, DC: American Psychological As-sociation, 1999:303320.

    Maidhof C, Dehm C, Walach H. Placebo response ratesin clinical trials: A meta-analysis (abstr.). Int J Psychol-ogy 2000;35:224.

    Morley S, Eccleston C, Williams A. Systematic review andmeta-analysis of randomized controlled trials of cogni-tive behaviour therapy and behaviour therapy forchronic pain in adults, excluding headache. Pain1999;80:113.

    Rosenthal R. Meta-Analytic Procedures for Social Re-search. Newbury Park, CA: Sage, 1991.

    Schonauer K. Semiotic Foundation of Drug Therapy: ThePlacebo Problem in a New Perspective. Berlin, NewYork: Mouton de Gruyter, 1994.

    Walach H, Maidhof C. Is the placebo effect dependent ontime? In Kirsch I. ed. Expectancy, Experience, and Be-havior. Washington, DC: American Psychological As-sociation, 1999:321332.

    Ward MM, Lorig KR. Patient education interventions inosteoarthritis and rheumatoid arthritis: A meta-analyticcomparison with nonsteroidal antiinflammatory drug

    treatment. Arthritis Care Res 1996;9:292301.Wiesendanger H, Werthmller L, Reuter K, Walach H.Chronically ill patients treated by spiritual healing im-prove in quality of life: Results of a randomized wait-ing-list controlled study. J Altern Complement Med2001;7:4551.

    Address reprint requests to:Harald Walach, Ph.D.

    Department of Environmental Medicine and Hospital Epidemiology

    University Hospital Freiburg

    Hugstetterstrae 5579106 Freiburg

    Germany

    E-mail: [email protected]

    EDITORIAL218

    http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/1075-5535^282001^297L.45[aid=1301384]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0304-3959^281999^2980L.1[aid=1301382]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0020-7594^282000^2935L.224[aid=1301381]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0007-5140^281998^2972L.389[aid=1301379]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/1075-5535^282001^297L.45[aid=1301384]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0893-7524^281996^299L.292[aid=1301383]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0304-3959^281999^2980L.1[aid=1301382]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0020-7594^282000^2935L.224[aid=1301381]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0140-6736^281998^29351L.1722[aid=1301380]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0007-5140^281998^2972L.389[aid=1301379]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0197-2456^281996^2917L.1[aid=1301378]http://matilde.catchword.com/nw=1/rpsv/cgi-bin/linker?ext=a&reqidx=/0033-3174^281998^2960L.503[aid=1301377]