Effects of Belladonna 12 CH and 30 CH in Healthy Volunteers

 

A Multiple, Single-Case Experiment in Randomization Design

 

 

H. WALACH, S. HIEBER & E. ERNST-HIEBER

University of Freiburg, Department of Psychology, D - 79085 Freiburg

 

1 - Introduction
2 - Method
3 - Results
4 - Discussion
5 - References

 

1. Introduction

The practice of homoeopathy lies on two principles: the law of similars and the potentization of homoeopathic remedies. The law of similars states that likes be cured by likes: "Similia similibus curentur" (Walach, 1986). This means for example that certain types of feavers which exhibit symptoms like an intoxication with deadly nightshade can be cured by the homoeopathic preparation Belladonna, which is prepared from this very same plant. While this principle has been known for a long time, it was the German doctor and pharmacist Samuel Hahnemann who made practical use of it, by giving drug substances to volunteers, noting down the symptoms they experienced, and using the very same symptoms as indication of this remedy in the case of a disease. His volunteers experienced sometimes quite toxic reactions. Therefore, Hahnemann diluted the remedial substances stepwise, by adding 99 drops of alcohol to one drop of substance and vigourously shaking it. This procedure he called "potentization" or "dynamisation". For every dilution process he used a new glass vial. He experimented with very high dilutions not knowing that he had way beyond transgressed Avogadro's number which states that 6.0231023 molecules are contained in one mole of any solution. By virtue of the homoeopathic potentization procedure, no molecules are to be expected in preparations beyond and above 12 CH or 24 XH, 12 stepwise agitated dilutions in the ratio 1:100, or 24 stepwise, agitated dilutions in the ratio 1:10. Hahnemann however, not knowing about Avogadro's number, very frequently used potentizations as high as 30 CH or higher, and in the 6th edition of his Organon (Hahnemann 1979) he even prescribed 30 CH as a standard potency for curative and experimental procedures alike.

                The law of similars was in Hahnemann's time a tradition with no direct practical implications. But by testing the available drug substances, Hahnemann found a working basis. Since then homoeopaths have "proven" new remedial substances by giving substances in both crude and diluted form to healthy volunteers and noting down their symptoms. These symptoms form the homoeopathic materia medica. Rarely however it has been tried to find out whether symptoms supposedly produced by these substances in healthy volunteers are in fact distinctive symptoms, or whether the symptoms experienced by the provers are due to the accidental situation of observation and other unspecific factors, and are therefore to be considered as placebo symptoms.

                In a previous study (Walach, 1993) it was shown that, although on a single case basis differences could be seen between placebo and verum in a double blind crossover trial, on a group level there was no difference between placebo and verum symptoms. This was due to the fact that some persons clearly showed more symptoms under the test substance, Belladonna 30 CH, and some provers showed clearly more symptoms under placebo. Since this trial was not designed as a single-case trial, no statistics could be provided in order to test for single subjective significant effects. We therefore decided to design a study which would allow to draw conclusions on a single case basis.

                We chose a randomization design proposed by Edgington (1987), which is a true single-case experiment, and can be evaluated statistically on a single-case basis. In this design treatment periods are randomized such that the single subject undergoes a randomized series of treatments not knowing which treatment is to be applied at what time period. A full report of the study is available (Ernst-Hieber and Hieber, 1995).

2. Method

We randomized 4 weeks of treatment and 4 weeks of placebo per person, preceeded by an observational run in week.

This allows for       8!         permutations.

                                4! . 4!

For each test subject one of these 70 permutations was chosen at random. Test substances were plant extracts of Belladonna prepared in 12 CH (group I) and 30 CH (group II). The test substances were freshly prepared by the German manufacturer Deutsche Homöopathie Union (DHU), Karlsruhe, according to the German pharmacopoe (Homöopathisches Arzneibuch, HAB). The homoeopathic dilution process yields a substance containing 10-24 and 10-60 parts of the original material, dilutions commonly used in homoeopathy for experimental and curative purposes. The test substances were dispensed as size 3 globules. Placebo consisted of the same batch of sugar globules medicated with the same amount of unsuccussed alcohol used for the potentization procedure of Belladonna. Placebo and verum substances were indistinguishable. One container per week was dispensed, labeled with the serial number of the week.

                A permutation table containing all possible 70 permutations was sent to the department of  biostatistics of the University of Ulm (Prof. Wilhelm Gaus) who, using a random number generator, selected independently 40 sequences of possible permutations. These sequences were sent to the manufacturer who packed the test substances according to the instructions, which were fixed in advance in a protocol. The code was held by the manufacturer. Sealed envelopes containing a copy of the code were sent to doctors who could be consulted in the case of an emergency or in case of an illness which might be traceable to the test substances. The code was not handed out until all data had been collected and a printout of the data with an electronic copy had been deposited at the department of biomedical statistics, University of Ulm. Thus, the study was strictly double-blind.

                Data collection was done using a diary which presented the subjects with predefined symptoms derived from the homoeopathic materia medica in order to query subjects on Belladonna as well as non-Belladonna symptoms. Apart from general well-being subjects were askedto report on any changes and if so, what type of changes these were. (e.g. did their well-being change for better or worse, did their well-being change suddenly or slowly, during the day or during the night, was it aggravated or improved by heat or cold, what types of changes did they notice and where did the changes occur. A sample copy of the diary can be obtained from the authors.)

Using the key symptoms and modalities of Belladonna, it was decided in advance which kind of responses would be counted as Belladonna, and non-Belladonna symptoms, respectively. There were an equal number of possible Belladonna or non-Belladonna responses. Belladonna and non-Belladonna symptoms are given in table 1

                We aimed at unselected subjects unfamiliar with the theory and nature of homoeopathy. Advertisements were distributed in several computer newsnets, placed in public places, and handed out as leaflets. Subjects were in good health, in no need of medication, under no medical treatment or supervision, not being heavy users of alcohol, caffeine or recreational drugs. Other criteria which might be handled as exclusion criteria in traditional homoeopathic remedy provings were noted down in a questionnaire, but were not applied, as their validity is not experimentally confirmed. These included unstable lifestyle, anticipation of major life-events, of stress and change in daily habits. Subjects were also screened using the Freiburg Complaint List (FBL) (Fahrenberg, 1995) and the Freiburg Personality Inventory (FPI) (Fahrenberg, Hampel & Selg 1989), two widely used and validated instruments. They were informed about the possibility of experiencing transitory symptoms according to homoeopathic theory, and of the essentially harmless nature of the experiment, and also that they would receive a preparation which was pharmaceutically inactive as it would not contain any molecule of an active substance, but which nonetheless was claimed to be able to produce transitory symptoms. They all gave written informed consent. They were instructed to take 2 times 3 globules of the respective test substances on Mondays, throw away the rest of the container, and to use a new container each week, according to the serial number.

          

TABLE 1. Belladonna and non-Belladonna symptoms

asked for in the diary

______________________________________

Belladonna symptoms                           non-Belladonna symptoms

                                                          (chosen at random)

_________________________________________________________

 

modalities

better warm                                       better cold

worse cold                                         worse warm

sudden change                                    slow change

during night                                        during day

 

type of changes

redness                                              paleness

swelling                                             weakness

heat                                                   cold

pain                                                   itching

pulsating                                            strong thirst

dryness                                              lameness

pressure/fullness                 disturbed sleep

restlessness                                        trembling

 

localisations

head                                                  breast

neck and throat                                   stomach

skin                                                   back

limbs

 

 

                The subjects were provided by mail with a full set of test substances, diaries and written instructions, how to conduct the data collection, and how to proceed with the test substances. These included some precautions, e.g not to place the test substances under any electromagnetically active device, not to expose them to exceptional heat or cold and to store them in a safe and neutral place.

                Statistical evaluation was done in the following way: for each week a Belladonna score was calculated, summing up all the Belladonna symptoms which had been experienced in this week. The non-Belladonna scores were not used for calculation. The differences of mean Belladonna scores were calculated between the four Belladonna and the four placebo weeks. This calculation was repeated for all 70 possible permutations, using an author -written Basic program which performed the permutations and calculations. Then the proportion of differences was determined which was greater or equal than the empirically obtained one. This value devided by 70 yielded the real probability that this distribution of Belladonna scores could have been obtained by chance. We conducted one-sided tests (i.e. more Belladonna symptoms with Belladonna). This procedure is clearly described by Edgington (1987). Visual analysis was done using the graph of the data as shown in the figures below. The general logic of single-case randomization experiments opens the possibility of conducting independent experiments with individual subjects. We hypothesized that, if there were any effects present in the potentized homoeopathic dilutions, then at least one subject out of a group of 20 people should show a reaction. This design does not produce multiple tests out of one experiment but is a series of independent experiments.

3. Results

Seventeen subjects volunteered to be enrolled in group I (Belladonna 12 CH). Three out of these 17 subjects dropped out and did not continue the experiment. The final group consisted of 14 subjects. In group II (Belladonna 30 CH) 16 subjects volunteered to participate, 5 dropped out leaving a sample of 11 subjects. Dropouts were not related to the test substances, but were motivated by lack of interest or bad compliance. Demographic data are presented in table 2.

                In group I (Belladonna 12 CH) one subject showed changes on visual analysis, which were expected, i.e. more Belladonna symptoms with Belladonna. This result is not significant, however. One subject showed changes which were unexpected, namely more Belladonna symptoms with placebo (p=0.057). A similar result was obtained in group II (Belladonna 30 CH): one experiment showed a significant tendency, with more Belladonna symptoms with Belladonna (p=0.071), two more experiments showed striking effects on visual analysis, yet yielded no significant statistics. Data and graphs are presented in figures 1-6.

                The figures present the data in the following way: The first week always was observation with placebo for run-in purposes which was not counted. Then follow the 8 experimental weeks in random order of placebo and Belladonna weeks, indicated by the dotted line (hills: placebo, troughs: Belladonna). The number of Belladonna symptoms is given in black bars, and the number of non-Belladonna symptoms in white bars. The left abscissa gives the absolute frequency of symptoms, the right abscissa gives the rating of general well-being on a 10 cm visual analogue scale. The data presents the number of Belladonna symptoms per week.

 As can be seen, a variety of pattern arises:

 

 

 

TABLE 2: Demographic data

        ________________________________________________________________

 

Group                I (Belladonna 12 CH)                II (Belladonna 30 CH)

________________________________________________________________

 

Marital status

unmarried                          10                                            12

married                             6 *                                           3**

 

Sex

female                               8                                             9

male                  8*                                            7

 

Age

mean (range)                     35 (27-52)                33 (24-39)

 

Education

A-level                              13                                            13

secondary education            3*                                            2

primary education               0                                             1

________________________________________________________________

*   only 16  subjects returned the questionaire.

** only 15 subjects returned the questionaire.

 

                We observed any type of reaction from low level responders presenting nearly no symptoms to highly responsive subjects presenting a large number of symptoms. Not only Belladonna symptoms were reported but also a large number of non-Belladonna symptoms. The experiment of subject number 16 (figure 1) showed 4 Belladonna symptoms in the last Belladonna week and no symptoms at all in all the other weeks and only 2 non-Belladonna symptoms in the Belladonna week. However, the randomization test was not significant.

            Subject 8 experienced a great number of Belladonna and a smaller number of non-Belladonna symptoms during the weeks 5 and 6, that is 2 weeks after having taken Belladonna 12CH. The Belladonna symptoms were pain in the throat and related symptoms. A smaller cluster of symptoms can be seen in week 7 (Belladonna) and in week 8 (placebo). There were clearly more Belladonna symptoms under placebo (p=0.057). It should be noted that not only were more symptoms reported with placebo but more speicific Belladonna symptoms, as the non-Belladonna symptoms were not included in our evaluation.

                Subject number 24 (Belladonna 30 CH, figure 3) showed an interesting pattern, albeit not significant: symtpoms were exhibited in three of the four Belladonna weeks, always on the day following administration.

The experiment with Belladonna 30 CH (figure 4, subject no 31) shows a statistical tendency (p=0.071) and a clear picture: we see altogether 14 Belladonna symptoms only in the Belladonna weeks with no Belladonna symptoms in the placebo weeks as well as a number of non-Belladonna symptoms in the Belladonna weeks which were not counted.

            There were also a couple of other interesting observations which however did not yield significant results. Subject  number 30 showed visually impressive results which for some reason did not reflect in a significant statistical effect (figure 5).

 

                Run-in    P             P            V            P            V            V            P            V

Figure 1. Striking in visual anlaysis but not statistically significant, more Belladonna symptoms with Belladonna 12 CH.           

Placebo (Score per week)                0  0  0  0  

Verum (Score per week)  0  0  0   

Probabilitity                                    .5

 

 

                Run-in       V            V            P            P            P            V            P          V

Figure 2. Significant randomization test, more Belladonna symptoms with placebo (Belladonna 12 CH).

Placebo (Score per week)                2  23  31  22             

Verum (Score per week)  4   4  12                

Probabilitity                                    .057

 

                Run-in          P          V            P             V           V          P           P            V

 

Figure 3. Striking in visual analysis; more Belladonna symptoms with Belladonna 30 CH. Belladonna symptoms with Belladonna 30CH showed on the day following administration.

Placebo (Score per week)                3  0  0  2  

Verum (Score per week)  7  1  2   

Probabilitity                                    .357

 

               Run-in          P          V            P             V           V          P           V           P

 

Figure 4. Significant randomization test, striking in visual analysis; more Belladonna symptoms with Belladonna 30 CH.

Placebo (Score per week)                0  0  0  0  

Verum (Score per week)  6  6  0  2  

Probabilitity                                    .071

 

                 Run-in          V           P         V             P           V          P           V           P

 

Figure 5. More Belladonna-symptoms in visual analysis, no significant randomization test with Belldonna 30 CH.

Placebo (Score per week)                4  0  0  0  

Verum (Score per week)  29 0  3  2 

Probabilitity                                    .242

 

 

 

                One can clearly see that Belladonna symptoms show up more frequently with Belladonna 30 CH (35 altogether) than with placebo (3 altogether). What can also be seen is a clear drop of the well-being data on Mondays exactly when the test substance had been taken, accompanied by the experience of Belladonna symptoms as well as some non-Belladonna symptoms. Intuitively one would expect a significance of the result which, however, did not show. We do not know, why this is so, but we suspect that this is a property of the permutation procedure which underlies the rationale of this kind of statistical testing.

                A vital presupposition of this kind of design is the absence of carryover effects. For if carryover effects should be present, the statisctical procedure looses power and a differentiation of the seperate experimental units, i.e. experimental weeks, is more difficult. We indeed saw some carryover effects which of course can't be proven as such.

                The data of subject number 19 (figure 6, Belladonna 12 CH) show a large cluster of symptoms beginning in the seventh week and going through to the end of the experiment. It is pain in the throat which is a Belladonna symptom but which can't be in fact attributed to Belladonna as it carries through the two following placebo weeks. The same was seen in the subject number 21 (data and graph not shown, Belladonna 30 CH): pain in the throat carrying through to the next placebo week.

            It is generally assumed that homoeopathic provers should be healthy in a homoeopathic sense i.e. they should show no symptoms in their normal state of health so as to be able to discern any change in their general well-being. We tried to operationalize this demand by using the Freiburg Complaint List, which is a very sensitive list of subjective complaints in any area of the body. A low scoring in the FBL means that a subject generally reports no or very few complaints on a phenomenological basis. We correlated FBL scores and the number of symptoms but found no clear correlation. Also, we saw some contradictory patterns i.e. subjects with very low FBL scores producing a lot of symptoms as well as producing very little symptoms and vice versa. We also saw subjects with very low FBL scores and still showing symptoms only under placebo or in a very unsystematic way. If there is any effect present in the data, it is unrelated to prior scores of either the Freiburg Complaint List or the Freiburg Personality Inventory scores.

 

                 Run-in       V          V           P            V            P           V            P           P

 

Figure 6. Hints for a carry-over effect with Belladonna 12 CH

Placebo (Score per week)                1  5  14  14              

Verum (Score per week)  10  3  0  12              

Probabilitity                                    .3

 

 

 

                We therefore conclude that Belladonna 12 CH and 30 CH does have effects different from placebo. The effects however are very small: one out of 25 experiments is significant, three more out of 25 experiments are striking in visual analysis but not statistically significant. There are unexplained paradoxical effects: one out of 25 experiments show more Belladonna symptoms with placebo. We can not exclude the presence of carryover effects in this kind of experimental pathogenesis study and therefore recommend not to use this kind of randomization design as a general approach to experiments following the homoeopathic rationale of a remedy proving.

4. Discussion

We have shown, using single-case randomization experimental design, that homoeopathic preparations (Belladonna 12 CH, 30 CH) produce effects different from placebo. One should observe that this can not be argued away by saying that so many experiments had been carried out for detecting the effect. As all the experiments were independent, the significant results are not artefacts of multiple testing, but true experimental results. But the fact that only 1 out of 25 test persons showed a significant effect and three more showed visually detectable effects means that the effects are very small. This, however, was expected: therefore the hypothesis was formulated that one out of the group of 20 would show a significant effect. The phenomenon, therefore, is not reproduceable in everybody and under any circumstances, but can be seen only infrequently in certain persons. Our data do not give a clear hint as to what kind of preconditions should be met. We examined the demographic data and screening information, but did not find any clear pattern or significant correlations. The type of result we found closely resembles the results obtained by Weiss et al. (1980) who asked whether food additives in soft drinks would trigger allergic reactions in children. As they, like we, expected a very small effect, they conducted a multiple single-case experiment with 22 children. Most of the children did not show any signs of allergic reactions, but one of the children showed a very clear and significant allergic reaction to the food additives. The authors therefore concluded that the effect was present but was very small. Still they issued a recommendation to the authorities to acknowledge food additives as a potentially allergogenic component of soft drinks, which is recognized as clinically relevant. The same is true for our results: the effect is very small and at the same time it seems to be there. As a randomization experimental procedure provides for intraindividual control, the effects can not be attributed to failure in randomizing groups of people as could have been in small parallel group designs. Of course, it could have been the case that Belladonna symptoms appeared under Belladonna just by chance. This is true, and the chance that this might have happened is given by the true p value. It is roughly 2/100. It also cannot be argued that our data collecting procedure generated the data. We also saw a large, approximately equal, number of non-Belladonna-symptoms, which we did not count. It does not seem to us that the experimental results can be explained away.

                At the same time we obtained paradoxical results: clearly more Belladonna symptoms with placebo. The same fact was also observed in the predecessor study (Walach 1993), where 11 out of 45 subjects showed inverse reactions, that is, less Belladonna symptoms with Belladonna and more Belladonna-symptoms with placebo. Since this experiment had been conducted as a traditional cross-over design, and as a group experiment, no single-case statistics were possible. We hypothesized that the paradoxical effects might have been curative effects, that is, the test substance Belladonna cured symptoms which had been there anyway, and which had shown under placebo. The present experiment also was designed to test this hypothesis. We did not find data corroborating this hypothesis. Only two other subjects who experienced more Belladonna symptoms with placebo indicated that the effect was possibly due to a curative effect, and these two subjects did not show significantly more symptoms. That is, the person who experienced more Belladonna symptoms with placebo cannot be attributed to a possibly curative effect of Belladonna. In order to appreciate what this means, one should bear in mind that the target of our study was the number of Belladonna symptoms and not the number of symptoms altogether. Thus these results mean that one subject reacted with placebo, as if he or she had taken Belladonna. This is as yet unexplained and indicate that "placebo" is a highly complex and poorly understood intervention.

                One might argue that this is just what one would expect by chance: sometimes more Belladonna symptoms with Belladonna, sometimes more with placebo, sometimes none. One should bear in mind, however, that we designed our experiment in a way that we would be able for a single case to answer the question: Does Belladonna produce more Belladonna symptoms than placebo? This was statistically true in one out of 25 cases and visually in three more out of 25 cases. One has also to consider that our methodology was a rather crude one: we did not look for all possible symptoms, neither did we probe for specific or individual symptoms. Rather did we screen for very crude general symptoms of Belladonna. Therefore, our results are not definite proofs, but hints that this venue of research should be followed on with more careful and possibly more sensitive methodology.

                Our study shows two shortcomings from a methodological and external validity point of view: first, the randomization test procedure requires random permutation of sequences. This again assumes that carryover effects must be excluded. This presupposition may not always be met. Although Belladonna normally is supposed to have shortlived action in homoeopathy, this does not exclude the possibility of carryover effects. Because of this, the design is not ideal for testing this phenomenon and it should not be taken as a prototype of experimental studies in this field. A design, which is also a single-case randomization experiment but which randomizes the intervention point, would be more appropriate. Another shortcoming is the fact that we operated with fixed dosages. Many homoeopaths recommend flexible doses, adjusted to individual reaction. We felt that this could not be accomodated within the framework of a randomization design, so we dropped this requirement and instead opted for a small single dose distributed over one day. In this way we hoped to provide a stimulus which would be true to homoeopathic theory and to the requirements of experimental design. Future studies, however, should take this into account and find a design which can allow for flexible dosing.

                In summary: the high dilution effect of Belladonna 12 CH and 30 CH seems to exist, albeit very weak. This warrants further investigation, preferably in a design, where carryover effects do not play any important role. Traditional exclusion criteria, such as perfect health or stimulant free life style, should be considered with caution, as in our study they did not contribute to explaining the effects. The effect is there, as well as the paradox: significantly more Belladonna-symptoms with Belladonna, as well as with placebo. The small effect size calls for confirmation in a study with larger numbers and more sensitive methodology.

 

Acknowledgement:

This study was supported by the Robert-Bosch-Foundation, Stuttgart. The authors gratefully acknowledge the help of Prof. W. Gaus, Ulm, in preparing the randomization table, of Frank Wieland, formerly DHU, and Dr. Marianne Heger, DHU Karlsruhe, for preparing the test substances, and of Dr. Patrick Onghena, Leeuven, for a critical reading of a draft version of this manuscript and helpful criticism of a former version of the statistical evaluation.

 

5. References

Edgington, E. (1987) Randomization Tests. Dekker Publisher, New York.

Ernst-Hieber, E. and Hieber, S. (1995) Wirkt eine homöopathische Hochpotenz anders als ein Placebo?. Randomisierte, doppelblinde multiple Einzelfallstudie. Hippokrates Publisher, Stuttgart .

Fahrenberg, J. (1994) Die Freiburger Beschwerdeliste FBL. Form FBL-G und revidierte Form FBL-R. Hogrefe Publisher, Göttingen

Fahrenberg, J., Hampel, R. and Selg, H. (1989) Das Freiburger Persönlichkeitsinventar FPI. Revidierte Fassung FPI-R und teilweise geänderte Fassung FPI-A. Hofgrefe Publisher, Göttingen. fifth Ed.

Hahnemann, S. (1979) Organon der Heilkunst. sixth Ed.,. R. Haehl (ed.), Reprint - Hippokrates Publisher, Stuttgart.

Walach, H. (1986) Homöopathie als Basistherapie. Plädoyer für die wissenschaftliche Ernsthaftigkeit der Homöopathie. Haug Publisher, Heidelberg.

Walach, H. (1993) Does a highly diluted homeopathic drug act as a placebo in healthy volunteers? Experimental study of Belladonna 30 CH in double-blind crossover design - a pilot study, J. Psychosom. Res. 37, 851-869.

Weiss, B., Williams, J.H., Margen, S., Abrams, B., Caan, B., Citron, L.J., Cox, C., McKibben, J., Ogar, D. and Schultz, S. (1980) Behavioral responses to artificial food colors, Science 207, 1487-1489.

Home Page