Effects
of Belladonna 12 CH and 30 CH in Healthy Volunteers
A Multiple, Single-Case Experiment in Randomization Design
H. WALACH,
S. HIEBER & E. ERNST-HIEBER
University of Freiburg, Department of Psychology, D -
79085 Freiburg
2 - Method
3 - Results
4 - Discussion
5 - References
The
practice of homoeopathy lies on two principles: the law of similars and the
potentization of homoeopathic remedies. The law of similars states that likes
be cured by likes: "Similia
similibus curentur" (Walach, 1986). This means for example that
certain types of feavers which exhibit symptoms like an intoxication with
deadly nightshade can be cured by the homoeopathic preparation Belladonna, which is prepared from this
very same plant. While this principle has been known for a long time, it was
the German doctor and pharmacist Samuel Hahnemann who made practical use of it,
by giving drug substances to volunteers, noting down the symptoms they
experienced, and using the very same symptoms as indication of this remedy in
the case of a disease. His volunteers experienced sometimes quite toxic
reactions. Therefore, Hahnemann diluted the remedial substances stepwise, by
adding 99 drops of alcohol to one drop of substance and vigourously shaking it.
This procedure he called "potentization" or "dynamisation".
For every dilution process he used a new glass vial. He experimented with very
high dilutions not knowing that he had way beyond transgressed Avogadro's
number which states that 6.0231023 molecules are contained in one mole of any
solution. By virtue of the homoeopathic potentization procedure, no molecules
are to be expected in preparations beyond and above 12 CH or 24 XH, 12 stepwise
agitated dilutions in the ratio 1:100, or 24 stepwise, agitated dilutions in
the ratio 1:10. Hahnemann however, not knowing about Avogadro's number, very
frequently used potentizations as high as 30 CH or higher, and in the 6th
edition of his Organon (Hahnemann 1979) he even prescribed 30 CH as a standard
potency for curative and experimental procedures alike.
The
law of similars was in Hahnemann's time a tradition with no direct practical
implications. But by testing the available drug substances, Hahnemann found a
working basis. Since then homoeopaths have "proven" new remedial
substances by giving substances in both crude and diluted form to healthy
volunteers and noting down their symptoms. These symptoms form the homoeopathic
materia medica. Rarely however it has been tried to find out whether symptoms
supposedly produced by these substances in healthy volunteers are in fact
distinctive symptoms, or whether the symptoms experienced by the provers are
due to the accidental situation of observation and other unspecific factors,
and are therefore to be considered as placebo symptoms.
In
a previous study (Walach, 1993) it was shown that, although on a single case
basis differences could be seen between placebo and verum in a double blind
crossover trial, on a group level there was no difference between placebo and
verum symptoms. This was due to the fact that some persons clearly showed more
symptoms under the test substance, Belladonna
30 CH, and some provers showed clearly more symptoms under placebo. Since this
trial was not designed as a single-case trial, no statistics could be provided
in order to test for single subjective significant effects. We therefore
decided to design a study which would allow to draw conclusions on a single
case basis.
We
chose a randomization design proposed by Edgington (1987), which is a true
single-case experiment, and can be evaluated statistically on a single-case
basis. In this design treatment periods are randomized such that the single
subject undergoes a randomized series of treatments not knowing which treatment
is to be applied at what time period. A full report of the study is available
(Ernst-Hieber and Hieber, 1995).
We
randomized 4 weeks of treatment and 4 weeks of placebo per person, preceeded by
an observational run in week.
This allows for
8! permutations.
4! . 4!
For each test subject one of these 70 permutations was
chosen at random. Test substances were plant extracts of Belladonna prepared in 12 CH (group I) and 30 CH (group II). The
test substances were freshly prepared by the German manufacturer Deutsche
Homöopathie Union (DHU), Karlsruhe, according to the German pharmacopoe
(Homöopathisches Arzneibuch, HAB). The homoeopathic dilution process yields a
substance containing 10-24 and 10-60 parts of the
original material, dilutions commonly used in homoeopathy for experimental and
curative purposes. The test substances were dispensed as size 3 globules.
Placebo consisted of the same batch of sugar globules medicated with the same
amount of unsuccussed alcohol used for the potentization procedure of Belladonna. Placebo and verum substances
were indistinguishable. One container per week was dispensed, labeled with the
serial number of the week.
A
permutation table containing all possible 70 permutations was sent to the
department of biostatistics of the
University of Ulm (Prof. Wilhelm Gaus) who, using a random number generator,
selected independently 40 sequences of possible permutations. These sequences
were sent to the manufacturer who packed the test substances according to the
instructions, which were fixed in advance in a protocol. The code was held by the manufacturer. Sealed
envelopes containing a copy of the code were sent to doctors who could be
consulted in the case of an emergency or in case of an illness which might be
traceable to the test substances. The code was not handed out until all data
had been collected and a printout of the data with an electronic copy had been
deposited at the department of biomedical statistics, University of Ulm. Thus,
the study was strictly double-blind.
Data
collection was done using a diary which presented the subjects with predefined
symptoms derived from the homoeopathic materia medica in order to query
subjects on Belladonna as well as
non-Belladonna symptoms. Apart from
general well-being subjects were askedto report on any changes and if so, what
type of changes these were. (e.g. did their well-being change for better or
worse, did their well-being change suddenly or slowly, during the day or during
the night, was it aggravated or improved by heat or cold, what types of changes
did they notice and where did the changes occur. A sample copy of the diary can
be obtained from the authors.)
Using the key symptoms and modalities of Belladonna, it was decided in advance
which kind of responses would be counted as Belladonna,
and non-Belladonna symptoms,
respectively. There were an equal number of possible Belladonna or non-Belladonna
responses. Belladonna and non-Belladonna symptoms are given in table 1
We
aimed at unselected subjects unfamiliar with the theory and nature of
homoeopathy. Advertisements were distributed in several computer newsnets,
placed in public places, and handed out as leaflets. Subjects were in good
health, in no need of medication, under no medical treatment or supervision,
not being heavy users of alcohol, caffeine or recreational drugs. Other
criteria which might be handled as exclusion criteria in traditional
homoeopathic remedy provings were noted down in a questionnaire, but were not
applied, as their validity is not experimentally confirmed. These included
unstable lifestyle, anticipation of major life-events, of stress and change in
daily habits. Subjects were also screened using the Freiburg Complaint List
(FBL) (Fahrenberg, 1995) and the Freiburg Personality Inventory (FPI)
(Fahrenberg, Hampel & Selg 1989), two widely used and validated
instruments. They were informed about the possibility of experiencing
transitory symptoms according to homoeopathic theory, and of the essentially
harmless nature of the experiment, and also that they would receive a
preparation which was pharmaceutically inactive as it would not contain any
molecule of an active substance, but which nonetheless was claimed to be able
to produce transitory symptoms. They all gave written informed consent. They
were instructed to take 2 times 3 globules of the respective test substances on
Mondays, throw away the rest of the container, and to use a new container each
week, according to the serial number.
TABLE 1. Belladonna and non-Belladonna
symptoms
asked for in the diary
______________________________________
Belladonna symptoms non-Belladonna symptoms
(chosen
at random)
_________________________________________________________
modalities
better warm better
cold
worse cold worse
warm
sudden change slow
change
during night during
day
type of
changes
redness paleness
swelling weakness
heat cold
pain itching
pulsating strong
thirst
dryness lameness
pressure/fullness disturbed
sleep
restlessness trembling
localisations
head breast
neck and throat stomach
skin back
limbs
The
subjects were provided by mail with a full set of test substances, diaries and
written instructions, how to conduct the data collection, and how to proceed
with the test substances. These included some precautions, e.g not to place the
test substances under any electromagnetically active device, not to expose them
to exceptional heat or cold and to store them in a safe and neutral place.
Statistical
evaluation was done in the following way: for each week a Belladonna score was
calculated, summing up all the Belladonna
symptoms which had been experienced in this week. The non-Belladonna scores
were not used for calculation. The differences of mean Belladonna scores were calculated between the four Belladonna and the four placebo weeks.
This calculation was repeated for all 70 possible permutations, using an author
-written Basic program which performed the permutations and calculations. Then
the proportion of differences was determined which was greater or equal than
the empirically obtained one. This value devided by 70 yielded the real
probability that this distribution of Belladonna
scores could have been obtained by chance. We conducted one-sided tests (i.e.
more Belladonna symptoms with Belladonna). This procedure is clearly described
by Edgington (1987). Visual analysis was done using the graph of the data as
shown in the figures below. The general logic of single-case randomization
experiments opens the possibility of conducting independent experiments with
individual subjects. We hypothesized that, if there were any effects present in
the potentized homoeopathic dilutions, then at least one subject out of a group
of 20 people should show a reaction. This design does not produce multiple
tests out of one experiment but is a series of independent experiments.
Seventeen
subjects volunteered to be enrolled in group I (Belladonna 12 CH). Three out of these 17 subjects dropped out and
did not continue the experiment. The final group consisted of 14 subjects. In
group II (Belladonna 30 CH) 16
subjects volunteered to participate, 5 dropped out leaving a sample of 11
subjects. Dropouts were not related to the test substances, but were motivated
by lack of interest or bad compliance. Demographic data are presented in table
2.
In
group I (Belladonna 12 CH) one
subject showed changes on visual analysis, which were expected, i.e. more
Belladonna symptoms with Belladonna.
This result is not significant, however. One subject showed changes which were
unexpected, namely more Belladonna
symptoms with placebo (p=0.057). A similar result was obtained in group II (Belladonna 30 CH): one experiment showed
a significant tendency, with more Belladonna
symptoms with Belladonna (p=0.071),
two more experiments showed striking effects on visual analysis, yet yielded no
significant statistics. Data and graphs are presented in figures 1-6.
The
figures present the data in the following way: The first week always was
observation with placebo for run-in purposes which was not counted. Then follow
the 8 experimental weeks in random order of placebo and Belladonna weeks, indicated by the dotted line (hills: placebo,
troughs: Belladonna). The number of Belladonna symptoms is given in black
bars, and the number of non-Belladonna symptoms
in white bars. The left abscissa gives the absolute frequency of symptoms, the
right abscissa gives the rating of general well-being on a 10 cm visual
analogue scale. The data presents the number of Belladonna symptoms per week.
As can be
seen, a variety of pattern arises:
TABLE 2: Demographic data
________________________________________________________________
Group I
(Belladonna 12 CH) II (Belladonna
30 CH)
________________________________________________________________
Marital status
unmarried 10 12
married 6
* 3**
Sex
female 8 9
male 8* 7
Age
mean (range) 35
(27-52) 33 (24-39)
Education
A-level 13 13
secondary education 3* 2
primary education 0 1
________________________________________________________________
* only 16
subjects returned the questionaire.
** only 15
subjects returned the questionaire.
We observed any type of reaction
from low level responders presenting nearly no symptoms to highly responsive
subjects presenting a large number of symptoms. Not only Belladonna symptoms
were reported but also a large number of non-Belladonna symptoms. The experiment of subject number 16 (figure 1)
showed 4 Belladonna symptoms in the last Belladonna
week and no symptoms at all in all the other weeks and only 2 non-Belladonna
symptoms in the Belladonna week.
However, the randomization test was not significant.
Subject
8 experienced a great number of Belladonna and a smaller number of non-Belladonna symptoms during the weeks 5
and 6, that is 2 weeks after having taken Belladonna
12CH. The Belladonna symptoms
were pain in the throat and related symptoms. A smaller cluster of symptoms can
be seen in week 7 (Belladonna) and in
week 8 (placebo). There were clearly more Belladonna
symptoms under placebo (p=0.057). It should be noted that not only were more
symptoms reported with placebo but more speicific Belladonna symptoms, as the non-Belladonna
symptoms were not included in our evaluation.
Subject number 24 (Belladonna 30 CH, figure 3) showed an
interesting pattern, albeit not significant: symtpoms were exhibited in three
of the four Belladonna weeks, always
on the day following administration.
The experiment with Belladonna 30 CH (figure 4, subject no 31) shows a statistical
tendency (p=0.071) and a clear picture: we see altogether 14 Belladonna symptoms only in the Belladonna weeks with no Belladonna symptoms in the placebo weeks
as well as a number of non-Belladonna
symptoms in the Belladonna weeks
which were not counted.
There
were also a couple of other interesting observations which however did not
yield significant results. Subject
number 30 showed visually impressive results which for some reason did
not reflect in a significant statistical effect (figure 5).

Run-in
P P
V P V V
P V
Figure 1. Striking in visual anlaysis but not statistically
significant, more Belladonna symptoms with Belladonna 12 CH.
Placebo (Score per week) 0 0 0 0
Verum (Score per week) 0 0
0 4
Probabilitity
.5

Run-in V V
P P P V
P V
Figure 2. Significant randomization test, more Belladonna
symptoms with placebo (Belladonna 12 CH).
Placebo (Score per week) 2 23 31
22
Verum (Score per week) 4 4
12 0
Probabilitity
.057

Run-in P V
P V V P P V
Figure 3. Striking in visual analysis; more Belladonna symptoms
with Belladonna 30 CH. Belladonna symptoms with Belladonna 30CH showed on the
day following administration.
Placebo (Score per week) 3 0 0 2
Verum (Score per week) 7 1
2 0
Probabilitity
.357

Run-in P V P
V V P V P
Figure 4. Significant randomization test, striking in visual
analysis; more Belladonna symptoms with Belladonna 30 CH.
Placebo (Score per week) 0 0 0 0
Verum (Score per week) 6 6
0 2
Probabilitity
.071

Run-in V P V P V P V P
Figure 5. More Belladonna-symptoms in visual analysis, no
significant randomization test with Belldonna 30 CH.
Placebo (Score per week) 4 0 0 0
Verum (Score per week) 29
0 3
2
Probabilitity
.242
One can clearly see that Belladonna symptoms show up more
frequently with Belladonna 30 CH (35
altogether) than with placebo (3 altogether). What can also be seen is a clear
drop of the well-being data on Mondays exactly when the test substance had been
taken, accompanied by the experience of Belladonna
symptoms as well as some non-Belladonna
symptoms. Intuitively one would expect a significance of the result which,
however, did not show. We do not know, why this is so, but we suspect that this
is a property of the permutation procedure which underlies the rationale of
this kind of statistical testing.
A
vital presupposition of this kind of design is the absence of carryover
effects. For if carryover effects should be present, the statisctical procedure
looses power and a differentiation of the seperate experimental units, i.e.
experimental weeks, is more difficult. We indeed saw some carryover effects
which of course can't be proven as such.
The
data of subject number 19 (figure 6, Belladonna
12 CH) show a large cluster of symptoms beginning in the seventh week and going
through to the end of the experiment. It is pain in the throat which is a Belladonna symptom but which can't be in
fact attributed to Belladonna as it
carries through the two following placebo weeks. The same was seen in the
subject number 21 (data and graph not shown, Belladonna 30 CH): pain in the throat carrying through to the next
placebo week.
It is
generally assumed that homoeopathic provers should be healthy in a homoeopathic
sense i.e. they should show no symptoms in their normal state of health so as
to be able to discern any change in their general well-being. We tried to
operationalize this demand by using the Freiburg Complaint List, which is a
very sensitive list of subjective complaints in any area of the body. A low
scoring in the FBL means that a subject generally reports no or very few
complaints on a phenomenological basis. We correlated FBL scores and the number
of symptoms but found no clear correlation. Also, we saw some contradictory
patterns i.e. subjects with very low FBL scores producing a lot of symptoms as
well as producing very little symptoms and vice versa. We also saw subjects
with very low FBL scores and still showing symptoms only under placebo or in a
very unsystematic way. If there is any effect present in the data, it is
unrelated to prior scores of either the Freiburg Complaint List or the Freiburg
Personality Inventory scores.

Run-in V
V P V
P V P P
Figure 6. Hints for a carry-over effect with Belladonna 12 CH
Placebo (Score per week) 1 5 14
14
Verum (Score per week) 10 3
0 12
Probabilitity
.3
We
therefore conclude that Belladonna 12
CH and 30 CH does have effects different from placebo. The effects however are
very small: one out of 25 experiments is significant, three more out of 25
experiments are striking in visual analysis but not statistically significant.
There are unexplained paradoxical effects: one out of 25 experiments show more Belladonna symptoms with placebo. We can
not exclude the presence of carryover effects in this kind of experimental
pathogenesis study and therefore recommend not to use this kind of
randomization design as a general approach to experiments following the
homoeopathic rationale of a remedy proving.
We have shown, using single-case randomization
experimental design, that homoeopathic preparations (Belladonna 12 CH, 30 CH) produce effects different from placebo.
One should observe that this can not be argued away by saying that so many
experiments had been carried out for detecting the effect. As all the
experiments were independent, the significant results are not artefacts of
multiple testing, but true experimental results. But the fact that only 1 out
of 25 test persons showed a significant effect and three more showed visually
detectable effects means that the effects are very small. This, however, was
expected: therefore the hypothesis was formulated that one out of the group of
20 would show a significant effect. The phenomenon, therefore, is not
reproduceable in everybody and under any circumstances, but can be seen only
infrequently in certain persons. Our data do not give a clear hint as to what
kind of preconditions should be met. We examined the demographic data and
screening information, but did not find any clear pattern or significant
correlations. The type of result we found closely resembles the results
obtained by Weiss et al. (1980) who
asked whether food additives in soft drinks would trigger allergic reactions in
children. As they, like we, expected a very small effect, they conducted a
multiple single-case experiment with 22 children. Most of the children did not
show any signs of allergic reactions, but one of the children showed a very
clear and significant allergic reaction to the food additives. The authors
therefore concluded that the effect was present but was very small. Still they
issued a recommendation to the authorities to acknowledge food additives as a
potentially allergogenic component of soft drinks, which is recognized as
clinically relevant. The same is true for our results: the effect is very small
and at the same time it seems to be there. As a randomization experimental
procedure provides for intraindividual control, the effects can not be
attributed to failure in randomizing groups of people as could have been in
small parallel group designs. Of course, it could have been the case that Belladonna symptoms appeared under Belladonna just by chance. This is true,
and the chance that this might have happened is given by the true p value. It
is roughly 2/100. It also cannot be argued that our data collecting procedure
generated the data. We also saw a large, approximately equal, number of
non-Belladonna-symptoms, which we did not count. It does not seem to us that
the experimental results can be explained away.
At
the same time we obtained paradoxical results: clearly more Belladonna symptoms with placebo. The
same fact was also observed in the predecessor study (Walach 1993), where 11
out of 45 subjects showed inverse reactions, that is, less Belladonna symptoms with Belladonna
and more Belladonna-symptoms with
placebo. Since this experiment had been conducted as a traditional cross-over
design, and as a group experiment, no single-case statistics were possible. We
hypothesized that the paradoxical effects might have been curative effects,
that is, the test substance Belladonna
cured symptoms which had been there anyway, and which had shown under placebo.
The present experiment also was designed to test this hypothesis. We did not
find data corroborating this hypothesis. Only two other subjects who
experienced more Belladonna symptoms
with placebo indicated that the effect was possibly due to a curative effect,
and these two subjects did not show significantly more symptoms. That is, the
person who experienced more Belladonna
symptoms with placebo cannot be attributed to a possibly curative effect of Belladonna. In order to appreciate what
this means, one should bear in mind that the target of our study was the number
of Belladonna symptoms and not the
number of symptoms altogether. Thus these results mean that one subject reacted
with placebo, as if he or she had taken Belladonna.
This is as yet unexplained and indicate that "placebo" is a highly
complex and poorly understood intervention.
One
might argue that this is just what one would expect by chance: sometimes more Belladonna symptoms with Belladonna, sometimes more with placebo,
sometimes none. One should bear in mind, however, that we designed our
experiment in a way that we would be able for a single case to answer the
question: Does Belladonna produce
more Belladonna symptoms than
placebo? This was statistically true in one out of 25 cases and visually in
three more out of 25 cases. One has also to consider that our methodology was a
rather crude one: we did not look for all possible symptoms, neither did we
probe for specific or individual symptoms. Rather did we screen for very crude
general symptoms of Belladonna.
Therefore, our results are not definite proofs, but hints that this venue of
research should be followed on with more careful and possibly more sensitive
methodology.
Our
study shows two shortcomings from a methodological and external validity point
of view: first, the randomization test procedure requires random permutation of
sequences. This again assumes that carryover effects must be excluded. This
presupposition may not always be met. Although Belladonna normally is supposed to have shortlived action in
homoeopathy, this does not exclude the possibility of carryover effects.
Because of this, the design is not ideal for testing this phenomenon and it
should not be taken as a prototype of experimental studies in this field. A
design, which is also a single-case randomization experiment but which
randomizes the intervention point, would be more appropriate. Another
shortcoming is the fact that we operated with fixed dosages. Many homoeopaths
recommend flexible doses, adjusted to individual reaction. We felt that this
could not be accomodated within the framework of a randomization design, so we
dropped this requirement and instead opted for a small single dose distributed
over one day. In this way we hoped to provide a stimulus which would be true to
homoeopathic theory and to the requirements of experimental design. Future
studies, however, should take this into account and find a design which can
allow for flexible dosing.
In
summary: the high dilution effect of Belladonna
12 CH and 30 CH seems to exist, albeit very weak. This warrants further
investigation, preferably in a design, where carryover effects do not play any
important role. Traditional exclusion criteria, such as perfect health or
stimulant free life style, should be considered with caution, as in our study
they did not contribute to explaining the effects. The effect is there, as well
as the paradox: significantly more Belladonna-symptoms with Belladonna, as well as with placebo. The
small effect size calls for confirmation in a study with larger numbers and
more sensitive methodology.
This
study was supported by the Robert-Bosch-Foundation, Stuttgart. The authors
gratefully acknowledge the help of Prof. W. Gaus, Ulm, in preparing the
randomization table, of Frank Wieland, formerly DHU, and Dr. Marianne Heger,
DHU Karlsruhe, for preparing the test substances, and of Dr. Patrick Onghena,
Leeuven, for a critical reading of a draft version of this manuscript and
helpful criticism of a former version of the statistical evaluation.
Edgington, E. (1987) Randomization Tests. Dekker Publisher,
New York.
Ernst-Hieber, E. and Hieber,
S. (1995) Wirkt eine homöopathische Hochpotenz anders
als ein Placebo?. Randomisierte, doppelblinde multiple Einzelfallstudie.
Hippokrates Publisher, Stuttgart .
Fahrenberg, J. (1994) Die Freiburger Beschwerdeliste FBL. Form FBL-G
und revidierte Form FBL-R. Hogrefe Publisher, Göttingen
Fahrenberg, J., Hampel,
R. and Selg, H. (1989) Das Freiburger Persönlichkeitsinventar FPI. Revidierte
Fassung FPI-R und teilweise geänderte Fassung FPI-A. Hofgrefe Publisher, Göttingen.
fifth Ed.
Hahnemann, S. (1979) Organon der Heilkunst. sixth Ed.,. R. Haehl (ed.), Reprint - Hippokrates
Publisher, Stuttgart.
Walach, H. (1986) Homöopathie als Basistherapie. Plädoyer für die wissenschaftliche Ernsthaftigkeit
der Homöopathie. Haug Publisher, Heidelberg.
Walach, H. (1993) Does
a highly diluted homeopathic drug act as a placebo in healthy volunteers?
Experimental study of Belladonna 30 CH in double-blind crossover design -
a pilot study, J. Psychosom. Res. 37, 851-869.
Weiss, B., Williams, J.H.,
Margen, S., Abrams, B., Caan, B., Citron, L.J., Cox, C., McKibben, J., Ogar,
D. and Schultz, S. (1980) Behavioral responses to artificial food colors,
Science 207, 1487-1489.