Not a month goes by without headlines in the media proclaiming
either that vitamins do amazing things or that they do nothing
at all. Such concerns no longer are limited to those whose
jobs are to raise such issues. Individuals purchasing health
foods and related products increasingly are asking questions
about the cost and effectiveness of supplements. Likewise,
governmental watchdog agencies, such as the Food and Drug
Administration (FDA), expect that the manufacturers and
marketers of nutrients and herbs be able to back up claims
with sound research. Total Health Magazine Online took an
in-depth look at some of the issues back in 2011, for which see
“Are Vitamin Supplements Safe?”
Unfortunately, responses to these demands for better
backing for claims often are less than satisfactory. Marketing-driven
science is as common as is science-driven marketing.
Distinguishing between the two requires familiarity with the
standards that universities and research institutions have
adopted to evaluate medical evidence. This means knowing
about the types of studies available and about the elements
found in every properly designed study.
There are three basic types of clinical investigations: case-control
studies, cohort studies and randomized controlled
trials. For most nutritional supplements, the last of these is
the primary form of investigation. However, for completeness,
a few words should be spared to describe the other two.
Case-control studies start with individuals who have already
developed a disease or special condition and the controls are
matched individuals who do not have the disease in question.
An example is an analysis of heart disease rates in male
smokers versus rates in otherwise similar males who have
never smoked. This is an observational study because there
is no intervention by the researchers. The strength of this
study type is that it allows researchers to explore how variables
influence the development of the condition being examined.
The major drawback is that the study can easily be biased with
regard to observations and other factors.
Cohort studies differ from case-control studies in that
researchers start with individuals who have not yet developed
the disease or condition being investigated. Hence, a cohort
study on athletic supplements might start with two groups
of similar athletes before one group begins supplement
use. The analysis would consist of determining whether the
group taking the supplement improved as measured by some
marker for performance or perhaps had fewer injuries. This
is an observational study because there is no intervention by
the researchers. Cohort studies have the virtue of allowing
investigators to more reliably establish whether a particular
action (taking a supplement) leads to a particular outcome
(fewer injuries). However, cohort studies may require years
of following the subjects and also depend upon the subject
populations being properly identified as identical with regard
to the studied condition(s) at the start of the study rather
than being weighted with some underlying predisposition. In
other words, it is easy to introduce bias into cohort studies.
In many ways, the “gold standard” of investigational
studies is the randomized placebo-controlled double-blind
clinical trial. Ideally, the trial population is relatively
uniform to start. Subjects are then randomly assigned to
active and placebo arms, further helping to reduce any bias
or predisposition in the groups being tested. The test is
double-blind, meaning that neither the participants nor the
investigators know who is taking the compound being tested.
Finally, inasmuch as there often is a large psychological effect
(the placebo or “sugar pill” effect) during the first weeks of
a study, there is an arm of the trial that receives an item that
appears to be identical to the compound being tested, but
which has no effect. Note that this is an intervention study—
the research actively intervenes by giving the compound to be
studied to one or more of the arms in the trial. The idea here
is to clearly demonstrate whether there is a cause and effect
relationship between the item being studied and the outcome
with the subjects. When possible, there is also a “cross-over”
phase in which, after a sufficient washout period, the group
that was used as the placebo arm becomes the active group
and the group that had been the active arm becomes the
placebo group. Not all studies lend themselves to this, but
cross-over studies insure that there are no unrecognized
predispositions in the subject that might bias the test results.
All of this sounds good in theory. Unfortunately, as shortly
will be shown, this “gold standard” of clinical trials still can
be biased in a variety of ways.
The design of trials involves at least one more component that
is important for evaluating whether the results of a given study
are weak or strong.
The first step in any clinical trial is the production of
a study protocol. This protocol presents three very
important elements. First is the hypothesis of the study:
what question is the study intended to answer?
Second is the study population: how and why were
subjects picked to be in the study; what are the criteria
for inclusion and exclusion; are special conditions
Third is the size of the study sample: how many
subjects are needed to insure that the results represent
true findings rather than mere chance? All studies
contain these three elements and the validity of these
components—was the study question correctly framed,
was the proper study population chosen, was the study
carried on for an appropriate period of time, were enough
subjects included to yield statistical significance, etc.—
are essential for evaluating the worth of the trial.
Before moving to examples of weak and strong of clinical trials,
a few words need to be said regarding statistical significance.
The usual cut-off level is given as “p< 0.05,” which means there
is only a five percent chance that the study findings represent
mere chance. Some statistical models are more strict than
others for performing this calculation, but readers actually need
to be worried about something else, which is the study sample
size. If a study uses, say, only seven subjects per arm, the small
size of the study means that the reported effect will need to be
very large to achieve statistical significance. Conversely, and
one sees this all the time in pharmaceutical studies, a trial monitoring 100,000 subjects may find significance for what,
in practice, are effects that are so weak that they are clinically
only marginally useful!
As noted above, randomized placebo-controlled double-blind
clinical trials are considered to be the ggold standardh
for research. Nevertheless, many such trials are quite weak
and misleading. For one thing, it all to often turns out to be
the case that the placebo is not actually inactive, for instance,
the practice of using maltodextrin or other sugars as the so-called
placebo in weight loss studies. Relatedly, especially in
studies involving weight loss, the placebo effect can be very
strong for many weeks. The placebo effect in diet studies
commonly leads to the loss of two pounds in eight weeks, and
much more if diet and exercise changes are included. A BBC
News report on the Internet (March 10, 2004) on trials of the
drug rimonabant noted that participants taking the placebo
were five pounds lighter at the end of one year. In some large
pharmaceutical diet trials in which subjects changed behavior,
diet and exercise, the weight loss in two months using the
placebo exceeded 11 pounds!
Similarly, if exercise is included in a weight loss trial
with healthy subjects, then LDL cholesterol, total
cholesterol, triglycerides and leptin levels normally
will go down, whereas HDL cholesterol will go up.
Moderately increasing the amount of protein in the diet,
likewise, will produce such trends. Hence, if a weight loss trial
includes exercise and a controlled diet with increased protein,
yet reports results opposite of these or fails to find weight loss
in participants using the placebo (as happened recently in a
highly promoted trial), then the reader should seriously wonder
whether there was a lapse somewhere in either design or
implementation because of the divergence from independently
established outcomes. Moreover, it is often the case that even
the most rock-solid of results cannot be extrapolated from one
group to another. To stay with diet trials, studies performed in
Asia or Latin America usually cannot be applied to American
experience because the study populations and eating habits are
so different. One has the right to question the reproducibility
and applicability of studies.
Of course, many studies are very strong, although this, too,
can be misleading. A recent one measured the effects of short-term,
oral L-arginine supplements (12 g/d for 3 weeks) in 16
hypercholesterolemic men with normal blood pressure (BP).
In this randomized, double-blind, two-period crossover design
study, L-arginine tablets (1 g each) and matched placebos
(microcrystalline cellulose) were used. The researchers
demonstrated that the L-arginine supplement increased blood
plasma levels of L-arginine and significantly reduced systolic
BP (p<.05) and diastolic BP (p<.001), both at rest and during
acute laboratory stressors. BP reductions were associated with
a significant decrease in heart output (p<.01); these changes
were mediated by small reductions in the volume of blood
pumped with each heart beat (p = 0.07). These results were
reproduced when the placebo group crossed over, plus they
make sense in terms of what is known of the role of L-arginine
in the body. Note that this study examines only one intervention
which is tested in several ways rather than examining several
interventions (e.g., diet + exercise + compound). With only
one intervention, it is relatively easy to establish a clear cause
and effect relationship.
This arginine study is an excellent example of a good
study with strong results that can be completely misleading.
The study lasted only three weeks. Based on a large number
of similarly successful studies lasting only one or two months
at a time, the temptation is to conclude that supplementing
with L-arginine is a great recourse for those who are
hypercholesteremic, hypertensive, need a boost in exercise,
and so forth. Unfortunately, such conclusions would be wrong.
As uncovered by a researcher who had been a proponent of
L-arginine supplementation, long-term supplementation with
L-arginine—in this case, six months.may lead either to null
results or to actual harm—1 The body consists of a vast number
of interconnected metabolic processes that are taking place
simultaneously. A beneficial effect in one area sometimes
is followed by a not so good effect someplace else. Hence,
even with well-designed trials, there can remain hidden or
submarine issues of which we become aware only much later.
Judging a clinical trial first requires establishing what
type of test is involved—case-control, cohort or randomized
controlled trial—because the type of test is the first clue as
to how impartial the observations might be. Next, one must
look closely at the components of the trial—the hypothesis
of the study, the study population and the size of the study
sample. A lack of clarity or inappropriateness in any one of
these will reduce the quality of the data and undermine the
analyses, interpretations and extrapolations based on the trial.
Finally, clinical trials seldom exist in a vacuum. A given trial
needs to be evaluated in light of related trials, especially trials
conducted by researchers whose concerns and orientations
are different from those involved with the test being evaluated.
Readers interested in pursuing this topic are urged to examine
Richard K. Riegelman, Studying a Study and Testing a Test (6th
1 Wilson AM, Harada R, Nair N, Balasubramanian N, Cooke
JP. L-arginine supplementation in peripheral arterial disease:
no benefit and possible harm. Circulation. 2007 Jul 10;116(2):188.95.
Epub 2007 Jun 25.