clinical trials

  • Judging the Clinical Trials

    Not a month goes by without headlines in the media proclaiming either that vitamins do amazing things or that they do nothing at all. Such concerns no longer are limited to those whose jobs are to raise such issues. Individuals purchasing health foods and related products increasingly are asking questions about the cost and effectiveness of supplements. Likewise, governmental watchdog agencies, such as the Food and Drug Administration (FDA), expect that the manufacturers and marketers of nutrients and herbs be able to back up claims with sound research. Total Health Magazine Online took an in-depth look at some of the issues back in 2011, for which see “Are Vitamin Supplements Safe?

    Unfortunately, responses to these demands for better backing for claims often are less than satisfactory. Marketing-driven science is as common as is science-driven marketing. Distinguishing between the two requires familiarity with the standards that universities and research institutions have adopted to evaluate medical evidence. This means knowing about the types of studies available and about the elements found in every properly designed study.

    There are three basic types of clinical investigations: case-control studies, cohort studies and randomized controlled trials. For most nutritional supplements, the last of these is the primary form of investigation. However, for completeness, a few words should be spared to describe the other two. Case-control studies start with individuals who have already developed a disease or special condition and the controls are matched individuals who do not have the disease in question. An example is an analysis of heart disease rates in male smokers versus rates in otherwise similar males who have never smoked. This is an observational study because there is no intervention by the researchers. The strength of this study type is that it allows researchers to explore how variables influence the development of the condition being examined. The major drawback is that the study can easily be biased with regard to observations and other factors.

    Cohort studies differ from case-control studies in that researchers start with individuals who have not yet developed the disease or condition being investigated. Hence, a cohort study on athletic supplements might start with two groups of similar athletes before one group begins supplement use. The analysis would consist of determining whether the group taking the supplement improved as measured by some marker for performance or perhaps had fewer injuries. This is an observational study because there is no intervention by the researchers. Cohort studies have the virtue of allowing investigators to more reliably establish whether a particular action (taking a supplement) leads to a particular outcome (fewer injuries). However, cohort studies may require years of following the subjects and also depend upon the subject populations being properly identified as identical with regard to the studied condition(s) at the start of the study rather than being weighted with some underlying predisposition. In other words, it is easy to introduce bias into cohort studies.

    In many ways, the “gold standard” of investigational studies is the randomized placebo-controlled double-blind clinical trial. Ideally, the trial population is relatively uniform to start. Subjects are then randomly assigned to active and placebo arms, further helping to reduce any bias or predisposition in the groups being tested. The test is double-blind, meaning that neither the participants nor the investigators know who is taking the compound being tested. Finally, inasmuch as there often is a large psychological effect (the placebo or “sugar pill” effect) during the first weeks of a study, there is an arm of the trial that receives an item that appears to be identical to the compound being tested, but which has no effect. Note that this is an intervention study— the research actively intervenes by giving the compound to be studied to one or more of the arms in the trial. The idea here is to clearly demonstrate whether there is a cause and effect relationship between the item being studied and the outcome with the subjects. When possible, there is also a “cross-over” phase in which, after a sufficient washout period, the group that was used as the placebo arm becomes the active group and the group that had been the active arm becomes the placebo group. Not all studies lend themselves to this, but cross-over studies insure that there are no unrecognized predispositions in the subject that might bias the test results. All of this sounds good in theory. Unfortunately, as shortly will be shown, this “gold standard” of clinical trials still can be biased in a variety of ways.

    The design of trials involves at least one more component that is important for evaluating whether the results of a given study are weak or strong.

    The first step in any clinical trial is the production of a study protocol. This protocol presents three very important elements. First is the hypothesis of the study: what question is the study intended to answer?

    Second is the study population: how and why were subjects picked to be in the study; what are the criteria for inclusion and exclusion; are special conditions involved?

    Third is the size of the study sample: how many subjects are needed to insure that the results represent true findings rather than mere chance? All studies contain these three elements and the validity of these components—was the study question correctly framed, was the proper study population chosen, was the study carried on for an appropriate period of time, were enough subjects included to yield statistical significance, etc.— are essential for evaluating the worth of the trial.

    Before moving to examples of weak and strong of clinical trials, a few words need to be said regarding statistical significance. The usual cut-off level is given as “p< 0.05,” which means there is only a five percent chance that the study findings represent mere chance. Some statistical models are more strict than others for performing this calculation, but readers actually need to be worried about something else, which is the study sample size. If a study uses, say, only seven subjects per arm, the small size of the study means that the reported effect will need to be very large to achieve statistical significance. Conversely, and one sees this all the time in pharmaceutical studies, a trial monitoring 100,000 subjects may find significance for what, in practice, are effects that are so weak that they are clinically only marginally useful!

    As noted above, randomized placebo-controlled double-blind clinical trials are considered to be the ggold standardh for research. Nevertheless, many such trials are quite weak and misleading. For one thing, it all to often turns out to be the case that the placebo is not actually inactive, for instance, the practice of using maltodextrin or other sugars as the so-called placebo in weight loss studies. Relatedly, especially in studies involving weight loss, the placebo effect can be very strong for many weeks. The placebo effect in diet studies commonly leads to the loss of two pounds in eight weeks, and much more if diet and exercise changes are included. A BBC News report on the Internet (March 10, 2004) on trials of the drug rimonabant noted that participants taking the placebo were five pounds lighter at the end of one year. In some large pharmaceutical diet trials in which subjects changed behavior, diet and exercise, the weight loss in two months using the placebo exceeded 11 pounds!

    Similarly, if exercise is included in a weight loss trial with healthy subjects, then LDL cholesterol, total cholesterol, triglycerides and leptin levels normally will go down, whereas HDL cholesterol will go up. Moderately increasing the amount of protein in the diet, likewise, will produce such trends. Hence, if a weight loss trial includes exercise and a controlled diet with increased protein, yet reports results opposite of these or fails to find weight loss in participants using the placebo (as happened recently in a highly promoted trial), then the reader should seriously wonder whether there was a lapse somewhere in either design or implementation because of the divergence from independently established outcomes. Moreover, it is often the case that even the most rock-solid of results cannot be extrapolated from one group to another. To stay with diet trials, studies performed in Asia or Latin America usually cannot be applied to American experience because the study populations and eating habits are so different. One has the right to question the reproducibility and applicability of studies.

    Of course, many studies are very strong, although this, too, can be misleading. A recent one measured the effects of short-term, oral L-arginine supplements (12 g/d for 3 weeks) in 16 hypercholesterolemic men with normal blood pressure (BP). In this randomized, double-blind, two-period crossover design study, L-arginine tablets (1 g each) and matched placebos (microcrystalline cellulose) were used. The researchers demonstrated that the L-arginine supplement increased blood plasma levels of L-arginine and significantly reduced systolic BP (p<.05) and diastolic BP (p<.001), both at rest and during acute laboratory stressors. BP reductions were associated with a significant decrease in heart output (p<.01); these changes were mediated by small reductions in the volume of blood pumped with each heart beat (p = 0.07). These results were reproduced when the placebo group crossed over, plus they make sense in terms of what is known of the role of L-arginine in the body. Note that this study examines only one intervention which is tested in several ways rather than examining several interventions (e.g., diet + exercise + compound). With only one intervention, it is relatively easy to establish a clear cause and effect relationship.

    This arginine study is an excellent example of a good study with strong results that can be completely misleading. The study lasted only three weeks. Based on a large number of similarly successful studies lasting only one or two months at a time, the temptation is to conclude that supplementing with L-arginine is a great recourse for those who are hypercholesteremic, hypertensive, need a boost in exercise, and so forth. Unfortunately, such conclusions would be wrong. As uncovered by a researcher who had been a proponent of L-arginine supplementation, long-term supplementation with L-arginine—in this case, six months.may lead either to null results or to actual harm—1 The body consists of a vast number of interconnected metabolic processes that are taking place simultaneously. A beneficial effect in one area sometimes is followed by a not so good effect someplace else. Hence, even with well-designed trials, there can remain hidden or submarine issues of which we become aware only much later.

    Judging a clinical trial first requires establishing what type of test is involved—case-control, cohort or randomized controlled trial—because the type of test is the first clue as to how impartial the observations might be. Next, one must look closely at the components of the trial—the hypothesis of the study, the study population and the size of the study sample. A lack of clarity or inappropriateness in any one of these will reduce the quality of the data and undermine the analyses, interpretations and extrapolations based on the trial. Finally, clinical trials seldom exist in a vacuum. A given trial needs to be evaluated in light of related trials, especially trials conducted by researchers whose concerns and orientations are different from those involved with the test being evaluated. Readers interested in pursuing this topic are urged to examine Richard K. Riegelman, Studying a Study and Testing a Test (6th edition, 2012).

    1 Wilson AM, Harada R, Nair N, Balasubramanian N, Cooke JP. L-arginine supplementation in peripheral arterial disease: no benefit and possible harm. Circulation. 2007 Jul 10;116(2):188.95. Epub 2007 Jun 25.