A Brief Tutorial – Clinical Trials and Statistics
In yesterday’s blog post, I explained that I would begin a blog series in which we would examine studies, data and reports that are appearing in the medical literature about long-term potential harms that can result from SARS-CoV-2 infection. Let me qualify that when I refer to “long-term,” realize that the longest interval from the first known infections from this novel virus to now are not quite 2 ½ years (not a long time when it comes to viruses). It certainly may be the case that some of the health effects we are seeing now will resolve over time. It is also possible that we won’t begin to identify other health effects for years from now.
But, before I launch into this series, we need to cover some concepts so that those without a scientific or medical background can understand what we are talking about. We will start with a brief overview of clinical study design, how to interpret clinical studies, and a smattering of cell biology, virology and immunology. I promise to try to make it all interesting and not overly complicated! In doing so, my apologies in advance to statisticians, cell biologists, virologists, immunologists and all other experts in the fields for which I cannot even begin to do justice.
Let’s start by understanding clinical trials/study design. If you want to know why it is important, merely look to the drama surrounding the ivermectin clinical studies. If you don’t understand the concepts I am going to cover, you could very well look at these trials and believe that there is a lot of evidence to support the use of ivermectin in the treatment of COVID-19. However, once you understand how to look at clinical trials, you realize that the weight of the evidence is pretty convincing that ivermectin is not effective.
Here we go. There are many ways that a clinical study can be designed. They often fit nicely into a number of categories and the weight of the evidence from the study should take into account the study design.
Let’s start with a few principles:
- When you want to apply the findings of a clinical trial to practice, you need to know the population that was tested in the study and understand how the patient you are treating or the person to whom you are providing advice might differ. As an example, a study done to look at the health effects of SARS-CoV-2 infection in nursing home residents may not allow one to conclude that the same health effects would be seen in children or college students, or even middle-aged adults. Why? Nursing home residents would be older and the very fact that they are in a nursing home suggests that they have significant limitations to care for themselves either due to comorbid health conditions or physical limitations or both. We know that older people as well as those with certain health conditions are at increased risk for severe disease with SARS-CoV-2 infection.
- When we are dealing with a subject like SARS-CoV-2, it is also helpful to know what the study period was and the countries the subjects were from. If we were doing a study to measure the effectiveness of the vaccines or a monoclonal antibody treatment, this would be critical information because that effectiveness can vary depending on the variant that was circulating at the time. A test of the Regeneron monoclonal antibody treatment effectiveness was excellent early in the pandemic, but almost zero today due to the shift in variants.
- You will also want to look at what question the authors of the study are trying to answer. For example, when we look at COVID-19 vaccine effectiveness studies, it is very important to know “effectiveness of what?” If a study is conducted to look at the effectiveness of preventing severe disease and the authors define severe disease as the rate of hospitalizations and death, then they likely are not conducting the kind of tests that would be necessary to answer the question of vaccine effectiveness against infection or vaccine effectiveness against symptomatic infection. Let’s now say that the authors are conducting a study to determine the effectiveness of vaccines in preventing symptomatic infection. Well, then likely the study design will inform you that they only tested subjects when they exhibited symptoms. Therefore, that study will not answer the question as to how well vaccines prevent all infections, since we know that with COVID-19, many people have no symptoms or symptoms that they don’t realize are cause for testing. If we wanted a study to evaluate how effective the vaccines were in preventing all infections, then we would vaccinate subjects and then routinely test them to look for evidence of infection regardless of how they felt.
- Finally, we need to be careful about not confusing an association with causation. When I was early in my medical practice, an observational study had found that treating menopausal women with hormone replacement led to better cardiovascular outcomes. So, we all prescribed women hormone replacement treatment referring to its “cardio-protective effect.” Years later, randomized controlled trials demonstrated that at best the effect was insignificant and at worst, the hormone replacement treatment was actually placing women at risk for worse cardiovascular outcomes. We’ll discuss these different types of studies below, but why would they come to such different conclusions? Observational studies are prone to identifying associations rather than causation. In this case, women who were most likely to seek and receive hormone replacement treatment at that time were in higher socioeconomic strata, with better access to health care, better nutrition, less likely to be smokers and more likely to belong to a gym or participate in regular exercise, and all of those factors would likely have contributed to those study participants’ lower risk for cardiovascular disease.
Types of clinical studies:
- Randomized trials – These are generally the highest quality trials and ones that are better designed to answer the question of whether something is an association or a cause. These trials are of higher quality because they divide study participants into groups of test subjects and so-called “controls.” The test subjects will receive an intervention, let’s say a vaccine, whereas the control group receives perhaps an injection of normal saline. We call the trial randomized because we take all the people who have signed up for the trial and “randomize” them into one of these two groups. Often that is done these days with the benefit of a computer. The best studies will make sure that the two groups look as similar as possible, e.g., same age ranges and average age between the test group and the control group. We can even make these studies better when the study participants are “blinded,” that is to say they don’t know whether they are receiving the intervention or a placebo. Why would that be better? Because people can filter the symptoms they report or their perception of how they feel based upon whether they believe they are being exposed to something or treated with something. We can make it even stronger by making the study “double-blinded,” meaning that neither the study participants nor the investigators know whether someone has received the intervention or the placebo, because now the investigators do not filter the evaluation of the study participants based upon knowing whether they received the intervention or not.
Let’s take a real-life example to see why a randomized trial can be so helpful. No doubt you heard a few doctors who stated something to the effect that “I treated all my COVID-19 patients with ivermectin and they all did great!” So, is that pretty good proof that ivermectin works? No, because maybe they treat a young, healthy population that was going to do great anyway, whether they received ivermectin or not. We would need to know more – how many patients did they treat, what were their conditions when they began treatment, how did they follow them up and for how long, how do they know whether any of these patients got sick and went to a hospital or died? This is where having a control group helps you answer this question.
Suppose I told you that I told all my friends to eat an orange every day during the pandemic and none of us got COVID. Does that mean that an orange a day will prevent COVID? No, because you don’t know the characteristics of my friends. They are likely older individuals and may not have school-aged children in the home. They are likely to be in health-care related fields and probably more likely to mask, avoid large crowds and get vaccinated.
- Observational studies. These are studies where we don’t make an intervention, but rather just observe differences between groups to see if there are different outcomes. These studies are prone to errors and particularly to identifying an association or correlation rather than causation. That doesn’t mean that they aren’t helpful and can’t provide us with insights. An observational study could be comparing COVID-19 disease transmission rates in a school with a mask requirement against a school without one.
- A common type of study we will look at will be a case-control study. In these studies, we are usually comparing a group with the disease or condition to a group without. For example, we could follow a group of college athletes who got COVID-19 and compare their exercise tolerance with a group of college athletes who did not report getting COVID-19 to see if there are differences.
There are a number of other refinements of how studies can be designed, but I think this gives you a sufficient background for now.
Let’s finish with a few concepts from statistics as to how to interpret the findings of a clinical trial.
With apologies in advance to any statisticians who may read this for the extreme oversimplification, when we look at the results of a clinical trial, we want to know how likely these results could have happened by chance. For example, if I want to know what percentage of the population has blue eyes, it is not practical for me to check every person’s eye color. But, if I have a representative sample, it is possible to get to this percentage number. The key is “representative.” If I just sample people in my neighborhood, that is unlikely to be representative of the entire population. So, when we look at the results from studies, we look for statistical measures of how likely these really are “statistically significant” results and if we repeated the experiment 100 times, how wide of a range of results might we be likely to get?
So, in determining statistical significance, we often look at the “p” values. A p-value represents the probability that the sample result was produced from random sampling of a population, given a set of assumptions about the population. When we make an intervention in a clinical trial, we are usually hoping that the result of the prevention or treatment would be unlikely to occur randomly in a population. For example, if we vaccinate 1,000 people and 2 people develop Bell’s palsy, is that similar to the rate of Bell’s palsy in the general population (i.e, might have occurred by chance) or is it significantly lower (maybe the vaccine helps prevent the occurrence of Bell’s palsy) or is it significantly higher (maybe the vaccine causes Bell’s palsy as a adverse event)? Statisticians (I am told) consider a p-value of 0.05 or less to be statistically significant. In clinical trials, we look for much lower p-values (in the thousandths) to give us confidence that a treatment really works or really doesn’t work. For example, if we had a randomized, control trial with the study group taking an antibiotic and the control group just using symptomatic treatment (rest, anti-inflammatory medications, etc.), and the outcome in the antibiotic-treated group was superior to the control group with a p<0.001, then we would feel very confident that the antibiotic worked with the p-value telling us that if we repeated this trial 1,000 times with just symptomatic management, we would only get the result we did with antibiotics at most one time.
The other very helpful statistical tool in interpreting how much to rely on the results of a study is the confidence interval (CI). People will recall that when the initial results for COVID-19 vaccine efficacy came out, one of the trials showed that the efficacy was 94.1%. People threw that number around like it was handed down from above. However, if you repeated that study many times, you wouldn’t end up with exactly the same result because there would be different people in different studies. For example, another trial might give us 92% and another one 95%. Statisticians can calculate an interval 95% CI, a range of numbers that we can be 95% sure contains the true mean for the population. So, for example, the result might be 94.1%, but then we can look to the 95% CI, and let’s say that is 92 – 97%. That would mean that the vaccine efficacy could realistically be as low as 92% or as high as 97%. Now, 95% confidence intervals also help us judge how much “confidence” (pardon the pun) we should place in the results. If a clinical trial shows that the effectiveness of a treatment is 64%, but the 95% CI is 33 – 76%, then we know that we cannot place a lot of trust in the 64% number, because it could be a little as half that. When the confidence interval is that wide, usually, in my experience, it means that the study population was small.
The most common statistic we are going to look at when we look at the subject of adverse health effects following COVID-19 will be risk ratios. We will see several versions. One will be relative risk (RR). This is the risk of one population relative to the risk of a different, or in terms of our study group and the control group, the risk of developing an outcome in the study group vs. developing the outcome in the control group. Back to my vaccine example, if the rate of development of Bell’s palsy in the vaccine group was the same as the rate of developing it in the control group, then the relative risk is 1 and the vaccine is unlikely to be the cause of the Bell’s palsy. Another ratio we will deal with is the odds ratio – the odds of an outcome occurring vs. the odds of it not occurring. For example, we will look at a study that shows 20 adverse cardiovascular events that can occur as a result of SARS-CoV-2 infection and for each event, there will be an odds ratio indicating how much having COVID-19 elevates the odds of you developing a cardiovascular condition than had you not had COVID.
Well, this very basic level of understanding is pretty much all you will need to understand the clinical studies we are going to review in this blog series. Now that I have probably irritated all the statisticians out there with my oversimplification, and probably not technically correct explanations, with the next blog piece, I will see if I can similarly irritate the biologists, virologists and immunologists. But, that will be our last tutorial, and then we are ready to dive into the studies!