This notebook is the data analysis I did for this post.
Boulware et al.
The incidence of new illness compatible with Covid-19 did not differ significantly between participants receiving hydroxychloroquine (49 of 414 [11.8%]) and those receiving placebo (58 of 407 [14.3%])
The study used Fisher's exact test to calculate the p-value. In this case, p is the answer to the question, "Assume that there are patients, of whom are sick. We pick of them at random. We would expect to be sick. What's the chance that 49 or less are sick, or 59 or more are sick?"
( is the number we actually observed, and goes the same distance in the opposite direction, since it's a two-tailed test.)
For instance, the chance that the first 49 patients we choose are sick is . Of the patients not yet chosen, of them are sick. The chance that the next patients we choose are healthy is . But it was arbitrary to make the first patients sick, so we multiply by .
Let's write a function that does just that.
Let's check that the chances sum to 1
And we get p=0.35, just as the paper says.
Now let's do the Bayesian analysis I describe in the blog.
Digression: what if the data was different? That is, suppose the hydroxychloroquine results were the same as the control?
Specifically, suppose 59 out of 414 of the hydroxychloroquine group got infected. That's 14.3%, almost exactly the same as the placebo group's 14.3%. Now follow the same procedure as before:
Another digression: What if our priors were different?
End of digressions.
We can also integrate the likelihood function to get probabilties. (Well, we should multiply by the prior, but we assume the prior is uniform, i.e. .)
So the probability of effectiveness is roughly 70%.
But really we should do this in 2 dimensions, since the control group's infection rate is also determined by the data, so it may vary.
Now we need to do some integrals. Symbolic integration is theoretically possible, but the result is a really long expression. You don't believe me? Here it is:
Worse, the expression has lots of large (more than 100 digits) numbers, so SageMath encounters errors and thinks that the integral is sometimes negative, which is ridiculous.
Where are we integrating? The placebo proportion is represented by , the hydroxychloroquine proportion by . We want to know the chance that the hydroxychloroquine proportion is at least better than the placebo proportion. That is, we need .
(Of course, we are assuming that our prior knowledge is uniform.)
So there's an 85% chance that hydroxychloroquine is better than the placebo, and a 67% chance that it reduces cases by at least 10%.
Before I figured out how to use SageMath's built-in numerical integration, I did it using Monte Carlo integration, which was much slower and less accurate.
When I analyze Horby et al. later on, I have to use my own integration method because SageMath won't cooperate. Let's check that my method gives the right answers for this problem.
Let's look at the participants who took the drug completely: 43/312 in the experimental group got infected, versus 50/336 in the control group.
(See the paper's supplemental appendix.)
What about the participants who took some of the drug? 4/37 in the experimental group got infected, versus 3/15 in the control group.
What about the participants who took none of the drug? 2/65 in the experimental group got infected, versus 5/56 in the control group.
Let's compare the percentage sick in the various subgroups of this study.
Skipper et al.
At 14 days, 24% (49 of 201) of participants receiving hydroxychloroquine had ongoing symptoms compared with 30% (59 of 194) receiving placebo (P = 0.21).
Same procedure as last time.
So hydroxychloroquine is probably effective.
So there's a 91% chance that hydroxychloroquine is better than the placebo, and a 76% chance that it reduces cases by at least 10%.
The following is speculative. I might be comparing apples to oranges, and I'm not sure that a uniform prior is a reasonable choice.
Since Skipper et al. and Boulware et al. are kind of measuring the same thing, we can try to combine their statistics. That is, assume that the chance of developing symptoms in Boulware et al. is some fraction of the chance in Skipper. Then we can "import" Boulware's data into Skipper.
So it looks like a 95% chance that hydroxychloroquine does not make patients more sick, and a 79% chance that it reduces cases by at least 10%. I don't trust these numbers. Even if it was valid to compare the studies, I'm not sure that this statistical method is the right one. It gives probabilites that are lower than I expected.
Let's do the integrals a slower way.
What is the constant ?
The constant is somewhere around 0.5, which makes sense because all the proportions from Boulware are half of their counterparts in Skipper, as you can see:
Skipper et al. also looks at hospitalization/death rates:
The incidence of hospitalization or death was 3.2% (15 of 465) among participants with known vital status. With hydroxychloroquine, 4 hospitalizations and 1 nonhospitalized death occurred (n = 5 events). With placebo, 10 hospitalizations and 1 hospitalized death occurred (n = 10 events); of these hospitalizations, 2 were not COVID-19–related (nonstudy medicine overdose and syncope). The incidence of hospitalization or death did not differ between groups (P = 0.29).
Based on the flow chart (Figure 1), the placebo group had 234 patients, and the hydroxychloroquine group had 231 patients. This does add up to 465. I'm going to decrease the placebo group's hospitalizations by 2 because those 2 patients were not hospitalized for COVID-19.
So there's a 73% chance that hydroxychloroquine decreases hospitalizations/deaths by at least 10%, and a 79% chance that it decreases hospitalizations. I'm surprised that we can get such high probabilities from just a few hospitalizations, but the placebo group did have 60% more of them.
Mitjà et al.
The clinical outcome of risk of hospitalization was similar in the control arm (7.1%, 11/157) and the intervention arm (5.9%, 8/136;RR 0.75 [95% CI 0.32; 1.77]) (Table 2).
Here we go again.
This is what inconclusive data looks like. The results lean toward hydroxychloroquine reducing hospitalizations, but there's still a 56% chance that it's ineffective or harmful.
So there's a 64% chance that hydroxychloroquine is better than the placebo, and only a 54% chance that it reduces cases by at least 10%. The prior probabilities were 50% and 45%, so we're not updating by very much.
Let's bring in the hospitalization data from Skipper et al. and use my highly speculative method.
So we have a 79% chance that hydroxychloroquine is safer than the placebo, and a 69% chance that it reduces cases by at least 10%. If possible, I trust these numbers even less than when I combined two studies up above. The two following expressions show that it is not the case that the results of Mitjà et al. are just a multiple of Skipper et al. That breaks the critical assumptions of my methods. It is much better to do the statistics separately for each study and then make an intuitive judgement about what they mean together.
(I'm sorry I'm using so much bold, but I want to be very clear about which sections of this notebook are unreliable.)
And thus the constant has a wide distribution, indicating that there really isn't a value of that works:
Horby et al
Let's do the big British randomized control trial. This one studied whether hydroxychloroquine prevented deaths, and the answer was no.
Results: 1561 patients randomly allocated to receive hydroxychloroquine were compared with 3155 patients concurrently allocated to usual care. Overall, 418 (26.8%) patients allocated hydroxychloroquine and 788 (25.0%) patients allocated usual care died within 28 days (rate ratio 1.09; 95% confidence interval [CI] 0.96 to 1.23; P=0.18).
So hydroxychloroquine is probably dangerous.
Ah, now the result is more subtle. Assume we start with a uniform prior. The data tells us that there's an chance that hydroxychloroquine increases the death rate, but there's only a chance that it raises the death rate by at least .
Of course, the chance that hydroxychloroquine lowers the death rate by at least is miniscule: .