Why we may trust registered clinical trials
The COVID-19 pandemic is disrupting all our lives. A safe and effective treatment, or even better, a vaccine, would be a gamechanger (Aghion et al. 2020). There are many promising candidates in development, but before regulators such as the US Food and Drug Administration (FDA) can grant approval for wide use of a vaccine, its safety and efficacy need to be demonstrated in clinical trials – just as for any other newly developed drug. By 30 August 2020, 3,176 studies related to COVID-19 were registered on ClinicalTrials.gov, the largest registry for clinical trials. All the big pharmaceutical companies are involved in the race for the first big breakthrough, but many smaller and less famous biotech enterprises have thrown their hats in the ring as well.
Clinical research with human volunteers should be held to highest ethical standard at all times, but in light of the current global emergency, the integrity of these clinical studies is arguably more crucial than ever. Just imagine a vaccine that does more harm than good being approved prematurely and distributed to billions of people.
At the same time, we often hear allegations that investigators are subject to major financial conflicts of interest. Massive research and development costs, and the lure of even larger profits in case of success, might push investigators to withhold unfavourable results or beautify data. The ‘winner of the vaccine race’ will strike gold given the worldwide demand. This outlook led many companies to take large risks with high up-front investments that may be lost if their attempts are not successful – or even if they succeed but not fast enough to be among the first. On top of this financial pressure, there is the pressure of the public clamouring for the panacea that allows everyone to get back to their normal lives.
In a paper recently published in PNAS, we systematically analysed results of pre-approval (phase II and phase III) drug trials reported to ClinicalTrials.gov from before the pandemic until August 2019 (Adda et al. 2020). Our results are overall reassuring about the integrity of registered clinical trials and give reason for hope that we can also trust the results of COVID-19 trials.
Our analysis focuses on the distribution of z-scores, a statistical measure that is isomorphic to p-values, comparable for all trials, and plays an important role in the evaluation by drug regulators. As shown in Figure 1, we do not find an artificial spike of z-scores right above 1.96, the salient threshold for statistical significance at the 5% level. This threshold is often used as the first point of reference for the evaluation of the strength of statistical evidence.
Figure 1 Comparison of phase II and phase III densities of the z-score and tests for discontinuity at z=1.96, depending on the affiliation of the lead sponsor
Notes: Density estimates of the constructed z statistics for primary outcomes of phase II (dashed blue lines) and phase III (solid gray lines) trials are shown. The shaded areas are 95% confidence bands, and the vertical lines at 1.96 correspond to the threshold for statistical significance at the 0.05 level.
Source: Adda et al. (2020).
Previous studies of publications in academic journals across a number of disciplines (including economics) identified a bunching of results right above this threshold, which is commonly interpreted as evidence of the manipulation of results to clear this hurdle (‘p-hacking’) ( Brodeur 2016). This does not seem to happen in phase II and phase III trials that report to ClinicalTrials.gov, neither in trials by non-industry sponsors (panel B), nor in trials conducted by the ten largest pharmaceutical companies ranked by revenue (panel C) or the remaining smaller industry sponsors (panel D).
However, a deeper look uncovers some suspicious regularities. For phase III trials by small industry sponsors (panel D), we found a persistent upward shift of the density of z-scores at the significance threshold. Though not as alarming as a spike right above this threshold, this break suggests that some less favourable results are not reported to the registry, something that is not supposed to happen. Negative results also contain valuable information. They prevent other researchers from replicating the efforts, wasting time and money, and putting more human volunteers at risk.
Another subtle pattern we find is the large increase in the share of significant results from phase II to phase III for all industry sponsored trials (panels C and D). To some extent, this progression is to be expected, because only the promising phase II trials are continued into phase III. Selective continuation makes sense for profit-maximising firms that are willing to invest in further research only if the chances of eventual marketing approval are high enough. This pattern is reasonable also from an ethical point of view, because human volunteers should not be put at risk if no benefits are expected.
But can selective continuation of only the more promising phase II trials fully explain the increased number of statistically significant results in phase III? To answer this question, we linked phase II and phase III trials in the registry and estimated the continuation probability of a phase II trial conditional on the z-score and other observable trial characteristics. This method allows us to predict a phase III distribution, which we can compare to the distribution of actually reported phase III results.
For trials sponsored by large pharmaceutical companies, the predicted share of significant phase III results and the actual share align nicely. Smaller industry sponsors, instead, report a higher share of significant results in phase III even though they are less likely to terminate a drug investigation with phase II results under threshold. While we cannot provide a definite reason for this discrepancy, one possibility is that small companies are selective in reporting only more favourable phase III results to the registry.
More research is needed to determine what drives the difference in reporting discipline between big pharma and smaller firms. A possible interpretation is that reputational concerns and the resulting economic incentives play an important role in the pharmaceutical industry, as they do in a number of other sectors (Mayzlin 2014).
Though FDA legislation mandates disclosure for a large number of trials and includes fines for non-compliance, to our knowledge these rules have never really been enforced. In recent years, public awareness of transparency concerns in clinical research has risen, motivating many large companies to establish internal disclosure policies and take reporting of results more seriously. However, reputational concerns might have less disciplinary power for smaller companies. Stricter enforcement of fines may be necessary to incentivise smaller companies and public research institutions to comply with disclosure rules.
Given the benefits of having all trial results made publicly available, our findings suggest that regulators should pay particular attention to enforcing the transparency of trials sponsored by smaller companies. This insight may also be valuable for the evaluation of COVID-19 trials—research by smaller sponsors might benefit from closer regulatory oversight.
The ex-post benefits for society of transparency in clinical research and publicly accessible trial registries containing ALL results are undeniable. However, economic theory also suggests that mandatory full disclosure of all results is not always the best policy once we take into account the ex-ante incentives of investigators to invest in costly research (Dahm et al. 2009, Henry 2009, Henry and Ottaviani 2019). Strict enforcement might chill incentives to engage in costly but socially desirable R&D activities in the first place. An important question for future research is whether this chilling argument holds true in the data.
Aghion, P, S Amaral-Garcia, M Dewatripont and M Goldman (2020), “How to strengthen European industries’ leadership in vaccine research and innovation”, VoxEU.org, 1 September.
Adda, J, C Decker and M Ottaviani (2020), “P-hacking in clinical trials and how incentives shape the distribution of results across phases”, Proceedings of the National Academy of Sciences 117(24): 13386-13392.
Brodeur A, M Lé, M Sangnier and Y Zylberberg (2016), “Star Wars: The empirics strike back”, American Economic Journal: Applied Economics 8: 1–32.
Dahm, M, P González and N Porteiro (2009), “Trials, tricks and transparency: How disclosure rules affect clinical knowledge”, Journal of Health Economics 28: 1141–1153.
Henry, E (2009), “Strategic disclosure of research results: The cost of proving your honesty”, Economic Journal, 119: 1036–1064.
Henry, E and M Ottaviani (2019), “Research and the approval process: The organization of persuasion”, American Economic Review 109: 911–955.
Mayzlin, D, Y Dover and J Chevalier (2014), “Promotional reviews: An empirical investigation of online review manipulation”, American Economic Review 104: 2421–2455.