David Healy: Do randomized clinical trials add or subtract from clinical knowledge 

 

Jean-François Dreyfus’ comments

 

        First of all, let me apologize, in advance, for some lack of nuance in the wording of my remarks. As I now live in the west countryside of France my command of the English language has clearly decreased.

        This said, I would like to compliment Dr. Healy for his provocative article although I do not totally endorse his contentions.

        Epistemologically, my teaching of clinical trials' methodology to pharmacy, medicine or engineering students always started with Karl Popper and John Stuart Mill. Popper because I believe that knowledge progresses by showing that a consequence of the current leading theory is falsifiable and, therefore, this theory has to be modified/improved - if you have seen only white swans and are convinced there is no other color for swans, the first time you see a black swan, you have to revise/improve your theory. Mill because in science causality is a crucial issue. This is rather easy with inanimate objects: you take two similar ingots and heat one of them. Everything else being equal, if the heated bar melts you may conclude that heat, being the only difference in the experimental environment, is a cause of metal melting. My first quiz was how to apply this model to humans, i.e., to decide on the causal role of a factor on differences between individuals; even if you are dealing with monozygotic twins, they cannot be strictly identical as their previous life experiences, current health status, aging process, life events encountered during the experiment and personal relevance of outcome criteria cannot be considered identical. Except in emergency situations, as we have seen recently with the COVID-19 pandemic, one individual who provides data in good faith and whose data are analyzed by well-meaning specialists cannot be the sole basis for a conclusion that applies to the whole population. I fully agree that we should not dismiss individual data as second-rate but it is my contention it should be used as the basis for new hypotheses to be tested.

        Daniel Schwartz, who taught me statistics, used to state that a true scientist obtaining confirmation or rejection of a hypothesis had to accept the results with equanimity as, whatever the case, they added to one’s knowledge. According to my doxa, using relevant groups and randomizing the group to which a participant belongs is the only way to equalize all the known and unknown factors that could confound the results. Of course, statistics were necessary to determine if group differences could occur by chance alone. No need to resort to “classic” statistics; bootstrapping and data permutations could be used to do such tests and, if considered necessary, to obtain confidence intervals. I shall therefore forego dealing with calculation considerations.

        As to the issue of the primary endpoint, most certainly to reach a clear-cut conclusion on whether two groups are different, one has to indicate in advance on which criterion such a judgement will be made. Of course, there are methods (to cite one: Carlo Emilio Bonferroni's and his followers) that allow testing multiple endpoints. Actually, the requirement of a primary endpoint is a consequence of the eagerness of industry (and authors) to report positive results in order to beat the infamous publication bias. With multiple endpoints and no predominance given to any, communicators were always able to claim the study results were significant. One has to require that a primary criterion be specified in advance to filter out truly positive trials from those in which positivity was a post hoc reconstruction. That you need RCTs to avoid personal bias in obtaining, analyzing or interpreting group comparisons is a postulate I never brought into question. Thus, from an epistemological point of view, I must admit I am not completely in agreement with Dr. Healy.

        But what about the rest of his demonstration? Not only do I agree with it but I contend that Healy could have gone a bit farther in his criticisms. When I was Head of the CNS development unit of Rhône−Poulenc, at that time the largest French pharmaceutical company, I had problems with Phase IIa. For those not familiar with this terminology, to be granted marketing permission a pharmaceutical product has to go through Phase I (in general, studies on healthy volunteers), Phase II (the initial studies in patients) and Phase III (notably, large scale studies in patients who resemble those in whom the medication is to be used). To be more precise, Phase II is generally subdivided in Phase IIa, in which therapeutic effects in humans are ascertained, and Phase IIb, in which therapeutic hypotheses, especially on dosing, developed during Phase IIa are confirmed using placebo-controlled RCTs on clean homogeneous patient populations with clear-cut conditions and using appropriate validated measurements. In Phase IIa one jumps from results obtained in healthy volunteers through studies that are fully formalized to establish whether the product has actual therapeutic properties. How is the trick done? Time and again, I tried to obtain some hints from colleagues in other companies but it seemed to be a trade secret.

        Finally, I developed my own solution: a network of trusted physicians who received, in confidence, a preliminary formulation of the new compounds and an in-depth investigator's brochure. They were told about what animal studies had led us to expect and, within certain limits, were free to experiment and accumulate data and experience; they would briefly exchange, say at biweekly intervals, information about their findings but were left free to continue in a direction they felt promising, even if it was considered a dead end. A few months later, investigators, data analysts and pharmaceutical executives gathered in a secluded place and results were dissected until some sort of consensus emerged on indications, dosing, precautions, adverse events to be expected, etc. It was my responsibility thereafter to draft the most suitable development plan which was submitted to the Phase IIa investigators for comments. However, I had the final say. Thus, RCTs were indeed considered a must but they were based on clinical judgement and methodologists'/analysts' creativity. In the 1980s a company could still decide to forego financial profits and develop a drug that would enhance its scientific/humanitarian image, for instance an original treatment for a rare condition (e.g., amyotrophic lateral sclerosis) considered beyond current medical reach.

        David Healy shouts what many people who went from academy to industry (and vice versa) have been murmuring: that the pharmaceutical industry is the major culprit of what we are currently seeing. Due to my own biography, I am quite picky at where you should start and/or stop the analysis of an issue and, in this case, the author stops the causal regress before going on to what I consider to be its actual start. Of course there is not a single responsible factor but if one needs to be singularized, I believe that the role of financial capitalism should be singled out. If this is not done, the public will not be in position to understand how it happens that well-meaning individuals led psychiatry to its present unfortunate state.

        Of course, some will insist that this is not a psychiatric issue but a societal one and that such a quandary has no place in a scientific discussion. However, I hope to substantiate that the need for pharmaceutical companies to be highly profitable so as not to see their shareholders leave to invest in more yielding fields elsewhere led them to decrease as much as possible the hazards of drug development. It probably started with diagnosis and diagnostic tools.

        Industry convinced academia that diagnosis had to be more objective, based on harder criteria and that fuzzy categories prevented psychiatry from being considered as a fully acceptable medical domain. And academics, I was one of them, jumped on the bandwagon, forgetting that this approach somehow dehumanized the physician/patient relationship. Next, scales were developed to assess every disorder. Recently, I went to the site of the Canadian Paediatric Society and found six preliminary screening tools and 35 (I may have lost count) scales to measure specific conditions; of course, these scales provided, among other benefits, more precise estimates of drug effects but, once more, precision was gained at the expense of comprehensiveness; an atypical symptom, or at least a symptom not resembling those that were considered as core symptoms, had a greater chance of being missed. Guidelines could have mitigated these effects but in most cases they hampered creativity by stating what was to be done and how it was to be done. Even junior collaborators could then design RCTs. Such guidelines were also considered by industry as a protection because only major companies had enough resources to abide by them.

        By multiplying guidelines in order to be more precise and supposedly helpful, well-meaning persons ensured, more or less, that smaller and more creative companies with limited means could not compete with larger ones. For instance, if you separate general anxiety from panic attacks, and I will not dispute here that there might have been good reasons to do so, two sets of studies would be required and the development costs would surge, becoming unfitting for smaller companies' budgets. This epidemic also encompassed pharmacovigilance. For instance, the MedDRA system was established in the 1990s and, as a user, I certainly will not “throw out the baby with the tub water,” as we say in France, since MedDRA  has its merits but its costs may not make it the most efficient system to make sure we have a complete reporting to health authorities of rare but every significant adverse event, post-marketing. Medical journals bear a heavy responsibility in publication bias and their semi-incestuous relationship with the pharmaceutical industry makes their request for evidence of a lack of conflict of interest among authors somewhat intriguing.

        Finally, Healy rightly asks us how one could get out of this quicksand system. Given the financial power of pharmaceutical corporations, I do not believe that a group of independent, trustworthy individuals would stay independent for very long. On the other hand, one can concur with the recommendation of making public the data on which registration and marketing permission was gained. Of course, there will be objections: why should a company be compelled to show some trade secrets to other companies? Most certainly, there could be steps taken to avoid harming those who pioneer a new approach by giving them, for instance, a period of exclusivity, forbidding their investigators to work/consult for competitors, or making sure that data were disseminated separately to teams that would replicate part of the analysis, with only a global synthesis being available to the public. In addition, among the solutions not mentioned by the author, at least two may be worth discussing: 1) increasing the role of ethics committee and 2) increasing the resources of health agencies.

        Ethics committees should be given proper means, rights, authority and enough time to actually examine a protocol in its context and make sure it is not biased. They should also include many more patient representatives, a random sample of lay persons who would be progressively trained and individuals concerned with economics and doctor/patient relationships, even if this sounds like an oxymoron. As to health agencies, they should have the resources to recruit as employees or advisors, methodologists of a good stature who would be in position to refuse a protocol if they felt it was doomed to miss the objective(s) of its purported aim. One would also benefit from generalizing the US system of publicly held sessions of advisory committees, which, viewed by outsiders seem to be fairly efficient.

        To conclude, it is my contention that RCTs remain one of the pillars of knowledge, if only because they prevent us from jumping to conclusions that are not warranted by data and building a coherent system out of false premises. Not only can a coherent system be built on false premises but it has been proven that a system that is more coherent is not by essence closer to truth than a less coherent one (Bovens and Hartmann 2003). It is, therefore, all important to base our actions on solid foundations. However, the inappropriate construction and the irrelevant exploitation to promote new drugs has led to an inordinate belief in a pharmaceutical company’s ability to rightly answer questions that should have been left to other decision means. Should we say that the rise of lay persons' skepticism (for instance on the efficacy of vaccines) is an unanticipated consequence of this situation?

 

Reference:

Bovens L, Hartmann S. Solving the riddle of coherence. Mind, 2003;112(448):601–33.

 

January 28, 2021