Barry Blackwell: Corporate Corruption in the Psychopharmaceutical Industry
Charles M. Beasley’s response 2 to Barry Blackwell’s response
First, I want to reiterate my deepest respect for Barry’s intellect and wide-ranging knowledge of many subjects and their history. While I would not use the same language to describe the pharmaceutical industry, in general there are many things on which I believe we would agree. For example, returning to the example of the Maryland law addressing pricing of generic drugs with limited competition, I would be quite happy to see this become law at the federal level.
I would like to address in some detail several of Barry’s comments in his most recent response (Blackwell, 2018) to my previous response (Beasley 2018) to his postings concerning Corporate Corruption in the Pharmaceutical Industry. First, he is quite correct about the substantial price differential for branded products between United States prices and those prices in countries (a majority of the world’s countries) with a single payer (governmental) system or governmental price controls imposed through other systems. I agree with his statement that such price differentials exist and will provide one specific personal example:
There is a new, ultra-rapid acting insulin (I am intentionally not providing the brand name of the product or the name of the corporate sponsor) that would be useful to me as I manage my type II diabetes mellitus very aggressively, aiming for normoglycemia. With Humalog™ (Lilly’s rapid acting insulin) as my current pre-prandial insulin I still experience postprandial spikes beyond what I would like to achieve and increasing the Humalog™ dose or increasing the time between administration and beginning a meal would place me at risk of hypoglycemia. This new ultra-rapid acting insulin would assist in reducing these postprandial spikes. This new insulin is not “natural” insulin (recombinant human insulin) that can be purchased in the United States without a prescription. This insulin product, as well as all other “modified” insulins (virtually all insulin products other than “natural”
insulins - recombinant human insulin [and previously beef- and pork-derived insulin]), requires a prescription in the United States. In Canada, all insulins “natural” or “modified” can be purchased without a prescription. This new insulin product costs approximately 10-fold more in the United States than it does in Canada. Furthermore, the formulation-packaging of this new insulin that allows it to be administered in small doses (0.5U increments) while available in Canada, is not available in the US. I have been considering a trip to Toronto, not because of the price differential but because of the availability of the formulating-packaging of specific interest to me.
On the surface, such price differentials might appear more than exorbitant. I would like to provide additional context regarding the effect of these price differentials. I will not offer an opinion regarding the equity of these price differentials, but readers will come to their own opinions:
In the spring of 1987, when I interviewed for my initial position with Eli Lilly, one of my interviewers told me that there were only six countries in the world where it was profitable to market pharmaceuticals (as of 1987). Profitability would be the difference between revenue and the cost of manufacturing the finished product, maintaining the human resources, maintaining the physical infrastructures, governmental payments and other costs required to market pharmaceuticals. I have no way of verifying the truth of this statement but the individual who provided this information had the background training and was in a position where accurate knowledge of such matters would be likely. In addition, the individual had no reason to be disingenuous with me. The statement could have been a simplification of very complex matters and perhaps “profitable” actually meant “substantially profitable.” Thirty-one years later, there could well be less than six countries in which marketing is “substantially profitable,” newly profitable countries could have replaced older countries that have become only minimally profitable to unprofitable, profit can be driven by volume even though profit on a single unit of product sold is marginal. One might ask if the statement were correct, why companies that are so focused on profits would continue to have a presence in
unprofitable markets. One reason is that once human resources and physical infrastructures are in place, they cannot be easily dissolved. Multiple other reasons for maintaining a presence in unprofitable markets could be at play if some markets result in overall profitability. I am aware of one company with an innovative product approved and marketed in the United States and believe the company intends to remain an independent company and develop additional products. However, this product is approved and marketed only in the United States and to the best of my knowledge; the product is not approved in any other regulatory venue (European Union or other individual countries). While I believe that revenue and income, in absolute terms, are quite modest for this company, the ratio of income to expenses is likely to be high.
If one believes as I do and have stated in my earlier responses, that the potential for large profits drive efforts at (investment in) potentially innovative research in the pharmaceutical industry with extreme risk of failure, then one way to view the soaring prices for branded drugs in the United States is that the United States is providing the stimulus for research that might benefit the entire world. Again, large cap pharmaceutical companies are highly unlikely to exit unprofitable or minimally profitable markets for a host of reasons. What I have said in this paragraph and the preceding paragraph should be understood to be only personal conjecture based on experience within the industry and non-validated personal communications combined with an opinion about what is necessary to stimulate commercial research.
Barry points out that some serious adverse drug reactions can be predicted based on preclinical findings and an understanding of pharmacological mechanisms of action. I would agree and add that class effects are also important. As an example of class effects pointing to serious adverse drug reactions, I would like to point out the matter of the “serotonin syndrome” (I prefer the term “hypermetabolic syndrome” because it is more descriptive of pathophysiology and avoids mechanistic attribution where multiple drug classes with different pharmacological activity can result in a cluster of the same signs and symptoms but will use “serotonin syndrome” throughout
this text), often fatal, occurring when an MAOI is administered following initiation and some period of treatment with a drug with substantial SSRI potency (including clomipramine):
Zimelidine was first marketed in 1982. Fluvoxamine was first marketed in 1983. Fluoxetine (first marketed in Belgium in 1986; approved in the United States in in late December 1987 and first marketed in 1988) was not the first SSRI commercially available, as many people outside our field incorrectly believe. In addition, clomipramine, while generally not classified as an SSRI, has potent serotonin uptake inhibiting activity and had been in clinical use in Europe for many years. During my residency (1984-1987), I became aware of reports of serious adverse events occurring in temporal association with coadministration of drugs with potent SSRI activity and MAOIs through non-archival source literature. Therefore, I was surprised that initial product labeling for SSRIs did not include more prominent Warnings-Precautions about this potential phenomenon, including a recommendation to consider elimination half-life of an SSRI in the timing of initiation of an MAOI following an SSRI.
Probable “serotonin syndrome” has been reported in humans since at least 1960 due to the combination of L-tryptophan and an MAOI (Oates and Sjoerdsma 1960). However, multiple literature searches using alternative text strings and involving ~2.5 hours of work on 24-April-2018, conducted within the period of 1972 (year of zimelidine synthesis) through 1987 (year of fluoxetine approval in the United States) for cases of “serotonin syndrome” observed in human subjects due to a combination of clomipramine or an SSRI and an MAOI yielded only two cases in a single report from one United States institution. In these two cases, the patients were given clorgiline and several weeks later given single doses of clomipramine (Insel et al. 1982). Therefore, the cases that I was aware of through non-archival source literature were unlikely to have been well known to a broad audience, might have been obtained from European pharmacovigilance databases and not published in academic literature or published with keyword descriptors other than “serotonin syndrome” as was our publication of a series of fatal cases (Beasley et al. 1993). Alternatively, I was simply not clever enough in my literature
search strategies. However, based on my literature search efforts, these cases would probably not have been easily evident to either United States regulators or corporate sponsors of SSRIs approved early among those approved in the United States. Barry has extensive knowledge of the history of “serotonin syndrome” reactions and might be aware of such cases reported prior to 1987.
There is a tension between including hypothetical adverse drug reactions based on pharmacological mechanisms or class effects when such events have not been observed to an excessive extent with drug compared to controls and/or studies specifically evaluating a potential adverse drug reaction that does not confirm or support the occurrence of the event as an adverse drug reaction. Our knowledge of the downstream effects of acute pharmacological actions is often imperfect and wanting. As an example of this concept, while we know a great deal about many acute pharmacological actions of several classes of effective psychotropic medications, we still do not understand precisely why they produce therapeutic efficacy. The threshold for labeling an adverse reaction based on class effect, although the event was not observed in clinical trials or not observed with excess incidence relative to control, is generally lower than the threshold for labeling based on pharmacological mechanism of action. Two examples help to illustrate this matter:
1. Any drug that significantly inhibits serotonin uptake will drastically reduce whole-blood serotonin because whole blood serotonin is primarily the serotonin actively taken up by platelets. Serotonin acts as one of multiple messengers released from platelets as they aggregate to facilitate further platelet aggregation as part of a positive feedback system. One could hypothesize that persons treated with a drug with serotonin uptake inhibiting properties would be at substantially increased risk of serious bleeding events that would be adverse drug reactions based on pharmacological action. However, the positive feedback system influencing platelet aggregation is highly redundant with multiple substances contained in platelets that are released during the process facilitating further platelet aggregation. While bleeding events were observed during fluoxetine clinical development they were not in excess relative to controls and cases tended to be highly confounded by the potential for multiple
etiological and contributory possibilities. Subsequent to United States approval of fluoxetine, Eli Lilly conducted a well-controlled clinical pharmacology study (Phase 1) evaluating the effect of repeated fluoxetine treatment on bleeding time as the dependent measure. The study was intended to provide high quality data on the question as to whether an SSRI has an adverse effect on coagulation/hemostasis in potentially large segments of the population and be available to inform regulators on the matter. This study did not affirm an adverse effect of fluoxetine on bleeding time. With additional, accumulating post-marketing experience additional bleeding events continued to be reported and techniques applied to the always fuzzy, and on an individual case basis usually inadequately reported and confounded, data suggested the need to add bleeding to the label of drugs that inhibit serotonin uptake, especially when taken in combination with other drugs that interfere with platelet aggregation by other pharmacological mechanisms. Should this hypothetical adverse reaction have been included in the product label for fluoxetine from the time of its approval or soon after and included in labeling for all other marketed products that significantly inhibit serotonin uptake (e.g., several of the TCAs such as amitriptyline)? I do not believe so because in my view, the evidence was weak for bleeding as an adverse reaction and a high-quality study of platelet function did not support the hypothesis of such a drug reaction in the general population. Empiricism rather than hypothesis should inform product labeling and it is too easy in retrospect to say that this reaction should have been considered established based on mechanism while ignoring the not insubstantial empirical data available at the time of and shortly after approval. Sponsors, and regulators are in a complex position when a study of high quality with a dependent variable highly sensitive to a potential adverse reaction, but modest size, fails to support the finding of an adverse reaction, but individual cases, potentially highly confounded and with a high background incidence due to non-drug causes, are being reported in temporal association with a given drug.
2. Should a randomized clinical trial have been required to explicitly “prove” that bleeding was not an adverse reaction to fluoxetine prior to approval? In a
previous response (Beasley 2018), I briefly addressed the size (and therefore the cost) of conducting a study to explicitly “prove” the presence of an adverse drug reaction that is relatively rare (i.e., 1 in 1,000 incidence), especially when the background incidence (event occurs for reasons other than drug exposure) is high. Studies can be designed to “prove” (or fail to “prove”) the absence of events (within certain non-zero excess bounds relative to a comparator). However, such a study for a rare event will generally be larger and costlier than one designed to “prove” the existence of such an event. If a study explicitly designed to “prove” (or fail to prove) the absence of every hypothetical adverse reaction based on pharmacological mechanism was required of every new chemical entity seeking approval as a drug for clinical use, drug development would cease.
Given the totality of data now present regarding bleeding and drugs that inhibit serotonin uptake, one interpretation would be that some small but undefined proportion of the general population is more dependent on serotonin release from platelets as a further promoter of platelet aggregation than the overall general population and this proportion of the population more dependent on serotonin might include individuals under the influence of other pharmacological activity negatively affecting platelet aggregation through pharmacological activity other than inhibition of serotonin uptake.
As stated above, the threshold for labeling an adverse drug reaction based on class effect is lower than that threshold for labeling an adverse reaction based on pharmacological mechanism of action. Although class effect is ultimately equivalent to pharmacological mechanism of action, the important distinction is that with class effect there is presumptive adequate evidence of an adverse drug reaction occurring in human subjects to a drug with similar pharmacological action to the one being for which a label is being developed. All new dopamine antagonist antipsychotic drugs carry a warning about neuroleptic malignant syndrome as a potential adverse reaction even if no cases were observed during development clinical trials in patients treated with the new drug.
Barry mentions that, as I pointed out, FDA generally requires two positive studies (based on an a priori outcome dependent variable that is explicitly defined and an a priori explicitly stated
statistical test of between treatment differences based on that outcome dependent variable). He also states that sponsors should report all clinical trial results, whether positive or negative. I could not agree with him more about complete reporting of all clinical trial (trials in patients, not necessarily Phase 1 trials in normal volunteers) results, irrespective of trial outcome. However, it is very important to recall and understand that even multiple trials that fail to result in a positive outcome cannot be considered as robust evidence of lack of efficacy if those trial results are being interpreted correctly from a statistical perspective. One or more such failed trials can generate a hypothesis of lack of efficacy but do not scientifically prove lack of efficacy.
Phase 2-3 trials that are designed to demonstrate efficacy to gain regulatory approval are designed with a null hypothesis of equivalence. Sample sizes are determined generally (when the outcome variable is some integer scale [e.g., HAMD, PANSS] in most psychiatric disorder studies) based on the sponsor’s guesses about differences between drug and control that can be observed in mean change from baseline under the assumption that the drug is effective and the standard deviations that will be observed. If the null hypothesis is rejected (conventionally at the ≤5% probability level), then the alternative hypothesis is accepted. If the null hypothesis is not rejected, no further interpretation should made, i.e., the study simply failed to reject the null; it did not “prove” the null.
There are two caveats regarding the interpretability of studies failing to show superiority of test drug over control. First, if the null hypothesis of equivalence is rejected but the difference between test drug and control statistically significantly favors control then this would be robust evidence of lack of efficacy for the test drug (the statistical test for the efficacy assessment is virtually always a two-sided test and, as such, the unanticipated superiority of control can be interpreted). Second, if the sample size of the study was such that the power of the study was ≥95% (from a conservative interpretive perspective, this should perhaps be ≥97.5%), then failure to reject the null hypothesis of equivalence would offer stronger evidence of lack of efficacy than the customary Phase 2-3 trials generally with sample sizes resulting in 80% power. Studies with ≥95% power are very large compared to those with 80% power.
Studies can be designed to test lack of superiority to control (a one-sided test) or equivalence to control (a two-sided test and of less interest than lack of superiority) but statistical designs are different than with efficacy studies and these would generally be much larger studies.
Studies could be designed to test both hypotheses, sequentially or in parallel, although they would still be quite large. Interestingly, given the statistical nature of formally testing for lack of difference (not superior to control) and its difference from testing for a difference (superior to control), parallel analyses can result in simultaneous rejection of both null hypothesis. You can find test drug to be not superior to the comparator and simultaneously superior to the comparator. Briefly, this is because when testing for lack superiority, a small but non-zero magnitude of superiority must be accepted as equivalent to lack of superiority. For an example of such an outcome where a safety outcome (change in QTc) rather than an efficacy outcome was being evaluated and the primary intent was to show lack of difference rather than difference (Beasley et al. 2005). However, the statistical matter is still well illustrated.
As stated above, failed Phase 2-3 efficacy studies can suggest that an investigational drug does not have efficacy and the more such failed studies, perhaps the stronger the suggestion. However, studies can fail for many reasons other than lack of efficacy. For example, sponsors can be grandiose in expectations regarding the difference in the change in the outcome variable or underestimate likely variance, or both resulting in sample sizes at any given prospective power that are too small. A sponsor, trying to save money, might use sample sizes based on relatively low power (<80%) might have a failed study due to the patient samples showing aberrant response to test drug and/or control compared to expectations used in sample size calculation. Unexpectedly high placebo response that can result from a multitude of upstream factors is a manifestation of this problem resulting in a failed study.
In his concluding remarks, Barry indicates that he could not help but to introduce politics into those remarks. While I want to avoid politics and any related domain, I was drawn to wondering if a better phrase to characterize the matters might be “economic theory/philosophy.”
Beasley C, Masica D, Heiligenstein J, Wheadon D, Zerbe R. Possible monoamine oxidase inhibitor – serotonin uptake inhibitor interaction: fluoxetine clinical data and preclinical findings. J Clin Psychopharmacol 13:312-320; 1993.
Beasley CM, Mitchell MI, Dmitrienko AA, Emmick JT, Shen W, Costigan TM, Bedding AW, Turick MA, Bakhtyari A, Warner MR, Ruskin JN, Cantilena LR, Kloner RA. The combined use
of ibutilide as an active control with intensive ECG sampling and signal averaging as a sensitive method to assess the effects of taalafil on the human QT interval J Am Coll Cardiol 46:678-687; 2005.
Beasley CM. Charles M. Beasley, Jr’s response to Barry Blackwell’s reply. INHN. January 11, 2018.
Blackwell B. Barry Blackwell’s response to Charles Beasley’s response. INHN. May 3, 2018.
Insel TR, Roy BF, Cohen RM, Murphy DL. Possible development of the serotonin syndrome in man. Am J Psychiatry 139:954-955; 1982.
Oates JA, Sjoerdsma A. Neurologic effects of tryptophan in patients receiving monoamine oxidase inhibitor. Neurology 10:1076-1078; 1960.
August 23, 2018