Andrew Winokur Photo. This photo was taken during a planning session where Dr. Amsterdam and Dr. Winokur were discussing the methodology for studies that would eventually lead to several neuroendocrine test batteries for identifying patients with melancholic versus non-melancholic subtypes of major depressive disorder, in order to identify putativce bio-markers of depression. The first neuroendocrine battery comprised the administration of four (4) separate neuroendocrine tests to an individual over a four-day period: (1) thyrotropin stimulating hormone (TSH), prolactin and growth hormone (GH) response to thyrotropin-releasing hormone (TRH) stimulation; (2) follicle stimulating hormone (FSH), luteinizing hormone (LH), prolactin, GH and TSH  response to Gonadotropin-releasing hormone (GnRH) stimulation; (3) GH, prolactin, TSH, cortisol  and glucose response to insulinhypoglycemia; and (4) cortisol response to dexamethasone suppression. The primary results of this study were published in the Archives of General Psychiatry in 1983.

May 31, 2018

Charles M. Beasley, Jr and Roy Tamura: What We Know and Do Not Know by Conventional Statistical Standards About Whether a Drug Does or Does Not Cause a Specific Side Effect (Adverse Drug Reaction )

Charles M. Beasley’s reply to Carlos Morra’s question


        We thank Carlos Morra for his question.  His question provides us an opportunity to expand on several points discussed in our e-book (studies intended to prove the absence of an Adverse Drug Reaction [ADR] that are analyzed using non-inferiority inferential methods) and points presented in three separate, previous submissions by Beasley to INHN that dealt with QTc prolongation and the so-called Thorough QT Study (TQT Study) (Beasley 2015, 2016, 2018).  The question asked by Carlos as to whether there are any Adverse Drug Reactions (ADRs) that are highly infrequent or rare in the large, clinically-treated population that can be “addressed” in a small, Phase 1 human clinical trial is a qualified yes.  By “addressed,” I mean determined or predicted  (with a reasonable probability of accuracy) to be an ADR (not merely an Adverse Event [AE]) in the population that will ultimately be treated with the drug being studied and therefore requiring discussion in the drug’s Product Information. 

        Our answer to this question is perhaps surprising given that we are discussing highly infrequent or rare ADRs that have little chance of being observed even once in sizeable clinical trial populations (e.g., populations between 1,000 – 5,000).  The qualified yes is only for ADRs for which there is a biomarker or predictive surrogate for the ADR that is observable in small populations if a drug does have a risk liability for an ADR of interest.  This biomarker must first be extremely sensitive (or have a high negative predictive value).  It is desirable that when the biomarker is absent in the experimental population, the probability of observing the ADR of interest in the population that will be treated in clinical practice approaches 0.   With clinically severe,  potentially fatal ADRs, reliance on biomarkers that would not be positive (not differentiate the drug from control) in a small experimental population, but in the real-world clinical use of the drug in large populations, even a few cases of the ADR would occur, would not be in the best interest of public health. 

        The biomarker should also be highly specific (or, have a relatively high positive predictive value).  It is desirable that when present in the small experimental group, the probability of the ADR of interest being observed in a clinical population of even modest size is substantially higher than 0.  From a public health perspective, it is not desirable for Phase 1 studies that result in the false-positive prediction of serious ADRs to keep potentially useful medications from reaching patients.

        In the August 20, 2015, INHN Announcement, a Comment written by me (Beasley 2015), commented on Edward (Ned) Shorter’s essay (Shorter 2013) “The QT Interval and the Mellaril Story: A Cautionary Tale.”  This initial comment did not directly address Ned’s information regarding Mellaril but rather provided background information on the epidemiology of drug-induced Torsade de Pointes (TdP) and my views on the appropriateness of FDA’s restriction on the maximum approved dose of citalopram based on a TQT Study that failed to reject the null hypothesis of a difference between citalopram and placebo in a non-inferiority analysis (the standard inferential analysis for a TQT Study). 

        On February 25, 2016, the INHN Announcement published a brief Further Comment on Ned Shorter’s essay (Shorter 2013) “The QT Interval and the Mellaril Story: A Cautionary Tale” authored by me (Beasley 2016).  This piece described a recently published article regarding changes in QTc with SSRIs describing prolongation with citalopram relative to other SSRIs that I thought relevant to judgment regarding the appropriateness of the FDA’s product information change for citalopram.

        On January 25, 2018, a Final Comment on Ned Shorter’s essay (Shorter 2013) “The QT Interval and the Mellaril Story: A Cautionary Tale,” authored by me (Beasley 2018) appeared in the INHN Announcement.  This piece did not elaborate further on the appropriateness of the FDA’s action regarding the approved dose of citalopram.  This piece described my thoughts on the evolving understanding and use of the heart-rated corrected QT interval (QTc) and the pathophysiology of TdP and other malignant ventricular tachydysrhythmias.  This 2018 work pointed out several matters of importance to Carlos’ question with the TQT Study as an example of where a small Phase 1 might provide a better answer to a critical safety question than randomized clinical trials that would be of such enormous size and so lengthy that they would be impossible actually to conduct.

        This material is reproduced from that Final Comment below.

        While QTc prolongation is a biomarker for risk of TdP and other ventricular tachydysrhythmias, even substantial QTc prolongation does not invariably lead to TdP.  Multiple drugs that prolong QTc are not associated with TdP, including amiodarone, carvedilol, ebastine, loratadine, phenobarbital, ranolazine, salbutamol, tamoxifen and tolterodine (Hondeghem 2008a).  Additionally, drugs that prolong QTc can be antiarrhythmic, e.g., amiodarone.

        Hondeghem and colleagues (Hondeghem 2001, 2008a,b; Shah 2005)  have proposed aset of four drug-induced changes (or characteristics of the changes) in cardiac electrophysiology that appear to be necessary to result in either TdP that can spontaneously revert to normal sinus rhythm (~80% of occurrences) or degrade into ventricular fibrillation (Vfib), or result directly in Vfib.  These changes in cardiac electrophysiology are best assessed through cardiac action potential studies in tissue preparations, but some have biomarkers that can be evaluated on the surface ECG.

        This set of changes is referred to by the acronym of TRIaD.  The first of these changes is triangulation (T), the lengthening of ventricular action potential (AP) duration specifically by prolonging Phase 3 of the AP.  Triangulation will lengthen QTc that reflects the AP if Phase 2 of the AP (plateau phase) does not shorten.  However, triangulation will not extend QTc (or the total duration of the AP) if Phase 2 is shortened.  Prolongation of Phase 3 repolarization is specifically defined as an increase in AP30-90 duration in action potential studies (Shah 2005).  The ECG manifestation of triangulation is a widening and flattening of the T-wave (Shah 2005). Such widening and flattening could be quantitated by measuring the onset to the end of the T-wave, the amplitude of the T-wave, ratios of these two parameters and absolute values of these two parameters.  Phase 3 repolarization is strongly contributed to by potassium influx through the IKr channel, and blockade of that current can result in triangulation. 

        The second factor, a characteristic of change, is reverse use dependence (R) of the triangulation/prolongation of Phase 3 repolarization – more significant effect at slower heart rates (Shah 2005).  A negative correlation between QTc length and heart rate would reflect reverse use dependence, but this cannot be assessed on a standard 10-second ECG, although it might be evaluated on an extended recording (Holter) if the recording interval captured a sufficient range of different, sustained heart rates.

        The third alteration is temporal variability in the action potential duration on a cycle-to-cycle basis that is referred to as instability (Ia) (Shah 2005).  The ECG manifestation of instability is T-wave alternans (Shah 2005) that is a beat-to-beat change in the morphology of the T-wave, including its amplitude, sometimes so large as to result in the alternating polarity of the T-wave. Variations in width (including width from onset to peak vs. peak to end reflecting symmetry) and amplitude of the T-wave could quantitate such morphological change.

        The fourth change is transmural dispersion (D) of ventricular repolarization (Shah 2005).  There is an ordered progression of repolarization across the ventricular wall initially with epicardial repolarization, followed by endocardial repolarization and, finally, M-myocyte (mid-myocyte, deep subendocardial) repolarization.  Disruption and desynchronization of this sequence, particularly with M-myocytes, is dispersion.  The ECG manifestation of dispersion is lengthening of the time interval between the peak and end of the T-wave, referred to as Tpe.  This length is sometimes corrected for QT (Tpe/QT).  Across the relevant literature, the terminology is confusing because some authors refer to the absolute length as Tpe, and some authors refer to that length corrected for QT as Tpe rather than Tpe/QT. 

        TRIaD predisposes to the development of TdP that might or might not progress to Vfib, and the development of Vfib without preceding TdP.  Other aspects of cardiac electrophysiology that can be influenced by drugs due to blockade of other cardiac ion channels (besides IKr) and alterations in autonomic tone, among other influences, predispose to the occurrence of Vfib in the presence of TRIaD.  λ is the product of the Effective Refractory Period (ERP) and Conduction Velocity (CV) (λ = ERP * CV).  The ERP is the time from the initiation of myocyte depolarization through partial repolarization (Phase 3) when stimulation will not result in a propagated AP (a second AP).  The CV is the speed of transmission of depolarization.   As λ decreases, there is an increased risk of Vfib (abrupt onset or evolution from TdP), and as λ increases, there is a higher likelihood of spontaneously terminating TdP (Shah 2005).

        In general, most non-cardiac drugs that lengthen QTc, do so by blocking IKr and drugs that block IKr will often, but not always, be associated with all components of TRIaD.  Therefore, while not perfect, QTc prolongation can be used with some caution as a biomarker for the risk of TdP.  One notable exception to this general association between IKr blockade and TRIaD and risk of TdP is when the drug that blocks IKr also blocks Na and/or Ca channels as these pharmacological actions can offset the effect of IKr blockade (fluoxetine is one example of such a drug).

        Based on the information briefly reviewed above, except for drugs intended for cardiac conditions that alter the activity of multiplecardiac ion channels or channels other than IKr and non-cardiac drugs that while blocking IKr also possess compensatory pharmacological activity, the TQT Study is a reasonable method of risk prediction.  It might well result in more false-positive signals than missing drugs with the potential for causing TdP.  The information above suggests that a set of pre-clinical studies might be superior to a Phase 1 human study for risk prediction in this area.

        The TQT Study has wide, international regulatory acceptance as a way of predicting a potential risk of TdP (more technically correct, given the non-inferiority analysis relative to placebo, predicting the lack of potential risk of TdP) in that this is an arrhythmia that will occur in a small proportion of persons with an inappropriately prolonged QTc.  As noted above, about 80% of cases of TdP will revert to normal sinus rhythm.  However, 20% will progress to fatal (without proper medical management) Vfib and with what would have been a brief loss of consciousness but occurring in the wrong circumstances (e.g., while swimming), additional fatalities might occur with TdP.

        We are aware of one other ADR for which there is regulatory acceptance for the use of a specialized Phase 1 study for risk prediction (again, absence of risk as analyzed).  This ADR is Substance Abuse Disorder (of the drug under evaluation).  If the drug has pharmacological activity similar to that of other drugs of abuse or results in patients’ subjective experience as similar to that of persons that abuse other drugs/substances, then a Phase 1 study (Human Abuse Potential [HAP] Study), using a population enriched for being prone to non-medical use of drugs similar to the one under evaluation can be employed to address this potential ADR.

        As a final note regarding the two Phase 1 studies discussed above, both the TQT and HAP Studies are conducted with positive controls to confirm the assay sensitivity of the studies, and the inferential analysis for the drug-placebo comparison is a non-inferiority analysis, as noted above.  The null hypothesis that must be rejected for the study to be a success (from the investigator/sponsor perspective) is that the drug and placebo are different.  If the experiment is a success, it ‘proves’ (within an a priori magnitude of acceptable observed difference) that drug is not inferior to placebo in causing more cases (or greater mean change) of the biomarker/predictor of the ADR than placebo.  Failure to reject the null hypothesis cannot be correctly interpreted as proving that the drug does have a risk of causing the ADR.

        There is an additional potential ADR, Type II diabetes mellitus, where we believe the risk for this ADR can be potentially adequately assessed with a set of two Phase 1 studies.  These two studies are a hyperglycemic glucose clamp study (evaluates pancreatic β-cells’ capacity produce and release an appropriate amount of insulin in response to an increase in systemic glucose) and a hyperinsulinemic-euglycemic glucose clamp study (evaluates hepatic cells’ and cells of other tissues [primarily muscle and adipose tissues] capacity to respond to insulin and dispose of glucose [transport glucose into the cells]).  I have previously discussed the details of these studies (Beasley 2019). Unfortunately, based on a review of virtually all placebo-controlled clamp studies conducted with olanzapine, there appears to be a lack of consensus on how these studies should be carried out and analyzed.  Without robust consensus among experts on both the adverse medical event of interest that might be an ADR for a drug of interest as well as the optimal conduct and analyses of such studies, the use of the studies for ruling-in or ruling-out risk is limited.

        The pair of glucose clamp studies differ from the TQT Study and HAP Study as regulators do not require them for approval to market a drug.  As such, this pair of Phase 1 studies does not have implicit regulatory acceptance as a means of excluding the risk of the medical event of diabetes mellitus as associated with a drug.  However, this pair of studies has been demonstrated to be of adequate sensitivity in demonstrating the risk of diabetes mellitus (or hyperglycemia) with a range of drug classes such as corticosteroids and β-blockers.

        The following summarizes our response to Carlos.  If the cascading elements of pathophysiology that lead to a clinically significant adverse medical event are well understood, and a biomarker or risk predictor for these elements of pathophysiology can be found that would manifest itself in a substantial proportion of a small population treated with a drug of interest, then small Phase 1 studies might be able to determine risk (or lack thereof) for the adverse medical event as an ADR.  There must be a robust consensus on how to conduct and analyze a study that uses the biomarker / risk predictor as a dependent variable in an experiment for such a Phase 1 study to be truly useful.



Beasley CM. Comment on The Q-T interval and the Mellaril story: a cautionary tale.  August 20, 2015.

Beasley CM. Further comment on The Q-T interval and the Mellaril story: a cautionary tale.  February 25, 2016.

Beasley CM. Final comment on The Q-T interval and the Mellaril story: a cautionary tale.  January 25, 2018.

Beasley CM.  Reply to Edward Shorter’s comment – what we know and do not know by conventional statistical standards about whether a drug does or does not cause a specific side effect (adverse drug reaction) – olanzapine and diabetes mellitus, evolution of data – illustrating the difficulties in identification of adverse drug reactions.  July 4, 2019.

Hondeghem LM, Carlsson L, Duker G.  Instability and triangulation of the action potential predict serious proarrhythmia, but action potential duration prolongation is antiarrhythmic.  Circulation 2001; 103:2004-13.

Hondeghem LM.  QT prolongation is an unreliable predictor of ventricular arrhythmia.  Heart Rhythm 2008a; 5:1210-12.

Hondeghem LM.  Use and abuse of QT and TRIaD in cardiac safety research: importance of study design and conduct.  Eu J Pharmacol 2008b; 584:1-9.

Shah RR, Hondeghem LM.  Refining detection of drug-induced proarrhythmia: QT interval and TRIaD.  Heart Rhythm 2005; 2:758-882.

Shorter E.  The Q-T interval and the Mellaril story: a cautionary tale.   July 18, 2013.


May 7, 2020