Jose de Leon: Training psychiatrists to think like pharmacologists
27. Evidence-based versus personalized medicine

Donald F. Klein’s response to Jose de Leon’s response

Reading Dr. Leon causes the appalling observation that there is a vast amount of relevant (a slippery word) biological literature that you are completely ignorant about. Nonetheless, disagreements exist worth pursuing. I am all in favor of extended discussions between knowledgeable friendly scientists. Dr. de Leon is fond of historical reminiscence and this may remind him of that intense letter writing between scientists of the 18th century. Formal public discussion has disappeared from our scientific meetings with other appalling consequences.

My Utopian wishes are not far from Dr. de Leon. An ancient, pre web, paper of mine addressed the package insert. It was respectably published in the Archives and sunk like a stone. (Klein, 1974). We agree that important vested interests would stifle such innovations. So, I think Dr. de Leon is a Utopian neighbor.

Like Dr. de Leon, depending on highly intelligent responsive statisticians has been necessary and illuminating. I had the good fortune of a 30-year close association with Don Ross PhD, who amiably corrected many false ideas while brilliantly pursuing some good ones. I used Ancova extensively and even got involved with the interesting work that developed slopes that absolutely ignored outliers. However, when applied to our limited samples the effects were trivial, so lost interest. I did learn that outlier influence was not only a matter of percentages, but also where they were placed. Some exerted substantial slope leverage and some did nothing but increase variance. I would guess if the outliers formed a non-random isolated clump they could importantly affect slope and would be just the sort of finding that Dr. de Leon pursues.

My statistical concern was quite practical. Ancova uses only initial and final measures. Yet RCTs make regular intermediate measurements between initial findings and termination. That Ancova only used endpoints seemed a waste of time, effort and data. So, I asked Ross what to do.

We had the good fortune to be friends with the late Eberhard Uhlenhuth MD. Uhli and I shared many interests. This led to an unsung paper re-analyzing Uhlenhuth’s previous detailed weekly evaluation loaded paper that used Ancova. Ross used a multivariate longitudinal procedure incorporating all the intermediate data (Ross, 2009). The key conclusion was ”To increase power, it is often recommended to increase sample size. However, this is often impractical since a major proportion of the cost per subject is due to the initial evaluation. Increasing the number of repeated observations increases power economically and also allows detailed longitudinal trajectory analyses.” For those interested in more detail I append our Abstract in References.

What raises my amateur statistical hackles is Dr. de Leon’s advocacy of initial stratification using pharmacological variables. It was unclear to me if that was in the context of Anova or Ancova. RCT Ancova starts with randomization of subjects to groups that get different treatments. Initial randomization is central to Ancova inferences. Subjects can have sharply different characteristics like different sexes or pharmacological characteristics, but there is no stratification. These characteristics may or may not be used as covariates according to the scientist’s judgment (before data collection) that they have effects on treatment outcome.

Using groups in Ancova whose pre-existing differences are nonrandom is a mistake impairing inferences. It has been the subject of intense discussion. I don’t follow Dr Leon’s discussion of using regression weights to alter drug dosage. It’s probably explained in his cited papers that I hope to read.

Dr. de Leon’s discussion of mathematical vs mechanistic approaches is interesting, but he gives me too much credit as a mathematical type. I’m a clinician whose unsupported psychoanalytic views were blown out of the water by the advent of psychopharmacology. In 1952, before we had any useful psychiatric drugs, I had the good fortune of being in charge of the acute male admissions ward and the male geriatric ward at Creedmoor State Hospital, a 6,000-bed jail. So, I have grown up with the field.

My persisting major interests are how do you make a judgment that a treatment is useful and can you predict different groups of individual trajectories based on initial data. Over time I picked up some statistics in sheer self-defense.

As a clinician my treatment choices are not mathematically based, but rather on the old cautions not to use a drug until it has been around, say two years, and to listen to other clinicians and scientists who had direct drug experience, while avoiding the hype of advertising. However, these cautions were not used for imipramine and chlorpromazine where the direct effects of my observations were compatible with many independent reports.

Well done RCTs with negative outcomes were pretty conclusive, although one third of the many imipramine trials were negative. Positive trials were impressive given few or insubstantial negative trials. These views received a sharp blow with the revelation that Pharma was concealing negative trials. This increased my focus on independent trials. These were all too few, especially as NIMH devoted its finances to dimensional synaptics.

With regard to pharmacokinetics, the early reports of hyper and hypo metabolism were persuasive, but what was their relevance? They were very relevant to interesting biology, but what about clinical practice?

My practice is to start with a very low dose and wait to see if any marked shift in feelings or behavior or illness was induced. This had been reinforced by experience with imipramine treating panic disorder. Starting with the usual antidepressant dose of 75 mg. a number of patients promptly accused me of poisoning them. On detailed review they reported a marked increase in the somatic symptoms of chronic anxiety. The patients often insisted that they had increased severity and frequency of panics, but that seemed due to their definition of feeling overwhelmed. They did not report acute unexpected paroxysms. I couldn’t attribute this to a marked hypo-metabolism since their blood levels were not elevated. Also with slow increments they could get to usual responses to 150 to 300 mg/day. So, my clinical belief was the institution of very low doses was a good move that lowered any concern about hypo-metabolism. As for hyper-metabolism, patients with no beneficial response to what seemed the correct treatment and minimum side effects would have doses pushed without too much concern about FDA ceilings. These were not established on the basis of toxicity, but rather that there weren’t any data. Also, this might have been influenced by our controlled, quite safe, study of up to 1200mg/day of fluphenazine (Rifkin, 1971).

Given this clinical approach of start low, increase slowly and, with an eye on side effects, go high. It did not seem that pharmacokinetic studies were necessary. Dr. De Leon makes interesting points about their utility in drug interactions. There I plead innocence, but would like some data on subject frequency and clinical impact.

I saw one patient with his wife. He was screaming with anxiety to the point of incoherence that he was dying of cancer. It turned out that he had a slow growing inoperable peculiar malignancy that would kill him eventually. His wife declared that the screaming anxiety had started three weeks ago. For two years he had been admirably stoical by 150 mg/day of bupropion. His wife denied any change in circumstances. With difficulty, since he only wanted to talk about his cancer, he asserted incoherently that his medication was the same. On repeated questioning, he muttered that his doctor had started a new, harmless anti-cancer drug a month ago. He carried some with his bupropion, to make sure that he would take them on schedule. Looking at the tablets they were ketoconazole, a known interaction producer, used in the treatment of pituitary adenomas. I couldn’t contact his doctor so took the responsibility, told him to stop the ketoconazole and sent the patient to a lab for a bupropion level. That turned out to be 10 times the normal level. In a week he was back to ordinary stoicism. When finally contacted, his knowledgeable doctor said that was odd, he didn’t think there were any reports of such an interaction. I checked and couldn’t find any; haven’t ever seen anyone else approximating that reaction. To sum up, data on frequency of clinically meaningful interactions would be helpful. How to gather such data is another question.


Clinical trials with several measurement occasions are frequently analyzed using only the last available observation as the dependent variable (last observation carried forward [LOCF]). This ignores intermediate observations. We reanalyze, with complete data methods, a clinical trial previously reported using LOCF, comparing placebo and five dosage levels of moclobemide in the treatment of outpatients with panic disorder to illustrate the superiority of methods using repeated observations. We initially analyzed unprovoked and situational, major and minor attacks as the four dependent variables, by repeated measures maximum likelihood methods. The model included parameters for linear and curvilinear time trends and regression of measures during treatment on baseline measures. Significance tests using this method take into account the structure of the error covariance matrix. This makes the sphericity assumption irrelevant. Missingness is assumed to be unrelated to eventual outcome and the residuals are assumed to have a multivariate normal distribution. No differential treatment effects for limited attacks were found. Since similar results were obtained for both types of major attack, data for the two types of major attack were combined. Overall downward linear and negatively accelerated downward curvilinear time trends were found. There were highly significant treatment differences in the regression slopes of scores during treatment on baseline observations. For major attacks, all treatment groups improved over time. The flatter regression slopes, obtained with higher doses, indicated that higher doses result in uniformly lower attack rates regardless of initial severity. Lower doses do not lower the attack rate of severely ill patients to those achieved in the less severely ill. The clinical implication is that more severe patients require higher doses to attain best benefit. Further, the significance levels obtained by LOCF analyses were only in the 0.05–0.01 range, while significance levels of <0.00001 were obtained by these repeated measures analyses indicating increased power. The greater sensitivity to treatment effect of this complete data method is illustrated. To increase power, it is often recommended to increase sample size. However, this is often impractical since a major proportion of the cost per subject is due to the initial evaluation. Increasing the number of repeated observations increases power economically and also allows detailed longitudinal trajectory analyses.


Klein DF. What should the package insert be? Arch Gen Psychiatry 1974; 31: 735-741.

Rifkin A, Quitkin F, Carrillo C, Klein DF. Very high dosage fluphenazine for non-chronic treatment- refractory patients. Arch Gen Psychiatry 1971; 25: 398-40.

Ross DC, Klein DF, Uhlenhuth EH. Improved statistical analysis of moclobemide dose effects on panic disorder treatment. Eur Arch Psychiatry Clin Neurosci. Epub 2009 Nov 21.

May 18, 2017