Disgrace: On Marc Hauser
The serious involvement of the government in policing scientific misconduct began only in 1981, when hearings were convened by Al Gore, then a Congressman and chair of the investigations and oversight subcommittee of the House Science and Technology Committee, after an outbreak of egregious scandals. One was the case of John Long, a promising associate professor at Massachusetts General Hospital who was found to have faked cell lines in his research on Hodgkin’s disease. Another case involved Vijay Soman, an assistant professor at Yale Medical School. Soman plagiarized the research findings of Helena Wachslicht-Rodbard, who worked at the NIH. A paper Wachslicht-Rodbard had written about anorexia nervosa and insulin receptors had been sent for publication review to Soman’s mentor, Philip Felig, the vice chair of medicine at Yale. Felig gave it to Soman, who ghostwrote a rejection for Felig. Soman then stole the idea of Wachslicht-Rodbard’s paper and some of its words, fabricated his own supporting “data” and published his results with Felig as co-author.
At Gore’s hearings there was a parade of senior scientists and scientific administrators claiming that scientific fraud was not a problem. It involved only a few “bad apples,” they insisted, and in any case the scientific community could be trusted to tackle the problem and the government should steer clear of restricting scientific freedom. As Philip Handler, then president of the National Academy of Sciences, the most prestigious organization of US scientists, put it, “The matter of falsification of data…eed not be a matter of general societal concern. It is rather a relatively small matter” in view of the “highly effective, democratic self-correcting mode” of science. After more well-publicized scandals, the federal Office of Scientific Integrity (later the ORI) was established to investigate allegations of scientific fraud in research supported by the NIH. The NSF established a similar office for its grantees.
The NIH and NSF now require all institutions that apply for research support to have a set of procedures for addressing allegations of scientific misconduct. In brief, the usual drill is that after an allegation is made to a department chair or dean, an inquiry is undertaken to determine if a formal investigation is warranted. If so, it is carried out by a small committee of faculty members from other departments. During both phases the accused scientist is given opportunities to respond, and the entire investigation is supposed to be confidential. The committee has full access to the accused scientist’s computer files, unpublished data and notes from research supported by the government.
If the investigation finds misconduct, the university can pursue a variety of actions, ranging from the removal of the scientist from the tarnished project to the withdrawal of the scientist’s published papers to his firing. The ORI or an equivalent federal agency then conducts its own investigation. It has the power to deny future research funds to the disgraced scientist. Federal prosecution for misuse of research funds is also a possibility. Partial or total secrecy is often maintained until after the federal investigation is completed. Sometimes the process of resolving scientific conduct can be prolonged, as appeals of ORI decisions are possible. More recently, the NIH and the NSF have required training in “responsible conduct of research” for all students receiving research support. As a result there have been a spate of books, symposiums, workshops and research grants on the subject. In my teaching of the subject at Princeton and Berkeley I have used F.L. Macrina’s excellent Scientific Integrity, now in its third edition (2005). It contains historical background, current regulations and cases for class discussion in a range of subjects, including authorship, peer review, mentoring, use of animals and humans as subjects, record keeping and conflict of interest and of conscience.
* * *
Marc Hauser has worked at the exciting interface of cognition, evolution and development. As he explained on his website, his research has focused on “understanding which mental capacities are shared with other nonhuman primates and which are uniquely human,” and on determining “the evolutionarily ancient building blocks of our capacity for language, mathematics, music and morality.” Hauser has worked primarily with rhesus monkeys, cotton-top tamarins and human infants. Cotton-top tamarins are small South American monkeys similar to marmosets and, like them, are very cute indeed. (I too have worked with marmosets and rhesus monkeys.) Hauser’s laboratory was virtually the only one in the world working on cognition in tamarins, which made replication of his work almost impossible. In his studies comparing human infants with monkeys, Hauser and his research team would usually collect the monkey data, and his collaborators—such as the distinguished developmental psychologists Susan Carey, chair of the Harvard psychology department, and Elizabeth Spelke, another Harvard colleague—would collect the human data. Hauser also wrote papers with major figures in related fields, such as Chomsky in linguistics and Antonio Damasio in neuroscience. Hauser had joint federal grants with most of these senior figures.
A key motivation in Hauser’s work has been to demonstrate that monkeys have cognitive abilities previously thought to be present only in the great apes and humans. In an important 1970 study, Gordon Gallup Jr., now of the State University of New York, Albany, showed that chimpanzees can recognize themselves in a mirror. Gallup put a red spot on the forehead of chimpanzees, and when given a mirror most of the animals touched the red spot. Subsequent studies showed that the great apes (chimpanzees, bonobos, orangutans, gorillas) and humans more than 18 months old could pass the mirror test of self-recognition but not lesser apes like gibbons or the wide range of monkeys tested. In 1995 Hauser published a claim that his cotton-top tamarins could pass the test. Two years later Gallup co-wrote an attack on Hauser’s methodology. He later told the Boston Globe that when he examined some of Hauser’s videotapes of the experimental results (other tapes were said to be lost), he reported that Hauser had no evidence for his claims. Hauser tried to rebut Gallup in print but admitted in a 2001 article that he could not repeat his results; however, he never retracted his original article.
Meanwhile, experiments with elephants, dolphins, orcas and magpies have shown that these animals too can recognize themselves in a mirror, unlike any monkey. The magpie achievement is not surprising, as recent research has shown that magpies and other corvids, such as jays and crows, have a variety of cognitive abilities previously seen only in the great apes, such as tool use, foresight and role taking. These are cases of convergent evolution: apes and corvids do not have any common ancestor with these high-level cognitive skills; they arose in separate lineages. (Aesop was there first.) Darwin had tried to remove the human from the center of the biological universe, stressing its psychological and physical continuity with other living beings. Hauser seems to want to put humans and other primates, even the cotton-top tamarin, on a cognitive plane above other animals, like dolphins and crows, that have sophisticated cognitive skills but are not in the primate lineage.
* * *
The beginning of the inquiry leading to Harvard’s 2007 investigation of Hauser was triggered by a delegation of three researchers in his lab. We know almost nothing from Hauser’s or Harvard’s statements about the nature of the students’ charges. However, an article by Tom Bartlett published in The Chronicle of Higher Education in August 2010 offers a glimpse into Hauser’s lab. It is based on a document provided to Bartlett, on condition of anonymity, by a former research assistant of Hauser’s. The document, Bartlett writes, “is the statement the research assistant gave to Harvard investigators in 2007.” As he explains, “one experiment in particular [had] led members of Mr. Hauser’s lab to become suspicious of his research and, in the end, to report their concerns about the professor to Harvard administrators.”
This experiment used a standard method in child and animal studies: a sound pattern is played repeatedly over a sound system and then changed, and if the animal then looks at the sound speaker the implication is that the animal noticed the change. In Hauser’s experiment, three tones (in a pattern like A-B-A) were played by the lab assistants. After the monkeys repeatedly heard this pattern, the scientists would modify it and observe if the monkeys had noticed the change in the sound pattern. Pattern recognition of this sort is considered to be a component of language acquisition.
The monkey’s behavior was videotaped and later “coded blind”—that is, the experimenters, without knowing which sound was being played, judged whether the monkey was looking at the speaker. When coding is done blind and independently by two observers, and the two sets of observations match closely, the results are assumed to be reliable.
Bartlett went on to explain that, according to the document that had been provided by the research assistant,
the experiment in question was coded by Mr. Hauser and a research assistant in his laboratory. A second research assistant was asked by Mr. Hauser to analyze the results. When the second research assistant analyzed the first research assistant’s codes, he found that the monkeys didn’t seem to notice the change in pattern. In fact, they looked at the speaker more often when the pattern was the same. In other words, the experiment was a bust.
But Mr. Hauser’s coding showed something else entirely: He found that the monkeys did notice the change in pattern—and, according to his numbers, the results were statistically significant. If his coding was right, the experiment was a big success.
The second research assistant was bothered by the discrepancy. How could two researchers watching the same videotapes arrive at such different conclusions? He suggested to Mr. Hauser that a third researcher should code the results. In an e-mail message to Mr. Hauser, a copy of which was provided to The Chronicle, the research assistant who analyzed the numbers explained his concern. “I don’t feel comfortable analyzing results/publishing data with that kind of skew until we can verify that with a third coder,” he wrote.
A graduate student agreed with the research assistant and joined him in pressing Mr. Hauser to allow the results to be checked, the document given to The Chronicle indicates. But Mr. Hauser resisted, repeatedly arguing against having a third researcher code the videotapes and writing that they should simply go with the data as he had already coded it. After several back-and-forths, it became plain that the professor was annoyed.
“i am getting a bit pissed here,” Mr. Hauser wrote in an e-mail to one research assistant. “there were no inconsistencies! let me repeat what happened. i coded everything. Then [a research assistant] coded all the trials highlighted in yellow. we only had one trial that didn’t agree. i then mistakenly told [another research assistant] to look at column B when he should have looked at column D…. we need to resolve this because I am not sure why we are going in circles.”