Forensic Science Put Jimmy Genrich in Prison for 24 Years. What if It Wasn’t Science?

Forensic Science Put Jimmy Genrich in Prison for 24 Years. What if It Wasn’t Science?

Forensic Science Put Jimmy Genrich in Prison for 24 Years. What if It Wasn’t Science?

A special investigation reveals a disastrous flaw affecting thousands of criminal convictions.

Copy Link
Facebook
X (Twitter)
Bluesky
Pocket
Email

1: THE BOMBINGS

The first bomb didn’t kill anyone. It was planted in the ground-floor parking lot beneath the Two Rivers Convention Center in downtown Grand Junction, Colorado—a dry, boom-and-bust mining town west of the Rockies where the snowcapped mountains give way to mesa and valley. In the morning the sun strikes gold on the sheer, striated walls of the sandstone cliffs that surround the Grand Valley, and in the evening the cliffs are ribboned with shifting periwinkle shadows. At 9 pm on Valentine’s Day, 1991, Dennis Lamb was leaving a vocational banquet for School District 51, crossing the Two Rivers parking lot, when an explosion blasted shrapnel into the back of his right calf. “I thought I’d been shot,” Lamb would recall.

Three weeks later, on the morning of March 5, four members of the Gonzales family piled into the family van, parked outside their home in a neighborhood two miles northeast of downtown, to go to the mall. Twelve-year-old Maria Dolores Gonzales had stayed home from school with a headache, and her mother let her come along. When the van rolled forward, a bomb that had been hidden in the left rear wheel well exploded. Shrapnel rocketed through the vehicle’s carpeted floor and the plush back of Dolores’s seat, entering her back and instantly severing her aorta. When she failed to respond to her family’s panicked screams to get out of the van, they pulled her from the seat and blood pooled on the pavement. She died soon after.

Three months later, Suzann and Henry Ruble had just finished dinner and were leaving the Feed Lot, a downtown restaurant just blocks from the Two Rivers Convention Center. “What is that thing over there?” Suzann asked, pointing to an object she thought looked like the pneumatic tubes drive-up banks used to have. “It looks kind of strange, don’t you think?” She drove over and Henry leaned out to pick it up. As he brought it up to his chest, Suzann recalled, “There was a big loud explosion. It lifted the truck and Henry dropped.” Ruble’s arms were blown off and his body mangled. He died instantly. Debris from this third and most powerful bomb was collected across the block.

The next day, federal agents from the Bureau of Alcohol, Tobacco, and Firearms descended on the town of roughly 30,000 to assist local investigators with what appeared to be a serial bomber. No one had taken credit, and the victims seemed chosen at random. The police were stymied, people were scared, and the pressure to find a culprit mounted. “We certainly don’t want to create hysteria or paranoia in the community,” a police lieutenant told the Grand Junction Daily Sentinel after warning people to check around their cars for bombs. “But we do want people to be aware that there is a person or persons out there with no regard for human life.”

Investigators drew up a list of about 30 suspects, many of whom were known to local police to dabble in explosives—not an unusual hobby in a town of miners and ranchers. “People use dynamite. People work in the oil patch. People set bombs off for fun,” said Ellen Miller, a former correspondent for The Denver Post who covered the story. “I mean, seriously. People knew what pipe bombs were.” Then, in early July, police received a nervous call from a woman who worked at the Readmor bookstore on Main Street. A 28-year-old local man named Jimmy Genrich had asked them to order The Anarchist Cookbook, a manual that contains, among other things, instructions on how to make a pipe bomb. They refused to order the book and instead called the police, skyrocketing Genrich to the top of the list of suspects.

A covert detail was put on Genrich in mid-July, with two or more ATF agents following him around the clock. They sat in an unmarked car outside the boardinghouse where he lived alone on the top floor, sharing a bathroom with other residents, mostly single men; the house sat just over two blocks from the convention center. They followed him when he walked across the street, wending his way through a parking lot, down an alley, and into the back door of Suehiro’s, the Japanese restaurant on Main Street where he worked as a part-time dishwasher. They followed him when he left Suehiro’s at the end of his shift and walked to the Corral, a bar a few blocks away, where he drank hard and tried (and failed) to talk to girls. They followed him when he walked a few more blocks to the Cheers Lounge, where the girls danced for other men and wouldn’t talk to him. They followed him when he staggered back to his boardinghouse, drunk and angry, and they sat outside in the dark, all night, wakeful in case he should decide to go on one of his long nighttime walks. During the daylight hours, they saw his mother, a church organist named Sheila Greenlee, deliver his meals in a cooler.

One morning in late July, local detective Bob Russell, a tall man with watchful blue eyes and military shoulders, showed up at the boardinghouse with two ATF agents. They wanted inside. Genrich cracked the door, got angry, and didn’t want to let them in. (Later he said Russell illegally put a foot in the door and wouldn’t allow him to close it, which Russell denied.) But he soon found himself sitting on his sagging brown couch talking to Russell, who stood over him while federal investigators looked through his belongings. Tucked under a radio on his dresser, they found a handwritten note scrawled on the back of an IRS envelope: “If I end up killing some stuck-up bitch don’t blame me. I’ve asked everybody I know for help, but no one listens…. I’ve tried making friends with these girls around here, but they just keep treating me like I’m not good enough to talk to. Valentine’s Day is coming and I still don’t have a sweetheart…. It’s been over a year now, and nobody has tried to help me yet. These girls still won’t even talk to me. Fuck you all. I’ll get even…. If I can’t be happy, I might as well kill one.”

They came back to the boardinghouse with a warrant and ransacked his room, as well as the home of his mother and his gentle, owl-eyed stepfather, Wallace Greenlee. They found no Anarchist Cookbook, no bomb-making instructions, and, despite thoroughly vacuuming carpets and furniture, no traces of gunpowder anywhere. All they came up with was a toolbox of electronics equipment, including some Buss-type stereo fuses like those in the bombs; a blue envelope with more handwritten notes in a similarly disturbing vein; and common household tools such as pliers and wire-strippers. They sent the tools to a forensic analyst in Maryland named John O’Neil. The hope was that if O’Neil could match marks made by Genrich’s tools to marks found on recovered bomb fragments, they would have the physical evidence they needed to arrest him. Meanwhile, federal agents started openly tailing Genrich everywhere he went, blue jackets with yellow ATF letters flapping like neon signage.

By February 1992, nearly a year after the Valentine’s bomb, O’Neil said he had matched Genrich’s tools to all three bombs—plus an earlier, unexploded bomb from 1989, which had been found in the parking lot of the LaCourt Motor Lodge, right next door to the Two Rivers Convention Center. Investigators had used this bomb to figure out the other bombs’ unusual “signature” construction: a galvanized-steel pipe four to six inches long, covered at each end with distinctive “Coin”-brand metal caps, ignited by an internal motion-sensitive mercury switch that triggered a Buss-type fuse soldered to an Energizer Everlast battery. The district attorney, Steve ErkenBrack, convened the first grand jury that Grand Junction had seen in years, and Genrich was indicted on 10 counts, including multiple counts of first-degree murder. Grand Junction police chief Darold Sloan called it “the most comprehensive investigation in my 23 years in law enforcement.” All told, they spent more than $1 million. ErkenBrack is convinced they got the right guy. “The multiple bombings stopped,” he says, “as soon as we focused on Mr. Genrich and seized his tools.”

During the trial, ErkenBrack and the trial judge, Nicholas Massaro, agreed that Genrich’s fate hung on the toolmarks, the only physical evidence that connected him to the bombs. In the early 1990s, few people challenged the foundations of forensic methods such as toolmark analysis. Since then, despite CSI-style portrayals of forensic analysts as crime-solving oracles, prominent scientists and criminal-justice experts have questioned many of the “pattern-matching” disciplines that rely on comparisons of bite marks, hairs, shoe prints, tire tracks, or fingerprints. These are different from, say, forensic DNA analysis, which relies on scientific principles like the known variations in the human genome. In contrast, pattern-matching examiners exercise an enormous amount of subjective judgment in determining what constitutes a match. In 2009 and 2016, major reports from the National Academy of Sciences (NAS) and the President’s Council of Advisors on Science and Technology (PCAST) blasted pattern-matching disciplines as barely science at all. Nonetheless, most of these forensic techniques are as widespread today as they were when Genrich was convicted.

Calculating how many people might be incarcerated based on erroneous “matches” is notoriously difficult, but according to the Innocence Project, faulty forensic science was a factor in about half of all wrongful convictions in which the defendants were later exonerated by DNA testing. According to the National Registry of Exonerations, which looked at a larger set of cases that also included non-DNA exonerations and judged the factors differently, faulty forensics was a factor in 24 percent of wrongful convictions. One recent academic study, using its own methodology, found that faulty forensics was a factor in 34 percent of the wrongful convictions they examined. Many prosecutors and judges, however, insist there is no problem and that wrongful convictions are vanishingly rare. In 2007, Supreme Court Justice Antonin Scalia cited a prosecutor claiming courts convict with an “error rate of 0.027 percent—or a success rate of 99.973 percent.” The prosecutor had divided the number of known exonerations over a 15-year period (a few hundred) by the total number of felony convictions in that period (15 million). It is wildly unlikely, however, that all wrongful convictions have been discovered. One academic study estimates that in capital cases—which receive far more post-conviction scrutiny than do other cases—one in 25 people set to be executed will have been wrongfully convicted. However you crunch the numbers, they are appallingly high, and could mean that thousands of people are behind bars partly because juries were swayed by unproven ”science.”

Today, Genrich is 55 years old and has been in prison for nearly 25 years for crimes he says he didn’t commit. His latest appeal has been taken up by the Innocence Project, in the hopes of not only freeing Genrich, but getting the courts to recognize recent scientific challenges to forensic pattern-matching techniques that affect hundreds of thousands of people at all levels of the criminal-justice system. In our investigation, we comprehensively reviewed the literature on handheld toolmarks published in forensic trade journals, dug through past legal rulings, pored over nearly 7,000 pages of trial transcripts, and conducted dozens of interviews with prosecutors, defense attorneys, forensic practitioners, judges, academics, and scientists, from Grand Junction to the Department of Justice. What we found was a startling lack of scientific support for forensic pattern-matching techniques such as toolmark analysis; a legal system that has failed to separate nonsense from science even in capital cases; and a consensus among prosecutors all the way up to the attorney general’s office that scientifically dubious forensic techniques should be not only protected, but expanded. With Donald Trump in the White House and Jeff Sessions at the helm of the DOJ, the nominal momentum for forensic-science reform spurred by the two major reports is slowing. Genrich’s case reveals a system that makes it nearly impossible to throw unproven forensic science out of courts and may be keeping thousands of innocent people behind bars.

2: THE ORIGIN OF FORENSICS

On a bitterly cold Valentine’s Day in 1929, four men hired by Al Capone entered an unheated garage on Chicago’s North Side and ordered the seven men inside to line up against a brick wall. Two men in suits and two men dressed as police officers carrying Tommy guns unleashed a barrage of bullets into henchmen of the infamous Chicago mobster George “Bugs” Moran. The police were stymied until they raided Fred “Killer” Burke’s house and found guns they suspected might have been used in the massacre. Burke wouldn’t confess, and the guns were the best evidence linking him to the crime, so they sent the weapons to Calvin Goddard, a former physician and pioneer in the new field of “forensic ballistics” at one of the nation’s first crime laboratories. (His new method of matching bullet casings to guns had played a role in the controversial 1927 execution of Italian-American anarchists Sacco and Vanzetti.) Goddard fired “test” bullets from Burke’s guns and, using a “split-image” comparison microscope he had helped invent for the purpose, matched grooved marks left on the test bullets and casings to those on bullets and casings found at the crime scene.

Goddard’s forensic ballistics is now known as “firearm and toolmark analysis,” and the field has since grown to include hundreds of examiners in crime labs nationwide. While making matches with household tools is less common, convictions have been secured in part based on marks left by knives, bolt cutters, bayonets, scissors, screwdrivers, pipe wrenches, or—as in Genrich’s case—pliers and wire-strippers. These marks are often harder to parse than those on bullets, because while all bullets fired from a gun follow the same path down the same metal barrel, toolmarks depend on multiple variable factors such as the angle and pressure with which a tool is applied to a surface, which may be hard or soft, spongy or brittle.

Firearm and toolmark analysis emerged out of a national push in the early 20th century to professionalize police investigative techniques at a moment when Americans were particularly enamored with science. Law enforcement borrowed terms from science, establishing crime “laboratories” staffed by forensic “scientists” who announced “theories” cloaked in their own specialized jargon. But forensic “science” focused on inventing clever ways to solve cases and win convictions; it was never about forming theories and testing them according to basic scientific standards. By adopting the trappings of science, the forensic disciplines co-opted its authority while abandoning its methods.

Amid the swirl of new forensic techniques, the courts realized there had to be a gatekeeping mechanism to filter out quackery. In 1923, the DC Court of Appeals provided that mechanism in Frye v. United States. The judges rejected a doctor’s dubious claim that he could use a polygraph to detect when a person was lying from a rise in their blood pressure. In the ruling, the court said that in order for scientific evidence or expert testimony to be admitted, it must be offered by an experienced practitioner making inferences from a “well-recognized scientific principle” that has “general acceptance in the particular field in which it belongs.” In Frye, the judges deemed the scientists in the “particular field” relevant to polygraph use to include psychologists and physiologists—not just polygraph practitioners who would, presumably, be biased toward preserving the technique’s reputation. The effectiveness of Frye in keeping dubious science out of the courts depends on whom judges include in their definition of the “relevant scientific community.” But as the decades wore on, and the forensic disciplines gained influence, judges tended to restrict their definition of the “relevant scientific community” to the forensic examiners themselves. Judges began taking advice on what counted as good forensics from the very people who invented the techniques and made a living off of them.

In the American criminal-justice system, where prosecutors regularly battle defense attorneys over what constitutes valid evidence, judges’ rulings on admissibility are the final word. Once a technique has made it into court and survived appeals, subsequent judges, most of whom have no scientific training and little ability to assess the scientific validity of a technique, will continue to allow it by citing precedent. Forensic examiners, in turn, cite precedent in order to claim that their techniques are reliable science. Prosecutors point to guilty verdicts as evidence that the science brought to court was sound. In this circular way, legal rulings—which never really vetted the science to begin with—substitute for scientific proof. This is Frye’s fatal flaw: Nowhere in this process is anyone required to provide empirical evidence that the techniques work as advertised. Frye aimed to keep pseudoscience out of the courts, but instead has helped create the perfect conditions to keep it in.

3: THE TOOLMARK ANALYST

By the time Genrich went on trial, the high-profile case had so saturated the local news that it was moved to the town of Greeley, on the other side of the Rockies. Inside an imposing courtroom with walls built of rare Colorado white marble—the same white marble used for the Tomb of the Unknown Soldier and the Lincoln Memorial—Genrich sat at the defense table waiting for a break in the day’s proceedings, when he would be allowed to play on the Game Boy that his public defender Roberta “Bert” Nieslanik brought him every day in her purse. He sat quietly next to co-counsel Greg Greer, who would occasionally reach out and place a hand gently on his arm. It was not clear to observers if Greer was comforting his client or subtly managing potential outbursts.

Ten days into the grueling five-week trial, ATF toolmark examiner John O’Neil took the stand. When tools are manufactured, he explained to the jurors ensconced in the marble jury box, processes such as grinding and milling create unique microscopic traits that can be used to distinguish even mass-produced tools. “Microscopically,” O’Neil said, “we move from scratches to ridges and valleys. It becomes topography.”

In the ATF lab outside Washington, DC, O’Neil had taken Genrich’s pliers and carefully scraped their open jaws across sheets of lead, copper, and aluminum to simulate the striated marks left behind when tightening a metal end cap. He had snipped pieces of copper wire similar to wires found in the debris swept up after the bombings. He then darkened the room and placed the samples side by side on his comparison microscope, overlaying the marks from his test cuts with the marks found on the debris. He would tilt the light source to the side, he said, to illuminate the “ridges and valleys,” to “follow the flow of that shadow line in and out of the striae.”

In a dramatic video presentation—one of the first of its kind in the nation—he showed the jury exactly how he made his matches. The video started with a still frame: a split screen of the ends of two pieces of wire. On the right was the wire O’Neil had cut using Genrich’s red-handled needle-nose pliers. On the left was a scrap of insulated wire from the unexploded 1989 bomb found outside the LaCourt motel. Then the camera began to zoom in. “At 20 times its normal size,” O’Neil said, “you’re beginning to see some features of the topography…it has contours.” As it zoomed to 40, then to 80, he explained that he had to remove the light’s filter and bring its angle down “to see the shadows.” He pointed to two lines about a quarter of the way from the top of the screen that ran parallel to the cut. “And you simply find two lines similar to that in size and shape and follow them through and see if the rest of what you have falls in place.” He testified that this alignment was so improbable that Genrich’s tool must have cut the wire in the bomb, “to the exclusion of any other tool” in the world. He then proceeded to match Genrich’s wire-strippers to a wire found at the Valentine’s Day bomb site, and matched his yellow-handled slip-joint pliers to scratches on the distinctive “Coin”-brand end-cap fragments that had been found at the scene of the Feed Lot bombing and that had sliced through the aorta of Dolores Gonzales.

When Nieslanik first saw this evidence, she was concerned. “I thought it was going to be a science,” she says. Barely five feet tall, with a pixie cut and turquoise cowboy boots, she talks fast and rarely sits still. She is a tiny woman who takes up space. In preparation for trial, Nieslanik scoured the professional journals published by forensic practitioners to try and understand the technique. Nieslanik had majored in chemistry in college before switching to women’s studies, and she had worked as the chemistry and physics laboratory coordinator at Mesa College before going to law school, so she was familiar with basic scientific concepts such as experimental design and statistics. She understood that just because Genrich’s tool had made a particular mark, that did not prove it was the only tool that could make that mark. The wire-strippers and pliers in Genrich’s toolbox were incredibly common—Nieslanik said she owned a pair. So how could the examiner know that the marks were unique to this set of wire-strippers or that set of pliers?

To her surprise, Nieslanik could find no scientific studies to back up the claims O’Neil was making. There was no standardized protocol to be followed. There were no criteria for how many points of similarity constituted a unique match. It seemed to be just O’Neil’s subjective judgment. Then Nieslanik discovered that O’Neil had not submitted most of his test cuts into evidence. (The judge, infuriated upon learning this, held O’Neil in contempt of court.) O’Neil had decided over 50 cuts “were of little or no value” because they didn’t match. “I thought, ‘They just cut and cut and cut until they get one that matches,’” Nieslanik says. She recalls turning to her co-counsel and saying, “Holy shit, this is not science. It’s just not science. It’s like voodoo.”

Even though she was convinced the field was “voodoo,” Nieslanik followed legal protocol and requested an independent review of O’Neil’s work. “It’s a science that’s questionable,” she explains, “but you have to hire somebody that believes in it to advise you about it.” A team of two other toolmark examiners came to look over O’Neil’s work. By the time they arrived, he had already set up the evidence for them, marking his matches with blue dots, a practice that could bias any examiner’s analysis. But the other examiners only agreed with O’Neil’s match to the Valentine’s Day bomb. All the other marks, they said, were inconclusive.

These conflicting results reinforced Nieslanik’s intuition that the technique was unreliable, so she decided on a risky tactic: not just to challenge O’Neil, but to question the premise of his entire field, one that had been accepted into court since the turn of the 20th century. She called up Don Searls, a math professor at the local university who was unfamiliar with toolmark analysis. Nieslanik sent him the few papers she could find describing the field. Searls, who consulted for aerospace and pharmaceutical companies on experimental design, including for a NASA Apollo mission, was shocked. He testified that the field of toolmark analysis “does not have a scientific basis” and could not provide “credible evidence.” He laid out how he would design a proper test, pointing out that several truly independent examiners—not ones who had blue dots arranged for them in advance—should test several tools of similar wear and tear, ideally without knowing which ones were the suspect’s.

The prosecutor, ErkenBrack, countered with an analogy. “If I have a babysitter,” he said, “and I come home and somebody has been in the fudge and there’s a little tiny handprint on the plate, are you telling me that in order to be statistically valid, I need to get in four other three-year-olds to compare that handprint?” But ErkenBrack’s commonsense analogy is flawed in a way that gets at the heart of the issue. He would only have to distinguish between the 3-year-old’s hands and the babysitter’s hands. The problem for forensic pattern-matchers is much more difficult. Imagine trying to figure out which 3-year-old stole a cookie from a picnic table in a New York City park, where any toddler could have wandered by. O’Neil had to conclude that Genrich’s tools were the only ones in the world that could have made the marks.

Claims like “to the exclusion of any other tool” need to be supported by scientific studies that answer two crucial questions. First, do tools leave unique marks? If this is true in principle, you’d need to test if the technique works in practice: How reliably can human examiners distinguish toolmarks under conditions that resemble casework? In other words, how often do they make errors?

No human endeavor is perfect, yet many forensic examiners claim “zero” or near-zero error rates. In a widely cited 1984 paper in the Journal of Forensic Sciences, bite-mark examiners claimed a coincidental match would occur less than one in 10 quadrillion times. But when actually tested, even the most experienced examiners were wrong about one in six times, and in one study they struggled to distinguish a child’s bite mark from an adult’s. In 2009, the chief of the FBI Firearms-Toolmarks Unit wrote that a qualified examiner will “rarely if ever commit a false positive error (misidentification).” In practice, error rates for matching bullets to firearms can be dramatically higher: In 2008, the Detroit Police Department’s crime lab was shuttered when auditors found that its examiners made one error in every 10 cases. The head of the FBI’s fingerprint laboratory testified that its error rate was one in 11 million—because he knew of only one error in the FBI’s 11 million comparisons—but subsequent tests of fingerprint examiners show error rates ranging from one in 680 to one in 24.

These overblown and largely imaginary numbers—and forensic testimony offered with the certainty O’Neil claimed—are dangerous, because they give a false sense of scientific precision to juries and contribute to wrongful convictions. When examiners testify that they can make a match “to a reasonable degree of scientific certainty,” they are making what sounds like a statistical statement. Dr. Searls’s point was that they hadn’t done the studies required to back up such a statement, so there was no way for O’Neil to support his claims—the “match” was simply what he subjectively judged to be true. O’Neil, who is now retired and spends his time making rosaries, stands by his testimony as well as his decision to throw out the test cuts that didn’t match. “If I didn’t believe that I had found something, then I wouldn’t have testified to it.”

But abstract arguments, while convincing to statisticians like Dr. Searls, do not have the same effect on juries as a video where they can see lines matching up. As deliberations started in the Genrich case, the jury was split 50/50 on the question of his guilt. They asked to see O’Neil’s video several times while deliberating. After an agonizing four days, they delivered the guilty verdict. “They all matched. It was a perfect match,” juror David Trujillo later told the local paper.

4: “A FORENSIC COMMUNITY IN DISARRAY”

Over the past century, thanks to the 1923 Frye ruling putting judges in charge of evaluating science, dubious forensic techniques have proved devilishly difficult to expel from the courts. Consider bite-mark matching, which arose out of a single case in 1974, when Walter Edgar Marx was convicted of involuntary manslaughter. Investigators had exhumed the body of the victim six weeks after she’d been buried, and three dentists matched a series of lacerations around her nose to a plaster cast of Marx’s bite. An appeals court acknowledged that there weren’t any scientific studies validating the technique, or showing that bite-mark examiners could reliably match a bite mark left on skin to a person’s teeth, but found the three dentists to have credible expertise matching dental remains from deceased people to patients’ dental records. The court inferred that looking at bite marks left in a cadaver’s skin was sufficiently similar. The judges seemed impressed by the methods, which included X-rays and 3-D models. In a final rebuke to science, the court ruled that not to allow the evidence would be to “sacrifice…common sense.” This case established the precedent for almost all subsequent rulings on bite-mark evidence, including the 1983 case of a Wisconsin man named Robert Lee Stinson, who was convicted in the rape and murder of a 63-year-old woman almost entirely on bite-mark evidence. Using a photograph of the victim’s body and a plaster cast of Stinson’s teeth, dentists testified that the marks matched “to a reasonable degree of scientific certainty.” The appeals court was convinced by the analysis: “the lateral incisor in the upper jaw was set back from the other teeth; all of the upper front teeth were flared; the lower right lateral incisor was worn to a pointed edge; the right incisor was set out from the other teeth on the lower jaw.” The dentists even compared Stinson’s bite to his twin brother’s. The judges concluded that the “bite-mark evidence in this case was sufficient to exclude to a moral certainty every reasonable hypothesis of innocence,” upholding Stinson’s conviction.

Twenty-three years later, DNA evidence exonerated Stinson. The 3-D models, the detailed descriptions of his incisors, the “moral certainty” of his guilt—everyone had been convinced. And everyone had been wrong. The scientific tide has since turned. Dentists have recanted their testimony and disavowed the method. The Texas Commission on Forensic Science called for a “moratorium” on bite-mark evidence. A study conducted by the president-elect of the American Board of Forensic Odontology revealed that 96 percent of the time, bite-mark examiners couldn’t unanimously agree on whether a bite mark came from a human, and the organization advised its members not to make identifications based on bite marks alone. And yet Stinson’s case still stands as precedent in Wisconsin courts, meaning other judges can cite it to admit bite-mark evidence. Its use is in decline, but there has never been a single ruling to exclude it.

Shockingly, the Supreme Court didn’t weigh in on the admissibility of forensic evidence until 70 years after Frye, in 1993—about two months after Genrich was found guilty. The high court’s ruling mandated that judges allow only scientific evidence supported by testable claims, and that proponents of the evidence must be able to provide measures of how often examiners make mistakes. What’s now known as the Daubert standard is federal law and has been adopted by most states, but it has had little effect in criminal law because most judges still rely on precedent, assuming evidence was vetted in past cases. “Judges,” said Harry T. Edwards, chief judge for the DC Circuit Court of Appeals, at a Harvard event last October, “believe that because we said it before, it must be right, and because these practitioners have been around for a long time, it must be right. In other words, history is the proof.” When it comes to booting flawed science out of criminal courts, “Daubert,” said Judge Edwards, “has largely been a failure.”

In 2009, the National Academy of Sciences performed the most sweeping independent survey of the state of forensic science to date. It was a bombshell. “Much forensic evidence—including, for example, bitemarks and firearm and toolmark identifications—is introduced in criminal trials without any meaningful scientific validation, determination of error rates, or reliability testing to explain the limits of the discipline,” the report noted. Forensic examiners in pattern-matching disciplines, it concluded, had no scientific basis for making claims of certainty in court. Professional societies gave no guidelines for testimony. Labs had no standard accreditation or certification procedures. There had been little research on variability, reliability, or human bias. Judge Edwards, who co-chaired one of the report’s subcommittees, said at Harvard, “We found a forensic community in disarray.”

While some in the forensic community dismissed the report, others embraced its recommendations, and for several years, a spirit of reform was in the air. The overall feeling was that more research was needed. “In forensic science,” says John Murdock, a preeminent former ATF firearm and toolmark examiner, “the research has been done by practitioners.” When we reached him by phone, Murdock was clearing his weekend schedule to participate in a large-scale study sent out by the FBI that required matching sets of fired bullets. He had also paid, out of pocket, to receive samples for a high-quality examiner proficiency test being conducted in the Netherlands using polymer replicas of bullets. He says this test is much more difficult than the relatively easy matching exams usually given by private companies. While these efforts are laudable, placing responsibility in the hands of working practitioners such as Murdock is not a feasible research strategy. Forensics lacks the infrastructure and the funding to support research. “It’s hard to create a research culture when you can’t afford research,” said Victor Weedn, a professor of forensic sciences at George Washington University and past president of the American Academy of Forensic Sciences. Many argue that the research shouldn’t be done by forensics examiners at all, but by academic scientists, who have university infrastructure and funding and can independently evaluate examiners’ claims.

In 2013, the Department of Justice under President Obama established the National Commission on Forensic Science, an interdisciplinary advisory committee including forensic practitioners as well as prominent scientists and attorneys, as a way of “passing the torch in forensic reform from the National Academy of Sciences.” The commission met quarterly and made nonbinding recommendations to the DOJ, some of which were adopted, including a new code of professional conduct for laboratories and the recommendation to drop the phrase “to a reasonable degree of scientific certainty” from examiner testimony—a dangerously misleading statement in the absence of empirical data. Grant funding for research was increased, and the National Institute of Standards and Technology established working groups for each forensic discipline to set standards for the field.

Then, in 2016, Obama’s scientific advisory council, PCAST, comprising prominent scientists and experts in academia and industry, issued a follow-up report. The years of nominal reform had apparently had little substantive effect on getting unproven “science” out of the courts. “It has become increasingly clear in recent years that lack of rigor in the assessment of the scientific validity of forensic evidence is not just a hypothetical problem but a real and significant weakness in the judicial system,” the report concluded. One of PCAST’s most contentious conclusions was the lack of support for firearm and toolmark analysis. The council found a number of studies that “sought to estimate the accuracy of examiners’ conclusions” but said only one study that included 218 examiners was “appropriately-designed” according to basic scientific standards. Despite examiners’ claims of near-infallibility, that study found a false-match rate of one in 100. A vast body of research of widely varying quality suggests that matching bullets to guns may have a scientific basis, though examiners are nowhere near infallible. For a widespread technique like firearm analysis, the lack of empirical data for error rates was shocking.

Because PCAST did not directly assess the skill of examiners who match marks from handheld tools like wire-strippers and pliers, we conducted our own review of the field and consulted with several leading toolmark examiners. We found a wealth of literature on the uniqueness of marks and the manufacturing process; a growing literature on quantifying the variability of striations; and results of examiner-proficiency tests. However, we found only a single study, from 2009, that tested toolmark examiners’ abilities in a controlled setting. The FBI tested eight of its own examiners analyzing marks left by screwdrivers. Promisingly, these eight examiners made no errors. But one small study, in which the researchers have a vested interest in the outcome, on one type of tool, is hardly a validation of the field, leaving the crucial question of error rates unanswered. Because firearms are a specialized type of tool, examiners point to firearms studies to support their work with handheld tools, but what seem like logical inferences don’t always hold up: Forensic dentists wrongly thought they could match bite marks because they could identify people from dental remains. Without empirical research, it’s difficult to say if handheld toolmark matching is more like firearms (i.e., needs more research) or more like bite marks (i.e., needs to be abandoned). Regardless, it’s wildly unlikely that examiners can make matches “to the exclusion of all other tools.”

Rather than spurring a constructive debate about reform, PCAST dramatically widened a growing divide between mainstream academia and the forensic community. As defense attorneys reached for the reports to challenge forensic evidence in court, prosecutors rushed to defend one of their most powerful tools. Battle lines were drawn: Forensic examiners, law enforcement, and prosecutors gathered on one side, defense attorneys and mainstream scientists on the other. It was a perfect political storm.

5: JIMMY GENRICH’S PAST

Sheila Greenlee wears a bright floral housedress at the dining-room table, smoothing her flyaway gray bob when she needs to gather her thoughts. She is prone to brief fits of giggling when her memory alights on places too painful to stay for long. A grand piano takes up most of the living room in the split-level home she shares with her husband, Wallace, the choir director at the Methodist church where she played the organ for so many years. Their modest house sits at the end of a quiet cul-de-sac on the southwest edge of town, just shy of the sheer-walled canyons of the Colorado National Monument. Sheila describes her son Jimmy as a shy young man who liked books and Nintendo video games. She also says he grew up poor and was abused as a child by his father, a violent drunk. Genrich had two brothers and one sister until his little brother Teddy had a swimming accident one Fourth of July. Genrich was 18 years old, and he was devastated by Teddy’s death.

James Martinez, a childhood friend who grew up across the street and now owns a motorcycle- and car-repair shop, remembers Genrich as a nerdy kid who rarely spoke unless spoken to. “We were all scared of his father,” says Martinez. “I remember we’d be in his room, listening to AC/DC, and we’d all go out the window when his dad came home. Jimmy would protect his little brother, and get his ass kicked for it.”

It seemed to Sheila, for a time, that her awkward son might be all right. After high school, Genrich moved to Phoenix to attend DeVry University, where he studied electronics (a fact made much of during the trial). When Martinez moved to Phoenix to work in a custom-hot-rod shop, Genrich was already there. “He had good grades. He was doing well. Where he fell into trouble might have been my fault.” Martinez shakes his head regretfully. “My best friends were all longhairs and bikers in bands.” He introduced the bikers to Genrich, who moved into a spare room in their rowdy house. “People thought he was weird,” says Martinez. “As he got older, he was still the nerd, but he turned into a man. He would say, ‘No woman will ever want me—I’m the ugliest guy in the room.’” His grades at DeVry plummeted and he struggled with a required English class, then eventually dropped out. He was drinking heavily and, Martinez says, picked up a meth habit from his biker housemates.

When investigators later questioned Martinez, he insisted he never saw Genrich making bombs. “I would have known,” he says. “He was living in my house. Jimmy was my friend, but he wasn’t that good of a friend where I would lie for him.”

Genrich moved back to Grand Junction in July of 1989 and struggled to pull a life together. He worked various jobs, including a stint as a dishwasher at the Two Rivers Convention Center—a boss later described him as “the sole problem employee.” His boss at Suehiro’s, where he was also a dishwasher, told the Daily Sentinel he had a temper. “He would yell at me and leave. Then come back a few days later and say he’s sorry. I always gave him his job back because he’s a good person,” his boss said. “He’s very honest.”

Genrich desperately wanted a girlfriend but was consumed by frustrations that girls wouldn’t “give him the time of day.” Sheila says he struggled with loneliness. “He craved it, and yet he didn’t want to be alone,” she says. “He used to say to me, ‘Mom, you have to help me find a girlfriend.’” Sheila and Wallace agree that he was increasingly “moody,” and he began to get into trouble for angry outbursts. They took him along on a car trip to Canada, to visit Sheila’s family, and he “threw rocks at cars because the people weren’t friendly.” Back in Grand Junction, Genrich was arrested for breaking the glass of an Albertsons exit door after he claimed employees snubbed him.

On February 10, 1990, Sheila took her son to St. Mary’s hospital for an involuntary, 72-hour psychiatric evaluation. He had attempted suicide once before, as a teenager, when he swallowed a bottle of aspirin after a girl had rejected him. In an affidavit, later quoted by investigators in a search warrant, Sheila stated that her son “appears to be mentally ill and, as a result of such illness, appears to be a danger to himself or others.”

There were more outbursts. In 1991, he furiously knocked books from shelves at the library after the librarians, he said, wouldn’t help him. At the Readmor bookstore, where his request for The Anarchist Cookbook made him the lead suspect, Genrich knocked books from shelves and spit on the windows. When asked why her son would want to order that book, Sheila sighs. “Knowing Jimmy, they were probably giving him a hard time.” Genrich later told an ATF agent he had tried to order it to “piss the lady off at the bookstore.”

After the ATF started trailing him, Genrich met his future lawyer Bert Nieslanik by climbing the steps to the public defender’s office to ask if the agents were allowed to follow him. He complained about them to Nieslanik, to his mother, and to the agents themselves. He took to talking with the agents, sometimes getting into their car for an hour or more to vent his frustrations about his life and ask when they were going to start looking for the real bomber. One night, according to an ATF agent, Genrich came out of his boardinghouse and got into the car with a handful of photographs. “He had brought me pictures of some bookshelves…we were talking about the tools, and he was telling me about this woodworking project that he had…he asked if I wanted to come in and take a look at these pictures, and I declined that night, but he brought the pictures out. He showed me the pictures of bookshelves that he had made for his mom and, also, a postcard of Crazy Horse Monument, up in South Dakota, I believe.”

Nieslanik says the attention from the ATF agents was a double-edged sword. Genrich had no one else to talk to, but having the feds follow him around made him a target for the town’s fears about the bombings—television-news cameras sometimes followed in the agents’ wake. One night, a man in the Corral attacked Genrich, bringing him to his knees. He was hurt and confused that the agents, who were there, did nothing to protect him. In July, ATF agent Larry Kresl took Genrich to lunch, then said he’d take him to the Job Services Center and help him find a job. Instead, Kresl took him to a room hung with grisly pictures of the victims’ bodies, and tried to elicit a confession. Genrich did not confess.

Four days later, agents followed him to Teddy’s grave and suggested Genrich commit suicide “so we can all go home.” After hearing of the agents’ behavior at trial, Judge Massaro said that “more repulsive governmental conduct, absent the infliction of actual harm, is difficult to imagine.”

In August, a federal agent named Bill Frangis came to Sheila and Wallace’s home and told them the case against Genrich was “overwhelming.” Sheila agitatedly clasps and unclasps her hands on the dining-room table as she remembers. “If the locals got him,” she recalls Frangis telling her, “he would be put to death. But if the feds looked after him, he’d try to save him.” Sheila glances at Wallace, who regards her calmly. “He was charming,” she continues. “I made him tea. He acted like he was going to help us.” Frangis asked Sheila to wear a wire and try to get her son to confess. “I was stunned,” she says, “because I knew that he didn’t do it.” Once she heard about the death penalty, Sheila says, “I was terrified,” but she couldn’t bring herself to wear the wire. Wallace, sensing her agony, stepped in and offered to wear it. “It was just like, his life was in danger,” Wallace recalls in a soft voice. A few weeks later, the feds came back and taped a wire to Wallace’s chest. In a tearful conversation in the living room, while ATF agents sat outside in their car listening in, Sheila and Wallace told Genrich that they loved him, that they would love him no matter what, and that he just needed to tell them if he did it. Sheila says she was trying to save his life. “He looked at me and said, ‘Mom, do you believe I did that?’” Genrich did not confess.

Before all this, Sheila says, “I believed in [law enforcement] wholeheartedly. I believed in them, A-one. Boy, has that changed. And I don’t say they’re all bad, by any means. But I can understand how they can twist things.”

Martinez understands that Genrich comes off as “weird.” But, he insists, “He didn’t have a violent bone in his body. He never hurt anybody. He was the one getting hurt.” Martinez shakes his head. “He was like that puppy who’d been beat, and you look at him and their tail goes between their legs, and they’re hunched over. That was Jimmy. Intimidated, you know?” He is particularly skeptical they got the right guy because on April 14, 1989, when the unexploded bomb was found outside the LaCourt motel, Genrich had an ironclad alibi.

6: REASONABLE DOUBT

One of the remarkable things about the Genrich case is that other than the toolmarks, it was built almost entirely on circumstantial evidence. There was no confession, no reliable eyewitness, no gunpowder, no DNA. Just an isolated, angry young man with a history of mental illness who made women uncomfortable and once asked about The Anarchist Cookbook, who lived in a messy 12-by-12 room with a toolbox and a pile of disturbing handwritten notes. If you believed this was the type of person to carry out a bombing, then the toolmark evidence confirmed what you already suspected to be true.

But if you trusted the toolmarks, then you’d struggle to explain how O’Neil matched Genrich’s needle-nose pliers to a bomb he couldn’t have planted. On April 14, 1989, the day the unexploded bomb was discovered in the LaCourt motel parking lot, Genrich was 580 miles away, in Phoenix. Registers from Houle’s bookstore showed his handwritten entries recording stock being returned to publishers on the day the bomb was found. The store owner testified that Genrich had worked every day that week, including a six-and-a-half-hour shift on the day the bomb was found. Investigators scoured plane records and car-rental registries but could find no evidence he had left Arizona. The prosecution suggested the circular theory that Genrich could have had an accomplice who planted that bomb, because he couldn’t have set the bomb himself. There was, however, no evidence that pointed to an accomplice. It stretches the imagination to believe Genrich was the 1989 bomber and also to believe O’Neil’s toolmark analysis to be reliable.

One of the key pieces of circumstantial evidence that convinced both the investigators and the jury was the unsettling handwritten notes. In his opening statement, ErkenBrack said, “Mr. Genrich was a very, very angry man, especially with women, but angry at everyone. You will hear Mr. Genrich said time and time again that he had a problem with women.” In the search-warrant affidavit, investigator Bob Russell writes that the Feed Lot bomb was placed near a car’s front right tire, and “this would be the side most ordinarily used by the female.” This logic falls apart a bit, however, when you consider that the Gonzales bomb was placed by a rear tire and the LaCourt motel bomb by the driver’s side. Judge Massaro noted that the bombs did not seem to have been placed in areas obviously frequented by women.

And while the handwritten notes are undoubtedly disturbing, if looked at in a slightly different light, they don’t obviously point to a serial bomber. After Sheila had her son briefly committed, Genrich started seeing a therapist who encouraged him to write down his feelings instead of, as he told The Denver Post, “losing my temper.” According to Genrich, these were the notes the ATF agents found when they searched his home. “I don’t hate women, I really don’t…. I’d get drunk and pissed off, so I’d write it down instead of going out and getting into trouble,” he told the Post. The closest Genrich ever came to a confession was telling the ATF agents, “I’m not a bomber, but I should be a rapist.”

When Nieslanik first met Genrich, she says, he “looked homeless, unkempt, agitated.” He had trouble making eye contact and seemed paranoid, but she says he wasn’t a bomber. “Cra-zy,” she singsongs, “but innocent.”

Taken together, the circumstantial evidence against Genrich was ambiguous at best. He could have constructed the electrical circuit, but so could many other people in Grand Junction. He had Buss-type fuses in his toolbox, but no trace of gunpowder was ever found. The letters clearly threatened violence toward women, but the bombs seemed to target random strangers. He lived close to two of the bomb sites, and he was known to take long, meandering walks around town, but it would require steely nerve and a steady gait to carry a bomb with a hair-trigger switch two miles to the Gonzales home. Genrich asked for The Anarchist Cookbook after the bombings had already started. And if you read the book, you’d notice it does not include directions or a diagram that “describes precisely how you make the bombs that were used in this case,” as ErkenBrack claimed in court. Then there’s that ironclad alibi for the LaCourt motel bomb.

So you’re left with toolmarks. The only match that all examiners agreed on was between a pair of Genrich’s red-handled wire-strippers and scratches left on a piece of baling wire found near the Valentine’s Day bomb. Baling wire, a farm fix-all as common as duct tape in Grand Junction, was not the type of wire included in the “signature bomb,” nor is it particularly well suited for a bomb circuit. In most cases that go to trial, there are multiple converging lines of evidence, but not in the case of Jimmy Genrich. If you believe O’Neil can trace shadow lines across a toolmark’s microscopic ridges and valleys until it all “falls in place,” then Genrich is guilty. But if you believe O’Neil had only to tilt the light just right to bring into view what he was seeking to find, then an innocent man may be in prison.

7: PROSECUTORS AND NATIONAL REFORM

The 2009 NAS report that upended the nation’s confidence in long-trusted forensic techniques had an unlikely instigator: Jeff Sessions. In 2000, Sessions proposed a bill to increase funding for crime laboratories struggling with backlogs; the bill passed, but most of the money never materialized. Leaders in the forensics community lobbied until the Senate finally included $1.5 million in a 2005 appropriations bill to fund an NAS report “identifying the needs of the forensics community.” It was meant to provide the “basis for legislation” that a senator like Sessions could introduce again to secure more crime-lab funding.

“He’s a former prosecutor,” says Erin Murphy, a professor of law at NYU and author of Inside the Cell: The Dark Side of Forensic DNA. “He was really interested in helping law enforcement basically use more forensics. He thought he was going to commission a study where they’d come back and say, ‘Well, we need 6,000 more bite-mark examiners and 400 more toolmark examiners.’ Instead, of course, they got the study they got, which basically said half of these sciences are garbage, the other half may have some basis but need more empirical work, there’s rampant overclaiming, there’s rampant bias. It was really a full-throated list of all of the problems in forensics.”

Sessions rejected the study’s conclusions. “I don’t think we should suggest that those proven scientific principles that we’ve been using for decades are somehow uncertain,” he said in a Senate hearing after the report’s release.

Prosecutors have a vested interest in resisting reform, because it could weaken one of their most powerful tools, threaten cases currently under way, and call past convictions into question. For years, the DOJ was aware that its hair-comparison examiners made mistakes, but the department did little to address the problem until a whistle-blower came forward in the early 1990s. During an eight-year review of 2,900 cases, the DOJ found several instances of potentially exculpatory evidence but only haphazardly notified defendants, if at all. This review remained secret until 2012, when The Washington Post broke the story that the department had never followed up on its mistakes. In 2015, the DOJ finally conceded that hair-comparison examiners gave flawed testimony in 96 percent of cases, including 33 of 35 death-penalty cases reviewed. Nine of those defendants had already been executed. One defendant served 28 years in prison before being exonerated by DNA testing. In court, the prosecution said it was a “one in 10 million” chance the hairs belonged to someone other than the defendant. One of the hairs turned out to be a dog’s.

Trusting prosecutors to reform forensics presents a clear conflict of interest. Prosecutors at all levels work closely with forensic practitioners—some crime labs report directly to DA offices—and they see themselves as being on the same team. When that relationship is too cozy, unconscious cognitive bias can curdle casework (such as examiners unconsciously seeking evidence that confirms opinions held by law enforcement). While some communication between prosecutors and forensic practitioners may be necessary, practices designed to curb cognitive bias vary widely. Police departments frequently share crucial information about cases, so examiners know what their colleagues want them to find. Crime labs in at least a dozen states receive funding through court fees—but only when their analyses result in convictions. In the Genrich case, O’Neil was flown to Grand Junction to meet with prosecutors during the course of his analysis. ErkenBrack was even present in the lab for the retest. Prosecutors rely on examiners to solve crimes and win convictions.

Still, prosecutors have a legal and ethical obligation not to present evidence they know isn’t true in court. Unfortunately, like judges, most prosecutors are not equipped to evaluate the scientific merit of forensic techniques, and even if they are acting in good faith, the incentives are all wrong. Prosecutors feel immense pressure to present forensic evidence to juries—many prosecutors believe juries can be reluctant to convict without it—a phenomenon dubbed “the CSI effect.” Bill Fitzpatrick, district attorney of New York’s Onondaga County and a past president of the National District Attorneys Association, says “people forget we are seekers of the truth,” but he has been a vocal opponent of efforts to limit forensic-expert testimony. “What are we going to say? ‘That fingerprint maybe came from that hand’? ‘That bullet maybe came from that gun’? No, we’d get no convictions.” Acknowledging the limitations of forensic techniques is perceived as opening the door to reasonable doubt.

Scandals like the DOJ’s hair-comparison cover-up have shaken public confidence in forensic evidence and prosecutors’ willingness to own up to mistakes, yet many prosecutors maintain faith in a system shown repeatedly to fail. “There are many layers to protect the judicial process when we’re talking about scientific evidence,” says Mike Ramos, district attorney of San Bernardino County and another former president of the NDAA. “Every time I’ve used the scientific-based evidence, it has been tested. They do an analysis, and that’s brought before the jury—‘here’s how it was done, here’s how it was tested’—to make sure that it was based on a protocol that has been used, and has gone through the tests, through all of the layers.” In any case, this reasoning goes, the defense can challenge any claim in an evidentiary hearing, as well as cross-examine expert witnesses, and, in the end, there’s a jury to decide. “We still have a system in place that I believe works,” says Ramos. “The last thing we want to do is convict someone who’s innocent.”

When asked if there were any conflicts of interest in prosecutorial agencies such as the DOJ overseeing forensic-science reform, Ramos said, “I don’t see a bias.”

One of the NAS report’s first recommendations was that reform be independent of prosecutorial agencies. It suggested a new federal agency, the National Institute of Forensic Science, an independent body (a bit like the FDA) that would encourage empirical testing of forensic techniques before they enter the courts, establish mandatory lab accreditation and examiner certification, and standardize expert testimony. Judge Edwards, who co-chaired one of the NAS committees, said at the Harvard event last October, “One of our most important recommendations was not having forensic science reform in DOJ. Prosecution is not consistent with a culture of science. We unanimously viewed that DOJ had to be kept out of it.”

The independent body suggested by the NAS was never created, and the DOJ has pushed back against independent reform. When Obama suggested PCAST look into forensic science, co-chair Eric Lander, a biologist at MIT and Harvard and leader of the Human Genome Project who also serves on the board of the Innocence Project, took the idea to the DOJ: “They had a fit,” he said at Harvard last October. “‘You realize this could jeopardize existing cases and past convictions?’” Reform that acknowledges long-standing problems with forensic evidence would, indeed, be deeply disruptive. It could mean that some people who have committed crimes would go free, but it would also mean that innocent people would be freed, and that serious flaws in the justice system could be corrected.

When the report was issued, Attorney General Loretta Lynch disavowed its conclusions: “While we appreciate their contribution to the field of scientific inquiry,” she said, “the department will not be adopting the recommendations related to the admissibility of forensic science evidence.”

Two months after being sworn in as attorney general, Sessions allowed the National Commission on Forensic Science to expire and suspended the ongoing review of examiner-testimony standards. A new Justice Department Task Force on Crime Reduction and Public Safety, established by executive order to “support law enforcement” and “restore public safety,” now oversees forensic science at the national level. Sessions has appointed a new senior adviser on forensics to lead an internal Forensic Science Working Group in conducting a “needs assessment of forensic science laboratories that examines workload, backlog, personnel and equipment needs of public crime laboratories,” as well as to “strengthen the foundations of forensic science.” In public statements, Sessions has returned to his rhetoric of the early 2000s: The purpose of forensic-science “reform” is to reduce crime-lab backlogs, and should be run by law enforcement and forensic practitioners. Prosecutors applauded Sessions’s move to put the DOJ back in charge.

It’s still unclear how the DOJ will handle its newfound mandate, but Sessions’s pick to lead the internal Forensic Science Working Group offers a clue. Ted Hunt, a Missouri prosecutor, was a member of the now-defunct National Commission on Forensic Science, where he voted against many reform measures, including crime-lab oversight measures and scientific evaluations of forensic methods. Although the DOJ revived its review of expert-testimony standards last August, Hunt was also one of just two members of the NCFS to oppose dropping the misleading phrase “to a reasonable degree of scientific certainty” from examiners’ testimony.

Hunt has stayed out of the limelight since his appointment, but records we obtained from recent meetings are revealing. In a meeting of the federal judiciary’s Advisory Committee on Rules of Evidence, Hunt blasted PCAST for its “narrow” definition of science and called PCAST’s insistence on empirical research methods “wrong and ill-advised.” At an October meeting at the National Academy of Sciences, he stated that “the jury is still out on bite marks,” inspiring at least two other participants to state firmly that “the jury is not out” on bite marks. Further, Hunt declared that how best to establish a method’s scientific validity amounts to a “difference of opinion.” At that point in the meeting, an MIT sociologist, Susan Silbey, felt compelled to speak up. She began reading out loud the requirements from PCAST’s report, such as having sufficient sample sizes and blinding subjects from the test. “PCAST is requesting basic scientific research methods. I teach scientific methodology every Friday at 9 am. This is what I teach in my class.”

Both Hunt and the DOJ declined to comment on whether they support dropping the scientifically dubious phrase “to a reasonable degree of scientific certainty” from examiner testimony. When asked if he thought prosecutors might have reasons to resist reforms that soften the language of examiners’ claims in court, Hunt replied, “I can unequivocally say that I don’t know of any prosecutor who would consciously choose to offer unreliable evidence or rely on faulty statements of probative value—whether forensic or not. The prosecutor’s duty is to seek justice, not win convictions.” While internal discussions about reform are clearly ongoing, the prosecutorial community’s stance seems to be, basically: “trust us.”

Judge Edwards, who helped write the 2009 NAS report and watched Silbey deliver her rebuke, took stock at the Harvard meeting. “Part of our worst fears have been realized…[the] DOJ is now the self-anointed leader of the forensic science reform. Which is a disaster.”

8: THE FUTURE OF FORENSICS

When Genrich learned that his case was going to be taken up by the Innocence Project, he was thrilled. Unlike most of the cases the nonprofit takes on, there is no DNA evidence that could exonerate him (there rarely is in bombing cases). Instead, his new team of lawyers is arguing that the scientific consensus around toolmark evidence has changed. Leading scientists at the NAS and PCAST say toolmark matching has not yet proved to be a scientifically reliable method, and they agree that the kind of testimony O’Neil gave is scientifically indefensible. The Innocence Project argues that this constitutes “newly discovered evidence” and that Genrich deserves a new trial. Thirty-six prominent scientists, legal scholars, and forensics experts have signed an amicus brief in support.

“The legal concept of newly discovered evidence including a change in science,” says Chris Fabricant of the Innocence Project, who is litigating Genrich’s case, “is in my view a no-brainer. It was presented to a jury as infallible, and today we know it’s not. There is an obligation—an ethical, a legal, and a moral obligation—to go back and correct the record where exaggerated claims may have led to a miscarriage of justice.” When Bert Nieslanik called up the statistician on a hunch that the science was flawed, she was a voice in the wilderness. There was no scientific consensus to support her. Fabricant says that “in a way, she was ahead of her time.”

Genrich’s appeal is among the first of a new flurry of post-conviction challenges based on the lack of research to support the pattern-matching forensic disciplines. The Innocence Project has filed appeals challenging bite marks and shaken-baby syndrome and, just last December, filed an amicus brief in a Maryland firearm case. Yet only two states—California and Texas—have passed statutes stating that a change in scientific consensus alone can constitute “newly discovered evidence.” These special statutes were advanced because higher courts have proved skittish about granting new trials when the scientific evidence used to convict has since become controversial or even discredited. In one case, the California Supreme Court failed to grant the defendant a new trial even when the bite-mark examiner recanted his testimony over 10 years later. That maddening ruling spurred the new California law, and only after its passage did the court finally exonerate the wrongfully convicted man. In states without special statutes, such as Colorado, appeals face a much tougher uphill battle to create new precedent that could be used to keep scientifically flawed expert testimony out of courtrooms in the future.

The rulings on cases such as Genrich’s will arrive at a critical moment for American criminal justice, which is on the cusp of an explosion of next-generation forensic techniques. Like so many other aspects of modern life, criminal justice is becoming automated. Artificial intelligence predicts criminal “hotspots” so police can be there at the ready. Software sold at $60,000 a pop to police departments helps them analyze minute amounts of DNA in complex mixtures from multiple people. The new field of digital forensics promises new techniques to parse large amounts of data. These technologies, by and large, are sold by for-profit companies, which means their algorithms are proprietary trade secrets. Their inner workings are invisible to lawyers mounting a defense. Juries will hardly be equipped to decide for themselves what weight to give a piece of sophisticated, impenetrable science presented as forensic evidence. Says Murphy, the NYU law professor, “Some of the bad habits of the first generation—the secrecy, the government bias, the overclaiming, the stretching without offering empirical support, the failure of judges to curb that behavior—that is going to be on steroids.”

Recently, a few intrepid judges have acknowledged that the problems with forensic science can’t be ignored any longer. In 2016, Judge Catherine Easterly, on the DC Court of Appeals, wrote in a robbery case involving a firearm, “As matters currently stand, a certainty statement regarding toolmark pattern matching has the same probative value as the vision of a psychic: it reflects nothing more than the individual’s foundationless faith in what he believes to be true.” Murphy, who signed on to the amicus brief supporting Genrich’s appeal, says favorable rulings like Judge Easterly’s could help change the legal landscape. “The more a court of high regard makes a ruling that says this is not a legitimate science, or this practice that stands in courtrooms across America…won’t stand in our state, people notice,” she says. “Lawyers making those arguments can say, ‘Look it’s not just me with my tinfoil hat, the Supreme Court of Colorado agrees.’”

When Nieslanik, who is now retired, heard about the NAS and PCAST reports, her response was, “What took them so long?” Genrich has now been in prison for nearly a quarter century, and there is still a deeply troubling disconnect between what constitutes scientific proof and what is accepted as legal evidence. “We weren’t proving Jimmy’s innocence,” Nieslanik says. “We were really focused on ‘this isn’t a science.’ I can tell you from doing triple-digit jury trials, the jurors really want concrete evidence. Because it’s a very hard decision to make. And that’s why scientific evidence is so dangerous when it’s not a real science, because it is persuasive.”

Editor’s note: An earlier version of this article did not clearly differentiate between the methods of the three exoneration studies described. The Innocence Project, the National Registry of Exonerations, and an academic study by Gould et al. each looked at different sets of cases and used different methods to determine whether faulty forensics was a factor in exonerations. The article also inaccurately described John Murdock’s employment and his participation in an FBI study. He formerly worked for ATF, not the FBI; he was working on a study sent out by the FBI, not a proficiency test. The text has been corrected.

Ad Policy
x