Testing Times in Higher Ed

Testing Times in Higher Ed

Facebook
Twitter
Email
Flipboard
Pocket

The SAT has been on the ropes lately. The University of California system has threatened to quit using the test for its freshman admissions, arguing that the exam has done more harm than good. The State of Texas, responding to a federal court order prohibiting its affirmative action efforts, has already significantly curtailed the importance of the SAT as a gatekeeper to its campuses. Even usually stodgy corporate types have started to beat up on the SAT. Last year, for example, a prominent group of corporate leaders joined the National Urban League in calling upon college and university presidents to quit placing so much stock in standardized admissions tests like the SAT, which they said were “inadequate and unreliable” gatekeepers to college.

Then again, if the SAT is anything, it’s a survivor. The SAT enterprise–consisting of its owner and sponsor, the College Board, and the test’s maker and distributor, the Educational Testing Service–has gamely reinvented itself over the years in myriad superficial ways, hedging against the occasional dust-up of bad public relations. The SAT, for example, has undergone name changes over the years in an effort to reflect the democratization of higher education in America and consequent changes in our collective notions about equal opportunity. But through it all, the SAT’s underlying social function–as a sorting device for entry into or, more likely, maintenance of American elitehood–has remained ingeniously intact, a firmly rooted icon of American notions about meritocracy.

Indeed, the one intangible characteristic of the SAT and other admissions tests that the College Board would never want to change is the virtual equation, in the public’s mind, of test scores and academic talent. Like the tobacco companies, ETS and the College Board (both are legally nonprofit organizations that in many respects resemble profit-making enterprises) put a cautionary label on the product. Regarding their SAT, the organizations are obliged by professional codes of proper test practices to inform users of standardized admissions tests that the exams can be “useful” predictors of later success in college, medical school or graduate school, when used in conjunction with other factors, such as grades.

But the true place of admissions testing in America isn’t always so appropriate. Most clear-eyed Americans know that results on the SAT, Graduate Record Exam or the Medical College Admission Test are widely viewed as synonymous with academic talent in higher education. Whether it’s true or not–and there’s lots of evidence that it’s not–is quite beside the point.

Given the inordinate weight that test scores play in the American version of meritocracy, it’s no surprise that federal courts have been hearing lawsuits from white, middle-class law school applicants complaining they were denied admission to law school even though their LSAT scores were fifty points greater than a minority applicant who was admitted; why neoconservative doomsayers warn that the academic quality of America’s great universities will plummet if the hordes of unwashed (read: low test scores) are allowed entry; why articles are written under titles like “Backdoor Affirmative Action,” arguing that de-emphasizing test scores in Texas and California is merely a covert tactic of public universities to beef up minority enrollments in response to court bans on affirmative action.

Indeed, Rebecca Zwick, a professor of education at the University of California, Santa Barbara, and a former researcher at the Educational Testing Service, wrote that “Backdoor Affirmative Action” article for Education Week in 1999, implying that do-gooders who place less emphasis on test scores in order to raise minority enrollments are simply blaming the messenger. And so it should not be surprising that the same author would provide an energetic defense of the SAT and similar exams in her new book, Fair Game? The Use of Standardized Admissions Tests in Higher Education.

Those, like Zwick, who are wedded to the belief that test scores are synonymous with academic merit will like this concise book. They will praise its 189 pages of text as, finally, a fair and balanced demystification of the esoteric world of standardized testing. Zwick and her publisher are positioning the book as the steady, guiding hand occupying the sensible middle ground in an emotional debate that they claim is dominated by journalists and other uninformed critics who don’t understand the complex subject of standardized testing. “All too often…discussions of testing rely more on politics or emotion than on fact,” Zwick says in her preface. “This book was written with the aim of equipping contestants in the inevitable public debates with some solid information about testing.”

If only it were true. Far from reflecting the balanced approach the author claims, the book is thinly disguised advocacy for the status quo and a defense of the hegemony of gatekeeping exams for college and university admissions. It could be more accurately titled (without the bothersome question mark) “Fair Game: Why America Needs the SAT.”

As it stands, the research staff of the College Board and the Educational Testing Service, Zwick’s former employer, might as well have written this book, as she trots out all the standard arguments those organizations have used for years to show why healthy doses of standardized testing are really good for American education. At almost every opportunity, Zwick quotes an ETS or College Board study in the most favorable light, couching it as the final word on a particular issue, while casting aspersion on other studies and researchers (whose livelihoods don’t depend on selling tests) that might well draw different conclusions. Too often Zwick provides readers who might be unfamiliar with the research about testing with an overly simplistic and superficial treatment. At worst, she leaves readers with grossly misleading impressions.

After providing a quick and dirty account of IQ testing at the turn of the last century, a history that included the rabidly eugenic beliefs of many of the early testmakers and advocates in Britain and the United States (“as test critics like to point out,” Zwick sneers), the author introduces readers to one of the central ideologies of mental testing to sort a society’s young for opportunities for higher education. Sure, mental testing has brought some embarrassing moments in history that we moderns frown on nowadays, but the testing movement has had its good guys too. Rather than being a tool to promote and protect the interests of a society’s most privileged citizens, the cold objectivity of standardized testing remains an important goal for exercise of democratic values.

According to this belief, standardized testing for admission to college serves the interest of meritocracy, in which people are allowed to shine by their wits, not their social connections. That same ideology, says Zwick, drove former Harvard president James Bryant Conant, whom Zwick describes as a “staunch supporter of equal opportunity,” in his quest to establish a single entrance exam, the SAT, for all colleges. Conant, of course, would become the first chairman of the board of the newly formed Educational Testing Service. But, as Nicholas Lemann writes in his 1999 book The Big Test: The Secret History of the American Meritocracy, Conant wasn’t nearly so interested in widening opportunity to higher education as Zwick might think. Conant was keen on expanding opportunity, but, as Lemann says, only for “members of a tiny cohort of intellectually gifted men.” Disillusioned only with the form of elitism that had taken shape at Harvard and other Ivy League colleges, which allotted opportunities based on wealth and parentage, Conant was nevertheless a staunch elitist, an admirer of the Jeffersonian ideal of a “natural aristocracy.” In Conant’s perfect world, access to this new kind of elitehood would be apportioned not by birthright but by performance on aptitude tests. Hence the SAT, Lemann writes, “would finally make possible the creation of a natural aristocracy.”

The longstanding belief that high-stakes mental tests are the great equalizer of society is dubious at best, and at worst a clever piece of propaganda that has well served the interests of American elites. In fact, Alfred Binet himself–among the fathers of IQ testing, who would invent the first version of the Stanford-Binet intelligence test, the precursor to the modern SAT–observed the powerful relationship between one’s performance on his so-called intelligence test and a child’s social class, a phenomenon Binet described in his 1916 book The Development of Intelligence in Children.

And it’s the same old story with the SAT. Look at the college-bound high school seniors of 2001 who took the SAT, and the odds are still firmly stacked against young people of modest economic backgrounds’ beating the SAT odds. A test-taker whose parents did not complete high school can expect to score fully 171 points below the SAT average, College Board figures show. On the other hand, high schoolers whose moms and dads have graduate degrees can expect to outperform the SAT average by 106 points.

What’s more, the gaps in SAT performance between whites and blacks and between whites and Mexican-Americans have only ballooned in the past ten years. The gap between white and black test-takers widened five points and eleven points on the SAT verbal and math sections, respectively, between 1991 and 2001. SAT score gaps between whites and Mexican-Americans surged a total of thirty-three points during that same period.

For critics of the national testing culture, such facts are troubling indeed, suggestive of a large web of inequity that permeates society and the educational opportunities distributed neatly along class and race lines, from preschool through medical school. But for Zwick, the notion of fairness when applied to standardized admissions tests boils down to a relatively obscure but standard procedure in her field of “psychometrics,” which is in part the study of the statistical properties of standardized tests.

Mere differences in average test scores between most minority groups and whites or among social classes isn’t all that interesting to Zwick. More interesting, she maintains, is the comparative accuracy of test scores in predicting university grades between whites and other racial groups. In this light, she says, the SAT and most standardized admissions tests are not biased against blacks, Latinos or Native Americans. In fact, she says, drawing on 1985 data from a College Board study that looked at forty-five colleges, those minority groups earned lower grades in college than predicted by their SAT scores–a classic case of “overprediction” that substantiates the College Board claim that the SAT is more than fair to American minorities. By contrast, if the SAT is unfair to any group, it’s unfair to whites and Asian-Americans, because they get slightly better college grades than the SAT would predict, Zwick suggests.

Then there’s the odd circumstance when it comes to standardized admissions tests and women. A number of large studies of women and testing at the University of California, Berkeley, the University of Michigan and other institutions have consistently shown that while women (on average) don’t perform as well on standardized tests as male test-takers do, women do better than men in actual classroom work. Indeed, Zwick acknowledges that standardized tests, unlike for most minority groups, tend to “underpredict” the actual academic performance of women.

But on this question, as with so many others in her book, Zwick’s presentation is thin, more textbookish than the thorough examination and analysis her more demanding readers would expect. Zwick glosses over a whole literature on how the choice of test format, such as multiple-choice versus essay examinations, rewards some types of cognitive approaches and punishes others. For example, there’s evidence to suggest that SAT-type tests dominated by multiple-choice formats reward speed, risk-taking and other surface-level “gaming” strategies that may be more characteristic of males than of females. Women and girls may tend to approach problems somewhat more carefully, slowly and thoroughly–cognitive traits that serve them well in the real world of classrooms and work–but hinder their standardized test performance compared with that of males.

Beyond Zwick’s question of whether the SAT and other admissions tests are biased against women or people of color is the perhaps more basic question of whether these tests are worthwhile predictors of academic performance for all students. Indeed, the ETS and the College Board sell the SAT on the rather narrow promise that it helps colleges predict freshman grades, period. On this issue, Zwick’s presentation is not a little pedantic, seeming to paint anyone who doesn’t claim to be a psychometrician as a statistical babe in the woods. Zwick quotes the results of a College Board study published in 1994 finding that one’s SAT score by itself accounts for about 13 percent of the differences in freshman grades; that one’s high school grade average is a slightly better predictor of college grades, accounting for about 15 percent of the grade differences among freshmen; and that the SAT combined with high school grades is a better predictor than the use of grades alone. In other words, it’s the standard College Board line that the SAT is “useful” when used with other factors in predicting freshman grades. (It should be noted that Zwick, consistent with virtually all College Board and ETS presentations, reports her correlation statistics without converting them into what’s known as “R-squared” figures. In my view, the latter statistics provide readers with a common-sense understanding of the relative powers of high school grades and test scores in predicting college grades. I have made those conversions for readers in the statistics quoted above.)

Unfortunately, Zwick misrepresents the real point that test critics make on the question of predictive validity of tests like the SAT. The salient issue is whether the small extra gains in predicting freshman grades that the SAT might afford individual colleges outweigh the social and economic costs of the entire admissions testing enterprise, costs borne by individual test-takers and society at large.

Even on the narrow question of the usefulness of the SAT to individual colleges, Zwick does not adequately answer what’s perhaps the single most devastating critique of the SAT. For example, in the 1988 book The Case Against the SAT, James Crouse and Dale Trusheim argued compellingly that the SAT is, for all practical purposes, useless to colleges. They showed, for example, that if a college wanted to maximize the number of freshmen who would earn a grade-point average of at least 2.5, then the admissions office’s use of high school rank alone as the primary screening tool would result in 62.2 percent “correct” admissions. Adding the SAT score would improve the rate of correct decisions by only about 2 in 100. The researchers also showed, remarkably, that if the admissions objective is broader, such as optimizing the rate of bachelor’s degree completion for those earning grade averages of at least 2.5, the use of high school rank by itself would yield a slightly better rate of prediction than if the SAT scores were added to the mix, rendering the SAT counterproductive. “From a practical viewpoint, most colleges could ignore their applicants’ SAT score reports when they make decisions without appreciably altering the academic performance and the graduation rates of students they admit,” Crouse and Trusheim concluded.

At least two relatively well-known cases of colleges at opposite ends of the public-private spectrum, which have done exactly as Crouse and Trusheim suggest, powerfully illustrate the point. Consider the University of Texas system, which was compelled by a 1996 federal appeals court order, the Hopwood decision, to dismantle its affirmative-action admissions programs. The Texas legislature responded to the threat of diminished diversity at its campuses with the “top 10 percent plan,” requiring public universities to admit any student graduating in the top 10 percent of her high school class, regardless of SAT scores.

Zwick, of course, is obliged in a book of this type to mention the Texas experience. But she does so disparagingly and without providing her readers with the most salient details on the policy’s effects in terms of racial diversity and the academic performance of students. Consider the diversity question. While some progressives might have first recoiled at the new policy as itself an attack on affirmative action, that has not been the case. In fact, at the University of Texas at Austin, the racial diversity of freshman classes has been restored to pre-Hopwood levels, after taking an initial hit. Indeed, the percentage of white students at Austin reached a historic low point in 2001, at 61 percent. What’s more, the number of high schools sending students to the state’s flagship campus at Austin has significantly broadened. The “new senders” to the university include more inner-city schools in Dallas, Houston and San Antonio, as well as more rural schools than in the past, according to research by UT history professor David Montejano, among the plan’s designers.

But the policy’s impact on academic performance at the university might be even more compelling, since that is the point upon which neoconservative critics have been most vociferous in their condemnations of such “backdoor” affirmative action plans that put less weight on test scores. A December 1999 editorial in The New Republic typified this road-to-ruin fiction: Alleging that the Texas plan and others like it come “at the cost of dramatically lowering the academic qualifications of entering freshmen,” the TNR editorial warned, these policies are “a recipe for the destruction of America’s great public universities.”

Zwick, too, neglects to mention the facts about academic performance of the “top 10 percenters” at the University of Texas, who have proven the dire warnings to be groundless. At every SAT score interval, from less than 900 to scores of 1,500 and higher, in the year 2000, students admitted without regard to their SAT score earned better grades than their non-top 10 percent counterparts, according to the university’s latest research report on the policy.

Or, consider that the top 10 percenters average a GPA of 3.12 as freshmen. Their SAT average was about 1,145, fully 200 points lower than non-top 10 percent students, who earned slightly lower GPAs of 3.07. In fact, the grade average of 3.12 for the automatically admitted students with moderate SAT scores was equal to the grade average of non-top 10 percenters coming in with SATs of 1,500 and higher. The same pattern has held across the board, and for all ethnic groups.

Bates College in Lewiston, Maine, is one case of a college that seemed to anticipate the message of the Crouse and Trusheim research. Bates ran its own numbers and found that the SAT was simply not a sufficiently adequate predictor of academic success for many students and abandoned the test as an entry requirement several years ago. Other highly selective institutions have similar stories to tell, but Bates serves to illustrate. In dropping the SAT mandate, the college now gives students a choice of submitting SATs or not. But it permits no choice in requiring that students submit a detailed portfolio of their actual work and accomplishments while in high school for evaluation, an admissions process completed not just by admissions staff but by the entire Bates faculty.

As with the Texas automatic admission plan, Zwick would have been negligent not to mention the case of Bates, and she does so in her second chapter; but it’s an incomplete and skewed account. Zwick quotes William Hiss, the former dean of admissions at Bates, in a 1993 interview in which he suggests that the Bates experience, while perhaps appropriate for a smaller liberal arts college, probably couldn’t be duplicated at large public universities. That quote well serves Zwick’s thesis that the SAT is a bureaucratically convenient way to maintain academic quality at public institutions like UT-Austin and the University of California. “With the capability to conduct an intensive review of applications and the freedom to consider students’ ethnic and racial backgrounds, these liberal arts colleges are more likely than large university systems to succeed in fostering diversity while toeing the line on academic quality,” Zwick writes.

But Zwick neglects to mention that Hiss has since disavowed his caveats about Bates’s lessons for larger public universities. In fact, Hiss, now a senior administrator at the college, becomes palpably irritated at inequalities built into admissions systems that put too much stock in mental testing. He told me in a late 1998 interview, “There are twenty different ways you can dramatically open up the system, and if you really want to, you’ll figure out a way. And don’t complain to me about the cost, that we can’t afford it.”

Zwick punctuates her brief discussion of Bates and other institutions that have dropped the SAT requirement by quoting from an October 30, 2000, article, also in The New Republic, that purportedly revealed the “dirty little secret” on why Bates and other colleges have abandoned the SAT. The piece cleverly observed that because SAT submitters tend to have higher test scores than nonsubmitters, dropping the SAT has the added statistical quirk of boosting SAT averages in U.S. News & World Report‘s coveted college rankings. That statistical anomaly was the smoking gun the TNR reporter needed to “prove” the conspiracy.

But to anyone who has seriously researched the rationales colleges have used in dropping the SAT, the TNR piece was a silly bit of reporting. At Bates, as at the University of Texas, the SAT “nonsubmitters” have performed as well or better academically than students who submitted SATs, often with scores hundreds of points lower than the SAT submitters. But readers of Fair Game? wouldn’t know this.

One could go on citing many more cases in which Zwick misleads her readers through lopsided reporting and superficial analysis, such as her statements that the Graduate Record Exam is about as good a predictor of graduate school success as the SAT is for college freshmen (it’s not, far from it), or her overly optimistic spin on the results of many studies showing poor correlations between standardized test scores and later career successes.

Finally, Zwick’s presentation might have benefited from a less textbookish style, with more enriching details and concrete examples. Instead, she tries to position herself as a “just the facts” professor who won’t burden readers with extraneous contextual details or accounts of the human side of the testing culture. But like the enormously successful–at least in commercial terms–standardized tests themselves, which promote the entrenched belief in American society that genuine learning and expert knowledge are tantamount to success on Who Wants to Be a Millionaire-type multiple-choice questions, books like Fair Game? might be the standardized account that some readers really want.

Thank you for reading The Nation

We hope you enjoyed the story you just read, just one of the many incisive, deeply-reported articles we publish daily. Now more than ever, we need fearless journalism that shifts the needle on important issues, uncovers malfeasance and corruption, and uplifts voices and perspectives that often go unheard in mainstream media.

Throughout this critical election year and a time of media austerity and renewed campus activism and rising labor organizing, independent journalism that gets to the heart of the matter is more critical than ever before. Donate right now and help us hold the powerful accountable, shine a light on issues that would otherwise be swept under the rug, and build a more just and equitable future.

For nearly 160 years, The Nation has stood for truth, justice, and moral clarity. As a reader-supported publication, we are not beholden to the whims of advertisers or a corporate owner. But it does take financial resources to report on stories that may take weeks or months to properly investigate, thoroughly edit and fact-check articles, and get our stories into the hands of readers.

Donate today and stand with us for a better future. Thank you for being a supporter of independent journalism.

Ad Policy
x