Search Gets Lost

Search Gets Lost

Why does Amazon now have customers do the search chores it used to do for them, and in innovative ways?


Remember, if you can, the dark world we lived in around fifteen years ago. Web companies like flared like a fireworks display on the Nasdaq, only to vanish just as quickly. Search engines crawled the digital superhighway in squads, each with its own extra-special method, and you never knew whether AltaVista or Lycos, Excite or Yahoo! would provide the guidance you needed. Websites like Slate and Salon, Feed and Suck offered innovative commentary on politics and culture. It was a new world, and if you weren’t logged on, you might as well have been dead. So, at least, magazines like Wired kept insisting. Looking backward as we cradle our Androids and balance our iPads on our laps, the first thing we see is the quaintness of a world of computer users squinting at their PC screens. It’s practically steampunk. Yet it’s also recognizably the world that led to the one we inhabit now—the one in which some of Wired’s predictions of the day, less about the death of Luddites than of the dead-tree media, seem to be coming true.

A harder look back reveals something else. The media world of the 1990s—even its most novel and exciting sectors—was a mosaic, not a fresco. Magazines still provoked excited gossip: Remember the “hot books,” or the Stephen Glass affair? Books themselves also seemed to have gathered new momentum. Barnes & Noble had been spreading across the country since the 1980s, erecting temples to dark wood, fragrant coffee and the Muses of publishing on smart city blocks as well as in what seemed to be every shopping mall in America. Borders, once acquired by Kmart in 1992, hustled in its wake, offering lighter-colored shelves and more magazines. Bestselling authors like Donna Tartt became media personalities, prettily packaged and profiled and sent across the country to sell their work. Even university presses felt the breeze. As the serious book market expanded, or seemed to, rumors flew: previously hardheaded editors were ordering print runs in four figures for monographs.

Nothing did more to make all of this happen, nothing transformed our ways of searching for information more rapidly, than the appearance of, which was founded in 1994 and began to sell books in 1995. In the same years that Netscape first led us onto the World Wide Web and our virtual lives became one long march from one link to another, we could suddenly buy books—any book, it seemed, however small its print run—wherever we were and whatever time of the day or night. Amazon did more than just make books available: it presented them appealingly, images of their dust jackets glowing on the screen, flanked by detailed and intelligent accounts of their contents, customer reviews and indispensable sales rank. Like the superstores, Amazon seemed to show that we had entered a new age of the book. And not only bestsellers profited—there was also the new model of the “long tail.” In the past, if a subject suddenly gained interest, a book about it that had been published a few years before would no longer have been on the shelves, and special orders took weeks or even months to fill. Thanks to Amazon’s huge warehouses and ubiquitous website, old books had the chance for a new lease on commercial life. Bliss it was in that dawn to be a live author—and even, in my case, to have one book, a heavily annotated monograph about astrology in the Renaissance—featured for a day on Amazon’s front page. Friends e-mailed to tell me they had seen it; then to ask if bodily fluids had been swapped to gain this position; and then, in much bigger numbers, to tell me how much they enjoyed watching my book fall, precipitously, from its brief perch at number one to a more reasonable position down in the many, many thousands.

* * *

The glory soon faded, because the high print orders shrank when the returns came in. Only a few of the remaining superstores—alas for the Borders that once graced Philadelphia’s Rittenhouse Square, a monument of civilization—still offer recondite books, as opposed to their 100,000 usual suspects. But Amazon survived, and evolved, and so did our ways of working with it. From the start, it transformed one ancient and tedious task: finding and ordering books for college and university courses. For decades, university teachers had compiled their reading lists from the massive, closely printed volumes of Books in Print. These tools of the teacher’s trade were as infuriating as they were indispensable. Publishers exist—or so every university teacher secretly thinks—mostly to take books out of print immediately after you have cast them to play a central role in your next term’s courses. Printed catalogs necessarily came out with far too long a lead time to keep abreast of these decisions, publisher by publisher. You could check your syllabus as often as you liked against the most recent Books in Print and still find yourself hung out to dry when, two weeks before the term started, the university bookstore sent notice that your most important texts were out of print or out of stock. After the 1979 Thor Power decision, which prevented publishers from writing down the value of their inventories for tax purposes, cancellation notices carpeted the floors of faculty mailrooms every August and January like autumn leaves in Vallombrosa. Amazon, by contrast, provided information about a book’s availability as current as the publishers themselves could keep it. A new distribution system couldn’t solve the underlying problems: publishers still took good books out of print when they stopped selling and ratcheted up the price of serious paperbacks until students couldn’t afford them. Still, by the late 1990s, even if you never bought a single book from Amazon, you found yourself relying on the information that this public-spirited firm made freely available.

From the first, then, Amazon served more than one purpose. It was a bookseller—or at least a book aggregator—and as its enormous warehouses spread across the land, it looked more and more like the world’s biggest department store. More important, though, Amazon consistently made clear that it saw itself as a tech company, and one with a special commitment to improving the art of the search. Again and again it rolled out exciting technologies, which made it possible to navigate the oceans of material on its Web page more rapidly and efficiently. As early as 2001 Amazon enabled users to “look inside” certain books. In fall 2003 the company took a much more dramatic step and made it possible to “search inside” the full text of 120,000 titles from more than 190 publishers. Remember, at the time, e-books were still rare, and most of them were hand-keyboarded reproductions of texts in the public domain. Though publishers used digital technology for editing and printing, they normally did not keep complete and searchable files of their final products, much less make them available to readers. Now Amazon users could search not just the full texts of individual titles but the whole mass of them, a collection comparable to that of a small liberal arts college or a superstore. Even professionals found that the experience gave them vertigo.

In October 2003 Gary Wolf described Search Inside! in Wired. He typed the name of Boss Tweed of Tammany Hall into the box:

Out pop a few books with Boss Tweed in the title. But the more intriguing results come from deep within books I never would have thought to check: A Confederacy of Dunces, by John Kennedy Toole; American Psycho, by Bret Easton Ellis; Forever: A Novel, by Pete Hamill. I immediately recognize the power of the archive to make connections hitherto unseen…. From the Hamill reference, I link to a page in the afterword on which he cites books that influenced his portrait of Tweed. There, on the screen, is the cream of the research performed by a great metropolitan writer and editor. Some of the books Hamill recommends are out of print, but all are available either new or used on Amazon.
   With persistence, serendipity and plenty of time in a library, I may have found these titles myself. The Amazon archive is dizzying not because it unearths books that would necessarily have languished in obscurity, but because it renders their contents instantly visible in response to a search. It allows quick query revisions, backtracking, and exploration. It provides a new form of map.

A year before Google introduced what it then called Google Print, the system we all use every day and call Google Books, Amazon was already teaching us how we would find texts and information in the new millennium. Devices and technologies that have become second nature to us—scanners and searchable PDFs, for example—first became familiar to many through Amazon. So did disintermediation: the sudden realization that we could work our way into a subject without taking a box of file cards to a reference room, riffling through catalogs and consulting librarians.

* * *

True, Amazon took care to limit how far you could go. All you could see, as you searched inside, were a few scanned pages of any individual book; you could not link directly to a single page or download any of them. At first, Amazon even intended to limit the number of searches a user could carry out, as well as the percentage of content from any given book that he or she could access. Unlike Google, in other words, Amazon always intended to build an instrument of commerce rather than a virtual catalog or library. (Google, of course, eventually revealed similar commercial ambitions of its own.) Jeff Bezos wanted to woo customers to his site, not to give them so much free material that they would stop buying books from him. Like George Bernard Shaw, he was interested in money, not art.

Like Shaw too, though, Bezos offered his customers a terrific product. Earning money, in Seattle in the years around 2000, was never enough. Innovation followed innovation. In the spring of 2005, for example, Amazon introduced a list of SIPs, or statistically improbable phrases, the most unlikely combinations of words to be found in a given book equipped with Search Inside!. Appearing toward the end of the listing, these quirky but fascinating phrase lists offered readers a new royal road to the heart of a book. As the sociologist and blogger Kieran Healy found, SIPs “effectively convey the essence of an author’s ideas, provided that the author is a phrase-maker.” Look up Marx’s Capital, Volume I, and you’ll find a list of SIPs that include “average social labour, appetite for surplus labour, direct exchangeability, social labour process, abstract human labour, labour fund, labour objectified, specifically capitalist mode, surplus value, necessary labour.” Look up Tocqueville’s Democracy in America, by contrast, and you’ll encounter “dogmatical belief, amongst democratic nations, all democratic nations, aristocratic ages, aristocratic people, democratic times, democratic armies, aristocratic communities, democratic ages, aristocratic countries, democratic people.” Not bad.

Even for fiction, the method proved surprisingly effective: the two phrases “rich cunt, fifteen francs” might well call Henry Miller’s Tropic of Cancer to mind even if you didn’t know they were Amazon’s SIPs for the book. And SIPs were not the only feature of their kind; there were also Capitalized Phrases (“people, places, events, or important topics mentioned frequently in a book”). Amazon’s concordance program listed the 100 most common words in some titles, using font size to indicate their frequency, and made it possible to grasp features of an author’s style with new ease and precision. As Deborah Friedell noted in 2005, the reader could establish immediately that “time” was the word that occurs most frequently in the Collected Poems of T.S. Eliot, while “contain” and “multitudes” did not make the top 100 of Whitman’s favorites. Across the Web and beyond, aficionados began putting together strings of SIPs to see what resulted, and using them in ways that looked forward not only to ordinary Google search as we now know it, but to some of the more sophisticated ways of mining texts that have been developed and made accessible, in recent years, by innovative specialists like Erez Lieberman Aiden and Jean-Baptiste Michel. The website of their Harvard Cultural Observatory describes the new ways they have devised “to enable the quantitative study of human culture across societies and across centuries”—and reveals some of the distant, unintended consequences of Amazon’s clever efforts to gain users and increase sales.

Amazon cared so much about these kinds of innovation that in 2003 it created a subsidiary in Palo Alto, A9, to develop them. Udi Manber, an Israeli computer scientist and expert on search who had joined Amazon in 2002, became the first chief executive of A9. The company not only equipped Amazon with Search Inside! and other features, but also devised its own innovative search portal. A9 continues to provide services for Amazon and many other clients. It all looks like one great confirmation of the boldest prophesies of the Cultural Studies crowd back in the 1990s: “market-powered synergies transforming everyday lives” and all that.

* * *

That would be a pretty story. The real one, however, is less so—and in some ways has more in common with the grim story of Amazon’s warehouse-labor policies than with the idealism of the early years of the Web. Slowly, Amazon stopped offering concordances and SIPs, among other features, even for titles that offered Search Inside!. Michael Pollan’s The Omnivore’s Dilemma, published in 2006, has phrase lists and a concordance on Amazon; In Defense of Food, published two years later, has phrase lists; but Food Rules, from 2009, has none of these features. Perhaps the Kindle, which appeared in 2007, became Amazon’s central focus for development. Kindle now comes equipped with everything from features for looking up words in a particular text to multiple forms of search and annotation. By contrast, new books—even those that sell very well indeed, such as Toni Morrison’s Home or Stephen Greenblatt’s The Swerve—are not provided with SIPs. You can still search inside books like these, and by entering distinctive terms in the main search box, you can still find other books that touch on similar themes. But Amazon no longer encourages or supports this kind of exploration as it once did. You have to know, before you begin exploring, the unusual phrase you’re looking for. Also, if you go back to older books that still have their lists of SIPs, you’ll find that they no longer function as they once did. When you click on a particular phrase, you don’t get a list of the other books in which it appears—though Amazon’s help feature still claims you do. Instead, you’re taken to a page that invites you to offer “ideas for improving Key Phrases.” (How about “Put them back”?) The new search capabilities that inspired and aided us a decade ago may not quite have become roadkill by the digital superhighway, but they’re shadows of what they once were.

New features abound, of course, but they’re the sort that university teachers and other white-collar workers know all too well: ways of doing more with less, by making workers (or customers) handle the routine chores that used to be done for them. Nowadays you can tag a given “product” for Amazon so that it knows what you think of a book; if you want, you can even study a tag cloud that lists and ranks the most popular customer tags, so that you’ll do a better job of tagging for the company. You can enter a customer discussion or post a review. And, of course, whenever you buy a book, you help Amazon not only gauge the book’s popularity, but also identify the other books that you have bought as well. It’s an efficient, thoroughly commercial counterpart to the old information system. The simple, elegant Web page that once showered discriminating customers with information now invites the consumer to provide information of every sort for Amazon to digest and profit from.

It’s not a tragedy: these days, Google and other search engines do everything that Amazon used to, and more (it’s probably no accident that Udi Manber currently works for Google). But it’s a sad and revealing tale nonetheless. Back in the day, the World Wide Web was mostly open territory, and innovation focused on bringing more people into it and giving them more things to do once there. Now, the Web has been fenced, plowed and monetized. More information and more images can be found inside private silos—like the phone and iPad apps we all love—than in the virtual territory open to all. If Jeff Bezos has his way, not only the remaining superstores and independents but also the remaining publishers will disappear. We’ll be disintermediated with a vengeance, as Amazon dominates our collective imagination. Once, Amazon opened worlds to us; now it just sells them.


Also in This Issue
The Amazon Effect,” by Steve Wasserman
How Germany Keeps Amazon at Bay and Literary Culture Alive,” by Michael Naumann

Dear reader,

I hope you enjoyed the article you just read. It’s just one of the many deeply-reported and boundary-pushing stories we publish everyday at The Nation. In a time of continued erosion of our fundamental rights and urgent global struggles for peace, independent journalism is now more vital than ever.

As a Nation reader, you are likely an engaged progressive who is passionate about bold ideas. I know I can count on you to help sustain our mission-driven journalism.

This month, we’re kicking off an ambitious Summer Fundraising Campaign with the goal of raising $15,000. With your support, we can continue to produce the hard-hitting journalism you rely on to cut through the noise of conservative, corporate media. Please, donate today.

A better world is out there—and we need your support to reach it.


Katrina vanden Heuvel
Editorial Director and Publisher, The Nation

Ad Policy