Ever since we learned about PRISM, the NSA’s secret project to collect metadata on Americans by tapping into commercial online services, we’ve been confounded by a tangle of intangible clashing values. We are asked to balance “preventing terrorism” against “protecting privacy.” It is hard to demonstrate what terrorism would have occurred without preventive measures, and privacy is as much a feeling as a circumstance.
A hypothetical versus an emotion: the invisibles clash at the coliseum. There is a danger that this crucial controversy is being framed in so blurry a manner that it will blend into the wind and blow away. Maybe reflecting on the terms will bring the situation into focus.
Metadata systems are said to gather only tags and skeletal information, but not “content.” The distinction is instrumental rather than substantial. The line between the two can shift over time. When someone retweets to a group of people, that generates only metadata and no data. In that case, which is a common one, the distinction becomes meaningless.
Metadata is the aspect of data that programs can most reliably “understand.” It’s the topical stuff that is regimented into a standard structure, like the blanks filled in on a form. In order to treat real-world events as metadata, certain actions of people, rather than their expressions, are used to fill in those blanks. For instance, programs cannot understand the meaning of ordinary conversation, but a program can log when a call is made, and to whom.
What we mean these days when we talk about security is preventing terrorist attacks. I was close by on 9/11, so I understand, though I wonder if we’ve become too narrow in our sensibility. Nonetheless, keeping to our nation’s narrowed sensibility, what can metadata do to prevent attacks?
I have no direct knowledge of PRISM, so I can only assume that what has been leaked about it is accurate. If this is so, then PRISM is probably like the many other metadata systems I have known in other spheres. In commerce, where there is at least as much talent and money as in the intelligence game, metadata’s primary strength is not investigative.
Companies have to be a little like the NSA in miniature on occasion. Major spammers, phishers and hucksters have to be shut down, for instance, in order for the Internet to be of use. Criminals learned long ago to spew deceptive metadata, so it can’t be reliably used to identify the bad guys. Spammy comments to blogs are generated by a web of fake or commandeered accounts, for example. Metadata has been a useful adjunct to investigations, to be sure, but effective detective work still relies on lucky breaks or, even more often, on old-fashioned human gumption.
Once you’ve identified a bad guy, metadata might (if it hasn’t been faked) help you find accomplices or top up your evidence of misdeeds. It has been widely pointed out that metadata didn’t detect the Boston Marathon bombers in advance. Closer cooperation with Russian authorities might have, however. After the fact, metadata might just help identify their accomplices and peers. Let’s see if it does.
It’s natural for techies to think in terms of advancing technology. Certainly, when technology becomes advanced enough, we imagine, there will be a button we can press that simply catches the bad guys. We want it so much that we sometimes hallucinate that we already have it. But we don’t. So while a metadata system comes on like a predator, it turns out to be sessile. It’s a filter feeder: it can catch only what it’s programmed to look for, so long as enough raw material flows through it and it’s lucky.