Monday, February 06, 2006
Lunchtime Notes: "Civil Liberty Infringement Engines"
An account in yesterday's Washington Post corroborates an earlier NYT story about the breadth of the domestic spying dragnet, the rarity with which real suspects are discovered. It fills in some information on the use of data mining methods, making the obvious if still disturbing point that large volumes of minimally suspicious communications are machine-filtered in order to generate the "leads" that are followed up by human agents. A key point:
Published government reports say the NSA and other data miners use mathematical techniques to form hypotheses about which of the countless theoretical ties are likeliest to represent a real-world relationship.
A more fundamental problem, according to a high-ranking former official with firsthand knowledge, is that "the number of identifiable terrorist entities is decreasing." There are fewer starting points, he said, for link analysis.
"At that point, your only recourse is to look for patterns," the official said...
Analysts build a model of hypothetical terrorist behavior, and computers look for people who fit the model. Among the drawbacks of this method is that nearly all its selection criteria are innocent on their own. There is little precedent, lawyers said, for using such a model as probable cause to get a court-issued warrant for electronic surveillance.Jeff Jonas, now chief scientist at IBM Entity Analytics, invented a data-mining technology used widely in the private sector and by the government. He sympathizes, he said, with an analyst facing an unknown threat who gathers enormous volumes of data "and says, 'There must be a secret in there.' "
But pattern matching, he argued, will not find it. Techniques that "look at people's behavior to predict terrorist intent," he said, "are so far from reaching the level of accuracy that's necessary that I see them as nothing but civil liberty infringement engines." [Emphasis added.]
Link via Gary Farber, who figured out that there was a data mining effort early on (h/t Double Plus Ungood), was right, and justifiably says I-told-you-so (expletive deleted). He has plenty of links, too, to his strenuous efforts to draw attention to the matter — they're worth following if you have the time to click through for additional background. I plead distraction in failing to have linked before now.
I am undecided as to whether it's good or bad, on net, that the article suggests that the NSA's computers are still, for the most part, fast but dumb.