Sunday, March 17, 2013

Biometrics and profiling: The door to the phone booth is now open

The next to last panel at the Yale Law School's March 3rd Location Tracking and Biometrics Conference was related to biometric identification and its implications for privacy in the hyper-connected world of the 21st century. Moderated by Wired magazine contributing editor Noah Shactman, the panel arguably was the creepiest of the day, with truly surreal implications for personal privacy. The panel featured Georgetown law professor Laura Donohue, Jennifer Lynch from the Electronic Frontier Foundation, NYU Ph.D candidate Travis Hall, a postdoctoral fellow from Carnegie Mellon named Ralph Gross, and Alvaro Bedoya, who is an aide to Minnesota Sen. Al Franken. Go here to watch it online, beginning at the 7:31:48 mark. Here's a summary from my notes:

Biometrics then and now
Shachtman opened the discussion by pointing out that the use of biometrics for identification dates at least to 2,400 years ago, when the Chinese used hand prints and thumbprints on official documents. In the mid-19th century, the British East India Company used them to authenticate documents and track prisoners (in the aftermath of the Indigo Revolt, 1859-1861). The first use of fingerprints in modern criminal case, he said, occurred in in Brazil.

The US government has funded biometrics research from ear lobes to body odors as potentially unique, personal identifiers, many of which can be used from a distance. Some 31 states (including Texas, see Grits' discussion from 2004 here, here and here) use facial recognition with DMV photos. The Department of Justice has a database with fingerprints of 130 million people.

Biometrics have three characteristics which make them useful for identification: They are immutable, readily accessible, and individuating. Those characteristics, though are a source of both benefits and problems. Notably, while biometrics are individualized, your computer turns them into ones and zeroes, meaning they can be electronically captured. Biometrics data can be gathered from a distance in public settings on a mass scale and monitored continuously, telling more about a person than just their identity. It's one thing, said Schactman, to get a fingerprint or DNA swab upon arrest. But today telescopes can capture an iris scan from 1,000 meters away. Thus setting the stage, we turn to the panelists:

Game changer: Remote identification, 'multimodal' biometrics
Georgetown law professor Laura Donohue described how the recent "technological leap" into the 21st century has created a "statutory gap" and a "constitutional abyss." (See her related law review article.) Kraft Foods is in talks with Facebook, she said, so that a commecial kiosk identifies you through facial recognition to tailor individualized marketing. In Las Vegas, there's a billboard that analyzes your age and gender to market different products to different people (these are also proliferating in Japan). According to Donohue, there were 633 facial recognition patents issued between 2001 and 2011 compared to just a handful the decade before. She identified four emerging trends:
  • Move to multimodal biometrics. Pairing fingerprints with iris scans, DNA.
  • Pairing of biographic information and biometrics.
  • Interoperable databases
  • Collapsing distinction between law enforcement, homeland security and national security.
The FBI sees multimodal biometrics as a key law enforcement tool of the future, hoping to fuse contextual, biographical and biometric information in connected databases. E.g., facial recognition at political rallies can identify people who were at multiple rallies and checked against a "Repository of Individuals of Special Concern" (RISC). These functions are also being privatized. The company Rapback lets employers submit their employees' biometrics, which it then gives to the FBI and is notified in return of the employee's criminal and in some cases civil activities. The service could even notify an employer, she said, when an employee is spotted at a political rally if it's caught on film.

Historically biometrics were used for immediate, one-to-one identification: Fingerprints identified someone booked into the jail, or an iris scan let them enter a secure corporate facility. But now many biometrics can be matched remotely and instead of one-to-one matching, can to one-to-many, potentially wiping out any remaining vestiges of privacy in public spaces. The dynamic of biometrics use is changing, said Donohue, along the following axes:
  • One-to-one vs. one-to-many.
  • Close up or at a distance
  • Custodial detention vs. public spaces
  • Notice or consent vs. none
  • A one time, limited occurrence vs. continuous and ongoing manner.
On the statutory side, the laws "have not grappled with new technology." And on the constitutional front, the focus in US v. Jones (finding the placement of a location tracking device on a car was a "search") on the physical intrusion of placing a tracker on a car ignores the growing array of tracking technologies like remote biometrics that require no physical intrusion. One could read Jones as including a "shadow majority" of justices endorsing the "mosaic theory" that holds continuous tracking over time violates one's reasonable expectation of privacy, but there are other cases, she said, that blur that distinction.

Immigration enforcement driving interoperable government databases
NYU's Travis Hall discussed biometrics, interoperability and immigration reform, with a particular focus on the FBI and the Department of Homeland Security's "Secure Communities" program, where people arrested on state and local criminal charges are matched with federal immigration databases to check for immigration violators and people for whom a criminal offense might itself be an immigration violation under the terms of their visa. Defense Department and Department of Justice databases don't talk to each other, he said, but they communicate indirectly through the Department of Homeland Security. The United States has a "federated system," said Hall, with four main biometric databases that after 9/11 all began to share data directly or indirectly. Fingerprints from federal, state and local arrestees are uploaded to the FBI which sends them to DHS to check for immigration violators. That way, DOD and intelligence agencies end up with access to data from state and local law enforcement activities.

At first, Secure Communities was pitched to the states as an opt-in program and only 13 states signed up to be notified of immigration violators in their jails. Then, when Illinois and Boston tried to opt out, the feds said "no, you can't."

What's the problem? The lines between criminal and civil enforcement mechanisms are becoming blurred, said Hall. Immigration status is often not static but "fuzzy," making bright-line enforcement under Secure Communities problematic. This blurring of criminal and civil enforcement mechanisms could also have unforeseen consequences down the line in areas of law completely unrelated to immigration. (I found myself wishing he'd given more hypotheticals about what that might look like.) With the advent of mobile biometrics, immigration agents can perform fingerprinting and iris scans in the field that instantly connect up to all the above-mentioned federal databases. (See an EFF white paper by Jennifer Lynch on the conjunction of biometrics and immigration enforcement.)

The expansion of immigration-related biometrics may impact youth eligible under the DREAM Act (or the administrative equivalent announced last year by President Obama), which states that applicants must demonstrate "good moral character." Applicants go through background checks and must give up their biometrics in order to qualify for provisional status, a process that's resulted in an "entrenchment of surveillance tools." In order to be lenient on “the good guys,” he said, government needs surveillance on everyone to identify bad actors.

Facebook as Big Brother
In an earlier panel, 9th Circuit Presiding Judge Alex Kozinski pointed out that in the Katz case, in which SCOTUS first articulated the concept of a "reasonable expectation of privacy," the court based its interpretation of Mr. Katz's expectations in large part on the anachronistic fact that he closed the door to the wiretapped phone booth - an factor that appears quaint in the modern age of cell phones. Sen. al Franken's aide, Alvaro Redoya, said that today, "the phone booth door is very much open." He added that "the future is now," and "this is a big deal."

We shouldn't just be concerned about the Minority Report scenario where advertising is funneled to us based on remote identification, he said. Now your driver's license, passport and Facebook account are all connected to facial recognition applications.

Facebook is honing its facial recognition software through its tag suggestions program, which presently is active everywhere but Europe where privacy laws prevent its implementation. On the back end, Facebook makes a "faceprint" they can match like a fingerprint. When your friends upload pictures, they are prompted, "would you like to tag" the people in them. The company has rolled this out on an "opt out" basis, meaning they're gathering faceprint data unless you've specifically declined to participate. The average person has 53 photos on their Facebook page, he said. Assuming a 60% non-participation rate (which is probably way too high), the company would have a faceprint for one out of 20 people on the planet. Assuming a 20% opt-out rate, which is perhaps more realistic, Facebook has pictures of one out of 10 humans in their facial  recognition system. Every time Facebook suggests, "is this so-and-so?" and asks if you want to tag them, and you say "no, it's not that person," the company improves their algorithm. Essentially, Facebook has crowd-sourced refinement of its system. Facebook does not promise they won't sell information to third parties. There are scenarios with real person to person (P2P) harms. In early 2010 an Israeli company rolled out Click App, a facial recognition system which Facebook purchased last year. Someone hacked it and figured out you could download pictures from Facebook and use it as a private facial recognition system.

Prof. Donohue had earlier described how the FBI had developed facial recognition technology to scan individuals at political rallies, identifying everyone who had attended two or more events. Redoya said the events in the FBI's example were from Obama and Clinton political rallies. In all states where such facial recognition technology has been rolled out, he said, it's a crime to block a sidewalk, for example, so it's easy to find a law enforcement justification for its use in such settings. Your faceprint remains roughly the same between ages 20 and 50, he said.

In Katz, the Supreme Court considered it important that the phone booth door was closed. But every time you walk outside you knowingly expose your face to the public, Redoya observed. Unless the law catches up to that sort of functionality, those sorts of outdated distinctions will obliterate personal privacy.

Privacy in the age of augmented reality
Carnegie Mellon's Ralph Gross discussed "Privacy in the age of augmented reality" (see an FAQ) having conducted experiments analyzing the convergence of public self-disclosure in social networks, improvements in facial recognition accuracy, cloud computing, "ubiquitous computing," and "statistical re-identification" of de-identified data The results, he said raise the question of whether in an era of "augmented reality" we have finally reached “the end of anonymity”?

Combining publicly available social network data and off-the shelf facial recognition technology, Gross and his fellow Carnegie Mellon researchers downloaded images from Facebook and then from dating service websites, trying to match them. One out of 10 dating-site members could be identified, he said. A second experiment set up cheap webcam and asked students to let them take three photos from different angles. They could identify one out of three subjects, not just from their profile pictures but also from tagged images.

Even more disturbing was Gross' success at predicting social security numbers (!). Think for a moment: How many times have you given out the last four digits of your social security number as an identifier for online services? Have you ever thought about what happens if the other five digits could be inferred from public records? For 27% of subjects from Facebook, Carnegie Mellon researchers could guess the first five digits of their Social Security Number within four attempts. In other words, their algorithm could come up with four guesses and one of them was right 27% of the time. So starting with a photo and using information of Facebook, it's possible to guess those first five digits around a quarter of the time. Over time and with more data, that algorithm could become even more robust.

Gross said modern facial recognition technology can go from an anonymous face to matching it to a presumptive name, then get online information, demographics, their friends, and potentially predict their social security number and credit score, not to mention their political and sexual orientation. This could all be done, he said, "in real time with a smart phone app. The implications are staggering and include:
  • Faces as conduits between online and offline data.
  • The emergence of personally predictable information
  • The rise of visual, facial searches
  • Democratization of surveillance, and
  • Social network profiles as Real IDs
When your face can be connected to so much information about you, it essentially becomes your ID.  Today's technology has reached the stage where such capability is no longer purely the domain of science fiction but a real-world scenario which courts and legislatures have yet to address.

Location data as biometrics: You are where you go
EFF's Jennifer Lynch spoke about "location data as biometrics." To my mind, the takeaway from her presentation was "you are where you go." The same thing can't be in two places at the same time and two different things can't occupy the same place, said Lynch, so by its nature location data is individualizing.

Cell phones generate staggering amount of location data totaling 600 billion transactions per day worldwide, data which frequently is shared with third parties in volume and in real-time and constitutes a significant potential new market for cell-phone carriers. Your movements quickly reveal where you spend your time, when, and with whom, as well as what's typical and what's not. Though cell-tower data is "de-identified," she said, once you know all that information, "re-identification" - i.e, figuring out who is who - is a somewhat trivial technical feat (as Ralph Gross had earlier demonstrated).

The more cell-phone towers and antennas that exist, the more precise location tracking by cell phones becomes. Using a site called AntennaSearch, Lynch found that there were 74 cell towers and 529 antennas within four miles of the Yale Law School. (Running the same search for Grits' own home in Central East Austin, I found 145 towers and 675 antennas within a four mile radius.)

A young German politician named Malte Spitz sued his cell phone company for all his location data and partnered with a newspaper to produce an amazing graphic tracking his movements for six months. The graphic includes not just his location but how many phone calls and text messages he received and sent, also linking the data to his Facebook and Twitter timelines to add context, creating a stunning diary of his life. Given the foibles of human memory, it shows your cell-phone carrier (and by extension any government agency or third party that accesses that data) in some ways may know more about your life than you do.

Following in his footsteps, so to speak, Lynch tracked herself for a month with a Google program called “Latitude” that records everywhere you've been. Nothing earth shattering - she mostly went from home to office to her kid's school, with an occasional trip to a store or other destination - but really it's the mundane data that identifies you and provides the most information about who you are and how you live your life. Location data combines and amplifies all the problems with biometrics, said Lynch.

Aren't biometrics 'awesome'?
Wired editor Noah Schactman interjected to ask the panel, "Isn't the idea of your face as universal recognizer awesome?" It would make passwords useless, he said, since someone can't hack your face in the way you can hack a password. Not true, said Lynch, noting that Japanese kids hack cigarette machines with facial recognition tech by holding up magazine ads of older people. (Grits wrote in 2005 that, for that very reason, biometrics make terrible passwords. To a computer your fingerprint, iris scan or facial structure are just ones and zeros which are easily replicable.)

Travis Hall pointed out that, while we live just one life, there are "siloed" aspects to everyone's existence. New technology breaks down those silos in ways that people don't want broken down. Identity in one context may be open, but can now be linked to other contexts in ways that people would prefer remain closed. Prof. Donohue added that, for that reason, there's a public or social harm from long-term storage of this information. New guidelines allow the National Counter Terroism Center to retain personal information about non-suspects for five years instead of 180 days, generating an ever-more detailed and robust data set about individuals over time.

One-to-one biometrics are not as big a problem compared to "one to many" apps. It's one thing to verify identity of an individual and another to identify strangers from a crowd, especially in an era when cameras are so ubiquitous.

During the Q and A, Chris Soghoian pointed out that there are dating websites for people from specific religions, people with particular STDs, gay people, etc., asking if the data could all be scraped and dumped into some sort of uber-database. Gross replied that it may or not be legal to do that - most likely it would violate the sites' terms of service - but technologically, we're at a point where it can be done. Hall pointed out that the "real problem" with data analytics is that "you don't know you're being tracked.'

Judge Kozinski stepped to the questioners' mic to ask about the implications of the NSA's “Solar Wind” project in Utah - a data processing facility where information accessed by intelligence services may be churned through at an astonishing 4 terrabytes per second. "Can they apply these technologies to all that data?" he wondered, essentially answering his own question. I'm not sure many people in the room had considered that. There was a moment of stunned silence as everyone took in the implications, before Prof. Donohue pointed out that such massive data processing capacity was especially a problem when combined with indefinite data retention.

Another questioner asked, "Can you opt out of information being shared?" The answer is sometimes. Your cell phone, for example, must ping the nearest tower periodically so it can receive phone calls. Android phones, it was pointed out, automatically link phone numbers you dial to your Google contacts list. Could there be an automatic opt in instead of opt out? Sure. But it's not required, and in practice people opt in via terms of service agreements they never read.

Travis Hall observed that the "persistence of data is astounding."  Data has long shelf life. In Europe there is an ongoing debate about the “right to be forgotten.” As it turns out, it's very had to be forgotten. Engineers having trouble comprehensively deleting even a single photo.

For the most part, unlike the politician from Germany, you have no right to review records the government keeps about you, Hall observed, especially for data gathered under national security authority. Schactman pointed out that, ironically, Al Quaeda members may be the only people not being tracked in the government's biometric databases.

See prior, related Grits posts from the conference:

No comments: