What would you do with data from more than 2 billion trips taken with the ride-booking service Uber?
Soon, you'll be able to explore the possibilities. The company recently debuted a new online tool called Movement, which provides data like ride durations between two points, based on GPS information. The tool is a dream for city planners and local governments, who can use it to learn more about commute patterns, and target infrastructure projects. And in the coming months, Uber wants to make Movement accessible to everyone.
It’s a gift, for certain. But some privacy experts worry the new tool could be a Pandora’s box. “Key, of course, to all of this, is, ensuring that the privacy of individual user data will be protected,” says Marc Rotenberg, president of the Electronic Privacy Information Center.
While Uber says the data in the Movement tool has been “anonymized and aggregated” to remove personally identifiable information, the company’s track record with user privacy is less than pristine. This fall, it weathered a public relations storm when an app update asked users to share their locations with Uber even when they weren’t using the ride-hailing service. Uber has also raised ire about its employees’ access to individual users: Some have reportedly retrieved journalists' travel records, and a job interviewee at the company has said he had access to user databases.
On the heels of these privacy snags, it could seem brash to release more information about riders. But Uber’s not just facing pressure from users and the media. As the company has moved into markets traditionally dominated by taxis and mass transit, it’s found itself increasingly at odds with local governments. Rotenberg sees the Movement tool as an olive branch, of sorts.
“[Uber has] come into a lot of cities where there are established taxi services,” he says. “The incumbents are resisting Uber’s presence. And I think the cities may feel a little bit that if they get some of the user data from Uber for their planning purposes, now there's a benefit that they didn't previously have.”
But for Rotenberg, Uber hasn’t yet passed a key checkpoint: proving that it has successfully removed personally identifying information from the user data. That’s because researchers are still trying to find the best way to do so, he says.
“There are some very smart people working in the field right now of analytics and de-identification and anonymization, trying to see if it is truly possible to take information that begins as personally identifiable — which most certainly the ride information associated with the Uber service is — and transform it in a way that even with lots of technology, and lots of processing power, you can’t reconstruct the original identity information you might have had,” he explains.
He cites the example of Latanya Sweeney, whose work has shown that 87 percent of Americans are uniquely identifiable by just their five-digit ZIP code, gender and date of birth. “[And] you have other researchers, such as Cynthia Dwork, who have developed techniques like differential privacy to try to help people assess what the risk of re-identification is,” he adds.
If it turns out that Uber’s ride information can’t be de-identified, for sure, Rotenberg says the data dump could open the door to a host of other serious concerns.
“You have to be considering everything from surveillance, stalking, cyberhacking, credit card theft, identity theft, financial fraud,” he says. “There's a long list of potential risk to the users of the Uber service, and that's why you need to deal with a threshold problem, which is the de-identification issue.”
Ultimately, Rotenberg doesn’t think that we can leave it to Uber, or to cities, to ensure that rider data will be completely anonymized. He suggests bringing in an independent, third-party ombudsman to represent the privacy interests of Uber users and make sure that anonymization techniques are working.
Better yet, he says, we should put user privacy on the books: “An even better approach … might be simply to have a state law or federal law which says to Uber that, if, in fact, you do disclose personally identifiable information, there'll be some liability.”
“And I think that would keep both Uber and the cities operating in a way that's more aligned with the interests of the Uber customers.”
©2016 Science Friday