Google knows roughly who you are, roughly what you care about, roughly who your friends are. What else will it know?

The Infrastructure

We tell our search engines things that we would never tell our closest friends, believing that we are cloaked in anonymity. With Google’s expanding ability to collect, keep, and analyze data, this belief may be increasingly misguided. Google’s new privacy policy, given effect in early 2012, applies uniformly to products like Gmail, Calendar, and Google search. On its face, the unification promotes transparency by limiting the number of privacy policies that Google users must review to inform themselves of the rules to which they are subjected. However, the new policy also allows the services to share user data with each other. This creates a communal pool of user data that can be used to personalize the user’s experience, but also to identify key personal characteristics such as interests, location, and true identity (if the last of these has not already been explicitly provided to Gmail during the sign-up process).

Google collects this data through channels embedded in the services it provides, free of charge, to its users. Google acts as a central hub to which millions of users connect like so many spokes, seeking to reach out to each other via Gmail or Google Docs (which require an account to access), or to sort through information using Google search (which does not require an account to access, though the allure of improved personalized search results may encourage users to log in). In exchange for these services, users forfeit control over the flow of their data, and implicitly permit Google to act as a centralized data repository. Because the services can now share data with each other under the new privacy policy, a rich tapestry of each user’s activity can be created and stored on Google’s servers.

The Problem

The rapid rate of technological development makes it difficult to predict potential future uses of data at the time of its voluntary provision. Users may psychologically divorce their online personas from their flesh and blood, but that is no longer the reality. Being careless with information on Facebook can now get teachers and bartenders fired. In a telling statement about how Google may treat the data it collects, then-CEO Eric Schmidt quipped, “if you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place.” The troubling consequence of such a stance is that Google users are encouraged to restrict their online behavior to comport with an unarticulated and uncertain standard of propriety, at the risk of having their behavior used against them. In 2009, searching “airport security” or “homemade plastic” on Google could land you on a terrorist watch list.

Landing on government watch lists was the result of using a single Google service incautiously. With the synergistic effect of data sharing between services, user-produced data could become more dangerous in the future, particularly if more parties were able to access it. This danger is illustrated by the example of a health insurance provider that is able to access Google data. If a user has been searching “cancer symptoms” and purchasing painkillers or energy supplements through Google’s one-click Checkout service, the insurance company may promptly raise the user’s insurance rates. In many cases, acquiring the identity of the user who produced the data would not be an issue; Google Checkout requires an account, which requests real-name identification. It is unlikely that Google is currently so invidious (or so indifferent to backlash) that it would allow an insurance company to acquire any user data. However, the data provided by users today is not going to disappear, and future uses remain potentially infinite.

Aside from internet absence, there is no unifying solution to this problem. There are actions that Google could take to mitigate the potential harmfulness of user-provided data, but the impetus to take this action must come from the external pressure of the users themselves. For example, a scrubbing process between the user terminal and the Google server could strip the data of all information unique to the user (such as IP address and username), and prevent such data from being stored on the server. Google would then retain aggregated data for marketing and research purposes. Another alternative is to treat a user’s usage data the same way medical data is treated—require the user’s written consent prior to any release. Unfortunately, potential solutions of greater complexity often face questions that do not lend themselves to bright-line resolutions.

The Solution The Questions

People’s voluntary provision of data is protected by the First Amendment. If a user wants to use a Google service, the cost of doing so is the information that Google then knows about that user. As long as the exchange is voluntary, government regulation is likely to face stern backlash on constitutional grounds. However, if the legislature were to make only impermissible use of this voluntarily-provided user data punishable, the user’s first Amendment right would not be harmed. Yet such a solution would require an answer to the question: when does voluntarily provided data become impermissibly used? Furthermore, ownership of the data is entirely unclear. What actionable property rights are there in data that voluntarily dropped into a depository, even if the user did not know he was providing data?

The European Union has proposed a legislative solution: the creation of a statutory right to be forgotten.. This would allow people to have their data deleted if there is no "legitimate ground" for retaining it. This puts some power back in the hands of who have already provided data to the repository, but with a principle as fluid as “legitimate ground,” Google is likely to find a justification for retaining most of its data, sapping the right of its bite. However, if the legitimacy of the ground were subject to judicial review, the right could retain protective strength.

A successful solution cannot prevent Google from functioning—the free exchange of information is critical in our society, and the concept of restricting it in order to protect the user is unsettlingly Orwellian. If an insurance company cannot gain access to a user’s search history, searching symptoms on Google can even be empowering to patients who can then walk into a doctor’s office feeling better informed about their condition. A solution that begins to make progress will have to prevent the flow of information from being used against the users who create it. Is this possible to do, without shutting down the search and sharing services we treasure? And in a disparate, disconnected group of users, where will we get the unified political will?

Navigation

LawContempSoc LawContempSoc

Webs Webs

r7 - 22 Jan 2013 - 20:10:38 - IanSullivan

This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.

Syndicate this site RSS ATOM