Law in the Internet Society
-- MeganBuckley - 31 Jan 2015

Calling for the Creation of a Data Agent

The mass aggregation and monetization of data generally occurs without individual creators knowing how, by whom, or for what purposes they are being recorded. This uncontrolled and unregulated data collection harms individuals by eliminating their control over what information remains private. Though technology is available to preclude such monitoring, for an average person, reasserting control over that which they choose to disclose is frustrating and inconvenient.

This essay calls for the development of a third party entity, a Data Agent, to act as a layer of protection between individuals and those seeking to exploit their data. A multi-platform software application that could be downloaded and run in the background of computers or smartphones, the Data Agent would give users control over what information they disclose when they make purchases, browse the Net, or connect with a smart device.

Data Aggregators

Data aggregators collect and analyze data from public records, telephone companies, credit cards, loyalty cards, and anywhere else they can. Acxiom, the world’s largest collector of data, boasts that its data capacity “incorporate[s] virtually any legally usable data elements, first-party, third-party, or otherwise." Having been in the business of data aggregation for 4 decades, its records include “financial information, Social Security numbers and other information typically considered sensitive.”

The process is well documented. At the behest of third parties – corporations, politicians, and other interest groups – data aggregators cross reference various data points, address and credit card data, car registration, and so on, creating a data profile in order to infer your preferences and predict your behavior.

Use of connected devices, or the Internet of Things (IoT? ) is expected to grow to 4.9 billion in 2015, reaching 25 billion in 2020. The IoT? ,from wearable technology to consumer product packaging will continue to growing exponentially, merge the physical world with technology, exponentially increasing the data immediately extractable for aggregation.

Data aggregation enables analysts to profile individuals, learn their tastes and preferences, track their behavior over time, and devise ways to influence them to act in accordance with clients’ wishes. Mostly unaware of the amount information they have disclosed or the purposes for which it is employed, individuals have provided corporations, governments, and various interests groups with power to subliminally target and coerce their future acts. Aggregators are businesses, willing to provide algorithms for any entity willing to pay. The vast data sets they possess, if exposed or disclosed to third parties, such as foreign intelligence networks, could result in widespread exploitation. Though anonymized in theory, the data is tied to addresses, social security numbers, and other information that would easily enable personal identification.

Benefits will undoubtedly accrue from the increased interconnectivity: resources may be allocated more efficiently, whether in the form of police units throughout a city or fertilizer on a farm; weaknesses in supply chains can be immediately anticipated; consumer needs can be predicted. However, a network of physical objects embedded with technology that merges them into an ecosystem of things, communications, application, and data analysis, poses an enormous risk to individual privacy. With the IoT? , aggregators will be able to include data such as an individual’s sleep patterns and thermostat settings alongside information like age, income, marital status, as well as internet searches, prescriptions they picked up using a loyalty card, time spent reading certain pages online.

Studies show that few people realize the extent to which they are providing others with tools to aid in their own exploitation, but even so, want to stronger control over data privacy. A consumer survey found that most adults (72%), even in exchange for loyalty points, are not willing to share personal data.

The Data Agent

The Data Agent would dramatically reduce the records available to aggregators by limiting the creation of useful data from a user’s: (1) credit cards, (2) interactions with the Net, and (3) smart devices. Though this would not solve the data traditionally available in public records nor the data voluntarily disclosed though use of social networks or loyalty cards, it would go a long way in reasserting individual ownership over data. Hopefully, spurring change in the present framework wherein commercial entities may monitor, record, sell, analyze, and use data gleaned from individuals for any purpose they so choose, towards a framework in which individuals control their data and commercial entities must seek consent and disclose their purpose.

Credit card records are sold to aggregators, supposedly stripped of personally identifiable information. These records are useful because they show each vendor at which a purchase was made and for what amount. The Data Agent would act as a sort of stopping point in the purchase, appearing as the vender on credit card statements for all transactions in the same way that all transactions to different vendors at Amazon appear only as Amazon on statements. By stripping the records of the vendor’s identity the Data Agent would render data much less valuable to the aggregator and much less intrusive to the individual.

The Data Agent would also preclude extraction of an individual’s interactions with the Net. At present, IP addresses provide locational information, giving greater power to the retained search history and browser data held by Google, Facebook, and others, enabling access to personally identifiable information and profiling of individual users. Software, such as I2P? or Tor, can be used for accessing web services without disclosing one’s IP address. And personal servers, such as Freedom Box, may provide a way to keep data at home and out of the cloud.

But, for the average user (see below), these protections are not realistic. Though protecting one’s privacy online is possible, it is not mainstream in part because it requires individuals to change their behaviors in ways that do not seem to be very convenient. An average user looking to buy a plane ticket is not going to take the steps to protect their privacy when searching on Google or Kayak. The Data Agent, already running, would use existing technology to conceal the IP address from which the search originated, allowing individuals to use Google or Kayak to compare plane tickets entirely anonymously. However, some user modification would still be necessary. If the user logged in to their Google account, their searches would be retained as part of their profile, albeit not tied to their location. But, at the very least, the Data Agent application would allow someone to access web services from any browser without allowing external parties to what sites you visit and without the sites you visit from learning your physical location.

The Data Agent should also block the sending of personally identifiable information along with records from smart devices. Though likely impossible to prevent any data monitoring when the physical object, by its very nature, is interconnected, the Data Agent should block third parties from extracting additional information. For example, though my FitBit? may provide data on my heartrate and daily steps, it should not provide data on my location, age, income, credit history; nor should it provide data that would enable such linkages to be made.

The Data Agent should not have the ability to track a user’s actions or retain data on past behavior. As a background application, it would strictly function as an intermediary between the user and technology, protecting the user from unwanted manipulation by corporations wanting to coerce consumers or interest groups seeking to influence public opinion. Though not a comprehensive solution to invasive data extraction, the Data Agent would provide an easy to adopt, easy to integrate, mainstream alternative to the present system. More importantly, it would start to reframe data as controlled by the individual who creates it.

Note on the average user and data collection

After spending several hours downloading Tor and trying out different tools to protect my actions online (I kept Tor, but got stuck trying to set up an email account linked to my newly purchased domain name that would not be tied to Google's cloud). Then I looked for ways to find what others already knew. Despite embracing practices that make me an easy target for data extraction – Apple products, Google account, wearable technology, etc. – I was pleased that much of the information tied to me, at least by one aggregator Acxiom, was inaccurate, seeming to not be integrated at all with my interactions on the Net. From its privacy disclosures, Acxiom provides a link to a site that, if you create an account to identify yourself (including last four of your SSN), you will have access to the data retained about you and have the option to opt out (I highly doubt the efficacy of this opt out option considering that Acxiom is neither the initial collector of data nor the consumer of its analytics). The site did not recognize me at any address from the past 7 years. But, once I entered an address it did recognize, the site granted me access to review and request modification of six categories of data about me. Besides incorrectly identifying me as a homeowner in rural Louisiana living with four adults, the first category, characteristic data also got my occupation and marital status wrong. But, it did list my correct birthdate, ethnicity, education, and gender. I'm not sure what to takeaway from the Acxiom data. Perhaps, I am don't have an important enough profile to demand data accuracy, or perhaps Acxion is simply lagging behind in using data from the Net.

 

Navigation

Webs Webs

r1 - 31 Jan 2015 - 23:47:07 - MeganBuckley
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM