The Adequate Balance in Governmental Big Data Disclosure Policies

-- By LiorSokol - 11 Jan 2022

Introduction

In recent years, big data is being gathered and held by governments. This "big data" includes data about criminal records, health, real estate ownership, etc. The data is considered "big" as it is collected on a large scale, enabling us to use it for drawing statistical conclusions. For instance, health records can identify sensitivity to a certain disease, or unique side-effects according to gender, race, place of residency, etc., and enable the development of more accurate treatments or a better understanding of its sources. Once this data is collected, a question is being asked whether this data should be "disclosed", and that in several levels – complete public disclosure, so that anyone will be able to access the data, or in-demand disclosure for research purposes in which I will focus on. In the status quo, there is a complex of rules regulating the disclosure in different fields. The HIPAA rules, for instance, deal with health records, and 45 CFR § 164.502 limits significantly the duty to disclose protected health data only to the patient's consent. Different sources may require different rules, however, general principles should apply to all fields. In this essay, I will present the main arguments in the literature for each side and try to draw the general principles to apply and enable more extensive disclosure.

Normative analysis - Big Data Disclosure

Advantages in Disclosing Data

Disclosing big data that is possessed by the government, can be advantageous in various ways. First, research benefits. In the post-modern era, information is a significant component of the research and development of new products. The existence ability to analyze big data accelerates research's progress. The research benefits are not limited to health records as supra discussed, but the extent to various fields. For instance, criminal records analysis could identify factors that increase criminal behaviors and help uproot them. Second, fulfilling democratic purposes. In a democracy, sovereignty is given to the people, which in their turn gives the government their mandate. Disclosing information can teach the public on the functioning of the state authorities, and by that holding its elected officials accountable for their actions. Moreover, it allows individuals to make informed decisions. For example, the disclosure of crime or health records is a crucial factor to evaluate a residency area.

Disadvantages in Disclosing Data

Nevertheless, there are obvious disadvantages to such disclosure. The first is the violation of the individual's right to privacy. Particularly, the information is often mandatorily collected by the state, lacking the individuals' consent. Even in the cases in which individuals opt in to provide information to the state, it is usually intended for a particular purpose, so that the state discloses the information for a purpose other than the purpose for which it was provided. According to theories of 'privacy as control', the change of purpose is taking individuals' information outside their control, and thus is violating the right to privacy. This violation may be mitigated if the information is published anonymously, but as long as there is a way to connect the information to the individual, the privacy violation cannot be overcome. Second, using citizens' private data is using the individuals as a product. In the digital world, information is a product that sells at a great price. Companies pay a lot of money to direct their advertisements to people that are expected to purchase their products, and therefore a company that can provide information about a potential buyer will be rewarded financially for this. When the government shares its databases, information analysis companies may use these databases to create a user profile analysis of individuals. A combination of information from several databases, such as age, place of residence, economic and family status will allow advertisers to optimize their advertisement. Using individuals' private data as a financial product can affect the way individuals behave, consume and read, thus violating their right to privacy. To sum it up, although sharing governmental big data can be economically and democratically beneficial, the individuals' right to privacy may be severely violated.

Leading principles for data disclosure

We supra presented two main privacy obstacles prevent the full disclosure of data, anonymity, and the use for individual purposes on behalf of the public one. Therefore, the following two principles should apply to any field of governmental data's disclosure: First, the utility to the public interest. In order to prevent disclosing data to private entities to be used as a targeting tool for their private interest, disclosure should be limited to the public interest. The indication should be made both by analyzing the requesting party (for instance, research rather than a commercial corporation), and its purpose. Haven't the requesting party proved a public beneficial purpose, the data should not be disclosed. Second, the re-identification principle. In order to mitigate the potential privacy violation and maintain anonymity, the governmental authority should examine the ability to identify the anonymized data's source and disclose it only if re-identification is impossible. Such principle was presented in the Canadian Ontario case, in which the Supreme Court had to decide whether the disclosure of the first three digits in the postal code of sex offenders in Ontario can be forced upon the ministry. The Supreme Court approved the regional court's decision, according to which the information should be provided due to the inability to re-identify the offenders. The question that remained open was whether the test should be examined by existing or future technologies. In my opinion, the examination should include future reasonable technologies, meaning technologies that can reasonably be expected to exist in the near future. These two principles create an adequate balance – big data will be disclosed only for great public utility, and only if a violation of privacy rights can be mitigated by anonymity. These two principles should be the basis of any specific field regulation, either to extend (like is required in my opinion for health records) or to limit the disclosure.