Law in the Internet Society

Toward Algorithmic Disclosure

-- By BenWeissler - 09 Oct 2020

Should the government require internet companies to disclose the inner workings of their algorithms to the public? I argue it should, and I examine the financial disclosures mandated by securities law as a useful analogue.

The term "algorithm" is slippery, often shrouded in layers of unnecessary mystification. This essay adopts a broad definition of algorithm: any decision-making process embodied in code, typically one that uses input (data) to generate output (often predictions). An algorithm can be simple — e.g., an arithmetical formula — or complex — e.g., a machine learning model.

Algorithms have been implicated in illegal housing and employment discrimination, and elsewhere have spurred extremism and hatred. Whether it is accurate to pin such maladies on algorithms is a point I revisit at the end of this essay.

The Financial Disclosure Analogue

In 1934, Congress passed the Securities and Exchange Act (SEA), a sweeping piece of legislation which created the SEC and empowered that agency to set disclosure rules for publicly-traded companies. The SEC requires such companies to file quarterly and annual reports, forcing them to reveal their financial performance, risk factors, and executive compensation. Violation of these disclosure rules (e.g., a false statement or omission of material fact) might result in a public enforcement action by the SEC and/or a private lawsuit brought by investors.

It's valuable to step back and ask in the abstract: why have financial disclosure rules at all? After all, in a world designed by Chicago-school economists (the "libertarian dream world"), you might expect investors, individually or in groups, to contract freely with companies and thereby obtain a bargained-for level of disclosure. In the libertarian dream world, a company's stock price might trade not only as a function of its earnings and growth prospects, but also in relation to the quality of disclosure it provides.

But we have rejected the libertarian dream world — and for good reason! Sunlight is the best disinfectant. Forcing companies to disclose financial information unlocks enormous benefits, with regard to efficiency and equity. Nobody wastes money or time bargaining for disclosure, and everybody gets to invest on a level playing field, starting from the same set of (true) information.

What Problems Does Algorithmic Disclosure Solve?

How come we accept the libertarian dream world when it comes to algorithms? Algorithms increasingly supplant human choice in dictating what consumes our time and attention, what housing and jobs we have access to, what interest rates we pay, and more. And yet we rely on the benevolence of companies voluntarily making selective algorithmic disclosure, or else what scraps of information lawsuits and outside researchers can pry loose.

Congress should wake from its slumber and pass algorithmic disclosure legislation. Exactly who must disclose what, at what level of detail and subject to what verification, are solvable problems and the proper subject of technocratic expertise. Algorithmic disclosure would carry a number of benefits:

  • First, direct legal benefits. In the libertarian dream world we currently inhabit, a plaintiff who wishes to bring a suit under the Fair Housing Act against a discriminatory housing algorithm faces an uphill battle. To survive a motion to dismiss, the plaintiff must plead with enough particularity to meet Twombly and Iqbal heightened “plausibility” standards. In all likelihood, however, the plaintiff lacks sufficient knowledge of how that black-box algorithm works in order to meet the Twiqbal bar and ever get access to discovery. Algorithmic disclosure would arm would-be plaintiffs with enough predicate information to get into court.
    • If financial disclosure and the SEA is any guide, the legal consequences of algorithmic disclosure will be far-reaching. One commentator has stated (somewhat tongue-in-cheek) that “everything is securities fraud” on the grounds that when a company does something bad, then (as companies are wont to do) does not disclose the bad thing to investors, it violates securities law. Contributing to global warming (and not disclosing it) is securities fraud, mistreating orcas (and not disclosing it) is securities fraud, and so on. A regime of algorithmic disclosure, like its financial disclosure counterpart, would create a legal dragnet, forcing companies ultimately to account for the bad things they do but omit from disclosure.

  • Second, behavioral benefits which flow from algorithmic disclosure. An open admission by corporate executives, under penalty of law, that “our company’s algorithm is designed to maximize engagement by strategically funneling users down extremist rabbit holes” might cause companies to rethink their algorithms and to fully internalize the reputational costs of such admissions. On the other end of the market, frank disclosures might cause users to rethink their engagement with platforms that use manipulative algorithms.

Defending the Proposal

Algorithmic disclosure is likely to draw criticism from two camps: those who (gasp) view it as a radical, impractical proposal and those who (yawn) think it does not go far enough.

In the camp of shocked gaspers, we might find IPdroids claiming that a company’s algorithm is proprietary IP (a trade secret), and that the government cannot constitutionally take it via forced disclosure. Whatever the merits of this position, it can likely be overcome by careful design of the disclosures. More generally, disclosure-skeptics will have to answer for the broadly successful track record of the SEA and SEC over the past century.

The second camp, the yawners, have a deeper and more persuasive critique of algorithmic disclosure. To ventriloquize these yawners: “We should avoid ascribing to algorithms magical powers of destruction and divisiveness that they simply do not have. Especially when the source of our maladies lies elsewhere — in unchecked data collection, the centralized structure of internet services, and deeper socioeconomic malaise.” Algorithmic disclosure, however, involves no misdirection nor denial of these realities. It seeks only to narrowly improve our situation and unlock incremental benefits. Moreover, because of the ‘legal dragnet’ effect discussed above, there is reason to believe that algorithmic disclosure will cast a somewhat wider legal shadow, helping us to indirectly attack the issues yawners raise — even before our law and politics address those deeper issues directly.

This is a fine draft, clever and engaging, somewhat irritating, as it should be. The lesson in the power of metaphors is also powerful. There's no particular relationship preexisting between the two forms of disclosure you chose to analogize, but having made the metaphorical connection, it then began to control the direction of your argument. If you had started from an environmental rather than a financial comparison, treating the disclosure of "algorithms" like the disclosures of hazardous chemicals in the workplace, or effluent disclosures from industrial discharge, you would have found a closer functional basis for comparison between two forms of regulation, and have reached different, though similar, rhetorical postures in your argument. You might try that, for the exercise.

But the primary route to improvement, I think, is to remove the factual misunderstanding that is the root of the argument. With respect to many "machine learning" applications, including the forms of recommendation and behavioral-cueing technologies you are discussing, there "algorithm" to be disclosed is basically trivial. Most ML, or even less descriptively "AI," structures depend for their effectiveness on their "training data," not on the executable computer programs, which are rather primitive routines, whose interconnections in "neural networks" depend not on the simple "algorithms" but on the sequence of data fed as raw material into those programs. What gets "disclosed" in the arrangements you have in mind is of no use in explaining the emergent properties of the system that contains this code.

About ten years ago I began receiving a few communications a year, first in the single digits then in the dozen range, form people (almost always trained at MIT) raising the same question:

Have you and Richard [Stallman] considered how to make free software principles work for machine learning? The source code of the programs doesn't do you any good in understanding or modifying the system: you need to have some copyleft that applies to the training data.

I always responded by saying that this was indeed a problem, and that it created an intractable subset of the general data licensing problem, which we hoped we could eventually solve on its own terms. In the latter part of the decade I would end each annual SFLC conference at CLS promising that "next year" we would discuss the issue. But by the time people knew it was there, a whole bunch of non-technical law and policy types, like Marc Rotenberg of EPIC, had invented "algorithmic transparency" as a policy prescription, and the bullshit-to-signal ratio climbed towards infinity.

Drop the assumption that when you know "the algorithm" you know anything. Assume instead that such disclosure is non-informative. Now what is your prescription?


 

Navigation

Webs Webs

r5 - 14 Nov 2020 - 19:08:56 - EbenMoglen
This site is powered by the TWiki collaboration platform.
All material on this collaboration platform is the property of the contributing authors.
All material marked as authored by Eben Moglen is available under the license terms CC-BY-SA version 4.
Syndicate this site RSSATOM