Index: [thread] [date] [subject] [author]

  From: <spm2101@columbia.edu>
  To  : <cpc@emoglen.law.columbia.edu>
  Date: Fri, 13 May 2005 22:29:21 -0400

Third Paper

I sent this Thursday but I didn't see it on the mailing list.

Paper III: How Do Parties Use Personal Data to Influence the Vote?

By Steve McBride

In the late 90's and the early part of this decade, political parties
began experimenting with using voter's personal data to predict and change
the outcome of our vote. What information do the parties already have and
what are they doing with it? Neither party is exactly forthcoming about
their data mining activities in the 2004 election, but journalists and
bloggers have pieced together some of the activities. In this paper I have
outlined the basics.
The initial data gathering process is straightforward. First, the parties
go to the local election records to get basic information on individuals,
such as party affiliation, gender, and age [1]. This limited information
is often inconsistent because each state keeps different data in their
registration records. Next, they use census and polling information to get
information such as race, income, and family status [2]. Much of this is
aggregated into census tracts and not individualized. Then the parties go
to data brokers like Choicepoint and Acxiom to get more individualized data
[3]. Beyond this, the data gathering activities of parties are not clear.

Both parties maintain databases of approximately 168 million names [4],
although the GOP database predates the Democrat database by a few years,
and they actually used it to help gain a few seats in the 2002 election
[5]. The Republicans have named their database Voter Vault, and
appropriately so, because of the level of secrecy around it. The GOP
refuses to talk about any details of Voter Vault [6]. The Democrats
counter with DataMart. The Democrats are a bit more forthcoming about
DataMart, and have boasted that the database contains 300 pieces of
information on each person in their database [7]. However, the disclosure
stops here; the Democrats refuse to discuss the content of the information
in DataMart.
So how do the parties use voter data to change the outcome of the vote?
One method is by redistricting, a process in which the incumbent party
pushes through measures to split up voting districts in an attempt to
increase the likelihood they will gain seats. This technique is an
advantage only to the incumbent. The other method is by targeting specific
voters with tailored messages tailored to that individual voter. Both
parties jockey for advantage in this arena.
Redistricting is not a new phenomenon. We've seen it in the United States
as early as 1812 when Massachusetts governor Elbridge Gerry involuntarily
lent his name to the word gerrymandering [8]. Although data mining isn't
creating a completely new problem with redistricting, it exacerbates an
existing problem by allowing more intricately drawn districts to be created
based on information that is not immediately obvious. For example, Ford
and Mercury owners are highly likely to vote Republican [9]. In using
information such as consumer purchasing habits, a solid GOP voting district
can be drawn up without the advantage being obvious at the time of the
redistricting. Since this technique is still in its infancy, we can't
track how well it is working. The quality of data inputs is poor and
limits the usability of the system. Coupled with limitations on existing
voter predictive models, there are constraints to what the technology can
do. But by the time this technology matures, it may be too late to change
existing laws to maintain the integrity of the vote.
Another use of personal data is through targeted personal communication
with individual voters. There are two avenues for communication: potential
campaign donors and potential voters. In identifying donors, the goal is
to raise more money for the political party. There is a measure of
transparency associated with this process, and most of the donor
information is actually made public [10].
In identifying potential voters, the goal is to first identify the
relatively small number of undecided voters. In most circumstances, the
large majority of voters in an election are decided when the campaign
starts; a small percentage, however, are genuinely undecided and will
determine the outcome of the election. Therefore, parties will create
targeted communications for the undecided voters, emphasizing the issues
that the candidate and target agree on and downplaying the issues that the
two disagree on [11]. "It is about being specific things to specific
people." [12]
It's tough to tell exactly how parties are manipulating this information,
since the parties aren't talking. However, it is not hard to imagine what
they might be doing. To illustrate, you can use data to generate a score
for how likely someone is to be a donor to your campaign, and how much
money they are likely to give, based on information like income, home
ownership, and magazine subscriptions. Then, you use the data to find out
what issues the most promising potential donors find important and send a
letter emphasizing those issues.
Now think of this happening with every swing voter, and elections become
more about who has better data mining than who has better issues.
Suddenly, the GOP knows that my mother would probably vote Republican if
she believed that George Bush is concerned with saving Social Security;
however my uncle is more concerned with lowering taxes. So the GOP sends
out letters to my mother emphasizing Social Security and to my uncle
emphasizing tax cuts.
So what should we do about the problem? With gerrymandering, the problem
is old; it's just an aggravation of an existing problem. Laws can be made,
lawsuits can be filed, and, in the end, neighborhoods change. It's a
concern, but one I think can be dealt with.
Voter and donor profiling is a tougher issue. To an extent, one party
will always be around to balance the other, but real problems arise if one
party gains a sizeable advantage in predicting behavior.
More importantly, there are serious issues with the idea of a vote being
modeled. The most important expression of free will in our society becomes
a predictable exercise. At best, the vote becomes an exercise in
consumption. At worst, the vote becomes an accurately predicted
afterthought. So what's the solution? The answer is worth at least
another 1000 word paper.

[1] http://www.cio.com/archive/060104/election.html.
[2] id.
[3] The Very, Very Personal is Political, John Gertner, Feb 15, 2004, New
York Times, available at:
http://www.why-war.com/news/2004/02/15/theveryv.html
[4] Data Churners Try to PinPoint Voter's Politics, Joyce Purnick, April
7, 2004, New York Times, available at:
http://www.nytimes.com/2004/04/07/politics/campaign/07VOTE.html?ex=1396670400&en=820e67c597bbda7f&ei=5007&partner=USERLAND.
[5] http://www.cio.com/archive/060104/election.html.
[6] Supra, note 3
[7] Id.
[8] http://en.wikipedia.org/wiki/Gerrymandering. After Gerry redistricted
Massachusetts to favor Jeffersonian candidates, a reporter commented that
one of the new districts looked like a salamander on the map. Another
reporter countered that it looked more like a Gerry-mander.
[9] http://www.baroudi.com/Blogs/Weblog23Feb2004.htm. This serves to
illustrate the point of how politicians can use mundane everyday
information to influence who gets elected.
[10] http://www.pcworld.com/news/article/0,aid,117309,00.asp. The Federal
Campaign Act and Buckley v Vallejo are responsible. www.fundrace.org, by
putting a couple of public government databases together, gives the names,
addresses, and professions of anyone who has donated more than $200 to a
presidential candidate or national committee. For example, try typing in
Jerry Seinfeld.
[11] Supra note 8.
[12] id.

ÐÏà¡±áÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿì¥Á‚:Åä III: How Do Parties Use Personal Data to Influence the Vote?

By Steve McBride

In the late 90’s and the early part of this decade, political parties began experimenting with using voter’s personal data to predict and change the outcome of our vote. What information do the parties already have and what are they doing with it? Neither party is exactly forthcoming about their data mining activities in the 2004 election, but journalists and bloggers have pieced together some of the activities. In this paper I have outlined the basics.
The initial data gathering process is straightforward. First, the parties go to the local election records to get basic information on individuals, such as party affiliation, gender, and age [1]. This limited information is often inconsistent because each state keeps different data in their registration records. Next, they use census and polling information to get information such as race, income, and family status [2]. Much of this is aggregated into census tracts and not individualized. Then the parties go to data brokers like Choicepoint and Acxiom to get more individualized data [3]. Beyond this, the data gathering activities of parties are not clear.
Both parties maintain databases of approximately 168 million names [4], although the GOP database predates the Democrat database by a few years, and they actually used it to help gain a few seats in the 2002 election [5]. The Republicans have named their database Voter Vault, and appropriately so, because of the level of secrecy around it. The GOP refuses to talk about any details of Voter Vault [6]. The Democrats counter with DataMart. The Democrats are a bit more forthcoming about DataMart, and have boasted that the database contains 300 pieces of information on each person in their database [7]. However, the disclosure stops here; the Democrats refuse to discuss the content of the information in DataMart.
So how do the parties use voter data to change the outcome of the vote? One method is by redistricting, a process in which the incumbent party pushes through measures to split up voting districts in an attempt to increase the likelihood they will gain seats. This technique is an advantage only to the incumbent. The other method is by targeting specific voters with tailored messages tailored to that individual voter. Both parties jockey for advantage in this arena.
Redistricting is not a new phenomenon. We’ve seen it in the United States as early as 1812 when Massachusetts governor Elbridge Gerry involuntarily lent his name to the word gerrymandering [8]. Although data mining isn’t creating a completely new problem with redistricting, it exacerbates an existing problem by allowing more intricately drawn districts to be created based on information that is not immediately obvious. For example, Ford and Mercury owners are highly likely to vote Republican [9]. In using information such as consumer purchasing habits, a solid GOP voting district can be drawn up without the advantage being obvious at the time of the redistricting. Since this technique is still in its infancy, we can’t track how well it is working. The quality of data inputs is poor and limits the usability of the system. Coupled with limitations on existing voter predictive models, there are constraints to what the technology can do. But by the time this technology matures, it may be too late to change existing laws to maintain the integrity of the vote.
Another use of personal data is through targeted personal communication with individual voters. There are two avenues for communication: potential campaign donors and potential voters. In identifying donors, the goal is to raise more money for the political party. There is a measure of transparency associated with this process, and most of the donor information is actually made public [10].
In identifying potential voters, the goal is to first identify the relatively small number of undecided voters. In most circumstances, the large majority of voters in an election are decided when the campaign starts; a small percentage, however, are genuinely undecided and will determine the outcome of the election. Therefore, parties will create targeted communications for the undecided voters, emphasizing the issues that the candidate and target agree on and downplaying the issues that the two disagree on [11]. “It is about being specific things to specific people.” [12]
It’s tough to tell exactly how parties are manipulating this information, since the parties aren’t talking. However, it is not hard to imagine what they might be doing. To illustrate, you can use data to generate a score for how likely someone is to be a donor to your campaign, and how much money they are likely to give, based on information like income, home ownership, and magazine subscriptions. Then, you use the data to find out what issues the most promising potential donors find important and send a letter emphasizing those issues.
Now think of this happening with every swing voter, and elections become more about who has better data mining than who has better issues. Suddenly, the GOP knows that my mother would probably vote Republican if she believed that George Bush is concerned with saving Social Security; however my uncle is more concerned with lowering taxes. So the GOP sends out letters to my mother emphasizing Social Security and to my uncle emphasizing tax cuts.
So what should we do about the problem? With gerrymandering, the problem is old; it’s just an aggravation of an existing problem. Laws can be made, lawsuits can be filed, and, in the end, neighborhoods change. It’s a concern, but one I think can be dealt with.
Voter and donor profiling is a tougher issue. To an extent, one party will always be around to balance the other, but real problems arise if one party gains a sizeable advantage in predicting behavior.
More importantly, there are serious issues with the idea of a vote being modeled. The most important expression of free will in our society becomes a predictable exercise. At best, the vote becomes an exercise in consumption. At worst, the vote becomes an accurately predicted afterthought. So what’s the solution? The answer is worth at least another 1000 word paper.

[1] HYPERLINK "http://www.cio.com/archive/060104/election.html" http://www.cio.com/archive/060104/election.html.
[2] id.
[3] The Very, Very Personal is Political, John Gertner, Feb 15, 2004, New York Times, available at: http://www.why-war.com/news/2004/02/15/theveryv.html
[4] Data Churners Try to PinPoint Voter’s Politics, Joyce Purnick, April 7, 2004, New York Times, available at:
HYPERLINK "http://www.nytimes.com/2004/04/07/politics/campaign/07VOTE.html?ex=1396670400&en=820e67c597bbda7f&ei=5007&partner=USERLAND" http://www.nytimes.com/2004/04/07/politics/campaign/07VOTE.html?ex=1396670400&en=820e67c597bbda7f&ei=5007&partner=USERLAND.
[5] HYPERLINK "http://www.cio.com/archive/060104/election.html" http://www.cio.com/archive/060104/election.html.
[6] Supra, note 3
[7] Id.
[8] HYPERLINK "http://en.wikipedia.org/wiki/Gerrymandering" http://en.wikipedia.org/wiki/Gerrymandering. After Gerry redistricted Massachusetts to favor Jeffersonian candidates, a reporter commented that one of the new districts looked like a salamander on the map. Another reporter countered that it looked more like a Gerry-mander.
[9] HYPERLINK "http://www.baroudi.com/Blogs/Weblog23Feb2004.htm" http://www.baroudi.com/Blogs/Weblog23Feb2004.htm. This serves to illustrate the point of how politicians can use mundane everyday information to influence who gets elected.
[10] HYPERLINK "http://www.pcworld.com/news/article/0,aid,117309,00.asp" http://www.pcworld.com/news/article/0,aid,117309,00.asp. The Federal Campaign Act and Buckley v Vallejo are responsible. HYPERLINK "http://www.fundrace.org" www.fundrace.org, by putting a couple of public government databases together, gives the names, addresses, and professions of anyone who has donated more than $200 to a presidential candidate or national committee. For example, try typing in Jerry Seinfeld.
[11] Supra note 8.
[12] id.
*
°´°´¬´¨Ä¨¤à M«³C!äŒüÐÈÌÈÀÄ¼ÄÀ¼ü¸È´°¸¼Äü¼Ì¼ÌÀ¬ÀÐÔ¬ØôØ¬hÛ{Fh*hZ4ÔäàìðìðìÜØÜØìØìÔìÐÌÐÌÐÌÐÌÐÈÌÈÔÈÔÈÄÔÈÄÀÄÀÄÌÀ¼ÄèÄ¸´h’)¤h2h¿MT*èøäøèàÜØÜÔÜÐôÔÜôàÌÈàèÄèÔèÀ¼À¼À¼¸¼´¸´°¬°¬¨¤¬ ¨¬œÄœ¸œ¸œ¸hÁOh=¥hÞK‰Üq)|–¦šŽ®Š’†êphÛ“ üÐüÌÔØÌÈØÈÌÄÌÀÄÀØ¼¸´°¼¬¨ø¬¨À¬¨¤¼¨ œÜ˜hÈñ{ZÇqÌ#ÔÐÌÅÐÁÐ¹Á²Á§¹ž¹ÁÐÁ¹Á²Á“¹ž¹ÁÐÐ²Ð‹ƒ‹|‹ñh=X0j«h‘âñ;%% †˜}˜ v ˜ kj~¼Uh$¼h$–hyH×jË(h„!Â0J/K©`ei60tiŒ‚hipw/©0,Nuö™J€€n€ Xÿ€Xÿ€Xÿ€Xÿ€Xÿ€Xÿ€Xÿ€ÿÿ\Ò Õ Ù n:schemas-microsoft-com:office:smarttags€City€urn:schemas-microsoft-com:office:smarttags€dateountry-regionh€Year‘? ˆH`„˜þ‡hÆ„0ýo(„ `„˜þ‡hLÿÆˆH`„˜þ‡hÆH'/VO©m/âAþ †5 atþÿ Data Mining: How do parties use personal data to influence the voterosoft Word 10.0 Mining: How do parties use personal data to influence the vote1i,.02l>ÿÿÿþÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRÿÿnÿÿÿÿÿoÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÀd.Document.8

-----------------------------------------------------------------
Computers, Privacy, and the Constitution mailing list

Index: [thread] [date] [subject] [author]