Index: [thread] [date] [subject] [author]
  From: <spm2101@columbia.edu>
  To  : <cpc@emoglen.law.columbia.edu>
  Date: Fri, 13 May 2005 22:29:21 -0400

Third Paper

I sent this Thursday but I didn't see it on the mailing list.

Paper III:  How Do Parties Use Personal Data to Influence the Vote?

By Steve McBride

	In the late 90's and the early part of this decade, political parties 
began experimenting with using voter's personal data to predict and change 
the outcome of our vote.  What information do the parties already have and 
what are they doing with it?  Neither party is exactly forthcoming about 
their data mining activities in the 2004 election, but journalists and 
bloggers have pieced together some of the activities.  In this paper I have 
outlined the basics.
	The initial data gathering process is straightforward.  First, the parties 
go to the local election records to get basic information on individuals, 
such as party affiliation, gender, and age [1].  This limited information 
is often inconsistent because each state keeps different data in their 
registration records.  Next, they use census and polling information to get 
information such as race, income, and family status [2].  Much of this is 
aggregated into census tracts and not individualized.  Then the parties go 
to data brokers like Choicepoint and Acxiom to get more individualized data 
[3].  Beyond this, the data gathering activities of parties are not clear. 

	Both parties maintain databases of approximately 168 million names [4], 
although the GOP database predates the Democrat database by a few years, 
and they actually used it to help gain a few seats in the 2002 election 
[5].  The Republicans have named their database Voter Vault, and 
appropriately so, because of the level of secrecy around it.  The GOP 
refuses to talk about any details of Voter Vault [6].  The Democrats 
counter with DataMart.  The Democrats are a bit more forthcoming about 
DataMart, and have boasted that the database contains 300 pieces of 
information on each person in their database [7].  However, the disclosure 
stops here; the Democrats refuse to discuss the content of the information 
in DataMart.
	So how do the parties use voter data to change the outcome of the vote? 
One method is by redistricting, a process in which the incumbent party 
pushes through measures to split up voting districts in an attempt to 
increase the likelihood they will gain seats.  This technique is an 
advantage only to the incumbent.  The other method is by targeting specific 
voters with tailored messages tailored to that individual voter.  Both 
parties jockey for advantage in this arena.
	Redistricting is not a new phenomenon.  We've seen it in the United States 
as early as 1812 when Massachusetts governor Elbridge Gerry involuntarily 
lent his name to the word gerrymandering [8].  Although data mining isn't 
creating a completely new problem with redistricting, it exacerbates an 
existing problem by allowing more intricately drawn districts to be created 
based on information that is not immediately obvious.  For example, Ford 
and Mercury owners are highly likely to vote Republican [9].  In using 
information such as consumer purchasing habits, a solid GOP voting district 
can be drawn up without the advantage being obvious at the time of the 
redistricting.  Since this technique is still in its infancy, we can't 
track how well it is working.  The quality of data inputs is poor and 
limits the usability of the system.  Coupled with limitations on existing 
voter predictive models, there are constraints to what the technology can 
do.  But by the time this technology matures, it may be too late to change 
existing laws to maintain the integrity of the vote.
	Another use of personal data is through targeted personal communication 
with individual voters.  There are two avenues for communication: potential 
campaign donors and potential voters.  In identifying donors, the goal is 
to raise more money for the political party.  There is a measure of 
transparency associated with this process, and most of the donor 
information is actually made public [10].
	In identifying potential voters, the goal is to first identify the 
relatively small number of undecided voters. In most circumstances, the 
large majority of voters in an election are decided when the campaign 
starts; a small percentage, however, are genuinely undecided and will 
determine the outcome of the election.  Therefore, parties will create 
targeted communications for the undecided voters, emphasizing the issues 
that the candidate and target agree on and downplaying the issues that the 
two disagree on [11].  "It is about being specific things to specific 
people." [12]
	It's tough to tell exactly how parties are manipulating this information, 
since the parties aren't talking.  However, it is not hard to imagine what 
they might be doing.  To illustrate, you can use data to generate a score 
for how likely someone is to be a donor to your campaign, and how much 
money they are likely to give, based on information like income, home 
ownership, and magazine subscriptions.  Then, you use the data to find out 
what issues the most promising potential donors find important and send a 
letter emphasizing those issues.
	Now think of this happening with every swing voter, and elections become 
more about who has better data mining than who has better issues. 
Suddenly, the GOP knows that my mother would probably vote Republican if 
she believed that George Bush is concerned with saving Social Security; 
however my uncle is more concerned with lowering taxes.  So the GOP sends 
out letters to my mother emphasizing Social Security and to my uncle 
emphasizing tax cuts.
	So what should we do about the problem?  With gerrymandering, the problem 
is old; it's just an aggravation of an existing problem.  Laws can be made, 
lawsuits can be filed, and, in the end, neighborhoods change.  It's a 
concern, but one I think can be dealt with.
	Voter and donor profiling is a tougher issue.  To an extent, one party 
will always be around to balance the other, but real problems arise if one 
party gains a sizeable advantage in predicting behavior.
	More importantly, there are serious issues with the idea of a vote being 
modeled.  The most important expression of free will in our society becomes 
a predictable exercise.  At best, the vote becomes an exercise in 
consumption.  At worst, the vote becomes an accurately predicted 
afterthought.  So what's the solution?  The answer is worth at least 
another 1000 word paper.


[1] http://www.cio.com/archive/060104/election.html.
[2] id.
[3] The Very, Very Personal is Political, John Gertner, Feb 15, 2004, New 
York Times, available at: 
http://www.why-war.com/news/2004/02/15/theveryv.html
[4]  Data Churners Try to PinPoint Voter's Politics, Joyce Purnick, April 
7, 2004, New York Times, available at:
http://www.nytimes.com/2004/04/07/politics/campaign/07VOTE.html?ex=1396670400&en=820e67c597bbda7f&ei=5007&partner=USERLAND.
[5] http://www.cio.com/archive/060104/election.html.
[6] Supra, note 3
[7] Id.
[8] http://en.wikipedia.org/wiki/Gerrymandering.  After Gerry redistricted 
Massachusetts to favor Jeffersonian candidates, a reporter commented that 
one of the new districts looked like a salamander on the map.  Another 
reporter countered that it looked more like a Gerry-mander.
[9] http://www.baroudi.com/Blogs/Weblog23Feb2004.htm.  This serves to 
illustrate the point of how politicians can use mundane everyday 
information to influence who gets elected.
[10] http://www.pcworld.com/news/article/0,aid,117309,00.asp.  The Federal 
Campaign Act and Buckley v Vallejo are responsible.  www.fundrace.org, by 
putting a couple of public government databases together, gives the names, 
addresses, and professions of anyone who has donated more than $200 to a 
presidential candidate or national committee.  For example, try typing in 
Jerry Seinfeld.
[11]  Supra note 8.
[12] id.

аЯрЁБсџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџьЅС‚:Хä III:  How Do Parties Use Personal Data to Influence the Vote?

By Steve McBride

	In the late 90’s and the early part of this decade, political parties began experimenting with using voter’s personal data to predict and change the outcome of our vote.  What information do the parties already have and what are they doing with it?  Neither party is exactly forthcoming about their data mining activities in the 2004 election, but journalists and bloggers have pieced together some of the activities.  In this paper I have outlined the basics.
	The initial data gathering process is straightforward.  First, the parties go to the local election records to get basic information on individuals, such as party affiliation, gender, and age [1].  This limited information is often inconsistent because each state keeps different data in their registration records.  Next, they use census and polling information to get information such as race, income, and family status [2].  Much of this is aggregated into census tracts and not individualized.  Then the parties go to data brokers like Choicepoint and Acxiom to get more individualized data [3].  Beyond this, the data gathering activities of parties are not clear.   
	Both parties maintain databases of approximately 168 million names [4], although the GOP database predates the Democrat database by a few years, and they actually used it to help gain a few seats in the 2002 election [5].  The Republicans have named their database Voter Vault, and appropriately so, because of the level of secrecy around it.  The GOP refuses to talk about any details of Voter Vault [6].  The Democrats counter with DataMart.  The Democrats are a bit more forthcoming about DataMart, and have boasted that the database contains 300 pieces of information on each person in their database [7].  However, the disclosure stops here; the Democrats refuse to discuss the content of the information in DataMart.
	So how do the parties use voter data to change the outcome of the vote?  One method is by redistricting, a process in which the incumbent party pushes through measures to split up voting districts in an attempt to increase the likelihood they will gain seats.  This technique is an advantage only to the incumbent.  The other method is by targeting specific voters with tailored messages tailored to that individual voter.  Both parties jockey for advantage in this arena.  
	Redistricting is not a new phenomenon.  We’ve seen it in the United States as early as 1812 when Massachusetts governor Elbridge Gerry involuntarily lent his name to the word gerrymandering [8].  Although data mining isn’t creating a completely new problem with redistricting, it exacerbates an existing problem by allowing more intricately drawn districts to be created based on information that is not immediately obvious.  For example, Ford and Mercury owners are highly likely to vote Republican [9].  In using information such as consumer purchasing habits, a solid GOP voting district can be drawn up without the advantage being obvious at the time of the redistricting.  Since this technique is still in its infancy, we can’t track how well it is working.  The quality of data inputs is poor and limits the usability of the system.  Coupled with limitations on existing voter predictive models, there are constraints to what the technology can do.  But by the time this technology matures, it may be too late to change existing laws to maintain the integrity of the vote. 
	Another use of personal data is through targeted personal communication with individual voters.  There are two avenues for communication: potential campaign donors and potential voters.  In identifying donors, the goal is to raise more money for the political party.  There is a measure of transparency associated with this process, and most of the donor information is actually made public [10]. 
	In identifying potential voters, the goal is to first identify the relatively small number of undecided voters. In most circumstances, the large majority of voters in an election are decided when the campaign starts; a small percentage, however, are genuinely undecided and will determine the outcome of the election.  Therefore, parties will create targeted communications for the undecided voters, emphasizing the issues that the candidate and target agree on and downplaying the issues that the two disagree on [11].  “It is about being specific things to specific people.” [12]
	It’s tough to tell exactly how parties are manipulating this information, since the parties aren’t talking.  However, it is not hard to imagine what they might be doing.  To illustrate, you can use data to generate a score for how likely someone is to be a donor to your campaign, and how much money they are likely to give, based on information like income, home ownership, and magazine subscriptions.  Then, you use the data to find out what issues the most promising potential donors find important and send a letter emphasizing those issues.  
	Now think of this happening with every swing voter, and elections become more about who has better data mining than who has better issues.  Suddenly, the GOP knows that my mother would probably vote Republican if she believed that George Bush is concerned with saving Social Security; however my uncle is more concerned with lowering taxes.  So the GOP sends out letters to my mother emphasizing Social Security and to my uncle emphasizing tax cuts.
	So what should we do about the problem?  With gerrymandering, the problem is old; it’s just an aggravation of an existing problem.  Laws can be made, lawsuits can be filed, and, in the end, neighborhoods change.  It’s a concern, but one I think can be dealt with.
	Voter and donor profiling is a tougher issue.  To an extent, one party will always be around to balance the other, but real problems arise if one party gains a sizeable advantage in predicting behavior.  
	More importantly, there are serious issues with the idea of a vote being modeled.  The most important expression of free will in our society becomes a predictable exercise.  At best, the vote becomes an exercise in consumption.  At worst, the vote becomes an accurately predicted afterthought.  So what’s the solution?  The answer is worth at least another 1000 word paper.    


[1]  HYPERLINK "http://www.cio.com/archive/060104/election.html" http://www.cio.com/archive/060104/election.html.  
[2] id. 
[3] The Very, Very Personal is Political, John Gertner, Feb 15, 2004, New York Times, available at: http://www.why-war.com/news/2004/02/15/theveryv.html
[4]  Data Churners Try to PinPoint Voter’s Politics, Joyce Purnick, April 7, 2004, New York Times, available at: 
 HYPERLINK "http://www.nytimes.com/2004/04/07/politics/campaign/07VOTE.html?ex=1396670400&en=820e67c597bbda7f&ei=5007&partner=USERLAND" http://www.nytimes.com/2004/04/07/politics/campaign/07VOTE.html?ex=1396670400&en=820e67c597bbda7f&ei=5007&partner=USERLAND.
[5]  HYPERLINK "http://www.cio.com/archive/060104/election.html" http://www.cio.com/archive/060104/election.html.
[6] Supra, note 3 
[7] Id.
[8]  HYPERLINK "http://en.wikipedia.org/wiki/Gerrymandering" http://en.wikipedia.org/wiki/Gerrymandering.  After Gerry redistricted Massachusetts to favor Jeffersonian candidates, a reporter commented that one of the new districts looked like a salamander on the map.  Another reporter countered that it looked more like a Gerry-mander.
[9]  HYPERLINK "http://www.baroudi.com/Blogs/Weblog23Feb2004.htm" http://www.baroudi.com/Blogs/Weblog23Feb2004.htm.  This serves to illustrate the point of how politicians can use mundane everyday information to influence who gets elected.
[10]  HYPERLINK "http://www.pcworld.com/news/article/0,aid,117309,00.asp" http://www.pcworld.com/news/article/0,aid,117309,00.asp.  The Federal Campaign Act and Buckley v Vallejo are responsible.   HYPERLINK "http://www.fundrace.org" www.fundrace.org, by putting a couple of public government databases together, gives the names, addresses, and professions of anyone who has donated more than $200 to a presidential candidate or national committee.  For example, try typing in Jerry Seinfeld.
[11]  Supra note 8. 
[12] id.  
*
АДАДЌДЈÄЈЄр MЋГC!äŒüаШЬШРÄМÄРМüИШДАИМÄüМЬМЬРЌРадЌиєиЌhл{Fh*hZ4дäрь№ь№ьÜиÜиьиьдьаЬаЬаЬаЬаШЬШдШдШÄдШÄРÄРÄЬРМÄшÄИДh’)Єh2hПMT*шјäјшрÜиÜдÜаєдÜєрЬШршÄшдшРМРМРМИМДИДАЌАЌЈЄЌ ЈЌœÄœИœИœИhСOh­=ЅhоK‰Üq)|–ІšŽЎŠ’†ъphл“ üаüЬдиЬШиШЬÄЬРÄРиМИДАМЌЈјЌЈРЌЈЄМЈ œÜ˜hШё{ZЧqЬ#даЬХаСаЙСВСЇЙžЙСаСЙСВС“ЙžЙСааВа‹ƒ‹|‹ёh=X0jЋh‘тё;%% †˜}˜ v ˜ kj~МUh$Мh$–hyHзjЫ(h„!Т0J/K©`ei60tiŒ‚hipw/©0,Nuö™J€€n€ Xџ€Xџ€Xџ€Xџ€Xџ€Xџ€Xџ€џџ\в е й	n:schemas-microsoft-com:office:smarttags€City€urn:schemas-microsoft-com:office:smarttags€dateountry-regionh€Year‘? ˆH`„˜ў‡hЦ„0§o(„ `„˜ў‡hLџЦˆH`„˜ў‡hЦH'/VO©m/тAў †5 atўџ Data Mining:  How do parties use personal data to influence the voterosoft Word 10.0 Mining:  How do parties use personal data to influence the vote1i,.02l>џџџўџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџRџџnџџџџџoџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџРd.Document.8

-----------------------------------------------------------------
Computers, Privacy, and the Constitution mailing list



Index: [thread] [date] [subject] [author]