Collaborative Research: Using Web Data to Study Campaigns and Representation

Project: Research project

Project Details


Electoral campaigns are a defining feature of democratic polities–they influence voters’ choices and set expectations for how elected representatives will serve their constituents. Yet studying campaigns and their effects has been difficult. This is particularly true when it comes to U.S. congressional elections where over 900 candidates compete in a given election year. In recent work, the investigators have addressed this gap by collecting data from candidates’ campaign and members’ official congressional websites. With the support of four collaborative National Science Foundation grants (SES-0822819 and SES-0822782 in 2008; SES-1024079, SES-1023291 and SES-1022902 in 2010; SES-1155043, SES-1154201, and SES-1154317 in 2012; and SES-1627431 and SES-1627413 in 2016), the investigators have amassed a data set consisting of more than 3000 website codings from 2002 through 2016. These data have offered unprecedented opportunities to study campaigns and their effects on voters and legislators. The investigators also have used survey and experimental data to explore how campaign “insiders” view their websites and how distinct web-campaign strategies affect voters. The investigators propose to extend their data collection to include the 2018 Midterm election and the 2019 legislative session. As before, they will code sites over the course of the campaign, archive sites, implement surveys of website designers, and code official member websites approximately one year after the campaigns. They will also–as they did in 2010, 2012, and 2016–solicit input on the coding scheme to include features that interest other scholars who can then use the data for their own research. Extending the data to include the 2018 election (and the 2019 legislative session) is critical for various reasons: the unified Republican government differs from what it has been in recent years of the project; this election cycle will also allow a comparison to the first election cycle coded for the project (2002) when George W. Bush was facing his first midterm election under very different political conditions. Independent analysts are discussing the possibility of major gains for House Democrats, but there is uncertainty in the Senate as a large number of Democrats are running in states that Donald Trump won in 2016. The project uniquely brings together campaign and legislative data. The investigators will construct a publicly available data set that includes coding of approximately 3,750 House and Senate campaign websites and roughly 700 official congressional websites, over eighteen points in time. These data will include extensive information on candidates’ backgrounds, districts, and campaign media. The intellectual merit to the project is multifaceted. In addition to enabling scholars to track the evolution of the Internet and technology over time, the data offer researchers a unique opportunity to test theories of campaigns and their effects on voters and representatives. Unlike other unmediated sources of campaign communication–such as television advertisements and debates–virtually all candidates launch campaign websites and all representatives have official websites. This allows for analyses on a representative sample of candidates and members, rather than a sample biased towards competitive, well-funded campaigns or certain members. Websites also enable politicians to present a holistic picture of their behaviors rather than short sound bites, posts, or selected roll call votes. The project offers many contributions to societal knowledge and will have a
Effective start/end date9/15/188/31/21


  • National Science Foundation (SES-1823696)


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.