Predicting the 2018 United States House of Representatives Elections and Voting Patterns with Polling Data

This repository contains the relevant model and visualization code for Team MSR2's Columbia University Spring 2018 Data Science Capstone Project with industry partner Microsoft Research.

Post-Election Update

In hindsight, given how far off our model's predictions were from the actual 2018 midterm election results (where the Democratic Party won 41 seats rather than losing 19 as predicted by our model), we feel that it would be worthwhile to highlight some possible explanations for the heavily Republican-favored predictions. Having discussed this issue with our mentors from Microsoft Research, we found that they obtained similarly Republican-favored predictions using the same dataset despite using more advanced modeling techniques than what was used here. This led us to believe that the main culprit was a Republican-leaning bias in the surveying methodology through which the polling data was collected. We go into further detail about how this bias may have been introduced in section 5.2 of our capstone report.

Abstract

Nationwide surveys are widely adopted for gathering household information, evaluating public policies, and predicting elections. However, no unofficial survey can achieve census-level coverage. With survey data gathered by PredictWise using Pollfish, a mobile survey platform, we use a multilevel regression and poststratification (MRP) model to predict the two-party vote shares for each of the 435 congressional districts and the District of Columbia in the forthcoming 2018 United States House of Representatives Elections. We use these predictions to investigate the changes in voter turnout in specific demographics that would be needed to change the balance of power in the House of Representatives. In addition to widely-used demographic and geographic information, we incorporate responses to psychometric survey questions using three weighting schemes to evaluate their effects on the model. We identify several question topics that improve our model’s predictive accuracy and find evidence that adding multiple topics simultaneously produces approximately linear improvements in accuracy.

Interactive Map (2018 Midterm Election predictions and alternate election outcomes)

baseline: Python code for our baseline prediction model and our predicted outcomes for the 2018 Midterm Elections
demographics: supplementary data that was used to augment our model or impute missing data
dynamic: an experimental dynamic model of Trump's approval rate
plots: exploratory data analysis code
psychometric: a collection of models that incorporate various psychometric variables along with the models without psychometric variables that were used as a baseline comparison
authoritarianism: Python code for our prediction model that includes identification with authoritarianism as an additional variable and updated predictions for the 2018 Midterm Elections produced by this model
report_map: the HTML, CSS, and JavaScript (D3.js) code used for the interactive map of our predictions
turnout_adjustment: Python code that modifies the poststratification space by adjusting voter turnout for specific demographic groups along with updated predictions produced by those adjustments

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
baseline		baseline
data		data
demographics		demographics
dynamic		dynamic
plots		plots
psychometric		psychometric
report_map		report_map
turnout_adjustment		turnout_adjustment
.gitignore		.gitignore
Capstone Presentation.pdf		Capstone Presentation.pdf
Capstone Report.pdf		Capstone Report.pdf
README.md		README.md
environment.yml		environment.yml
graphs.R		graphs.R
misc.py		misc.py
preprocessing.R		preprocessing.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting the 2018 United States House of Representatives Elections and Voting Patterns with Polling Data

Post-Election Update

Abstract

Interactive Map (2018 Midterm Election predictions and alternate election outcomes)

Contents

About

Releases

Packages

Contributors 4

Languages

wl2522/MSR2

Folders and files

Latest commit

History

Repository files navigation

Predicting the 2018 United States House of Representatives Elections and Voting Patterns with Polling Data

Post-Election Update

Abstract

Interactive Map (2018 Midterm Election predictions and alternate election outcomes)

Contents

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages