Analysis of us presidential elections, 2016
-
Upload
tapan-saxena -
Category
Data & Analytics
-
view
141 -
download
0
Transcript of Analysis of us presidential elections, 2016
Analysis of US Presidential
Elections, 2016
Table of Contents
Overview of Dataset Objectives Tools Used Methodology Analysis & Findings Assumptions Prediction Conclusions Bibliography
Overview of Dataset
Dataset was obtained from Kaggle website The dataset contains relevant data for the 2016 US Presidential
Elections, including results of primary elections The dataset consisted of 4 files in csv and zip format, namely,
County_facts- demographic data on counties from US census County_facts_dictionary- description of columns of County_facts Primary_results- File containing data about votes and number of
votes received by each candidate in different counties.
Objectives Understanding the primary elections and key terms Number of candidates who took part in the primary elections from each party Most popular candidate for each party by different state and with respect to
different types of people Differentiation in number of votes with respect to party and candidate by each
state Analysing the Non-Swing states (Looking previous 5 election year trends) Understanding the general elections and key terms Calculating the number of electoral votes for final presidential nominees Prediction of the next President of the United States of America Predictions and models Popularity of each candidate on the basis of twitter sentiment analysis Performance comparison of the various tools utilized
Tools Used
RStudio MS-Excel SAS SQL with RStudio Tableau Anaconda
Methodology
Obtain dataset from Kaggle.com
Explore the data to find what its all about
Understand the US primary elections Defining objectives
Modifying, cleaning and transformation of Data in RStudio
Writing the modified dataset into a csv file
Carrying out different type of analysis on the modified data to draw insights using different tools and visualizations
Understand the US general elections
Make certain Assumptions in order to predict the next president
Do qualitative & quantitative analysis keeping in mind the assumptions made to find out the next president
Supporting our answer with the help of certain mathematical models
Twitter Sentiment Analysis to find the popularity of final presidential nominees
Comparison of performance of tools used for analysis
Drawing conclusions
Analysis & Findings
Understanding the US Primary Election and key terms
Key Terms National Conventions Primary
Closed primary Open primary New Hampshire Primary
Caucus Iowa Caucuses
Delegates Pledged Delegates Super Delegates
Number of candidates who took part in the primary elections from each partyBased on dataset, a total number of 14 candidates together from both the parties took part in the primary elections, who are as follows:
Democratic Party
Hillary Clinton
Bernie Sanders
Martin O’ Malley Re
publ
ican
Pa
rty
Ben Carson
Carly Fiorina
Chris Christie
Donald Trump
Jeb Bush
John Kasich
Marco Rubio
Mike Huckabee
Rand Paul
Ted Cruz
Rick Santorum
Most popular candidate by each party
Republican Party
Democratic Party
Top most popular candidates for each party by different types of people
Most popular candidate from both the parties, by different types of person
Non-Swing states, looking at the previous election
2012 ELECTIONS TO SEE THE NON-SWING STATES AND COMPARE IT WITH THIS YEAR ELECTIONS
Understanding the general election and key terms
Key terms Electoral College Electors Swing states
Calculating the number of electors Number of electors differ for each state
The number of electors are calculated on the basis of number of districts in each state along with the senate members, which are two for all states
The more the number of districts in each state, the more the number of electors
Electors are the persons who choose the president of the United States
The electors vote in the favour of the nominee who was popular across each state
California 53
districts2 senate members
55 electors
Prediction of the next president of United States
As the data pertaining to general elections was not available certain presumptions were made, which are as follows: The conditions and the number of votes to be cast during the upcoming
general elections would be similar to the conditions during primary elections Therefore, the same data of primary elections was analysed to draw
prediction insights Qualitative analysis and current affairs were used to make predictions Two different predictions were made, one on the basis of party and other on
the basis of final presidential nominee The predictions are supported by different mathematical models defined by
distinguished professors in their fields Assumption on the division of votes of the candidates who quit or suspended
their campaign
Predictions and models
On the basis of party, the most number of electoral votes went to republican party, leading to the win of Donald Trump
If we take only the candidates solely, and forget the parties then there can be two phases as follows, 1st Phase- Winner Hillary Clinton 2nd Phase- Winner Donald Trump
Mathematical models to support our answer include different econometric models such as, DeSart Model (Jay DeSart), Fair Model (Ray Fair), Primary Model (Helmut Norpoth), and Electoral Cycle Model (Helmut Norpoth) among others.
Twitter sentiment analysis
All candidates Hillary Clinton Donald Trump
Performance of various tools utilized
We have carried out similar analysis on both R and Python and based on our data and skills we came to the following conclusions:
Parameter R PythonNumber of lines of code (average) 145 85
RAM Usage 88% 66%Average Processing Time (minutes) 8-10 4-7
Ease of coding Easy ModerateNumber of Packages used 22-25 4-6
Conclusion As per our analysis the prediction is mainly dependent on the
casting of votes in swing states along with division of votes of Ted Cruz of Republican party as he has declined to endorse his republican counterpart Donald Trump.