McMenamin Demo

15
LEGISLATOR PROGNOSTICATOR Brenton McMenamin

Transcript of McMenamin Demo

LEGISLATOR PROGNOSTICATOR

Brenton McMenamin

Can we predict what a congressional representative will do if elected?

Party interests

Voting behavior ✓

Party interests

Regional interests

Financial interests

Proposed legislation

APP DEMO

http://www.legislator-prognosticator.xyz/

CLASSIFIER VALIDATION

0.5 0.6 0.7 0.8 0.9 1

Firearms

Immigration

Energy & Oil

Housing

Diplomatic decrees

Spending Defecits

Program Budgets

Voting rights

Insurance

Higher education

Opposition to Mandates

Full Model Money Only Party Only

Area under the Curve (AUC)

Cross-validated performance of Random forest classifiers indicate that we can predict involvement for 11 legislative topics

I used to do neuroimaging to study how brains learn and represent information.

Now I’m looking for a job where I force machines to learn and represent information.

Take note of the particularly large “data-scientist lobe”

This is my brain:

EXTRA CREDIT!

FINDING DONOR CLUSTERS FEC records from >20 million political donations from 2010-2015

Donations from individuals were aggregated based on occupation, resulting in 55,000 different donation sources

Politicians were clustered based on donor similarity using GMM to minimize BIC

Optimal Bayesian Information Criterion (BIC) with 10 clusters

DONOR CLUSTER COMPOSITION Agriculture Cattelman, Sod grower, Rancher, President FOX TV station

Accounting Sentinel Trust Co., Tax Consultant

Transportation & Energy

Auto dealer, Oil Consultant, Professor/Editor

Insurance Insurance & Risk Consultant, Insurance & Financial Services, Zurlo Investment Trust

Executives Financial communication, Retired ophthalmologist, Auditor, Professional board member

Small Business Contract administration, Research attorney, Small business owner, Provider

Manufacturing & Big Business

Circuit designer, Oil consultant

Services & Self-employed

Building contractor, Developmental consultant, Dance Artist, Beverage & Food distribution

Natural resources & Land use

National Park Manager, Conservation Advocate/Author, Land Conservation

Environmental Environmental educator, Timberland Investments, Library ass't, Renewable energy developer

FINDING TEXT CLUSTERS Full text of >14,000 bills downloaded from the Library of Congress and subjected to Latent Sematic Indexing (LSI) •  Tokenized with multi-word phrases

•  tf-idf transformed word counts

•  LSI placed each document in 300-dimensional “topic space”

Document were clustered in the high-dimensional topic space using Dirichlet Process GMM to identify the primary legislative topics

TOPICS

TOPICS

DONORS