Agility & Speed Training Software/Equipment Kevin McMenamin.
McMenamin Demo
-
Upload
brenton-mcmenamin -
Category
Data & Analytics
-
view
212 -
download
0
Transcript of McMenamin Demo
CLASSIFIER VALIDATION
0.5 0.6 0.7 0.8 0.9 1
Firearms
Immigration
Energy & Oil
Housing
Diplomatic decrees
Spending Defecits
Program Budgets
Voting rights
Insurance
Higher education
Opposition to Mandates
Full Model Money Only Party Only
Area under the Curve (AUC)
Cross-validated performance of Random forest classifiers indicate that we can predict involvement for 11 legislative topics
I used to do neuroimaging to study how brains learn and represent information.
Now I’m looking for a job where I force machines to learn and represent information.
Take note of the particularly large “data-scientist lobe”
This is my brain:
FINDING DONOR CLUSTERS FEC records from >20 million political donations from 2010-2015
Donations from individuals were aggregated based on occupation, resulting in 55,000 different donation sources
Politicians were clustered based on donor similarity using GMM to minimize BIC
Optimal Bayesian Information Criterion (BIC) with 10 clusters
DONOR CLUSTER COMPOSITION Agriculture Cattelman, Sod grower, Rancher, President FOX TV station
Accounting Sentinel Trust Co., Tax Consultant
Transportation & Energy
Auto dealer, Oil Consultant, Professor/Editor
Insurance Insurance & Risk Consultant, Insurance & Financial Services, Zurlo Investment Trust
Executives Financial communication, Retired ophthalmologist, Auditor, Professional board member
Small Business Contract administration, Research attorney, Small business owner, Provider
Manufacturing & Big Business
Circuit designer, Oil consultant
Services & Self-employed
Building contractor, Developmental consultant, Dance Artist, Beverage & Food distribution
Natural resources & Land use
National Park Manager, Conservation Advocate/Author, Land Conservation
Environmental Environmental educator, Timberland Investments, Library ass't, Renewable energy developer
FINDING TEXT CLUSTERS Full text of >14,000 bills downloaded from the Library of Congress and subjected to Latent Sematic Indexing (LSI) • Tokenized with multi-word phrases
• tf-idf transformed word counts
• LSI placed each document in 300-dimensional “topic space”
Document were clustered in the high-dimensional topic space using Dirichlet Process GMM to identify the primary legislative topics