for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While...
Transcript of for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While...
![Page 1: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/1.jpg)
Datafor
DeansAn HLC 2015 Annual Meeting Conversation
James KulichVice President and Chief Information Officer
Elmhurst College
![Page 2: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/2.jpg)
Studies compiled by the CDC show that when states privatize the sale of an alcoholic beverage, consumption of that beverage grows by about 48 percent. Most of those studies looked at wine, not hard liquor.
"Problem drinking could increase as much as 48 percent," said Mark Nelson, the Cowlitz County Sheriff in another TV commercial. Source: Linda Thomas, MyNorthWest.com, July 16,2013
State of Washington Initiative 1183
Source: Melissa Allison, Taking a Closer Look at Liquor Initiative I-1183, Seattle Times, October 22, 2011
![Page 3: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/3.jpg)
![Page 4: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/4.jpg)
What’s the Story?
![Page 5: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/5.jpg)
Some Considerations
• Are the data accurate?• Did data reporting change?• Is the change significant in a statistical sense?• Was the decline in arrests widespread or is it greatest among those who
drink hard liquor?
• Did higher taxes result in decreased alcohol consumption?• Did DUI penalties change?• Are people driving less in general (maybe higher gas prices)?• Were there fewer arrests because there were fewer police?• Do people now drive much shorter distances to buy hard liquor?
Source: Vincent Granville, Developing Analytic Talent
![Page 6: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/6.jpg)
Some Considerations
• Are the data accurate?• Did data reporting change?• Is the change significant in a statistical sense?• Was the decline in arrests widespread or is it greatest among those who
drink hard liquor?
• Did higher taxes result in decreased alcohol consumption?• Did DUI penalties change?• Are people driving less in general (maybe higher gas prices)?• Were there fewer arrests because there were fewer police?• Do people now drive much shorter distances to buy hard liquor?
Source: Vincent Granville, Developing Analytic Talent
![Page 7: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/7.jpg)
Food for Thought
Topic One: Data Basics
![Page 8: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/8.jpg)
Question
How many faculty do you have?
![Page 9: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/9.jpg)
Good data hygiene
• How were the data collected?• By what methods, by whom, and with what standards? • Over what time frames?• For what purposes?• Experimental design or observational?
• What are the possible sources of bias?• Human• Methodological (Alumni employment survey)
![Page 10: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/10.jpg)
Data hygiene issues
• Missing values• Duplicate records• Clerical errors
• Data formatting• Data scaling (normalizing)
• Junk data• OutliersSource: Philipp K. Janert, Data Analysis with Open Source Tools
![Page 11: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/11.jpg)
A key question to ask
What precisely do the data represent?
Get a good data dictionary.
![Page 12: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/12.jpg)
Absolute and relative data
Percentages: use with care.
![Page 13: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/13.jpg)
The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did not dip even more due to a the rather significant increase of numbers of students in three at-risk categories. Specifically worth noting is that compared to the fall 2010 entering class, the fall 2011 entering class (the subject of these retention data) saw an increase of 296% in Latino students, a 94% increase in African American students and a 15% increase in first generation college students. As you know, all three groups of students are considered at-risk in terms of retention and timely graduation.
![Page 14: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/14.jpg)
0
500
1000
1500
2000
Group 1 Group 2
296% Growth
0
500
1000
1500
2000
Group 1 Group 2
94% Growth
![Page 15: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/15.jpg)
Key question to ask
Percentage of what?
No naked percentages.
Understand the baseline in absolute terms.
Visualize!
![Page 16: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/16.jpg)
Food for Thought
Topic Two: Averages and Comparisons
![Page 17: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/17.jpg)
Nobody is average (or at least thinks they are).
![Page 18: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/18.jpg)
The College’s average first to second year retention rate over the past ten years is 81%.
Identify the student who was 81% retained.
![Page 19: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/19.jpg)
Averages are never about individuals or individual data.
Averages are always about groups or groups of data.
![Page 20: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/20.jpg)
Many different averages
Mean: Add all values and divide by count (can be prone to outliers)
Median: The middle value (good to use when you have significant outliers)
Mode: The most common value (if there is one)
![Page 21: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/21.jpg)
Source: http://thenormalgenius.blogspot.com
![Page 22: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/22.jpg)
Averages and comparisons
Two plus two always equals four.
Two plus two can equal 4.395 in the world of data comparisons. At least you sometimes can’t tell them apart.
![Page 23: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/23.jpg)
The College’s average first to second year retention rate over the past ten years is 81%. The average retention rate for the first of these five years is 82% and the average retention rate for the second of these five years is 80%. So, retention has dropped by 2%.
Or has it?
![Page 24: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/24.jpg)
The College’s average first to second year retention rate over the past ten years is 81%. The average retention rate for the first of these five years is 82% and the average retention rate for the second of these five years is 80%. So, retention has dropped by 2%.
Answer One: Of course, as specific data.
Answer Two: Maybe, if these averages are intended to represent larger populations that have natural variation.
![Page 25: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/25.jpg)
The College’s average first to second year retention rate over the past ten years is 81%. The average retention rate for the first of these five years is 82% and the average retention rate for the second of these five years is 80%. So, retention has dropped by 2%.
80 8280
82
Evidence there is a difference
Insufficient evidence to claim a difference
![Page 26: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/26.jpg)
Key question to ask
Behind every average lie actual data with real variation.
What is the distribution? How are the data arranged?
![Page 27: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/27.jpg)
Comparison Groups
• First decide, with good precision, what you wish to compare.• If you have a natural base for comparison (student recruiting
among schools with whom you really compete), use that base to make a comparison for that purpose.
• Explore using large groups with similar features.• Use small groups to go into depth regarding a specific issue.
• One size never fits all.
![Page 28: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/28.jpg)
Food for Thought
Topic Three: Relationships, Trends and Predictions
![Page 29: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/29.jpg)
Related but different concepts
Correlation: to what extent do two quantities relate or travel together.
Absolutely key point: Just because A and B correlated, you can never immediately conclude that A causes B. In fact, B may cause A. Or, both A and B may both be driven by some unknown factor C.
Classification and regression: how can trends be identified and used to predict the future.
Key questions: With what probability and with what impact or value?
![Page 30: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/30.jpg)
Anscombe’s Quartet
Source: Wikipedia
![Page 31: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/31.jpg)
How are predictive models built?
Training data:
Known featuresor attributes
Known values of a target variable
Build the model
Test the model
Apply the model
Evaluate and iterate
![Page 32: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/32.jpg)
Ingredients in good model building
• Precise problem and target variable definition• Careful data evaluation and preparation• Investment in data
This is a team game!
• Variables: given and constructed• Modeling approach• Test large numbers of scenarios quickly
Source: Provost and Fawcett, Data Science for Business
![Page 33: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/33.jpg)
Model evaluation: What is a good model?
• Accuracy – maybe• Key framework: expected value
Impact: costs and benefits
Technical considerations• ROC curve: Compare true positives and false positives• Lift curve: Compare use of model with random results
Source: Provost and Fawcett, Data Science for Business
![Page 34: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/34.jpg)
Lift: Compare model with random results
Source: Provost and Fawcett, Data Science for Business
![Page 35: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/35.jpg)
Data do a great job of suggesting questions.
Data can provide evidence to support conclusions, but people make decisions.
Visualize, visualize, visualize.
Data, in old and new forms, will continue to grow exponentially in volume, velocity, variety, and importance in our worlds.
Closing Thoughts
![Page 36: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did](https://reader034.fdocuments.in/reader034/viewer/2022043012/5fab6095bf8c3804de21f6f6/html5/thumbnails/36.jpg)
All model are wrong.
Some are useful.
George E.P. Box