for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While...

37
Data for Deans An HLC 2015 Annual Meeting Conversation James Kulich Vice President and Chief Information Officer Elmhurst College

Transcript of for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While...

Page 1: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Datafor

DeansAn HLC 2015 Annual Meeting Conversation

James KulichVice President and Chief Information Officer

Elmhurst College

Page 2: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Studies compiled by the CDC show that when states privatize the sale of an alcoholic beverage, consumption of that beverage grows by about 48 percent. Most of those studies looked at wine, not hard liquor.

"Problem drinking could increase as much as 48 percent," said Mark Nelson, the Cowlitz County Sheriff in another TV commercial. Source: Linda Thomas, MyNorthWest.com, July 16,2013

State of Washington Initiative 1183

Source: Melissa Allison, Taking a Closer Look at Liquor Initiative I-1183, Seattle Times, October 22, 2011

Page 3: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did
Page 4: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

What’s the Story?

Page 5: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Some Considerations

• Are the data accurate?• Did data reporting change?• Is the change significant in a statistical sense?• Was the decline in arrests widespread or is it greatest among those who

drink hard liquor?

• Did higher taxes result in decreased alcohol consumption?• Did DUI penalties change?• Are people driving less in general (maybe higher gas prices)?• Were there fewer arrests because there were fewer police?• Do people now drive much shorter distances to buy hard liquor?

Source: Vincent Granville, Developing Analytic Talent

Page 6: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Some Considerations

• Are the data accurate?• Did data reporting change?• Is the change significant in a statistical sense?• Was the decline in arrests widespread or is it greatest among those who

drink hard liquor?

• Did higher taxes result in decreased alcohol consumption?• Did DUI penalties change?• Are people driving less in general (maybe higher gas prices)?• Were there fewer arrests because there were fewer police?• Do people now drive much shorter distances to buy hard liquor?

Source: Vincent Granville, Developing Analytic Talent

Page 7: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Food for Thought

Topic One: Data Basics

Page 8: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Question

How many faculty do you have?

Page 9: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Good data hygiene

• How were the data collected?• By what methods, by whom, and with what standards? • Over what time frames?• For what purposes?• Experimental design or observational?

• What are the possible sources of bias?• Human• Methodological (Alumni employment survey)

Page 10: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Data hygiene issues

• Missing values• Duplicate records• Clerical errors

• Data formatting• Data scaling (normalizing)

• Junk data• OutliersSource: Philipp K. Janert, Data Analysis with Open Source Tools

Page 11: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

A key question to ask

What precisely do the data represent?

Get a good data dictionary.

Page 12: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Absolute and relative data

Percentages: use with care.

Page 13: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did not dip even more due to a the rather significant increase of numbers of students in three at-risk categories. Specifically worth noting is that compared to the fall 2010 entering class, the fall 2011 entering class (the subject of these retention data) saw an increase of 296% in Latino students, a 94% increase in African American students and a 15% increase in first generation college students. As you know, all three groups of students are considered at-risk in terms of retention and timely graduation.

Page 14: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

0

500

1000

1500

2000

Group 1 Group 2

296% Growth

0

500

1000

1500

2000

Group 1 Group 2

94% Growth

Page 15: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Key question to ask

Percentage of what?

No naked percentages.

Understand the baseline in absolute terms.

Visualize!

Page 16: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Food for Thought

Topic Two: Averages and Comparisons

Page 17: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Nobody is average (or at least thinks they are).

Page 18: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

The College’s average first to second year retention rate over the past ten years is 81%.

Identify the student who was 81% retained.

Page 19: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Averages are never about individuals or individual data.

Averages are always about groups or groups of data.

Page 20: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Many different averages

Mean: Add all values and divide by count (can be prone to outliers)

Median: The middle value (good to use when you have significant outliers)

Mode: The most common value (if there is one)

Page 21: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Source: http://thenormalgenius.blogspot.com

Page 22: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Averages and comparisons

Two plus two always equals four.

Two plus two can equal 4.395 in the world of data comparisons. At least you sometimes can’t tell them apart.

Page 23: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

The College’s average first to second year retention rate over the past ten years is 81%. The average retention rate for the first of these five years is 82% and the average retention rate for the second of these five years is 80%. So, retention has dropped by 2%.

Or has it?

Page 24: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

The College’s average first to second year retention rate over the past ten years is 81%. The average retention rate for the first of these five years is 82% and the average retention rate for the second of these five years is 80%. So, retention has dropped by 2%.

Answer One: Of course, as specific data.

Answer Two: Maybe, if these averages are intended to represent larger populations that have natural variation.

Page 25: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

The College’s average first to second year retention rate over the past ten years is 81%. The average retention rate for the first of these five years is 82% and the average retention rate for the second of these five years is 80%. So, retention has dropped by 2%.

80 8280

82

Evidence there is a difference

Insufficient evidence to claim a difference

Page 26: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Key question to ask

Behind every average lie actual data with real variation.

What is the distribution? How are the data arranged?

Page 27: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Comparison Groups

• First decide, with good precision, what you wish to compare.• If you have a natural base for comparison (student recruiting

among schools with whom you really compete), use that base to make a comparison for that purpose.

• Explore using large groups with similar features.• Use small groups to go into depth regarding a specific issue.

• One size never fits all.

Page 28: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Food for Thought

Topic Three: Relationships, Trends and Predictions

Page 29: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Related but different concepts

Correlation: to what extent do two quantities relate or travel together.

Absolutely key point: Just because A and B correlated, you can never immediately conclude that A causes B. In fact, B may cause A. Or, both A and B may both be driven by some unknown factor C.

Classification and regression: how can trends be identified and used to predict the future.

Key questions: With what probability and with what impact or value?

Page 30: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Anscombe’s Quartet

Source: Wikipedia

Page 31: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

How are predictive models built?

Training data:

Known featuresor attributes

Known values of a target variable

Build the model

Test the model

Apply the model

Evaluate and iterate

Page 32: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Ingredients in good model building

• Precise problem and target variable definition• Careful data evaluation and preparation• Investment in data

This is a team game!

• Variables: given and constructed• Modeling approach• Test large numbers of scenarios quickly

Source: Provost and Fawcett, Data Science for Business

Page 33: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Model evaluation: What is a good model?

• Accuracy – maybe• Key framework: expected value

Impact: costs and benefits

Technical considerations• ROC curve: Compare true positives and false positives• Lift curve: Compare use of model with random results

Source: Provost and Fawcett, Data Science for Business

Page 34: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Lift: Compare model with random results

Source: Provost and Fawcett, Data Science for Business

Page 35: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Data do a great job of suggesting questions.

Data can provide evidence to support conclusions, but people make decisions.

Visualize, visualize, visualize.

Data, in old and new forms, will continue to grow exponentially in volume, velocity, variety, and importance in our worlds.

Closing Thoughts

Page 36: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

All model are wrong.

Some are useful.

George E.P. Box

Page 37: for Deans · The College’s first-year to second year retention rate for 2011-2012 is 78%. While this is a 2% decrease from 2010-2011, we are pleased that the retention rate did

Time for conversation

Thanks!

[email protected]