Talent Management Data Mining - Discovering Gold in LAP 360 Aggregate Data
Aggregate Data
Transcript of Aggregate Data
![Page 1: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/1.jpg)
© Relay Graduate School of Education. All rights reserved. 1
AGGREGATE DATA
![Page 2: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/2.jpg)
© Relay Graduate School of Education. All rights reserved. 22
AGENDA OBJECTIVES
Agenda and Objectives
• Descriptive statistics• Dispersion• Aggregate data• The right questions and graphics• Closing
Compare basic descriptive statistics and identify their limitations
Describe common mistakes associated with analyzing "on average" data
Explain the purpose of the Data Narrative analyses
Evaluate research questions against criteria for quality
2 2
![Page 3: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/3.jpg)
© Relay Graduate School of Education. All rights reserved. 33
In the last activity, we finished reviewing our
tree of statistical terminology.
Now let’s apply that knowledge!
![Page 4: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/4.jpg)
© Relay Graduate School of Education. All rights reserved. 44
Aggregated Data
![Page 5: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/5.jpg)
© Relay Graduate School of Education. All rights reserved. 5
Entertainers vs. Athletes for Class #1
• Who did better? How much better?• Should we be worried about the lower-performing group? Why, why not? • If you were the principal, would you intervene on behalf of the lower-
performing group? Is this teacher disfavoring entertainers?
![Page 6: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/6.jpg)
Click ahead when you’ve completed the appropriate section
of your Handout
![Page 7: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/7.jpg)
© Relay Graduate School of Education. All rights reserved. 7
Check Your Work
![Page 8: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/8.jpg)
© Relay Graduate School of Education. All rights reserved. 8
Check Your Work
• Athletes, on average, performed about 25 percentage points higher than entertainers
• That overall average difference is misleading. Except for Charlie Sheen vs. Serena Williams, everybody else in the two groups performed similarly
![Page 9: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/9.jpg)
© Relay Graduate School of Education. All rights reserved. 99
Statistical finding vs.
Interesting finding
![Page 10: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/10.jpg)
© Relay Graduate School of Education. All rights reserved. 1010
Statistical finding vs.
Interesting findingAthletes performed better than
entertainers. But the difference was really just because Charlie Sheen
scored a 1 and Serena a 100.
![Page 11: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/11.jpg)
© Relay Graduate School of Education. All rights reserved. 1111
Statistically significantvs.
Practically significantNot every statistical finding has any practical purpose. A number is just a number without any other context.
![Page 12: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/12.jpg)
© Relay Graduate School of Education. All rights reserved. 1212
Speaking of…“Statistical significance”
![Page 13: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/13.jpg)
© Relay Graduate School of Education. All rights reserved. 13
Misuse of the Term “Statistically Significant”
The word "significant", in this sense, does not mean "large" or "important" as it does in the everyday use of the word.
http://xkcd.com/539/
![Page 14: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/14.jpg)
© Relay Graduate School of Education. All rights reserved. 1414
Statistical significance
The Data Narrative is NOT a test of statistical significance! It’s an exploration of your data. It’s a report and analysis. It is NOT
statistical modeling.
![Page 15: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/15.jpg)
© Relay Graduate School of Education. All rights reserved. 15
Correct use of the Term “Statistically Significant”
• Statistically significant, in the statistical sense, refers to something that is unlikely to have occurred by chance. Like a scientific experiment performed in a laboratory setting.
![Page 16: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/16.jpg)
© Relay Graduate School of Education. All rights reserved. 1616
CORRELATION CAUSATION
Unless you're using randomized trials and experimentation, in the statistical world, you cannot say that something caused something else. You can say that two
things are 'related', or may be 'contributing factors', but not they caused
each other.
![Page 17: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/17.jpg)
© Relay Graduate School of Education. All rights reserved. 1717
Simpson’s Paradox:Why even an average of 1.5 years of growth is not necessarily good enough
![Page 18: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/18.jpg)
© Relay Graduate School of Education. All rights reserved. 18
Example #1: Longitudinal SAT Verbal scores
Newspaper headlines:Average scores don’t improve!
Overall
200
240
280
320
360
400
440
480
520
560
Average SAT Verbal 1981Average SAT Verbal 2002
Behind the scenes: Scores increases within every racial subgroup
White
Black/A
frAm
Asian
Hispan
ic/Lati
no
American
Indian
200240280320360400440480520560
Average Verbal SAT 1981Average Verbal SAT 2002
![Page 19: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/19.jpg)
© Relay Graduate School of Education. All rights reserved. 19
Describe the Paradoxical Nature of the Data
Group indexAverage SAT Verbal 1981
Average SAT Verbal 2002
White 519 527
Black/AfrAm 412 431
Asian 474 501
Hispanic/Latino 438 446
American Indian 471 479--------------------Overall average
-------------504
------------- 504
![Page 20: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/20.jpg)
Click ahead when you’ve completed the appropriate section
of your Handout
![Page 21: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/21.jpg)
© Relay Graduate School of Education. All rights reserved. 21
Check Your Work – What’s The Paradox?
Group indexAverage SAT Verbal 1981
Average SAT Verbal 2002
White 519 527
Black/AfrAm 412 431
Asian 474 501
Hispanic/Latino 438 446
American Indian 471 479--------------------Overall average
-------------504
------------- 504
![Page 22: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/22.jpg)
© Relay Graduate School of Education. All rights reserved. 22
Scores Increase By Subgroup But Hold Constant Overall
Group indexAverage SAT Verbal 1981
Average SAT Verbal 2002
White 519 527
Black/AfrAm 412 431
Asian 474 501
Hispanic/Latino 438 446
American Indian 471 479--------------------Overall average
-------------504
------------- 504
![Page 23: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/23.jpg)
© Relay Graduate School of Education. All rights reserved. 23
Why the Paradox? Every subgroup increased their score.The percentage of test-takers in each group changed.
Test Takers 1981
White; 85%
Black/AfrAm; 9%
Asian; 3%
Hispanic/Latino; 2%
American Indian; 1%
Test Takers 2002
White; 65%
Black/AfrAm; 11%
Asian; 10%
Hispanic/Latino; 9%
American Indian; 1%
Other; 4%
![Page 24: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/24.jpg)
© Relay Graduate School of Education. All rights reserved. 2424
“A statistician can have his head in an oven and his feet in ice…
![Page 25: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/25.jpg)
© Relay Graduate School of Education. All rights reserved. 2525
“A statistician can have his head in an oven and his feet in ice…
and he will say that, on average,
![Page 26: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/26.jpg)
© Relay Graduate School of Education. All rights reserved. 2626
“A statistician can have his head in an oven and his feet in ice…
and he will say that, on average, he feels fine.”
![Page 27: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/27.jpg)
© Relay Graduate School of Education. All rights reserved. 2727
“A statistician can have his head in an oven and his feet in ice…
and he will say that, on average, he feels fine.”
Be wary of “on
average”!
![Page 28: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/28.jpg)
© Relay Graduate School of Education. All rights reserved. 28
UC Berkeley: Was Admissions Biased?
• In 1973, a lawsuit was filed against UC Berkeley for discrimination – overall, 44% of men were admitted and only 35% of women were admitted to all graduate programs
http://hoopedia.nba.com/index.php?title=Oski_the_Bear_California
![Page 29: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/29.jpg)
© Relay Graduate School of Education. All rights reserved. 29
Nope, no bias. Women just applied to more competitive programs than men.
• In 1973, a lawsuit was filed against UC Berkeley for discrimination – overall, 44% of men were admitted and only 35% of women were admitted to all graduate programs
• It was discovered that, on average, women were applying to more competitive programs at higher rates (like law school, med school, etc), and therefore were being rejected with greater frequency overall across programs
GO BEARS!
![Page 30: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/30.jpg)
© Relay Graduate School of Education. All rights reserved. 30
Other Examples of Simpson’s Paradox
COMPARING BATTING AVERAGES: WHO LOOKS LIKE THE BETTER PLAYER?
Each year Justice is better, but overall Jeter is better.
Why the paradox?
1995 Average .250 Average .253
1996 Average .314 Average .321
95/96 Average .310 Average .270
![Page 31: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/31.jpg)
© Relay Graduate School of Education. All rights reserved. 31
Other Examples of Simpson’s Paradox
COMPARING BATTING AVERAGES: WHO LOOKS LIKE THE BETTER PLAYER?
Player 1995 1996
Derek J Average .250 .314
Hits/AB 12/48 183/582
David J Average .253 .321
Hits/AB 104/411 45/140
Why the paradox?
1995 Average .250 Average .253
1996 Average .314 Average .321
95/96 Average .310 Average .270
![Page 32: Aggregate Data](https://reader036.fdocuments.in/reader036/viewer/2022081605/58f07cf11a28abfd558b46cb/html5/thumbnails/32.jpg)
© Relay Graduate School of Education. All rights reserved. 32
What’s the Lesson from Simpson’s Paradox?
DISAGGREGATE YOUR DATA!(AND TELL THE RIGHT STORY)