Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical...

21
Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical...

Page 1: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Lecture 1. Making Sense of Data: Data Variation

David R. Merrell90-786 Intermediate Empirical

Methods for Public Policy and Management

Page 2: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Making Sense of Data: Data Variation

Introductions Instructor: David R. Merrell TA s: Max Hernandez-Toso and Hao Xu

Course Content: USEFUL STATISTICS

Statistics is the use of data to reduce uncertainty about potential observations

Page 3: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Course Information Web site

http://Duncan.heinz.cmu.edu/GeorgeWeb/

Heinz 90-786 Front Page.htm Data files

r:/academic/90786

Page 4: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Making Sense of Data Motivation in management and

policy What is data? What’s the use of data? Data variation

Page 5: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Motivation for Statistical Input

Managerial Decision Making Changes in societal or organizational

conditions Differences between observations and

expectations Policy Making

Impact of changing the system

Page 6: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

What is Data?

Unit of analysis Number of variables

one, two, more than two Level of measurement / kind

of data Nominal, Ordinal, Interval

Page 7: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Unit of analysis

Focus of attention: a case that can be be separately and uniquely identified

person (student, woman, tenant, .. place (city, street intersection, river, … object (car, power plant, ...) organization (school, corporation, …) incident (birth, election) time period(day, season, year, ...)

Page 8: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Variables Characteristics, attributes, and

occurrences observed about each unit of analysis

Require specific step-by-step procedure to obtain values for the variable

Page 9: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Examples

Driver's license application study Unit of analysis: people who apply for a driver's

license. Outcome variable: License issued or not Other variables: Applicant's age, sex, and race

Snowfall in Pittsburgh Units of analysis: Snowstorms Outcome variable: depth of the snowfall from each

storm Other variables: date of snowstorm, temperature

Page 10: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Nominal data Classifies outcomes by categories Categories must be mutually

exclusive and exhaustive Examples:

Marital status, region of the country, religion, occupation, school district, place of birth, blood type

Page 11: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Ordinal data Classifies outcomes by ranked

categories Examples:

Officers in the U.S. Army can be classified as: 1 = general 5 = captain 2 = colonel 6 = first lieutenant 3 = lieutenant colonel 7 = second lieutenant 4 = major

Education (highest diploma or degree attained)

Page 12: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Interval data Classifies outcomes on a

continuous scale Examples:

Scholastic Aptitude Test (SAT) score Consumer Price Index (CPI) Time of day

Page 13: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

What’s the Use of Data? Description Evaluation Estimation

Page 14: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Description Summary of observations In February, 1997 the M1A money

supply in Taiwan rose 6.46% over February, 1996

Housing starts in June, 1996, rose to a seasonally adjusted rate of 1,480,000 units from a revised 1,461,000 in May

Page 15: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Evaluation Comparison of observed state of

affairs against expectations Expectations are based on: ethical

norms, managerial plans and budgets

Page 16: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Estimation Uses observations to assess an attribute

of a population or to predict future values.

A new charter school in Boston raised test scores an average of 7 percentile points. How would other charter schools do? How will this charter school do in the future?

Page 17: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Data Variation: Data Compression and Display Boxplots Five number summary

minimum lower quartile point median upper quartile point maximum

Page 18: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Batting Average of 263 major league baseball players

Aver Career

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Page 19: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Compressed Data ValuesMedian 0.263Minimum 0.196Maximum 0.353

Range 0.155

Mode 0.250

Mean 0.263Standard Deviation 0.023

Page 20: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Batting Average of 263 major league baseball players

Aver Career

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Median 0.263

Maximum0.352

Minimum0.196

Page 21: Lecture 1. Making Sense of Data: Data Variation David R. Merrell 90-786 Intermediate Empirical Methods for Public Policy and Management.

Next Time ... Data Compression for One Variable