Dr. Michael R. Hyman, NMSU Data Preparation. 2 File, Record, and Field.

Post on 13-Dec-2015

217 views 0 download

Tags:

Transcript of Dr. Michael R. Hyman, NMSU Data Preparation. 2 File, Record, and Field.

Dr. Michael R. Hyman, NMSU

Data Preparation

2

File, Record, and Field

3

Data Matrix

4

Data Entry

Process of transforming data from research projects to computers

5

(1) Validation(2) Editing(3) Coding(4) Data entry/transcription(5) Machine cleaning of data

Five Steps for Data Preparation

6

Check that interviews conducted as specified• Ensure respondent qualified• Interviewer looked/acted professionally• Interview conducted in proper environment• All appropriate questions asked

Validation

7

Check for:• Omissions• Ambiguities• Inconsistencies• Proper skip patterns • Properly recorded answers, especially

to open-ended questions

Editing: Personal Interviews

8

Check for:• All questionnaire sections and key

questions answered• Respondents understood instructions

and took task seriously• No missing pages• Questionnaire returned before cutoff

date

Editing: Self-Administered Questionnaires

9

Solutions for Editing Problems

• Re-contact respondent

• Discard questionnaire

• Use only good items

– Data analysis implications (beyond scope of class)

10

Coding

• Process of grouping and assigning numeric codes to different question responses

• Closed-ended questions easier because pre-coded

11

Pre-coding Example

12

Coding an Open-Ended Question

• Generate list of responses

• Consolidate responses (subjective judgment)

• Set response category codes

• Assign independent response category and record associated numeric code

13

Portion of Travel Study Code Book

14

• Validated, edited, and coded questionnaires given to data entry operator

• More accurate and efficient to go directly from questionnaire to data entry device and storage medium

• Skip coding sheets

Data Entry Process

15

Data Transcription

16

• Checking entered data for internal logic by either the data entry device or another connected device

• Excel/Quattro and SPSS rely on dumb data entry• Require data cleaning

Intelligent Data Entry

17

Machine Cleaning of Data

• Computerized error check– Identifies and suggests fixes for logical

errors• Marginal report

– Computer-generated table of response frequencies for questions

– Monitor entry of valid codes and skip patterns

18

Machine Cleaning Instructions

19

Recoding Data

20

Recoding Data

• Using computers to convert original codes used for raw data into codes that are more suitable for analysis

• Var1 = 8 - Var1

21

Collapsing a Five-Point Likert Scale

22

Coping with Missing Data

23

24

Item Non-response to Questions of Fact

25

Ways to Handle Missing Responses

• Leave blank

• Case-wise deletion

• Pair-wise deletion

• Mean response

• Imputed response