1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

45
1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006

Transcript of 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

Page 1: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

1

Research Methods

Lecture 2

The dummies’ guide to STATA

Wiji Arulampalam18/10/2006

Page 2: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

2

Econometrics Software

• You can use any software that does what you need

• See Timberlake for details of what does what well [www.timberlake.co.uk]

• PC Give is hard to beat for time series analysis– Microfit, EViews are good alternatives

• STATA does (just about) everything. • STATA (and everything else) is available as a

delivered application on the network.

Page 3: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

3

WHY STATA

• Need to know how to use STATA for (i) Econometrics A [next term](ii) Econometrics B [this term](iii) Panel Data Econometrics [next term]

• E-Views demo will be given by the Econometrics tutors!

• The above two should be sufficient

Page 4: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

4

STATA

• Hopefully you will have access by next week

• So full demo next week

• Stata command file wages.do and data file wages.dta on the module web page for you to practice

Page 5: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

5

STATA

• Use STATA: FOR– large survey datasets (merging them) – complex nonlinear models (e.g. LDV’s)

• But see also LimDep– nonparametric and evaluation methods– you want to

• continue studying economics • be a professional economist • learn something new

– you hate PC Give.

Page 6: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

6

Some useful websites

• Stata’s own resources for learning STATA– Stata website, Stata journal, Stata library, Statalist

archive– http://www.stata.com/links/resources1.html

• Michigan’s web-based guide to STATA (for SA)• UCLA resources to help you learn and use

STATA: – http://www. ats.ucla.edu/stat/stata – including movies and “web-books”

Page 7: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

7

Accessing STATA

Available from your ‘Delivered Applications’

Double click on icon!

Wstata.exe

Page 8: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

8

Buttons/Menu

Page 9: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

9

Enter commands here

Page 10: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

10

OR use the do editor to create a .do file

Page 11: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

11

Results window

Better to save the output – more later

Page 12: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

12

Click for Extensive Help OR

Type help in command line

help

Page 13: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

13

Type help in command line

help xxx

Page 14: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

14Exit, clear

Page 15: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

15

Click and point in v9

Exit, clear Menu/tabs

Page 16: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

16

Important features (1)

• NOTE– Always use lowercase in STATA– Otherwise you can get very confused

• More--more-- in your output window more output to come. [Press spacebar and the next page appears]– Command set more off turn this off

• Not enough memory [so reset!]– . set mem XXXm (allocate XXX mb of data)– . set matsize XXX (max matrix size XXX square)

Page 17: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

17

Important features (2)

• To Break– To stop anything hit the “break” (menu button with red

cross, or hit Ctrl and C simultaneously)

Page 18: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

18

Using data on disk (1)

• Opening a dataset

– datasets need to be rectangular

[variables in columns; observations in rows ]

– Stata datasets have a .dta extension

– Will read excel or text files

– Otherwise use Stat/Transfer to convert other format

files to stata files

Page 19: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

19

Using data on disk (2)

• There are several ways of getting data into STATA: eg: wages.dta

. use wages (or click: file/open on the menu bar)

. use lwage ed exp in 1/1000 if fem==1

. insheet using wages.csv (or .txt)

(imports an Excel csv file or a “text” file)

Page 20: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

20

Opens the file

List of variables

Page 21: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

21

Basic data reporting (1)

• .describe (or press F3 key)

– Lists the variable names and labels

• .describe using wages

– Lists the variable names etc WITHOUT loading the

data into memory (useful if the data is too big to fit)

• .codebook

– Tells you about the means, labels, missing values etc

Page 22: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

22

Page 23: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

23

Basic data reporting (2)

• sort and count

– .sort personid

• sorts data by personid

– .count if personid==personid[_n-1]

• counts how many unique separate personids

• _n-1 is the previous observation

Page 24: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

24

Page 25: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

25

Page 26: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

26

First look at the data (1)

• .list lwage ed exp in 1/10 if fem>=0

– Lists the first 10 rows of var1 to var3 for which var4≥0

• .tab fem union (or tabulate)

[variables should be integers]

– gives a crosstab of fem vs union

Page 27: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

27

Page 28: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

28

First look at the data (2)

• .summ fem union (or summarize or sum)– means, std devs etc for x1 and x2

• .corr ed exp in 1/100 if fem<1 (,cov)– correlation coeffs (or covariances) for selected data– .pwcorr ed exp lwage [does all pairwise corr coeffs]

Page 29: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

29

Page 30: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

30

Page 31: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

31

Page 32: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

32

Tabulating (1)

• tab x1 x2 if x4==0, sum(x3)

– gives the means of x3 for each cell of the x1 vs x2 crosstabulation for observations where x4=0

• tab x1 x2, missing

– Includes the missing values

• tab x1 x2, nolabel

– Uses numeric codes instead of labels

– Eg “1” instead of “NorthWest” etc

Page 33: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

33

Tabulating (1)

• tab x1 x2, col– Gives % of column instead of count– Can get row percentages by using row instead– Or both by using row col

• table educ ethnic, c(mean wage) row col

– Customises the table so it includes the mean (or median or mx or count or sd ….) of wage by cells

Page 34: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

34

Labelling

• Always have your data comprehensively labelled.label data “This is pooled GHS 90-99”

.label variable reg “region”

.lab define reglab 0 “North” 1 “South” 2 “Middle”

.lab values region reglab

• Tedious to do for lots of variables– but then your output will be intelligibly labelled– other people will be able to understand it in future

Page 35: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

35

Data manipulation (1)

• Data can be renamed, recoded, and transformed:Command .generate or gen for short

. gen logrw=log((earn/hours)/rpi)

. gen agesq=age^2 (squares)

. gen region1=(region==1) (1 if true, 0 if not)

. gen ylagged=y[ _n-1 ]

(_n is the obs # in STATA)

Page 36: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

36

Data manipulation (2)

• Command recode:

. recode x1 .=0, 1/5=1 (. is missing value (mv))

. replace rate=rate/100

. replace age=25 if age==250

. egen meaninc=mean(income), by (region)

(see help egen for details)

Page 37: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

37

Page 38: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

38

Data selection (1)

• You can also organise your data set with various commands:

. keep if _n<=1000 ( _n is the observation number)

. drop region

. drop if ethnic~=1

keeps only the first 1000 observations, drops region, and drops all the observations where the variable ethnic≠1 (~= is “not equal to”)

Page 39: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

39

Data selection (2)

• Then save the smaller file for subsequent analysis

. save newfile

. save, replace (take care – it overwrites existing file)

Page 40: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

40

Page 41: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

41

Functions

• Lots of functions are possible. • See . help functions

– Obvious ones like• Log(), abs(), int(), round(), sqrt(), min(), max(), sum()

– And many very specialised ones.– Statistical functions

• distributions– String functions

• Converting strings to numbers and vice versa– Date functions

• Converting dates to numbers and vice versa– And lots more

Page 42: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

42

Command files

• Stata command files have a .do extension• It is ALWAYS good practice to use a .do file

– you will know exactly what you have done. – It makes it easy to develop ideas.– And correct mistakes.

• . do wages.do, nostop– (echoes to screen, and keeps going after error

encountered)

• Or . run wages.do (executes “silently”)

Page 43: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

43

Keeping track of output (1)

• Can scroll back your screen (upto a point)

• But better to open a log file at the beginning of your session, and close it at the end.

• Click on file, log, begin . Or type

. log using myoutput

. Commands……………………

. log close

[log command allows the replace and append options.]

Page 44: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

44

Keeping track of output (2)

• Default is .smcl file extension (that STATA can read)

• .log extension gives an ASCII file that anything can edit

• ALWAYS LOG your output

is a good way of developing a .do file – since it saves the commands as well as the output

Page 45: 1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.

45

Next Lecture

Monday 23rd October F107 11:00-12:00

STATA demo