Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard...
-
Upload
morris-carroll -
Category
Documents
-
view
219 -
download
3
Transcript of Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard...
![Page 1: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/1.jpg)
Getting Started with Stata
2/11/2010
Tom Tomberlin
Nealia Khan
Learning Technologies Center
Harvard Graduate School of Education
![Page 2: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/2.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 3: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/3.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 4: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/4.jpg)
Overview
Why use Stata?
Availability Can self-program, or use menus Cutting –edge statistical methods (including user-defined
functions) Publication-quality graphics
![Page 5: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/5.jpg)
Stats and Graphics
![Page 6: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/6.jpg)
Getting Started
• A word about programming in and using Stata
• Stata is case sensitive, so Myvar is different from myvar
• All commands in Stata are lower-case
• “and’ = &, “or” = |, “not”= !
• Assignment is “=“ , value equivalency is “==“
![Page 7: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/7.jpg)
Windows in Stata
![Page 8: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/8.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 9: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/9.jpg)
Getting Started
• Opening Stata
• Opening Data:– Stata formatted data
“use” command
– Comma-separated variables “insheet using”
– Tab-delimited variables “insheet using”
– Flat-files Create a dictionary
![Page 10: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/10.jpg)
Apply Your Knowledge
• Exercise 1:
• Open Stata
• Using the insheet command, open the comma-separated variables data file located in– F:\workshops\SATdata.csv
(HINT: all Stata commands must be written in lower case.
Don’t forget to put pathnames in quotes!)
![Page 11: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/11.jpg)
Examining Data
• Look at your data – did our data import correctly?
– How are our data measured?
– What kinds of variables do we have?
• How would we describe the distribution of our data?– Graphs
Histograms Scatterplots
– Charts/Tables Frequency tables Cross-tabs
![Page 12: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/12.jpg)
Looking at Data
• There are several ways to look at our data in Stata
– Editor
– Browser
– Stata commands codebook des Tables of frequency and distribution Graphs of distribution
![Page 13: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/13.jpg)
Examining Data
• Let’s look at how the variable ‘csat’ is distributed
– hist csat
– tab csat
![Page 14: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/14.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 15: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/15.jpg)
Do files
What are do-files?
‘Do’ files are essentially a syntax list of all of the commands that you wish to run, and the setting that you would like to set
– Why use them?
Replication Collaboration Audit trail Help
– How to create and run one
![Page 16: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/16.jpg)
Do-files
• Creating and running a do-file
![Page 17: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/17.jpg)
Do files
– EXERCISE 2: Create a simple do-file from the commands that you have already entered.
(HINT: you must clear the data in memory before opening a new dataset.)
![Page 18: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/18.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 19: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/19.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 20: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/20.jpg)
Basic Data Cleaning
– Labeling– To label a variable: label var varname label– To label values:
label define labelname 1 ‘high’ 0 ’low’ Label val varname labelname
– Renaming ren varname1 varname2
– Recoding recode varname oldvalue=newvalue
– Generating a new variable gen newvarname=somevalue
– Replacing values of an already generated variable replace newvarname=somevalue
![Page 21: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/21.jpg)
Basic Data Management
• Subsetting– keep
– drop
– if
Merging
merge
must sort both files by the linkage variable!
ex: merge linkage_var using “F:\workshops\newfile”
![Page 22: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/22.jpg)
Basic Data Cleaning
• EXERCISE 3:
• generate a dichotomous variable called hi_score from the
csat variable, where a value of 1 indicates a score of greater than 922 and a 0 is less than or equal to 922.
• label it as 0=low and 1=high.
![Page 23: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/23.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 24: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/24.jpg)
Beginning Analysis
• Univariate analysis summarize histogram Table
Bivariate analysis
tabulate
pwcorr
ttest
![Page 25: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/25.jpg)
Apply Your Knowledge
EXERCISE 4:
Generate a histogram of the expense variable
generate a two-way table to see if distributions are the same or different for the values of expense by the different values of your newly created hi_score variable
If you have time, see if there is a significant correlation between scores on SATs and the average amount of money that each state spends on education.
![Page 26: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/26.jpg)
Beginning Analysis
• Multivariate models
– Linear regression
regress depvar indepvar1 indepvar2 … indepvarN
– Logistic Regression logit depvar indepvar1 indepvar2 … indepvarN
![Page 27: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/27.jpg)
Apply Your Knowledge
• Exercise 5:
Generate two scaterplots – one to look at the relationship between expense and csat , one to look at expense and hi_score.
Depending on your estimation of the relationship (linear or not), run the appropriate regression to test for the relative effect of expense on either csat scores or hi_scores
![Page 28: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/28.jpg)
Agenda
I. Overview of Stata
II. Getting Started
III. ‘Do’ files
IV. Basic data cleaning
V. Basic data management
VI. Beginning analysis
VII. Special topics (time permitting)
![Page 29: Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf731a28abf838c7f27d/html5/thumbnails/29.jpg)
Thanks
Questions?
Gutman Library, room 323a&b
[email protected]://www.isites.harvard.edu/icb/icb.do?keyword=ltc