Software for data management: The contribution of Stata
-
Upload
jeremy-fitzgerald -
Category
Documents
-
view
21 -
download
2
description
Transcript of Software for data management: The contribution of Stata
Software for data management: The contribution of Stata
Dr Karen Robson, Senior Research Fellow, The Geary Institute, University College Dublin, Ireland
Getting acquainted with Stata
StataCorp develops and distributes Stata, software for statistical analysis.
Stata is available for Windows, Macintosh, and Unix computers.
Stata is used by medical researchers, biostatisticians, epidemiologists, economists, sociologists, political scientists, geographers, psychologists, social scientists, and other research professionals needing to analyze data. Gaining popularity in the social and medical sciences
Particularly useful for handling large-scale longitudinal data
Stata SE (for large data sets)
can analyze datasets with as many as 32,766 variables, and the only limit on observations is the amount of RAM on your computer
can handle string variables with a maximum length of 244 characters
can handle matrices up to 11,000 x 11,000. requires at least 512 megabytes of RAM and
80 megabytes of disk space
Stata/Intercooled (the standard one)
can analyze datasets with as many as 2,047 variables, and the only limit on observations is the amount of RAM on your computer
can handle string variables with a maximum length of 244 characters
can handle matrices up to 800 x 800.
Small Stata
A smaller, student version of Stata (for educational purchases only)
Stata MP
The fastest version of Stata (for dual-core and multicore/multiprocessor computers)
Stata/MP is the fastest and largest version of Stata.
Resources
StataCorp website (www.stata.com)
Resources
StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk)
Resources
StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk) UCLA Stata “portal”
(http://www.ats.ucla.edu/stat/)
Resources
StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk) UCLA Stata “portal”
(statcomp.ats.ucla.edu/stata) Statalist (www.hsph.harvard.edu/statalist)
Resources
StataCorp website (www.stata.com) Timberlake website (www.timberlake.co.uk) UCLA Stata “portal”
(statcomp.ats.ucla.edu/stata) Statalist (www.hsph.harvard.edu/statalist) Stata Journal (www.stata-journal.com)
As well, available Dec 2008
Launching Stata
OS contingentDefault window preferencesWindow preferences fully adjustableAuto memory set
Comparing with SPSS
Start up differences
Comparing with SPSS
Start up differencesWith data file open
Comparing with SPSS
Start up differencesWith data file openViewing data
data viewer, data editor
Comparing with SPSS
Start up differencesWith data file openViewing data
data viewer, data editorViewing variables
Comparing with SPSS
Start up differencesWith data file openViewing data
data viewer, data editorViewing variablesViewing output/commands
output window buffer, log files
Comparing with SPSS
Start up differencesWith data file openViewing data
data viewer, data editorViewing variablesViewing output/commands
output window buffer, log filesSyntax and “do files”
INPUT
Stata command window
Do file
Pull-down menu
Variable window
Review window
Computation
RESULTS
Output window
Log file
Advantages and disadvantages of Stata
User driven Free STBs Dedicated journal Web active Memory
requirements Backward
compatible
Change! SPSS dominance Orientated to writing
syntax/code Pull-down windows
debate! Now in version 8 forward
Advantages and disadvantages of Stata
Easier code Easier data handling Clarity of operations/
feedback Results table
function
Before version 8, limited graphics
Now, complex graphics
Variable labelling Editing of output
Advantages and disadvantages of Stata
Nested/master do files
Flexible terminology Setting types of
data Interactive help Switch output (log
file) on/off
Copy and paste
Overview of analytic techniques
Too numerous to mention!Comprehensive manualsA selection:
All types of regressionSurvey packageEpidemiological packageMultilevel modellingTime series functionsCluster analysis
Data
Data files .dtaStat/Transfer software
Stata – using wide and long file formats
Wide file formats (everything you add goes to the right of the existing data)
Long file formats (everything you add goes underneath the existing data)
MERGE
Data 1 Data 2
APPEND
Data 2
Data 1
Data 1 (indi)
‘master’ Data 2 (indj)
‘using’
_merge values
1
3
2