SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word,...

92
SPSS, STATA, and SAS: Flavours of Statistical Software A. Michelle Edwards, Ph.D. Ontario DLI Training April 14, 2005

Transcript of SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word,...

Page 1: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

SPSS, STATA, and SAS:

Flavours of Statistical Software

A. Michelle Edwards, Ph.D.

Ontario DLI TrainingApril 14, 2005Kingston, ON

Page 2: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Data background

CANADIAN COMMUNITY HEALTH SURVEY (CCHS) CYCLE 2.1 (2003)

Summary:

“The Canadian Community Health Survey (CCHS) is a cross-sectional survey that collects information related to health status, health care utilization and health determinants for the Canadian population. The CCHS operates on a two-year collection cycle. The first year of the survey cycle “.1” is a large sample, general population health survey, designed to provide reliable estimates at the health region level. The second year of the survey cycle “.2” is a smaller survey designed to provide provincial level results on specific focused health topics.

This Microdata File contains data collected in the third year of collection for the CCHS (Cycle 2.1). Information was collected between January 2003 and December 2003, for 126 health regions, covering all provinces and territories. The CCHS (Cycle 2.1) collects responses from persons aged 12 or older, living in private occupied dwellings. Excluded from the sampling frame are individuals living on Indian Reserves and on Crown Lands, institutional residents, full-time members of the Canadian Armed Forces, and residents of certain remote regions. The CCHS covers approximately 98% of the Canadian population aged 12 and over.”

Source: CCHS Cycle 2.1 2003 User Guide.

Datafile information:The complete PUMF datafile contains 1068 variables and 134072 cases. For the purposes of this workshop the dataset has been trimmed to 8 variables and only contains data pertaining to Nova Scotia, Ontario and Alberta.

The data was originally provided in an SPSS and a SAS format along with SPSS and SAS data definition statements. For the purposes of this exercise you will be provided with a CSV (comma-separated values) file.

I’ve also included the portions of the data dictionary that pertain to the variables used in this exercise. Please note to view the entire data, codebook and associated data – you will need to access the DLI site or a DLI member institution.

Page 3: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Codebook information:

Page 4: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc
Page 5: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc
Page 6: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Source: Data Dictionary CCHS Cycle 2.1 (2003).

Page 7: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Downloading and Saving the CCHS Dataset to the Desktop

Please go to the following webpage to download the CSV dataset called cchs2003.csv

http://tdr.uoguelph.ca/DATA/WKSHPS/DLI2005

There are 2 ways to save the dataset to your desktop:

1. Right-click on the link – select Save Target as.. or Save Link as… - Select Desktop and save.

2. Follow link (to view file) – Use the File – Save as.. on the Web Browser software. Please note that if you use this method you must Change Save as Type… to Text. If you forget this step you will saving an HTML page which will not open in the Statistical computing packages. Save the file to the Desktop.

Viewing the CSV file

Now that we have our file – let’s take a look at the contents of the file to get a feeling for it and to see what it looks like. Let’s open the file in Notepad.

Open Notepad (Start – Run – type Notepad – OK).

File – Open – navigate to the Desktop.

Note that your file is not here – you need to change file type to All Files.

Select CSV file and Open.

Page 8: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

The file contains a number of variable names and a bunch of numbers. To decipher the variable names we need to correspond these with the codebook sections inlcuded above. The numbers have no meaning unless we have the codebook handy. To perform any sort of analysis or summary – Notepad is not going to do it….

1. SPSS

Open the SPSS Program.

Similar look and feel to Excel. Spreadsheet type look, icons at the top and menu bar at the very top.

Think of the SPSS dataview as a container for the data. You cannot manipulate the data in this window as you would an Excel spreadsheet.

Page 9: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Open cchs2003.csv in SPSS

Navigate to your Desktop – Change Files of type to All Files(*.*) – Select cchs2003.csv - Open

Page 10: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

SPSS immediately recognizes that the cchs2003.csv file is not an SPSS datafile. As a result it opens up the Import Wizard and walks you through 6 steps to open the datafile.

Step 1 - Allows you to select a format for data that you’ve already read in. Chicken and the egg concept. This window also shows you what your data looks like.

Page 11: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Step 2 – Select how your variables are arranged. In this example our file is delimited – there’s a comma separating our variables. We also have the variable names listed at the top of our file.

Page 12: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Step 3 – If there are variable names at the top of your file – your data must start on line 2. If you want to skip the first 10 lines of data – you can tell SPSS to start reading the data on line 11.

In this example each line represents a case. However, in some situations this may not be the case. For example if someone has entered the data in the following format:

M 12 S F 13 SF 35 M F 45 DM 78 W M 50 M

Where each line represents the Gender of an individual, their age, their marital status, the gender of another individual, their age, and their marital status.

In this situation – the number of variables represents a case – in the little example above it would be three variables for each case.

Page 13: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Step 4 – deciding what the appropriate delimiters are. In most cases SPSS will do its best to determine the best delimiter or combination of delimiters. In the cchs2003.csv file SPSS has decided that a Comma and a Space are the delimiters. This would result in a wrong file configuration. Since we know that the file was comma-separated – let’s use the Comma only. In situations where the delimiter is unknown, you can play around with the selection of delimiters until the data look correct.

Page 14: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Step 5 – In our example we have variable names. If you import a file without variable names – this is the window where these can be added or changed. Select a column – the variable name and data format boxes become available. Please note that above the Numeric there is an option of Do Not Import.

Page 15: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Step 6 – The final step. Click Finish and the data is now in SPSS.

BUT – All we have now is the same data we saw in Notepad in columns – a little neater but that’s about it…

Page 16: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Goal – is to add variable labels and value labels.

In SPSS – at the bottom of the screen there are 2 tabs – one is the DataView (which we have been in) and the other is the VariableView.

If you select the VariableView tab you will be presented with a table where we can enter information regarding the variables.

All variables in our dataset are of the Numeric format, with varying widths and some came in with decimal places while others came without. All of these features can be changed.

Page 17: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Variable Labels

Let’s add a label for geocgpr as Province – Under the column called Label – type the label you would like to see associated with the variable. It becomes quite handy when working with surveys to add the question under the label.

Here are the labels for the 8 variables in our dataset:

Geocgprv – ProvinceDHHC_SEX - GenderDHHCGMS – Marital StatusGENCDHDI – Self-Rated HealthGENCDMHI – Self-Rated Mental HealthCIHC_8A – Health Improvement – More exerciseCIHC_8B – Health Improvement – Lose weightWTSC_M – Master weights

Please note that instead of labels you can enter the question text.

Please do not forget to save your dataset along the way. To save the file – File -> Save as… Navigate to the Desktop -> save as cchs2003.sav SAVE

SPSS datafiles are saved as .sav files.

Page 18: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Value Labels

So now we know what the variable names represent BUT we still do not know what the values within the variables represent. We can use our codebook to view these – but it would be handy to add these into the SPSS program.

Select the cell under the Values column for the geocgprv variable

Click on the grey square with the 3 dots to obtain the value label dialogue box.

Page 19: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

For the geocgprv variable – we are given the following codes:

10 – Newfoundland and Labrador11 – Prince Edward Island12 – Nova Scotia13 – New Brunswick24 – Quebec35 – Ontario46 – Manitoba47 – Saskatchewan48 – Alberta59 – British Columbia60 – Yukon/NWT/Nunavut

We already know that the data we’re working with only pertains to Nova Scotia, Ontario and Alberta. So we will only be adding these value labels.

Let’s add the value label for Nova Scotia first.

To add this to SPSS – type 12 in the Value box, and Nova Scotia in the Value label box. Then click on Add to add this to the dataset. Work through the same steps for Ontario and Alberta labels.

Click OK when finished.

Page 20: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

The variable view now looks like this…

You can enter as many or as few of the values as you wish… When you switch back to the DataView – some people would rather see the labels and not the values. To

accomplish this – in DataView click on the icon at the top right-hand side of your screen.

Now we know what the variable names represent as well as the variable value labels. We’re ready to do some analysis now…

Survey Weights

“The principle behind estimation in a probability sample such as the CCHS Cycle 2.1 is that each person in the sample "represents", besides himself or herself, several other persons not in the sample. For example, in a simple random 2% sample of the population, each person in the sample represents 50 persons in the population. In the terminology used here, it can be said that each person has a weight of 50.

Page 21: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

The weighting phase is a step that calculates, for each person, his or her associated sampling weight. This weight appears on the microdata file, and must be used to derive meaningful estimates from the survey. For example, if the number of individuals who smoke daily is to be estimated, it is done by selecting the records referring to those individuals in the sample having that characteristic and summing the weights entered on those records.”

For more detailed information on how the weights were calculated for the CCHS Cycle 2.1 survey please refer to the User’s Guide page 22.

How do we apply weights in SPSS?

You will now have the option to select a variable to weight cases by. In our dataset the variable is called WTSC_M (master weights).

Page 22: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Once you click OK and apply the weights - a box appears at the bottom righthand corner of the DataView screen that states the weight is on.

Once the weight has been applied all results will reflect population results and NOT sample results.

Frequencies

All statistical analyses can be found under the Analysis menubar. For frequencies – Descriptive Statistics -> Frequencies

To perform any analysis in SPSS – you wil be presented with a dialogue box which you fill in. There will also be several options boxes to explore.

Page 23: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

One neat trick to remember – in the dialogue boxes if you are uncertain of any tests or definitions – right-click on the test and you will be presented with a definition.

Page 24: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

To run the analysis – Click OK. The results are presented in a second window called the SPSS Viewer.

Another neat trick to remember – if you can’t remember what the tables are showing you. Right-click and select Results Coach. This will take you through a mini-tutorial of the results table you selected. Very handy for students!

Result tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the table in question – Copy object – move into Word – Select Paste.

Page 25: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Please note that the above results represent sample results. Once the weight is applied the results are:

Page 26: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Crosstabulations

To perform a crosstabulation in SPSS – we’re interested in seeing the relationship the gender and self-rated Health.

Analyze -> Descriptive Statistics -> Crosstabs

Select Gender of Respondent as the Column Variable and gencdhdi (self-rated health) as the row variable.

Page 27: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

If you want to calculate a Chi-square statistic to see whether there is a significant relationship between the 2 variables – Go to Statistics and select Chi-square

The results are as follows:

Remember to take advantage of the Right-Click feature if you are uncertain about what the results are telling you…

Please note that there is a syntax available for SPSS. If you are interested in the syntax – select Paste in any of the dialogue boxes to open the syntax window. But most students enjoy the user-friendliness.

Page 28: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

2. STATAOpen the Stata program

Looks a lot different than SPSS – almost scary-like – no user-friendly or friendly-feeling windows.

“Stata is, at its heart, a command-driven application” (Getting Started with Stata manual).

There are a number of windows available to Stata Users.

Results window – will show commands that we run and associated notes (running log file), and the resulting tables of any analysis that were requested.

Command window – this is where you enter the commands that you would like performed.

Review window – lists the commands that have already been run.

Variables window – lists the variables that are available in the current dataset.

Page 29: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Open cchs2003.csv in Stata

Use the Browse button to navigate to the Desktop and select the cchs2003.csv file. Please change the files of type to CSV – to find the cchs2003.csv file

Select Specify value delimiter and comma. We know this information so let’s use it.

Page 30: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Note how there’s a message in the Stata Results window but there’s no data. From the message we can see that there are 8 variables and 61604 observations.

What happens if you receive a message in the results window that looks like this:

Page 31: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Stata reads a dataset and stores it in RAM memory of your machine. By default when you open the Stata program it allocated 1.00 Mb RAM memory for your data. We need to increase this to read a larger datafile. The complete CCHS Cycle 2.1 dataset contains 134072 observations – yet the dataset I’m reading in here still only contains 8 variables.

First – determine how much RAM is available on your machine… Go to my Computer – Right click and select properties.

On this particular laptop there is a total of 512 Mb of RAM. To increase the amount of RAM memory Stata can use from 1.00 Mb to 50 Mb…

First clear the dataset – type Clear

Then type: Set memory 50000

You should see:

Page 32: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Then try to import the dataset again…

Page 33: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

How do we see the data?

Type browse in the Command Window OR select the icon.

You can move around in this window – you can sort the file by any variable by selecting Sort.

If you do not like the order the data is presented you can change the order by selecting the >> button or the << button.

Since you are in the browser you are unable to make changes to the data.

Also note that when the data browser is open the Command window is not available.

Close the Browser Window by clicking on the X at the top corner of the window.

The Command window now reappears.

Page 34: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

How do we edit the data?

Open the table editor by typing edit or by selecting the icon.

In this window we can move around as you were able to in the Browser - but now you can edit the data as well.

Page 35: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Background Information and Editor functions

When you first open a dataset in Stata, the program creates a backup copy of the dataset – a snapshot of your data before you make any changes if you will.

In the Table Editor you have 2 different saving features available to you:

Preserve – if you make changes to your data in the Editor and you are satisfied you can update the backup copy of the file.

Restore – if you want to cancel the changes you’ve made to your data and restore the backup copy – think of it as an “undo” option for saving your data.

Other variable options available in the Editor are:

Hide – hide variables – a feature very similar to Excel – variables are hidden from view.

Delete – when you select this option you are presented 3 options:

Delete the current variableDelete the current observationsDelete all observations throughout the dataset that have the same value for the current variable as the current observation

As an example:

Close the Editor – notice that the command window returns.

To close open datafiles – type – Clear.

To reopen datafile – select the insheet statement from the Review Window. It will automatically be pasted into the Command Window – hit enter and the cchs2003.csv file will be reread into Stata.

Page 36: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

List

Another way to view data in Stata is using the List command. This will list each observation in the Results window. To continue viewing hit the Enter key to move line by line OR the space bar to view page by page. To exit – Type Q.

With the list command you can select which observations you wish to observe. If I’m only interested in looking at the hhid and baa1 variables.

list geocgprv cihc_8b

You can also subset – for example if we were only interested in looking at those observations pertaining to female respondents.

list geocgprv cihc_8a if dhhc_sex ==1

This will list the geocgprv and cihc_8b variables for only the observations where dhhc_sex = 1 OR for males in the dataset.

Describe

In SPSS we could view information about the variables in the VariableView window. In Stata we can view similar information by submitting the describe command.

Page 37: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

We can see the variable name, storage type, display format, value labels, and variable labels.

Again – we’re in the situation where the variable labels are their names. Without the codebook we do not know what the variables represent.

Variable Labels

To add variable labels we need to use the following command:

label var geocgprv "Province"

var – tells Stata that we are working variable labelsgeocgprv – name of a variable“…” – the label you would like associated with the variable name you stated.

After you’ve tried to enter a few your command window will look like this…

To see whether the labels have taken effect – try the describe command again.

Page 38: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Any labels you have entered are now listed in the variable label section.

Value labels

We now know what the variables represent but we need to add the value labels – to help us distinguish the different values within the variable.

To add value labels in Stata – there are 2 steps. We first need to create the value labels and then we need to apply them to the variables.

To create the value labels:

label define prov 12 "Nova Scotia" 35 "Ontario" 48 "Alberta"

label – tells Stata we are working with labelsdefine – tells Stata we are defining value labelprov – a name you want to give the values12 35 48 – the variable value“..” – the label we want associated with each value.

Page 39: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

To apply the value labels to a variable:

label values geocgprv prov

label – tells Stata we are working with labelsvalues – tells Stata we are applying a value labelgeocgprv – variable name we are applying the value label toprov – the name of the value label we defined above.

Once you apply a few of these – try running describe again:

You know see that some variables have a value label attached to them.

Survey Weights

There is no setting the weights on or off as there is in SPSS. Most Stata commands / analyses can deal with weighted data. There are 4 kinds of weights as defined by Stata:

fweight – frequency weightpweight – sampling weightaweight – analytic weightiweight – importance weight

Page 40: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

General comman syntax in Stata would be:

Command … [weighttype = varname]

Frequencies

In Stata there are 2 ways to run a frequency analysis. Through the menubar or by running code in the command window.

To use the Command window:

tabulate geocgprv

tabulate – tells Stata that you have requested a frequency calculationgeocgprv – the variable which you would like frequencies calculated

Please notice how the variable label and value labels are shown in the results window.

Page 41: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

If you prefer to use the Menu bar – go to Statistics -> Summaries, Tables & Tests -> Tables -> One-way Tables

Enter the variable in question in the Categorical Variable box.

Please note there are a number of options you can investigate at this point.

Page 42: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

When you click OK on the dialogue box – you see the same results and command used in the Results window.

Let’s try it with the weight variable

Command:

tab geocgprv [aw=wtsc_m]

Page 43: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Crosstabulations

To run a crosstab in Stata you use the same command, only you list the second variable. So if we are interested in examining the relationship between province and health improvement – more exercise, we would type the following command in the Command box:

. tab cihc_8a geocgprv

With weights:

tab cihc_8a geocgprv [aw=wtsc_m]

Page 44: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

But what if you want to precentages and a Chi-square test to see if there’s a significant relationship between the two questions?

tab geocgprv cihc_8a, chi2 row

Now we can see that row (independent variable in this example) percentages, as well as the Chi-square test results. Please note that both SPSS and Stata give you the same results.

To save output

There’s not a nice feature in Stata to save your results window. Many people simply highlight the table of interest, copy and paste it into Word. You may need to some reformatting once the table is in Word.

You can log all the information from the Results window into a Log document. To do this

File -> Log -> Begin

And save it to a file. The log file is formatted for Stata – if you want to place information into Word or elsewhere, you will need to copy and paste as above.

Page 45: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

3. SASOpen the SAS program

SAS has a number of windows to work with – very simliar to Stata.

Editor – where you will write your code

Log – SAS maintains a log of all processes it has run – this is where you will find your errors.

Output – this is where your results will be displayed

Explorer tab – allows you to move inside and outside the SAS program

Results tab – table of contents of all result tables in the Output window

Page 46: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Open cchs2003.csv in SAS

Rather than writing a program to import the cchs2003.csv file – we’ll use the Import feature in SAS.

File -> Import Data

SAS, very much like SPSS will walk you through a wizard.

Our file is a CSV file – select this option from the Standard Data Source box.

Browse to the desktop to select the cchs2003.csv file. Select the options tab to insure that the data will be read from the second row and the variable names will be created from the first row.

Page 47: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

SAS arranges its data in a library format. To compare this to “My Computer” on our computers – think of a folder as a filing cabinet drawer – this is referred to as a library in SAS. Each file in the folder or each folder in the cabinet drawer is a dataset.

SAS has 2 kinds of libraries – temporary and permanent libraries. Today we will be working with the temporary library called Work. This means that anything we create in SAS will be cleared out when the SAS program is shut down. Permanent libraries are used to create permanent SAS datasets.

On the Import Wizard dialogue box leave library as Work – under Member – type in cchs2003. This will name our SAS dataset cchs2003.

Page 48: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

The next step allows you to save the program generated by SAS to import the datafile.

Click Finish and the dataset has been imported.

Page 49: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Please note that nothing appears to happen – other than some print in the log Window.

If you look at the bottom of the Log window – it shows this:

The work.nepal dataset was successfully created. We can also see that there are 8 variables and 61604 observations. BUT – as with Stata – there is no data to be seen.

To view the data we can go to the Explorer tab on the left side.

Click on the icon and select the Work library or icon. You should now see the contents of the Work library – our cchs2003 dataset.

If you double-click on the Cchs2003 icon – SAS will open the ViewTable. You can now browse your data as we could in Stata.

Options on the top menubar include sorting, form view, column attributes, table attributes, edit and browse. As in Stata, when you are browsing the data you cannot make any changes to the data. To edit the data you must enter the Edit mode. To do

this you can select the Edit Icon .

To exit the ViewTable mode – Click on the x at the top righthand corner of the window.

Page 50: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Sometimes looking at the data in this format is a bit cumbersome – another way to look at it is by using the Proc Print procedure.

How does SAS work?

There’s the Data step – used to get your data into SAS and used to manipulate the data. This is one of the strong points for SAS.

To analyze the data you use a series of Procedures or PROCs. If you want to do anything with your data – look at it or process it you will probably use a PROC.

PROC PRINT

To view the data in the Results window we can use a Proc Print.

In the editor type:

Proc print;Run;

To run the program click on the “Running man” OR go to Run -> Submit.

When the process is complete SAS will automatically present you with the Output window. You should ALWAYS look a the log window first. If there are no problems or errors then proceed to the Output window.

In this case we get an Output Window Full message. We’ve filled the Output window buffer with the 61604 observations. Options are listed below. For this case we will select the third option – C- Clear window without saving. Click OK

Page 51: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

A snippit from the log window showing that everything is fine…

A snippit of the Output window..

Page 52: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Proc Contents

In SPSS we could gather information about the variables from the VariableView, and in Stata we used the describe command. In SAS we will use the Proc Contents procedure.

In the editor type:

Proc contents;Run;

To run a selection of the code in the SAS editor – highlight the code as above and hit the “running dude”.

Check the log window first before browsing the Output window..

Page 53: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

The output window shows the following:

The same basic information we saw in both SPSS and in Stata. Again – we are presented with variable names, types, lengths, positions, and formats. No labels are visible nor any value labels.

Variable Labels

To add labels to the variables in SAS – since we are looking to make changes to the dataset we need to work in the Data step.

First we need to start with a Data statement. By doing this we are telling SAS that we are creating a new SAS dataset and we’re giving it a new name. Very similar to using a Save As… function in any other program.

Second, we need to tell SAS what dataset we’re are working with. We do this by using a Set statement.

Data cchs_lab; Set cchs2003;

Page 54: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Now we can start adding our labels.

Label geocgprv = “Province”;Label dhhc_sex = “Gender”;Label dhhcgms = “Marital Status”;

OR

Labelgeocgprv = “Province”dhhc_sex = “Gender”dhhcgms = “Marital Status”;

Either way works.

Once you have finished entering your labels – you need to complete the Data step with a Run; at the end.

So the program looks like this:

Please note that certain parts of the program are a certain colour. This is really handy for debugging.

Page 55: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Once you submit this program – check you log window for any errors. If everything works out you should see:

Try running the Proc Contents again to see whether the labels took effect or not.

Page 56: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Value Labels

Same two-step process as was required for Stata. First we need to create the value labels (in SAS this is referred to as a format), then we need to apply the labels or the format.

To create the format we need to use a Proc format.

Try adding this to the editor window. Run the code and check your log window for any problems.

This is what my log window looks like. So now we have a format for prov, sex, scale, and yesno.

Page 57: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Now we need to apply these to the variables. Again we’re looking at changing the data so this will be conducted in a Data step. Let’s use the Data step we have already created.

Page 58: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

At the bottom – before the Run statement let’s add the following code:

Format geocgprv prov.Format dhhc_sex sex.;Format dhhcmgs marstat.;Run;

OR

Try running the Data step again – check the Log window. If all looks well – run the Proc Contents to see what has changed…

Here’s my log window after I’ve added the format to my Data step…

Page 59: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

After I run Proc contents :

So now we have the variable and value labels added to the SAS dataset.

Frequencies

SAS does have a Graphical User Interface (GUI) Analyst and/or the Enterprise Guide – but one of the strengths of SAS is the ability to write the code and add any options available. So we will be concentrating on the code to generate frequency counts.

To generate frequencies in SAS you need to use a procedure called Proc Freq. Let’s generate frequencies for geocgprv as we have for SPSS and Stata. To do this you need to use the following code:

Proc freq; Tables geocgprv;Run;

In the output window – after you’ve checked your log window you should see..

Page 60: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Please note that the variable and value labels are shown.

Applying survey weights.

SAS acts very much like Stata when it comes to the weights. There is no weight on or weight off function. You must apply the weights for each procedure separately.

proc freq; tables geocgprv; weight wtsc_m;Run;

The output window or results:

Crosstabulations

To examine the relationship between province and how individuals self rated their mental health – we would look at generating a crosstab table. In SAS we use the Proc freq; to do this. We need to add the variables and place an ‘*’ between the two.

Your code should look like this:

Proc freq; Tables gencdmhi*geocgprv;Run;

When you run this and after you check your log window – the results should look like this:

Page 61: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

By default we already see the row and column percentages – but what if we want to see the Chi-square test results?

We need to add that option to our Proc freq code..

proc freq; tables gencdmhi*geocgprv/chisq;Run;

By placing a / after the tables and adding the chisq option – we see the following output:

Page 62: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc
Page 63: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

To apply the weights – add the weight statement:

proc freq; tables gencdmhi*geocgprv/chisq; weight wtsc_m;Run;

Output results:

Page 64: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

If you check these results against SPSS and Stata – you will see that they match. However, with SAS you get a bit more information than in both SPSS and Stata.

To save output

The output window results can be copied and pasted into a Word document very much like with Stata. Formatting is maintained across the program. However, SAS has the ODS (Output Delivery System) which allows the user to create the output in another format – html, rtf, pdf, are examples. To create an HTML document from the Proc freq code we have – we need to add the following lines:

ods html file ="C:\test.html";

Proc freq; tables gencdmhi*geocgprv/chisq;Run;ods html close;

These lines of coding will produce an html document called test.html and located in my root C:\ directory.

Here’s the log window results…

Page 65: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

If the SAS viewer has been installed on your system you will be able to preview your document…

This document can now be posted on the web or sent easily from one person to another.

Page 66: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Overview

Each program has its merits and demerits. The choice of package will depend on a number of factors:

1 – choice of learning curve

2 – Time limitations – thesis due in a week vs. thesis due in a year

3 – type of analysis – social science researchers tend towards SPSS- epidemiologists tend towards Stata- animal scientists, mathematicians tend towards SAS

4 – access to support and help

Page 67: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

More information:

SPSS. 2003. SPSS Base 12.0 User’s Guide. SPSS Inc., Chicago, Illinois

SAS. 2003. SAS Online Documentation. SAS Institute Inc., Cary, NC

Stata Corporation. 2003. Getting started with Stata for windows. A Stata Press Publication, College Station, Texas.

Your local Statistical Computing Consultant

Local SPSS, Stata, and SAS course and/or workshops

Page 68: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc

Quick comparison guide:

SPSS Stata SASImporting Files Wizard Menubar WizardVariable Labels VariableView

Type in Label columnLabel var “label” In Data Step add

Label var=”label”;Value Label VariableView

Add in Values Column – dialogue box

1 – create valuelabel define gender 0 “male” 1 “female”

2 – apply valuelabel values bagndr1 gender

1 – create formatProc format; Value gender 0=”male” 1=“female”;run;

2 – apply formats in Data step format bagndr1 gender.;

Frequencies AnalyzeDescriptive StatisticsFrequenciesFill in dialogue box

Tabulate variable Proc freq; Tables variable;Run;

Crosstabulations AnalyzeDescriptive StatisticsCrosstabsFill in dialogue box

Tabulate variable1 variable2 Proc freq; Tables variable1*variable2;Run;

Chi-square Check option in Statistics box in Crosstabs

Tabulate variable1 variable2, row col chi2

Proc freq; Tables variable1*variable2/chisq;Run;

Saving output Select tableCopyPaste into Word

Highlight tableCopyPaste

Highlight tableCopyPasteODS

User-friendliness High Medium LowLearning Curve Low Medium High

Page 69: SPSS, STATA, and SAS:€¦  · Web viewResult tables from SPSS can be copied directly into Word, PowerPoint and Excel. To accomplish this – Select the ... Try running the Proc