SPSS Intro Workshop Fall 2008 - Yale StatLab...

22
StatLab Workshop Series 2008 Introduction to SPSS 1 StatLab Workshop Fall 2008 Introduction to SPSS for Windows with Sherlock Campbell And John Ferguson October 3, 2008

Transcript of SPSS Intro Workshop Fall 2008 - Yale StatLab...

Page 1: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

1

StatLab Workshop Fall 2008

Introduction to SPSS

for Windows

with

Sherlock Campbell

And

John Ferguson

October 3, 2008

Page 2: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

2

I. Introduction SPSS is a statistical analysis and data management package widely used in the social sciences. In SPSS, most features are accessible through menus, located at the top of display windows.

To Start SPSS: Click on SPSS in the Start Menu. To Exit SPSS: Click File > Exit or click on the red X in the upper right of the SPSS window.

• When you start SPSS for the first time, you are greeted with the dialog box below:

• To get started, choose ‘Open an existing data source’ from the options, make sure ‘More Files… is highlighted and click 'OK'. This will open the Open File dialog box, choose the file ‘1991 U.S. General Social Survey.sav’ and click ‘Open’. • There are three main window displays when using SPSS; the Data Editor, the Viewer, and the Syntax Editor. The Data Editor and the Viewer will open whenever you open a data set.

Page 3: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

3

Data Editor:

o The Data Editor is a spreadsheet style window which displays data and information about the variables in that data. You can enter data directly in this window, edit data, and edit variable names, variable type, variable labels and more.

o This window has two tabs at the bottom left of the window. The Data View tab allows you to see the data, with variable names across the top (in columns). The Variable View tab (below) lets you see your variables in rows, with the variable information in the columns.

o You can open more than one data set at a time. Each data set will open in a separate data editor window. The currently active data set will have a green plus sign displayed in it's icon in the upper left corner of the window, like this: . The same plus sign will be displayed in the icon on the taskbar in Windows. When you select an action from the menus, SPSS will perform that action on the currently active dataset.

Page 4: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

4

Viewer

o The Viewer displays output of commands including charts, graphs, tables, and any error or warning messages. The left hand part of the window displays the outline view of output, allowing you to easily select what you'd like to view, hide or delete from the file. The right side is where the content is displayed. You can edit most output by simply double-clicking it in the right side of the window. You can also open more than one Viewer if you wish. This feature comes in handy if you want to refer to the results of an analysis run previously (or by someone else) while working on a new analysis. Make sure to keep track of which Viewer is currently active when running analyses, as the output will be directed to the active Viewer (usually the last one you used or clicked.)

• Syntax Editor o The Syntax Editor displays statements from the SPSS command language or

syntax. This window is basically a plain text editor where you can type (or paste) commands you wish SPSS to run.

o The Syntax Editor does not open by default. • Although the menus displayed on each window include options specific to that window,

several key menus are displayed on all SPSS windows, including File, Edit, Data, Transform and Analyze to allow quick access to core functionalities.

Page 5: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

5

First time users: A tutorial is available in SPSS. Click on Help > Tutorial. As you can see from the table of contents screen shown below, it is a comprehensive introduction and well worth exploring.

Page 6: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

6

II. Getting Data into SPSS Direct Entry

• You can use the SPSS Data Editor to enter raw data manually. Make sure you are in Data View. Check the tab at the very bottom of the Data Editor as shown below:

• Once you are in the Data View just click in a cell and begin typing. • The Variable View tab allows you to edit the variable names, type (numeric, string, etc.), variable labels, value labels, missing values and other variable attributes.

SPSS or .sav Files

• Click on File > Open > Data... to open a SPSS (.sav) file.

Excel, Stata, SAS, database Files

• SPSS can also directly open files from Excel, Stata, SAS, and some database files. Select the appropriate type from 'Files of type:' just below the 'File name:' text box.

Page 7: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

7

Create a new variable: • Using the sample data file '1991 U.S. General Social Survey.sav' (in a default installation, it is located in C:\Program Files\SPSS) we will create a new variable. The COMPUTE command creates new numeric variables or modifies the values of existing string or numeric variables.

• In any SPSS window, click Transform in the menu, then click Compute… o To indicate clicking on menus, I will use bold font to indicate a menu choice and

the greater than symbol ‘>’ to indicate a sub-menu. The selection above will be typed like this: Transform > Compute… You should see the following on your screen:

Page 8: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

8

• Clicking on Compute… produces the following dialog box.

• We are going to calculate 'Hlth_tot' by adding the values of the 9 'hlthx' variables in the data set.

o NOTE: If Hlth_totx already exists in the data, this COMPUTE statement will replace it with new values. If it does not exist, COMPUTE creates a new variable at the end of your data.

• In the ‘Target Variable:’ box at the top left, type in the name of the new variable you want to create, in this case Hlth_tot .

• The ‘Numeric Expression:’ box to the right is where we enter the formula to create the new variable. You can select variables on the left and click the arrow box to move them to the expression box. You can also just type directly into the box, or click the buttons below.

• If you select an entry in the ‘Function group:’ box, the functions in that group will be displayed in the ‘Functions and Special Variables:’ box in the bottom right. For this exercise, we’ll select ‘Statistical’.

• If you select one of those functions, a brief description will be displayed in the box at the center. Click on ‘Sum’.

Page 9: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

9

• Now click on the arrow to move the function to the expression window above. Your dialog box should look like this:

• For this example, we just need to create a comma separated list of the variables we want to

sum between the parentheses. With the ‘?’ highlighted, double click on hlth1 to replace the ‘?’.

o Which is ‘hlth1’? The variable list, by default, lists the variables using the Variable Labels followed by the Variable Name in parentheses. If the name and label fill the box, you can hover the mouse pointer over the Label to see the full name. hlth1 is labeled ‘Ill Enough to Go to the Doctor’.

• We will be using hlth1 through htlh9. Go ahead and fill in the expression and click ‘OK’ to run.

o Since the variables are listed consecutively in the data, we can just type in ‘(hlth1 TO hlth9)’ into the expression window. SPSS will use all the variables between hlth1 and hlth9 in the expression, based on the order of variables in the data – not just the variable names. Be careful, if you change the order of the variables, add or delete variables, you will get a different result.

• If you look at the end of your data set, you will see the new variable.

Page 10: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

10

III. Analyzing Data • There are two basic ways to use SPSS to analyze, manage and present data:

1. Select procedures from them menus (Data, Transform, Analyze, Graph). 2. Open a Syntax Window and type in commands directly using the SPSS command

language. • For this session we’ll stick to the menu interface and use the file '1991 U.S. General Social Survey.sav'. o Descriptive Statistics

1. Select the menu Analyze > Descriptive Statistics > Frequencies… to open the following dialog box:

2. Select the first four variables, then click the arrow to move them to the ‘Variable(s):’ box.

3. Click ‘Statistics…’ at the bottom to bring up the following dialog:

4. We won’t use these today, but it’s good to know that these options are there. Click ‘Continue’.

Page 11: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

11

5. Now click ‘Charts…’

6. Click in the radio button for Bar Charts and click ‘Continue’. 7. That’s all it takes to create Frequency Tables and Charts to begin to explore your

data.

o One-Way ANOVA 1. The One-Way ANOVA is found under Analyze > Compare Means > One-Way

ANOVA... and produces the dialog box below:

2. For this analysis the groups will be formed by levels of 'Death of a Close Friend' and we'll see if 'General Happiness' differs by group. Click on the variable (on the left) 'General Happiness' then move it to the 'Dependent List:' box by clicking on the upper arrow button. Now click on 'Death of a Close Friend' (scroll down) and then click the lower arrow button to move that variable to the 'Factor:' box.

3. At this point the 'OK' button and the 'Paste' button activate, as you have entered the minimum information needed to run a One-Way ANOVA. However, you may want to click the 'Options...' button to bring up the following dialog:

Page 12: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

12

4. All of these options will be useful at different times, depending on your circumstances. A quick reference is available by clicking the 'Help' button. Let's check all the boxes under Statistics and the Means plot as well. We'll leave Missing Values as the default.

5. Since our factor has only two levels, contrasts are not needed, the 'Contrasts...' button will open a dialog to allow you to define contrasts. The 'Post Hoc...' button will be covered in the GLM example below.

6. Click 'OK' to run the analysis.

7. The output in the viewer reveals that there is no significant difference between the two groups. Let's look a second at the means plot, though:

Page 13: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

13

8. At first glance, that certainly LOOKS like a big difference, doesn't it? The default settings for charts in SPSS can sometimes be inappropriate for the data you are trying to display. This is one of those cases. In this situation, the difference between 1.78 and 1.80 is not important (or significant) and should not be exaggerated by the scale of the chart. Let's fix that.

First: double-click on the chart in the output window. This will open the Chart Editor in a new window, with your chart ready to edit.

Click on one of the scale points on the Y axis (the vertical one on the left). The scale points will now have a blue box around them.

Right Click in one of the blue boxes and select the first option 'Properties Window' to get the Properties dialog for the Y axis. (While editing a chart in SPSS, keep this in mind: When in doubt, right click on the item you want to change.) Click on the 'Scale' tab at the upper right and you should be seeing the following:

For those of you familiar with Excel, this may look familiar. We need to change the minimum and maximum from 'Auto' to 'Custom'. Uncheck the boxes next to 'Minimum' and 'Maximum' and replace the values to the right with a minimum of 0 and a maximum of 2. Let's also change the 'Major

Page 14: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

14

Increment' to 0.5. Click 'Apply' to update your chart. Hint: if you drag the properties box next to the chart, you can update the chart and see the impact of each change before clicking 'Close'. Those changes give us the following chart:

That's much better; the chart now displays the information in a more appropriate context.

Page 15: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

15

• One-Way ANOVA (using GLM) o The One-Way ANOVA is found under Analyze > General Linear Model >

Univariate... and produces the dialog box below:

o We will be doing the same analysis as before, so the groups will be formed by levels of 'Death of a Close Friend' and 'General Happiness' will be the dependent variable. Click on 'General Happiness' then move it to the 'Dependent Variable:' box by clicking on the upper arrow button. Now click on 'Death of a Close Friend' (scroll down the variable list to find) and then click the arrow button to move it to the 'Fixed Factor(s):' box.

Why 'Fixed' and not the 'Random Factors' box? Another good question. If you right click on 'Fixed Factor(s):' a small pop up appears that provides a brief description. (Most of the titles in SPSS dialog boxes have this brief description available.) The text of this pop up is 'The levels of a fixed factor include all levels about which conclusions are desired.' Since our variable is yes/no, all the levels we're interested in are included and we should include it as a Fixed Factor.

Random Factors have levels that are a random sample of the levels about which we want conclusions. The key is to keep in mind what you want to do with the results. Are you interested in generalizing your findings to levels that are not sampled in the variable (the levels are sampled from a larger population of possible levels) or did you include the entire population in your levels? If variable's level is sampled from a larger population, you should treat that variable as a Random Factor.

Page 16: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

16

o Once you have assigned variables in the Univariate dialog box, click on 'Model...' to get the following dialog:

We will not be changing anything in this dialog for this analysis. Note the default is to run a full factorial model, that is, with multiple factors and/or covariates in your model, all the interactions will be included. The dialog gives you a way to 'prune' the list if you need to. You can also change the type of sum of squares calculated and whether or not to include an intercept in your model. Click 'Continue'.

o Clicking the 'Contrasts...' button opens the following dialog:

A number of kinds of contrast are available through this dialog. To change or add contrasts first select the factor of interest then select the type of contrast from the list box below. After you select the contrast type, you still need to click the 'Change' button. You will then see the contrast type listed next to the factor in parentheses. For this analysis, our factor has only 2 levels so contrasts are not necessary. Click 'Cancel' to return to the main Univariate dialog.

Page 17: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

17

o Now click on the 'Plots...' button to bring up the Profile Plots dialog:

The Profile Plots dialog allows you to plot (graph) any or all of the factors in your analysis. To plot our factor, make sure it is highlighted in the 'Factors:' box and then click the arrow to copy it to the 'Horizontal Axis:' box. Now click the 'Add' button below to list it in the 'Plots' box at the bottom. The 'Separate Lines:' and 'Separate Plots:' boxes are pretty self explanatory. You can add multiple plots and try switching factors from the horizontal axis to separate lines to visually explore your results. If you want to make adjustments to a plot, select it in the 'Plots' box, make the changes above and then click the 'Change' button. To delete a plot, select it then click 'Remove'.

o The 'Post Hoc...' button displays the following dialog:

Page 18: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

18

Since no factors have been selected, the various tests are grayed out. When you select the factor(s) the tests will become active and you can select any or all the tests you want. The current analysis does not require any post hoc tests so we'll click 'Continue' or 'Cancel' to return to the Univariate dialog.

o The 'Save...' dialog provides options to save various diagnostic information as variables. The dialog is below:

Page 19: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

19

IV. Creating Graphs • Click on Graphs to access the many graphing options of SPSS.

• The tutorial (Help > Tutorial) will introduce Chart Builder... a wizard style approach to creating charts. Basically, you drag & drop the kind of chart you want, then drag and drop the variables to create a chart. There are numerous options to tweak. • The Interactive option is a little more structured, requiring you to select the type of chart you want before you can drag & drop variables to create the chart. • The options directly under Graphs don’t provide drag & drop options. But if you already know what kind of graph you want to create, it can be quicker than going through the Chart Builder. The Chart Builder and Interactive menus do not offer all available options available for each type of graph.

Page 20: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

20

o For example, other than syntax or a newer version of SPSS, the only way to create a matrix of scatterplots like the one below is to use Graphs >Scatter/Dot... click 'Matrix Scatter', then 'Define'. I selected ‘Age of Respondent’, ‘Highest Year of School Completed’, ‘Highest Year of School Completed, Father’, and ‘Highest Year of School Completed, Mother’.

• While running the Anovas we created simple plots. This time we’ll create a boxplot of the two groups we analyzed.

o Go to Graphs > Boxplot… You should see the following dialog:

Page 21: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

21

We’re creating a simple boxplot and since we want to display 1 variable split into 2 groups, the default settings are correct. Click ‘Define’

o I’ve added the variables to the ‘Define Simple Boxplot:’ dialog below: o o Click ‘OK’ to produce the following chart:

o

Page 22: SPSS Intro Workshop Fall 2008 - Yale StatLab Homestatlab.stat.yale.edu/workshops/StatLab-IntroSPSS2008.pdf · SPSS is a statistical analysis and data management package widely used

StatLab Workshop Series 2008 Introduction to SPSS

22

V. Saving or Printing Data, Output or Graphs

• Click on File > Save to save work in active window. • Just as SPSS can open files from a number of other programs, it can save files in a number of formats, just click File > Save As… and choose from the options under ‘Save as type:’. • Click on File > Print to print contents of active window or click on Print Button on the Icon Bar.

Saving paper, Editing print output:

• Results of analyses in SPSS are displayed in an SPSS Viewer window. The left pane displays an outline of the output, including titles, notes, and statistics. Clicking on a small box with a minus (-) sign in it will collapse that entry in the output, hiding it from view in the main window. Individual items in a listing can be toggled from displayed to hidden (and back) by double clicking on the icon in the outline pane.

• There are several options for printing from SPSS that consolidate output and save paper. The following are available from the SPSS Viewer window. These options may be combined to consolidate output even further.

o Select the section(s) you want to print from the outline box: A single click suffices for one section. If several sections need to be printed out, click on the appropriate sections

while holding down the CTRL key. Once selection has been made, click on File > Print > Selection

o Delete or hide sections you do not need to print: Select section(s) from the outline box and use the DELETE key to discard

them, or click on the box with the minus (-) sign to hide them. Click on File > Print > All visible output

• If presentation style is not an issue, you can change the text output page size, which will print output without any page breaks.

o Click on Edit > Options, and then the Viewer tab. o Click on Infinite in the Text Output Page Size box. o Click on OK to save options. o Click on File > Print > All visible output

• Clear page breaks and add them wherever you prefer: o Click on Insert > Clear Page Break to clear current page breaks. o To add page breaks at certain sections, click on the section where you want the new

page to start, either from the outline box or from the text output on the right. o Click Insert > Page Break o Click on File > Print > All visible output

TIP: Use File > Print > Preview to check the output to decide on modifications and options for printing.