Opening SPSS for the first time · Web viewThese notes assume that SPSS is already installed on the...

19
Author: Jenny Freeman Introduction to SPSS: 1 Date created: 23/10/2020 Contents Opening SPSS for the first time.............................................1 Variable View...............................................................2 Data View...................................................................4 Opening an EXCEL File.......................................................5 Recoding variables using Automatic Recode......................................8 Creating a grouping variable from a continuous variables....................9 Computing new variables....................................................12 Saving Files...............................................................13 Introduction to SPSS: Workshop 1 notes These notes assume that SPSS is already installed on the computer (see notes on Installing SPSS for students and staff at University of Sheffield). They cover what was discussed in the workshop and are not intended to be a complete record of everything that you could/should know when starting to use SPSS for the first time. They will give you enough of an introduction to get you started. Opening SPSS for the first time To open SPSS for the first time, click on the Start button at the bottom left of your screen and scroll down the list to IBM SPSS Statistics > IBM SPSS Statistics 26. Click on this and SPSS will start to open up. Don’t worry if nothing appears to happen, it can take a minute or two to open up.

Transcript of Opening SPSS for the first time · Web viewThese notes assume that SPSS is already installed on the...

Author: Jenny Freeman

Introduction to SPSS: 1

Date created: 23/10/2020

ContentsOpening SPSS for the first time1Variable View2Data View4Opening an EXCEL File5Recoding variables using Automatic Recode8Creating a grouping variable from a continuous variables9Computing new variables12Saving Files13

Introduction to SPSS: Workshop 1 notes

These notes assume that SPSS is already installed on the computer (see notes on Installing SPSS for students and staff at University of Sheffield). They cover what was discussed in the workshop and are not intended to be a complete record of everything that you could/should know when starting to use SPSS for the first time. They will give you enough of an introduction to get you started.

Opening SPSS for the first time

To open SPSS for the first time, click on the Start button at the bottom left of your screen and scroll down the list to IBM SPSS Statistics > IBM SPSS Statistics 26. Click on this and SPSS will start to open up. Don’t worry if nothing appears to happen, it can take a minute or two to open up.

You should see the screen opposite. If you’ve never used SPSS before the Recent Files box will be empty.

If you have used SPSS before and want to use a file that you have used recently, click on the filename in the Recent Files box to open it. Otherwise click on the Close button on the bottom right and you will see the main SPSS Data Editor Interface. Depending on how it opens, you will either be in the Data View or Variable View sheet. You can toggle between Data View and Variable View by clicking on the tabs at the bottom left of the screen. The image below is of the Variable View:

Variable View

The Variable View contains the information on each of the variables in an SPSS dataset. Each line contains the information for a separate variable, the columns are for each of the different pieces of information that is needed to define a variable. Let’s take a closer look at the top bar:

The ones that are useful to understand are listed below:

Name

This is the name given to the variable. Names must be unique, so you cannot have two variables with the same name. They must begin with a letter, cannot contain spaces, full-stops or symbols (e.g. !, * etc) and cannot exceed 64 characters in length – but don’t worry too much, SPSS will let you know if you use a name that is not allowed.

Type

This allows you to chose the type of variable. The default is Numeric and this is usually ok. To change the Type, click on the box and then on the three little dots to the right and this will open up the Types dialogue box. Select the type that you want.

Other useful types include Dates and Strings. The Dates type is used for defining variables that are dates. If you have a Date variable, once you select Date, there is another box on the right that allows you to select how the date is formatted.

The String type is used when you have a character variable. If you have a character variable you may want to change the Width, as described below to allow for more characters in the variable

Width

You can ignore this if you have a numeric variable. It is most useful when you have a character variable as it controls the number of characters your variable can contain. The default is 8, and if this is set to 8, you will only be able to have a variable containing 8 characters. For storing longer strings such as free-text comments you will need to increase it so that it is greater than your longest comment

Decimals

This controls the number of decimal points that are displayed for numeric variables in Data View. The default is 2

Label

This is really useful as it allows you to label your variable and give it a more descriptive label than you can with the Name field. You can use spaces and other special characters. Any output that you produce will be displayed using the Label rather than the Name.

Values

This is another useful field as you can use it to define the meaning of specific values for your variable. For example, let’s suppose you have a likert scale numbered from 1 (Strongly Disagree) through to 5 (Strongly Agree). Using the Values field you can label the numbers to reflect their meaning.

To do this, click on the Values box and then on the three little dots to the right and this will open up the Values dialogue box. Type in each value and its label.

Missing

SPSS will read any cell without data in as a missing value. In addition you may want to code specific types of missing data such as missing because the respondent refused to answer the question, or there was an equipment failure. This field allows you to set up specific values as missing values. For example 999 for missing age. Note that you can use the Values field to label the different missing codes

Columns

Default is set to 8. This is usually sufficient. You can change this if you want to accommodate larger values, or longer variable names.

Align

Default alignment of the text in Data View. Default is right and there is usually no need to change this

Measure

This is where you can tell SPSS what sort of data you have. There are three options:

Nominal: use for categorial data with no natural ordering, such as tree species.

Ordinal: use for categorical data where there is a natural ordering, such as with a likert scale

Scale: use for discrete and continuous data

Role

Ignore this field

Data View

Now let’s look at Data View. This is where you can input your data, and view the data in files that have previously been created. In Variable View each line represented a particular variable and the columns described the different characteristics of that variable. In Data View, each line represents one sampling unit e.g. the data for a particular individual or observation. Each variable is allocated a column in Data View.

If you have set up value labels for variables, in Data View you can get SPSS to display either the value or the label by clicking on View on the top ribbon. Scroll down to Value Labels. If this box is checked as in the image on the right, the value labels will show. If it is unchecked the values will be displayed in Data View, not the labels.

Opening an EXCEL File

SPSS is able to read in a variety of file types including EXCEL, CSV, Text and files from other statistical packages such as SAS and STATA. You can do this in one of two ways, either FILE > OPEN > DATA and in the Files of type click on the arrow on the right and select the type you want.

Alternatively you can click on FILE > IMPORT DATA and select the data file type from the list. It gives you the same options as FILE > OPEN > DATA without the SPSS and Portable options. We are going to open the Titanic dataset. It is saved as a CSV file (Comma Separated Variable). In this file format the different variables on each line are separated by a comma. Select CSV from the list and navigate to the page where the file is saved. Click open and the Text Import Wizard dialogue box will open. You will need to go through the 6 steps.

Step 1: You will need to say whether there is a predefined format. Select No and click the Next button at the bottom of the box

Step 2:

Step 3:

Step 4: At this stage, check whether any text is enclosed in quote marks. For the Titanic data, the names are in double quotes, so I have selected this:

Step 5: At this stage you can tell SPSS what type/measure each variable is by selecting the variable from the Data Preview window and the selecting the Data format from the list. Alternatively, you can click on Next and SPSS will select the format based on the data that it is reading in:

Step 6: Click on Finish, unless you want to save the file format or create a syntax file (more on this in a later tutorial).

Here are the first few lines of data in the Data View window:

Here is the Variable View window for the Titanic data that has been read in from Excel:

Recoding variables using Automatic Recode

Looking at the Titanic data above, we can see that both Gender and Embarked (where the passenger embarked on the Titanic) have both been read in as string variables. This is because in the Excel file there were represented by letters:

Gender: M = male, F = female

Embarked: C = Cherbourg; S = Southampton; Q = Queenstown.

You can use Automatic Recode to recode a string variable as a numeric variable. Select Transform > Automatic Recode. In the Automatic Recode dialogue box, select the variable you want to recode from the list on the left, and click on the arrow to the right to move it into the Variable -> New Name box. Then enter the name of the new variable into the New Name box and click on Add New Name. Tick the box to get SPSS to Treat blank string values as user-missing and then click OK.

And looking at the new variable in Variable View and the values, we can see that it is has created numeric codes with labels that reflect the original string values:

Creating a grouping variable from a continuous variables

In the Titanic dataset, one of the variables is age and let’s suppose we want to create three age groups for those aged under 18, aged 18 to 64 and 65+. To create age groups, Select Transform > Recode into Different Variables. You can select Recode into Same Variables, but this overwrites the original variable and is not recommended as you cannot compare your transformed variable with your original variable to confirm that the transformation worked as planned. If you use Recode into Different Variables, you can always remove the original variable once you have confirmed that the transformation worked as planned.

Once you have the Recode into Different Variables dialogue box open, move the variable that you want to recode from the left into the Numeric Variable -> Output Variable box by selecting it and clicking the arrow to the right. Then create a name for the new variable by typing it into the Output Variable Name field and clicking Change. Remember that variable names must not contain spaces or special characters. Next you need to create the age groups but clicking on Old and New Values

Now we can create our age groups. When recoding, it can be useful to know what the minimum and maximum values are, and how many decimal points there are for the variable being recoded. The reason for this is explained below.

The left-hand side is for the old values, and the right-hand side is where you input the new values. First let’s copy over all the system missing values, to be missing in the new coding. Select System-missing on the left and System-missing on the right and (or if you also have user coded missing values select the item below. Then click Add. You will see SYSMIS -> SYSMIS appear in the Old-> New box.

Now let’s start coding our age groups. Click on Range, LOWEST through value and add 17.99 to this box. In New value select Value and add 1 to the box. Click Add. Only select Range, LOWEST through value if you are sure there are no negative ages, or age values below what there should be in the dataset. This can happen because of data entry errors and if you don’t spot them you will recode an incorrect value with a valid age group value. In addition, I have suggested using 17.99 rather than 17 as there may be decimal ages in the dataset, and this will ensure that any age less than 17.991 is coded as being less than 18 years. This is because when we want to recode so that we have only those who are aged less than 18, it is good to know whether age is in full years, or decimal years such as 17.5 years.

To create the middle age group, 18 to 64 years, select Range and put 18 in the upper box and 64.99 in the lower box. In New Value select Value and add 2 to the box. Click Add.

Finally to create the group for those aged 65+, click on Range, value through HIGHEST and input 65. In the New Value field, click on Value and add 3. Click Add. Then click Continue to add these codes. This will take you back to the main Recode dialogue box. Click on OK and you will have created an new age group variable.

At this point I always like to check that my coding has worked, so I do some simple descriptive statistics Analyze > Descriptive Statistics > Explore

You will get the following dialogue box. Move age to the Dependent List box, and age_group to the Factor list. This will give you some descriptive statistics for age, within each of the age groups. Select Statistics from the Display box. Then click OK. You will get a lot of output. Below is an edited selection of the output showing you the number in each group and the minimum and maximum values in each age group. You can see that in each age group the minimum and maximum values are what you would expect for that age group.

age_group

Cases

Valid

Missing

Total

N

Percent

N

Percent

N

Percent

age

1.00

154

100.0%

0

0.0%

154

100.0%

2.00

879

100.0%

0

0.0%

879

100.0%

3.00

13

100.0%

0

0.0%

13

100.0%

Descriptives

age_group

Statistic

age

1.00

Minimum

.1667

Maximum

17.0000

2.00

Minimum

18.0000

Maximum

64.0000

3.00

Minimum

65.0000

Maximum

80.0000

Computing new variables

You can create new variables from a combination of existing variables. In the session we looked at creating a new variable ‘Days’ that counted the amount of time between two dates in days. Transform > Compute Variable and you will get the following dialogue box. Add the name of the new variable to Target Variable, In Function group click on Date Arithmetic and double click on Datediff. Then in the brackets add End date, Start date, “days”. Click OK and you will have created the difference between the two dates in days.

Saving Files

Once you have finished your session, you can save your data file as an SPSS file. You can also save any output you have created as an SPSS output file. This will enable you to come back to it at a later date. To save the data file, make sure that you are in the Data window. It doesn’t matter whether you are in Variable View or Data View. Click on File > Save if you have an existing file open and are happy to over-write it. Alternatively if you want to create a new file click File > Save As and you will get the Save Data As dialogue box. Type in the name of the new file and click Save. You can select the directory that you want to save the file to. In the image below it is saving to the Downloads directory. Note that SPSS data files have the extension .sav

You can save the Output in a similar way. SPSS output files have the extension .sps

To close SPSS down click on File > Exit. If you haven’t already saved your data and output files, you will be asked if you want to. If you select no, it won’t be saved, but if you select yes the Save As dialogue box will open to allow you to save your data and output.