Statistics for Cross-Cultural Research - University of...
Transcript of Statistics for Cross-Cultural Research - University of...
Chapter 4How To Do Cross-Tabs
in Spss 10.0/11.0
In Chapter 3, we did an exercise to examine the Scatterplot of Population Density and Juridictional Hierarchy (Figure 3.3). The red line in that plot show the average values on the Y coordinate (Juridictional Hierarchy, scaled from 1 to 5) for each value of the X coordinate (roughly, the log to base 5 of the population density). A cross tabulation (or cross-tab for short) differs from a scatterplot in that the rows represent the values of one variable (e.g., those of the Y variable in a scatterplot) and the columns represent the values of second variable (e.g., those of the X variable in our scatterplot). The information may be the same as in a scatterplot but here instead of a graph we have in each cell of the table the number of cases that have a given pair of values on the two variables.
In this chapter we cover getting variables from a spss file, asking for percentages, asking for statistics, getting your table, exporting your table to Word or html for use in a research paper, website or publication, and last, a better way to export your table to Word or html.
Getting variables
Now, let us do a cross-tabulation for population density and this measure of political complexity tested the hypothesis that increases in density are correlated with growth in complexity.
To start:
1. IN MENU LINE CHOOSE:
ANALYZE → DESCRIPTIVE STATISTICS → CROSSTABS
Chapter 4
You will see the following window:
2
Cross-Tabs in Spss
2. MOVE RESPECTIVE VARIABLES TO "ROW" AND "COLUMN" BOXES.
Questions that might arise at this point is:
"If I see a list of variables by number how do I get this list of variables by name?Or, “How do I get a list of variables by number?"
Cancel the window above and click the Edit and then Options on the main Menu. The General Tab for options will then open, as and under “Variables Lists” click the “Display labels” button (or the “Display names” which is what we have above) and then click the “OK” button.
3
Chapter 4
Now go back to crosstabs: from menu, ANALYZE → DESCRIPTIVE STATISTICS → CROSSTABS. Then you might have
Another question that will arise at this point is:
"Which variable should be put in which box?"
4
Cross-Tabs in Spss
Recall that in formulating our hypothesis, we thought that population density might affect the development of political complexity. There are good reasons for this expectation. Population can grow within a given area due to the balance between birth rate and immigration on the one hand and mortality and emigration on the other. When population densifies, however, new forms of political integration are needed (Johnson 1982). When we have this kind of idea about which variable is likely to be the predictor of the other, which is often a matter of temporal ordering and at other times a matter of logical priority, we call the predictor the independent variable and the dependent variable the one that is predicted. Here we will use the terminology for distinguishing independent and dependent variables. Sometimes, however, we are simply interested in the relationship without a notion of causal or temporal or logical order. Some statistical correlations, however, will still distinguish between independent and dependent variable on the basis of which is used in a formal sense to make the prediction. This will be treated under the idea of mathematical function in Chapte 5, where the function takes the predictor and returns the prediction, and that is the only difference between the independent and dependent variables.
Our question about which variable to assign to the rows of a cross tabulation and which to the colums will have different answers depending on the following:
2a. If both variables have the same number of values,
PUT INDEPENDENT VARIABLE IN THE "ROW" BOX; PUT DEPENDENT VARIABLE IN THE "COLUMN" BOX.
2b. If one variable has more values than the other,
PUT THIS VARIABLE (THE ONE WITH MORE VALUES) IN THE "ROW" BOX; PUT THE OTHER VARIABLE (THE ONE WITH LESS VALUES) IN THE "COLUMN"
BOX
The reason for this rule is as follows: if we put the variable with many values in columns, the table will become too wide and it will be difficult (or impossible) to fit it in a standard page (in fact, we shall encounter this problem once below). In general, as the standard orientation of paper is "portrait" (), rather than "landscape" (), it is always more convenient to deal with tables that are long, but narrow,
rather than short, but wide
We are going now to cross-tabulate the following variables:
5
Chapter 4
V64. POPULATION DENSITYand
V237. JURISDICTIONAL HIERARCHY BEYOND LOCAL COMMUNITY
V64 ("POPULATION DENSITY") has the following values: 1 = < 1 person per 5 sq. mile2 = 1 person per 1–5 sq. mile3 = 1–5 persons per sq. mile4 = 6–25 persons per sq. mile5 = 26–100 persons per sq. mile6 = 101–500 persons per sq. mile7 = over 500 persons per sq. mile
V237 ("JURISDICTIONAL HIERARCHY BEYOND LOCAL COMMUNITY" ≈ "POLITICAL COMPLEXITY INDEX") has the following values: 1 = No levels (no political authority beyond community)2 = One level (e.g., petty chiefdoms)3 = Two levels (e.g., larger chiefdoms)4 = Three levels (e.g., states)5 = Four levels (e.g., large states)
Thus, "Population density" variable has 7 values, whereas the "Political complexity" one has only 5. In addition to this, "Population density" is more likely to be regarded as the independent variable. Hence, in the present situation we have all the grounds to
PUT V64 ("POPULATION DENSITY") IN ROWS, and PUT V237 ("JURISDICTIONAL HIERARCHY BEYOND LOCAL COMMUNITY") IN COLUMNS:
6
Cross-Tabs in Spss
But after you have done this, it is still too early to click the "OK" button. So, as your next step:
3. CLICK THE "CELLS..." BUTTON.
Asking for percentages
You will see the following window:
7
Chapter 4
We advise you to always make crosstabs not only with observed counts, but also with percentages. As we shall see below, crosstabs with percentages are immensely more useful than the ones without them. To make a crosstab with percentages you should tick the boxes in "Percentages" part of the submenu. You can tick both "Row,” and "Column,” but the experience shows that in this case resultant tables are not "user-friendly.” So we advise you the following:
If the independent variable is in rows, tick the "Row" box; if the independent variable is in columns, tick the "Column" box!
In our case the independent variable ("Population Density") is in rows. So, tick the "Row" box. You will see the following window:
8
Cross-Tabs in Spss
Asking for statistics
After this click the "Continue" button. After this do not forget to order the statistical analysis of the crosstab. To do this, click the "Statistics….” You will see the following window:
9
Chapter 4
If you are a beginner in statistical analysis, we would advise you to tick the following boxes:
Chi-squarePhi and Cramer's VCorrelationsGammaKendall’s Tau-b
You will not necessarily need all the resultant additional tables to analyze statistically each concrete crosstab, but what you will get will be quite sufficient to answer any questions that could appear in the nearest future when you analyze crosstabs statistically. Statistics are covered in the next chapter.
10
Cross-Tabs in Spss
Getting your table
Now, press "Continue,” then "OK,” and you will get the following table:
Population Density * Jurisdictional Hierarchy Beyond Local Community Crosstabulation
29 6 1 36
80,6% 16,7% 2,8% 100,0%
17 5 22
77,3% 22,7% 100,0%
11 8 4 2 25
44,0% 32,0% 16,0% 8,0% 100,0%
7 9 4 5 2 27
25,9% 33,3% 14,8% 18,5% 7,4% 100,0%
9 13 5 5 2 34
26,5% 38,2% 14,7% 14,7% 5,9% 100,0%
4 6 4 3 2 19
21,1% 31,6% 21,1% 15,8% 10,5% 100,0%
3 1 5 4 6 19
15,8% 5,3% 26,3% 21,1% 31,6% 100,0%
80 48 23 19 12 182
44,0% 26,4% 12,6% 10,4% 6,6% 100,0%
Count% withinPopulation DensityCount% withinPopulation DensityCount% withinPopulation DensityCount% withinPopulation DensityCount% withinPopulation DensityCount% withinPopulation DensityCount% withinPopulation DensityCount% withinPopulation Density
< 1 person / 5 sq. mile
1 person / 1-5 sq. mile
1-5 persons / sq. mile
6-25 persons / sq. mile
26-100 persons / sq. mile
101-500 persons / sq.mile
over 500 persons / sq.mile
PopulationDensity
Total
No levels One level Two levels Three levels Four levelsJurisdictional Hierarchy Beyond Local Community
Total
As you see, even though we put in columns the variable with a smaller number of values, the resultant table does not fit a standard page. To a considerable extent this is explained by the fact that even the most recent versions of SPSS produce crosstabs with an entirely useless column ("Count vs. % within"). In order to make this crosstab easier to read, and more prepared for publication we would advise you to delete it.
To do this double-click on the table, and block this column, e.g. pressing on any of its cells with your mouse and using the combination of "Shift" and "" buttons. After that using the mouse's left button make the column as narrow as possible:
11
Chapter 4
If you click on any point outside the table now, you will see that the column has disappeared. We would also advise in this case to diminish the breadth of the second and the last column. After this the table will look in the following way:
Table 2.1:Population Density * Jurisdictional Hierarchy Beyond Local Community Crosstabulation
29 6 1 3680,6% 16,7% 2,8% 100%
17 5 22
77,3% 22,7% 100%
11 8 4 2 2544,0% 32,0% 16,0% 8,0% 100%
7 9 4 5 2 2725,9% 33,3% 14,8% 18,5% 7,4% 100%
9 13 5 5 2 3426,5% 38,2% 14,7% 14,7% 5,9% 100%
4 6 4 3 2 1921,1% 31,6% 21,1% 15,8% 10,5% 100%
3 1 5 4 6 1915,8% 5,3% 26,3% 21,1% 31,6% 100%
80 48 23 19 12 18244,0% 26,4% 12,6% 10,4% 6,6% 100%
< 1 person / 5 sq.mile
1 person / 1-5 sq.mile
1-5 persons / sq.mile
6-25 persons / sq.mile
26-100 persons /sq. mile
101-500 persons /sq. mile
over 500 persons /sq. mile
PopulationDensity
Total
No levels One level Two levels Three levels Four levelsJurisdictional Hierarchy Beyond Local Community
Total
12
Cross-Tabs in Spss
Exporting your table to Word
Now, the table could be read more or less easily. However, if you are going to publish it (e.g., to use it in your essay, thesis, or article), we would still advise you to edit it. To edit an SPSS table you should first double-click on it to get into the editing mode, and then to double-click on that cell of the table which you would like to edit. For example, if you double-click on the label of the dependent variable ("Jurisdictional Hierarchy Beyond Local Community"), the table will look as follows:
We would suggest that the table which we have made should be edited in the following way:
1. The dependent variable could be more appropriately titled "Political Centralization Index = # of Political Integration Levels over Community.”
2. This variable labels should be re-named accordingly. 3. Numerical values of the variable should be added.1
1 If you are going to use the respective database and respective variable in future, we would advise you to do corresponding changes in database itself. We would also advise you to re-code V237 in the following way: 0 = No levels (no political authority beyond community); 1 = One level (e.g., petty chiefdoms); 2 = Two levels (e.g., larger
13
Chapter 4
As a result, the final version of the table will look as follows (Table 2.2):
Table 2.2:Population Density * Political Centralization
29 6 1 3680,6% 16,7% 2,8% 100%
17 5 22
77,3% 22,7% 100%
11 8 4 2 2544,0% 32,0% 16,0% 8,0% 100%
7 9 4 5 2 2725,9% 33,3% 14,8% 18,5% 7,4% 100%
9 13 5 5 2 3426,5% 38,2% 14,7% 14,7% 5,9% 100%
4 6 4 3 2 1921,1% 31,6% 21,1% 15,8% 10,5% 100%
3 1 5 4 6 1915,8% 5,3% 26,3% 21,1% 31,6% 100%
80 48 23 19 12 18244,0% 26,4% 12,6% 10,4% 6,6% 100%
1 = < 1 person / 5sq. mile
2 = 1 person / 1-5sq. mile
3 = 1-5 persons /sq. mile
4 = 6-25 persons /sq. mile5 = 26-100 persons/ sq. mile6 = 101-500persons / sq. mile7 = over 500persons / sq. mile
PopulationDensity
Total
0 = No levels(Independentcommunities)
1 = Onelevel (Simplechiefdoms)
2 = Two levels(Complexchiefdoms)
3 = Threelevels (Small
states)
4 = Four levels(Large states /
empires)
Political Centralization Index = # of Political Integration Levels overCommunity
Total
And a possible final step. Normally, your essay, thesis, or article will be in Word, or other similar program. So, you may need to move the table from SPSS to Word. However, if you just copy and paste it, you will get the following:
chiefdoms); 3 = Three levels (e.g., states); 5 = Four levels (e.g., large states).
14
Cross-Tabs in Spss
Population Density * Political CentralizationPolitical
Centralization Index
= # of Political
Integration Levels
over Communit
y
Total
0 = No levels
(Independent
communities)
1 = One level
(Simple chiefdoms
)
2 = Two levels
(Complex chiefdoms
)
3 = Three levels
(Small states)
4 = Four levels
(Large states /
empires)
Population Density
1 = < 1 person / 5
sq. mile
Count 29 6 1 36
% within Population
Density
80,6% 16,7% 2,8% 100,0%
2 = 1 person / 1-5 sq. mile
Count 17 5 22
% within Population
Density
77,3% 22,7% 100,0%
3 = 1-5 persons /
sq. mile
Count 11 8 4 2 25
% within Population
Density
44,0% 32,0% 16,0% 8,0% 100,0%
4 = 6-25 persons /
sq. mile
Count 7 9 4 5 2 27
% within Population
Density
25,9% 33,3% 14,8% 18,5% 7,4% 100,0%
5 = 26-100
persons / sq. mile
Count 9 13 5 5 2 34
% within Population
Density
26,5% 38,2% 14,7% 14,7% 5,9% 100,0%
6 = 101-500
persons / sq. mile
Count 4 6 4 3 2 19
% within Population
Density
21,1% 31,6% 21,1% 15,8% 10,5% 100,0%
15
Chapter 4
7 = over 500
persons / sq. mile
Count 3 1 5 4 6 19
% within Population
Density
15,8% 5,3% 26,3% 21,1% 31,6% 100,0%
Total Count 80 48 23 19 12 182 % within
Population Density
44,0% 26,4% 12,6% 10,4% 6,6% 100,0%
As you see, you will not get a real table, but rather a half-finished product.2 In order to move to a Word document the whole table click on the table with the right-hand button, and choose "Copy objects" (not just "Copy"!):
2 Note, however, that the SPSS is not 100% compatible with the Word yet, so the SPSS objects sometimes "behave" in Word rather "capriciously"; hence, we advise you in certain circumstances to consider this possibility – to prepare a normal Word table on the basis of such half finished product, rather than to insert into a Word document an SPSS object.
16
Cross-Tabs in Spss
Now you can paste it safely into a word document.
Finally, for an exercise make a cross-tab for reliance on agriculture and fixity of settle-ment. If you follow the algorithm specified above correctly, the result should look as fol-lows:
A better way to export your table to Word or html Finally, there is a way to import a cross-tab from SPSS to Word, which preserves all the main features of the table and makes it possible to finish easily editing of a table in Word. To follow this way just choose after clicking on an SPSS table with the right-hand button "Export":
After this you will see the following:
17
Chapter 4
Just press "OK". By default the HTML file will be saved in the directory in which you are working. After this find file Output.htm in your working directory and open it. You will see the following:
18
Cross-Tabs in Spss
Now press "Control-A" to select the table, copy it and paste it in a Word document with which you are working. If everything has been done correctly, the size of font and the ta-ble can now be adjusted to fit the page, and the adjusted table should look as follows:
Agriculture-Contribution to Local Food Supply * Fixity of Settlement Cross tabulation
Fixity of Settlement
TotalMigratory
Seminomadic-fixed then migratory
Rotating among 2+ fixed
Semisedentary-fixed core, some
migratory
Impermanent-periodically
moved
Permanent
Agriculture-Contribution to Local Food Supply
None16 10 2 4 3 35
45,7% 28,6% 5,7% 11,4% 8,6% 100,0%
Non-Food Crops
2 1 3
66,7% 33,3% 100,0%
< 10%9 6 1 1 17
52,9% 35,3% 5,9% 5,9% 100,0%
< 50% < single source
3 2 2 1 4 12
25,0% 16,7% 16,7% 8,3% 33,3% 100,0%
< 50% > single source
2 5 6 29 42
4,8% 11,9% 14,3% 69,0% 100,0%
Primarily agricultural
1 1 1 8 66 77
1,3% 1,3% 1,3% 10,4% 85,7% 100,0%
19
Chapter 4
Total28 21 6 14 15 102 186
15,1% 11,3% 3,2% 7,5% 8,1% 54,8% 100,0%
Now you can easily finish the editing of the table any way you like using just standard Word menu. Note that importing a cross-tab to Word this way you will spend much less time and effort editing the table than when you just copy it directly from SPSS, which is what you see in the following table.
Agriculture-Contribution to Local Food Supply * Fixity of Settlement Crosstabulation
16 10 2 4 3 3545.7% 28.6% 5.7% 11.4% 8.6% 100%
2 1 3
66.7% 33.3% 100%
9 6 1 1 1752.9% 35.3% 5.9% 5.9% 100%
3 2 2 1 4 1225.0% 16.7% 16.7% 8.3% 33% 100%
2 5 6 29 424.8% 11.9% 14.3% 69% 100%
1 1 1 8 66 771.3% 1.3% 1.3% 10.4% 86% 100%
28 21 6 14 15 102 18615.1% 11.3% 3.2% 7.5% 8.1% 55% 100%
None
Non-FoodCrops
< 10%
< 50% <single source
< 50% >single source
Primarilyagricultural
Agriculture-Contribution toLocal FoodSupply
Total
Migratory
Seminomadic-fixed
thenmigratory
Rotatingamong
2+ fixed
Semisedentary-fixed
core,some
migratory
Impermanent-periodi
callymoved
Permanent
Fixity of Settlement
Total
However, though by now we know quite a lot about the relationship between the vari-ables of consideration, we have not tested the respective hypothesis statistically. In the next chapter we shall try to explain to you how to do this.
20