Post on 31-Mar-2015
Data, Tables and Graphs
Presentation
Types of data
Qualitative and quantitative Qualitative is descriptive (nominal,
categories), labels or words Quantitative involves numbers Data: information to be analyzed
Types of data
Discrete and continuous Discrete: takes on only whole number values Continuous: can take on decimal (fractional)
values
Coding schemes
Coding schemes are numbers assigned to characteristics of the data to be analyzed
Best to use numeric coding schemes
Example: age, race and gender, coding scheme
Age: recorded as a two digit number Race:
Coded as a single digit number using a coding scheme:
1. African American 2. Hispanic 3. White
4. Asian 5. Other
Example: continued
Gender 1. male 2. female Andy is a 22 year old white male Age: 22, Race: 3, Gender: 1 Coded as: 2231
Data file
Usually rectangular Variable values recorded for the unit of
analysis We will use SPSS as an example: Statistical
Package for the Social Sciences
Data file: example
ID Age Sex Race IQ Hand MS
1 22 1 3 102 1 1
2 34 2 1 110 1 2
3 60 2 1 112 1 3
4 54 1 3 92 1 2
5 39 1 1 120 2 1
Data file
Each row is the unit of analysis (usually a subject)
Each column is a variable Every variable should be given a label
(name) If it is a nominal variable, each value should
have a value label
Example of value label
Unit of analysis: subject Variable: marital status Values might include: single, married,
divorced, widowed Each value should be coded as a number,
and the label provided
Missing value
Data is often incomplete—there will be missing information
There should be a code to indicate if a piece of data (a variable) is missing for a particular subject (often 0 is used)
Example: no IQ score available, coded as a 0, indicated in the data file
Simple descriptive statistics
Frequency: number of times a value occurs If there are 48 females and 52 males in a
sample, f = 48 for females and 52 for males Proportion = f/N, P = 48/100 for females,
or .48 Percent: % = f/N * 100
Qualitative (nominal)
Frequency distributions Tables and graphs
Always label tables and graphs
Table 1. Gender of Sample
Frequency Proportion Percent
Male 52 .52 52%
Female 48 .48 48%
Pictorial representations
Pie charts Bar charts
Displaying two variables in a table
Crosstabs Race and gender, as an example
Quantitative data
Tables and graphs Ungrouped data Each value is displayed Count: each value Frequency: number of times each value
occurs
Quantitative
Frequency: number of times each value occurs
Cumulative frequency: arrange the numbers in ascending (or descending), and sum the frequencies going down the table
Indicates how many scores are less than a given score (cf)
Quantitative: tables
Proportion, cumulative proportion Percent, cumulative percent
Graphs, quantitative, ungrouped
Histogram Bar graphs Line graphs: frequency Cumulative
Quantitative, grouped data
Sometimes cumbersome to list each value—too many values
Example: age—could be 0 to 90+ Set up group intervals, i.e., 0-5, 6-10, etc. Rules: 1. first and last interval should not have a 0
frequency
Grouped data
Mutually exclusive and exhaustive All intervals should be the same width Important rule, not in the book: when
collecting data, do not group (collapse)—information is lost. You can always group later
Interval width
No hard and fast rules—what seems to be most meaningful
Appearance also a consideration As a start, use the formula, width = range of
scores (highest-lowest), divided by the number of intervals
Continuous data
If data is continuous, actually decimal values are possible
Must develop a rule for handling this For example, use a rounding rule