Descriptive Statistics

Post on 07-Aug-2015

17 views 3 download

Tags:

Transcript of Descriptive Statistics

StatisticsResearch Methods

What’s in this PowerPoint?• Why learning statistics?• Two Perspectives of Statistics• Descriptive Statistics• Inferential Statistics

Why is my evil

lecturer forcing me

to learn statistics?

Why oh why? • What do you learn in this class?

Research

• What is research? To answer some interesting questions

• How do you answer the research questions? Collect data Explain & analyze the data

• Numbers = data

Quantitative Research Process (Field, 2009)

Review of Literature

Review of Literature

??

?

So you’ve done hypothesis…• Let’s identify the variables• For example:

Research Question• Is there a relationship between gender and

English competence? Hypothesis• There is a correlation between gender and

English competence Variables?

What is variable?

Gender

English competence

How’s the relationship?

gender English competence

Independent Variable

Dependent Variable

Measuring Variables

Variables

categorical

Binary Only 2 categories

Nominal > 2 categories

Ordinal Categories w/ logical ORDER, difference doesn’t matter

continuous

Interval equal interval = equal difference

Ratio The difference makes sense, clear /natural 0

So what level of measurements are our variables?Gender

• Categorical? Binary? Male vs. Female Nominal? Male vs.

Female vs. Gay vs. Lesbian

Ordinal? No!

• Continuous? Interval? No! Ratio? No!

English Competence

• Categorical? Binary? No.. Nominal? No.. Ordinal? Beginner vs.

Intermediate vs. Advanced (but…)

• Continuous? Interval? GPA 1.5-4 Ratio? 0-100

But why do we need to know these?• Statistics is about explaining the

data in meaningful ways and as detailed as possible Meaningful • Clear (female is not male, GPA 3.00>1.50

but those with GPA 3.00 is not as twice smarter) descriptive statistics

Detailed• more accurate analyses, more accurate

explanation of the population inferential statistics

Golden Rule• Aim for higher level of measurement

Binary

Nominal

Ordinal

IntervalRatio

preferred

Data Prepara

tion

Data – what is it?• In Quantitative research, data mostly

consist of numbers or words that are converted to numbers (such as in discourse analysis)

How to prepare your data?• Use tools!

Calculator – um, really? MS Excel SPSS

• Why Excel? Ubiquitous Free Easy to use Can be converted to SPSS for more

detailed analyses

Preparing the Data in MS Excel

• Open the file “Statistics-Complete.xls”

• Columns variables• Rows cases• Cell Address

Column A to ZZ Row 1, 2, 3 to ∞ Example: A2 column A, row

2

• First Row name of variable (for analysis)

Perspectives of

Statistics

Two perspectives• Descriptive Statistics

To describe or summarize the data Results of the data only

• Inferential Statistics To make inferences about the population

from the data (sample)

Descriptive

Statistics

How do you describe data?

Data Description

Itself (size)

Frequency (how many/often)

Percentage (how big)

Against each other

Central tendency (how they are

placed)

Mean

Median

Modus

Dispersion (how they are spread)

Low vs. High

RangeStandard Deviation

Against population

Normal distribution

Kurtosis

Skewness

Let’s learn and practice• See the file “Statistics-complete.xls”• You will find the data for the variables “gender”

and “competence”• Variable in columns, cases in rows• Variable naming rules (for exporting to SPSS)

Short, explanatory Must be unique No spaces, blanks, or !,?, ‘, and * Must begin with a letter, followed by either a letter,

any digit, a full stop or symbols @, #, _ or $ Cannot end with a full stop or underscore Are not case sensitive

Using Formula in MS Excel• Go to Tab “Formula”

Click the icon fx “Insert Function”

• Go to fx bar Click the icon fx, choose

from the dropdown menu

• Type “=“ at the formula bar, followed by the formula a pop-up text will guide

you on how the string of the formula should be)

How do you describe data? By Itself• Frequency – how many? How often?

A.k.a. tallies, To count up the number of things or people in different categories

• Raw frequencies COUNT – the number of cases (e.g. how many

cases) COUNTIF – the number of cases based on

certain conditions (e.g. how many males/females)

SUM – the total of certain numbers (e.g. combining 2 variables)

How do you describe data? By Itself• Group Sum/Percentage – how big?

Raw frequencies can be converted into percentages

Graphical display of data (a.k.a. pie charts) Other ways to display data (histogram,

line)

• How? Group the data – using COUNTIF Insert Chart – using Tab “Insert” |

“Column” or “Pie”

How do you describe your data? Against each other• Central Tendency – how are they

placed among each other? The tendency of a set of numbers to

cluster around a particular value (Brown)

What are they?• Mean• Mode• Median

How do you describe your data? Against each other

MeanA.k.a. averageSum of all values in a distribution divided by

the number of valuesAVERAGE

How do you describe your data? Against each other

Mode• Frequently occurring values in a set of

numbers• MODE

How do you describe your data? Against each other

Median• The middle value• The data needs to be sorted from smallest to

highest• MEDIAN

How do you describe your data? Against each other• Dispersion

To what extent the individual values vary away from the central tendency

What are they?• Low-High• Range• Standard Deviation

How do you describe your data? Against each other

Low-High• The lowest and the highest values• MIN, MAX

Range• The highest – the lowest + 1• Input the MIN and MAX and calculate

Standard Deviation• To what extent a set of scores varies in

relation to the mean• STDEV

How do you describe your data? Against the population Normal Distribution – how representative

are they? A.k.a. Bell Curve How the values usually disperse in real

population

SDs -3 -2 -1 M 1 2 3

2.14% 13.59% 34.13% 34.13% 13.59% 2.14%

How do you describe your data? Against the population

Kurtosis• How peaked or flat the curve• The more positive, the more peaked

Skewness• A few values are much larger or smaller than

the typical values found in the data set• Negative vs. positive

NP

Checking Normality in MS Excel• Create a BIN (percentile

of your data)• Sort your data from the

lowest to the highest• Create the case number

(nth data) 81 is 20th data

Using Normality Percentage1. Remember the percentage

of normality cumulative percentage• 2.14% lowest 2.14%• 13. 59% low 15.73%

(2.14 + 13.59)• 68.26% mid 83.99%

(2.14 + 13.59 + 68.26)• 13.59% high 97.58%• 2.14% highest 100%

2. Convert the data to meet the percentage of normality (e.g. the data in the file is 20, so 20 is 100%, 19.516 is 97.58%, and so on).

Using Normality Percentage3. Identify the bin

numbers (cut points) E.g. 100% is 20th data

case in the file 81 97.58% is the approx.

19th data case 794. Decide how many times

the data occur within the bin numbers [FREQUENCY] 46-47 pts = 1 time, 46-52 pts= 2 times, and so on; the final one 81 should be 20 times

5. Decide the number of the data under 47 is 1 score, 47-52 is 2 scores, and so on.

Using Mean (Average) & Standard Deviation1. Remember the

calculation for normality using average +/- standard deviation (-3 to 3)

2. Calculate the normality data for deciding bin numbers using the formula: M +/- (3*SD) M +/- (2*SD) M+ /- (1*SD)

• Follow Step. 4 & 5 in using normality percentage

Generating the histogram1. Select the data in the ‘number of

data’2. Click in the Menu Bar – Insert |

Column | 2D-Column3. To make the histogram clearer,

click the whole histogram, right click ‘Select Data’• In ‘Horizontal (Category) Axis

Labels, click ‘Edit’• In ‘Axis Label Range’ bar,

select the bin numbers, then ‘OK’ and ‘OK’

4. To add the trendline, select the bar (yellow or green), click ‘Add Trendline’• In ‘Trendline Options’, select

‘polynomial’ and adjust the order (1/2/3/4) until it shows normality line

Too complicated? Let’s try the smart way

• Activate Add-ins for Statistical Procedures

1

2

3

4

56

Smart Way…• Once activated, you should have

something like this in your Menu:

How to do descriptive Statistic?• Menu | Data Analysis | Descriptive

Statistics• Select the data range that you want

as an Input Range• Select the output range• Tick Summary Statistics• Voila!

Inferential

Statistics