Copy of Data Analysis 08

8/14/2019 Copy of Data Analysis 08

1/44

Data Analysis

Florenda F. Cabatit RN MAFlorenda F. Cabatit RN MAFacilitator


2/44

DATA ANALYSIS

Data analysis is the process by whichinformation is rendered meaningfuland intelligible (Polit and Hungler,1995).

It is the systematic organization andsynthesis of research data and thetesting of research hypotheses usingthose data (2004).


3/44

Statistical Analysis

Quantitative analysis deals withnumerical analysis of information.

It is the manipulation of numeric data

through statistical procedures for thepurpose of describing phenomena orassessing the magnitude and reliabilityof relationships among them.

Statistics is the scientific method used inquantitative analysis.


4/44

StatisticsStatistics

Statistics helps to:Organize dataSummarize dataEvaluate data

Present data in an easilyunderstood form .


5/44

StatisticsStatistics

Two branches of Statistics :Descriptive statistics -

statistics used to describe andsummarize dataInferential Statistics

statistics that permit inferenceson whether relationshipsobserved in a sample are likelyto occur in the larger population.


6/44

Considerations in theConsiderations in thechoice of appropriatechoice of appropriatestatistical methodsstatistical methods

The purpose of the research The level of measurement of thevariables

The number of groups/variablesinvolved

The type of groups being studied


7/44

Levels of Measurement

Nominal - the lowest level- involves assigning numbers to classify

characteristics into categories

- numeric codes assigned in nominalmeasurement do not convey quantitativeinformation.

- the numbers are merely symbols that

represent different values.- categories must be mutually exclusive

and collectively exhaustive.


8/44

Ordinal Measurement

This involves sorting objects on the basisof their relative standing or ranking on anattribute.The numbers are not arbitrary-they signifyincremental values but does not however,tell anything about how much greater one

level is than another.


9/44

Interval Measurement

A measurement in which

an attribute of a variableis rank ordered on a scalethat has equal distances

between points on thatscale.


10/44

Ratio ScaleRatio Scale

A quantitative measurement in which intervalsare equal and there is a true zero point.

The highest level of measurementAll arithmetic operations are permissible withthis measurement (add, subtract, multiply, anddivide numbers on this scale).


11/44

Descriptive Statistics

Three characteristics to fullydescribe a set of data:

shape of the distributionvalues

central tendency Variability


12/44

Review of DescriptiveStats.

Descriptive Statistics are used to presentquantitative descriptions in a manageableform.This method works by reducing lots of datainto a simpler summary.Example:

37 0 Centigrade as average adult bodytemperatureSUs quality-point system


13/44

Univariate Analysis

This is the examination across cases of onevariable at a time.Frequency distributions are used to groupdata.One may set up margins that allow us togroup cases into categories.Examples include

Age categoriesPrice categoriesTemperature categories.


14/44

Distributions

Two ways to describe a univariatedistribution

A tableA graph (histogram, bar chart)


15/44

Distributions (cont)

Distributions may also be displayedusing percentages.

For example, one could usepercentages to describe the following:

Percentage of people under the

poverty levelOver a certain ageOver a certain score on a

standardized test


16/44

Distributions (cont.)

CategoryCategory PercentPercentUnder 35 9%36-45 2146-55 4556-65 1966+ 6

A Frequency Distribution Table A Frequency Distribution Table


17/44

Distributions (cont.)

05

1015

2025303540

45

U

n d e r

3 5

3 6

- 4 5

4 6

- 5 5

5 6

- 6 5

6 6 +

Percent

A Histogram


18/44

Central Tendency

An estimate of the center of adistribution

Three different types of estimates:MeanMedianMode


19/44

Mean

The most commonly used method of describing central tendency.One basically totals all the resultsand then divides by the number of units or n of the sample.Example: The NCM 104 Quiz meanwas determined by the sum of all thescores divided by the number of students taking the exam.


20/44

Median

The median is the score found at theexact middle of the set.One must list all scores in numericalorder and then locate the score inthe center of the sample.Example: If there are 500 scores in

the list, score #250 would be themedian. This is useful in weeding out outliers.


21/44

Mode

The mode is the most repeated scorein the set of results.Lets take the set of scores:15,20,21,20,36,15, 25,15Again we first line up the scores15,15,15,20,20,21,25,36

15 is the most repeated score and istherefore labeled the mode.


22/44

Central Tendency

If the distribution is normal (i.e., bell-shaped), the mean, median and mode

are all equal.In our analyses, well use the mean.


23/44

Dispersion

Two estimates types:

Range

Standard deviationStandard deviation is moreaccurate/detailed because an outlier can

greatly extend the range.


24/44

Range

The range is used to identify thehighest and lowest scores.Lets take the set of scores:15,20,21,20,36,15, 25,15.The range would be 15-36. Thisidentifies the fact that 21 points

separates the highest to the lowestscore.


25/44

Standard Deviation

The standard deviation is avalue that shows the relationthat individual scores have tothe mean of the sample.If scores are said to bestandardized to a normal curve,there are several statisticalmanipulations that can beperformed to analyze the data

set.


26/44

Standard Dev. (cont)

Assumptions may be made aboutthe percentage of scores as theydeviate from the mean.If scores are normally distributed,one can assume thatapproximately 69% of the scores in

the sample fall within one standarddeviation of the mean.Approximately 95% of the scoreswould then fall within two standard

deviations of the mean.


27/44

Standard Dev. (cont)

The standard deviation calculatesthe square root of the sum of the

squared deviations from the mean of all the scores, divided by the number of scores.This process accounts for bothpositive and negative deviationsfrom the mean.


28/44

RESEARCH QUESTION: DESCRIBE

LEVEL TYPE OF DESCRIPTION STATISTICAL TOOL

NOMINAL

Distribution

Central Tendency

Frequency distributionContingency Table

Mode

ORDINAL Distribution

Central Tendency

Frequency DistributionContingency TableScatterpoint

Mode, Median

RATIO/INTERVAL

Distribution Frequency DistributionContingency TableScatterpoint

Central TendencyMode, Median, Mean

VariabilityRange, Variance,

Standard Deviation


29/44

Inferential

statistics Based on the law of probabilityIt provides a means for drawingconclusions about a population,given data from a sampleIt estimates population parametersfrom sample statistics


30/44

Inferential

StatisticsStatistical Inference consists of twotechniques:

2.Estimation of parameters3.Hypothesis testing


31/44

Hypothesis TestingStatistical hypothesis testing provides

objective criteria for deciding whether hypotheses are supported by empirical evidence.

It is a process of disproof or rejection.Researchers seek to reject the null hypothesis through various statistical tests.Hypothesis testing uses samples to draw conclusions about relationships within the

population.


32/44

Type I and Type II

ErrorsType I Error - researchers make a type I

error when a true null hypothesis isrejected.

Type II Error researchers make a type IIerror when a false null hypothesis isaccepted


33/44

Level of Significance

This refers to the risk of making a typeI error in a statistical analysis.The value selected beforehand

signifies the risk or the probability of rejecting of rejecting a true nullhypothesis.

The two most frequently usedsignificance levels (referred to as alpha or ) are:

.05

.01


34/44

Level of Significance

With .05 significance level, we areaccepting the risk that out of 100 samplesdrawn from a population, a true nullhypothesis would be rejected only 5 times.

With a .01 level of significance, the risk of a type I error is lower: in only 1 sample outof 100 would we erroneously reject thenull hypothesis.


35/44

Critical Region

This refers to the area in the samplingdistribution representing values thatare improbable if the null hypothesisis true.

It is defined by the level of significance


36/44

Statistical Tests

Two-tailed test- this means that both endsor tails of the sampling distribution areused to determine improbable values.

In one-tailed tests, the critical region of improbable values is entirely in one tailof the distribution-the tail correspondingto the direction of the hypothesis


37/44

An example of Critical Regions of a two-tailed test


38/44

Types of StatisticalTypes of Statistical

TestsTestsParametric Tests a class of inferential statistical tests thatinvolve:a. Assumptions about thedistribution of the variablesb. The estimation of a parameterc. The use of interval or ratiomeasures.


39/44

Statistical TestsStatistical Tests

Non-parametric Tests statisticaltests that do not estimate parameters

- also called distribution-free statistics.


40/44


41/44


42/44

Steps in Hypothesis

testing1. State the alternative hypothesis2. State the null hypothesis3. Establish the level of significance

4. Select a one-tailed or two-tailed test5. Compute a test statistic6. Calculate the degrees of freedom

7. Obtain a tabled value for the statisticaltest8. Compare the test statistic with the

tabled value.

The Decision Matrix


43/44

The Decision MatrixIn realityIn reality

WhatWhatwe concludewe conclude

Null trueNull true Null falseNull false

Alternative falseAlternative false Alternative trueAlternative true

InIn realityreal ity...... InIn realityreal ity......

Accept nullAccept null

Reject alternativeReject alternative

Reject null

Accept alternative

WeWe says ay ......

There is no real programThere is no real programeffecteffect

There is no difference,There is no difference,gaingain

Our theory is wrongOur theory is wrong

We say...

There is a real programeffect

There is a difference, gain Our theory is correct

There is no real program effectThere is no real program effect There is no difference, gainThere is no difference, gain Our theory is wrongOur theory is wrong

There is a real program effectThere is a real program effect There is a difference, gainThere is a difference, gain Our theory is correctOur theory is correct

1-1-

THE CONFIDENCE LEVELTHE CONFIDENCE LEVEL TYPE II ERRORTYPE II ERROR

The odds of saying there isThe odds of saying there is nono effect or gain when in fact thereeffect or gain when in fact thereis noneis none

# of times out of 100 when# of times out of 100 whenthere isthere is nono effect, well say effect, well say

there is nonethere is none

The odds of saying there is noThe odds of saying there is noeffect or gain when in facteffect or gain when in fact therethereis oneis one

# of times out of 100 when# of times out of 100 whentherethere isis an effect, well say an effect, well say

there is nonethere is none

1-1- TYPE I ERRORTYPE I ERROR POWERPOWER

The odds of saying thereThe odds of saying there isis ananeffect or gain when in fact thereeffect or gain when in fact there

is noneis none

The odds of saying thereThe odds of saying there isis ananeffect or gain when in fact thereeffect or gain when in fact there

is oneis one

# of times out of 100 when# of times out of 100 whenthere isthere is nono effect, well say effect, well say

there is onethere is one

# of times out of 100 when# of times out of 100 whentherethere isis an effect, well say an effect, well say

there is onethere is one


44/44

Decision Matrix

If you try to increase power, youIf you try to increase power, youincrease the chance of windingincrease the chance of winding

up in the bottom row and of up in the bottom row and of Type I error.Type I error.

If you try to decrease Type IIf you try to decrease Type I

errors, you increase the chanceerrors, you increase the chanceof winding up in the top row andof winding up in the top row andof Type II error.of Type II error.

Copy of Data Analysis 08

Documents

Transcript of Copy of Data Analysis 08