Introduction to Statistical Analysis Using Graphpad Prism 6
Click here to load reader
-
Upload
azmi-mohd-tamil -
Category
Health & Medicine
-
view
432 -
download
7
Transcript of Introduction to Statistical Analysis Using Graphpad Prism 6
Download & Install
• You can download and install 30 days
evaluation version from
http://www.graphpad.com/demos/
• Upon installation, do not start the evaluation • Upon installation, do not start the evaluation
period unless you are ready to start using it
immediately.
Uniqueness of Prism
• Prism caters for analysis and graphs for scientific
publication, especially for laboratory and
biomedical research.
• Data are usually entered and manipulated using • Data are usually entered and manipulated using
spreadsheet such as Microsoft Excel. Data
needed for analysis are copied into specific tables
within Prism.
• Specific analysis requires specific tables. So you
must know exactly what analysis is required.
The 6 tables within Prism.
XY Data Tables
Column tables
Grouped tables
Contingency tables
Survival tables
Parts of whole tables
Choosing the appropriate
statistical testsstatistical tests
Use these tables to choose the
appropriate statistical tests.
Parametric Statistical Tests
Qualitative
Dichotomus
Quantitative Normally distributed data Student's t Test
Qualitative
Polinomial
Quantitative Normally distributed data ANOVA
Quantitative Quantitative Repeated measurement of the Paired t TestQuantitative Quantitative Repeated measurement of the
same individual & item (e.g.
Hb level before & after
treatment). Normally
distributed data
Paired t Test
Quantitative -
continous
Quantitative -
continous
Normally distributed data Pearson Correlation
& Linear
Regresssion
Non-parametric Statistical Tests
Qualitative
Dichotomus
Quantitative Data not normally distributed Wilcoxon Rank Sum
Test or U Mann-
Whitney Test
Qualitative Quantitative Data not normally distributed Kruskal-Wallis One Qualitative
Polinomial
Quantitative Data not normally distributed Kruskal-Wallis One
Way ANOVA Test
Quantitative Quantitative Repeated measurement of the
same individual & item
Wilcoxon Rank Sign
Test
Quantitative -
continous/ordina
l
Quantitative -
continous
Data not normally distributed Spearman/Kendall
Rank Correlation
Statistical Tests for Qualitative Data
Variable 1 Variable 2 Criteria Type of Test
Qualitative Qualitative Sample size > 20 dan no
expected value < 5Chi Square Test (X
2)
Qualitative Qualitative Sample size > 30 Proportionate TestQualitative
Dichotomus
Qualitative
Dichotomus
Sample size > 30 Proportionate Test
Qualitative
Dichotomus
Qualitative
Dichotomus
Sample size > 40 but with at
least one expected value < 5X
2 Test with Yates
Correction
Qualitative Quantitative Normally distributed data Student's t TestQualitative
Dichotomus
Qualitative
Dichotomus
Sample size < 20 or (< 40 but
with at least one expected
value < 5)
Fisher Test
Qualitative Quantitative Data not normally distributed Wilcoxon Rank Sum
URL for data & submit answers
• Data -https://drive.google.com/file/d/0B_0qI7iLxVpmVWNXMnV3WWZMSWM/view?usp=sharing
• The analysis required http://drtamil.me/2015/02/04/uninottichallengehttp://drtamil.me/2015/02/04/uninottichallenge/ password tcr1 (exercise done at teaching computer room 1 – tcr1)
• Submit answers at this link https://docs.google.com/forms/d/1o_L7ZjXF9Q1PON2zDs_VwkKsLCHT4v-8WruXhCiVq2Q/viewform
Data – Factors Related to SGA
A study to identify factors that can cause small for gestational
age (SGA) was conducted. Among the factors studied were the
mothers’ body mass index (BMI). It is believed that mothers with
lower BMI were of higher risk to get SGA babies.
• 1. Create a new variable mBMI (Mothers’ Body Mass Index) from the mothers’ HEIGHT (in metre) & WEIGHT (first trimester weight in kg). mBMI = weight in kg/(height in metre)2. Calculate the following for mBMI;
– Mean
– Standard deviation
• 4. Conduct the appropriate statistical test to test whether there is any association between BMI and OUTCOME.
• 5. Conduct the appropriate statistical test to find any association between OBESCLAS (Underweight/Normal/Overweight) and BIRTHWGT.
• 6. Assuming that both variables mBMI & – Standard deviation
• 2. Create a new variable OBESCLAS (Classification of Obesity) from mBMI. Use the following cutoff point;
– <20 = Underweight
– 20 – 24.99 = Normal
– 25 or larger = Overweight
– Create a frequency table for OBESCLAS.
• 3. Conduct the appropriate statistical test to test whether there is any association between OBESCLAS (Underweight/ Normal/Overweight) and OUTCOME.
• 6. Assuming that both variables mBMI & BIRTHWGT are normally distributed, conduct an appropriate statistical test to prove the association between the two variables.
– Demonstrate the association using the appropriate chart. Determine the coefficient of determination.
• 7. Conduct Simple Linear Regression using BIRTHWGT as the dependent variable. Try to come out with a formula that will predict the baby’s birthweight based on the mother’s BMI.
– y = a + bx
Online form for answers
Exercise 1 & 2
• 1. Create a new variable mBMI (Mothers’ Body Mass Index) from the mothers’ HEIGHT (in metre) & WEIGHT (first trimester weight in kg). mBMI = weight in kg/(height in metre)2. Calculate the following for mBMI;– Mean
– Standard deviation– Standard deviation
• 2. Create a new variable OBESCLAS (Classification of Obesity) from mBMI. Use the following cutoff point;– <20 = Underweight
– 20 – 24.99 = Normal
– 25 or larger = Overweight
– Create a frequency table for OBESCLAS.
Recode BMI into OBESCLAS
• Type
=IF(F2<20,"Underweigh
t",IF(F2>25,"Overweight
","Normal")) in cell G2
and press Enter.and press Enter.
• Then drag down cell G2
until G101 to fill up the
rest of the cells.
Recode BMI into OBESCLAS
Recode BMI into 1,2 or 3
• We should also recode BMI into numeric OBESCLAS2 for import into Prism. Prism doesn’t accept string data.accept string data.
• =IF(F2<20,“1",IF(F2>25,“3",“2")) in cell H2 and press Enter.
• Then drag down cell H2 until H101 to fill up the rest of the cells.
Recode BMI into 1,2 or 3
Recode BMI into OBESCLAS
• If typing logical command is not your forte, you can
just select all data, then sort the data according to
the BMI. Then drag and fill values 1, 2 or 3 beside it.
Add Column Freq with Value of 1
• Just add another
column with the
variable name “FREQ”
and fill it with value of 1
from I2 to I100.from I2 to I100.
• This will help with the
pivot table exercise
later.
Import Excel Data Into Prism
• Select all the data from
Excel. Copy.
• Open Prism, select
“Columns”, “Enter
replicate values..” &
click “Create”
Checking Normality
• Click on the “Analyze”
button.
• Select “Column
Statistics”.Statistics”.
• Select the variables
with continuous data.
• Then click “OK”.
Only Height is normally distributed
But for the purpose of today’s exercise, we are going to ASS-U-ME that all
these continuous variables are normally distributed.
Question 1 – BMI
• Column Statistics also
generates the Mean &
S.D.;
– Mean 24.49
– S.D. 4.769
Frequency Distribution
• Go back to the data by clicking on the data table on left side of screen. Then click on the “Analyze” button the “Analyze” button again.
• Select “Frequency Distribution”
• Tick on OBESCLAS2. Then click on “OK”.
Frequency Distribution
• Then click on OK again.
You will get the
following frequency
distribution table.
Exercise 3
• 3. Conduct the appropriate statistical test to test whether there is any association between OBESCLAS
SGA Normal TOTAL
UnderW
Normal
OverW
TOTAL 50 50 100between OBESCLAS (Underweight/Normal/Overweight) and OUTCOME.
• Therefore most suitable analysis is Pearson Chi-square.
TOTAL 50 50 100
Variable 1 Variable 2 Criteria Type of Test
Qualitative Qualitative Sample size > 20 dan no
expected value < 5Chi Square Test (X2)
Qualitative
Dichotomus
Qualitative
Dichotomus
Sample size > 30 Proportionate Test
Qualitative
Dichotomus
Qualitative
Dichotomus
Sample size > 40 but with at
least one expected value < 5X
2 Test with Yates
Correction
Qualitative Quantitative Normally distributed data Student's t TestQualitative
Dichotomus
Qualitative
Dichotomus
Sample size < 20 or (< 40 but
with at least one expected
value < 5)
Fisher Test
Qualitative Quantitative Data not normally distributed Wilcoxon Rank Sum
Pivot Table in Excel
• Click on “Insert”, “Pivot
Table” in Excel.
• Select all your earlier
Excel data.
Pivot Table
• On the right side of the screen, pull FREQ into values, OBESCLAS into row labels and OUTCOME into column labels.
• Now select the created contingency table (excluding the “Grand Total”), and copy it using Ctrl-C.
Paste Pivot Table Into Prism
• Click “New”, “New Data Table”.
Table”.
• Select “Contingency”, “Start with an empty table”.
• Then paste the pivot table into Prism.
The Pasted Pivot Table
Chi-Square Analysis
• Click on “Analyze”, “Contingency
table analysis”, then “Chi-
square”, then OK again twice.
Chi-Square Results from Prism
Normal Overweight Underweight0
20
40
60
Contingency
Fre
qu
en
cy
Normal
SGA
• Prism only states that there is a significant association (p < 0.0001) between mother’s weight classification and small for gestational age.
• But it doesn’t show which group has the higher rate of SGA.
Normal Overweight Underweight
Mothers' Weight Classification
Combine Results From Excel & Prism
• There is a significant difference (p<0.0001) of SGA rates
between underweight, normal and overweight mothers.
• Underweight mothers has a higher rate (94%) of SGA,
compared to normal mothers (58%) and overweight
mothers (26%).
Underweight vs Normal?
• There is a significant difference (p<0.01) of SGA rates between underweight and normal mothers.
• Underweight mothers has a significantly higher rate (94%) of SGA, compared to normal mothers (58%).
Question 3
Question 3
Exercise 4
• 4. Conduct the appropriate statistical test to test whether there is any association between BMI and OUTCOME.
Qualitative
Dichotomus
Quantitative Normally distributed data Student's t Test
Qualitative
Polinomial
Quantitative Normally distributed data ANOVA
Quantitative Quantitative Repeated measurement of the Paired t TestBMI and OUTCOME.
• Basically we are comparing the mean BMI of SGA mothers against BMI of Normal mothers.
• Therefore the appropriate test is Student’s t-test.
Quantitative Quantitative Repeated measurement of the
same individual & item (e.g.
Hb level before & after
treatment). Normally
distributed data
Paired t Test
Quantitative -
continous
Quantitative -
continous
Normally distributed data Pearson Correlation
& Linear
Regresssion
Copy BMI Column Into Prism
• Click “New”, “New Data Table”.
Table”.
• Select “Column”, “Enter replicate values into stacked columns”.
• Then paste the BMI of SGA mothers into column A & BMI of Normal mothers into column B.
The Pasted BMI Data
Student’s T-Test
• Click on “Analyze”, “Column
analysis”, then “t-tests”, then
OK again.
• Tick “Unpaired”, “Yes, parametric”, then “equal SDs”, then OK again.
T-Test Results from Prism
• Prism states that there is a significant mean difference of BMI (p < 0.0001) between SGA mother’s (22.52) and normal mothers (26.46). normal mothers (26.46). Therefore mean BMI of SGA mothers is significantly lower than the normal mothers.
• And it also proves that there is equal variances of the two means.
Question 4
Question 4
Exercise 5
• 5. Conduct the appropriate statistical test to find
any association between OBESCLAS
(Underweight/Normal/Overweight) and
BIRTHWGT.BIRTHWGT.
• Basically we are comparing the mean
BIRTHWEIGHT of underweight mothers, normal
weight mothers and overweight mothers.
• Therefore the appropriate test is Analysis of
Variance (ANOVA).
Copy Birth Weight Column Into Prism
• Click “New”, “New Data Table”.
• Select “Column”, “Enter replicate
• Select “Column”, “Enter replicate values into stacked columns”.
• Then paste the babies’ birth weight of underweight mothers into column A, babies’ birth weight of normal weight mothers into column B & babies birth weight of overweight mothers in column C.
The Pasted Birth Weight Data
ANOVA
• Click on “Analyze”, “Column
analysis”, then “One-way
ANOVA”, then OK again.
• Tick “No matching”, “Yes, ANOVA”, then click “MultipleComparison” tab. Click OK
ANOVA Results from Prism
• Prism states that there is a significant mean difference of mean birth weight (p < 0.0001) between underweight mothers’ (2.187), normal mothers ‘(2.768) & overweight mothers’(3.245).
• Unfortunately it also proves that there is unequal variances of the three means. So it fails the homogeneity of variances assumption.
ANOVA Results – post hoc
• Post-hoc tests indicate there is significant difference of birth weight between ALL the three groups. Underweight mothers’ have the lowest mean birth weight of 2.187kg.
3
4
5
ANOVA
h w
eig
ht
Underweight Normal Overweight0
1
2
Compare Babies Birth Weight byMother's Weight
Bir
th
Question 5
Question 5
Exercise 6
• 6. Assuming that both variables mBMI & BIRTHWGT are normally distributed, conduct an appropriate statistical test to prove the statistical test to prove the association between the two variables.–Demonstrate the association using the
appropriate chart. Determine the coefficient of determination.
Pearson Correlation
Qualitative
Dichotomus
Quantitative Normally distributed data Student's t Test
Qualitative
Polinomial
Quantitative Normally distributed data ANOVA
Quantitative Quantitative Repeated measurement of the
same individual & item (e.g.
Hb level before & after
treatment). Normally
distributed data
Paired t Test
Quantitative -
continous
Quantitative -
continous
Normally distributed data Pearson Correlation
& Linear
• mBMI and birth weight are both normally distributed
continuous data. Since the aim is to measure the
strength and direction of the association between
these two continuous variable, therefore Pearson
Correlation is the most appropriate test.
continous continous & Linear
Regresssion
Copy BMI & Birth Weight Into Prism
• Click “New”, “New Data
• Click “New”, “New Data Table”.
• Select “XY”, “Enter and plot a single Y value for each point”.
• Then paste the BMI into column X & BIRTHWGT into column A.
The Pasted BMI & Birth weight Data
• BMI is coded as X since
it is the risk factor.
• Birth weight is coded as
Y since it is the outcome Y since it is the outcome
of interest.
• Risk factor first, then
Outcome.
• X comes first before Y.
• Capisce? (Understand?)
Pearson’s Correlation
• Click on “Analyze”, “XY
analysis”, then “Correlation”,
then OK again.
• Tick “Compute r between two selected data sets”, “Yes, Pearson correlation coefficients”, then “Two-tailed”, then OK again.
Correlation Results from Prism
• Prism states that there is a significant, positive & fair (r=0.4812) correlation between mothers’ BMI and babies’ birth weight. Therefore as BMI Therefore as BMI increases, the birth weight also increases.
• 23.15% (r2=0.2315) variability of the birth weight is determined by the variability of the mothers’ BMI.
3
4
5
Scatter Diagram - BMI vs Birth weight
weig
ht
0 10 20 30 40 500
1
2
BMI
Bir
th
Question 6
Question 6
Exercise 7
• 7. Conduct Simple Linear Regression using BIRTHWGT as the dependent variable. Try to come out with a formula that will predict the baby’s formula that will predict the baby’s birth weight based on the mother’s BMI. –y = a + bx
Simple Linear Regression
Qualitative
Dichotomus
Quantitative Normally distributed data Student's t Test
Qualitative
Polinomial
Quantitative Normally distributed data ANOVA
Quantitative Quantitative Repeated measurement of the
same individual & item (e.g.
Hb level before & after
treatment). Normally
distributed data
Paired t Test
Quantitative -
continous
Quantitative -
continous
Normally distributed data Pearson Correlation
& Linear
• mBMI and birth weight are both normally distributed
continuous data. Since the aim is to come out with a
regression formula between these two continuous
variable, therefore Simple Linear Regression is the
most appropriate test.
continous continous & Linear
Regresssion
Reuse BMI & Birth weight Data
• BMI is coded as X since
it is the risk factor.
• Birth weight is coded as
Y since it is the outcome Y since it is the outcome
of interest.
• Since the SLR uses the
same variables, we will
reuse the XY table from
Exercise 6.
Simple Linear Regression
• Click on “SLR” icon, it is just above the “Analyze” icon.
• Just change the range so that the line will start at the y axis (X=0).
• We can set the line to end at the maximum value (it is X=41 in this exercise).
Click OK
SLR Results from Prism
• Prism states that there is a
significant regression
coefficient (b=0.07323).
• The constant (a) is 1.081
• 23.15% (r2=0.2315)
variability of the birth variability of the birth
weight is determined by the
variability of the mothers’
BMI.
• BW = 1.081 + 0.073BMI
• For every increase of BMI of
1 unit, BW increases 0.07kg.
3
4
5
Scatter Diagram - BMI vs Birth weight
weig
ht
0 10 20 30 40 500
1
2
BMI
Bir
th
Question 7
Question 7
Slight difference of the constant value. Prism calculated
1.081 instead of 1.079. Maybe it was due to decimal
difference of the BMI upon import.
Question 7Question 7
drta
mil@
gm
ail.co
m
Birth weight