Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created...
-
Upload
calvin-blaise-alexander -
Category
Documents
-
view
212 -
download
0
Transcript of Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created...
![Page 1: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/1.jpg)
Introduction to Statistics
Steven A. JonesBiomedical Engineering
Louisiana Tech University
(Created for our NSF-funded Research Experiences in Micro/Nano
Engineering Program)
![Page 2: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/2.jpg)
Experimental Design
1.Develop a Hypothesis2.Design Statistical Analysis3.Set up Experiment4.Test Experiment (Positive Control)5.Collect Data6.Perform Statistical Analysis
Step 2 must not be done after the data have been collected!
![Page 3: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/3.jpg)
Gaussian Distribution
0
0.05
0.1
0.15
0 5 10 15 20 25
Value of x
Pro
ba
bili
ty D
en
sit
y
Uniform Distribution
0
0.1
0.2
0 5 10 15 20 25
Value of x
Pro
ba
bili
ty D
en
sit
y
Rayleigh Distribution
0
0.1
0.2
0.3
0.4
0 2 4 6
Value of X
Pro
bab
ilit
y D
ensi
ty
Some Common Probability Distributions
Gaussian: Sum of numbers.
Uniform: e.g. Dice throw.
Rayleigh: Square root of the sum of the squares of two Gaussians.
![Page 4: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/4.jpg)
Introductory Question
There is a 60% chance of rain on Friday and a 45% chance of rain on Saturday.
What is the probability that it will rain on Friday and Saturday?
![Page 5: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/5.jpg)
Introductory Question
There is a 60% chance of rain on Friday and a 45% chance of rain on Saturday.
What is the probability that it will rain on either Friday or Saturday?
![Page 6: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/6.jpg)
Axioms of Probability
For independent outcomes A and B:
BPAPBPAP
BPAP
BPAPBAP
APAP
BPAPBAP
111
1
1
&
or
If probabilities are small, then they can be added when “or” is used.
![Page 7: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/7.jpg)
Relationships Among Probability Distributions
Assume that are uniformly distributed. Then:ix
N
iiii xAy
1
Is Gaussian (Normal) distributed for N sufficiently large.
22
21 yyz Is distributed.
2
zw Is Rayleigh distributed.
![Page 8: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/8.jpg)
Simple Distribution
You expect that the correct value for a height measurement is 12 meters, and the standard deviation is 3 meters. One way to determine whether or not you are correct is to take some measurements.
![Page 9: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/9.jpg)
What could you conclude if you made one measurement and the value fell as follows on the expected distribution?
Gaussian Distribution
0
0.05
0.1
0.15
0 5 10 15 20 25
Value of x
Pro
ba
bili
ty D
en
sit
y
![Page 10: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/10.jpg)
What if three values fell as follows on the expected distribution?
Gaussian Distribution
0
0.05
0.1
0.15
0 5 10 15 20 25
Value of x
Pro
ba
bili
ty D
en
sit
y
![Page 11: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/11.jpg)
Did the two data sets to the left come from
different distributions?9 9.5 10 10.5 11 11.5 12
Value of X
What about these two?
9 9.5 10 10.5 11 11.5 12
Value of X
Set 2:
Set 1:
Set 2:
Set 1:
![Page 12: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/12.jpg)
How confident are you that the data sets in each plot below come from different
distributions?
9 9.5 10 10.5 11 11.5 12
Value of X
9 9.5 10 10.5 11 11.5 12
Value of X
9 9.5 10 10.5 11 11.5 12
Value of X
Lower standard deviation
Smaller difference in means
9 9.5 10 10.5 11 11.5 12
Value of X
Fewer data points
![Page 13: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/13.jpg)
A Student’s T test measures the confidence you can have that two values are inherently different, based on three
parameters
1. Difference of the means
2. Standard deviations
3. Number of data points obtained
Particularly useful when there are multiple confounding variables.
E.g. Blood pressure drugs – are we, on average, lowering blood pressure?
Student’s T Test
![Page 14: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/14.jpg)
A Student’s T test is used to answer the following question:
Given:
• Difference of the means
• Standard deviations
• Number of data points obtained
• That these data come from normal distributions
What is the probability (p) that they came from the same underlying distribution?
![Page 15: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/15.jpg)
Example
Given the mean and standard deviation for pressure, along with the number of points measured from a clinical drug trial, what is the probability that the drug had an effect on the distribution (i.e. that it changed the blood pressure of these individuals on average).
Sample Mean: Mean from the sample that was taken (the 2000 people in the drug trial).
Distribution Mean: Mean that would occur if you could give the drug to everyone in the world and do the measurement.
![Page 16: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/16.jpg)
Underlying and Sample Distributions
0
5
10
15
20
25
30
-3 -2 -1 0 1 2 3
Bin
Fre
qu
ency
Uniform Distribution
Uniform (Theory)
Gaussian Distribution
Gaussian (Theory)
![Page 17: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/17.jpg)
Statistical Tests You Should Know
T-test: Are the means of two data sets the same?
F-test: Are the standard deviations of two data sets the same?
Chi-Squred Test: Does the distribution of a data set match a proposed distribution?
Anova: Like an F-test for multiple variables.
Pearson’s Correlation Coefficient: Does one variable depend on another?
![Page 18: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/18.jpg)
To Run a T-Test
Calculate the mean of the data.
Calculate the standard deviation of the data.
Determine the T statistic (e.g. )
From T determine p.
p is “the probability that you would get a difference in means this large or smaller, given that the two measurement sets come from the same distribution.”
xN
![Page 19: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/19.jpg)
Interpretation of T test
You set the value that you consider significant.
Medical applications: p < 0.05 is “significant.”
Since p < 0.05 is a 1/20 probability, you will typically be wrong once in every 20 T tests.
![Page 20: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/20.jpg)
Hypothesis
Null hypothesis: statement that the two distributions are the same. i.e. “Altase causes no change in blood pressure.”
Alternative hypothesis: Can vary. Altase reduces the mean blood pressure. Altase changes the mean blood pressure.
![Page 21: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/21.jpg)
One-Tailed vs Two-Tailed
Depends on “alternative hypothesis.” One tail: If alternative hypothesis is that one mean is
greater than the other. Two tail: If alternative hypothesis is that the means
are different.
Saying that one of the means is greater is more restrictive.
The confidence you have in your result depends on the prediction (1st law of the frisbee).
![Page 22: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/22.jpg)
Example
A friend throws a frisbee. It bounces off a pole, goes to the roof of the house, rolls along an arc, flips off the gutter, and then lands in the fountain. Are you impressed?
A friend predicts that the frisbee will do the above, and then it happens. Are you impressed?
As with the frisbee, statistical analysis depends on how far you are willing to stick your neck out.
![Page 23: Introduction to Statistics Steven A. Jones Biomedical Engineering Louisiana Tech University (Created for our NSF-funded Research Experiences in Micro/Nano.](https://reader035.fdocuments.in/reader035/viewer/2022070408/56649e5c5503460f94b54312/html5/thumbnails/23.jpg)
Confidence Interval
States the range of values that contains the true value within a given percent confidence.
Depends on number of samples, and desired confidence.
Not a statistical test of significance, but related to the T-test
,
Gaussian Distribution
0
0.05
0.1
0.15
0 5 10 15 20 25
Value of x
Pro
ba
bili
ty D
en
sit
y• The more samples we have, the narrower the confidence interval.