T Tests and ANovas

download T Tests and  ANovas

If you can't read please download the document

description

T Tests and ANovas. Jennifer Siegel. Objectives. Statistical background Z-Test T-Test Anovas. Predicting the Future from a Sample. Science tries to predict the future Genuine effect? Attempt to strengthen predictions with stats - PowerPoint PPT Presentation

Transcript of T Tests and ANovas

  • Jennifer Siegel

  • Statistical backgroundZ-TestT-TestAnovas

  • Science tries to predict the futureGenuine effect?Attempt to strengthen predictions with statsUse P-Value to indicate our level of certainty that result = genuine effect on whole population (more on this later)

  • Develop an experimental hypothesisH0 = null hypothesisH1 = alternative hypothesisStatistically significant result P Value = .05

  • Probability that observed result is trueLevel = .05 or 5%95% certain our experimental effect is genuine

  • Type 1 = false positiveType 2 = false negativeP = 1 Probability of Type 1 error

  • Lets pretend you came up with the following theoryHaving a baby increases brain volume (associated with possible structural changes)

  • Z - testT - test

  • Population

  • CostNot able to include everyoneToo time consumingEthical right to privacy

    Realistically researchers can only do sample based studies

  • T = differences between sample means / standard error of sample meansDegrees of freedom = sample size - 1

  • H0 = There is no difference in brain size before or after giving birthH1 = The brain is significantly smaller or significantly larger after giving birth (difference detected)

  • T=(1271-1236)/(119-113)

    Sheet1

    Before Delivery6 Weeks After DeliveryDifference

    1437.41494.557.1

    1089.21109.720.5

    1201.71245.443.7

    1371.81383.611.8

    1207.91237.729.8

    1150.71180.129.4

    1221.91268.846.9

    1208.71248.339.6

    Sum9889.310168.1278.8

    Mean1236.16251271.012534.85

    SD113.8544928218119.04134260845.1868497865

    T6.7189144537

    DF7

    946.2975.2

    1036.21070.3

    999.91027.3

    1274.31303.9

    1183.41238,7

    Sheet2

    Sheet3

  • http://www.danielsoper.com/statcalc/calc08.aspxWomen have a significantly larger brain after giving birth

    Sheet1

    Before Delivery6 Weeks After DeliveryDifference

    1437.41494.557.1

    1089.21109.720.5

    1201.71245.443.7

    1371.81383.611.8

    1207.91237.729.8

    1150.71180.129.4

    1221.91268.846.9

    1208.71248.339.6

    Sum9889.310168.1278.8

    Mean1236.16251271.012534.85

    SD113.8544928218119.04134260845.1868497865

    T6.7189144537

    DF7

    1236.1625-

    946.2975.2

    1036.21070.3

    999.91027.3

    1274.31303.9

    1183.41238,7

    Sheet2

    Sheet3

  • One-sample (sample vs. hypothesized mean)Independent groups (2 separate groups)Repeated measures (same group, different measure)

  • ANalysis Of VArianceFactor = what is being compared (type of pregnancy)Levels = different elements of a factor (age of mother)F-Statistic Post hoc testing

  • 1 Way Anova 1 factor with more than 2 levelsFactorial AnovaMore than 1 factorMixed Design AnovasSome factors are independent, others are related

  • There is a significant difference somewhere between groups

    NOT where the difference lies

    Finding exactly where the difference lies requires further statistical analysis = post hoc analysis

  • Z-Tests for populationsT-Tests for samplesANOVAS compare more than 2 groups in more complicated scenarios

  • Varun V.Sethi

  • Objective

    Correlation

    Linear Regression

    Take Home Points.

  • Correlation- How much linear is the relationship of two variables? (descriptive)Regression- How good is a linear model to explain my data? (inferential)

  • CorrelationCorrelation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom).

  • Strength and direction of the relationship between variablesScattergrams

  • Measures of Correlation

    Covariance

    2) Pearson Correlation Coefficient (r)

  • 1) Covariance

    The covariance is a statistic representing the degree to which 2 variables vary together

    {Note that Sx2 = cov(x,x) }

  • A statistic representing the degree to which 2 variables vary together

    Covariance formula

    cf. variance formula

  • 2) Pearson correlation coefficient (r)

    r is a kind of normalised (dimensionless) covariance

    r takes values fom -1 (perfect negative correlation) to 1 (perfect positive correlation). r=0 means no correlation(S = st dev of sample)

  • Limitations:

    Sensitive to extreme values

    Relationship not a prediction.

    Not Causality

  • Regression: Prediction of one variable from knowledge of one or more other variables

  • How good is a linear model (y=ax+b) to explain the relationship of two variables? If there is such a relationship, we can predict the value y for a given x.

  • Linear dependence between 2 variablesTwo variables are linearly dependent when the increase of one variable is proportional to the increase of the other one

    Samples: - Energy needed to boil water - Money needed to buy coffeepots

  • Fiting data to a straight line (o viceversa): Here, = ax + b : predicted value of ya: slope of regression lineb: interceptResidual error (i): Difference between obtained and predicted values of y (i.e. yi- i)

    Best fit line (values of b and a) is the one that minimises the sum of squared errors (SSerror) (yi- i)2 = ax + b

  • Adjusting the straight line to data: Minimise (yi- i)2 , which is (yi-axi+b)2

    Minimum SSerror is at the bottom of the curve where the gradient is zero and this can found with calculus

    Take partial derivatives of (yi-axi-b)2 respect parametres a and b and solve for 0 as simultaneous equations, giving:

    This can always be done

  • We can calculate the regression line for any data, but how well does it fit the data?

    Total variance = predicted variance + error variancesy2 = s2 + ser2Also, it can be shown that r2 is the proportion of the variance in y that is explained by our regression model r2 = s2 / sy2

    Insert r2 sy2 into sy2 = s2 + ser2 and rearrange to get:

    ser2 = sy2 (1 r2)

    From this we can see that the greater the correlation the smaller the error variance, so the better our prediction

  • Do we get a significantly better prediction of y from our regression equation than by just predicting the mean?

    F-statistic

  • Prediction / Forecasting Quantify strength between y and Xj ( X1, X2, X3 )

  • A General Linear Model is just any model that describes the data in terms of a straight line

    Linear regression is actually a form of the General Linear Model where the parameters are b, the slope of the line, and a, the intercept.y = bx + a +

  • Multiple regression is used to determine the effect of a number of independent variables, x1, x2, x3 etc., on a single dependent variable, yThe different x variables are combined in a linear way and each has its own regression coefficient:

    y = b0 + b1x1+ b2x2 +..+ bnxn +

    The a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y.i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for

  • Take Home Points

    Correlated doesnt mean related. e.g, any two variables increasing or decreasing over time would show a nice correlation: C02 air concentration in Antartica and lodging rental cost in London. Beware in longitudinal studies!!!

    Relationship between two variables doesnt mean causality(e.g leaves on the forest floor and hours of sun)

  • Linear regression is a GLM that models the effect of one independent variable, x, on one dependent variable, y

    Multiple Regression models the effect of several independent variables, x1, x2 etc, on one dependent variable, y

    Both are types of General Linear Model

  • Thank You

    *We can use information about distributions to decide how probable it is that the results of an experiment looking at variable x support a particular hypothesis about the distribution of variable y in the population.= central aim of experimental scienceThis is how statistical tests work: test a sample distribution (our experimental results) against a hypothesised distribution, resulting in a p value for how likely it is that we would obtain our results under the null hypothesis (null hypothesis = there is no effect or difference between conditions) i.e. how likely it is that our results were a fluke!

    *Normal distribution

    The x-axis represents the values of a particular variableThe y-axis represents the proportion of members of the population that have each value of the variable The area under the curve represents probability

    Mean and standard deviation tell you the basic features of a distributionmean = average value of all members of the groupstandard deviation = a measure of how much the values of individual members vary in relation to the mean The normal distribution is symmetrical about the mean 68% of the normal distribution lies within 1 s.d. of the mean

    Now, its important to remember that not all data has this distribution and if you data doesnt fit this normal distribution, you will have to use another type of test. Normal distribution of data is an assumption of T-tests. If your data doesnt look like this, you have to try another statistical test (like Chi Squared)

    *A hypothesis is a prediction that you have about a specific group.

    H1 = the experimental hypothesis significant difference (or between samples)H0 = the null hypothesis states that your experimental group is no different from the rest of the population. statistically significant difference between sample & population (or between samples)

    To get a statistically significant results, you need to show that your experimental group falls on the higher end of probability.

    One-sided testOne sided test is like it sounds. The value for the null hypothesis is an entire section of the distribution. You would use this if you wanted to know if your experimental group is only significantly greater than another group.

    A one-sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located entirely in one tail of the probability distribution.In other words, the critical region for a one-sided test is the set of values less than the critical value of the test, or the set of values greater than the critical value of the test.A one-sided test is also referred to as a one-tailed test of significance.

    Two-sided test

    A two-sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located in both tails of the probability distribution. You would basically use this if you wanted to know if your experimental group is greater than or less than another group.

    In other words, the critical region for a two-sided test is the set of values less than a first critical value of the test and the set of values greater than a second critical value of the test.A two-sided test is also referred to as a two-tailed test of significance.

    The choice between a one-sided and a two-sided test is determined by the purpose of the investigation or prior reasons for using a one-sided test.*P values = the probability that the observed result was obtained by chance and it stands for power.

    The power of a statistical hypothesis test measures the test's ability to reject the null hypothesis when it is actually false - that is, to make a correct decision.

    level is set a priori (Usually 0.05). With this P value, we can be 95% certain that our experimental effect is genuine.

    If p < result is less than the set P value then you reject the null hypothesis and accept the experimental hypothesisIf however, p > level then we reject the experimental hypothesis and accept the null hypothesis

    Type I error = false positive. Where we incorrectly accept the alternative/ experimental hypothesis.

    level of 0.05 means that there is 5% risk that a type I error will be encountered

    Type II error = false negative. Reject the alternative/ experimental hypothesis, when it should be accepted.

    Scientists care more about accepting a false results than rejecting a true one. So Power = 1 probability of a type I error. The maximum power a test can have is 1, the minimum is 0. Ideally we want a test to have high power, close to 1.

    Beware of errors level of 0.05 means that there is 5% risk that a type I error will be encountered.The other type of error is- type II error = false negativeWhere we incorrectly reject the exp hypothesis

    Here is an example of how these errors work. Lets say you start dating someone.

    Imagine, I say, a you just started dating someone. You on their first date he or she mentioned that her birthday was coming up this week. But the you can't remember the exact day. It might be today. Or maybe not. Embarrassed to admit it you decides to make a guess. You have two choices. When you see him or her today you can say "Happy Birthday!" Or he can say nothing, hoping that today isn't her birthday. The reality behind the situation is pretty simple: Either today is her birthday or it isn't.

    Saying "Happy Birthday!" when it is not her birthday is like a Type 1 error. It is a false positive: I'm saying it is your birthday when, in fact, it isn't. Conversely, staying quiet when today is her birthday is like a Type 2 error. It is a false negative: Today is my birthday and you said nothing to me, you missed it.

    *Example of a repeated measures T-test. Take same group and test brain volume before birth and after birth. Its important to understand the difference between these two.

    Population is the entire group with everyone included. An example of this would be the US census. If already know the variance of the general population, you can use the Z-test.

    Realistically researchers cant sample everyone, so they have the use the T-test when you do not know the variance of a general population and you only have a sub group. However, I should point out here that the strength of this test is dependant on the number of participants. *This is the formula for a 1 sample z-test. Basically, it is a formula that is set up to compare unit groups in a standard way through this linear formula.

    Again, you use this formula when you have the variance of the entire population. Variance describes how far values lie from the mean.

    So, youll plub in your mean for the group you are looking at, the mean for the population and the standard variance for the population to get a Z score.

    One way to make distributions directly comparable, is to standardise them by computing a linear transformationThe standardised normal distribution does exactly thatThis can be thought of as expressing your data in the same units.Therefore, if you remember from the previous slide, the range of 2 standard deviations around the mean covers approximatley 95%; because the standard deviation of a standardised normal distribution is 1, a z-score of +2 or 2, i.e. 2 std, gives the boundary for our confidence intervalOnly for 2-tailed tests! See distr. around mean versus area from infinity to z=2.0

    Often you dont know the s.d. of the hypothesised or comparison population, and so you use a t-test. This uses the sample s.d. or variance instead. This test introduces a source of error, which decreases as your sample size increases.

    Therefore, the t statistic is distributed differently depending on the size of the sample, like a family of normal curves. The degrees of freedom (df = sample size 1) represents which of these curves you are relating your t-value to. There are different tables of p-values for different degrees of freedom. larger sample = more squashed t-statistic distribution = easier to get significance

    *So, you would use a two sampled t-test if you wanted to determine if two samples are different. So, well look at our previous hypothesis. Does having a baby increase your brain size. *Because one wants to know if the brain size after giving birth is increase or lessoned than the pre-birth group. This will be a two tailed T-test. If you only wanted to find out if brain size is greater after giving birth, this would be a 1 sample T-test. This is our sample, taken directly from a paper. This is particularly important because when using SPSS you will have to specify which type of statistical test you are using.

    Im putting this up to show that one can take a group and try to find out if it is different from a hypothesized population. Here in this formula, you have the mean hypothesized population/ Standard error. This is when you compare the mean of 1 sample to a given value.

    You might uses this if you were testing sleep behaviour and you thought the sample that you had was not the norm. You could hypothesize that the population sleeps 10 hours and night to determine if your sample was significantly different to this. A one sample t-test is a hypothesis test for answering questions about the mean where the data are a random sample of independent observations from an underlying normal distribution N(, sigma^2), where sigma^2 is unknown.

    So, lets say you want to compare more than 2 groups. Lets say you want to look at normal pregnancies vs. Preeclamptic groups. Or if you want to compare several different times a brain scan was taken. In an experiment with more than 2 samples or more than 2 tasks (or 2 samples and 2 tasks), one could do lots of t-tests and compare all the different groups with each other this way but actually you increase the possibility of accepting the experimental hypothesis when its wrong. Youll remember this as the false negative situation. (this is referred to as familywise/ experimentwise error rate). It is much better to use ANOVA.

    In its simplest form ANOVA provides a statistical test of whether or not the means of several groups are all equal, and therefore generalizes t-test to more than two groups

    It looks to see if the variation between groups is different

    ANOVA is concerned with differences between means of groups, not differences between variances. The name analysis of variance comes from the way the analysis uses variances to decide whether the means are different.

    The way it works is simple: the statistical proceedure looks to see what the variation (variance) is within the groups, then works out how that variation would translate into variation (i.e. differences) between the groups, taking into account how many subjects there are in the groups. If the observed differences are a lot bigger than what you'd expect by chance, you have statistical significance. (so if the patterns of data spread are similar in your different samples, then the mean wont be much different, ie the samples are probably from the same population; reversely, if the pattern of variance differs between groups, so will the mean, thus the samples are likely to be drawn from different populations)

    Terminology: Factors: the overall things being compared (type of pregnancy)Levels: the different elements of a factor (young vs old, time to pregnancy)

    ANOVA tests for one overall effect only so it can tell us if experimental manipulation was generally successful but it doesnt provide specific information about which specific groups were affected. need for post-hoc testing!

    ANOVA produces F-statistic or F-ratio which is similar to t-score as it compares the amount of systematic variance in the data to the amount of unsystematic variance. As such, it is the ratio of the experimental effect to the individual differences in performance.

    If the F=ratios value is less than 1, it must represent a non-significant event (so you always want a F-ratio greater than 1, indicating that experimental manipulation had some effect above and beyond the effect of individual differences in performance).

    To test for significance, compare obtained F-ratio against maximum value one would expect to get by chance alone in an F-distribution with the same degrees of freedom. p-value associated with F is probability that differences between groups could occur by chance if null-hypothesis is correct.

    The type that I have described is referred to as a one-way ANOVA because it has one factor which = cartoon characters (with more than 2 levels)Can also have two-way, three-way ANOVAsThese = factorial ANOVAsAllow for possible interactions between factors as well as main effectsFor example you could have 2 factors with 2 levels each This would = a 2 x 2 factorial designCan also have related or independent designs or a mixture

    There is a significant difference between the groups

    NOT where this difference lies

    Finding exactly where the differences lie requires further statistical analyses. So when you are running a a particular statistical test, youll specify that you want to have post hoc values listed. But youll need to make sure the overall value is significant. T-tests assess if two group means differ significantlyCan compare two samples or one sample to a given valueANOVAs compare more than two groups or more complicated scenarios