Comparison of Means

download Comparison of Means

of 50

Transcript of Comparison of Means

  • 7/29/2019 Comparison of Means

    1/50

    T distribution

    Student t-test

    1

  • 7/29/2019 Comparison of Means

    2/50

    Steps in hypotheses testing concerning

    SteSteps Examples

    1. Set up hypotheses

    Select level of significance

    Ho: = oH1 : o

    =0.05

    2. Select the appropriate statistics Z= X- o/n

    3. Generate decision rule Reject Ho if Z Z1-Do not reject Hoif Z Z1-

    4. Compute the value test value

    5. Draw conclusion about Ho by

    comparing the test value (4) to the

    decision rule (3).

    2

  • 7/29/2019 Comparison of Means

    3/50

    Test of hypothesis concerning

    Assumptions: Normal distribution or large sample (n30)

    Simple random samples

    Case 1: known x z ( /n)

    Case 2. unknown andn 30 x z (s/n)

    Case 3: unknown andn 30

    X (s/n), df = n-1

    3

  • 7/29/2019 Comparison of Means

    4/50

    Estimation of two samples is concerned withestimating (1-2), the difference in meansbetween groups/populations

    Tests of hypotheses in the two sample caseare also concerned with the difference in themeans.

    E.g. Ho : 1-2= 0 ( no difference in means) vsH1: 1-2 0 (means are different)

    Or H1: 12 (the mean of population 1 is

    larger than the mean of population 2) Or H1: 12 (the mean of population 1 is

    smaller than the mean of population 2)

    4

  • 7/29/2019 Comparison of Means

    5/50

    General format

    Test value =(observed)-(expected)_____________________________

    Standard error

    1 - 2 is the observed difference, and 1-2 is theexpected difference which is 0 when the null

    hypothesis is 1= 2, since the equivalent of 1-2 = 0

    Standard error of the difference is (1/n1)+(2/n2)

    5

  • 7/29/2019 Comparison of Means

    6/50

    If12 and2

    2 are not known, the researcher

    can use the variances s12and s2

    2 obtained from

    sample respectively, provided the sample sizesmust be 30 or more. The formula then is:

    1 - 2 = difference in means

    ______

    (s12/n1)+(s2

    2/n2) = standard error

    of difference in

    meansProvided n1 30 and n2 30.

    6

  • 7/29/2019 Comparison of Means

    7/50

    In comparison between two means, the same

    basic steps of hypothesis testing for Z are

    followed. When comparing two means by using t-test,

    the researcher must decide if the two samples

    are independent or dependent. Two assumptions of difference between two

    means:

    The sample must be independent of each other The population from which the samples were

    drawn must be normally distributed

    7

  • 7/29/2019 Comparison of Means

    8/50

    Student t-test

    1. Testing the difference between two means

    : independent large samples

    2. Testing the difference between two

    means: independent small samples

    3. Testing the difference two means: small

    dependent samples

    8

  • 7/29/2019 Comparison of Means

    9/50

    Testing between two means of

    independent large samples

    general formula

    Test statistic

    12 (S1

    2 /n1+ S22/n2)

    Confidence Interval:

    ( - ) = Z1-/2 (se)

    9

  • 7/29/2019 Comparison of Means

    10/50

    Example 1.

    A survey found that the average hotel room

    rate in Zaria is N88.42 and the average room

    rate in Funtua is N80.61. Assume that the data

    were obtained from two samples of 50 hotels

    each and the standard deviation were N5.62

    and N4.83, respectively. At = 0.05, can it be

    concluded that there is a significant differencein the rates?

    10

  • 7/29/2019 Comparison of Means

    11/50

    solution

    Step 1 State the hypotheses

    Ho : 1 = 2 and H: 1 2 (claim)

    Step 2. Find the critical value

    Z = 1.96 Compute the test value

    1 - 2/(S12/n1)+(S1

    2/n2), thus substitution

    88.42-80.61/(5.622

    /50)+(4.83

    2

    /50)7.81/(31.5844/50)+(23.3289/50)

    7.81/(0.6317)+(0.4666)

    7.81/(1.0982)11

  • 7/29/2019 Comparison of Means

    12/50

    7.81/(1.0983)

    7.81/1.048 (note 1.048 is se)

    t= 7.4523

    Step 4. Make the decision. If tcalc > ttab Reject

    the null hypothesis (Ho)

    Step 5. Summarize the result

    There is no enough evidence to support

    the claim that the means are not equal. Hence

    there is significant difference in the rates.

    12

  • 7/29/2019 Comparison of Means

    13/50

    Fixing of confidence limit

    1 - 2 Z *(S12/n1)+(S1

    2/n2)

    88.42-80.61 1.96*(5.622/50)+(4.832/50)

    7.81 *1.96*(31.5844/50)+(23.3289/50)

    7.81 1.96*(0.6317)+(0.4666)

    7.81 1.96*(1.0982)

    7.81 1.96(1.0478)

    7.81 2.0537 therefore

    7.81 2.0537 = 5.7363 &7.81 + 2.0537 = 9.8637

    CI = (5.7363,9.8637) 13

  • 7/29/2019 Comparison of Means

    14/50

    Using of confidence level to

    test hypotheses

    State the hypotheses

    Ho : 1 - 2 = 0

    H1: 1 - 2 0

    Make a decision. If CI does not contain 0, Reject null hypothesis

    CI (5.7363, 9.8367) does not contain 0, thereforeHo is rejected

    Summary. No enough evidence that the meansare the same. There is significant difference inmean rates

    14

  • 7/29/2019 Comparison of Means

    15/50

    Supposing the mean cholesterol level of malesage 50 is 241. An investigator wishes to

    examine whether the cholesterol levels aresignificantly reduced by modifying diets onlyslightly. A random sample of 12 patient agreeto participate in the study and followed the

    modified diet for 3months. After 3months,their cholesterol levels were measured andsummary statistics are produced on the n=12subjects. The mean cholesterol level is 235

    with standard deviation of 12.5. Based on thedata is there statistical evidence that themodified diet reduces cholesterol?

    15

  • 7/29/2019 Comparison of Means

    16/50

    1. set up hypotheses

    Ho : = 241

    H1: 241

    2. select the appropriate test statistic

    t = -uo/(s/n) for

    3. Decision rule

    Reject Hoif t -1.796 (df= 11, p =0.05)

    Do not reject Ho if t < -1.796

    16

  • 7/29/2019 Comparison of Means

    17/50

    4. test value

    t = -o/(s/n)

    Substituting the values in the formula above:

    235-241 -6 = -6 = -1.66

    12.5/12 12.5/3.464 3.6

    17

  • 7/29/2019 Comparison of Means

    18/50

    Example 4

    An investigation is undertaken to examine themean times to relief from headache pain under 2entirely treatments: medication vs Relaxationtreatment. Patients suffering from chromic

    headaches are enrolled in a study and randomlyassigned to one or the two treatments underinvestigation. Patients are instructed to eithertake assigned medication or perform the

    relaxation exercises at the onset of their nextheadache. They are also instructed to record thetime in minutes until the headache pain isresolved.

    18

  • 7/29/2019 Comparison of Means

    19/50

    Fifteen subjects are assigned to the

    medication treatment and report a mean time

    relief of 33.8minutes with a variance of2.85minutes. A second random sample of 15

    subjects are assigned to the relaxation

    treatment, and report a mean time to relief of

    22.4minutes with a variance of 3.07 minutes

    The data layout is shown below/next slide

    19

  • 7/29/2019 Comparison of Means

    20/50

    Patients with

    chronic

    headaches

    Randomize

    Medication Relaxationtreatment

    n1 = 15

    1 = 33.8 minutesS1

    2 = 2.85

    N = 15

    2 = 22.4 minutesS2

    2 3.07

    Are these sample means statistically significantly different . Run an appropriate test

    to asses whether there is a significant difference in the mean time to relief under

    the two different treatments using 5% level of significance. 20

  • 7/29/2019 Comparison of Means

    21/50

    Formula

    12/ Sp*(1/n1+ 1/n2)

    Where Sp = pooled standard deviation

    = (X12(x)2/n1) + (X2

    2(x)2/n2)

    _______________ + _______________

    n1-1 n2 - 1

    21

  • 7/29/2019 Comparison of Means

    22/50

    Substituting in the formula:

    t = 33.8 22.4

    __________ 1.72 *(1/15+1/15)

    11.4/0.63 = 18.10

    t = 18.10 > 2.08 (t0.05, dfn1+n2 -2)

    Reject Ho because there is significant evidencethat there is difference in the mean relief time

    between medication and relaxation therapy.

    22

  • 7/29/2019 Comparison of Means

    23/50

    Two dependent populations

    Attributes Test Statistic Confidence Interval

    Samples are matched or

    paired, n (# pairs) 30

    Samples are matches orpaired, n(# pairs) 30

    Where d, Sd are the mean

    Z = d - d__________

    Sd/n

    t = d - d__________

    Sd/n

    df = n-1

    and standard deviation of

    d Z1-/2*Sd/n

    d t1-/2*Sd/n

    df = n-1

    the difference scores

    23

  • 7/29/2019 Comparison of Means

    24/50

    Example

    A nutritionist expert is examining a weight loss

    programme to evaluate its effectiveness. Ten

    subjects were randomly selected for the

    investigation. Each subjects initial weight isrecorded, they follow the program for six

    weeks, and they are again weighed. The data

    are given below:

    24

  • 7/29/2019 Comparison of Means

    25/50

    Subjects initial weight Final weight

    1 180 165

    2 142 138 3 126 128

    4 138 136

    5 175 170 6 205 197

    7 116 115

    8 142 128

    9 157 144

    10 136 130

    25

  • 7/29/2019 Comparison of Means

    26/50

    Sbjts iw fw difference(d) difference2(d2)

    1 180 165 15 225

    2 142 138 4 16 3 126 128 -2 4

    4 138 136 2 4

    5 175 170 5 25 6 205 197 8 64

    7 116 115 1 1

    8 142 128 14 196

    9 157 144 13 169

    10 136 130 6 36

    d = 66 d2 = 74026

  • 7/29/2019 Comparison of Means

    27/50

    d= d/n = 66/10 = 6.6

    S2d = d2(d)2/n = 740 (66)2/10

    n-1 9

    S2d = 33.82

    Sd

    = 33.82 = 5.82

    27

  • 7/29/2019 Comparison of Means

    28/50

    Test the hypothesis

    1. Set up hypotheses

    Ho : d = 0

    H1 : d 0 2. Select the appropriate statistic

    t = __ d - d____

    Sd/n

    28

  • 7/29/2019 Comparison of Means

    29/50

    6.6-0/(5.8210) = 3.59

    df n-1 = 10 1 = 9

    tcalc.

    = 3.59, ttab(0.05,

    df=9)

    = 2.262

    tcalc. ttab Reject Ho

    We have 95% significant evidence, to show thatthere is mean weight loss following six weeksprogram.

    29

  • 7/29/2019 Comparison of Means

    30/50

    Fixing of confidence interval

    Recall

    d t1-/2*Sd/n d = 6.6, t1-/2 = 2.262, Sd= 5.83 , 10 = 3.162278 6.6 2.262* 5.83/3.162278

    6.6 2.262 * 18.44

    6.6 41.71128

    6.66+41.71128 = 48.31128

    6.66- 41.71128 = -35.11128

    (-35.11128, 48.311128) Do not reject Ho: we have95% significant evidence to show that the programhas no significant effect on mean weight loss after sixweeks.

    30

  • 7/29/2019 Comparison of Means

    31/50

    Chi-square table

    31

  • 7/29/2019 Comparison of Means

    32/50

    Chi- Square Analysis

    Goodness of fit test

    Test of independence

    Test of heterogeneity

    Used for the test of hypotheses of multi-variable

    data in one-sample, two or more sample

    applications.

    Both tests and test statistic follows chi-squaredistribution (2).

    32

  • 7/29/2019 Comparison of Means

    33/50

    Goodness of fit test

    Test Statistic

    2= (O-E)2

    E

    Where O = observed, E = expected

    E.G. Volunteers at a teen hotline have beenassigned to based on the assumption that 40%

    of all calls are drug-related, 25% are sex-related, 24% are stress-related and 1%concern educational issues.

    33

  • 7/29/2019 Comparison of Means

    34/50

    For this investigation, each call is classified

    into one category based on the primary issue

    raised by the caller. To test the hypothesis, the

    following data are collected from 120randomly selected calls placed to the teen

    hotline. Based on the data, is the assumption

    regarding the distribution appropriate?

    34

  • 7/29/2019 Comparison of Means

    35/50

    35

    Topical issue

    Drugs Sex Stress Education

    Number of calls 52 38 21 9

  • 7/29/2019 Comparison of Means

    36/50

    1 Sep up the hypothesis

    Ho : p1 = 0.40, p2 = 0.25, p3 = 0,25, p4 = 0.1

    H1 : Ho is false 2 Select appropriate statistic

    2= (O-E)2

    E

    3. select level of confidence

    = 0.05, here we determine df, n-1, 4 1 = 3

    2 = 7.815 from table @ df=3, critical level0.05

    36

  • 7/29/2019 Comparison of Means

    37/50

    4. Decision rule

    Reject Ho if2 7.815

    Do not reject Ho if

    2

    7.815 5. compute test statistic

    37

  • 7/29/2019 Comparison of Means

    38/50

    Topical Issue Drugs Sex Stress Educational Total

    O =

    (observed

    frqcy)

    52 38 21 9 120

    E =

    (expectedfrqncy)

    120(0.40) =

    48

    120 (0,25) =

    30

    120(0.25) =

    30

    120(0.1)

    12

    120

    (O-E) 4 8 -9 -3 0

    (O-E)2/E (4)2/48

    = 0.33(8)2/30

    = 2.13(-9)2/30

    = 2.70(-3)2/12

    = 0.755.915

    38

    Organized computations of the test statistic

    The test statistic (2) = 5.913

    l

  • 7/29/2019 Comparison of Means

    39/50

    Conclusions

    Do not reject Hosince 5.913 7.815

    We do not have significant evidence = 0.05to show that the distribution of topical issues

    in the calls placed to the teen hotline is not as

    assummed (40% drug related, 25% sex-

    related, 25% stress-related and 10% eduction-

    related).

    39

  • 7/29/2019 Comparison of Means

    40/50

    Test of Independence

    This considers applications involving two or

    more samples or two categorical variables.

    Our interest is to evaluate whether these two

    categorical variables are related(dependent/associated) or unrelated

    (independent/ not associated). The following

    example illustrates the use of2

    test ofindependence

    40

  • 7/29/2019 Comparison of Means

    41/50

    Example.

    The following data were collected in a multi-

    site study of medical effectiveness in type IIdiabetes. Three sites were involved in the

    study, a health maintenance organization

    (HMO), a university teaching hospital (UTH),

    and an independent practice association

    (IPA). Type II patients were enrolled in the

    study from each site and monitored for over a

    three year period. The data below illustratethe treatment regimens of patients measured

    by site

    41

  • 7/29/2019 Comparison of Means

    42/50

    Treatment Regimens

    Site Diet & exercise Oral

    Hypoglycemics

    Insulin Total

    HMO 294 827 579 1700

    UTH 132 388 352 772

    IPA 189 516 404 1109Total 615 1630 1335 3581

    42

    Th t bl b i 3 X 3 t b l ti

  • 7/29/2019 Comparison of Means

    43/50

    The table above is a 3 X 3 cross-tabulation

    table or a contingencytable.

    Both sites and treatment regimens arecategorical variables

    Site is called the row variables and treatment

    regimen is called the column variables

    The number of rows in the table is denoted R

    and the number of columns in the table is

    denoted C.

    In this table, R=3 and C=3

    The row and column totals are shown on the

    right and bottom of the table, respectively.43

  • 7/29/2019 Comparison of Means

    44/50

    The 9 combinations of site and treatment

    regimens are called the cells of the table.

    e.g. Patients in the HMO treated by diet andexercise denoted one cell of the table,

    patients in the HMO treated by the oral

    hypoglycemics denote another cell, etc,

    We wish to use the data to test the hypothesis

    that the two variables (site and treatment

    regimen) are independent (i.e. no difference

    in treatment regimen across sites)

    The hypotheses are written as follows

    44

    1 t th h th i

  • 7/29/2019 Comparison of Means

    45/50

    1. set up the hypothesis

    Ho : Site and treatment regimen are

    independent ( no relationship between siteand treatment regimen)

    H1 : Ho is false ( site and treatment regimen

    are related)

    2. Select the significant level ( = 0.05)

    3. select the appropriate statistic

    2

    = (O-E)2

    E

    45

    4 D i i l

  • 7/29/2019 Comparison of Means

    46/50

    4. Decision rule

    To select the appropriate critical value, we first

    determine the df = (R-1)(C-1)=(3-1)(3-1) DF= (2)(2) = 4

    From the table 2 = 9.49

    Reject Ho if2calc 9.49(tab) else do not rejectHo if

    2calc 9.49(tab)

    5. compute the test statistic

    46

    h

  • 7/29/2019 Comparison of Means

    47/50

    To compute the test static

    Note that the observed values are displayed in

    the cells Let us compute the expected values and put

    them in parenthesis in each cell.

    The expected value for each cell is computedby finding the product of the row and column

    totals in which the cell is located / total

    patients involved in the investigation. Eg expected frequency of HMO and diet /

    exercise = 1700 X 615/3581

    47

    Treatment Regimens

  • 7/29/2019 Comparison of Means

    48/50

    Treatment Regimens

    Site Diet & exercise Oral

    Hypoglycemics

    Insulin Total

    HMD 294

    (1700X615)/3581=

    291.95)

    827

    (1700X1630)/3581=

    774.3)

    579

    (1700X1335)/3581=

    633.8)

    1700

    UTH 132

    (772X615)/3581

    =132.6)

    388

    (772X1630/3581

    =351.6)

    352

    (773X1335)/3581

    =287.8)

    772

    IPA 189

    (1109X615)/3581

    =189.5)

    516

    (1109X1630)/3581

    =505.1)

    404

    (1109X1335)/3581

    =413.4)

    1109

    Total 615 1630 1335 3581

    48Note: The marginal totals of observed = marginal totals of expected

    U i th b d d t d f i

  • 7/29/2019 Comparison of Means

    49/50

    Using the observed and expected frequencies,we compute the test statistics

    2= (O-E)2

    E

    (294-291.5)2 + (827-774.3)2 + (579-633.8)2 +

    291.5 774.5 633.8

    (132-132.6)2 + (288-351.6)2 + (352-187.8)2 +

    132.6 351.6 187.8

    (189-190.5)2 + (516-505.1)2 + (404-413.3)2 =

    190.5 505.1 413.3

    49

    2 0 014 + 3 359 + 4 732 + 0 003 + 11 509 +

  • 7/29/2019 Comparison of Means

    50/50

    2 = 0.014 + 3.359 + 4.732 + 0.003 + 11.509 +

    14.320 + 0.011 + 0.2235 +0.215

    = 34.629

    Conclusion. Reject Hosince 34.629 9.49. we

    have significant evidence ( = 0.05) to showthat site and treatment regimen are not

    independent ( i.e. related).