Hypothsis testing

71
HYPOTHESIS AND ITS TESTING [email protected] MUHAMMAD SHAFIQ, COMMERCE DEPARTMENT UNIVERSITY OF BALOCHISTAN, QUETTA

Transcript of Hypothsis testing

HYPOTHESIS AND ITS TESTING

[email protected] SHAFIQ,

COMMERCE DEPARTMENTUNIVERSITY OF BALOCHISTAN, QUETTA

Hypothesis

• Unproven proposition

• Supposition that tentatively explains certain facts or phenomena

• Assumption about nature of the world

Hypothesis

• An unproven proposition or supposition that tentatively explains certain facts or phenomena

• Null hypothesis• Alternative hypothesis

Null Hypothesis

• Statement about the status quo

• No difference

Alternative Hypothesis

• Statement that indicates the opposite of the null hypothesis

Introduction: inferential statistics for hypothesis• Empirical testing involves Inferential statistics

• Inference is drawn about the population on basis of observations of sample

Inferential statistical inference types

• Univarate statistical analysis• Test hypotheses having one variable

• Bivariate statistical analysis• Test hypotheses having two variables

• Multivariate statistical analysis• Test hypotheses and models involving multiple (three or more) variables or set

of variables

Common Types of hypotheses tested

• Relational hypotheses• e-g how changes in a variable vary with changes in another (regression analysis)

• Hypotheses about differences between the groups• How a variable varies from one group or other (causal design)

• Hypotheses about differences from some standard• How a variable differs from some preconceived standard (Univarate testing)

Hypothesis testing procedure

• Get the research hypothesis from the research objective

• Sample is obtained and relevant variable is measured.

• Measured value obtained in the sample is to compare to the value stated explicitly or implied in the hypothesis. If value is consistent with hypothesis, the hypothesis is supported or vice versa.

• Example:• H1 The average satisfaction of a plant is more than 34 pieces in an hour. If the

average is less than 34 pieces in an hour than hypothesis not support or otherwise.

Significance Level

• Critical probability in choosing between the null hypothesis and the alternative hypothesis

• Indicates how likely it is that an inference support the difference between an observed value and some statistical expectation

Significance Level

• Critical Probability

• Confidence Level

• Alpha (α)

• Probability Level selected is typically.10, 0 .05 or 0.01

• Too low to warrant support for the null hypothesis

P-value (observed or computed significance level)

• In statistics, the p-value is a function of the observed sample results (a statistic) that is used for testing a statistical hypothesis. Before the test is performed, a threshold value is chosen, called the significance level of the test, traditionally 5% or 1%  and denoted as α.

• Probability value or the observed or computed significance level.

• P-values are compared to significance levels to test hypotheses

• The probability in a p-value is that the statistical expectation (null) for a given test is true.

• Low P-value mean there is little likelihood that the statistical expectation is true

Critical values

• The values that lie exactly on the boundary of the region of rejection

.025.025

Example of hypothesis testing• A restaurant owner is concerned about his business image. He is

investigating whether customers perceive the service friendly. Sample size is 225 customers, using five point scale where 1 indicates the most unfriendly and 5 is the most friendly with the interval scale. Suppose that service has to be different from 3.0.

• H1 Customer perception of friendly services are significantly greater than three. or

We can write it as:

H0 µ= 3.0

H1 µ ≠ 3.0

Formula for solving z score

n

SZZSX or

Example of hypothesis testing…

• Sample size is 225 on the basis of sample, mean was calculated which is 3.78

• If SD of population is known then it will be used in solution but here the sample standard deviation was S= 1.5

• Now enough information to test the hypothesis

• Formula for solving z value:

n

SZZSX or

0.3 : oH

The null hypothesis that the mean is equal to 3.0:

0.3 :1 H

The alternative hypothesis that the mean does not equal to 3.0:

A Sampling Distribution

=3.m0

x

=3m.0 x

=.025a =.02a5

A Sampling Distribution

LOWER LIMIT

UPPERLIMIT

=3.m0

A Sampling Distribution

Critical values of m

Critical value - upper limit

n

SZZSX or

225

5.196.1 0.3

1.096.1 0.3

196. 0.3

196.3

Critical values of m

Critical value - lower limit

n

SZZS

X- or -

225

5.196.1- 0.3

Critical values of m

1.096.1 0.3

196. 0.3

804.2

Critical values of m

Region of Rejection

LOWER LIMIT

UPPERLIMIT

=3.0m

Hypothesis Test =3.0m

2.804 3.196=3.0m 3.78

Type I error

• An error caused by rejecting the null hypothesis when it is true

• Practically, a type I error occurs when the researcher concludes that a relationship or difference exists in the population when in reality it does not

Type II error

• An error caused by falling to reject the null hypothesis when the alternative hypothesis is true

• Has a probability of beta (β)

• Practically, a type II error occurs when a researcher concludes that no relationship or difference exist when in fact one does exist

Accept null Reject null

Null is true

Null is false

Correct-no error

Type Ierror

Type IIerror

Correct-no error

Type I and Type II Errors

Type I and Type II Errorsin Hypothesis Testing

State of Null Hypothesis Decisionin the Population Accept Ho Reject Ho

Ho is true Correct--no error Type I errorHo is false Type II error Correct--no error

Calculating Zobs : another way of testing hypothesis through z test

xs

xz

obs

X

obs S

XZ

Alternate Way of Testing the Hypothesis

X

obs SZ

78.3

1.

0.378.3

1.

78.0 8.7

Alternate Way of Testing the Hypothesis

Choosing the Appropriate Statistical Technique

• Type of question to be answered

• Number of variables• Univarate• Bivariate• Multivariate

• Scale of measurement

Statistical technique:Type of question to be answered

• If a researcher is simply with central tendency of a variable or with the distribution of variable or comparison of different business division sales result with some target level

• One sample t –test• Comparison of two salespersons average monthly sales will require:

• t-test of two means• Comparison of quarterly sales distribution:

•Chi-square test

Number of variables

•Univarate•Bivariate•Multivariate

Scale of measurement

• Nominal• Ordinal frequencies, mode or cross tabulation• Interval• Ratio

PARAMETRICSTATISTICS

NONPARAMETRICSTATISTICS

t-Distribution• Symmetrical, bell-shaped distribution

• Mean of zero and a unit standard deviation

• Shape influenced by degrees of freedom

Degrees of Freedom(df)

• df are equal to sample size (n) minus one• The number of observations minus the number of constraint or

assumptions needed to calculate a statistical term• Abbreviated d.f. • Number of observations• Number of constraints

• For example 4+2+1+X =12 the value is 5: means value of first three can be change but not the 4th one. There are three degree of freedom

• Use of computer software in this regard to calculate

Calculating the confidence interval estimate by using the t-distribution

• Suppose that a researcher is interested to investigate that how long do the fresh post graduate stay in the organization at 95% confidence level. The data from the sample are presented below:

No. of years stay at job: 3 5 1 12 1 2 2 2 5

4 2 3 1 3 4 2 6 7

formula µ = X ± t cl S//n

Steps to calculate

1. Calculate the sample mean (sum all and divide by No. of observation)

2. Since population SD is unknown, estimate it through sample SD S=2.81

3. Estimate the standard error of the mean using the formula Sx =S//n

4. 2.81//18 =.66

5. Determine the t-value form t-table i-e A.3

or

Xlc StX ..

n

StX lc ..limitUpper

n

StX lc ..limitLower

Confidence Interval Estimate Using the t-distribution

= population mean

= sample mean

= critical value of t at a specified confidence

level

= standard error of the mean

= sample standard deviation

= sample size

..lct

X

XSSn

Confidence Interval Estimate Using the t-distribution

xcl stX

17

81.2

89.3

n

S

X

Confidence Interval Estimate Using the t-distribution

Use of t table i-e table A.3

• Go to t table in the appendix i-e A.3. t-table provides information similar to that in the Z table, however it is somewhat different.

The t-table format stresses the chance of error or significance level (α), rather than the 95% chance of including the population mean in the estimate. Our example I a two-tailed test. Since a 95% confidence level is selected so the significance level is 5%. Once this has been determined all we have to do to find the t-value is look under the 0.05 (1.00 -0.95= 0.05). Column for two-tailed test at the row in which degree of freedom equal the appropriate value (n-1). Below 17 degrees of freedom t-value at 95% confidence level is t=2.12

Lower limit = 3.89 -2.12 (2.81 //18) =2.28

Upper limit= 3.89 +2.12 (2.81 //18)= 5.28

Hypothesis Test Using the t-Distribution

Suppose that a production manager believes the average number of defective assemblies each day to be 20. The factory records the number of defective assemblies for each of the 25 days it was opened in a given month. The mean was calculated to be 22, and the standard deviation, ,to be 5.

XS

Univariate Hypothesis Test Utilizing the t-Distribution

20 :

20 :

1

0

H

H

nSS X /25/5

1

Testing a Hypothesis about a Distribution

• Chi-Square test

• Test for significance in the analysis of frequency distributions

• Compare observed frequencies with expected frequencies

• “Goodness of Fit”

The researcher desired a 95 percent confidence, and the significance level becomes .05.The researcher must then find the upper and lower limits of the confidence interval to determine the region of rejection. Thus, the value of t is needed. For 24 degrees of freedom (n-1, 25-1), the t-value is 2.064.

Univariate Hypothesis Test Utilizing the t-Distribution

:limitLower 25/5064.220 .. Xlc St 1064.220

936.17

:limitUpper 25/5064.220 ..

Xlc St 1064.220

064.20

X

obs S

Xt

1

2022

1

2

2

Univariate Hypothesis Test t-Test

Chi square test

• a statistical method assessing the goodness of fit between a set of observed values and those expected theoretically

• The chi-squared test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories.

i

ii )²( ²

E

EOx

Chi-Square Test

x² = chi-square statisticsOi = observed frequency in the ith cellEi = expected frequency on the ith cell

Chi-Square Test

n

CRE ji

ij

Chi-Square Test Estimation for Expected Number for Each Cell

Chi-Square Test Estimation for Expected Number for Each Cell

Ri = total observed frequency in the ith rowCj = total observed frequency in the jth columnn = sample size

2

222

1

2112

E

EO

E

EOX

Univariate Hypothesis Test Chi-square Example

50

5040

50

5060 222

X

4

Univariate Hypothesis Test Chi-square Example

Hypothesis Test of a Proportion p is the population proportion

p is the sample proportion

p is estimated with p

5. :H

5. :H

1

0

¹p

=p

Hypothesis Test of a Proportion

100

4.06.0pS

100

24.

0024. 04899.

pS

pZobs

04899.

5.6.

04899.

1. 04.2

0115.Sp =

000133.Sp =1200

16.Sp =

1200

)8)(.2(.Sp =

n

pqSp =

20.p =200,1n=

Hypothesis Test of a Proportion: Another Example

0115.Sp =

000133.Sp =1200

16.Sp =

1200

)8)(.2(.Sp =

n

pqSp =

20.p =200,1n=

Hypothesis Test of a Proportion: Another Example

Indeed .001 the beyond t significant is it

level. .05 the at rejected be should hypothesis null the so 1.96, exceeds value Z The

348.4Z0115.05.

Z

0115.15.20.

Z

Sp

Zp

=

=

-=

p-=

Hypothesis Test of a Proportion: Another Example

Allah may shower his blessing upon youBye bye…