1 Outline input analysis goodness of fit randomness independence of factors homogeneity of...

30
1 Outline Outline input analysis goodness of fit randomness independence of factors homogeneity of data Model 05-01

Transcript of 1 Outline input analysis goodness of fit randomness independence of factors homogeneity of...

Page 1: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

1

OutlineOutline

input analysis goodness of fit randomness independence of factors homogeneity of data

Model 05-01

Page 2: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

2

Chi-Square TestChi-Square Test

arbitrary data grouping possibly good fit in one but bad in other

groupings

Page 3: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

3

Kolmogorov-Smirnov TestKolmogorov-Smirnov Test

advantages no arbitrary data grouping as in the Chi-square test goodness of fit test for continuous distributions universal, same criterion for all continuous distributions

disadvantages not designed for discrete distributions, being

distribution dependent in that case not designed for unknown parameters, biased goodness

of fit decision for estimated parameters

Page 4: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

4

KSKS Test Test

F(x): underlying (continuous) distribution Fn(x): empirical distribution of n data points

F(x) & Fn(x) being close in some sense

define Dn = supx |F(x) - Fn(x)| if Dn being too large: data not from F(x)

Page 5: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

5

Idea of Idea of KSKS Test Test

continuous distribution F Fn empirical distribution of F for n data

points F (x) = p

|Fn(x) - F(x)| ~ |Yp - p| for Yp ~ Bin(n, p)

supx |Fn(x) - F(x)| ~ supp |Yp - p|

Page 6: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

6

Test for RandomnessTest for Randomness

Do the data points behave like

random variates from i.i.d.

random variables?

Page 7: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

7

Test for RandomnessTest for Randomness

graphical techniques run test run up and run down test

Page 8: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

8

BackgroundBackground

random variables X1, X2, …. (assumption Xi constant)

if X1, X2, … being i.i.d. j-lag covariance Cov(Xi, Xi+j) cj = 0

V(Xi) c0

j-lag correlation j cj/c0 = 0

Page 9: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

9

Graphical TechniquesGraphical Techniques

estimate j-lag correlation from sample check the appearance of the j-lag correlation

jn

xxxxc

jn

ijii

j

1))((

ˆ1

)(ˆ 1

2

2

n

xxs

n

ii

n

22 ˆ/ˆ :ncorrelatio lag- of estimate nn scj

Page 10: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

10

Run TestRun Test

Does the following pattern of A and B appear to be random?

AAAAAAAAAAAAAAABBBBBAAAAA Any statistical test for the randomness of

the pattern? # of permutations with 20A’s & 5B’s = 53130 # of permutations with 5B’s together = 21 an event of probability 0.000395

Page 11: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

11

Run Test for Two Types of ItemsRun Test for Two Types of Items

for two types of items R: number of runs

AABBBABB: 4 runs by this 8 items

for na of item A and nb of item B E(R) = 2nanb/(na+nb) + 1

V(R) =

if min(na, nb) > 10, R ~ normal

)1()(

)2(22

baba

bababa

nnnn

nnnnnn

Page 12: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

12

Run Test for Continuous DataRun Test for Continuous Data

(43.2, 7.4, 5.4, 25.3, 27.3, 13.9, 67.5, 35.4) sign changes: + + + 3 runs down & 2 runs up, a total 5 runs R: number of runs, for n sample values

E(R) = (2n-1)/3 V(R) = (16n-29)/90 Dist(R) normal

Page 13: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

13

Test for IndependenceTest for Independence

It is easier to simulate a system if the classifications are independent.

Are the classifications of the random quantities independent?

Page 14: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

14

Tax reform

Income level

Low Medium High Total

For 182 213 203 598

Against 154 138 110 402

Total 336 351 313 1000

Test for IndependenceTest for Independence for two classifications

e.g., Is voting behavior independent of income levels easier to simulate for independent voting opinion

and income levels2 ╳ 3 Contingency Table

Page 15: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

15

Test for IndependenceTest for Independence

independent income level and opinion generate income level: 3 types (i.e., m types) generate opinion: 2 types (i.e., n types) generate an entity: 5 types (m+n types)

dependent income level and opinion generate income level: 3 types (i.e., m types) generate opinion: 2 types (i.e., n types) generate an entity: 6 types (mn types)

for k factors (classifications) independent: m1 + m2 + … + mk dependent: m1 m2 … mk

Page 16: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

16

Test for IndependenceTest for Independence

H0: voting opinion and income levels are independent

H1: voting opinion and income levels are dependent

Page 17: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

17

Test for IndependenceTest for Independence

marginal distribution:

If H0 is true,

336 351 313, , ,

1000 1000 1000598 402

, .1000 1000

P L P M P H

P F P A

336 598,

1000 1000336 402

, 1000 1000

P L F P L P F

P L A P L P A

Page 18: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

18

Test for IndependenceTest for Independence

expected frequency: cell probability multiplies the total number of observations

in general, the expected frequency of any cell is:

column total row totalexpected frequency

grand total

336 598e.g., 1000 200.9

1000 1000

Page 19: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

19

Test for IndependenceTest for Independence

Observed and Expected Frequencies

d.o.f. associated with the chi-squared test is

Tax reform

Income level

Low Medium High Total

For 182(200.9) 213(209.9) 203(187.2) 598

Against 154(135.1) 138(141.1) 110(125.8) 402

Total 336 351 313 1000

1 1v r c r number of rows

c number of columns

Page 20: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

20

dependent voting opinion and income levels

Test for IndependenceTest for Independence

Calculate for the r ╳ c Contingency Table reject H0 if ; otherwise accept

2

2 i i

i i

o e

e

2 2 with 1 1v r c

2 2 2

2 182 200.9 213 209.9 110 125.87.85

200.9 209.9 125.8

20.05 5.991

Page 21: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

21

Test for Test for HomogeneityHomogeneity

Are the entities of the same type?

Page 22: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

22

Test for HomogeneityTest for Homogeneity

3 ╳ 3 Contingency Table

Abortion law

Political affiliation

Democrat Republic Independent Total

For 82 70 62 214

Against 93 62 67 222

Undecided 25 18 21 64

Total 200 150 150 500

predetermined

Page 23: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

23

Test for HomogeneityTest for Homogeneity

row (or column) totals are predetermined H0: same proportion in each row (or column)

H1: different proportions across rows (or columns)

analysis: same as the test of independence

Page 24: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

24

H0: Democrats, Republicans, and Independents give the same opinion (proportion of options)

H1: Democrats, Republicans, and Independents give different opinion (proportion of options)

= 0.05 critical region: χ2 > 9.488 with v = 4 d.o.f. computations: find the expected cell frequency

Test for HomogeneityTest for Homogeneity

Page 25: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

25

Observed and Expected Frequencies

Decision: Do not reject H0.

Abortion law

Political affiliation

Democrat Republic Independent Total

For 82(85.6) 70(64.2) 62(64.2) 214

Against 93(88.8) 62(66.6) 67(66.6) 222

Undecided 25(25.6) 18(19.2) 21(19.2) 64

Total 200 150 150 500 2 2 2

2 82 85.6 70 64.2 21 19.21.53

85.6 64.2 19.2

Test for HomogeneityTest for Homogeneity

Page 26: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

26

Model 5-1: An Automotive Model 5-1: An Automotive Maintenance and Repair ShopMaintenance and Repair Shop

additional maintenance and repair facility in the suburban area

customer orders (calls) by appointments, from one to three days in advance calls arrivals ~ Poisson process, mean 25 calls/day distribution of calls: 55% for the next day; 30% for the

days after tomorrow; 15% for two days after tomorrow response missing a desirable day: 90% choose the

following day; 10% leave

Page 27: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

27

An Automotive An Automotive Repair and Maintenance ShopRepair and Maintenance Shop

service Book Time, (i.e., estimated service time) ~ 44 + 90*BETA(2, 3) min Book Time also for costing promised wait time to customers

wait time = Book Time + one hour allowance actual service time ~ GAMM(book time/1.05, 1.05) min first priority to wait customers

customer behavior 20% wait, 80% pick up cars later about 60% to 70% of customers arrive on time 30% to 40% arrive within 3 hours of appointment time

Page 28: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

28

Costs and RevenuesCosts and Revenues

schedule rules at most five wait customers per day no more than 24 book hours scheduled per day (three

bays, eight hours each) normal cost: $45/hour/bay, 40-hour per week overtime costs $120/hour/bay, at most 3 hours revenue from customers: $78/ book hour penalty cost

each incomplete on-going car at the end of a day: $35 no penalty for a car whose service not yet started

Page 29: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

29

System PerformanceSystem Performance

simulate the system 20 days to get average daily profit average daily book time average daily actual service time average daily overtime average daily number of wait appointments not

completed on time

Page 30: 1  Outline  input analysis  goodness of fit  randomness  independence of factors  homogeneity of data  Model 05-01.

30

Relationship Between ModelsRelationship Between Models

Model 5-1: An Automotive Maintenance and Repair Shop a fairly complicated model non-queueing type

Model 5-2: Enhancing the Automotive Shop Model two types of repair bays for different types of cars customer not on time