09AugStatMS

download 09AugStatMS

of 15

Transcript of 09AugStatMS

  • 8/12/2019 09AugStatMS

    1/15

    The written Masters Examination

    Option Statistics and Probability FALL 2009

    Full points may be obtained for correct answers to 8 questions. Each numbered question(which may have several parts) is worth the same number of points. All answers will begraded, but the score for the examination will be the sum of the scores of your best 8solutions.

    Use separate answer sheets for each question. DO NOT PUT YOUR NAME ON

    YOUR ANSWER SHEETS . When you have finished, insert all your answer sheetsinto the envelope provided, then seal and print your name on it.

    Any student whose answers need clarification may be required to submit to an oralexamination.

  • 8/12/2019 09AugStatMS

    2/15

    MS Exam, Option Statistics and Probability, FALL 2009

    1. (Stat 401)

    Let U and V be two independent, standard normal random variables.

    (a) Find ( 1P U V ) .

    (b) Find the distribution of U V . (Hint: you may start with the joint distribution of U V and V.)

    2. (Stat 411)

    Let X1, , Xn be independently, identically distributed Poisson random variables with parameter ,where >0.

    (i) Show that = i X is complete sufficient.Y (ii) Use Rao-Blackwell theorem to derive the MVUE (Minimum Variance Unbiased Estimator)

    of P(X1=0).

    3. (Stat 411)

    Let and be two independent random samples from normal distributionsand respectively.

    [1] Find the likelihood ratio for testing against .[2] Rewrite as a function of which has a well-known distribution.

    2

  • 8/12/2019 09AugStatMS

    3/15

    3

    MS Exam, Option Statistics and Probability, FALL 2009

    4. (Stat 416)

    Consider the following data which gives miles per gallon of a make of car before and after applicationof a newly developed gasoline additive.

    Car 1 2 3 4 5 6 7Before 17.2 21.6 19.5 19.1 22.0 18.7 20.3After 18.3 20.8 20.9 21.2 22.7 18.6 21.9

    State the basic model assumptions under which the Wilcoxon Signed Rank Test can be used. Set up thenull hypothesis and the alternative, determine the value of the test statistic, and explain how you woulddetermine the p-value. When would you use the Sign Test instead?

    5. (Stat 431)

    1- How would you draw a simple random sample of size 100 from a finite population of size20,000?

    2- A simple random sample of size 10 was drawn from a finite population of size 80. These 10units were surveyed and the related data on a single survey variable Y were collected. However,later it was observed that in recording the data in a computer file one of the 10 units was missingin the file. What will be the HT estimator of the population total based on the remaining 9 data points? Indicate if you need to make any assumptions.

    3- Based on the following survey sampling plan how would you estimate the population mean andits standard error if upon implementation of the sampling design the sample {3} is selected.

    Samples in the support of the design: {1, 2, 3}, {3, 4, 5}, {2, 4, 5}, {3}, {1,4,5}

    Prob. distribution over the support: 3/9 2/9 2/9 1/9 1/9

    6. (Stat 461)

    Suppose that coin 1 has probability 0.7 of coming up heads, and coin 2 has probability 0.6 of comingup heads. If the coin flipped today comes up heads, then we select coin 1 to flip tomorrow, and if itcomes up tails, then we select coin 2 to flip tomorrow. If the coin initially flipped is equally likely to becoin 1 or coin 2, then what is the probability that the coin flipped on the third day after the initial flip iscoin 2?

  • 8/12/2019 09AugStatMS

    4/15

    MS Exam, Option Statistics and Probability, FALL 2009

    7. (Stat 471)

    Consider a matrix A of order m and let be a column -vector. Prove that exactly one of thefollowing holds. Either: has or has a solution

    n x

    b A y

    m< Ax b= 0 0, 0T T b y y but not both.

    8. (Stat 471)

    What are complementary slackness conditions for two linear programming problems that are dual to

    each other? Use the complementary slackness conditions to show that for a transportation problem

    with 3 warehouses and 4 cities and with cost matrix

    5 7 9 66 7 10 5

    7 6 8 1

    C

    =

    ,

    and with supplies 120, 140, 100 from warehouses 1, 2, and 3 respectively and with demands 100, 60,

    80, and 120 at cities 1, 2, 3 and 4 respectively, shipping schedule:

    Ship from warehouse 1 to city 1 100 units

    Ship from warehouse 1 to city 3 20 units

    Ship from warehouse 2 to city 2 60 units

    Ship from warehouse 2 to city 3 60 units

    Ship from warehouse 2 to city 4 20 units

    Ship from warehouse 3 to city 4 100 units

    is optimal.

    9. (Stat 473)

    Given the bimatrix game ( , of order m) A B n with Nash equilibrium strategies 1 2( , ,..., )m p p p p=

    A B=

    ,

    , show that they are optimal strategies for the two players in case .1 2( , ,..., )nq q q q=

    4

  • 8/12/2019 09AugStatMS

    5/15

    MS Exam, Option Statistics and Probability, FALL 2009

    10. (Stat 481)

    Four corn varieties were tested for their production in an experiment with 4 blocks and the followingyield data were obtained:

    yijBlock

    Variety 1 2 3 4ABCD

    9.39.49.29.7

    9.49.39.49.6

    9.69.89.510.0

    10.09.99.710.2

    (1). Assume that there is no interaction effect between the corn variety and the blocks, write down theappropriate model and necessary restrictions to analyze the data.

    (2). Denote i y and j y the row mean and column mean respectively, we have

    ( ) ( ) ( ) 11.1,70.04,30.044

    1

    4

    12

    4

    12

    4

    12

    = =

    =

    = ===

    i jij

    i j

    ji y y y y y y

    Construct the ANOVA Table:

    Source S.S. DF MS F

    (3). Formulate the hypothesis and draw your conclusions for both effects based on above ANOVAtable given significance level 0.05.[Given: F(0.05,3,9)=3.86, F(0.05,3,12)=3.49]

    11. (Stat 481)

    To analyze a dataset with 10 observations of ( , )i i x y and 18i x = , ,20i y = 2 40i x = ,, , one may consider simple linear regression model,2 50i y = 30i i x y =

    20 1 ; (0,i i i y x N = + + ).n I

    0

    (a) Find the least-square estimate of the simple linear regression model.(b) Perform a test of 0 1: H = against 1:a H 0 < . Use 0.05 = .(c) Estimate and construct a 95% confidence interval for .( | 2) E y x = ( | 2) E y x =(d) Predict the value of y at , and construct a 95% confidence interval for y. Hint: you need toconsider the effect of

    2 x = on prediction of y.

    [Given: (0.05) 1.645 z = (0.025) 1.96 z = (0.05;8) 1.86t = (0.025;8) 2.306t = ]

    5

  • 8/12/2019 09AugStatMS

    6/15

    Statistics 401&481 MS Exam

    Fall Semester 2009

    1. Let U and V be two independent, standard normal random variables.

    (a) Find ( 1P U V ) .(b) Find the distribution of U V . (Hint: you may start with the joint distribution of U V and V.)

    Solution:

    (a)

    0 0

    0 0

    0

    01 1/2

    1/2 0

    ( 1) ( 0) ( 0) ( ) ( ) ( ) ( )

    ( )( ( ) 1 / 2) ( )(1 / 2 ( ))

    ( 1 / 2) (1 / 2 ) 1 / 4

    u

    u

    P U V P U V P U V u v dvdu u v dvdu

    u u du u u du

    t dt t dt

    = + = +

    = +

    = + =

    ,

    where , and( )t = u ( ) and ( ) are pdf and cdf of standard normal distribution,respectively.

    (b) Let / X U V = and , then U Y V = XY = and V Y = . Therefore, the joint pdf of( , ) X Y is

    , ,

    2 2 2

    2 2

    / /( , ) ( , ) ( ) ( )/ / 0

    1 1exp exp | |2 22 2

    1 (1 )exp | |, , ,2 2

    X Y U V u x u y y x f x y f u v xy yv x v y

    x y y y

    x y y x y R

    = =

    =

    +=

    1

    and the marginal pdf of X is

    2 2

    ,

    2 2 2 2

    2 20

    1 (1 )( ) ( , ) exp | |2 2

    1 (1 ) (1 ) 1exp , .(1 ) 2 2 (1 )

    X X Y x y f x f x y dy y dy

    x y x yd x

    x x

    += =

    + += = + +

    R

    [Note: This proof shows that the ratio of two indep. standard normal r.v.s follows astandard Cauchy distribution.]

  • 8/12/2019 09AugStatMS

    7/15

    Stat 411, Estimation problem, Fall 2009

    Let X 1 ; : : : ; X n be independently, identically distributed Poisson randomvariables with parameter , where >0.

    (i) Show that Y = X i is complete sucient.(ii) Use Rao-Blackwell theorem to derive the MVUE (Minimum variance

    unbiased estimator of P (X 1 = 0) .

    Solution.

    (i) The p.d.f. of a Poisson is

    f (x; ) = ex ln x !; x = 0 ; 1;:::

    0; otherwise.

    Clearly this belongs to the regular exponential class of families. Hence Y iscomplete sucient.

    (ii) Let

    U = 1, if X 1 = 00; otherwise.

    Clearly E (U ) = P (X 1 = 0) = e for all ; hence U is unbiased for P (X 1 =0) = e : For y = 0 ; 1;:::

    E (U jy) = P (X 1 = 0 jY = y)

    =P (X 1 = 0) P

    n

    Pi =2 X i = yP

    n

    Pi =1 X i = y=

    e e (n 1) ((n 1) )y =y!e n (n )y =y!

    =n 1

    n

    y

    :

    Hence by the Rao-Blackwell theorem

    n 1n

    n

    Pi =1

    X i

    is MVUE for P (X 1 = 0) :

    2

  • 8/12/2019 09AugStatMS

    8/15

    Stat 411 (chapter 8, 9), Fall 2009:Let and be two independent random samples from normaldistributions and respectively.[1] Find the likelihood ratio for testing against .[2] Rewrite as a function of which has a well-known distribution. What is it?

    Solution:[1] Under , write . The likelihood function

    attains its maximum

    at , where , and .

    Under , the likelihood function

    attains its maximum

    At . Therefore, the likelihood ratio statistic

    [2] Write . Then and follows a normal distribution withmean and variance .

  • 8/12/2019 09AugStatMS

    9/15

  • 8/12/2019 09AugStatMS

    10/15

    Statistics 431-MS ExamFall Semester 2009

    1- How would you draw a simple random sample of size 100 from a finite population of size 20,000?

    Answer: Write an algorithm which allows you to draw one unit at time withoutreplacement till you have drawn 100 units. See Chapter 1 in the text book for thecourse (see below).

    2- A simple random sample of size 10 was drawn from a finite population of size80. These 10 units were surveyed and the related data on a single surveyvariable Y were collected. However, later it was observed that in recording thedata in a computer file one of the 10 units was missing in the file. What will bethe HT estimator of the population total based on the remaining 9 data points?Indicate if you need to make any assumptions.

    Answer: If it is reasonable to assume that the missing data could have been fromany of the 10 units in the sample with equal probability then the remaining 9 units inthe sample form a simple random sample of size 9 from the frame of 80. Nowconsult Chapter 4 of the text book under SRS (80,9) for the remaining parts in thequestion.

    3- Based on the following survey sampling plan how would you estimate the population mean and its standard error if upon implementation of the samplingdesign the sample {3} is selected.

    Samples in the support of the design: {1, 2, 3}, {3, 4, 5}, {2, 4, 5}, {3},{1, 4, 5}

    Prob. distribution over the support: 3/9 2/9 2/9 1/9 1/9

    Answer : We note that the frame has 5 units in it. And the selected probability sampleis {3}, thus we need to compute the first ordered inclusion probability for unit 3 underthe given sampling plan. Since unite 3 are in three samples {1, 2, 3}, {3,4,5} and {3},thus, the probability of unit 3 being selected under this sampling plan is 3/9 + 2/9 +1/9 = 6/9. Therefore, the HT estimator of the population mean is [(observation made

    on unit 3)9/6]/5. As for the standard deviation of this HT estimator, we observe thatthe second ordered inclusion probabilities for all 10 pairs of units are positive for thissampling plan and therefore we can have an unbiased estimator of the variance of thegiven HT estimator even though we have only a single observation at hand. Now youcan use expression (3.15) in the text and utilize the first part of the expression to carrythe estimation and then take positive square root of it for the standard deviation.

    Text Book:

    Hedayat, A.S., Sinha, B.K. (1991). Design and Inference in Finite PopulationSampling. Wiley, New York.

  • 8/12/2019 09AugStatMS

    11/15

  • 8/12/2019 09AugStatMS

    12/15

    Masters exam questions:

    VII Stat 471: Linear and Non Linear Programming:

    Consider a matrix A of order m n and let b be a column m -vector. Prove that exactly one

    of the following holds Either : Ax = b has x 0 or AT y 0, bT y < 0 has a solution y but

    not both.

    Solution: If the LP problemmax0 .x

    Ax = b, x 0has an optimal solution then solution must rst be feasible and in case A T y 0 for some y

    then for the feasible x 0 we will have (x A T y) 0 (x A T y) = ( y Ax ) = ( y b) 0

    and thus bT y < 0 is not possible.

    In case the set {x : Ax = b, x 0} is then since min bT y subject to AT y 0 has a

    feasible solution y = 0, it cannot have an optimal solution for otherwise it will contradict

    the duality theorem by asserting an optimal solution to its dual max 0 .x subject to Ax =

    b, x 0 which by assumption has not even a feasible solution. The statement that the

    minimum problem has no optimal solution, but a feasible solution implies that there is some

    y for which, bT y < 0 where 0 corresponds to the objective functions value at the feasible

    solution point y = 0.

    VIII. Stat 471: Linear and Non Linear Programming: What are complemen-

    tary slackness conditions for two linear programming problems that are dual to each other?

    Use the complementary slackness conditions to show that for a transportation problem with

    3 warehouses and 4 cities and with cost matrix

    C =

    5 7 9 6

    6 7 10 5

    7 6 8 1

    and with supplies 120, 140, 100 from warehouses 1, 2, and 3 respectively and with demands

    100, 60, 80, and 120 at cities 1, 2, 3 and 4 respectively, shipping scedule:

    Ship from warehouse 1 to city 1 100 units

    Ship from warehouse 1 to city 3 20 units

    Ship from warehouse 2 to city 2 60 units

    1

  • 8/12/2019 09AugStatMS

    13/15

    Ship from warehouse 2 to city 3 60 units

    Ship from warehouse 2 to city 4 20 units

    Ship from warehouse 3 to city 4 100 units

    is optimal.Solution Given the linear programming problem, say min( c, x) subject to Ax = b, x 0,

    the dual is max( b, y) subject to A T y c, and where y is unrestricted. The complementary

    slackness theorem says that at optimal solutions x , y to the two problems when it exists

    we will have [(Ax ) i bi ]yi = 0 for each i. and similarly [(AT y c) j x j ] = 0 for each j .

    Further for feasible solutions x, y these complementary conditions when satised imply they

    are also optimal.

    Thus if we can check the feasibility of our solutions and dual feasibility constructed

    assuming slackness we can check the optimality.

    Our problem of transportation has dual max d j v j + s i u i subject to u i + v j cij for all

    i, j . First let us pretend our shipping schedule as optimal and thus let us get the necessary

    slackness feasibility conditions giving u1 + w1 = c11 = 5 , u 1 + w3 = 9 , u 2 + w2 = 7 , u 2 + w3 =

    10, u 2 + w4 = 5 , u 3 + w4 = 1 giving a solution say starting arbitrarily with u1 = 0 , giving

    w1 = 5 , w3 = 9 , u 2 = 1 , w2 = 6 , w 4 = 4 , u 3 = 3. These give a feasible solution to the

    dual ui + w j cij . Thus since the given shipping is feasible, the shipping is optimal by

    complementary slackness sufficiency part for the two feasible solutions.

    IX Stat 473 Game Theory

    Given the bimatrix game ( A, B ) of order m n with Nash equilibrium strategies p =

    ( p1 , p2 , . . . p m ), q = ( q 1 , q 2 , . . . q n ) show that they are optimal strategies for the two players in

    case A = B .

    Solution Since mixed strategies p, q constitute a Nash equilibrium we have ( p Aq )

    (x Aq ) mixed strategy x. Thus v = ( p Aq ) (Aq )i for all coordinates i. Similarly

    the equilibrium condition ( p Bq ) ( p By ) for all mixed strategy y says ( p Bq ) =

    ( p Aq ) = v ( p Be j ) = ( p, Ae j ) (AT p) j v Thus p, q are optimal for the zero

    sum game A with value v.

    2

  • 8/12/2019 09AugStatMS

    14/15

    STAT 481 -Fall 2009 (Jing Wang)Four corn varieties were tested for their production in an experiment with 4 blocks and

    the following yield data were obtained:

    yij BlockVariety 1 2 3 4

    A 9.3 9.4 9.6 10B 9.4 9.3 9.8 9.9C 9.2 9.4 9.5 9.7D 9.7 9.6 10 10.2

    (1). Assume that there is no interaction effect between the corn variety and the blocks,write down the appropriate model and necessary restrictions to analyze the data.

    Model for the data is

    Y ij = + i + j + ij , , i = 1 , ..., 4, j = 1 , ..., 4

    where is the overall mean, i is the treatment (variety) effect such that4

    i =1 i = 0 and

    j is the block effect such that4

    j =1 j = 0, errors ij are assumed to be i.i.d. and follow a

    normal distribution with a constant variance, i.e. ij N 0, 2 .(2). Denote and the row mean by yi and column mean by y j respectively, we have

    4

    i =14 (yi y ) = 0 .3,

    4

    j =14 (y j y ) = 0 .7,

    4

    i =1

    4

    j =14 (yij y ) = 1 .11

    Construct the ANOVA Table:

    Source Sum Square D.F. MS F Treatment 0.3 3 0.1 8.182

    Block 0.7 3 0.233 19.091Error 0.11 9 0.012Total 1.11 15

    (3). Formulate the hypothesis and draw your conclusions for both effects based on aboveANOVA table given signicance level 0.05.

    Hyptothesis for treatment effects

    H 0 : 1 = 2 = 3 = 4 = 0 vs. H 1 : not all i = 0 .

    As in ANOVA table, F trt = 8 .182 > F (0.05, 3, 9) = 3 .86 which leads to the conclusion thatthere is signicant treatment (variety) effect.

    Hyptothesis for treatment effects

    H 0 : 1 = 2 = 3 = 4 = 0 vs. H 1 : not all j = 0 .

    As in (2), F block = 19 .091 > F (0.05, 3, 9) = 3 .86 which suggests that the block effect is alsosignicant.

    1

  • 8/12/2019 09AugStatMS

    15/15

    11. To analyze a dataset with 10 observations of ( , )i i x y and 18i x = , 20i y = ,, , , one may consider simple linear regression model,2 40i x = 2 50i y = 30i i x y =

    20 1 ; (0,i i i y x N = + + ).n I

    0

    (a) Find the least-square estimate of the simple linear regression model.

    (b) Perform a test of 0 1: H = against 1:a H 0 < . Use 0.05 = .

    (c) Estimate and construct a 95% confidence interval for .( | 2) E y x = ( | 2) E y x =

    (d) Predict the value of y at , and construct a 95% confidence interval for y. Hint:you need to consider the effect of

    2 x = on y.

    Solution:

    (a)( )

    1

    1 22 22 1

    ( )( ) 30 18 20 /10 0.789( ) 40 18 /10

    i i i i i i

    i i i

    x x y y x y n x y

    x x x n x

    = = =

    = , and

    0 1 2 1.8 ( 0.789) 3.420 y x = = = .

    (b) Under ,0 H 111

    ( 2( )

    t t ns

    = ) , where 1 2

    ( )( )i

    ss

    x x =

    and

    22 ( )

    1i y ys

    n

    =

    .

    The rejection region is (this is a one-sided test!), and the

    observed . So we have strong evidence to reject .1 1{ (0.05;8)} { 1.86}t t t < = <

    1 = 2.06t 0 H

    (c) The point estimate , and ( | 2) 3.420 0.789 2 1.842 E y x = = =

    22 2

    21 (2 )( ( | 2)) 0.117

    ( )i

    xs E y x s

    n x x

    = = + = .

    The 95% C.I. for is .( | 2) E y x = ( | 2) (0.025;8) ( ( | 2)) (1.053,2.631) E y x t s E y x= = =

    (d) The point estimate , and 3.420 0.789 2 1.842 y = =

    22 2

    21 (2 )( ) 1 1.228

    ( )i

    xs y s

    n x x

    = + + = .

    The 95% C.I.(or prediction interval) for y is (0.025;8) ( ) ( 0.713, 4.397) y t s y = .