ES10rex

21
620-370 Statistics for Mechanical Engineers Revision Exercises – page 1 Revision exercises 1 Questions 1-1 and 1-2 refer to the following information: In a factory, machines U , V and W produce 60%, 30% and 10% of the total output respectively. Of these outputs, 2%, 3% and 4% are defective. R1.1-1 The percentage of defective items produced at the factory is A. 9.0% B. 3.0% C. 2.8% D. 2.5% E. 2.2% R1.1-2 An item chosen at random from the combined output of a day’s production is found to be defective. The probability that it was produced by machine W is equal to: A. 0.10 B. 0.16 C. 0.20 D. 0.33 E. 0.40 R1.1-3 The Markov chain with state space {0, 1, 2} has transition probability matrix P , where P = 0.70 0.20 0.10 0.20 0.60 0.20 0.20 0.30 0.50 P 2 = 0.55 0.29 0.16 0.30 0.46 0.24 0.30 0.37 0.33 P 4 = 0.44 0.35 0.21 0.37 0.39 0.24 0.37 0.38 0.25 Pr(X 4 =2 | X 2 = 2) is equal to: A. 0.16 B. 0.21 C. 0.25 D. 0.30 E. 0.33 R1.1-4 X and Y are independent random variables with E(X) = 20, sd(X)=4; and E(Y ) = 30, sd(Y )=3. If Z = X + Y +2, then the standard deviation of Z is A. 9 B. 29 C. 7 D. 27 E. 5 R1.1-5 Suppose that T is normally distributed, i.e. T d = N(200, 20 2 ). Because the upper tail of the distribution of T is (······ ) than exponential, the hazard function of T is (······ ). The missing words are A. (longer), (increasing); B. (longer), (decreasing); C. (shorter), (increasing); D. (shorter), (decreasing); E. (about the same), (approximately constant).

description

ES10rex

Transcript of ES10rex

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 1

    Revision exercises 1Questions 1-1 and 1-2 refer to the following information:In a factory, machines U , V and W produce 60%, 30% and 10% of the total output respectively. Ofthese outputs, 2%, 3% and 4% are defective.

    R1.1-1 The percentage of defective items produced at the factory is

    A. 9.0%

    B. 3.0%

    C. 2.8%

    D. 2.5%

    E. 2.2%

    R1.1-2 An item chosen at random from the combined output of a days production is found to bedefective. The probability that it was produced by machine W is equal to:

    A. 0.10

    B. 0.16

    C. 0.20

    D. 0.33

    E. 0.40

    R1.1-3 The Markov chain with state space {0, 1, 2} has transition probability matrix P , where

    P =

    0.70 0.20 0.100.20 0.60 0.200.20 0.30 0.50

    P 2 = 0.55 0.29 0.160.30 0.46 0.24

    0.30 0.37 0.33

    P 4 = 0.44 0.35 0.210.37 0.39 0.24

    0.37 0.38 0.25

    Pr(X4 = 2 |X2 = 2) is equal to:A. 0.16

    B. 0.21

    C. 0.25

    D. 0.30

    E. 0.33

    R1.1-4 X and Y are independent random variables with E(X) = 20, sd(X) = 4; and E(Y ) = 30,sd(Y ) = 3. If Z = X + Y + 2, then the standard deviation of Z is

    A. 9

    B.

    29

    C. 7

    D.

    27

    E. 5

    R1.1-5 Suppose that T is normally distributed, i.e. T d= N(200, 202).Because the upper tail of the distribution of T is ( ) than exponential,the hazard function of T is ( ).The missing words are

    A. (longer), (increasing);

    B. (longer), (decreasing);

    C. (shorter), (increasing);

    D. (shorter), (decreasing);

    E. (about the same), (approximately constant).

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 2

    R1.1-6 If Z d= N(0, 1) then Pr(Z > 1 |Z > 2) is equal toA. 1;

    B.0.84130.9772

    ;

    C.0.02280.1587

    ;

    D.0.02280.8413

    ;

    E. 12

    .

    R1.1-7 A probability interval, a confidence interval and a prediction interval are statements about:

    A. a parameter, a statistic, and an observation, respectively;

    B. a statistic, an observation, and a parameter, respectively;

    C. a statistic, a parameter, and an observation, respectively;

    D. a parameter, an observation, and a statistic, respectively;

    E. an observation, a statistic, and a parameter, respectively.

    R1.1-8 A 95% confidence interval for a mean of a random variable with known variance, basedon the normal distribution, is found to be (15.1, 15.9). Without any further calculations, theP -value for test of H0: =15 versus H1: 6=15 is such thatA. P > 0.05;

    B. 0.01 6 P < 0.05;C. 0.001 6 P < 0.01;D. P < 0.001;

    E. cannot be determined without knowing the sample size.

    R1.1-9 The graph of the relative likelihood function, RLL() is shown in the diagram below.

    An approximate 95% confidence interval for is given by:

    A. 4 < < 8;

    B. 4 < < 12;

    C. 3 < < 9;

    D. 3 < < 15;

    E. 1 < < 19.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 3

    R1.1-10 In a 44 Latin square experiment with one-replicate, the following sums of squares for theanalysis of variance were obtained:

    SSrows 300columns 600treatments 120error 60total 1110

    From the analysis of variance the F -ratio for treatments is such thatA. F = 4 and so the treatments are not significant;

    B. F = 4 and so the treatments are significant;

    C. F = 2 and so the treatments are not significant;

    D. F = 2 and so the treatments are significant;

    E. F = 1 and so the treatments are not significant.

    R1.2 (a) If Pr(F ) = 0.2, Pr(G) = g and Pr(F G) = 0.8, find g:i. if F and G are mutually exclusive;

    ii. if F and G are independent.iii. Show that Pr(F |G) 6 0.25.

    (b) Items from a production line are checked for major flaws before being used. 90% pass thetest. The other 10% are recycled. Of twenty items tested, find the probability that at mosttwo are recycled.

    (c) In a production process, each item has to pass through three stages. At each stage, theprobability of successful completion of the stage is 0.8. Of those that fail, half repeat thestage and half are returned to stage 1. All items failing stage 1 must repeat stage 1. Con-sider the four-state Markov chain describing an items progress through the productionprocess, with states 1 = stage 1, 2 = stage 2, 3 = stage 3 and 4 = complete.

    (a) Write down the transition probability matrix for this Markov chain.(b) Suppose that each procedure at each stage takes one hour. Explain how you could

    obtain, using matrix multiplication, the probability that an item is completed withinfive hours of processing time.

    R1.3 (a) Let Y =160i=1Xi, where the Xi are independent and identically distributed random vari-

    ables, each having probability mass function: p(0) = 0.2, p(1) = 0.6, p(2) = 0.2. FindPr(Y 6 150).

    (b) Let T denote a positive continuous random variable with hazard density function h(t) =0.03t, t > 0. Find Pr(T > 4).

    (c) If E(Z) = and var(Z) = c2, find approximate expressions for E(Z) and var(

    Z). For

    what values of c will the approximation be best?

    (d) Suppose that X d= N(46, 42).

    i. Find the probability that X exceeds the threshold, t = 50.

    ii. Suppose that the threshold is random: T d= N(50, 32), and that T and X are indepen-dent Find the probability that X exceeds the threshold T .

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 4

    R1.4 (a) Evaluate the sample mean and sample standard deviation for the following data set:23.2, 37.4, 29.9, 12.4, 17.0, 43.5, 31.5, 19.6, 22.2, 34.7.

    (b) A random sample of 49 observations is obtained from a normal population, N(320, 402).Specify values you might expect to observed for min, Q1, med, Q3 and max. Hence sketcha typical boxplot for such a sample.

    (c) The following is a sample of n = 19 observations on X d= N(, 2).84 37 33 24 58 75 55 46 65 5918 30 48 38 70 68 41 52 50

    The graph below is the normal QQ plot for this sample.

    Specify the coordinates of the indicated point, explaining how they are obtained. Use thediagram to obtain estimates of and .

    R1.5 Twelve independent observations are obtained onX d= N(, 52). To testH0: =40 vsH1: 6=40,the decision rule is to reject H0 if |X 40| > 3.

    i. Find a 95% probability interval for X if = 40.

    ii. Find the size of the specified test.

    iii. Find the P -value if x = 44.0.

    iv. Find the power of the specified test if =45.

    v. Find a 95% confidence interval for if x = 44.0.

    vi. Find a 95% prediction interval for X if x = 44.0.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 5

    R1.6 (a) A random sample of ten observations is obtained on an exponentially distributed randomvariable with unknown mean . The sample has mean x = 5.7 and standard deviations = 7.2. Give an estimate of and its standard error.

    (b) The log-likelihood for a set of data is given by lnL() = 20+100 ln . Find the maximumlikelihood estimate of and its standard error.

    (c) Each day for thirty days, a random sample of 12 items from the days production is se-lected and carefully measured: the average of the 12 measurements (x) and the range ofthe 12 measurements (R) are calculated and recorded each day.At the end of the thirty days, the average of the daily averages, x = 36.29; and the averageof the daily ranges, R = 7.32.Assume that the measurements are approximately normal with mean and variance 2.

    i. Explain why R 3.258; and hence derive an estimate of .ii. Determine control limits for an x-chart.

    iii. Instead of R, it is suggested that the difference between the second largest and thesecond smallest observation be used. If this difference is denoted by Q, find c suchthat Q c. What advantage/disadvantage might an estimator based on Q haveover the estimator based on R?

    R1.7 (a) Sketch a graph, roughly indicating location and spread, of each of the following:

    i. the pdf of t10;ii. the pdf of 210;

    iii. the pdf of F10,10.

    (b) A random sample of nine observations is obtained on X1d= N(1, 21), and it is found that

    x1 = 38.9, s1 = 7.41.

    An independent random sample of nine observations is obtained on X2d= N(2, 22), for

    which x2 = 31.5 and s2 = 5.67.

    i. Find a 95% confidence interval for 1/2; and verify that there is no significant evi-dence against the null hypothesis H0: 1=2.

    ii. Specify the pooled standard deviation estimate based on both samples.iii. Assuming = 1 = 2, obtain a 95% confidence interval for .iv. Using the pooled variance estimate, obtain a 95% confidence interval for 12.

    R1.8 (a) Determinations of the strength of a fibre after using three treatments were as follows:number mean variance

    control 6 76 20treatment A 6 82 24treatment B 6 84 31

    i. Show that the error mean square is 25; and hence, or otherwise, complete the follow-ing analysis of variance table:

    df SS MS F

    treatments ** ****** **** ***error ** ****** ****total ** 575.00

    ii. Test the hypothesis H0: 1 = 2 = 3 giving an approximate P -value. What do youconclude?

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 6

    (b) A study was made to determine if humidity conditions have an effect on the force requiredto pull apart pieces of glued plastic. Two types of plastic were tested using three levelsof humidity. The results are given in the table below. There are two observations on eachfactor combination; the number in brackets is the average of these two observations.

    humiditytype 30% 60% 90%A 41.2, 40.6 (40.9) 38.6 37.8 (38.2) 35.5, 33.3 (34.4)B 39.0, 40.8 (39.9) 34.6, 37.4 (36.0) 23.2, 26.4 (24.8)

    The following analysis of variance was obtained for these data:df SS MS F

    type 1 54.6 54.6 24.1humidity 2 245.0 122.5 354.5interaction 2 43.4 21.7 9.6error 6 13.6 2.26total 11 356.6

    Give a brief analysis and interpretation of these results.

    R1.9 The following represent the results of a one replicate 23 experiment.

    y P Q R17.2 0 0 012.3 0 0 116.6 0 1 012.8 0 1 120.4 1 0 015.8 1 0 117.3 1 1 014.8 1 1 1

    av.y

    P0 14.7P1 17.1

    Q0 16.4Q1 15.4

    R0 17.9R1 13.9

    The following split-up of the sum of squares for these data was obtained using MATLAB, withX1=P , X2=Q and X3=R:

    Source Sum Sq. d.f. Mean Sq. F Prob>F-------------------------------------------------------X1 11.1 1 11.1 Inf NaNX2 2.2 1 2.2 Inf NaNX3 31.2 1 31.2 Inf NaNX1*X2 2 1 2 Inf NaNX1*X3 0.3 1 0.3 Inf NaNX2*X3 1.3 1 1.3 Inf NaNX1*X2*X3 0.1 1 0.1 Inf NaNError -0 0 -0Total 48.2 7

    (a) Why are the p-values undefined?

    (b) Generate a half-normal plot for the root mean-squares and use this to identify the possiblyimportant effects.Note: 1( 1+q2 ) = 0.16, 0.32, 0.49, 0.67, 0.89, 1.15, 1.53, for q =

    18 ,

    28 , . . . ,

    78 .

    (c) Produce a revised analysis of variance and determine the significant effects.

    (d) Obtain an estimate and a 95% confidence interval for the effect of R.

    (e) Give an interpretation of your results.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 7

    R1.10 (a) The following tables gives two observations on y for each of the specified values of x. Theaverage value of y for each x-value and the overall average are also given.

    x 0 1 2 3 4y 13 25 27 32 37

    16 23 29 33 35y 14.5 24.0 28.0 32.5 36.0 y = 27.0

    The following results were obtained using these data:

    Regression Analysis

    The regression equation is y = 16.7 + 5.15 x

    est se t Pintercept 16.70 1.155 14.46 0.000slope 5.15 0.471 10.93 0.000

    DF SS MS F PRegression 1 530.45 530.45 119.37 0.000Residual Error 8 35.55 4.44Total 9 566.00

    S = 2.108 R-Sq = 93.7%

    One-way ANOVA

    Source DF SS MS F PBetween groups 4 555.00 138.75 63.07 0.000Within groups 5 11.00 2.20Total 9 566.00

    S = 1.483 R-Sq = 98.06%

    Find a 95% confidence interval for E(Y |x = 2):(a) assuming a straight line regression model;(b) making no assumptions about the regression.

    Give an assessment of the goodness of fit of the straight-line regression.

    (b) A random sample of n= 50 observations is obtained on (X,Y ), from which the samplecorrelation is found to be r= 0.5.(a) Indicate with a sketch the general nature of the scatter plot for these data.(b) Give an approximate 95% confidence interval for the population correlation.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 8

    Revision exercises 2

    R2.1 (a) Consider a system of components that works if at least k out of n independent compo-nents operate. Denote such a system by [k/n]. Consider systems made up of independentcomponents each of which has reliability 0.9.Find the reliability of a [3/4] system.

    (b) Minor flaws occur randomly in a production process. Suppose that each item coming offthe production line has an average of 0.4 flaws.What is the expected proportion of items free of flaws?

    (c) The hazard density function (hdf) of a continuous random variable T is given by

    h(t) =f(t)

    1 F (t)where f and F denote the pdf and cdf of T .If T has pdf f(t) = 1/(1 + t)2 (t > 0), find the hdf of T .

    (d) X is a random variable with mean 20 and standard deviation 2. Let Y = lnX .Find an approximate value for the standard deviation of Y .

    (e) A random sample of 11 observations is obtained on Y d= N(67, 102).Find a 95% probability interval for the sample standard deviation.

    (f) Sixty independent trials, each with probability p of success, yielded 48 successes.Find a 95% confidence interval for p.

    (g) Of 100 independent 95% confidence intervals, let Z denote the number of these confidenceintervals that contain the true parameter value.Specify the distribution of Z.

    (h) If W = |Z|, where Z d= N(0, 1), show that cq(W ) = 1(1+q2 ), where denotes thestandard normal cdf. Hence find the median of W .

    (i) The log-likelihood function for a particular data set is given bylnL = 50( 1)2.

    Find and se().

    (j) Each day for fifty days, a random sample of 16 items from the days production is selectedand measured: the average of the 16 measurements (x) and the range of the 16 measure-ments (R) are calculated and recorded each day.At the end of the fifty days, the average of the daily averages, av(x) = 14.50; and theaverage of the daily ranges, av(R) = 7.06Determine control limits for an x-chart.

  • 620-370 Statistics for Mechanical Engineers, November 2007 examination Revision Exercises page 9

    R2.2 (a) The events A and B are such that Pr(B) = 0.2, Pr(A |B) = 0.4 and Pr(A |B) = 0.1.i. Find Pr(A).

    ii. Find Pr(B |A).iii. Are A and B positively related? Explain.

    (b) A production process consists of two stages. Items begin in stage 1. As a result of stage1, three things can happen: the item is scrapped, with probability 0.1; or the item is re-worked, i.e. sent through stage 1 again, with probability 0.4; or the item moves along tostage 2. As a result of stage 2, three things can happen: the item is returned to stage 1with probability 0.2, or it is sent through stage 2 again with probability 0.3, or it is deemedsatisfactory and completeConsider this process as a Markov chain with four states: 0 = scrapped, 1 = stage 1, 2 =stage 2, and 3 = complete.

    i. Write down the transition probability matrix, P .

    Powers of P were computed with the following results:

    P4 = 1.0000 0.0000 0.0000 0.00000.1834 0.1166 0.1575 0.54250.0434 0.0630 0.0851 0.80850.0000 0.0000 0.0000 1.0000

    P40 = 1.0000 0.0000 0.0000 0.00000.2188 0.0000 0.0000 0.78120.0625 0.0000 0.0000 0.93750.0000 0.0000 0.0000 1.0000

    ii. Specify the probability that an item is still in the system after four cycles.iii. Specify the proportion of items scrapped in the production process.iv. If an item has reached stage 2, specify the value for its probability of successful com-

    pletion.

    R2.3 (a) Consider the discrete random variable X with pmf given byx 0 1 2 3p(x) 0.4 0.3 0.2 0.1

    Show that E(X) = var(X) = 1.(b) Suppose that X1 and X2 are independent random variables each with the pmf given in

    (a). Find Pr(X1 = X2).

    (c) Suppose thatX1, X2, . . . , X100 are independent random variables each with the pmf givenin (a). Let S = X1 + +X50 and T = X51 + +X100.

    i. Find the mean and standard deviation of S T .ii. Use the central limit theorem to obtain an approximate value for Pr(S = T ).

    R2.4 A random sample of 100 observations is obtained from a Normally distributed random variable

    Xd= N(60, 102).

    (a) Specify approximate values you would expect to obtain for each of the following statisticsfor this sample:

    i. the sample mean, x;ii. the sample standard deviation, s;

    iii. the number of observations less than 50, freq(X < 50);iv. the sample upper-quartile, Q3;v. the sample maximum, x(100).

    (b) i. Sketch a boxplot that would be not unreasonable for this sample.ii. Indicate in a sketch, the likely form of a Normal QQ-plot for this sample, showing its

    important features.iii. What would the Normal QQ-plot look like if observations less than 50 were censored?

  • 620-370 Statistics for Mechanical Engineers, November 2007 examination Revision Exercises page 10

    R2.5 (a) A random sample of 400 observations is obtained on T d= exp(0.01), for whichF (t) = 1 e0.01t (t > 0); and f(t) = 0.01e0.01t (t > 0).

    i. Write down the mean and standard deviation of the sample mean, T .The summary notes may be used.

    ii. Specify an approximate 95% probability interval for T .Give your answers to one decimal place.

    (b) The following is a random sample from a Normal population:3.0, 5.0, 6.0, 7.0, 9.0.

    i. Verify that x = 6.0 and s2 = 5.0.ii. Find a 95% confidence interval for .

    iii. Find a 95% confidence interval for .iv. Find a 95% prediction interval for X .

    Give your answers to two significant figures.

    R2.6 (a) A following sample is obtained on X d=N(, 2) to test the hypothesis =50 against 6=50using a significance level of 0.05.

    54.0, 45.0, 39.9, 41.5, 55.6, 48.8, 36.6, 49.0, 47.4, 45.6, 39.8, 51.3, 34.2, 32.8, 59.3, 36.0.For this sample n = 16, x = 44.8 and s = 8.0.

    i. Show that t = 2.60, specify the appropriate critical value and hence show that H0 isrejected.

    ii. Specify the p-value for this test.

    (b) Suppose the above sample represents results for the strength of the adhesion for gluedtiles. You are required to report to management on the mean strength of the adhesion. Inparticular, management is concerned that the mean should be no less than 50. Test thehypothesis = 50 vs < 50.Write a brief statement summarising your conclusions for management.

    R2.7 (a) Independent random samples are obtained from Normally distributed populations, X1d=

    N(1, 2) and X2d= N(2, 2), with the following results:

    n1 = 15; x1 = 25.0, s21 = 60.0;n2 = 10; x2 = 31.5, s22 = 44.7.

    It is assumed that the population variances are equal, and so the sample variances arepooled to give s2 = 54.0.

    i. Explain how this pooled variance is obtained.ii. Find a 95% confidence interval for 12.

    (b) Determinations of a strength measure after using three treatments (P , Q and R) were asfollows:

    mean var

    [1] treatment P 5, 4, 10, 7 6.5 7.0[2] treatment Q 12, 8, 9, 15 11.0 10.0[3] treatment R 15, 9, 12, 14 12.5 7.0For these data

    (y y)2 = 150.

    i. Show that s2 = 8 and hence, or otherwise, derive the analysis of variance table for theabove data.

    ii. Test the hypothesis H0: P=Q=R giving an approximate P -value.iii. Find a 95% confidence interval for the effect of treatment R relative to treatment Q,

    i.e., RQ.iv. State the conclusions you reach from your analysis.

  • 620-370 Statistics for Mechanical Engineers, November 2007 examination Revision Exercises page 11

    R2.8 (a) Twenty-four experimental units are available in blocks of six. It is required to run anexperiment to compare three treatments: a control C, treatment A and treatment B. Givean appropriate assignment of treatments to the experimental units. Explain your method.

    (b) The experiment described in (a) is carried out, with results as indicated in the table below:

    C A B

    B1 9, 12 23, 19 12, 15B2 5, 8 16, 18 11, 14B3 10, 14 23, 20 15, 20B4 10, 12 23, 18 19, 14

    (av) 10.0 20.0 15.0

    This yielded the following incomplete analysis of variance table:

    source df SS MSblocks ** 400 ***treatments ** *** ***error ** *** 5total ** 574

    i. Complete this analysis of variance table.

    ii. Assuming an additive model with independent normally distributed errors havingequal variances, test the significance of the treatment effects.

    iii. Give an estimate of AC , and the standard error for your estimate.

    R2.9 The table below gives the results of one replicate of a 24 experiment with factors A, B, C andD. Some relevant computer output is also given.

    anovasource df SS

    y A B C D A 1 169 av1 4 0 0 0 0 B 1 1 A0 7.752 8 0 0 0 1 C 1 81 A1 14.253 4 0 0 1 0 D 1 2254 16 0 0 1 1 AB 1 4 C0 8.755 2 0 1 0 0 AC 1 0 C1 13.256 8 0 1 0 1 AD 1 07 6 0 1 1 0 BC 1 0 D0 7.258 14 0 1 1 1 BD 1 0 D1 14.759 8 1 0 0 0 CD 1 16

    10 14 1 0 0 1 ABC 1 1 C0D0 6.0011 12 1 0 1 0 ABD 1 1 C0D1 11.5012 20 1 0 1 1 ACD 1 1 C1D0 8.5013 10 1 1 0 0 BCD 1 1 C1D1 18.0014 16 1 1 0 1 ABCD 1 415 12 1 1 1 0 Error 016 22 1 1 1 1 Total 15 504

    i. How might a half-normal plot be used to indicate which interactions might be non-zero? Indicate in a rough sketch its appearance in this case.

    ii. Show that A, C, D and CD are significant.

    iii. Show that the estimate of the error standard deviation, s =

    1311

    .

    iv. Give an estimate of, and a 95% confidence interval for the effect of A. Use s 1.1.v. Sketch a cell-mean plot to indicate the interaction between C and D. Write a sentence

    describing the interaction.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 12

    R2.10 The table below gives the corresponding values of variables x and y.

    x 5 6 7 8 10 11 12 13 14 14y 20 22 26 24 34 30 38 36 32 38

    For these data, the following statistics were calculated:

    n = 10, x = 10, y = 30;

    (x x)2 = 100,(xx)(yy) = 180,(yy)2 = 400.i. Assuming that E(Y |x) = + x and var(Y |x) = 2, obtain estimates of and

    using the method of least squares.

    ii. Show that s2 = 9.5 and hence obtain se().iii. Plot the observations and your fitted line.iv. Find the sample correlation, and give an approximate 95% confidence interval for the

    population correlation.

    Revision exercises 3

    R3.1 (a) If Pr(G) = 0.4 and Pr(H) = 0.2, find Pr(G H) ifi. G and H are mutually exclusive;

    ii. G and H are independent;iii. Pr(G |H) = 0.8.

    (b) The events A, B and C are independent with probabilities 0.5, 0.6 and 0.8 respectively.

    i. Find the probability that at least one of the events A, B and C occur.ii. Find the probability that at least two of the events A, B and C occur.

    R3.2 Consider the following sampling plan. Test a random sample of 100 items chosen from a lotcontaining a very large number of items, and accept the lot if at most five defective items arefound.

    Use an appropriate Poisson approximation to sketch the graph of the OC curve for this sam-pling plan, for 0 6 p 6 0.1, where p denotes the lot defective proportion.

    R3.3 A production process consists of three stages. Items begin in stage 1, and as a result of eachcycle, three things can happen: the item is scrapped, with probability 0.05; or the item is re-worked, i.e. sent through the same stage again, with probability 0.25; or the item moves alongto the next stage. Consider this process as a Markov chain with five states: 0 = item scrapped,1 = stage 1, 2 = stage 2, 3 = stage 3 and 4 = complete.

    (a) Write down the transition probability matrix, P .(b) Given that

    P5 = 1.0000 0 0 0 00.1799 0.0010 0.0137 0.0766 0.72890.1279 0 0.0010 0.0137 0.85750.0666 0 0 0.0010 0.9324

    0 0 0 0 1.0000

    i. Specify the probability that an item is still in the system after five cycles.ii. Give an approximate value for the proportion of items scrapped in the production

    process.iii. If an item has reached stage 2, give an approximate value for its probability of suc-

    cessful completion.

    R3.4 (a) Minor flaws occur randomly in the production process. Suppose that each sprocket com-ing off the production line has an average of 1.2 minor flaws. What proportion of itemsare free of minor flaws?

    (b) Sprockets are checked for major flaws before being used. 90% pass the test. The other10% are scrapped. Of 18 sprockets tested, what is the probability that at most two arescrapped?

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 13

    (c) The sprockets must fit in their place in the assembly, so their measurement needs to bewithin 0.20 mm of specifications. If the deviation of this measurement from specificationsis roughly normal with mean 0.05 mm and standard deviation 0.10 mm, find approxi-mately the proportion of sprockets rejected on this measurement check.

    R3.5 A random sample of sixteen observations is obtained from a population having a standardnormal distribution, i.e. N(0, 1). Find approximate values for:

    i. the probability that the sample mean is greater than 0.5;

    ii. the probability that the sample variance is greater than 2;

    Consider the distribution of X(16), the maximum of a random sample of sixteen observationsfrom a standard normal distribution.

    iii. Use the Statistical Tables to specify the mean of X(16).

    iv. Use the Statistical Tables to show that Pr(X(16) 6 1.2816) = 0.1853.

    R3.6 The diagram below gives the sample cdf for a random sample of 100 observations on X .

    i. Find the sample median.

    ii. Sketch a boxplot for this sample.

    iii. Draw a rough graph of what you think the population pdf might look like.

    iv. Give a rough estimate of the population standard deviation.

    v. Explain briefly how a sample cdf relates to a probability plot.

    R3.7 (a) The following is a random sample of ten observations on Y d= Pn():{7,12,9,2,4,7,8,11,4,6}.

    For this sampley = 70 and

    y2 = 580.

    Give an estimate of and its standard error.

    (b) Sixty independent trials, each with probability p of success, yielded 18 successes. Find a95% confidence interval for p.

    (c) A random sample of n = 15 observations on X d= N(, 2) has sample mean x = 50.0 andsample variance s2 = 60.0.

    i. Find a 90% confidence interval for .ii. Find a 90% prediction interval for X .

    Give your answers to one decimal place.

    (d) Each day for twenty days, a random sample of ten items from the days production isselected and carefully measured: the average of the ten measurements (x) and the varianceof the ten measurements (s2) are calculated and recorded each day.At the end of the twenty days, the average of the daily averages, av(x) = 8.2; and theaverage of the daily ranges, av(s2) = 1.6Determine control limits for an x-chart.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 14

    R3.8 A random sample of n observations is obtained on X which has pdf and cdf given by

    f(x) =

    (1 + x)+1(x > 0); F (x) = 1 1

    (1 + x)(x > 0).

    (a) Find the maximum likelihood estimate of and an expression for its standard error.

    (b) i. Show that the q-quantile of X is such that ln(1 + cq) = ln(1 q).ii. Let x(k) denote the kth order statistic for this sample on X ; and define

    zk = ln(1 kn+1 ). Explain how a plot of ln(1 + x(k)) against zk yields:(A) a check of the distributional assumption; and (B) an estimate of .

    iii. Indicate in a rough sketch what such a plot might look like if the model is correct.

    R3.9 For a random sample of n observations taken from a normal population, the variance of the samplevariance, S2, is given by var(S2) = 2

    4

    n1 .Use this result to show that an approximate expression for the standard error of the samplestandard deviation s is given by se(s) s

    2(n1) .

    Hence find an approximate 95% confidence interval for based on a random sample of n = 33observations from a normal population, for which s = 16.

    The exact 95% confidence interval in this case is (12.9 < < 21.2). Explain how this exactconfidence interval is obtained.

    R3.10 A random sample yields the following graphs of the log likelihood, lnL; the score function,U = lnL ; and the observed information function, V =

    2 lnL2 .

    Use the above graphs to determine

    i. the maximum likelihood estimate of ;

    ii. an approximate 95% confidence interval for ;

    iii. an approximate standard error for .

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 15

    R3.11 Consider a sequence of independent trials with probability of success . We wish to test H0: = 0.05 vs H1: = 0.1.

    (a) Define the size and the power of a statistical test.

    (b) Given the following MATLAB output, which gives the cdf of a binomial distribution, spec-ify a test, based on n=300 observations which has size 6 0.05 and power > 0.95 to test H0: = 0.05 vs H1: = 0.1.

    >> x = [19 20 21 22 23]x = 19 20 21 22 23>> binocdf(x,300,0.05)ans = 0.8810 0.9224 0.9514 0.9708 0.9832>> binocdf(x,300,0.1)ans = 0.0171 0.0287 0.0458 0.0699 0.1024

    (c) In the sequential likelihood ratio test to test H0: = 0 vs H1: = 1, we define Uk =

    ln(Lk(1)Lk(0)

    ), where Lk() denotes the likelihood of the data set consisting of the first k

    observations. Then, we accept H1 if Uk > 3, accept H0 if Uk 6 3, and continue samplingotherwise.

    i. Give a brief explanation of why the cut-off values 3 correspond roughly to size =0.05 and power = 0.95.

    ii. In the case of testing H0: = 0.05 vs H1: = 0.1 for independent trials, indicate whyUk 0.75xk 0.05k, where xk denotes the number of successes in k trials.

    iii. Verify that, if we obtained 20 successes in 200 trials, then this sequential test wouldmean that H1 would be accepted.

    iv. Comment on the advantages and disadvantages of a sequential test.

    R3.12 PQR-Co is concerned with the quality of its major product, the deconvolving sprocket. Inparticular, the springiness of the sprocket must be at least 0.2. To determine the springinessconclusively, a destructive test needs to be used. But this is clearly pointless for products tobe shipped. In the past it has tested a random sample of 1% of the products shipped. A newnon-destructive testing procedure is proposed which is inexpensive and could be applied toall products shipped. However, it is felt that this new testing procedure over-estimates thespringiness. To test this, a series of sprockets is tested using each of the methods: it is firsttested by the new method (N) and then using the destructive test (D), after which no furthertesting is possible. These results are given below.

    N D1 0.280 0.2282 0.228 0.2273 0.271 0.2424 0.217 0.1975 0.225 0.2096 0.247 0.2277 0.209 0.1908 0.226 0.2109 0.235 0.215

    10 0.241 0.239

    Describe how you would test the null hypothesis that N = D against the alternative N >D. Give a rough analysis, without actually performing the test.

    R3.13 Independent samples are obtained from normally distributed populations, X1d= N(1, 21) and

    X2d= N(2, 22), with the following results:

    n1 = 8; x1 = 80, s21 = 40;n2 = 8; x2 = 50, s22 = 32.

    i. Find a 95% confidence interval for 1/2; and verify that the hypothesis 21 = 22 wouldbe accepted.

    ii. Specify the pooled variance estimate based on both samples.

    iii. Using the pooled variance estimate, obtain a 95% confidence interval for 12.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 16

    R3.14 (a) Twenty experimental units are available in blocks of four. It is required to run an experi-ment to compare four treatments. Give an appropriate assignment of treatments to plots.Explain your method.

    (b) The experiment described in (a) is carried out, with results as indicated in the table below:

    T1 T2 T3 T4 (sum)

    B1 16 25 9 30 80B2 9 17 14 20 60B3 4 21 16 19 60B4 17 28 24 31 100B5 14 29 27 30 100(sum) 60 120 90 130 400

    This yielded the following incomplete approximate analysis of variance table:

    source df SS MSblocks ** 400 ***treatments ** *** ***error ** *** 15total ** 1180

    i. Complete this analysis of variance table.

    Assuming an additive model with independent normally distributed errors having equalvariances,

    ii. test the significance of the treatment effects.

    The experiment above is actually a five replicate 22 factorial experiment, with T1 = P0Q0,T2 = P1Q0, T3 = P0Q1 and T4 = P1Q1. This allows the split up of the treatment sum ofsquares into components due to P , Q and PQ: SSP = 500, SSQ = 80 and SSPQ = 20.

    iii. Show that the effect of P is highly significant.iv. Give an estimate of the effect of P , and the standard error for your estimate.

    R3.15 The table below gives the results of one replicate of a 24 experiment with factors P , Q, R andS. Some relevant computer output is also given.

    y P Q R S1 49.0 0 0 0 02 56.2 0 0 0 13 49.8 0 0 1 0 av.y4 49.0 0 0 1 1 P0 53.505 52.8 0 1 0 0 P1 56.736 62.2 0 1 0 17 51.8 0 1 1 0 Q0 52.358 57.2 0 1 1 1 Q1 57.889 52.6 1 0 0 0

    10 55.4 1 0 0 1 R0 56.0811 50.2 1 0 1 0 R1 54.1512 56.6 1 0 1 113 57.2 1 1 0 0 S0 52.5514 63.2 1 1 0 1 S1 57.6815 57.0 1 1 1 016 61.6 1 1 1 1

    anova-1source df SSP 1 41.603Q 1 122.103R 1 14.823S 1 105.063PQ 1 1.102PR 1 5.522PS 1 0.122QR 1 0.003QS 1 6.003RS 1 6.003 anova-2PQR 1 0.062 source df SS MS F PPQS 1 3.063 P 1 41.60 41.60 11.55 0.006PRS 1 12.602 Q 1 122.10 122.10 33.91 0.000QRS 1 0.062 R 1 14.82 14.82 4.12 0.067PQRS 1 5.062 S 1 105.06 105.06 29.18 0.000Error 0 Error 11 39.61 3.60Total 15 323.198 Total 15 323.20

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 17

    i. Indicate the steps in the analysis of this experiment. Include answers to the followingquestions: How might a half-normal plot be used to indicate which interactions might benon-zero? How is the second analysis of variance obtained from the first? Which effectsare significant? What is the estimate of the error variance?

    ii. Give an estimate of, and a 95% confidence interval for the effect of P .

    R3.16 (a) The table below gives the values of an independent variable x and a dependent variabley.

    x 0 1 2 3 4y 90 87 82 74 67

    For these data, the following statistics were calculated:

    x = 2, y = 80;

    (x x)2 = 10,(x x)(y y) = 59,(y y)2 = 358.i. Assuming that E(Y |x) = + x and var(Y |x) = 2, obtain estimates of and

    using the method of least squares.ii. Find s2 and hence obtain se().

    iii. Plot the observations and your fitted curve.

    (b) A random sample of n = 50 observations are obtained on (X,Y ). For this sample, it isfound that x = y = 50, sx = sy = 10 and the sample correlation rxy = 0.5.

    i. Indicate, with a rough sketch, the general nature of the scatter plot for this sample.ii. On your diagram, indicate the regression line for the regression of y on x.

    iii. Give an approximate 95% confidence interval for the population correlation.

    Revision exercises 4

    R4.1 (a) The events A and B are such that Pr(A) = 0.5, Pr(B) = 0.4 and Pr(B |A) = 0.6. FindPr(B | A).

    (b) Items from a production line are classified as having no fault, having a minor fault orhaving a major fault. On average, 90% have no fault, 8% have a minor fault and 2% havea major fault.A fault detector is such that the probability that it signals a fault is 0.01 for items with nofault; 0.6 for items with a minor fault; and 0.95 for items with a major fault.

    i. For what proportion of items is a fault signalled?ii. Of those items for which a fault is signalled, what proportion have a major fault?

    R4.2 (a) A production process when in control produces 10% defective items. If a weeks produc-tion is 400 items, find the mean and variance ofX , the number of defective items producedin a week and hence give an approximate 95% interval within which X will lie.

    (b) Consider the following sampling plan. Test a random sample of 100 items chosen from alot containing a large number of items, and accept the lot if the number of defective itemsfound is at most two.Construct the OC-curve for this sampling plan.

    R4.3 A production process consists of three stages. As a result of each stage, three things can happen:the item is scrapped; the item is reworked, i.e. sent through the same stage again; or the itemmoves along to the next stage. The probabilities of these events at each stage are set out in thefollowing table:

    scrapped reworked next stagestage 1 0.1 0.2 0.7stage 2 0.1 0.2 0.7stage 3 0 0.6 0.4

    Consider this process as a Markov chain with five states: 0 = scrapped, 1 = stage 1, 2 = stage 2,3 = stage 3 and 4 = complete.

    i. Write down the transition probability matrix, P .

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 18

    ii. If N denotes the number of cycles (i.e. stages and repeated stages) required to completean item, specify approximately E(N).

    iii. Given that

    P 5 =

    1.000 0.000 0.000 0.000 0.0000.234 0.000 0.006 0.227 0.5330.125 0.000 0.000 0.136 0.7390.000 0.000 0.000 0.078 0.9220.000 0.000 0.000 0.000 1.000

    a. Specify the probability that an item is still in the production process after five cycles.b. Specify the probability that an item is complete after five cycles.c. Give an approximate value for the proportion of items scrapped in this production

    process.

    R4.4 (a) The graph below represents the cdf of a random variable W

    Use the graph to obtain approximate values for

    i. Pr(0.2 < W < 0.5) and Pr(W > 0.5);ii. the median and quartiles of W .

    Draw a rough sketch of the graph of the pdf of W .

    (b) Suppose that T is a continuous random variable with pdf given by

    f(t) = 12t2(1 t) (0 < t < 1).

    Find the mean and standard deviation of T .

    R4.5 (a) Suppose that X and Y are independent random variables, which are such that X d=N(10, 42) and Y d= N(12, 32). Find

    i. Pr(X > 12);ii. Pr(X > Y ).

    (b) If Z d= Pn(25), find Pr(Z > 33).(c) If E(U) = 10 and sd(U) = 5, specify approximate values for the mean and standard

    deviation of lnU .

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 19

    R4.6 The following is a random sample of 10 observations on the integer-valued random variable Y6, 4, 9, 3, 2, 4, 11, 8, 6, 7.

    For this sampley = 60 and

    y2 = 432.

    (a) Find:

    i. the second order statistic, y(2);ii. the sample mean;iii. the sample standard deviation;iv. the sample median;v. the sample quartiles.

    (b) If Y d= Pn(), give an estimate of and its standard error.

    R4.7 (a) A sequence of n = 30 independent trials yields x = 9 successes.Use the tables to specify a 95% confidence interval for p, the probability of success.

    (b) A random sample of n = 15 on X d= N(, 2) yields sample mean x = 160 and samplevariance s2 = 60.

    i. Find a 95% confidence interval for .ii. Find a 95% prediction interval for X .

    (c) The log-likelihood for a data set is given by

    lnL = 200 ln 252.

    Find the maximum likelihood estimate of and its standard error.

    R4.8 (a) A random sample of twenty-five observations is obtained on X d= N(, 102). The sam-ple mean for this sample is x = 46.4. Test the null hypothesis H0: = 50 against thealternative H1: < 50. Specify the P -value.

    (b) Each day for twenty days, a random sample of 25 items from the days production isselected and carefully measured: the average of the 25 measurements (x) and the range ofthe 25 measurements (R) are calculated and recorded each day.At the end of the twenty days, the average of the daily averages, x = 22.0; and the averageof the daily ranges, R = 7.86.It is assumed that the measurements are approximately normal with mean and variance2.

    i. Explain why R 3.93.ii. Determine control limits for an x-chart.

    R4.9 Independent samples are obtained from normal populationsX1d= N(1, 21) andX2

    d= N(2, 22),with the following results:

    n1 = 8; x1 = 80, s21 = 40;n2 = 8; x2 = 50, s22 = 32.

    i. Find a 95% confidence interval for 1/2; and verify that the hypothesis 21 = 22 wouldbe accepted.

    ii. Specify the pooled variance estimate based on both samples; and state its distributionunder the assumption that 21 = 22 .

    iii. Using the pooled variance estimate, obtain a 95% confidence interval for 1 2, andspecify the value of the t-statistic used to test 1 = 2.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 20

    R4.10 Random samples were obtained on each of four normal populations having equal variances,2. The samples contained n1 = 4, n2 = 3, n3 = 5 and n4 = 5 observations respectively.

    i. Complete the following analysis of variance table derived from the above samples:df SS MS F

    between sampleswithin samples 25total 625

    ii. Show that the hypothesis that the populations have equal means is rejected using a test ofsize 0.05.

    iii. Give an estimate of 2, and a 95% confidence interval for 2.

    iv. If x1 = 31 specify a 95% confidence interval for 1.

    v. Specify the standard error of 3 4.

    R4.11 (a) An experiment to compare the effect of three treatments is to be conducted using eighteenplots. Six of the plots are classed as good quality, six as moderate quality and sixas poor quality. Describe in detail how you would decide which treatments would beallocated to which plots.This experiment is carried out giving the following analysis of variance table:

    df SS MSblocks .. 48 ..treatments .. 24 ..error .. 39 ..total .. 111

    i. Test the significance of the treatment effects.ii. Specify the standard error of the estimate of the difference in effects of treatments 1

    and 2.

    (b) The yield of a chemical process is observed at three temperature levels (1=low, 2=mediumand 3=high) for each of three catalysts. Four replicate observations are obtained for eachcatalyst-temperature combination and the averages of these sets of four observations areplotted in the following diagram.

    For these data, the error mean square, s2 = 1. Indicate the number of degrees of freedomfor each of the components in the standard analysis of variance for this situation; and in-dicate whether the corresponding F -tests are likely to be significant for these data, givingreasons for your answers.

  • 620-370 Statistics for Mechanical Engineers Revision Exercises page 21

    R4.12 A study was conducted to investigate the relationship between strength (y) and density (x) of aparticular material. Seventeen specimens were tested and for these data the following statisticswere calculated:

    x = 30, y = 50;(x x)(y y) = 200, (x x)2 = 400, (y y)2 = 400.

    i. Evaluate the sample correlation coefficient r and give a rough 95% confidence interval for.

    ii. Fit the straight line regression model y = + x+ e; i.e. give estimates of and .

    iii. Roughly sketch a possible scatter plot for these data.

    iv. Complete the following analysis of variance table:

    df SS MS Fregression .. .. .. ..residual .. .. 20total .. ..

    Hence, or otherwise, test the hypothesis = 0.