Random Variable
-
Upload
yetta-oneill -
Category
Documents
-
view
31 -
download
0
description
Transcript of Random Variable
Random Variable
Qualitative (categorical)
Quantitative (numeric)
Nominal Ordinal RatioInterval
ContinuousDiscrete
SUMMARIZING NUMERIC DATA
• Simple Frequency Table
• Grouped Frequency Table
• Histogram
• Frequency Polygon
• Cumulative Frequency Distribution
• Arithmetic Mean
• Median
• Mode.
3- 3
Measures of Central Location
Mean for grouped data:
3- 4
N
fxmeanpopulation :
n
fxxmeansample
:
3- 5
MedianMedian for grouped data:
mf
Fn
cLmedian
2
3- 6
ModeMode for grouped data:
cffff
ffLodem
mm
m
21
1
Measures of Dispersion (Variability)
• Range
• Variance and Standard Deviation
• Coefficient of Variation
• Non-central Locations: Inter-fractile Ranges
Standard Deviation
(grouped data)
(ungrouped data)
)1(
)()( 22
nn
fxfxns
)1(
)()( 22
nn
xxns
Coefficient of variation:
%)100(x
sCV
68%
95%
99.7%
3- 10
Empirical Rule:
The Relative Positions of the Mean, Median, and Mode:
Symmetric Distribution
Zero skewness → :Mean =Median = Mode
M o d e
M ed ia n
M ea n
3- 11
Positively skewed: Mean>Median>Mode
M o d e
M ed ia n
M ea n
3- 12
Negatively Skewed: Mean<Median<Mode
M o d eM ea n
M ed ia n
3- 13
Non-Central Location Measures (Fractiles or Quantiles)
• Quartiles• Sextiles• Octiles• Deciles• Percentiles
n = sample sizeL = lower limit of jth quartile classF = < cumulative frequency of immediately preceding class.fQj = frequency of jth quartile class.
The jth quartile for grouped data is given by:
Calculating Quartiles for Grouped Data
jQj f
cFjn
LQ
4
Example
A sample of 20 randomly-selected hospitals in the US revealed the following daily charges (in $) for a semiprivate room.
153 159 142 146
141 140 130 148
142 163 134 151
122 167 137 152
143 168 159 1411.1 Using class intervals of width 10 units, construct a less-than cumulative frequency distribution of the above data. Let 120 units be the lower limit of the smallest class.
1.2 Draw a less-than ogive and use it to estimate the 80th percentile.
1.3 For the grouped data of question 1.1 above, calculate: 1.3.1 The mean, median and mode 1.3.2 The interquartile range.. 1.3.3 The coefficient of variation. Interpret the result obtained.
Solution
Class Freq, f <cum freq, F
120 - 130 1 1
130 - 140 3 4
140 - 150 8 12
150 - 160 5 17
160 - 170 3 20
∑ = 20
1.1
1.2
80th percentile = 158
Class Freq, f <cum freq, F midpt, x fx
120 - 130 1 1 125 125
130 - 140 3 4 135 405
140 - 150 8 12 145 1160
150 - 160 5 17 155 775
160 - 170 3 20 165 495
∑ = 20 ∑ = 2960
14820
2960
f
fxx
c
f
FLx
med
n
medmed2 5.14710
8
410140
cfff
ffLx
e
eee )2( 21mod
1modmodmod 3.14610
)5316(
)38(140
1.3.1
Class Freq, f <cum freq, F
120 - 130 1 1
130 - 140 3 4
140 - 150 8 12
150 - 160 5 17
160 - 170 3 20
∑ = 20
7.143.141156
3.141108
)45(140
156105
1215150
13
1
3
QQIQR
Q
Q
1.3.2
Class Midpt, x fx fx2
120 - 130 125 125 15625
130 - 140 135 405 54675
140 - 150 145 1160 168200
150 - 160 155 775 120125
160 - 170 165 495 81675
∑ = 2960 ∑ = 440300
1.3.3
8.1019
20/2960440300
)1(
/)( 222
n
nfxfxs
CV = standard deviation/mean
→ CV = 10.8/148 0.073 ≡ 7.3% → data clustered around mean.
BASIC PROBABILITY CONCEPTS
• Random Experiment• Sample Space• Event• Collectively Exhaustive Events • Dependent Events • Independent Events
• Marginal Probability
• Joint Probability: P(A∩B) = P(B∩A) • Conditional Probability: P(A|B) = P(A∩B)/P(B) P(B|A) = P(A∩B)/P(B)
.
Complement Rule:
P(A’) = 1 – P(A) or P(A) = 1 – P(A’)
P(A and B) = P(AB) = P(A)P(B/A) or
P(A and B) = P(AB) = P(B)P(A/B)
General Multiplication Rule:
Special Multiplication Rule:
P(A and B) = P(A)P(B) = P(B)P(A)
Special Addition Rule:
P(A or B) = P(A)+P(B)
GeneralAddition Rule:
P(A or B) = P(A)+P(B) – P(A and B)
Example
A company manufactures a total of 8000 motorcycles a month in three plants A, B and C. Of these, plant A manufactures 4000, and plant B manufactures 3000. At plant A, 85 out of 100 motorcycles are of standard quality or better. At plant B, 65 out of 100 motorcycles are of standard quality or better and at plant C, 60 out of 100 motorcycles are of standard quality or better. The quality controller randomly selects a motorcycle and finds it to be of substandard quality. Calculate the probability that it has come from plant B.
Solution
P(B/substd) = No. of substd items from B/Total no. of substd items
No of substd items from A = 4000x(100 – 85)/100 = 40x15 = 600 No of substd items from B = 3000x(100 – 65)/100 = 30x35 = 1050 No of substd items from C =1000x(100 – 60)/100 = 10x40 = 400 Total number of substd items = 600 +1050 + 400 = 2050 P(B/substd) = 1050/2050 = 0.512
PROBABILITY DISTRIBUTIONS
• Properties
• Discrete distributions
• Normal distributions
xnx
xnx
nxP
)1(
)!(!
!)(
Binomial Probability Distribution
Example
According to a leading newspaper, the largest cellular phone service in the US has about 36 million subscribers out of a total of 180 million cell phone users. If six cell phone users are randomly selected, what is the probability that at least two of them subscribes to this service?
xnx
xnx
nxP
)1(
)!(!
!)(
2.0180/36
)1()0(1)2( PPxP
262.0)2.01()2.0()!06(!0
!6)0( 60
P
393.0)2.01()2.0()!16(!1
!6)1( 51
P
345.0393.0262.01)2( xP
n = 6
!)(
x
exP
x
Poisson Probability Distribution
Example
Customers arrive randomly and independently at a service point at an average rate of 30 per hour.
1. Calculate the probability that exactly 20 customers arrive at the service point during any given hour.
2. Calculate the probability that during any 5 minute period at least 3 customers arrive at the service point.
ex
xPx
!)(
)2()1()0(1)3( PPPxP
5.20
!0
5.2)0( eP
5.21
!1
5.2)1( eP 5.2
2
!2
5.2)2( eP
5.20
!0
5.2 e 5.21
!1
5.2 e5.2
2
!2
5.2 e
; λ = 30/60 min = 2.5/5 min
→ P(x ≥ 3) = 1 -
- = 0.497
- -
2.
0134.0!20
30)10( 20
20
eP1.
Solution
λ = 30/hr
x
z
Standard normal or z-distribution
Normal probability distribution
- 5
0 . 4
0 . 3
0 . 2
0 . 1
. 0
x
f(
x
r a l i t r b u i o n : m = 0 , s2 = 1
Mean, median, andmode are equal
Theoretically, curve extends to infinity
a
Normal Distribution
Normal curve is symmetrical
Area between 0 and z
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
Example
Six hundred candidates wrote an entrance test for admission to a management course. The marks obtained by the candidates were found to be normally distributed with a mean of 132 marks and a standard deviation of 18 marks.
1. How many candidates scored between 140 and 160 marks?
2. If the top 60 performers were given confirmed admission, calculate the minimum mark (to the nearest integer) above which a candidate would be guaranteed admission?
x
z
Solution
Z1 =(140 -132)/18 = 0.4444 → P1 ≈ 0.172
Z2 =(160 -132)/18 = 1.5556 → P2 ≈ 0.440
→ P (160<X<140) ≈ 0.440 – 0.172 = 0.268 → 0.268 x 600 students ≈ 161 students
1.
cc
xz
15528.118
132
18
132
c
cc xxx
Let xc denote the minimum mark.
60/600 = 0.1 = 10%. P(0 <z<zc) = 0.50 - 0.10 = 0.4 → zc = 1.28
2.
HYPOTHESIS TESTING
• What is a Hypothesis?
• What is Hypothesis Testing?
Basic Terms
• Null hypothesis• Alternative hypothesis• Level of significance• Type I error• Type II error• Critical value• Test statistic• Rejection area• Acceptance area• One-tailed test• Two-tailed Test
Five-Step Procedure for Hypothesis TestingFive-Step Procedure for Hypothesis Testing
Step 1: State the null and alternative hypotheses
Step 3: Identify and calculate the test statistic
Step 4: Formulate and apply the decision rule
Step 2: Determine the critical value associated with the the level of significance
Step 5: Draw a conclusion
Test statistic:
Large sample( n Large sample( n > 30) 30)
Testing a Single Population Mean
Small sample( n <Small sample( n < 30) 30)
n
xttest
Test statistic:
n
xztest
t table with right tail probabilities
df\p 0.4 0.25 0.1 0.05 0.025 0.01 0.005 0.0005
1 0.32492 1 3.077684 6.313752 12.7062 31.82052 63.65674 636.6192
2 0.288675 0.816497 1.885618 2.919986 4.30265 6.96456 9.92484 31.5991
3 0.276671 0.764892 1.637744 2.353363 3.18245 4.5407 5.84091 12.924
4 0.270722 0.740697 1.533206 2.131847 2.77645 3.74695 4.60409 8.6103
5 0.267181 0.726687 1.475884 2.015048 2.57058 3.36493 4.03214 6.8688
6 0.264835 0.717558 1.439756 1.94318 2.44691 3.14267 3.70743 5.9588
7 0.263167 0.711142 1.414924 1.894579 2.36462 2.99795 3.49948 5.4079
8 0.261921 0.706387 1.396815 1.859548 2.306 2.89646 3.35539 5.0413
9 0.260955 0.702722 1.383029 1.833113 2.26216 2.82144 3.24984 4.7809
10 0.260185 0.699812 1.372184 1.812461 2.22814 2.76377 3.16927 4.5869
11 0.259556 0.697445 1.36343 1.795885 2.20099 2.71808 3.10581 4.437
12 0.259033 0.695483 1.356217 1.782288 2.17881 2.681 3.05454 4.3178
13 0.258591 0.693829 1.350171 1.770933 2.16037 2.65031 3.01228 4.2208
14 0.258213 0.692417 1.34503 1.76131 2.14479 2.62449 2.97684 4.1405
15 0.257885 0.691197 1.340606 1.75305 2.13145 2.60248 2.94671 4.0728
16 0.257599 0.690132 1.336757 1.745884 2.11991 2.58349 2.92078 4.015
17 0.257347 0.689195 1.333379 1.739607 2.10982 2.56693 2.89823 3.9651
18 0.257123 0.688364 1.330391 1.734064 2.10092 2.55238 2.87844 3.9216
19 0.256923 0.687621 1.327728 1.729133 2.09302 2.53948 2.86093 3.8834
20 0.256743 0.686954 1.325341 1.724718 2.08596 2.52798 2.84534 3.8495
21 0.25658 0.686352 1.323188 1.720743 2.07961 2.51765 2.83136 3.8193
22 0.256432 0.685805 1.321237 1.717144 2.07387 2.50832 2.81876 3.7921
23 0.256297 0.685306 1.31946 1.713872 2.06866 2.49987 2.80734 3.7676
24 0.256173 0.68485 1.317836 1.710882 2.0639 2.49216 2.79694 3.7454
25 0.25606 0.68443 1.316345 1.708141 2.05954 2.48511 2.78744 3.7251
26 0.255955 0.684043 1.314972 1.705618 2.05553 2.47863 2.77871 3.7066
27 0.255858 0.683685 1.313703 1.703288 2.05183 2.47266 2.77068 3.6896
28 0.255768 0.683353 1.312527 1.701131 2.04841 2.46714 2.76326 3.6739
Test statistic:
Large sample( n > 30)
Testing a Single Population Proportion:
n
pztest
)1(
Small sample( n< 30)
Test statistic:
n
pttest
)1(
Tests Involving Two Sample Means
2
22
1
21
2121 )(
ns
ns
xxztest
Example
A union representing workers at a large industrial concern accused management that discriminatory wages were paid to the workers in two production facilities, A and B. It claimed that workers in facility A were being paid less than those in facility B. The company investigates the claim by examining the pay of 70 workers from each production facility. The results were as follows.
Facility A Facility B
Mean salary $455.00 $463.00
Std deviation $10.00 $13.00
What conclusion did the company reach? Investigate at the 5% level of significance.
BA
BA
081.470/16970/100
463455
// 22
BBAA
BAtest
nn
xxz
Solution
H1:
→ two tailed-test nA, nB > 30 → z test. α = 5% → zcrit = 1.96
Since │zcrit │ > │zcrit│ reject H0
→ Sufficient statistical evidence to suggest a significant difference in the salaries.
H0:
Tests Involving Two Sample Proportions
21
2121
11
)(
nnpq
ppztest
21
2211
nn
pnpnp
pq 1
Example
Surveys were conducted in two major cities “A” and “B” to ascertain viewer habits regarding a popular television channel. In city “A”, 1000 people were interviewed and 680 said they viewed the channel. In city “B”, 600 people were interviewed and 444 said they viewed the channel. Investigate, at the 5% level of significance, whether there is a significant difference between the viewing habits in the two cities.
BA
BA
7025.06001000
444680
BA
BBAA
nn
npnpp
54.2
600/11000/12975.07025.0
600/4441000/680
)/1/1(
BA
BAtest
nnpq
ppz
H0:
H1:
→ two tailed-test; α = 5% → zcrit = 1.96
q = 1 – p = 0.2975
Since │ztest │> │zcrit │, reject H0 at the 5% level of significance.→ Sufficient statistical evidence to suggest a significant difference in the viewing habits.
Major Characteristics:
positively skewed
non-negative
family of chi-square distributions
Chi-square Applications
H0: There is no difference between the observed and expected frequencies.
H1: There is a difference between the observed and the expected frequencies.
Test statistic:
e
eostat f
ff 22
The critical value is a chi-square value with (k-1) degrees of freedom, where k is the number of categories
Right tail areas for the Chi-square Distribution
df\area 0.995 0.99 0.975 0.95 0.90 0.75 0.5 0.25 0.10 0.05 0.025 0.01 0.005
1 0.00004 0.00016 0.00098 0.00393 0.01579 0.10153 0.45494 1.3233 2.70554 3.84146 5.02389 6.6349 7.87944
2 0.01003 0.0201 0.05064 0.10259 0.21072 0.57536 1.38629 2.77259 4.60517 5.99146 7.37776 9.21034 10.5966
3 0.07172 0.11483 0.2158 0.35185 0.58437 1.21253 2.36597 4.10834 6.25139 7.81473 9.3484 11.3449 12.8382
4 0.20699 0.29711 0.48442 0.71072 1.06362 1.92256 3.35669 5.38527 7.77944 9.48773 11.1433 13.2767 14.8603
5 0.41174 0.5543 0.83121 1.14548 1.61031 2.6746 4.35146 6.62568 9.23636 11.0705 12.8325 15.0863 16.7496
6 0.67573 0.87209 1.23734 1.63538 2.20413 3.4546 5.34812 7.8408 10.6446 12.5916 14.4494 16.8119 18.5476
7 0.98926 1.23904 1.68987 2.16735 2.83311 4.25485 6.34581 9.03715 12.017 14.0671 16.0128 18.4753 20.2777
8 1.34441 1.6465 2.17973 2.73264 3.48954 5.07064 7.34412 10.2189 13.3616 15.5073 17.5346 20.0902 21.955
9 1.73493 2.0879 2.70039 3.32511 4.16816 5.89883 8.34283 11.3888 14.6837 16.919 19.0228 21.666 23.5894
10 2.15586 2.55821 3.24697 3.9403 4.86518 6.7372 9.34182 12.5489 15.9872 18.307 20.4832 23.2093 25.1882
11 2.60322 3.05348 3.81575 4.57481 5.57778 7.58414 10.341 13.7007 17.275 19.6751 21.9201 24.725 26.7569
12 3.07382 3.57057 4.40379 5.22603 6.3038 8.43842 11.3403 14.8454 18.5494 21.0261 23.3367 26.217 28.2995
13 3.56503 4.10692 5.00875 5.89186 7.0415 9.29907 12.3398 15.9839 19.8119 22.362 24.7356 27.6883 29.8195
14 4.07467 4.66043 5.62873 6.57063 7.78953 10.1653 13.3393 17.1169 21.0641 23.6848 26.119 29.1412 31.3194
15 4.60092 5.22935 6.26214 7.26094 8.54676 11.0365 14.3389 18.2451 22.3071 24.9958 27.4884 30.5779 32.8013
16 5.14221 5.81221 6.90766 7.96165 9.31224 11.9122 15.3385 19.3689 23.5418 26.2962 28.8454 31.9999 34.2672
17 5.69722 6.40776 7.56419 8.67176 10.0852 12.7919 16.3382 20.4887 24.769 27.5871 30.191 33.4087 35.7185
18 6.2648 7.01491 8.23075 9.39046 10.8649 13.6753 17.3379 21.6049 25.9894 28.8693 31.5264 34.8053 37.1565
19 6.84397 7.63273 8.90652 10.117 11.6509 14.562 18.3377 22.7178 27.2036 30.1435 32.8523 36.1909 38.5823
20 7.43384 8.2604 9.59078 10.8508 12.4426 15.4518 19.3374 23.8277 28.412 31.4104 34.1696 37.5662 39.9969
Helped Harmed No Effect Total
Drug 150 30 70 250
Sugar Pills 130 40 80 250
Total 280 70 150 500
A certain drug is claimed to be effective in curing the common cold. In a clinical trial involving 500 patients having the common cold, 250 were given the drug and the rest were given sugar pills. The patients’ reactions to the treatment are recorded in the table below.
On the basis of the above data, can it be concluded, at the 5% significance level, that there is a significant difference in the effect of the drug and sugar pills?
Example
f0 fe f0 – f0 (f0 - f0)2/fe
150 140 -10 0.714330 35 5 0.714370 75 5 0.3333
130 140 10 0.714340 35 -5 0.714380 75 -5 0.3333
= 3.524
991.52 crit
0fef ef0fef
22 524.3 critcalc
H0: No significant difference in effect of drug and sugar pills.
H1: There is a significant difference in effect of drug and sugar pills.
α = 0.05, df = (2-1)(3-1) = 2 →
Hence do not reject H0 at α = 0.05.
→ insufficient statistical evidence to suggest that there is a significant difference between drug and sugar pills.
• Correlation analysis• Scatterplot• Correlation coefficient• Dependent and independent variables• The coefficient of determinationcoefficient of determination • Linear regression equation
LINEAR REGRESSION AND CORRELLATION
2222 yynxxn
yxxynr
Correlation Coefficient Formula:
The coefficient of determination =coefficient of determination = r2
b = slope of the line.
Y' = average predicted value of Y for any X.
a = Y-intercept = estimated Y value when X=0
The regression equation : Y' = a + bX
n
xbya
22
xxn
yxxynb
Example
The following data relates to the training periods and average weekly sales of seven randomly selected salesmen in a large company.
Salesman Training (hours) Ave weekly sales ($’000)
A 20 44
B 5 22
C 10 35
D 13 32
E 12 27
F 8 26
G 15 35
1. Calculate the correlation coefficient. Comment on the value obtained.
2. Determined the coefficient of determination and interpret the value obtained.
3. Assuming a linear relation between the variables in the given data, obtain the regression equation connecting the variables.
4. Estimate the weekly sales of a salesman who had 22h of training. Is the result reliable? Explain.
Solution
x y x2 Y2 xy
20 44 400 1936 880
5 22 25 484 110
10 35 100 1225 350
13 32 169 1024 416
12 27 144 729 324
8 26 64 676 208
15 35 225 1225 525
83 221 1127 7299 2813
1. Let x denote training period (in hours) and let y denote sales (in $’000)
2222 yynxxn
yxxynr
9.0)22172997)(8311277(
221832813722
xx
x
strong positive linear relationship between x and y
2. r2 = 0.81 81% of variation in Y due to variation in X. The remaining 19% due to other factors.
22 xxn
yxxynb 35.1
8311277
22183281372
x
x
xbya
=
= 221/7 – 1.35 x 83/7 =15.56 → y = 15.56 +1.35x
3.
4. When x = 22 hours, y = 15.56 + 1.35 x 22 = 45.3 x $1000 = $45300
No. Regression equation valid only in the domain 5 ≤ x ≤ 20
TIME SERIES AND FORECASTING
Components
• The Irregular Variation (I)
Multiplicative Model: Y = T.C.S.I
• The Secular Trend (T)
• The Cyclical Variation (C)
• The Seasonal Variation (S)
The linear trend equation :
T = a + bt
Moving average Centred moving average Ratio to centred moving average Adjusted seasonal average Deasonalizing a series.
Seasonal Indices
Year Q1 Q2 Q3 Q4
2008 14.0 15.6 21,5 18.3
2009 13.1 14.7 24.8 19.4
2010 14.4 17.3 25.6 15.8
The Following table gives the quarterly healthcare claims (in R millions) against all healthcare claims for the period 2008 to 2010.
1. Represent the above data in as time series plot.2.Calculate the quarterly seasonal indices for healthcare claims using the ratio-to moving average method. Interpret the results.3. Derive a trend line using the method of least squares4.Estimate the seasonally-adjusted trend value of health care claims for the third quarter of 2011.
Example
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2008 2009 2010
Quarterly Healthcare Claims ( in Rm) for the period 2008 - 2010
1.
Season Data(Rm)
4MA(Rm)
Centred4MA (Rm)
Unadj. SI(%)
2008 Q1 14.0 - - -
Q2 15.6 - - -
Q3 21.5 17.350 17.238 124.7
Q4 18.3 17.125 17.013 107.6
2009 Q1 13.1 16.900 17.313 75.7
Q2 14.7 17.725 17.863 82.3
Q3 24.8 18.000 18.163 136.5
Q4 19.4 18.325 18.650 104.0
2010 Q1 14.4 18.975 19.075 75.5
Q2 17.3 19.175 18.725 92.4
Q3 25.6 18.275 - -
Q4 15.8 - - -
2.
Q1 Q2 Q3 Q4
2 008 124.7 107.6
2 009 75.7 82.3 136.6 104.0
2 010 75.6 92.4 - -
Mean SI 75.7 87.4 130.7 105.8
Adj. SI 75.7 87.5 130.9 106.0
The annual seasonal influences are as follows:
Q1: substantial decrease of 24.3%Q2: decrease of 12.5%Q3: substantial increase of 30.9%Q4: increase of 6.0%
t T t2 tT1 14.0 1 14.02 15.6 4 31.23 21.5 9 64.54 18.3 16 73.25 13.1 25 65.56 14.7 36 88.27 24.8 49 173.68 19.4 64 155.29 14.4 81 129.6
10 17.3 100 173.011 25.6 121 281.612 15.8 144 189.6
∑ = 78 ∑ = 214.5 ∑ = 650 ∑ = 1439.2
T(t) = 15.9 +0.31t
3.
Adj. Estimate for Q3 of 2011:
Y(2011, Q3) = T(15) x 1.309 = (15.9 + 0.31 x 15) x 1.309 = 26.9 ≡ R26.9m
4.
STATISTICAL DECISION THEORY
Components to Decision-Making Situation
• Decision alternatives or acts
• Payoffs
• States of nature
• Minimax Regret Strategy
• Maximin Strategy
• Maximax Strategy
Decision Making Without Probabilities
• Expected Payoff or Expected Monetary Value (EMV)
Decision Making with Probabilities
• Payoff table
Decision Trees
• Decision nodes
• Even nodes
• Tree Structure
• EMV calculations
Example
A large corporation arranged to use an ocean linear as a floating hotel for its annual convention. The shipping company had to make a decision whether or not to lease the ship. If leased, the company would get a flat fee and an additional percentage of profits from the convention, which could attract as many as 50000 people. The company’s analysts estimated that if the ship were leased there would be a 50% chance of realizing a profit of $700000, a 30% chance of making a profit of $800000, 15% chance of making a profit of $900000 and a 5% chance of making a profit of $1m.If the ship were not leased, it could be used for its usual voyage over the convention duration. In this case there would a 90% probability of making a profit of $750000 and a 10% probability that profits would be $780000.
The company has one additional option. It the ship were leased, and it became clear within the first few days of the convention that the profits were going to be in the $700000 range, the company could choose to promote the convention on its own by offering participants discounts on the ocean liner’s cruises. The company’s analysts believe that if this action were chosen there would be a 60% chance that profits would increase to $740000 and a 40% chance that the promotion would fail, lowering profits to $680000.
4.1 Draw a decision tree to depict the above problem.
4.2 What decision should the shipping company take? Show all working.
Lease
Do not
lease
0.1
0.9
0.3
0.15
0.05
0.5Promote
Do not
Promote
0.4
0.6
$700000
$680000
$740000
$800000
$900000
$1000000
$750000
$780000A
B
C
D
4.2
EMV = max[EMV(A), EMV(B)]
EMV(A) = $780000 x 0.1 + $750000 x 0.9 = $753000
EMV(B) = $1000000x0.05 + $900000x0.15 + $800000x0.3 + 0.5xEMV(C) = $425000+0.5xEMV(C)
EMV(C) = max[$700000, EMV(D)]
= max[$700000, $680000x0.4 + $740000x0.6] = $716000 → promote
Hence EMV (B) = $425000 + $716000x0.5 = $783000 → EMV = $783000
Decision: Lease and then promote the convention if profits from lease are in the $700000 range.