MNS 601 COURSE NOTES 1
MNS 601: STATISTICS FOR BUSINESS
COURSE NOTES – WEEK 1 - ONLINE .
This is a summary of all the important concepts in the course. The information is
arranged in a specific sequence and divided into class sessions. There are two types of
content:
Presentation of information/concepts – you should study this information
thoroughly in addition to studying the relevant sections in the textbook (which
are identified in the Course Outline)
Examples – these sample problems illustrate the key concepts
* * * * * * * * * * * * * * * * * * Session #1 * * * * * * * * * * * * * * * * * *
INTRODUCTION
Definitions
Population - all the members of a defined group.
Sample - some of the members of a population.
Parameter - any descriptive measure of a characteristic of a population. Expressed using
Greek letters.
Statistic - any descriptive measure of a characteristic of a sample. Expressed using
Arabic letters.
Variable - a characteristic of a group.
Levels of Measurement
Nominal - grouping objects into classes. No "scale" is implied, e.g., nationality, sex, etc.
Ordinal - measures that indicate an order (i.e., the relationships can be expressed using
>). The scale does not have equal intervals between the points, e.g., the final standings in
a beauty contest, socio-economic status, grades in the military, etc.
Interval - measures that indicate an order and have equal intervals between successive
points on the scale, e.g., temperature in Fahrenheit degrees. The majority of statistical
procedures you will learn in this course apply to interval data.
Ratio - comparable to interval measures but have an absolute “0” point, e.g., Kelvin
temperature scale.
NOTE: Nominal and ordinal measures are often called categorical while interval and
ratio measures are called quantitative or scale.
MNS 601 COURSE NOTES 2
DESCRIPTIVE STATISTICS
The primary purpose of descriptive statistics is to describe or summarize data so they are
easier to interpret.
Measures of Central Tendency
Mode - the most popular score, found by determining the frequency of each score. Note:
The mode is the only "average" which can be used appropriately to describe nominal
data. It is possible that a set of scores may have no mode, or several.
Example #1: To calculate the mode of the following set of scores:
13, 21, 19, 16, 14, 13, 36, 13, 19, 6, 15, 13, 4, 15, 8
You could construct this table:
SCORE FREQUENCY
13 4
21 1
19 2
16 1
14 1
36 1
6 1
15 2
4 1
8 1
Thus, the mode of these scores is 13.
Median - the middle score, found by listing scores in order and counting to the middle.
For an odd number of observations, it is the middle value. For an even number of
observations, it is the average of the two middle values.
Example#2: To find the median of the same scores as in Example #1 …
Arrange them in order: 4, 6, 8, 13, 13, 13, 13, 14, 15, 15, 16, 19, 19, 21, 36
Then, find the position of the middle score using the formula:
N + 1 = 15 + 1 = 8
2 2
Thus, the 8th
score, which is 14, is the median.
Arithmetic Mean - the "average". The mean is affected by the value of every score in
the group.
MNS 601 COURSE NOTES 3
Sample Mean = x (x-bar) = sum of scores = ∑ x__
number of scores n
Where:
= the sum of
x = individual scores
n = number of scores in the sample
Population Mean = µ (mu) = sum of scores = ∑ x__
number of scores N
Where:
= the sum of
x = individual scores
N = number of scores in the population
Example#3: To calculate the arithmetic mean of the same scores as is Example #1, add
the scores (i.e., calculate X) = 225, then count the number of scores = 15, and divide
= 225 = 15
15
Weighted Arithmetic Mean - the mean of a set of scores in which the scores have
different weights.
w = ∑ wx
∑ w
Where "w" = the weight assigned to each value of x.
NOTE: The variable you are averaging is denoted by “x.”
Example#4: If your test grades for a course were 85%, 75%, 69% and 83%, and the first
3 tests were worth 20% of your course grade and the final test was 40%. Your course
grade would be a weighted mean:
Test # Grade (x) Weight (w) wx
1 85% .20 17.0
2 75% .20 15.0
3 69% .20 13.8
4 83% .40 33.2
Total = 1.00 79.0
x w = 79.0 = 79%
MNS 601 COURSE NOTES 4
1.00
Measures of Dispersion
Range - the difference between the largest and smallest scores, or, the largest and
smallest scores themselves. Note: the range tells you nothing about the scores between
the largest and smallest scores.
Example #5: The range of the scores (same scores as in Example #1)
13, 21, 19, 16, 14, 13, 36, 13, 19, 6, 15, 13, 4, 15, 8
Can be expressed as: 32 or, 4 - 36
Standard Deviation - a statistic that shows how much the scores are spread out around
the mean. The larger the standard deviation, the more spread out are the scores.
s = standard deviation of a sample
sigma (σ) = standard deviation of a population
(Deviation - the distance of a score from the mean for the group. For example, if
the mean of a set of scores is 10, the deviation of a value of 15 is +5, and the
deviation of a score of 6 is -4.)
Every score in a group of scores has a deviation score. Conceptually, standard deviation
is the mean of the deviation scores for a group.
For a population: For a sample:
__________ ______________
sigma (σ) = √ ∑ (x - u)2 s = √ ∑ (x - x )2
N n – 1
MNS 601 COURSE NOTES 5
Example #6: A consumer group studying the cost of various food items in San Diego
grocery stores recorded the following prices for a food item in seven different stores: 79
cents, 80 cents, 67 cents, 99 cents, 76 cents, 83 cents, and 84 cents. The standard
deviation of these prices can be calculated using the formula:
______________
s = √ ∑ (x - )2
n - 1
x _x _ x - x (x - x )2
79 81 -2 4 563_
80 81 -1 1 s = √ 7 - 1
67 81 -14 196
99 81 18 324
76 81 -5 25
83 81 2 4 = √ 93.83
84 81 3 9__
sum (∑) = 563 = 9.7 cents
Variance - the arithmetic mean of the squared deviations of the individual scores about
their mean. The variance = the standard deviation squared.
s2 = sample variance sigma
2 (σ2
) = population variance
MNS 601 COURSE NOTES 6
* * * * * * * * * * * * * * * * * * Session #2 * * * * * * * * * * * * * * * * * *
Frequency Distributions and Graphs
Frequency Distribution – tabular summary of data showing the number (frequency) of
items in each of several non-overlapping classes.
Frequency distributions can be ungrouped or grouped.
For categorical data you may simply determine the frequency for each category
(ungrouped) (pp. 34–35 35-36, Table 2.2, Figure 2.1)
For quantitative/scale data, you should group/bin the scores then find the
frequency of scores falling within each group/class.
The three steps required to define the classes for a frequency distribution with
quantitative data are, to determine the:
1. Number of classes
2. Width of each class
3. Class limits
In setting the class limits you should be sure that:
1. The class intervals are of equal width. To determine the width of the class
interval use the formula below, then round up:
Class Width = (Largest data value + one unit) - Smallest data value
Total number of classes
2. The classes do not overlap (i.e., they are mutually exclusive).
3. You identify a reasonable number of classes.
For example, given the following scores if we construct a frequency distribution with 3
class intervals the class width could be: (16 – 1) ÷ 3 = 5. NOTE: Start the first class
using the lowest number in the data set.
12 4 1 8 15 6 2 2 3 5 9 2
Classes (of X) Frequency (f)
1 - 5 7
6 - 10 3
11 - 15 2
MNS 601 COURSE NOTES 7
Histogram - The base of each bar/column equals one class interval and the height of
each one equals the frequency. The mid-point of each bar is the mid-point of a class
interval. A histogram is one of the easiest graphs to interpret when one variable is being
presented.
NOTE: Bar charts vs. histograms: A bar chart is used when the independent (x) variable
is categorical, which means the bars are separated. A histogram is used when the
independent (x) variable is scale, which means the bars are not separated.
The histogram for the frequency table above follows:
Example #7: Draw a histogram of the following frequency distribution.
Range of Scores Number of
On a Final Exam as a % Students
(Midpoint shown on graph)
52.5 – 57.4 (55) 1
57.5 – 62.4 (60) 3
62.5 – 67.4 (65) 2
67.5 – 72.4 (70) 2
72.5 – 77.4 (75) 5
77.5 – 82.4 (80) 6
82.5 – 87.4 (85) 5
87.5 – 92.4 (90) 3
92.5 – 97.4 (95) 2
MNS 601 COURSE NOTES 8
Answer:
Frequency polygon - a "line" graph that connects the midpoints of each class interval (at
the top). They are particularly useful when more than one distribution is being
represented.
Pie Chart - a circular graph. The steps involved in constructing one are:
1. Add
2. Calculate the percentages
3. Divide the circle (360 degrees) into segments
For example, the U.S. Federal Budget Expenditures ($2,931B) in 2008 are shown in the
following pie chart (where “Count” is amounts in $B):
MNS 601 COURSE NOTES 9
PROBABILITY & PROBABILITY DISTRIBUTIONS
Probability - A number between 0 and 1 that represents the odds that a particular event
will occur. It is the ratio of successful outcomes to the total number of outcomes
possible: The relative frequency of successful outcomes. [NOTE: The term “success” in
this context refers to the outcome of interest, not necessarily one that is positive or good.]
The probability of A = P(A) = ____number of examples of A___
total number of possible outcomes
Frequency Distribution - shows the number of values that fall within arbitrary classes or
limits.
Probability Distribution - a systematic arrangement of the probabilities corresponding
to the values of a random variable.
Frequency Distribution Probability Distribution
Freq. of X Prob. of X
__________________ __________________
Values/Classes of X Values of X
A frequency distribution is a distribution of actual values. A probability distribution is a
theoretical distribution: It represents what should happen according to the laws of
probability.
MNS 601 COURSE NOTES 10
Example #8: Construct a frequency distribution and a histogram for the following
experiment. Toss two coins a total of 12 times. For each toss record the number of
heads that appear (i.e., 0, 1 or 2).
The probability distribution for this experiment is:
Outcome (Number of Heads on 2 Coins) Probability
0 0.25
1 0.50
2 0.25
The actual frequency distribution for your experiment will, of course, vary but as
predicted by the probabilities above, it is most likely that the result, 1 head, will occur
more frequently than either 0 or 2 heads. The more trials in the experiment, the more the
frequency distribution will approximate the theoretical probability distribution.
Comparison of Frequency Distribution (Actual)
With Its Probability Distribution (Theoretical)
Recap of 72 Hours at the Crap Table Shooters: 1829 Total Rolls: 14, 967
Point 2 3 4 5 6 7 8 9 10 11 12
Actual
Rolls
401 853 1236 1668 2008 2574 2081 1601 1253 875 417
Theory
Rolls
416 832 1248 1664 2080 2496 2080 1664 1248 832 416
From: Mickelson, B 72 Hours at the Crap Table Las Vegas: GBC Press, 1978
In the above table, “Actual Rolls” indicates the actual frequency of occurrence for each
point value in 14, 967 rolls of the dice. “Theory Rolls” indicates the frequency of
occurrence predicted by probability formulas. Comparing “Actual” with “Theory” you
can see that the values are very close, i.e., that the Actual results very closely
approximate the Theoretical prediction based on the statistical probability formulas.
Discrete Variable - the several possible values differ by clearly defined steps (e.g., the
number of automobiles sold per day). The probability distribution for such a value has a
“stair-step” appearance since only certain values of X are possible.
Continuous Variable - all values are possible. The probability distribution for such a
value has the appearance of a smooth curve.
MNS 601 COURSE NOTES 11
The Binomial Probability Function
It is a discrete probability distribution. It is used to determine the probability of obtaining
exactly x successes in n trials of an experiment. In order to use this formula, four
assumptions (conditions) must be met. These are found at the top of p. 208 242 in the
textbook under “Properties of a Binomial Experiment.”
The formula is found on p. 212 244 in the textbook.
P (x out of n) = f(x) = n! px (1-p)
(n-x)
x! (n – x)!
where: n = the number of trials in the experiment
x = the number of successes
p = the probability of a success on one trial
1 – p = the probability of failure on one trial
f(x) = the probability of x successes in n trials
NOTE: ! = factorial, which indicates you are to multiply the number by all the
preceding integers counting backwards to 1. Thus, 5! = 5 x 4 x 3 x 2 x 1 = 120
Example #12: The probability of tossing a coin 10 times and getting exactly 7 heads is:
P(7 out of 10) = _____n!_____ (px) (1-p)
n-x
x! (n – x)!
The binomial probability function defines a series of binomial probability distributions,
one for every possible combination of values of n and probability.
NOTE: By definition: X0 = 1 and 0! = 1
Binomial Probability Table - A table of binomial probabilities for various values of n,
x, and probability. See Appendix B, Table 5, pp. 989-997 985-993.
MNS 601 COURSE NOTES 12
Areas Under Any Probability Distribution
This area ... Y
is equal to the probability
of finding a value that
falls within ...
X
this range
The total area under a probability distribution = 1
MNS 601: STATISTICS FOR BUSINESS
COURSE NOTES – WEEK 2 - ONLINE
* * * * * * * * * * * * * * * * * * Session #3 * * * * * * * * * * * * * * * * * *
The Normal Curve – Continuous Probability Distributions
The Normal Curve – [pp. 239-245 276-278] A continuous, symmetrical, probability
curve that is defined by "population mean” (u) and "population standard deviation"
(sigma, ). The normal curve is important because it is a fairly accurate representation of
the frequency distribution of scores on a large number of variables, i.e., lots of variables
are normally distributed.
The probability that a normally distributed random variable X will fall in a given range is
equal to the area under the normal curve for that range.
Standard Normal Distribution - has u = 0, population standard deviation () = 1.
Use the table on the inside front cover of the textbook to find areas (i.e., probabilities)
under this distribution.
Nonstandard Normal Distribution - calculate the standard deviation using the formula:
[p. 245 282] Z = x - u
population standard deviation (σ, sigma)
MNS 601 COURSE NOTES 13
The Central Limit Theorem
If you want to estimate the mean (µ) of a population you can make a good estimate by
examining the means of samples taken from that population. [p. 281 319-320]
The Distribution of Sample Means
If a population has a mean (µ) and you take many samples from the population, calculate
the mean for each sample, and calculate the mean of those sample means, then, that mean
will approach µ as the sample size increases.
or,
The mean of a distribution of sample means is equal to the population mean, µ .
The standard deviation of a distribution of sample means approaches
population standard deviation
square root of “n” as the sample size increases. (Where n = sample size.)
The standard deviation of sample means = the standard error of the mean (SEM) =
population standard deviation
square root of “n”
Sampling Error - the difference between the sample value and the parameter.
Sample Mean + Sampling Error = µ
Since the standard error of the mean is a measure of how much, on the average, the
sample mean varies from the population mean, it is a measure of sampling error.
The smaller the sampling error, the more accurate the estimate of the parameter.
The size of the standard error of the mean can be reduced by increasing the sample size.
If you don't know the standard deviation of the population, you can use an alternate
formula to calculate an …
unbiased estimate of the standard error of the mean = sample standard deviation
square root of “n”
Thus:
Z = sample mean - µ = sample mean - mean of the means
standard error of the mean population std dev
square root of “n”
MNS 601 COURSE NOTES 14
STATISTICS FOR BUSINESS
COURSE NOTES – WEEK 3
ESTIMATES
Point Estimate - a single value from a sample used to estimate the value in the
population (i.e., a parameter).
Interval Estimate - An estimate of a parameter in terms of the probability that it will
occur within an interval.
Interval Estimates for Means - The distribution of sample means can be considered
normal, therefore, we can use the table on the back inside cover of the textbook to
determine the probability that the population mean (µ) will fall in a given interval. And,
by stating the probability that µ will fall in a specific interval, we are identifying a
confidence interval.
Alpha (a) is the area under the probability distribution that is outside (above and below)
the confidence limits, i.e., it is the total “shaded” area. It is equal to 1.00 minus the size
of the confidence interval, thus a 95% confidence interval has an alpha value of 5% (or,
0.05).
Confidence Interval a Z
90% .10 + 1.645
95% .05 + 1.96
99% .01 + 2.575
If the population standard deviation (σ) is known (i.e., it has been calculated directly,
or a reliable estimate is available), use this formula:
sample mean - (Za/2) population std. dev. < µ < sample mean + (Za/2) population std. dev
square root of “n” square root of “n”
If the population standard deviation (σ) is unknown (and, therefore, must be estimated
using s), use this formula:
sample mean - (ta/2) sample std dev < µ < sample mean + (ta/2) sample std dev
square root of “n” square root of “n”
The “t” distribution is actually a collection of distributions [p. 316 354, the
shape of each one depending on the “degrees of freedom (df),” which is
dependent on “n.” In confidence interval problems, df = n – 1.
MNS 601 COURSE NOTES 15
Critical values for the “t” distribution are found in the t table in your
textbook [pp. 980-982 976-978].
SAMPLING
Simple Random Sampling - The selection of items from a universe in such a way that
each item in the universe has an equal probability of being selected as each sample item
is drawn. A table of random numbers can be used to conduct random sampling.
Sample statistics can only be generalized to the population that the sample represents.
HYPOTHESIS TESTING
Purpose - To use sample data in testing assumptions about the population from which the
sample came.
Key Question: How large must the difference be between the sample mean and the
population mean to be statistically significant? When that difference is significantly
large, it can no longer be attributed to sampling error, rather, there is another reason for
the difference.
You can use the Z formula to find the probability of obtaining a sample mean that is far
(or further) from the hypothesized population mean.
Z = sample mean - mean of the means
population standard deviation
square root of “n”
Step 1: Formulate the Null and Alternative Hypotheses
Ho = Null Hypothesis = There is no "significant difference" between the value of the
sample statistic and the value of the population parameter,
or,
Any difference that does exist is due to sampling error.
The null hypothesis is the hypothesis that includes the equal sign (=).
[Remember: Sampling error is not the same as sampling bias.]
MNS 601 COURSE NOTES 16
H1 = Alternative Hypothesis = There is "significant difference" between the value of the
sample statistic and the value of the population
parameter,
or,
The difference cannot be accounted for (explained by)
sampling error alone.
Decisions are made on the basis of whether or not the null hypothesis is accepted.
The null hypothesis and the alternative hypothesis are mutually exclusive and collectively
exhaustive.
Example #13: The null hypothesis should include the condition of equality, while the
alternative hypothesis should involve one of the three signs >, <, or ≠. For each of the
following claims, identify Ho and H1 as in the following example:
"The mean I.Q. of physicians is greater than 110."
Ho: µ ≤ 110
H1: µ > 110
a. The mean age of professors is more than 30 years.
Ho: µ ≤ 30 H1: µ > 30
b. The mean I.Q. of criminals is below 100.
Ho: µ ≥ 100 H1: µ < 100
c. The mean I.Q. of college students is at least 100.
Ho: µ ≥ 100 H1: µ < 100
d. The mean monthly maintenance cost of an aircraft is $3,200.
Ho: µ = $3,200 H1: µ ≠ $3,200
e. The mean annual salary of police officers is less than $45,000.
Ho: µ ≥ $45,000 H1: µ < $45,000
Step 2: Determine the Criterion (Significance Level) for Deciding Whether to Accept or
Reject the Null Hypothesis.
alpha = a = the significance level
Since sample statistics are not always reliable measures of population parameters
they may lead us to an incorrect decision, either a Type I or Type II Error.
NOTE: For each problem you are given to solve, the significance level will be
provided.
Step 3: Select the Appropriate Probability Distribution (Normal or t) and Determine the
Critical Values.
MNS 601 COURSE NOTES 17
One-Tailed Tests - The case where you are interested in whether the sample
statistic is significantly different from the population parameter in one direction
only.
Two-Tailed Tests - The case where you want to know if the sample statistic is
significantly greater than or less than the population parameter.
MNS 601 COURSE NOTES 18
MNS 601 COURSE NOTES 19
Step 4: Using the Sample Data, Compute the Test Statistic.
For example:
Z = sample mean - mean of the means
population standard deviation
square root of “n”
Step 5: Compare the Test Statistic With the Critical Value and Either Reject or Accept
the Null Hypothesis.
In addition to “accepting” or “rejecting” the null hypothesis, you must also write
a brief interpretation of the results (a few words, phrase or sentence) that answers
the original question stated in the problem.
MNS 601 COURSE NOTES 20
HYPOTHESIS TESTS COMPARING THE SAMPLE MEAN
WITH THE POPULATION MEAN,
WHEN THE POPULATION STANDARD DEVIATION IS KNOWN
Example: It has been hypothesized that blondes have above average intelligence. You
know that according to one I.Q. test the population mean (µ) is 100 and the
population standard deviation (σ) is 15. You take a random sample of 100
blondes and administer the test. The sample mean is 101.8. Do the sample
data support the hypothesis that blondes are above average in intelligence?
(Use a = .05)
MNS 601 COURSE NOTES 21
Start
1 Identify the specific claim or hypothesis Blondes have above
to be tested and put it in symbolic form. average (100 I.Q.)
intelligence. µ > 100
2 Give the symbolic form that must be µ < 100
true when the original claim is false.
Of the two symbolic expressions obtained
3 so far, let the null hypothesis Ho be Ho: µ < 100
the one that contains the condition of H1: µ > 100
equality; H1 is the other statement.
Select the significance level a based on
the seriousness of a type I error. Make
4 a small if the consequences of rejecting a = 0.05
a true Ho are severe. The values of 0.05
and 0.01 are very common.
5 What statistic is relevant to this test The sample mean is the
and what is its sampling distribution? relevant statistic. Sample
means can be approximated
by a normal distribution.
Determine the test statistic, the The sample mean of 101.8 is
6 critical region, and the critical equivalent to Zcalc = 1.20. The
value(s). (It helps to draw a picture.) critical region consists of all
values > Zcrit = 1.645.
Reject Ho if the test statistic is in the
7 critical region. Accept Ho if the Accept Ho
test statistic is not in the critical region.
=100 Zcrit=1.645
For sample mean of 101.8, Zcalc = 1.20
8 Restate this previous decision in There is insufficient evidence
simple, non-technical terms. to support the claim that
blondes have above average
Stop intelligence.
MNS 601 COURSE NOTES 22
* * * * * * * * * * * * * * * * * * Session #5 * * * * * * * * * * * * * * * * * *
HYPOTHESIS TESTS COMPARING THE SAMPLE MEAN
WITH THE POPULATION MEAN, WHEN THE POPULATION STANDARD
DEVIATION IS UNKNOWN
Use the t distribution and the formula:
t = sample mean - mean of the means
sample standard deviation
square root of “n”
Example: A pilot training program usually takes an average of 57.2 hours (µ), but
new teaching methods were used on the last class of 25 students.
Computations reveal that for this experimental class, the completion times
had a mean (x ) of 54.8 hours and a standard deviation (s) of 4.3 hours. At
the = 0.05 significance level, test the claim that the new teaching
techniques reduce the instruction time.
Solution: The claim that the new teaching method reduces instruction time is
equivalent to the claim that µ < 57.2 hours. We compare the test statistic
as follows:
t = sample mean - mean of the means
sample standard deviation
square root of “n”
= 54.8 - 57.2 = -2.791
4.3
25
We find the critical t value from the table where we locate 25 - 1 or
24 degrees of freedom at the left column and a = 0.05 (one-tail) across the
top. The critical t value of 1.711 is obtained, but since small values of x will cause the rejection of Ho, we recognize that t = -1.711 is the actual t
value that is the boundary for the critical region.
It is easy to lose sight of the underlying rationale as we go through
this hypothesis testing procedure, so let's review the essence of the test.
We set out to determine whether the sample mean of 54.8 hours is
significantly below the value of 57.2 hours. Knowing the distribution of
sample means (of which 54.8 is one) and choosing a level of significance
(5% or a = 0.05), we are able to determine the cutoff for what is a
significant difference and what is not. Any sample mean equivalent to a t
score below -1.711 represents a significant difference. The mean of 54.8
MNS 601 COURSE NOTES 23
hours is significantly below 57.2 hours, so it appears as though the
new teaching method does reduce instruction time.
MNS 601 COURSE NOTES 24
Start
1 There is a claim that the new teaching
method requires a mean time faster than
57.2 hours. That is, < 57.2 hours.
2 The alternative to the original claim is
> 57.2 hours.
Ho must contain the condition of equality
3 so we get: = Ho : > 57.2 hours
H1 : < 57.2 hours
The level of significance has been
4 specified in the statement of the problem.
a = 0.05
5 The sample mean should be used in
testing a claim about a population mean. a = 0.05
Since is unknown, we assume that
such sample means follow a
t distribution. It is reasonable to assume t = -1.71 = 57.2
here that the population of all or, t = 0
completion times is essentially normal.
Sample: x = 54.8 hours, where
tcalc = -2.791 (mean of 25 students)
The test statistic (tcalc = -2.791), critical
value (tcrit = -1.711), and critical region
6 are shown in the figure to the right.
Since the test statistic is in the critical
7 region, we reject Ho.
8 The new teaching method does appear to
reduce the training completion time.
Stop
MNS 601 COURSE NOTES 25
COMPARING TWO POPULATION VARIANCES
Properties of the F Distribution:
The F distribution is defined by the ratio of two variances.
1. [p. 462 496] All values of F are non-negative (F > 0)
2. Instead of being symmetric, the F distribution is skewed to the right
3. There is a different F distribution for each different pair of degrees of freedom
for numerator and denominator
One use of the F distribution is to test whether two samples are from populations
having equal variances.
F = s12 = variance for sample #1 (larger variance)
s22 variance for sample #2 (smaller variance)
degrees of freedom = n1 – 1 and n2 – 1
(Where n1 is the size of sample #1 and n2 is the size of sample #2.)
[Example pp. 461-464 497-499]
Values for the F distribution are in the Appendix Table 4 [pp. 985-988 981-984]. Need
to know:
1. Significance level ()
2. Degrees of freedom for both the numerator and denominator
This test can be either one-tailed or two-tailed. However, by always placing the larger
variance in the numerator of the F ratio, this will produce an Fcalculated > 1, thus making
the right tail the relevant critical area.
NOTE: Review the information in the textbook [pp. 460-464 495–499].
MNS 601 COURSE NOTES 26
ANALYSIS OF VARIANCE (ANOVA)
Purpose of ANOVA - To test for significant differences among more than two sample
means. This permits you to make inferences about whether the samples are drawn from
populations having the same mean (i.e., from the same population, or from different
populations).
Assumptions -
1. Independent variable is nominal or ordinal (with small number of categories)
2. Dependent variable is interval or ratio
3. Random sampling
4. Dependent variable is normally distributed in the populations (robust to violation
of normality assumption if n per group > 20)
5. The populations have equal variances - homogeneity of variances (robust to
homogeneity assumption if ns are similar)
[p. 511 550 Fig 13.2, p. 512 550 Fig. 13.3]
F = between groups variance = variance of the sample means
within groups variance weighted mean of the sample variances
∑ nj (mean for each sample - grand mean) 2
k - 1
F = _______________________________________ = MSTR
MSE
∑ (nj – 1) sj2
nT – k
Where:
nj = size of each sample
k = number of samples/groups
nT = ∑nj = total size of all samples
sj2 = variance of each sample
x double-bar = grand mean = mean of all the sample values
[Example p.p. 508-510, 517-519 546-548, 555-557]
NOTE: Review the information in the textbook [pp. 508-519 546–557] (From Section
13.1, up to but not including the Section, “Computer Results for Analysis of Variance”)
IMPORTANT NOTE: Two ways to interpret results.
REJECT Ho IF:
If Fcalculated > Fcritical
OR,
IF p-value < a (alpha, significance level)
The p-value is the probability of obtaining the given result [i.e., sample statistic(s)], if the null hypothesis is true.
MNS 601 COURSE NOTES 27
MNS 601: STATISTICS FOR BUSINESS
COURSE NOTES – WEEK 4 - ONLINE
* * * * * * * * * * * * * * * * * * Session #6 * * * * * * * * * * * * * * * * * *
SIMPLE LINEAR REGRESSION
Simple regression analysis allows us to predict one variable from another, where the two
variables are quantitative (interval or ratio level)
Independent Variable(s) - The one that is known and assumed to be predictive or
causal. Denoted by X.
Dependent Variable - The one you are trying to predict. That is, it varies with the
independent variable. Denoted by Y.
Scatter Diagrams - This example shows a “perfect” relationship because all the points
lie on the line.
MNS 601 COURSE NOTES 28
NOTE: Review the information in the textbook [pp. 57-58 64–68]
Purpose of Regression Analysis - To develop a mathematical equation that can be used
to predict values of some dependent variable (Y) from values of an independent
variable (X). That is, to explain or account for the variation in a variable.
The line fitted to the scatter diagram may be called any one of the following:
regression line
line of average relationship
least squares regression line
best-fit line
^ y = b0 + b1x
where: b0 = y-intercept, b1 = slope
NOTE: Review the information in the textbook [pp. 562-569 600–607].
Y
(DEPENDENT)
Y
X b1 = Y_
X
b0 (VALUE OF Y WHEN X = 0)
X
(INDEPENDENT)
MNS 601 COURSE NOTES 29
The following formulas are used to find the constants, b1 and b0.
b1 = ∑ (xi - mean of x) (yi - mean of y)
∑ (xi - mean of x) 2
b0 = mean of y - b1 (mean of x)
Note: Since it is usually true that the dependent variable (y) cannot be determined
exactly from a set of specified values of the independent variables (x), the best
relationship that can be derived is an average value of y associated with a specified value
of the independent variable. And, this average value of y will have some error (e)
associated with it since it is an estimate. Thus, the more precise equation is:
Y = bo + b1x + e
MNS 601 COURSE NOTES 30
CHI SQUARE (χ²)
Test of Independence - to test for the independence of two categorical variables.
χ² = ∑ (fij - eij)2
eij
Where: fij = observed frequency
eij = expected or theoretical frequency
Assumptions:
1. Both the independent and dependent variables are categorical data (i.e., measured
at a nominal/ordinal level)
2. n > 50
3. The expected frequency for each cell must be > 5
This is a way to test whether membership in one category has any bearing on
membership in another. For example:
Is a person's level of responsibility in a company related to their sex?
OR,
Is the quality of an executive's work related to whether they have an MBA?
H0 can be written as: (a) they are not related, (b) they are independent, or (c) there
is no significant difference between the groups in the independent variable.
Example: To illustrate the second case, suppose a researcher collected data on the quality
of work performed by 240 business executives using a rating scale from excellent to
poor. Then, he divided them into two groups, those with MBA's and those without. (Use
0.10 level of significance)
Using the χ² template:
Step 1: Enter the observed frequencies.
Observed Frequencies
No MBA MBA Total
Excellent 40 60 100
Very Good 10 10 20
Good 5 15 20
Fair 10 30 40
Poor 15 45 60
Total 80 160 240
Step 2: Using the marginals, calculate the expected frequencies, and enter them in the
template.
eij = column total x row total
T
Where T = grand total of marginals
MNS 601 COURSE NOTES 31
Expected Frequencies
No MBA MBA Total
Excellent 33.33 66.67 100
Very Good 6.67 13.33 20
Good 6.67 13.33 20
Fair 13.33 26.67 40
Poor 20 40 60
Total 80 160 240
Step 3: Interpret the results of the χ² analysis by either (1) comparing the p-value to the
significance level (), or (2) comparing the calculated χ² value to the table (critical)
value of χ²
Test Results
0 Correction
8.260 χ2
5 Rows
2 Columns
4 df
0.083 p(χ2)
0.186 V (or φ)
See the χ² table in Appendix B [pp. 983-984 979-980]. To use this table you need to
know and df.
NOTE: Review the Chi-square distribution [p. 451 486]. Like the Student's t
distribution, the χ² distribution is different for different degrees of freedom.
For contingency tables: df = (r - 1) (c - 1)
where: r = # rows in the contingency table
c = # columns in the contingency table
Thus, df = (5 - 1) (2 - 1) = 4
Critical value of χ² at df = 4 and = 0.10 is 7.779
CONCLUSION: There is a significant difference between the ratings of executives who
have MBAs and those who do not.
MNS 601 COURSE NOTES 32
* * * * * * * * * * * * * * * * * * Session #7 * * * * * * * * * * * * * * * * * *
TIME SERIES & FORECASTING
Purpose - Time series decomposition is used to detect patterns of change in statistical
information over regular intervals of time and to project these patterns in making
predictions (i.e., forecasting).
The three kinds of change involved in time series decomposition are listed below. A time
series may contain more than one of these components.
1. Seasonal variation
2. Trend
3. Irregular variation
[pp. 786-792 807-813]
Seasonal Variation
Seasonal variation is repetitive and predictable movement around the trend line in one
year or less. To detect seasonal variation, the time intervals should be days, weeks,
months, or quarters.
We study seasonal variation to:
1. Establish a pattern of past change
2. Make projections (for short-run decisions)
3. Eliminate its effects from the time series
Example: Over a six-year period, the gold market planned an uncanny buying
opportunity in June of each year. This remarkable chart graphically illustrates that June
is buying time. The chart shows a seasonal trend in the dollar price of gold for the years
1977 to 1982. Here's how to interpret these patterns. For example, at a level of 104 the
price of gold is 4% above its long-term trendline, i.e., 4% above its seasonal adjusted
average. Similarly, at 97 gold is 3% below its long-term trendline.
MNS 601 COURSE NOTES 33
112 -
GOLD PRICE
110 - ($ U.S.)
108 - 1981
106 - 1982
1980
104 - 1979
102 - 1978
1977
100 -
98 -
96 -
94 -
92 -
90 -
J F M A M J J A S O N D
[pp. 829-834 848-856]
Ratio-to-Moving Average Method - This method uses an index (based on 1.00) to
describe the degree of seasonal variation.
Trend Analysis
Trends are described using the least squares method. They can be linear or curvilinear.
We study trends to:
1. Describe an historical pattern
2. Project past trends into the future
Linear Trends - Use the equation for a straight line (regression equation).
[p. 834 830] Tt = b0 + b1t
Where: Tt = linear trend forecast in period t; b0 = intercept of the linear trend
line, b1 = slope of the trend line, t = time period
b1 = ∑ (t - mean of t) (Yt - mean of Y)
∑ (t - mean of t) 2
b0 = mean of Y - b1 (mean of t)
MNS 601 COURSE NOTES 34
The Process of Decomposing a Time Series
Following are the steps in the process:
1. Calculate the seasonal indices
2. Deseasonalize all of the original data
3. Conduct a trend analysis
4. Use the trend equation to make a forecast
5. Adjust the forecast for the seasonal effect (i.e., re-seasonalize the forecasted
value)