Statistics 578 Assignemnt 1

17
BA578Assignment1-Solutions: due by Midnight (11:59pm) Sunday, September 9 th , 2012 (Chapters 1, 2 and 3) True/False questions carry 1 point each, Multiple Choices carry two points each and the Essay type questions carry 6 points each. The total score is 70 points. True/False (one point each) Chapter 1 1. An individual’s credit score is an example of a ratio scale variable. FALSE There is no intrinsic Zero. An arbitrary minimum is established. 2. An example of a qualitative variable is the miles per gallon rate of a car. FALSE 3. If we examine some of the population measurements, we are conducting a sample of the population. TRUE 4. Census always gives more accurate values than a sample. FALSE Sometimes Census involves larger non-sampling (or Measurement) errors, especially when the population is large and measurement requires higher skills/expertise. 5. The telephone number of an individual is an example of an Ordinal scale variable. FALSE It is a nominal scale variable used only for identification. Chapter 2 6. A bar chart is a graphic that can be used to depict qualitative data. TRUE 7. The sample cumulative distribution function is initially increasing, and then decreasing. FALSE It is never decreasing 8. When we wish to summarize the proportion (or fraction) of items in a class we use the relative frequency for the class. TRUE 1

description

Study Help

Transcript of Statistics 578 Assignemnt 1

Page 1: Statistics 578 Assignemnt 1

BA578Assignment1-Solutions: due by Midnight (11:59pm) Sunday, September 9 th , 2012 (Chapters 1, 2 and 3)

True/False questions carry 1 point each, Multiple Choices carry two points each and the Essay type questions carry 6 points each. The total score is 70 points.

True/False (one point each)

Chapter 11. An individual’s credit score is an example of a ratio scale variable. FALSE There is no intrinsic Zero. An arbitrary minimum is established.

2.  An example of a qualitative variable is the miles per gallon rate of a car.  FALSE

3. If we examine some of the population measurements, we are conducting a sample of the population.  TRUE

4. Census always gives more accurate values than a sample. FALSE Sometimes Census involves larger non-sampling (or Measurement) errors, especially when the population is large and measurement requires higher skills/expertise.

5. The telephone number of an individual is an example of an Ordinal scale variable. FALSE It is a nominal scale variable used only for identification.

Chapter 26. A bar chart is a graphic that can be used to depict qualitative data.  TRUE

 7. The sample cumulative distribution function is initially increasing, and then decreasing.  FALSE It is never decreasing

8. When we wish to summarize the proportion (or fraction) of items in a class we use the relative frequency for the class. TRUE

Chapter 3

9. The income distribution is skewed to the right; therefore the Median Income must be less than the Mean Income. TRUE

10. The range of the measurement is the largest measurement plus the smallest measurement.  FALSE My Instructions on Ch 3

11. The median is said to be more resistant to extreme values. TRUE My Instructions on Ch 3

12. If there are 7 classes in a frequency distribution then the fourth class must contain the Median. FALSE (depends on how the class frequencies are distributed among the classes)

1

Page 2: Statistics 578 Assignemnt 1

Multiple Choices (Two points each)

Chapter 1

1.  Ratio variables have the following unique characteristic: (select only one)A. Meaningful orderB.  PredictableC. Categorical in natureD. An inherently defined zero value

2. Which of the following is a categorical (qualitative) variable? A. Air TemperatureB. Bank Account BalanceC. Daily Sales in a StoreD. Whether a Person Has a Traffic ViolationE. Both A and D

3. Which of the following is a quantitative variable? A. The make of a TVB. A person's genderC. The age of a houseD. Whether a person is a college graduateE. Whether a person has a charge account

4. The two types of qualitative variables are: A. Ordinal and ratioB. Interval and ordinalC. Nominative and ordinalD. Interval and ratioE. Nominative and interval

5. Letter Grade is an example of a(n) ________ variable. A. NominativeB. OrdinalC. IntervalD. Ratio

Chapter 2

6. A company collected the ages from a random sample of its middle managers with the resulting frequency distribution shown below:

Class Interval Frequency20 - 25 825 - 30 1630 - 35 11

2

Page 3: Statistics 578 Assignemnt 1

35 - 40 540 – 45 745 – 50 6 What would be the approximate shape of the relative frequency histogram? (You can draw the histogram to give you a visual impression).A. SymmetricalB. UniformC. Multiple peakD. Skewed to the leftE. Skewed to the right

class Freq20-25 825-30 1630-35 1135-40 540-45 745-50 6

It is clearly skewed to the right (long right-tail) and one highest peak.

7. A(n) ______ is a graph of a cumulative relative frequency distribution. A.  Frequency PolygonB. Scatter plotC. HistogramD.  Ogive

3

20-25 25-30 30-35 35-40 40-45 45-5002468

1012141618

Freq

Freq

Page 4: Statistics 578 Assignemnt 1

8. If there are 50 values in a data set, how many classes should be created for a frequency histogram? A. 4B. 5C. 6D. 7E. 8

Chapter 3

9. According to a survey of the top 10 employers in a major city in the Midwest, a worker spends an average of 413 minutes a day on the job. Suppose the standard deviation is 25 minutes and the time spent is approximately a normal distribution. What are the times that approximately 99.73% of all workers will fall? A. [388 438]B. [338 488]C. [363 463]D. [338 438]E. [388 488]

10. In a statistic class, 10 scores were randomly selected with the following results were obtained: 74, 73, 77, 77, 71, 68, 65, 77, 67, 66 What is the median? A. 71.5B. 71.0C. 72.0D. 71.0E. 73.0

11. In a statistic class, 10 scores were randomly selected with the following results were obtained (mean=71.5): 74, 73, 77, 77, 71, 68, 65, 77, 67, and 66. What is the Variance? A.12.00B. 22.72C. 4.77D. 516.20E. 144.00

12. When using the Chebyshev’s theorem to obtain the bounds for a 99.73 percent of the values in a population, the interval generally will be ___________ the interval obtained for the same percentage if normal distribution is assumed (empirical rule). A. Wider thanB. Narrower thanC. The same asD. greater than or less than (depending on the size of the standard deviation)My Instructions on Ch 3 page 13.

4

Page 5: Statistics 578 Assignemnt 1

13. Find the z-score for a IQ test score of 90 when the mean is 100 and the standard deviation is 12.  A. 7.5B. -0.7.5C. -0.83D. 0.83E. -10

14. Time to degree has become a “hot” topic with federal legislators. At one state university it was necessary to do a quick calculation when one of the local congressmen called the president. Twenty students were randomly selected from the most recent graduating class and the number of semesters they were enrolled was calculated (mean=9.6) 7, 8, 10, 11, 8, 6, 10, 9, 9, 8, 13, 12, 8, 11, 11, 14, 8, 7, 10, 12 What is the variance? A. 8B. 2.16C. 9.5D. 4.67E. 21.846You don’t need to show the following table (I obtained using MegaStat) but I am giving it to show how quickly you can get answer with MegaStat

Descriptive statisticscount 20 mean 9.60 sample variance 4.67 sample standard deviation 2.16 minimum 6 maximum 14 range 8

Or you can use calculator and the formula given in the book and my instructions for sample variance if you find it difficult to use MegaStat.

Essay Type Questions (6 points each)

Chapter 21. Consider the following data on distances traveled by 40 people to visit the local amusement park.   

distance freq

1-8 159-16 1217-24 725-32 533-40 1total 40

5

Page 6: Statistics 578 Assignemnt 1

Expand and construct the table adding columns for relative frequency and cumulative relative frequency and construct the histogram of frequencies, plot the frequency polygon and the Ogive curve using Excel.

distance freq rel.fr cum.rel.fr1-8 15 0.375 0.3759-16 12 0.3 0.67517-24 7 0.175 0.8525-32 5 0.125 0.97533-40 1 0.025 1total 40 1 na

1-8 9-16 17-24 25-32 33-4002468

10121416

freq

freq

1-8 miles 9-16 miles 17-24 miles 25-32 miles 33-40 miles0

2

4

6

8

10

12

14

16

Series1

6

Page 7: Statistics 578 Assignemnt 1

2. Math test anxiety can be found throughout the general population. A study of 120 seniors at a local high school was conducted. The following table was produced from the data. Complete the missing parts. (Work step by step to solve this puzzle. Round the frequencies to the nearest whole number.)

Score Range Frequency Rel frequency Cumulative Rel. freq.Very anxious 37-50 0.20Anxious/Tents 33-36 12Mild Anxiety 27-32Relaxed 20-26 24Very Relaxed 10-19 0.333Total

We have to work step by step using our knowledge of Frequency tables to solve this puzzle. For the first class, Relative Frequency and Cumulative Relative Frequency will be the same. So we write 0.20 in the first row last column. Moreover, we find the frequency for this class by multiplying Relative frequency 0.20 by total frequency 120 to get 24. Thus, first row is completely filled. In the second row we convert the given frequency 12 into relative frequency after dividing by 120 which gives 0.10. Therefore, the cumulative relative frequency in the second row will be 0.30. Thus, second row is filled too.

Next we convert the given relative frequency in the fifth row into frequency after multiplying 0.333 by 120 and rounding to get 40. Since the total frequency is given as 120, we can find the remaining frequency for the third row once we have the frequencies for the other four rows. It is calculated as 20. The rest of the story should be clear to you. Just remember that the total of all frequencies must be the given number 120 and the total of all relative frequencies must always be 1.

7

Page 8: Statistics 578 Assignemnt 1

Score Range Frequency Rel frequency Cumulative Rel. freq.Very anxious 37-50 24 0.20 0.20Anxious/Tents 33-36 12 0.10 0.30Mild Anxiety 27-32 20 0.167 0.467Relaxed 20-26 24 0.20 0.667Very Relaxed 10-19 40 0.333 1.000Total 120 1.000 NA

3. The number of weekly sales calls by a sample of 25 pharmaceutical salespersons is below: 24, 56, 43, 35, 37, 27, 29, 44, 34, 28, 33, 28, 46, 31, 38, 41, 48, 38, 27, 29, 37, 33, 31, 40, and 50. Construct a histogram and plot the frequency polygon. (Use Excel: Instructions given in Instructions for Chapter 2 and also in Textbook Appendix of Chapter 2)

Using Excel, Data, then Data analysis, then Histogram (living bin blank) and checking appropriate boxes you can get

Bin FrequencyCumulative

%24 1 4.00%

30.4 6 28.00%36.8 6 52.00%43.2 7 80.00%49.6 3 92.00%

More 2 100.00%

24 30.4 36.8 43.2 49.6 More012345678

0.00%

20.00%

40.00%

60.00%

80.00%

100.00%

120.00%

Histogram

FrequencyCumulative %

Bin

Freq

uenc

y

8

Page 9: Statistics 578 Assignemnt 1

20 25 30 35 40 45 50 55 600

1

2

3

4

5

6

7

8

1

6 6

7

3

2

frequency polygon

freq

Some of you have determined five classes on your own and specified the bin ranges, which is also good. The answers will look different but both are acceptable, because the rule is only a recommendation or guide. There can be a little variation from the rule depending on the nature of the data or the problem under study. However in this course if you have to decide the number of classes on your own then follow the rule. If you let the computer determine the number of classes leaving the bin rane blank, that is good too. Finally, MegaStat and excel will give a little different answers depending on how they interpret the class boundaries (whether to include the upper limit of a class boundary or the lower limit). But both are acceptable. A researcher makes the final judgement .

Suppose a researcher decided after looking at the data that the lower boundary should be 24 and number of classes only 5, and wanted to use MegaStat, then: Go to Excel, type the data as it is given (no need to order), go to Add-Ins, then MegaStat, then Frequency Dtistribution-Quantitative, then select the range of input, put 21 as lower boundary and 6 as class interval (6 as class intervals gives 6 classes in this case which the researcher finds too many for 25 observations), then check for histogram, ogives etc, and click ok. The result is as follows which different from above but is equally acceptable.

Frequency Distribution - Quantitative

 Dat

a           cumulative low

er uppermidpoin

t width frequenc

y percent frequency

percent

24 < 31 28 7 7 28.0 7 28.0 31 < 38 35 7 8 32.0 15 60.0 38 < 45 42 7 6 24.0 21 84.0 45 < 52 49 7 3 12.0 24 96.0 52 < 59 55 7 1 4.0 25 100.0

25 100.0

9

Page 10: Statistics 578 Assignemnt 1

10

17 24 31 38 45 52 0.0

25.0

50.0

75.0

100.0Ogive

Data

Cum

ulat

ive

Per

cent

17 24 31 38 45 52 0.0

10.0

20.0

30.0

40.0

Frequency Polygon

Data

Per

cent

24

31

38

45

52

59

Histogram

Data

Per

cent

Page 11: Statistics 578 Assignemnt 1

Chapter 3

4. The following frequency table summarizes the ages of 60 shoppers at the local grocery store.

   First calculate the sample mean for the above frequency table using the method for grouped data. Then, calculate the sample variance and sample standard deviation for this data set.

Age of the shopper Frequency(f) Midpoint(Mi) f* Mi

15-23 10 19 19024-32 21 28 58833-41 10 37 37042-50 8 46 36851-59 5 55 27560-68 6 64 384TOTAL 60 = N --------------- 2175X = ∑ (f * M) / N = 2175/60 = 36.25

f Mi Mi - X (Mi - X )2 f * (Mi - X ) 2

10 19 -17.25 297.5625 2975.62521 28 -8.25 68.0625 1429.312510 37 0.75 0.5625 5.6258 46 9.75 95.0625 760.55 55 18.75 351.5625 1757.81256 64 27.75 770.0625 4620.375

Total 11549.25s2 = ∑ (f * (Mi -X ) 2) / (N-1) = 11549.25 / 59 = 195.75

s = square root of sample variance = 13.99

5. For the data in Essay question number 3 above, calculate the sample Mean, Variance and Standard deviation without grouping the data.

Using Excel (Data- Descriptive Statistics-Summary Statistics)

Column1

Mean 36.28Standard Error 1.63658

11

Page 12: Statistics 578 Assignemnt 1

2Median 35Mode 37Standard Deviation

8.182909

Sample Variance 66.96

Kurtosis-

0.13027

Skewness0.65914

9Range 32Minimum 24Maximum 56Sum 907Count 25

Using MegaStat

   

Descriptive statistics

  X

count 25 mean 36.28 sample variance 66.96 sample standard deviation 8.18 minimum 24 maximum 56 range 32 sum 907.00

sum of squares34,513.0

0 deviation sum of squares (SSX) 1,607.04

You get the same results using calculator as follows:

No. of calls (X) Deviation from mean (X-36.28) (X-36.28)2

24 -12.28 150.80

56 19.72 388.88

43 6.76 45.70

35 -1.28 1.64

12

Page 13: Statistics 578 Assignemnt 1

37 0.72 0.52

27 -9.28 86.12

29 -7.28 53.00

44 7.72 59.60

34 -2.28 5.20

28 -8.28 68.56

33 -3.28 10.76

28 -8.28 81

46 12 68.56

31 -5.28 27.88

38 1.72 2.96

41 4.72 22.28

48 11.72 137.35

38 1.72 2.96

27 -9.28 86.11

29 -7.28 53.00

37 0.72 0.52

33 3.28 10.75

31 --5.28 27.88

40 3.72 13.84

50 13.72 188.23

Total = 907 and X = 907/25 = 36.28 Total = 0 Total = 1607.04

s2 = 1607.04/24 = 66.96 and s = √66.96 = 8.18

13