latihan soal statistika

CHAPTER 2

2.22

The data displayed below represent the cost of electricity during July 2004 for a random sample of 50 one-bedroom apartments in a large city.

Raw data on Utility Charges ($) :

96 171 202 178 147 102 153 197 127 82

157 185 90 116 172 111 148 213 130 165

141 149 206 175 123 128 144 168 109 167

95 163 150 154 130 143 187 166 139 149

108 119 183 151 114 135 191 137 129 158

a. Form a frequency distribution and a percentage distribution that have class intervals with the upper class limits $99, $119, and so on.

b. Plot a histogram and a percentage polygon.c. Form the cumulative percentage distribution and plot the ogive (cumulative percentage

polygon).d. Around what amount does the monthly electricity cost seem to be concentrated?

2.28

The following data represent the responses to two questions asked in a survey of 40 college students majoring in business – what is your gender? (Male=M; Female=F) and what is your major? (Accountancy=A; Computer Information Systems=C; Marketing=M):

Gender : M M M F M F F M F M F M M M M F F M F F

Major : A C C M A C A A C C A A A M C M A A A C

Gender : M M M M F M F F M M F M M M M F M F M M

Major : C C A A M M C A A A C C A A A A C A C

a. Tally the data into a contingency table where the two rows represent the gender categories and the three columns represent the academic-major categories.

b. Form cross-classification tables based on percentages of all 40 student responses, based on row percentages, and based on column percentages.

c. Using the results from (a), construct a side-by-side bar chart of gender based on student major.

CHAPTER 3

3.9

In the 2002-2003 academic year, many public universities in the United States raised tuition and fees due to a decrease in state subsidies (Mary Beth Marklein, Public Universities Raise Tuition, Fees-and Ire, “USA Today, August 8, 2002, 1A-2A). The following represents the change in the cost of tuition, a shared dormitory room, and the most popular meal plan between the 2001-2002 academic year and the 2002-2003 academic year for a sample of 10 public universities.

University

University of California, Berkeley 1,589

University of Georgia, Athens 593

University of Illinois, Urbana-Champaign 1,223

3.10

The following data COFFEEDRINK represent the calories and fat (in grams) of 16-ounce iced

coffee drinks at Dunkin’ Donuts and Starbucks.

Product Calories Fat

Dunkin’ Donuts Iced Mocha Swirl Latte (whole milk) 240 8.0

Starbucks Coffee Frappuccino blended coffee 260 3.5

Dunkin’ Donuts Coffee Coolatta (cream) 350 22.0

Starbucks Iced Coffee Mocha Expresso

(whole milk and whipped cream) 350 20.0

Starbucks Mocha Frappuccino blended coffee

(whipped cream) 420 16.0

Starbucks Chocolate Brownie Frappucinno blended

Coffee (whipped cream) 510 22.0

Starbucks Chocolate Frappucinno blended crème

(whipped cream) 530 19.0

For each variable (calories and fat):

a) Compute the mean, median, first quartile, and third quartile

b) Compute the variance, standard deviation, range, interquartile range, coefficient of

variation, and Z scores. Are there any outliers? Explain

c) Are the data skewed? If so, how?

d) Based on the results of (a) through (c), what conclusions can you reach concerning the

calories and fat in iced coffee drinks at Dunkin’ Donuts and Starbucks?

3.23

The following data represent the quarterly sales tax receipts (in thousands of dollars) submitted

to the comptroller of the Village of Fair Lake for the period ending March 2004 by all 50

business establishments in that locale: TAX

10.3 11.1 9.6 9.0 14.5

13.0 6.7 11.0 8.4 10.3

13.0 11.2 7.3 5.3 12.5

8.0 11.8 8.7 10.6 9.5

11.1 10.2 11.1 9.9 9.8

11.6 15.1 12.5 6.5 7.5

10.0 12.9 9.2 10.0 12.8

12.5 9.3 10.4 12.7 10.5

9.3 11.5 10.7 11.6 7.8

10.5 7.6 10.1 8.9 8.6

a) Compute the mean, variance, and standard deviation for this population

b) What proportions of these businesses have quarterly sales tax receipts within ±1, ±2, or

±3 standard deviation of the mean?

c) Compare and contrast your findings whit what would be expected on the basis of the

empirical rule. Are you surprised at the results in (b)?

CHAPTER 4

4.63

A survey asked workers which aspects of his or her job are extremely important. The results in

percentages are as follows:

Is Aspect Extremely Important?

Aspect of Job Men WomenGood relationship with boss 63% 77%Up-to-date equipment 59 69Resources to do the job 55 74Easy commute 48 60Flexible hours at work 40 53Able to work at home 21 34

Suppose the survey was based on the responses of 500 men and 500 women. Construct a

contingency table for the different responses concerning each aspect of the job. If a respondent

is chosen at random, what is the probability that

a) He or she feels that a good relationship with the boss is an important aspect of the job?

b) He or she feels that an easy commute is an important aspect of the job?

c) The person is a male and feels that a good relationship with the boss is an important

aspect of the job?

d) The person is a female and feels that having the flexible hours is an important aspect of

the job?

e) Given that the person feels that having a good relationship with the boss is an important

aspect of the job, what is the probability that the person is a male?

f) Are any of the things that workers say are extremely important aspects of a job

statistically independent of the gender of the respondent? Explain.

4.65

The owner of a restaurant serving Continental-style entrees was interested in studying ordering

patterns of patrons for the Friday to Sunday weekend time period. Records were maintained

that indicated the demand for dessert during the same time period. The owner decided to

study two other variables along with whether a dessert was ordered: the gender of the

individual and whether a beef entrée was ordered. The results are as follows:

GENDER

Dessert ordered Male Female TotalYes 96 40 136No 224 240 464Total 320 280 600

BEEF ENTREE

Dessert ordered Male Female TotalYes 71 65 136No 116 348 464Total 187 413 600

A waiter approaches a table to take an order. What is the probability that the first customer to

order at the table

a) Orders a dessert?

b) Orders a dessert or a beef entrée?

c) Is a female and does not order a dessert?

d) Is a female or does not order a dessert?

e) Suppose the first person that the waiter takes the dessert order from is a female. What is

the probability that she does not order dessert?

f) Are gender and ordering dessert statistically independent?

g) Is ordering a beef entrée statistically independent of whether the person orders dessert?

CHAPTER 5

5.27

J.D Power & Associates calculates and publishes various statistics concerning car quality. The

Initial Quality score measures the number of problems per new car sold. For 2003 model cars,

the Lexus was the top brand with 1.63 problems per car. Korea’s Kia came in last with 5.09

problems per car (L. Hawkins, “Finding a Car That’s Built to Last?” The Wall Street Journal, July

9, 2003, D1, D5). Let the random variable X be equal to the number of problems with a newly

purchased Lexus.

a. What assumptions must be made in order for X to be distributed as a Poisson random

variable? Are these assumptions reasonable?

Making the assumptions as in (a), if you purchased a 2003 Lexus, what is the probability that

the new car will have:

b. Zero problems?

c. Two or fewer problems?

d. Give an operational definition for “problem”. Why is the operational definition important

in interpreting the Initial Quality score?

5.41

Cinema advertising is increasing. Normally 60 to 90 seconds long, these advertisements are

longer and more extravagant, and tend to have more captive audiences than television

advertisements. Thus, it is not surprising that the recall rates for viewers of cinema

advertisements are higher that for television advertisements. According to survey research

conducted by the ComQUEST division of BBM Bureau of Measurement in Toronto, the

probability a viewer will remember a cinema advertisement is 0.74, whereas the probability a

viewer will remember a 30-second television advertisement is 0.37 (Nate Hendley, “Cinema

Advertising Comes of Age”, Marketing Magazine, May 6, 2002, 16)

a. Is the 0.74 probability reported by the BBM Bureau of Measurement best classified as a

priori classical probability, empirical classical probability, or subjective probability?

b. Suppose that 10 viewers of cinema advertisement are randomly sampled. Consider the

random variable defined by the number of viewers that recall the advertisement. What

assumptions must be made in order to assume that this random variable is distributed as

a binomial random variable?

c. Assuming that the number of viewers that recall the cinema advertisement is a binomial

variable, what are the mean and standard deviation of this distribution?

d. Based on your answer to (c), if none of the viewers can recall the ad, what can be inferred

about the 0.74 probability given in the article?

5.51

A study of the homepages for Fortune 500 companies reports that the mean number of bad

links per homepage is 0.4 and the mean number of spelling errors per homepage is 0.16 (Nabil

Tamimi, Murii Rajan, and Rose Sebastianella, “Benchmarking the Home Pages of ‘Fortune’ 500

Companies”, Quality Progress, July 2000). Use the Poisson distribution to find the probability

that a randomly selected homepage will contain

a. Exactly 0 bad links

b. 5 or more bad links

c. Exactly 0 spelling errors

d. 10 or more spelling errors

CHAPTER 6

6.7

During 2001, 61.3% of U.S. household purchased ground coffee and spent an average of $36.16

on ground coffee during the year (“Annual Product Preference Study”, Progressive Grocer, May

1, 2002, 31). Consider the annual ground coffee expenditures for households purchasing

ground coffee, assuming that these expenditures are approximately distributed as a normal

random variable with mean of $36.16 and a standard deviation of $10.00.

a. Find the probability that a household spent less than $25.00.

b. Find the probability that a household spent more than $50.00.

c. What proportion of the households spent between $30.00 and $40.00?

d. 99% of the households spent less than what amount?

6.28

The fill amount of soft drink bottles is normally distributed with a mean of 2.0 liters and a

standard deviation of 0.005 liter. Bottles that contain less than 95% of the listed net content

(1.90 liters in the case) can make the manufacturer subject to penalty by the state office of

consumer affairs. Bottles that have a net content above 2.10 liters may cause excess spillage

upon opening. What proportion of the bottles will contain

a. Between 1.90 and 2.0 liters?

b. Between 1.90 and 2.10 liters?

c. Below 1.90 liters or above 2.10 liters?

d. 99% of the bottles contain at least how much soft drink?

e. 99% of the bottles contain an amount that is between which two values (symmetrically

distributed) around the mean?

CHAPTER 8

8.1

If X=85, σ=8, and n=64, construct a 95% confidence interval estimate of the population mean µ.

8.9

The inspection division of the Lee County Weights and Measures Department wants to estimate the actual amount of soft drink in 2-liter bottles at the local bottling plant of a large nationally known soft-drink company. The bottling plant has informed the inspection division that the population standard deviation for 2-liter bottles is 0.05 liter. A random sample of 100 2-liter bottles at this bottling plant indicates a sample mean of 1.99 liters.

a. Construct a 95% confidence interval estimate of the population mean amount of soft drink in each bottle.

b. Must you assume that the population of soft-drink fill is normally distributed? Explain.c. Explain why a value of 2.02 liters for a single bottle is not unusual, even though it is outside

the confidence interval you calculated.d. Suppose that the sample mean had been 1.97 liters. What is your answer to (a)?

8.13

Construct a 95% confidence interval estimate for the population mean, based on each following sets of data, assuming that the population is normally distributed:

Set 1 : 1,1,1,1,8,8,8,8

Set 2 : 1,2,3,4,5,6,7,8

Explain why these data sets have different confidence intervals even though they have the same mean and range.

8.25

If n=400 and X=25, construct a 99% confidence interval estimate of the population proportion.

8.35

If you want to be 99% confident of estimating the population mean to within a sampling error of ± 0.02 and the standard deviation is assumed to be 100, what sample size is required?

8.63

The market research director for Dotty’s department store wants to study women’s spending on cosmetics. A survey is designed in order to estimate the proportion of women who purchase their cosmetics primarily from Dotty’s department store, and the men yearly amount that women spend on cosmetics. A previous survey found that the standard deviation of the amount women spend on cosmetics in a year is approximately $18.

a. What sample size is needed to have 99% confidence of estimating the population mean to within ±5?

b. What sample size is needed to have 90% confidence of estimating the population proportion to within ±0.045?

c. Based on the results in (a) and (b), how many of the store’s credit cardholders should be sampled? Explain.

QUIZ 1

4.62

A soft-drink bottling company maintains records concerning the number of unacceptable bottles of soft drink from the filling and capping machines. Based on past data, the probability that a bottle came from machine I and was nonconforming is 0.01 and the probability that a bottle came from machine II and was nonconforming is 0.025. Half the bottles are filled on machine I and the other half are filled on machine II. If a filled bottle of soft drink is selected at random, what is the probability that

a. It is a nonconforming bottle?b. It was filled on machine I and is a conforming bottle?c. It was filled on machine or is a conforming bottle?d. Suppose you know that the bottle was produced on machine I. What is the probability that

is nonconforming?e. Suppose you know that the bottle is nonconforming. What is the probability that it was

produced on machine I?f. Explain the difference in the answers to (d) and (e).

(Hint : Construct a 2x2 contingency table or a Venn diagram to evaluate the probabilities).

4.67

In February 2002, the Argentine peso lost 70% of its value compared to the United States dollar. This devaluation drastically raised the price of imported products. According to a survey conducted by AC Nielsen in April 2002, 68% of the consumers in Argentina were buying fewer products than before the devaluation, 24% were buying the same number of products, and 8% were buying more products. Furthermore, in a trend toward purchasing less-expensive brands, 88% indicated that they had changed the brands they purchased (Michelle Wallin, “Argentines Hone Art of Shopping in a Crisis”, The Wall Street Journal, May 28, 2002, A15). Suppose the following complete set of results were reported.

NUMBER OF PRODUCTS PURCHASED

BRANDS PURCHASED Fewer Same More Total

Same 10 14 24 48

Changed 262 82 8 352

Total 272 96 32 400

What is the probability that a consumer selected at random:

a. Purchased the same number or more products than before?b. Purchased fewer products and changed brands?c. Given that a consumer changed the brands they purchased, what then is the probability

that the consumer purchased fewer products than before?

4.68

Sport utility vehicles (SUVs), vans, and pickups are generally considered to be more prone to roll over than cars. In 1997, 24.0% of all highway fatalities involved a rollover; 15.8% of all fatalities in 1997 involved SUVs, vans, and pickups, given that the fatality involved a rollover. Given that a rollover was not involved, 5.6 % of all fatalities involved SUVs, vans, and pickups (Anna Wilde Mathews, “Ford Ranger, Chevy Tracker Tilt in Test, “The wall Street Journal, July 14, 1999, A2). Consider the following definitions:

A = fatality involved an SUV, van, or pickup

B = fatality involved a rollover

a. Use Bayes’ theorem to find the probability that the fatality involved a rollover, given that the fatality involved an SUV, van, or pickup.

b. Compare the result in (a) to the probability that the fatality involved a rollover, and comment on whether SUVs, vans, and pickups are generally more prone to rollover accidents.

latihan soal statistika

Documents

Transcript of latihan soal statistika