Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab...

9
MINITAB PROJECT MATH-408: Probability and Statistics David Sickmiller

Transcript of Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab...

Page 1: Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab Project.doc · Web viewMinitab gives a correlation of 0.420 between the two variables.

MINITAB PROJECTMATH-408: Probability and Statistics

David Sickmiller

Page 2: Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab Project.doc · Web viewMinitab gives a correlation of 0.420 between the two variables.

Dr. B. Dimitrov

Tuesday, December 18, 2001

11:15-12:15 class

Page 3: Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab Project.doc · Web viewMinitab gives a correlation of 0.420 between the two variables.

IntroductionWhen purchasing a car, in addition to price, there are several factors to consider. With each make,

model, year, and engine option one gets a different car. This car will have its own combination of characteristics such as fuel economy, horsepower, acceleration time, and weight. Buyers may have different priorities in mind when shopping for a car. High fuel economy will save gas money and be healthier for the environment, but a fast acceleration time may provide more excitement while driving.

The objective of this project is to identify cars that have both high performance and good fuel economy. We will analyze the data various ways looking for vehicles that accelerate faster than is normal for their fuel economy. In order to manage the large amount of data, we will frequently group the cars together to make assumptions.

Groupings of cars for comparison: Different number of cylinders Origin

Characteristics to compare: MPG Acceleration (0-60 MPH)

Statistics and Distribution of CharacteristicsFirst, we will examine the distribution of the MPG and 0-60MPH characteristics of the cars. To

help narrow down the search, we will separate the data by origin (American, European, or Japanese) and by the number of cylinders in the engines. This will help us discover what general type of car we are looking for.

MPGMiles per gallon (MPG) is analyzed below. The graphs are histograms that have been fitted with a

normal curve. Descriptive numeric statistics are below the graphs.

Origin N N* Mean Median True Mean

St. Dev.

SE Mean

Min. Max. Q1 Q3

American 249 5 20.084 18.500 19.721 6.403 0.406 9.000 39.000 15.000 24.150European 70 3 27.891 26.500 27.579 6.724 0.804 16.200 44.300 23.750 30.775Japanese 79 0 30.451 31.600 30.397 6.090 0.685 18.000 46.600 25.400 34.100

It is fairly easy to that Japanese automobiles are the most promising group. Not only are their mean and median higher than any other group, their max of 46.6 is larger than the maximums of the other groups. Now let’s compare MPG by the number of engine cylinders. The American cars do not closely follow the normal curve. The other two origins generally fit the curve.

Page 4: Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab Project.doc · Web viewMinitab gives a correlation of 0.420 between the two variables.

Engine Cylinders

N N* Mean Median True Mean

St. Dev.

SE Mean

Min. Max. Q1 Q3

3 4 0 20.55 20.25 20.55 2.56 1.28 18.00 23.70 18.25 23.154 204 3 29.287 28.250 29.070 5.710 0.400 18.000 46.600 25.000 33.0005 3 0 27.37 25.40 27.37 8.23 4.75 20.30 36.40 20.30 36.406 84 0 19.986 19.000 19.587 3.807 0.415 15.000 38.000 18.000 21.0008 103 5 14.963 14.000 14.801 2.836 0.279 9.000 26.600 13.00 16.000

The group with the highest MPG is that of 4-cylinder engines. Thus, Japanese four-cylinders generally have good fuel economy. This characteristic seems to follow the normal.

0-60MPH AccelerationThe time to accelerate from 0 to 60MPH is analyzed below. The graphs are histograms that have

been fitted with a normal curve. Descriptive numeric statistics are below the graphs.

Origin N Mean Median True Mean

St. Dev.

SE Mean

Min. Max. Q1 Q3

American 254 14.943 15.000 14.928 2.805 0.176 8.000 22.200 13.000 16.750European 73 16.822 15.700 16.622 3.011 0.352 12.200 24.800 14.500 19.250Japanese 79 16.172 16.400 16.177 1.955 0.220 11.400 21.000 14.500 17.600

With acceleration time, lower numbers are better. American cars are the best with a median time of 15.000 seconds and a mean time of 14.943 seconds and also the lowest time with 8.000 seconds. Now let’s separate the cars by the number of cylinders. This characteristic seems to follow the normal.

Page 5: Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab Project.doc · Web viewMinitab gives a correlation of 0.420 between the two variables.

Engine Cylinders

N Mean Median True Mean

St. Dev.

SE Mean

Min. Max. Q1 Q3

3 4 13.250 13.500 13.250 0.500 0.250 12.500 13.500 12.750 13.5004 207 16.616 16.200 16.489 2.379 0.165 11.600 24.800 14.800 18.0005 3 18.63 19.90 18.63 2.37 1.37 15.90 20.10 15.90 20.106 84 16.263 16.100 16.258 2.021 0.221 11.300 21.000 15.025 17.6008 108 12.837 13.000 12.738 2.254 0.217 8.000 22.200 11.500 14.000

As many would expect, the lowest mean time is from the largest engines; 12.837 seconds for 8-cylinders. It is surprising, however, that the second-lowest mean is from 3-cylinder engines. Their mean is 13.250 seconds. Incidentally, all the 3-cylinder engines are sports cars from Mazda. This characteristic seems to follow the normal.

Comparison by BoxplotSide-by-side comparison may be easier by viewing boxplots of the characteristics. While these

graphs do not include the breadth of information of the previous section, they allow for a very quick evaluation of the differences. For example, origin 1 (America) produces cars with noticeably lower MPG than origin 3 (Japan). Eight-cylinder cars have lower MPG than any other engines. The lower-left boxplot reveals that difference in acceleration between different geographical originals is not as significant as the other differences.

Page 6: Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab Project.doc · Web viewMinitab gives a correlation of 0.420 between the two variables.

HypothesesAlready, we have identified differences between the cars’ characteristics. We can make several

hypotheses that can be tested later.1. Japanese cars have higher MPG than American cars.2. Four-cylinder cars have higher MPG than 8-cylinder cars.3. American cars have lower acceleration times than Japanese cars.4. Eight-cylinder cars have lower acceleration times than 4-cylinder cars.

Relationship between MPG and AccelerationAt this point, we do not have enough information to choose a fast, efficient vehicle. If we wanted

the most efficient, we’d get a Japanese 4-cylinder. However, that would not be fast like an American 8-cylinder. We’re looking for a combination of speed and efficiency. A scatterplot can show the relationship between these two characteristics.

The graph to the left displays acceleration on the y-axis versus MPG on the x-axis. Dots near the bottom of the graph represent cars that have lower acceleration times (i.e. fast), and point on the right indicate cars that have high MPG (i.e. clean). The bottom-right corner, which indicates fast, clean automobiles, is quite sparse. There seems to be a positive correlation between acceleration and MPG. Minitab gives a correlation of 0.420 between the two variables.

Confidence Intervals – Another look at the dataLet’s construct a 95% confidence interval for the means of the characteristics we have already

compared. This will gives us ranges of the mean to work with, rather than a single calculated point. For MPG, American is definitely lowest, European is in the middle and slightly overlaps with the highest, Japanese. For acceleration, American is definitely lowest, Japanese is in the middle and slightly overlaps with the largest, European.

Origin MPG AccelerationSt. Dev. 95% C.I. St. Dev 95% C.I.

American 6.403 ( 19.288, 20.879) 2.805 ( 14.598, 15.288)European 6.724 ( 26.316, 29.467) 3.011 ( 16.131, 17.513)Japanese 6.090 ( 29.108, 31.794) 1.955 ( 15.741, 16.603)

Two of our hypotheses from earlier were: Japanese cars have higher MPG than American cars, and American cars have lower acceleration times than Japanese cars. The data seem to support this because the confidence intervals agree and do not overlap.

RegressionRegression gives us the most help in determining which cars stand out as having a good

combination of acceleration and MPG. Below is Minitab’s regression analysis:

The regression equation is acceleration = 12.1 + 0.148 mpg

398 cases used 8 cases contain missing values

Predictor Coef StDev T PConstant 12.0811 0.3986 30.31 0.000mpg 0.14829 0.01609 9.22 0.000

S = 2.505 R-Sq = 17.7% R-Sq(adj) = 17.5%

Analysis of Variance

Source DF SS MS F PRegression 1 533.31 533.31 84.96 0.000

Page 7: Groups of cars to compare: - David Sickmiller's Blogdavid.sickmiller.com/cribs/MATH408/Minitab Project.doc · Web viewMinitab gives a correlation of 0.420 between the two variables.

Residual Error 396 2485.82 6.28Total 397 3019.12

Unusual ObservationsObs mpg accelera Fit StDev Fit Residual St Resid 7 32.7 11.400 16.930 0.194 -5.530 -2.21R 33 44.6 13.800 18.695 0.362 -4.895 -1.97 X 41 46.6 17.900 18.991 0.392 -1.091 -0.44 X 99 30.0 21.800 16.530 0.163 5.270 2.11R 108 19.0 21.900 14.899 0.145 7.001 2.80R 109 27.2 24.800 16.115 0.139 8.685 3.47R 135 43.1 21.500 18.472 0.339 3.028 1.22 X138 26.0 21.000 15.937 0.132 5.063 2.02R 140 23.0 23.500 15.492 0.126 8.008 3.20R 147 43.4 23.700 18.517 0.344 5.183 2.09RX148 44.0 24.600 18.606 0.353 5.994 2.42RX150 41.5 14.700 18.235 0.315 -3.535 -1.42 X151 44.3 21.700 18.650 0.357 3.050 1.23 X154 15.0 8.500 14.305 0.186 -5.805 -2.32R 159 23.0 20.500 15.492 0.126 5.008 2.00R 182 17.0 21.000 14.602 0.164 6.398 2.56R 217 29.0 22.200 16.382 0.153 5.818 2.33R 221 28.8 11.300 16.352 0.152 -5.052 -2.02R 225 14.0 9.000 14.157 0.198 -5.157 -2.06R 246 24.5 22.100 15.714 0.127 6.386 2.55R 282 32.0 11.600 16.826 0.185 -5.226 -2.09R 307 18.0 21.000 14.750 0.154 6.250 2.50R 316 15.0 19.500 14.305 0.186 5.195 2.08R 337 9.0 18.500 13.416 0.265 5.084 2.04R 345 15.0 21.000 14.305 0.186 6.695 2.68R 352 23.9 22.200 15.625 0.126 6.575 2.63R 362 14.0 8.000 14.157 0.198 -6.157 -2.47R 369 14.0 8.500 14.157 0.198 -5.657 -2.27R

R denotes an observation with a large standardized residualX denotes an observation whose X value gives it large influence.

ConclusionNow we know which cars we want! Observations #7, 33, 41, 150, 154, 221, 225, 282, 362, and

369 stand out from the rest of the data. Their residual and standard residual is negative which means they belong in the bottom-right of the scatterplot. Those observation numbers translate to these cars:

1980 Datsun 280ZX 32.7 MPG 11.4 sec. 1980 Honda Civic 1500 GL 44.6 MPG 13.8 sec. 1980 Mazda GLC 46.6 MPG 17.9 sec. 1980 VW Rabbit 41.5 MPG 14.7 sec. 1970 AMC Ambassador DPL 15.0 MPG 8.5 sec. 1979 Chevrolet Citation (V6) 28.8 MPG 11.3 sec. 1970 Chevrolet Impala 14.0 MPG 9.0 sec. 1982 Dodge Rampage 32.0 MPG 11.6 sec. 1970 Plymouth 'Cuda 340 14.0 MPG 8.0 sec. 1970 Plymouth Fury III 14.0 MPG 8.5 sec.

The regression equation equations, as stated above, says that for every second above 12 seconds, there should be an additional .148 seconds for each mile per gallon. The Ambassador, Impala, ‘Cuda, and Fury are so fast that it does not matter very much how fuel-efficient they are. And the high fuel efficiency of the Mazda GLC at 46.6MPG means the expected acceleration should be 18.9 seconds. However, the cars in the middle are still quite respectable by 2001 standards. The Datsun 280ZX, Chevy Citation, and Dodge Rampage all have about 30 MPG and mid-11 second times. I would recommend these cars for the consumer interested in both performance and fuel conservation.