Presented by Del Ferster. What’s in store for tonight? I have lots or “practice problems” that...

Post on 04-Jan-2016

212 views 0 download

Transcript of Presented by Del Ferster. What’s in store for tonight? I have lots or “practice problems” that...

Presented by Del Ferster

What’s in store for tonight?I have lots or “practice problems” that

cover the entire spectrum of statistics that are being considered this year.

The solutions to these problems will also be presented.

We’ll spend some time on the different types of sampling.

What’s in store for tonight?We’ll consider the difference between

association and causation.I have a good M&M activity for us to do

(maybe you’re right, maybe it’s just an excuse to eat chocolate )

I have a nice Starburst activity that we’ll consider, too.

Another look at some “test-like” problems that deal with a variety of statistics topics.

You got a problem??This is a rather large set of problems, so I’m going to give you a while to work on them.

Hopefully, some of the ideas come back quickly.

When you’re set, we’ll look at the solutions.

Solutions to Review Problems1. minimum x value is between 30 & 40, so A2. minimum y value is between 60 & 70, so D3. y-intercept (x=0) on final exam, overall

average is 59.351, so A4. C5. D6. A7. B8. C

Solutions to Review Problems9. D10. D11. C12. B13. 8:00 Class 9:00 Class TOTAL

Earned an A 18 12 30

Did not earn an A

4 6 10

TOTAL 22 18 40

Solutions to Review Problems14. 30/40=75%15 18

81.81%2212

66.7%1830

75%40

a

b

c

b c a so

C

Solutions to Review Problems16. 18

60%304

40%1022

55%40

x

y

z

y z x so

C

Solutions to Review Problems17A.

17B. 26/47=55.319%17C. 13/47=27.660%17D. Type O

Seniors: 9/26=34.615%Juniors: 6/21=28.571%, so Greater percentage

in Juniors.

Type A Type B Type AB Type O TOTAL

Junior 7 5 5 9 26Senior 1 6 8 6 21TOTAL 8 11 13 15 47

Solutions to Review Problems18A. Approximately 86% (86.387%)18B. Approximately 95% (95.408%)18C. A student whose Quiz average is 0, has

a final course grade of 41.667.18D. For each change of 1 percent in quiz

average, final course average increases by 0.559 percent

Solutions to Review Problems19.

19A. 537519B. 1004/5375=18.68%19C. 400/1780=22.47%

Student Smokes Student Does Not Smoke

TOTAL

Both Parents Smoke 400 1380 1780One Parent Smokes 416 1823 2239Neither Parent Smokes 188 1168 1356TOTAL 1004 4371 5375

Solutions to Review Problems19D. 416/2239=18.58%19E. 188/1356=13.86%19F

Solutions to Review Problems19G. 1380/4371=31.57%19H 188/1004=18.73%19I. 816/1004=81.27%20A. For each increase of 1 inch in wheel

diameter, coasting distance increases 5.332 inches.

20B. A wheel that has a diameter of 0 inches will have a coasting distance of 10.585 inches.

20C. Approximately 53 inches (53.241)

Solutions to Review Problems20D. Approximately 17 inches (16.770)20E. No. The correlation is clearly positive; most

likely near POSITIVE 1

21A. Write each number on a slip of paper, put the paper slips in a hat (A Packers hat works BEST ), then select 10 slips from the hat.

21B. Group the numbers into 5 strata (1-20, 21-40, 41-60, 61-80, and 81-100), then randomly select 2 numbers from each stratum. (Impressive Latin knowledge, eh? )

Solutions to Review Problems21B. Group the numbers into 5 strata

(1-20, 21- 40, 41-60, 61-80, and 81-100), then randomly select 2 numbers from each stratum. (Impressive Latin knowledge, eh? )

21C. Randomly select on of the groups (1-20, 21- 40, 41-60, 61-80 or 81-100), then randomly select 10 numbers from that group.

Solutions to Review Problems21D. Pull a random number, then

include every 5th number after that number (note, it doesn’t have to be the 5th number, in reality, the “span number” should be random, too.

Solutions to Review Problems22. Association implies some kind of

relationship exists between the two variables, but stops short of saying a change in x (the explanatory variable) causes a change in y (the response variable).

Solutions to Review Problems22 (continued).

To conclude causation, an experiment (not an observational study) must be done, where subjects are randomly assigned to 2 groups—experimental and control. Other variables must be controlled or eliminated.

Association doesn’t require control, or random assignment of subjects to 2 groups. Observational studies can imply association.

A Quick look at basic terms, ways to represent results, and 2-way tables.

Qualitative variables classify the data into categories.

The categories may or may not have a natural ordering to them.

Qualitative variables are also called categorical variables.

EXAMPLESEye colorFavorite NFL teamGenderDo you smoke?

Qualitative Variables/Categorical Variables

Distribution of a categorical variableThe distribution of a categorical variable provides the possible values that a variable can take on and how often these possible values occur.

The distribution of a categorical variable shows the pattern of variation of the variable.

According to the Bureau of Justice, the following data represent the number of inmates by ethnicity in 2007.

Example #1

White 338,400Black 301,900

Hispanic 125,600

Graphing Qualitative DataOften, rather than simply presenting

numerical values, we choose to graph our data.

When generating a graph of 1 categorical variable, we might consider the following types of graph.Pie ChartBar Graph

Pie ChartA pie chart displays the distribution

of the qualitative variable by dividing the circle into wedges corresponding to the categories of the variable such that the angle of each wedge is proportional to the percentage of items in that category.

Pie Charts are easy to do in EXCEL.

A Pie Chart for the Prison DataWhite 338,400

Black 301,900

Hispanic 125,600

A Pie Chart for the Prison Data (Using Percents)

White 338,400

Black 301,900

Hispanic 125,600

Bar GraphA bar graph displays the distribution

of a qualitative variable by listing the categories of the variable along one axis and drawing a bar over each category with a height equal to the percentage of items in that category.

The bars should all be of equal width. We could also do one using percents.

Bar Graph for the Prison Data

White 338,400

Black 301,900

Hispanic 125,600

Categorical Variables place individuals into one of several groups or categories.

The values of a categorical variable are labels for the different categories.

The distribution of a categorical variable lists the count or percent of individuals who fall into each category.

Comparing 2 Categorical Variables

When a dataset involves two categorical variables, we begin by examining the counts or percents in various categories for one of the variables.

Comparing 2 Categorical Variables

Two-way Table – describes two categorical variables, organizing counts according to a row variable and a column variable.

Two-way Table – describes two categorical variables, organizing counts according to a row variable and a column variable.

Two-Way TablesTwo-way tables come about when

we are interested in the relationship between two categorical variables.One of the variables is the row variable.

The other is the column variable.The combination of a row variable and a column variable is a cell.

Dr. F is hosting 38 of his friends to a cookout. Now, Dr. F. has limited cooking skills, so everyone is having a burger. However, he has bought sufficient tomatoes so that anyone who wants tomato on his or her burger will be happy.

The following slide details the results of his burger and tomato survey.For the record….a good burger needs only

2 things…CHEESE….and KETCHUP!

Example #2

Burger/Tomato Two-Way TableLet’s look at the components of a 2 way table

GENDER * TOMATOES Crosstabulation

Count

11 8 19

6 13 19

17 21 38

F

M

GENDER

Total

N Y

TOMATOES

Total

Row variable

Column variable

Column Totals

Row Totals

Overall Total

Cells

A quick look at basic terms, and an introduction to linear regression and correlation.

Quantitative variables have numerical values that are measurements (length, weight, and so on) or counts (of how many).

Examples:How many are in your family?How many cars do you own?

Quantitative Variables

We further distinguish quantitative variables based on whether or not the values fall on a continuum.A discrete variable is one for which

you can count the number of possible values. How many siblings a person has

A continuous variable can take on any value within a given interval.A person’s weight

More on Quantitative Variables

SCATTER PLOTS

ScatterplotA graphical display of two

quantitative variablesWe plot the explanatory

(independent) variable on the x-axis and the response (dependent) variable on the y-axis

Each dot represents a single observation and its ordered pair (x,y)

Describing Scatterplots

When we consider scatterplots, we focus on 4 things:DirectionFormScatterUnusual elements

DirectionPositive: as values of the

explanatory variable increase, values in the response variable tend to increase

As x gets larger, y gets larger

DirectionNegative: as values of the

explanatory variable increase, values in the response variable tend to decrease

As x gets larger, y gets smaller

DirectionNull: no discernible patter of

change in the response variable

Form (Shape)Linear: The shape has the

appearance of a linear relationship.

There doesn’t have to be a perfect fit.

FormCurvedWe can use logarithms to

transform into linear forms.

FormNoneNo discernible form

Strength (Scatter)Strong association: very little

scatter

StrengthModerate strength:

StrengthWeak strength: lots of scatter

Unusual FeaturesOutliers—They just don’t fit the

trend

Determining the LINE that best fits our data.

Regression LineA regression line is a straight line

that describes how a response variable y changes as an explanatory variable x changes.

A regression line summarizes the relationship between two variables, but only in a specific setting: when one of the variables helps explain or predict the other.

Regression LineWe often use a regression line to predict the value of y for a given value of x.

Regression, unlike correlation, requires that we have an explanatory variable and a response variable

Regression LineFitting a line to data means drawing a

line that comes as close as possible to the points.

Extrapolation-the use of a regression line for prediction far outside the range of values of the explanatory variable x that you used to obtain the line.Such predictions are often not

accurate.

Linear RegressionRegression analysis finds the equation

of the line that best describes the relationship between the two variables.

In other words, what line best fits the data that is represented on our scatterplot.

While there are formulas to calculate this line, most of the time we’d use a graphing calculator or app for our ipad.

Interpreting our lineThe slope, b, is the amount by which y changes when x increases by one unit.

The intercept, a, is the value of y when

.0x

y a bx

A way to measure the strength of a LINEAR trend.

CORRELATION, denoted by r measures the direction and strength of the linear relationship between two quantitative variables.

General PropertiesIt must be between -1 and 1, or (-1≤ r ≤ 1).If r is negative, the relationship is

negative.If r = –1, there is a perfect negative

linear relationship (extreme case).If r is positive, the relationship is

positive.

Some facts about CORRELATION

General PropertiesIf r = 1, there is a perfect positive linear

relationship (extreme case).If r is 0, there is no linear relationship.r measures the strength of the linear

relationship.If explanatory and response are switched,

r remains the same.r has no units of measurement associated

with itScale changes do not affect r

Some facts about CORRELATION

Association does not imply causationCorrelation does not imply causationSlope is not correlationA scale change does not change the correlation.

Correlation doesn’t measure the strength of a non-linear relationship.

Summary of Correlation

A look at the different ways in which we can acquire a sample.

Data CollectionIn research, statisticians use data in many

different ways. Data can be used to describe situations. Data can be collected in a variety of ways, BUT if the sample data is not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.

Basic Methods of SamplingRandom Sampling

Selected by using chance or random numbers

Each individual subject (human or otherwise) has an equal chance of being selected

Examples: Drawing names from a

hat Random Numbers

Basic Methods of SamplingSystematic Sampling

Select a random starting point and then select every kth subject in the population

Simple to use so it is used often

Basic Methods of Sampling

Convenience SamplingUse subjects that are easily accessible Examples:

Using family members or students in a classroom Mall shoppers

Basic Methods of SamplingStratified Sampling

Divide the population into at least two different groups with common characteristic(s), then draw SOME subjects from each group (group is called strata or stratum)

Results in a more representative sample

Basic Methods of SamplingCluster Sampling

Divide the population into groups (called clusters), randomly select some of the groups, and then collect data from ALL members of the selected groups

Used extensively by government and private research organizations

Examples: Exit Polls

A look at the differences.

Types of ExperimentsObservational Studies

The researcher merely observes what is happening or what has happened in the past and tries to draw conclusions based on these observations

No interaction with subjects, usuallyNo modifications on subjects Occur in natural settings, usuallyCan be expensive and time consumingExample:

Surveys---telephone, mailed questionnaire, personal interview

Types of ExperimentsExperimental Studies

The researcher manipulates one of the variables and tries to determine how the manipulation influences other variables

Interaction with subject occurs, usuallyModifications on subject occursMay occur in unnatural settings (labs or

classrooms)Example:

Clinical trials of new medications ,treatments, etc.

Wrapping it all upAgain, I thank you for your attention, participation, and effort—I truly do know how long the day is for you!

I hope that you and your family enjoy a wonderful Thanksgiving and Christmas time.Take some time to relax, and be with the ones that matter to you!