MM150 Unit 8 Seminar. Probability (Unit 7) Statistics (Unit 8) : Gathering data; organizing data...

Post on 13-Jan-2016

223 views 0 download

Tags:

Transcript of MM150 Unit 8 Seminar. Probability (Unit 7) Statistics (Unit 8) : Gathering data; organizing data...

MM150 Unit 8 Seminar

Probability (Unit 7)

Statistics (Unit 8) : Gathering data; organizing data

Statistics (Unit 9) : Analyzing data; making conclusions

DefinitionsStatistics - The art and science of gathering, analyzing, and making predictions from numerical information obtained from an experiment.

Data - The numerical information obtained.

Descriptive Stats - Concerned with the collection, organization, and analysis of data.Ex. Out of 100 people surveyed, 54 prefer Coca-Cola over Pepsi.

Inferential Stats - Concerned with making generalizations or predictions of the data.Ex. Based on the experiment above, Coca-Cola is better than Pepsi.Misuses of statistics?

Population - All items or people of interest.

Sample - A subset of the population.

Why do we use statistics?Cautions on interpreting

statistics•Statistics are used to draw conclusions

about a population. It must be noted that drawn conclusions can be wrong. Therefore, you must always make sure you understand how the information is collected and presented.

•While the numbers may not lie, the inferences and conclusions regarding those numbers may not be true.

Experiment(as described in textbook page

331)•As an analogy to obtaining a

sample for a population

•A jar with 90 blue marbles and 10 red marbles

•We don’t know the exact contents of this jar, but we could make conclusions based on sampling a few marbles (blind selecting marbles).

EVERYONE:

How can someone be sure of the contents of the jar?

How large does the sample have to be?

5 ways of sampling•We collect samples to represent

our whole population of interest.

•Random

•Systematic

•Cluster

•Stratified

•Convenience

Random Sampling- Each item in the population has an equal chance of being selected. No biases with how it’s chosen.

The best time to use random sampling is when all items in the population are similar with regard to the characteristic we are concerned with (ie. different colored marbles)

Best practices: Use a random number generator or a table of random numbers.

Systematic Sampling- obtaining a sample by drawing

every nth item with the first item determined by a random number.

What to watch for:(1) The list from which the systematic sampling is taken must contain the entire population.(2) The population is comprised of items where every nth item is made/inspected by the same entity.

Cluster Sampling- A random selection of groups of units. Also called an area sample because it can be based on geography.- Involves dividing an area into sections, then randomly selecting one section to study some or every unit in one section.

Stratified Sampling-A population is divided into parts, called strata, to make sure each stratum is selected from. Some knowledge of the population is needed to complete this type of sampling.- The strata are based on factors such as gender, race, religion, or income.- Once divided into strata, then a random sample is taken from each stratum.

Convenience Sampling

- Using data that is easily or readily obtained.

Things to consider:(1) May be only information available(2) Limited information is better than none(3) Can be extremely biased

Frequency Distribution

•Why?

•To organize and summarize data

•A listing of the observed values and the corresponding frequency (how often) of occurrence of each value.

Frequency Distribution ExampleTwelve students recorded the number of siblings they have. 0, 1, 2, 3, 1, 1, 0, 3, 2, 2, 1, 0 collected data Number of Siblings

Frequency

0 3

1 4

2 3

3 2

Note:3 + 4 + 3 +2 = 12

Rules for Data Groups By Classes(1) The classes should be of the same “width.”

(2) The classes should not overlap.

(3) Each piece of data should belong to only one class.

*A frequency distribution should be constructed with 5 - 12 classes.

Frequency Distribution with ClassesA professor wants to construct a

frequency distribution for end-of-term grades.

We must determine upper and lower class limits. One way that makes sense is

90-99 80-89 70-79 60-69 50-59 40-49 30-39 20-29 10-19 0-9

Lower Class Limits

Upper Class Limits

Class Width is 10.In 90-99, there are 10 data

scores:90, 91, 92, 93, 94, 95, 96, 97, 98,

99

Subtract Consecutive Upper Class Limits:

59-49 = 10

Subtract Consecutive Lower Class Limits:

20-10 = 10

Statistical Graphs• Circle graph (pie charts) – used to

compare parts of a whole to a whole. Since the whole pie has to add up to one whole, it’s a good visual to see what percentage a component is to the whole.

• Histograms and Frequency Polygons – illustrates frequency distributions.

• Stem-and-Leaf plots – idea similar to histograms, but shows actual data values.

Circle Graph Example32 people are interviewed to determine their favorite movie genre. Below are the results.

9 Action 16 Comedy 7 DramaTo construct the circle graph, we

must find the measure of the central angle for each section of

the circle graph.The circle will be divided into 3

sections.

Circle Graph Con’t

GenreNo. of People

Percent of Total

Measure of Central Ang

Action 9 9/32 * 100 = 28.12% 0.2812 * 360 = 101.23

Comedy 16 16/32 * 100 = 50% 0.5 * 360 = 180

Drama 7 7/32 * 100 = 21.87% 0.2187 * 360 = 78.77

Circle Graph Con’t

Comedy

Action Drama

Histogram and Frequency Polygon

Stem and Leaf Plot ExampleConstruct a stem and leaf plot for the following data scores:

98, 85, 99, 78, 81, 77, 90, 74, 72, 81, 88, 92, 99

789

7 2 4 7 88 1 1 5 89 0 2 8 9 9

7 | 2 represents 72

Some questions to consider

•EVERYONE: which data score occurs most often?

•EVERYONE: What is the least data score?

•EVERYONE: What is the greatest data score?

•EVERYONE: What would be the stem and leaf for the following data? 125, 113, 130, 121, 117, 122, 122, 129, 131, 119