What is Statistics - StartLogicwellsmat.startlogic.com/sitebuildercontent/sitebuilder... · 2010....

Post on 13-Sep-2020

0 views 0 download

Transcript of What is Statistics - StartLogicwellsmat.startlogic.com/sitebuildercontent/sitebuilder... · 2010....

Introduction to Statistics

Chapter 1

Overview

Lesson 1-1

What is Statistics?

Statistics is the science of:

Collecting information

Surveys are used to collect data from a small part of

a larger group so we can learn something about the

larger group

Organizing and summarizing the information

collected

Analyzing the information collected in order to

draw conclusions.

Two Types of Statistics

Descriptive Statistics

Organizing and summarizing the information

collected.

Inferential Statistics

Drawing conclusions from the information

collected.

Individuals and Variables

Data are observations (such as measurement,

gender, or survey responses) that have been

collected.

Individuals are the objects described by a set of

data.

Individuals may be people, animals or things.

A variable is any characteristic of an individual.

A variable can take different values for different

individuals.

Populations and Samples

A population is the entire group of individuals

about which we want information about.

A sample is part of the population from which we

actually collect information, which is then used to

draw conclusions about the whole.

A census is a sample survey that attempts to

include the entire population in the sample

Example – Page 11, #18

Identify the (a) sample and (b) population. Also determine

whether the sample is likely to be representative of the

population.

Nielsen Media Research surveys 5000 randomly selected

households and finds that among TV sets in use 19% are

tuned to 60 Minutes.

Sample: _______________________

Population:________________________

5000 selected households

All households

Sample is representative of the population

Example – Page 11, #20

Identify the (a) sample and (b) population. Also determine

whether the sample is likely to be representative of the

population.

A graduate student at the University of Newport conducts a

research project about how adult Americans communicate.

She begins with a survey mailed to 500 adults that she knows.

She asks them to mail back a response to this question: “Do you

prefer to use e-mail or the U.S. Postal service?” She gets back

65 responses, with 42 of them indicating a preference for postal

mail.

Example – Page 11, #20A graduate student at the University of Newport conducts

a research project about how adult Americans communicate.

She begins with a survey mailed to 500 adults that she knows.

She asks them to mail back a response to this question: “Do

you prefer to use e-mail or the U.S. Postal service?” She gets

back 65 responses, with 42 of them indicating a preference

for postal mail.

Sample: _______________________

Population:________________________

65 respondents

All adult Americans

Sample is not representative of the population

Types of Data

Lesson 1-2

Parameters and Statistics

A parameter is a number that describes the

population

A parameter is a fixed number, but in practice

we don’t know its value

A statistics is a number that describes a sample.

The value of a statistic is known when we have

taken a sample, but it can change from sample

to sample

We often use a statistic to estimate an

unknown parameter.

Example – Page 9, #2

Determine whether the given value is a statistic or

a parameter.

A sample of students is selected and the average (mean)

number of textbooks purchased this semester is 4.2

Statistic

Example – Page 9, #4

Determine whether the given value is a statistic or

a parameter.

The study of all 2223 passengers aboard the Titanic, it was

found that 706 survived when it sank.

Parameter

Types of data

Categorical Data

The individuals being studied are grouped into

categories based on some qualitative trait.

The resulting data are merely labels or

categories

Measurement Data

The individuals being studied are “measured”

based on some quantitative trait.

The resulting data are sets of numbers.

Measurement data is classified as

Discrete

Results when the number of possible value is either

finite number or “countable” number. (That is, the

number of possible values is 0 or 1 or 2 and so on.)

Continuous

Results from infinitely many possible values that

correspond to some continuous scale that covers a

range of values without gaps, interruptions, or

jumps.

Example – Page 10, #6

Determine whether the given values are from discrete

or continuous data set.

A statistic student obtains sample data and finds that

the mean weight of cars in the sample is 3126 lb.

Continuous

Example – Page 10, #8

Determine whether the given values are from discrete

or continuous data set.

Discrete

When 19,218 gas masks from branches of the U.S.

military

Review – Types of Data

Categorical or Qualitative Data

Measurement or Quantitative Data

Discrete Data – Counting

Continuous Data - Measuring

Four Levels of Measurement

Nominal

The data cannot be arranged in an ordering

scheme ( such as low to high)

Ordinal

The categories are ordered, but differences

can’t be found or are meaningless.

Four Levels of MeasurementInterval

the categories are ordered, the differences are

meaningful, there is no natural starting point and

ratio are meaningless

Ratio

The categories are ordered, the differences are

meaningful, there is a natural zero starting point

and the ratios are meaningful.

Example – Page 10, #10

Determine which of the four levels of measurement

(nominal, ordinal, interval, ratio) is most appropriate.

Ratings of fantastic, good, average, poor, or unacceptable

for blind dates.

Ordinal

Example – Page 10, #12

Determine which of the four levels of measurement

(nominal, ordinal, interval, ratio) is most appropriate.

Numbers on the jerseys of women basketball players

in the WNBA.

Nominal

Example – Page 10, #14

Determine which of the four levels of measurement

(nominal, ordinal, interval, ratio) is most appropriate.

Social security numbers.

Nominal

Critical Thinking

Lesson 1-3

Misuses of Statistics

Bad Samples

Small Samples

Misleading Graphs

Pictographs

Distorted Percentages

Loaded Questions

Order of Questions

Refusals

Correlation and Causality

Self-Interest Study

Precise Numbers

Partial Pictures

Deliberate Distortions

Bad Samples

Voluntary responses samples is one in which the

respondents themselves decide whether to be

included.

Examples

Polls conducted through the Internet

Mail-in polls

Telephone call-in polls

Bad Samples

Loaded Questions

Misleading Graphs

Suppose that you’re looking

into a summer job and you

see advertisement for a

company that says that the

current salaries are

significantly higher than they

were two years ago.

What impression do you get

about the improvement in

salaries at this company from

this graph?

12

10

2002

year

2004Ave

rage

sta

rtin

g sa

lary

($

per

hour

)

Partial Pictures

Example – Page 17, #2

Use critical thinking to develop an alternative conclusion.

A study showed that homeowners tend to live longer than

those who do not live in their own homes. Conclusion:

Owning a home creates inner peace and harmony that

causes people to be in better health and live longer.

Homeowners tend to be more wealthier and they can better

afford health care, which leads to better health.

A better conclusion is that being a homeowner is associated

with living longer.

Example – Page 17, #6

Use critical thinking to address the key issue.

After a national census was conducted, the Poughkeepsie

Journal ran this front page headline: “281,421,906 in

America.” What is wrong with this headline?

The headline suggests that the census count was

determine with great precision, but the figure is likely

to be in error by millions of people.

Design of Experiments

Lesson 1-4

Observational Study

An observational study

observes individuals and

measures variables of

interest but does not

attempt to influence the

responses. The purpose of

an observational study is

to describe some group

or situation.

Experiment

An experiment deliberately imposes some treatment on

individuals in order to observe their responses. The purpose

of the experiment is to study whether the treatment causes a

change in the response.

Example – Page 27, #2

Much controversy arose over a study of patients with

syphilis who were not given a treatment that could

have cured them. Their health was followed for years

after they were found to have syphilis.

Determine where the given description corresponds

to an observational study or an experiment.

Observational Study

Example – Page 27, #4

Cruise ship passengers are given magnetic bracelets,

which they agree to wear in an attempt to eliminate

or diminish the effects of motion sickness.

Determine where the given description corresponds

to an observational study or an experiment.

Experiment

Different Types of

Observation Studies

Cross Sectional Study

Data are observed, measured and collected at

one point in time.

Retrospective (or Case Control) Study

Data are collected from the past by going back

in time.

Prospective (of Longitudinal or Cohort) Study

Data are collected in the future from groups

(called cohorts) sharing common factors.

Example – Page 27, #6

Identify the type of observation study (cross-sectional,

retrospective, or prospective).

A researcher from Mt. Sinai Hospital in New York City

plans to obtain data by following (to the year 2010) siblings

of victims who perished in the World Trade Center

terrorist attacked of September 11, 2001.

Prospective

Example – Page 27, #8

Identify the type of observation study (cross-sectional,

retrospective, or prospective).

An economist collects data by interviewing people who

won the lottery between the years of 1995 and 2000.

Retrospective

Confounding in Experiments

Confounding occurs in an experiment when

the experiment is not able to distinguish

between the effect of different factors.

Try to plan the experiment so the

confounding does not occur.

Controlling Effects of Variables

Blinding

Subjects doesn’t know whether their receiving a

treatment or a placebo.

Blocks

Groups subjects with similar characteristics

Completely Randomized Experimental Design

Subjects are put into different blocks through a process of

random selection.

Rigorously Controlled Design

Subjects are very carefully chosen.

Data Collection

If sample data are not collected in an

appropriate way, the data is completely

useless.

Randomness typically plays a crucial role in

determining which data is collected

Random Samples

In a random sample members from a

population are selected in such a way that

each individual member has equal chance of

being selected.

A simple random sample (SRS) of size n

subjects is selected in such a way that every

possible sample of the same size n has the same

chance of being chosen.

The Draft Lottery

Example – Page 29, #22

Does this sampling plan result in a random sample?

Simple random sample? Explain

A classroom consists of 30 students seated in five different

rows, with six students in each row. The instructor rolls

a die and the outcome is used to select a sample of the

students in a particular row.

Random Sample:

Simple Random Sample:

yes

No

Example – Page 29, #24

Does this sampling plan result in a random sample?

Simple random sample? Explain

A quality control engineer selects every 100th computer

power supply unit that passes on a conveyor belt. .

Random Sample:

Simple Random Sample:

No

No

Example – Page 29, #26

Does this sampling plan result in a random sample?

Simple random sample? Explain

A market researcher randomly selects 10 blocks in the

Village of Newport, then asks all adult residents of the

selected blocks whether they own a DVD player.

Random Sample:

Simple Random Sample:

Yes

No

Sampling Techniques

Random Sampling

Simple Random Sampling

Systematic Sampling

Convenience Sampling

Stratified Sampling

Cluster Sampling

Random Sampling Each member of the population has an equal chance of being

selected.

Systematic SamplingSelect some starting point and then select every kth (such as

every 3rd ) element in the population

Convenience Sampling

Use results that are easy to get

Stratified Sampling

Subdivide the population into at least two different subgroups

(or strata) that share the same characteristics (such as gender or

age bracket), then draw a sample from each subgroup.

Cluster Sampling

Divide the population into sections (or clusters); then

randomly select some of those clusters; choose all

members from the selected clusters

Example – Page 28, #10

Identify which of these types of sampling is used: random,

systematic, convenience, stratified, or cluster.

The Dutchess County Commissioner of Jurors obtains

a list of 42,763 car owners and constructs a pool of jurors

by selecting every 100th name on the list.

Systematic

Example – Page 28, #12

Identify which of these types of sampling is used: random,

systematic, convenience, stratified, or cluster.

A General Motors researcher has partitioned all registered

cars into categories of sub-compact, compact, mid-size,

intermediate and full-size. She is surveying 200 cars owners

from each category.

Stratified

Example – Page 28, #14

Identify which of these types of sampling is used: random,

systematic, convenience, stratified, or cluster.

A marketing executive for General Motors finds that its

public relations department has just printed envelopes with

the names and addresses of all Corvette owners. She wants

to do a pilot test of a new marketing strategy, so she

thoroughly mixed all of the envelopes in a bin, then obtains

a small sample group by pulling 50 of those envelopes.

Random

Sampling Error

Are errors by the act of taking a sample. It is

difference between a sample result and the true

population result; such an error results from

chance sample fluctuations

Nonsampling Error

Sample data that are incorrectly collected,

recorded, or analyzed (such as by selecting a

biased sample, using a defective instrument, or

copying the data incorrectly)

Errors in Sampling