Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically...

28
Answering Questions Statistically ENVS 407 Prevention of Tobacco Addiction

Transcript of Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically...

Page 1: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Answering Questions StatisticallyENVS 407 – Prevention of Tobacco Addiction

Page 2: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Key statistical ideas

• Clarify your questions

• Construct contrasts

• Know the procedure

• Control as much as you can, then leave the rest to chance!

Page 3: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Clarify your questions

• Initial Questions:

▫ Why are people buying cigarettes?

▫ Where are people getting their cigarettes?

• Problems:

▫ “Why” is super hard to answer.

• Solution:

▫ Break question into smaller, easier questions

▫ Try to think of questions that can be written as “how much/many” or “is this more than that”

Page 4: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Clarify your questions (cont’d)

• Break the question into parts:

▫ Availability: what stores sell the most types of tobacco?

▫ Availability: what stores carry the most brands of cigarettes?

▫ Advertising: which modes of advertising are most popular?

▫ Advertising: do different locations have different size advertisements?

Page 5: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Construct contrasts

• Statistics is about comparing one set of things to another set of things… and figuring out if they’re different

Page 6: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Construct contrasts – hypothesis

• What do you want to disprove?

▫ Null hypothesis

• What do you want to prove beyond a reasonable doubt?

▫ Alternative hypothesis

Page 7: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Construct contrasts – EXAMPLE 1

• Null hypothesis: The average number of tobacco advertisements at grocery stores is the same as the number of advertisements at a liquor shop.

• Alternative hypothesis: The average number of advertisements at grocery stores is less than at a liquor shop.

Page 8: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Construct contrasts – EXAMPLE 2

• Null hypothesis: The average percent of “sexy” ads at grocery stores is the same as at liquor stores.

• Alternative hypothesis: They are not the same.

Page 9: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – collecting data

• Once you translate the questions into statistics…

▫ Design the study

Means vs percentage?

▫ Randomly select places to sample the data

Careful of bias!

▫ Collect the data

Harder than you think!!

▫ Analyze the data

Easier than you think!!

▫ Interpret the data

Page 10: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – testing

• Things aren’t ever perfectly like the null…

• …but how different is too different?

Page 11: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

• Greek vs. Roman

▫ μ = true mean (a.k.a. “average”)

▫ σ = true standard deviation

▫ x = sample mean (i.e., comes from the data)

▫ s = sample standard deviation (i.e., from the data)

▫ n = sample size (sometimes have n1 and n2)

Page 12: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 13: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 14: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 15: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 16: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 17: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 18: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 19: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

Page 20: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

t-statistic

Page 21: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – counts/means

t-statistic

But is this “big”?

Page 22: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Tables for the t distribution• If we want a 100·C% confidence

level for the test, we need to find the value so that we have a probability of C between -t* and t*

in a t distribution with n-1 degrees of freedom

• Example: 95% confidence level when n = 14 means that we need a

tail probability of 0.025, so t*=2.15

= 0.95

= 0.025

t*-t*

df = 14

Page 23: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – percentages

• The symbols

▫ p = true percentage

▫ Y = observed outcome (e.g. count of successes)

▫ n = sample size

▫ p = sample percentage (i.e., Y/n)

Page 24: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Know the procedure – percentages

Page 25: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Control as much as you can…

• Make sure you don’t “stack the deck”

▫ Don’t pick all your grocery stores from Center City and all of your liquor stores from University City

• Standardize definitions of “size of advertisement” and “theme of ad”

▫ It’s surprising how much opinions differ

• Think carefully about all variables which are important… but aren’t the one you’re most interested in. CONTROL THEM!!!

Page 26: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

…leave the rest to chance!

• Randomize once you’ve controlled for the important variables

▫ Get a list of well “controlled” stores, and then randomly pick which you’ll visit

▫ Picking the easiest to go to will introduce selection bias!!

Page 27: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Key statistical ideas

• Clarify your questions▫ Bigger -> smaller▫ Intangible -> quantifiable

• Construct contrasts▫ Compare two things: greater than? less than?

merely different?

• Know the procedure ▫ Or know someone who knows the procedure…

• Control as much as you can, leave the rest to chance!

Page 28: Answering Questions Statisticallymbaiocch/Guest Lecture... · Answering Questions Statistically ENVS 407 –Prevention of Tobacco Addiction. Key statistical ideas •Clarify your

Websites and resources

• Quick reference▫ http://en.wikipedia.org/wiki/Student's_t-test

▫ Use “unequal sample sizes, unequal variance”

• Simple t-test calculator▫ http://www.graphpad.com/quickcalcs/ttest1.cfm

• Wharon StatLab▫ http://www-stat.wharton.upenn.edu/~sivana/statlab.html

• My page ▫ http://stat.wharton.upenn.edu/~mbaiocch/

▫ The slides I used today▫ Spreadsheet▫ My contact info