Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population...

48
Plan and Data

Transcript of Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population...

Page 1: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Plan and Data

Page 2: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Are you aware of concepts such as

• sample, • population, • sample distribution, • population• distribution, • sampling variability?

Page 3: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?
Page 4: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

PPDAC

• Problem, question, purpose for investigating• Plan,• Data,• Analyse data, • Draw a conclusion, justify with evidence

Page 5: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

The plan

The goal in a sampling process is to obtain a sample to represent the population of interest.

For us, this means choosing an appropriate sample size.

Page 6: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

What makes a good sample

In common language usage, a sample is representative of the population if characteristics in the sample are a reflection of those in the parent population.

Page 7: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?
Page 8: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Under this meaning, a truly representative sample almost never exists.

In statistical jargon a representative sample means that the sampling process produces samples in which there is no tendency for certain characteristics to differ from those in the population in some systematic manner, e.g., all random samples could be viewed as representative samples.

Page 9: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Sample size

The aim of statistical testing is to uncover a significant difference when it actually exists. In its simplest form this involves comparing samples.

Page 10: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Sample size is important because

Larger samples increase the chance of finding a significant difference (if it exists), but • Larger samples cost more money.• Sometimes a larger sample is not a possibility.

Page 11: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Sample size

In general, the larger the sample size, the better the sample reflects the characteristics of the population.

Page 12: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Sample size

Larger sample sizes also help to give a better idea of the shape of the distribution.

Page 13: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Sample size

So the sample size is chosen to maximise the chance of uncovering a specific mean/median difference, which is also statistically significant.

Page 14: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Note:

The specific difference and statistically significant are two quite different ideas.

Page 15: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Remember box plots- this is what we want to produce so we can comment on what we notice and then what

we infer – more about this later

Page 16: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

The specific difference and statistically significant are two quite different ideas.

Here the medians have a specific difference but the difference is not statistically significant more about this later.

Page 17: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

The specific difference (difference in means and/or medians) is found by the researcher in terms of the outcome measure of an experiment or investigation.

In this Achievement Standard, we are going to perform an investigation that compares two sets of data.

Page 18: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Examples of specific difference

• For instance, difference in mean right foot length between Year 11 boys and mean right foot length in Year 11 girls from the 2012 Census at Schools database;

• 3kg mean weight change in a diet experiment,

• 10% mean improvement in a teaching method experiment.

Page 19: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Statistical significance is a probability statement telling us how likely it is that the observed difference was due to chance only.

Page 20: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

The reason larger samples increase your chance of significance is because they more reliably reflect the population mean/median.

Page 21: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

PLAN

Your plan must:• define the variables you will investigate;• decide how you will measure these variables;• note what things might affect the measures you

take (managing sources of variation);• decide how many measures you need to collect

(sample size);• explain how your data will be obtained and

recorded.

Page 22: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

define the variables you will investigate;

The variable we are investigating is the length of the right foot of Year 13 boys and Year 13 girls.

The lengths are measured in cm.

Page 23: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

decide how you will measure these variables;note what things might affect the measures you take (managing

sources of variation);

• If you are taking these measurements yourself, you will standardise the method:

• E.g. To minimise measurement errors, the measurements will be taken with the shoe removed and from the longest toe to the back of the heel. To get consistent measurements, I will get each person to place their right foot against the wall and mark the position of the longest toe using a ruler. I will check that the foot is at right angles to the wall.

Page 24: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

decide how you will measure these variables;note what things might affect the measures you take (managing

sources of variation);

• If you download measurements talk about your reservations about how the measurements were taken.

• E.g. As each person measured their own foot for the database, it is unlikely that the measurement method used was the same for everyone and hence the measurements will contain measurement errors.

Page 25: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

decide how many measures you need to collect (sample size);

Page 26: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Note:

• You need to choose a Discrete and a Continuous variable.

• Discrete: There are a lot of data the same e.g. gender, site

• Continuous: The data are generally all different e.g. weight, height, length

Page 27: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

decide how many measures you need to collect (sample size);

“I will get our two random samples using the 2011 CensusAtSchool random sampler.I will take a random sample of 25 boys from the population of 13 year-old NZ boys in the 2011 CensusAtSchool database.I will take a random sample of 25 girls from the population of 13 year-old NZ girls in the 2011 CensusAtSchool database.”

Page 28: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Talk about your sample

• Ask yourself • “Is it reasonable that these samples are

representative of the population?”

• “Is it reasonable to assume that samples taken from the Census at Schools database would represent all Year 13 students in New Zealand?”

Page 29: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Here is a ‘worry’ list

I worry about the quality of the foot length data since students measured and recorded their own foot lengths.

Were measurements made with shoes on or shoes off?Would all students have seen ‘cm’ to the right of the entry box?To what level of precision did the students make theirmeasurement?Why were there missing values?

Page 30: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

We need to mention that we are concerned about the accuracy of the data and that this could be improved by having the data collected in exactly the same way with a more detailed requirement e.g. The students place their foot against the wall and the measurement is taken rounded to the nearest cm from the wall to the the end of the longest toe.

Page 31: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Data

Page 32: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Census at Schools Data Viewer

Page 33: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

What should I be concerned about?

These are the questions that were asked.

Page 34: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

What should I be concerned about?

How do we know if the student was

actually a Year 13 student?

Page 35: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

What should I be concerned about?

Can we be sure that the students took off their

shoe to measure?

How accurately did they measure?

Page 36: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Getting the data

Page 37: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

We didn’t get exactly 25 in each category- does this matter?

Page 38: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Sample size

It is not necessary to have equal sample sizes as long as the samples are representative of the population.

Page 39: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

24 in each sample

Page 40: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Data summary

Page 41: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Was 24 data points enough to talk about the distribution, shape etc.?

Page 42: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

The shape is not obvious but we should think about what we expect from foot-length

Page 43: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

We could argue that the distribution is likely to be symmetrical as we would expect a cluster of data in the middle and some

extreme values either side.

Page 44: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

Because equal sample size doesn’t matter, I have just asked for a total of 100

Page 45: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?
Page 46: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

The shape is now becoming clearer.

Page 47: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

This no longer says ‘Year 13’ so I wonder if there is enough data for this sample size

Page 48: Plan and Data. Are you aware of concepts such as sample, population, sample distribution, population distribution, sampling variability?

• It is better to collect data from Year 9 or 10 as there is likely to be more data available for these year levels.