Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling...

Post on 04-Jan-2016

254 views 1 download

Transcript of Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling...

Lecture 2Review

ProbabilitiesProbability DistributionsNormal probability distributionsSampling distributions and estimation

Probabilities

The probability of an even occurring is the proportion of its occurrence in the long run.

Some properties of probabilities: Usually expressed as between 0 and 1. Sometimes they can be expressed in percentages, and referred to as chance, or the likelihood of an outcome. (eg: there is a 30% probability of precipitation).

Probabilities

The sum of the probabilities of all possible outcomes will always be 1.00, or 100%. (something has to happen).

Therefore, the probability that an event will not occur is the complement of the probability that it will occur.

Probabilities are sometimes converted to odds. The odds of an event happening is the ratio of the probability of its occurrence to the probability of its non-occurrence.

Probabilities

The odds of variable Y having outcome of y is equal to the probability of outcome y, divided by the probability of an outcome other than y.

)(1

)(

yP

yP

Probability Distributions

Probability Distribution: A list of the possible outcomes for a variable, along with their probabilities

Probabilities for continuous variables can be plotted as curves. The area under the curve is the probability that the variable will take that value.

The mean of the probability distribution is also known as the expected value of the variable.

Probability Distributions

For discrete distributions, the expected value (μ) is the sum of the possible values, multiplied by their probabilities of occurrence.

Where y is the value for the variable Y, and P(y) is the probability of that value.

yyP

The Normal Curve

Characteristics of the normal curve: Bell-shaped and unimodal Asymptotic Have means, medians, and modes which are

equal Have areas under them that have the

constant proportions:

The Normal Curve

Normal Distribution

.34 .34

.134 .134

.047 .047

68%95%

99.7%

Z-Scores

Z-scores are values for particular values of a variable, expressed in terms of the standard deviation of that variable.

Where μ is the mean of the variable, σ is its standard deviation.

Y

z

The Standard Normal Curve

3 2 1 0 1 2 3

.34 .34

.134 .134

.047 .047

68%95%

99.7%

Standard Normal Distribution

μ = 0σ = 1

Z-Scores

An intelligence test has scores which are normally distributed, with a mean of 100 and a s.d. of 15. Convert a test score of 127 to z-scores.

Z-scores

If income is approximately normally distributed (It isn’t), with a mean of $15,000 and a standard deviation of $2100, what is the approximate probability of an individual having an income of $25,000?

Sampling distributions

A sampling distribution is a probability distribution that represents the long-run distribution of the sample statistic, if repeated samples of size n are taken.

For particular statistics (such as means, proportions, or differences of means), we can assume that sampling distributions will have certain properties.

Sampling Distribution of Means

The sampling distribution of means is the theoretical distribution of means if many random samples of the same size (n) are taken .

For random samples, the individual sample means will fluctuate around the actual population parameter (μ). In the long run, the mean of the sample means (the mean of the sampling distribution) will have the same mean as the population (μ).

Sampling Distribution of Means

The sampling distribution of means is the theoretical distribution of means if many random samples of the same size (n) are taken .

For random samples, the individual sample means will fluctuate around the actual population parameter (μ). In the long run, the mean of the sample means (the mean of the sampling distribution) will have the same mean as the population (μ).

Standard Error

The standard deviation of the sampling distribution is the standard error. For the sampling distribution of means, the standard error, or the standard error of the mean is:

where σ is the population standard deviation, n is the sample size.

nY

Central Limit Theorem

Central Limit Theorem: If repeated random samples are drawn from a population, As the sample size n grows, the sampling distribution of sample means approaches a normal distribution.

Law of Large Numbers (Bernoulli): as more samples are taken from a population with mean μ, the closer the mean of these means approaches the population mean μ.

Central Limit Theorem

The importance of these two findings is that the sampling distribution of means, if n is reasonably large, will be normally distributed and centred on the population mean.

Importantly, this holds no mater what the shape of the sample or the population distributions. It is the sampling distribution that is normally distributed.

Sampling Distribution of Means

Sampling Distribution of Means

Means ofindividualsamples

Long-runmean of

the samplemeans

Central Limit Theorem

We therefore have 3 distributions:

1) The sample distribution, that describes the actual sample collected, with sample mean, standard deviation of s and sample size n.

2) The sample distribution is drawn from the population distribution, which has a mean of μ, sample size N, and standard deviation of σ.

3) The sampling distribution of a statistic is the a theoretical probability distribution of the statistic, or the variability in the statistic among samples of a certain size (n).

Examples

If a sample of size (n=100) is drawn from a population of size (N=100,000) with a standard deviation of (σ =12.4), what is the standard error or the mean?

What is the standard error of a sample of (n=500) drawn from the same population?

Next:

Review of Point and Interval Estimators Statistical Significance Hypothesis Testing