Contact Information

65
Contact Information Dr. Daniel Simons Vancouver Island University Faculty of Management Building 250 - Room 416 Office Hours: MTW 11:30 – 12:30 [email protected]

description

Contact Information. Dr. Daniel Simons Vancouver Island University Faculty of Management Building 250 - Room 416 Office Hours: MTW 11:30 – 12:30 [email protected]. Suggestions for Best Individual Performance. Attend all classes - PowerPoint PPT Presentation

Transcript of Contact Information

Page 1: Contact Information

Contact Information

Dr. Daniel Simons

Vancouver Island UniversityFaculty of Management

Building 250 - Room 416Office Hours: MTW 11:30 – 12:30

[email protected]

Page 2: Contact Information

Suggestions for Best Individual Performance

• Attend all classes• Take notes. Course covers a lot of material and your notes are

essential• Complete all assignments (not for grade)• Read the book• Participate, enrich class discussion, provide feedback and ask

questions• Revise materials between classes, integrate concepts, make

sure you understand the tools and their application• Don’t hesitate to contact me if necessary

Page 3: Contact Information

Evaluation Method

Tests have a mix of problems that evaluate• Concepts• Problem sets (assignments)• Class applications• Readings• New applications• Closed book time constrained tests to reward knowledge and

speed• Each test covers slides, assignments, and required readings.• Evaluation system may not be perfect but it works

Page 4: Contact Information

Brief Overview of the Course Economics suggests important relationships, often with policy implications, but virtually never suggests quantitative magnitudes of causal effects.

What is the quantitative effect of reducing class size on student achievement?

How does another year of education change earnings? What is the price elasticity of cigarettes? What is the effect on output growth of a 1 percentage point

increase in interest rates by the Bank of Canada? What is the effect on housing prices of environmental

improvements?

Page 5: Contact Information

This course is about using data to measure causal effects.

Ideally, we would like an experiment what would be an experiment to estimate the effect of

class size on standardized test scores? But almost always we only have observational

(nonexperimental) data. returns to education cigarette prices monetary policy

Most of the course deals with difficulties arising from using observational to estimate causal effects

confounding effects (omitted factors) simultaneous causality “correlation does not imply causation”

Page 6: Contact Information

STATISTICAL PRINCIPLES

A review of the basic principles of statistics used in business

settings.

REVIEW OF QUME 232

Page 7: Contact Information

Basic Statistical Concepts

• Important that students are comfortable with the following:– Concept of random variable (whether discreet or continuous) and

their associated probability functions– Cumulative, marginal, conditional and joint probability functions– Mathematical expectations, concept of independence– Bernoulli, Binomial, Poisson, Uniform, Normal, t, F and χ2

distributions

Page 8: Contact Information

Let Y1 3,Y2 5,Y3 1,Y4 5

n 4

Yii1

n

Y1 Y2 Y3 Y4

35 1512

Summations

• The S symbol is a shorthand notation for discussing sums of numbers.

• It works just like the + sign you learned about in elementary school.

Page 9: Contact Information

Algebra of Summations

1 1 1

1 1

1

1 1 2 21 1 1

( )

( )

...

...

n n n

i i i ii i i

n n

i ii i

n

i

n n n

i i n n i ii i i

k

X Y X Y

kX k X

k k k k nk

X Y X Y X Y X Y X Y

S

is a constant

(1)

(2)

(3)

Note: follows the same rules as +, so

(4)

Page 10: Contact Information

Summations: A Useful Trick

( ) ( )( )i i i iX Y Y X X Y YS S

Page 11: Contact Information

Double Summations

1 1 1 2 1 31 1

1 1 2 2 1 2 1 2

1 2 1 2

1 1 2 2 1 2 1 2

1 1

...

( ... ) ( ... ) ... ( ... )( ... ) ( ... )

( ... ) ( ... ) ... ( ... )

n n

i j n ni j

n n n n

n n

n n n n

n n

i jj i

X Y X Y X Y X Y X Y

X Y Y Y X Y Y Y X Y Y YX X X Y Y Y

Y X X X Y X X X Y X X X

X Y

1 1

n n

i ji j

X Y

• The “Secret” to Double Summations: keep a close eye on the subscripts.

Page 12: Contact Information

Descriptive Statistics

• How can we summarize a collection of numbers?– Mean: the arithmetic average.

The mean is highly sensitive to a few large values (outliers).

– Median: the midpoint of the data. The median is the number above which lie half the observed numbers and below which lie the other half. The median is not sensitive to outliers.

1

1 n

ii

Y Yn

Page 13: Contact Information

Descriptive Statistics (cont.)

– Mode: the most frequently occurring value.– Variance: the mean squared deviation of a

number from its own mean. The variance is a measure of the “spread” of the data.

– Standard deviation: the square root of the variance. The standard deviation provides a measure of a typical deviation from the mean.

21 2

1

1( , ,..., ) ( )n

n ii

Var Y Y Y Y Yn

Page 14: Contact Information

Descriptive Statistics (cont.)

– Covariance: the covariance of two sets of numbers, X and Y, measures how much the two sets tend to “move together.”

If Cov(X,Y) 0, then if X is above its mean, we would expect that Y would also be above its mean.

1

1( , ) ( , ) ( )( )n

i ii

Cov X Y Cov Y X X X Y Yn

Page 15: Contact Information

Descriptive Statistics (cont.)

– Correlation Coefficient: the correlation coefficient between X and Y “norms” the covariance by the standard deviations of X and Y. You can think of this adjustment as a unit correction. The correlation coefficient will always fall between -1 and 1.

Corr( X ,Y ) XY cov( X ,Y )

(s.d.X )(s.d.Y )

Page 16: Contact Information

A Quick Example

Y {8,4,4,6,13}X {3,4,2, 2,8}

Y 15

(844613) 355

7

X 15

(342 28) 155

3

medianY 6

medianX 3

The mode of Y is 4.The mode of X is -2,2,3,4, and 8.

Page 17: Contact Information

A Quick Example (cont.)

2 2 2 2 2

2 2 2 2 2

1( ) [(8 7) (4 7) (4 7) (6 7) (13 7) ]51 56[1 9 9 1 36] 11.25 51( ) [(3 3) (4 3) (2 3) ( 2 3) (8 3) ]51 52[0 1 1 25 25] 10.45 5

56( ) 3.35

52( ) 3.25

Var Y

Var X

StdDev Y

StdDev X

Page 18: Contact Information

A Quick Example (cont.)1( , ) [(8 7)(3 3) (4 7)(4 3) (4 7)(2 3)5

(6 7)( 2 3) (13 7)(8 3)]1 [(1)(0) ( 3)(1) ( 3)( 1) ( 1)( 5) (6)(5)]51 35[0 3 3 5 30] 75 5

7 35( , ) 0.6556 52 29125 5

Cov X Y

Corr X Y

X

In the example, and Y are strongly correlated.

Page 19: Contact Information

Populations and Samples

• Two uses for statistics:– Describe a set of numbers– Draw inferences from a set of numbers we observe to a

larger population

• The population is the underlying structure which we wish to study. Surveyors might want to relate 6000 randomly selected voters to all the voters in the United States. Macroeconomists might want to relate data about unemployment and inflation from 1958–2004 to the underlying process linking unemployment and inflation, to predict future realizations.

Page 20: Contact Information

Populations and Samples (cont.)

• We cannot observe the entire population.

• Instead, we observe a sample drawn from the population of interest.

• In the Monte Carlo demonstration from last time, an individual dataset was the sample and the Data Generating Process described the population.

Page 21: Contact Information

Populations and Samples (cont.)

• The descriptive statistics we use to describe data can also describe populations.

• What is the mean income in the United States?

• What is the variance of mortality rates across countries?

• What is the covariance between gender and income?

Page 22: Contact Information

Populations and Samples (cont.)

• In a sample, we know exactly the mean, variance, covariance, etc. We can calculate the sample statistics directly.

• We must infer the statistics for the underlying population.

• Means in populations are also called expectations.

Page 23: Contact Information

Populations and Samples (cont.)

• If the true mean income in the United States is b, then we expect a simple random sample to have sample mean b.

• In practice, any given sample will also include some “sampling noise.” We will observe not b, but b + e.

• If we have drawn our sample correctly, then on average the sampling error over many samples will be 0.

• We write this as E(e) = 0

Page 24: Contact Information

Probability

• A random variable X is a variable whose numerical value is determined by chance, the outcome of a random phenomenon– A discrete random variable has a countable number of possible values,

such as 0, 1, and 2– A continuous random variable, such as time and distance, can take on any

value in an interval

• A probability distribution P[Xi] for a discrete random variable X assigns probabilities to the possible values X1, X2, and so on

• For example, when a fair six-sided die is rolled, there are six equally likely outcomes, each with a 1/6 probability of occurring

Page 25: Contact Information

Mean, Variance, and Standard Deviation

• The expected value (or mean) of a discrete random variable X is a weighted average of all possible values of X, using the probability of each X value as weights:

(17.1)• the variance of a discrete random variable X is a weighted average,

for all possible values of X, of the squared difference between X and its expected value, using the probability of each X value as weights:

(17.2)• The standard deviation σ is the square root of the variance

E[X] XiP[Xi ]

i

2 E[(X )2 ] (Xi )2P[Xi ]

i

Page 26: Contact Information

Continuous Random Variables

• Our examples to this point have involved discrete random variables, for which we can count the number of possible outcomes:– The coin can be heads or tails; the die can be 1, 2, 3, 4, 5, or 6

• For continuous random variables, however, the outcome can be any value in a given interval

• A continuous probability density curve shows the probability that the outcome is in a specified interval as the corresponding area under the curve

Page 27: Contact Information

Expectations

• Expectations are means over all possible samples (think “super” Monte Carlo).

• Means are sums.

• Therefore, expectations follow the same algebraic rules as sums.

• See the Statistics Appendix for a formal definition of Expectations.

Page 28: Contact Information

Algebra of Expectations

• k is a constant.• E(k) = k• E(kY) = kE(Y)• E(k+Y) = k + E(Y)• E(Y+X) = E(Y) + E(X)• E(SYi ) = SE(Yi ), where each Yi is a

random variable.

Page 29: Contact Information

Variances

• Population variances are also expectations.

Var(X) E[(X - E(X))2 ]Var(kX) E[kX - E(kX))2 ]

2

2

2 2

2

[( - ( )) ]

[( ( - ( )) ]

· [( - ( )) ]

· ( )

E kX kE X

E k X E X

k E X E X

k Var X

Page 30: Contact Information

Algebra of Variances

2

1 1 1 1

( ) 0

( ) · ( )( ) ( )( ) ( ) ( ) 2 ( , )

( ) ( ) ( , )n n n n

i i i ji i i j

j i

Var k

Var kY k Var YVar k Y Var YVar X Y Var X Var Y Cov X Y

Var Y Var Y Cov Y Y

(1)

(2) (3) (4)

(5)

• One value of independent observations is that Cov(Yi ,Yj ) = 0, killing all the cross-terms in the variance of the sum.

Page 31: Contact Information

1-31

2 random variables: joint and marginal distributions

• The joint probability distribution of X and Y, two discrete random variables, is the probability that the two variables simultaneously take on certain values, say x and y.

Example: Weather conditions and commuting times• Let Y = 1 if the commute is short (less than 20 minutes) and = 0 if

otherwise.• Let X = 0 if it is raining and 0 if it is not

The joint probability is the frequency with which each of the four possible outcomes (X=0,Y=0) (X=1,Y=0) (X=0,Y=1) (X=1,Y=1) occurs over many repeated commutes

Page 32: Contact Information

Joint probability Distribution

Rain (X=0) No Rain (X=1) Total

Long commute Y=0 0.15 0.07 0.22

Short Commute Y=1 0.15 0.63 0.78

Total 0.30 0.70 1.0

Over many commutes, 15% of the days have rain and long commute, that isP(X=0, Y=0) 0.15. This is a joint probability distribution

Page 33: Contact Information

Marginal Probability Distribution

• The marginal probability distribution of a random variable X is just another name for its probability distribution. The marginal distribution of X from above is

• Find E(X) and Var(X)

X (weather condition) P(x)0 0.3

1 0.7

Page 34: Contact Information

Conditional Probability Distribution

• The probability distribution of a random variable X conditional on another random variable Y taking on a specific value.

• The probability of X given Y. P(X=x|Y=y)• P(X=x|Y=y) = P(X=x, Y=y)/ P(Y=y)• Conditional probability of X given Y = joint probability of x

and y divided by marginal probability of Y (the condition)

Page 35: Contact Information

Conditional Distribution

• P(Y=0|X=0) = P(X=0,Y=0)/ P(X=0) = 0.15/0.30 =0.5

Conditional Expectation:

The conditional expectation of Y given X, that is the conditional mean of Y given X, is the mean of the conditional distribution of Y given X

k

iii xXyYPyxXYE

1

)|()|(

Page 36: Contact Information

The expected number of long commutes given that it is raining is E(Y|X=0) = (0)*(0.15/0.30) + (1)*(0.15/.30)=0.5

Page 37: Contact Information

Law of Iterated Expectations

• The expected value of the expected value of Y conditional on X is the expected value of Y.

• If we take expectations separately for each subpopulation (each value of X), and then take the expectation of this expectation, we get back the expectation for the whole population.

E(E(Y | X)) E(Y )

Page 38: Contact Information

Independence

• Two random variables X and Y are independently distributed, or independent, if knowing the value of one of the variables provides no information about the other. Specifically, when E(Y|X) = E(Y)

Or alternatively,• P(Y=y|X=x) = P(Y=y) for all values of x and y

Or• P(Y=y,X=x) = P(Y=y)*P(X=x)

That is, the joint distribution of two independent variables is the product of their marginal distributions

Page 39: Contact Information

Independence

Are commuting times and weather conditions independent?

P(Y=0,X=0) = 0.15

P(Y=0) * P(X=0) = 0.22 * 0.3 = 0.066

Since

X and Y are NOT independent)0(*)0()0,0( XPYPXYP

Page 40: Contact Information

Covariance and Correlation

• Covariance is a measure of the extent to which two random variables move together

• If X and Y are independent then the covariance is zero but a covariance of zero does not imply independence. A zero covariance implies only linear independence

))(*)(()])([(),( YEXEEXYYXEYXCov YXXY

][)](*[)(1 1

YXiii

k

i

n

jiYX YXPYXXYE

Page 41: Contact Information

Covariance and Correlation

63.0)63.0(*)1*1()07.0)(.0*1()15.0)(1*0()15.0)(0*0()],(.[ YXPXYEXY

7.0)7.0*1()3.0*0()(*)( XPXXEX

78.0)78.0*1()22.0*0()( YEY

084.0)]78.0(*)7.0[(63.0)()( YXXY XYEXYCov

There is a positive relationship between commuting times and weather conditions

Page 42: Contact Information

YX

XY

YVarXVarXYCovXYCorr

)(*)(

)()(

Correlation

Correlation solves the units problem of covariance. It is also a measure of dependence. It is unitless and has values between -1 and 1. A value of zero implies that X and Y are uncorrelated.

Page 43: Contact Information

Standardized Variables

• To standardize a random variable X, we subtract its mean and then divide by its standard deviation :

(17.3)• No matter what the initial units of X, the standardized random variable Z has

a mean of 0 and a standard deviation of 1• The standardized variable Z measures how many standard deviations X is

above or below its mean:– If X is equal to its mean, Z is equal to 0– If X is one standard deviation above its mean, Z is equal to 1– If X is two standard deviations below its mean, Z is equal to –2

• Figures 17.4 and 17.5 illustrates this for the case of dice and fair coin flips, respectively

Z

X

Page 44: Contact Information

Figure 17.4a Probability Distribution for Six-Sided Dice, Using Standardized Z

Page 45: Contact Information

Figure 17.4b Probability Distribution for Six-Sided Dice, Using Standardized Z

Page 46: Contact Information

Figure 17.4c Probability Distribution for Six-Sided Dice, Using Standardized Z

Page 47: Contact Information

Figure 17.5a Probability Distribution for Fair Coin Flips, Using Standardized Z

Page 48: Contact Information

Figure 17.5b Probability Distribution for Fair Coin Flips, Using Standardized Z

Page 49: Contact Information

Figure 17.5c Probability Distribution for Fair Coin Flips, Using Standardized Z

Page 50: Contact Information

The Normal Distribution

• The density curve for the normal distribution is graphed in Figure 17.6

• The probability that the value of Z will be in a specified interval is given by the corresponding area under this curve

• These areas can be determined by consulting statistical software or a table, such as Table B-7 in Appendix B

• Many things follow the normal distribution (at least approximately):– the weights of humans, dogs, and tomatoes– The lengths of thumbs, widths of shoulders, and breadths of skulls – Scores on IQ, SAT, and GRE tests – The number of kernels on ears of corn, ridges on scallop shells, hairs on

cats, and leaves on trees

Page 51: Contact Information

Figure 17.6 The Normal Distribution

Page 52: Contact Information

The Normal Distribution (cont.)

• The central limit theorem is a very strong result for empirical analysis that builds on the normal distribution

• The central limit theorem states that:– if Z is a standardized sum of N independent, identically distributed

(discrete or continuous) random variables with a finite, nonzero standard deviation, then the probability distribution of Z approaches the normal distribution as N increases

Page 53: Contact Information

Sampling

Recall that:• Population: the entire group of items that interests us• Sample: the part of this population that we actually

observe• Statistical inference involves using the sample to draw

conclusions about the characteristics of the population from which the sample came

Page 54: Contact Information

Selection Bias

• Any sample that differs systematically from the population that it is intended to represent is called a biased sample

• One of the most common causes of biased samples is selection bias, which occurs when the selection of the sample systematically excludes or underrepresents certain groups – Selection bias often happens when we use a convenience sample

consisting of data that are readily available• Self-selection bias can occur when we examine data for a group of

people who have chosen to be in that group

Page 55: Contact Information

Survivor and Nonresponse Bias

• A retrospective study looks at past data for a contemporaneously selected sample– for example, an examination of the lifetime medical records of 65-year-olds

• A prospective study, in contrast, selects a sample and then tracks the members over time

• By its very design, retrospective studies suffer from survivor bias: we necessarily exclude members of the past population who are no longer around!

• Nonresponse bias: The systematic refusal of some groups to participate in an experiment or to respond to a poll

Page 56: Contact Information

The Power of Random Selection

• In a simple random sample of size N from a given population:– each member of the population is equally likely to be included in the sample– every possible sample of size N from this population has an equal chance of

being selected

• How do we actually make random selections? • We would like a procedure that is equivalent to the following:

– put the name of each member of the population on its own slip of paper– drop these slips into a box– mix thoroughly– pick members out randomly

• In practice, random sampling is usually done through some sort of numerical identification combined with a computerized random selection of numbers

Page 57: Contact Information

Estimation

• First, some terminology:• Parameter: a characteristic of the population whose value is

unknown, but can be estimated• Estimator: a sample statistic that will be used to estimate the value of

the population parameter• Estimate: the specific value of the estimator that is obtained in one

particular sample• Sampling variation: the notion that because samples are chosen

randomly, the sample average will vary from sample to sample, sometimes being larger than the population mean and sometimes lower

Page 58: Contact Information

Sampling Distributions

• The sampling distribution of a statistic is the probability distribution or density curve that describes the population of all possible values of this statistic– For example, it can be shown mathematically that if the individual

observations are drawn from a normal distribution, then the sampling distribution for the sample mean is also normal

– Even if the population does not have a normal distribution, the sampling distribution of the sample mean will approach a normal distribution as the sample size increases

• It can be shown mathematically that the sampling distribution for the sample mean has the following mean and standard deviation:

(17.5) Mean of X

Standard deviation of X / N

Page 59: Contact Information

The Mean of the Sampling Distribution

• A sample statistic is an unbiased estimator of a population parameter if the mean of the sampling distribution of this statistic is equal to the value of the population parameter

• Because the mean of the sampling distribution of X is μ, X is an unbiased estimator of μ

Page 60: Contact Information

The Standard Deviation of the Sampling Distribution

• One way of gauging the accuracy of an estimator is with its standard deviation:– If an estimator has a large standard deviation, there is a

substantial probability that an estimate will be far from its mean– If an estimator has a small standard deviation, there is a high

probability that an estimate will be close to its mean

Page 61: Contact Information

The t-Distribution

• When the mean of a sample from a normal distribution is standardized by subtracting the mean of its sampling distribution and dividing by the standard deviation of its sampling distribution, the resulting Z variable

has a normal distribution• W.S. Gosset determined (in 1908) the sampling distribution of the

variable that is created when the mean of a sample from a normal distribution is standardized by subtracting and dividing by its standard error (≡ the standard deviation of an estimator):

Page 62: Contact Information

The t-Distribution (cont.)

• The exact distribution of t depends on the sample size, – as the sample size increases, we are increasingly confident of the

accuracy of the estimated standard deviation

• Table B-1 at the end of the textbook shows some probabilities for various t-distributions that are identified by the number of degrees of freedom:degrees of freedom = # observations - # estimated parameters

Page 63: Contact Information

Confidence Intervals

• A confidence interval measures the reliability of a given statistic such as X

• The general procedure for determining a confidence interval for a population mean can be summarized as:

1. Calculate the sample average X2. Calculate the standard error of X by dividing the sample standard

deviation s by the square root of the sample size N 3. Select a confidence level (such as 95 percent) and look in

Table B-1 with N-1 degrees of freedom to determine the t-value

that corresponds to this probability4. A confidence interval for the population mean is then given by:

X t * s / N

Page 64: Contact Information

Sampling from Finite Populations

• Notably, a confidence interval does not depend on the size of the population

• This may first seem surprising: if we are trying to estimate a characteristic of a large population, then wouldn’t we also need a large sample?

• The reason why the size of the population doesn’t matter is that the chances that the luck of the draw will yield a sample whose mean differs substantially from the population mean depends on the size of the sample and the chances of selecting items that are far from the population mean– That is, not on how many items there are in the population

Page 65: Contact Information

Key Terms from Chapter 17

• Random variable• Probability distribution• Expected Value• Mean• Variance• Standard deviation• Standardized random

  variable• Population• Sample

• Selection, survivor, and   nonresponse bias

• Sampling distribution• Population mean• Sample mean• Population standard

deviation• Sample standard deviation• Degrees of freedom• Confidence interval