Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1...

31
Econometrics - QEM Page 1 Meeting 5 Meeting 5 Indicator Variables Chapter 7 of PoE Michal Rubaszek Based on presentation by Walter R. Paczkowski

Transcript of Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1...

Page 1: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 1Meeting 5

Meeting 5Indicator Variables

Chapter 7 of PoE

Michał RubaszekBased on presentation by Walter R. Paczkowski

Page 2: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 2Meeting 5

Indicator (=dummy / binary / dichotomous) variables: take two values (1 or 0) to indicate the presence or absence of a characteristic

Indicator variables allowus to:– account for qualitative factors in econometric models

(eg. gender, location, other examples?)– construct models in which some or all regression model

parameters, including the intercept, change for some observations in the sample

Generally, we define an indicator variable D as:

1 if characteristic is present

0 if characteristic is not presentD

=

Page 3: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 3Meeting 5

Consider a model to predict the value of a house (PRICE) as a function of its characteristics: size (SQFT) and location(D):

The predicted price is:

Adding an indicator variable causes a parallel shift in the relationship by the amount δ

1 2β βPRICE D SQFT eδ= + + +1 if property is in the desirable neighborhood

0 if property is not in the desirable neighborhoodD

=

( ) ( )1 2

1 2

β β when 1

β β when 1

SQFT DE PRICE

SQFT D

δ + + == + =

Page 4: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 4Meeting 5

7.1Indicator Variables

7.1.1Intercept Indicator Variables

FIGURE 7.1 An intercept indicator variable

Page 5: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 5Meeting 5

Interpretation of �?D = 0 defines the reference group. Let reverse the definition of the dummyD:

What is the estimate for of � in the newmodel:

Can we estimate

knowing thatD + LD = 1? The problem of exact collinearity

1 if property is not in the desirable neighborhood

0 if property is in the desirable neighborhoodLD

=

1 2β βPRICE LD SQFT eλ= + + +

1 2β βPRICE D LD SQFT eδ λ= + + + +

Page 6: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 6Meeting 5

Consider a model:

The variable (SQFT x D) is called an interaction (slope-indicator / slope dummy) variable as it allows for a change in the slope of the relationship

The predicted value is:

( )1 2β βPRICE SQFT SQFT D eγ= + + × +

Eq. 7.5

( ) ( )( )

1 2

1 2

1 2

β β

β β when 1

β β when 0

E PRICE SQFT SQFT D

SQFT D

SQFT D

γ

γ

= + + ×

+ + == + =

Page 7: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 7Meeting 5

7.1Indicator Variables

FIGURE 7.2 (a) A slope-indicator variable(b) Slope- and intercept-indicator variables

( ) 2

2

β γ when 1

β when 0

DE PRICE

DSQFT

+ =∂ = =∂

Page 8: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 8Meeting 5

Consider a regression for house prices as:

( )1 1 2

3 2 3

β δ β γ

β δ δ

PRICE UTOWN SQFT SQFT UTOWN

AGE POOL FPLACE e

= + + + ×+ + + +

Eq. 7.7

UTOWN – location near the UniversityFPLACE – fireplace

Page 9: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 9Meeting 5

7.1Indicator Variables

Table 7.2 House Price Equation Estimates

• location premium for lots near the university is $27,453• price per additional square foot is $89.12 for houses near the university and

$76.12 for houses in other areas• Houses depreciate $190.10 per year• A pool increases the value of a home by $4,377.20• A fireplace increases the value of a home by $1,649.20

Page 10: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 10Meeting 5

Tests with indicator variables

Page 11: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 11Meeting 5

Consider the wage equation:

Does race / gender affect wages?The expected value is:

The null is�� � �� � � � 0 so that the restricted model simplifies to:

�� � �� � �� ��� � �

We can use the � test

7.2.1Interactions

Between Qualitative

Factors

( )1 2 1 2β β δ δ

γ

WAGE EDUC BLACK FEMALE

BLACK FEMALE e

= + + ++ × +

( )( )( )( )

1 2

1 1 2

1 2 2

1 1 2 2

β β -

β δ β -

β δ β -

β δ δ γ β -

EDUC WHITE MALE

EDUC BLACK MALEE WAGE

EDUC WHITE FEMALE

EDUC BLACK FEMALE

+ + += + + + + + +

Page 12: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 12Meeting 5

Consider the model:

so that for two locations we:

where α1 = β1 + δ and α2 = β2 + γ

The Chow test for model stability is:

�0: �� � ��and�� � ��that is equivalent to

�0: � � 0and� � 0

We can verify the null with the F test

( )1 2β δ β γPRICE D SQFT SQFT D e= + + + × +

( ) 1 2

1 2

1

β β 0

SQFT DE PRICE

SQFT D

α α+ == + =

Page 13: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 13Meeting 5

Log-linear Models

Page 14: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 14Meeting 5

Consider the wage equation in log-linear form:

What is the interpretation of δ?Since

We knowthat:

So� is approximately the percentage difference

( ) 1 2ln β β δWAGE EDUC FEMALE= + +

Eq. 7.11( ) ( )

1 2

1 2

β β ( 0)ln

β δ β ( 1)

EDUC MALES FEMALESWAGE

EDUC FEMALES MALES

+ == + + =

( ) ( )ln ln δFEMALES MALES

WAGE WAGE− =

Page 15: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 15Meeting 5

The estimated model is:

– We estimate that there is a 24.32% differential between male and female wages

Precisely, since:

the difference is:

100(eδ – 1)% = 100(e-0.2432– 1)% = -21.59%

( )�

( ) ( ) ( ) ( )ln 1.6539 0.0962 0.2432

0.0844 0.0060 0.0327

WAGE EDUC FEMALE

se

= + −

1FEMALES MALES FEMALES MALES

MALES MALES MALES

WAGE WAGE WAGE WAGEe

WAGE WAGE WAGEδ−− = = −

Page 16: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 16Meeting 5

Expected value from the log-linear model

It is important to remember that for the log-linear model:

ln � � �� � ��� � �, � ∼ !0, "�#

the expected value of y is:

( ) ( ) ( ) ( )2 21 2 1 2exp β β 2 exp β β exp 2E y x xσ σ= + + = + ×

Page 17: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 17Meeting 5

Treatment Effects

Page 18: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 18Meeting 5

Problemof reasoning known as post hoc

– One event’s preceding another does not necessarily make the first the cause of the second, i.e. ‘‘correlation is not the same as causation’’

– e.g. share prices in Hungary and Poland

In many cases data exhibit a selection bias:

– When membership in the treated group is determined by choice, then the sample is not a random sample

– e.g. some people choose (or self-selected) to go to the hospital and the others do not

Page 19: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 19Meeting 5

Example of questions in which the selection bias is a problem:– ‘‘How much does an additional year of education

increase wages?’’ – ‘‘How much does participation in a job-training

program increase wages?’’– ‘‘How much does a dietary supplement contribute to

weight loss?’’

Selection bias makes it difficult to measure a causal (=treatment) effect of � on � in a model:

� � �� � ��� � �

Page 20: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 20Meeting 5

Our aim- randomized controlled experiment– randomly assign items to a treatment group and a

control group– compare the characteristics of the two groups

Consider a model:

in which:

so that:

1 2β β , 1, ,i i iy d e i N= + + = K

1 individual in treatment group

0 individual in control groupid

=

( ) 1 2

1

β β if in treatment group, 1

β if in control group, = 0i

ii

dE y

d

+ ==

Page 21: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 21Meeting 5

The LS estimator for β2, the treatment effect, is:

with:

The estimator b2is called the difference estimator, because it is the difference between the sample means of the treatment and control groups

( )( )

( )1

2 1 02

1

N

i ii

N

ii

d d y yb y y

d d

=

=

− −= = −

Eq. 7.14 1 0

1 1 0 01 1,

N N

i ii iy y N y y N

= == =∑ ∑

Page 22: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 22Meeting 5

Since �$ % �& � �� � ��'$ � �$ % �� � ��' � �� '$ % ' � �$ we have:

To be unbiased we need:

If we allow individuals to ‘‘self-select’’ into treatment and control groups, then:

is the selection bias in the estimation of the treatment effect

( )( )

( )( )1

2 2 2 1 02

1

β β

N

i ii

N

ii

d d e eb e e

d d

=

=

− −= + = + −

( ) ( ) ( )1 0 1 0 0E e e E e E e− = − =

( ) ( )1 0E e E e−

Page 23: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 23Meeting 5

How can we eliminate the self-selection bias?

– We need torandomly assign individuals to treatment and control groups

Consider a model in which test score (TOTALSCORE) dependson the class size (SMALL, indicator variable):

The data show that the score of students in small classes is on average 13.9 points higher (LS estimate)

1 2β βTOTALSCORE SMALL e= + +

Page 24: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 24Meeting 5

7.5Treatment

EffectsTable 7.7 Project STAR: Kindergarden

1 2 3β β βTOTALSCORE SMALL TCHEXPER e= + + +79

1 2 32

β β β δ _i i i j i ij

TOTALSCORE SMALL TCHEXPER SCHOOL j e=

= + + + +∑

Extended specifications ()��**+, - dummy variables):

Page 25: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 25Meeting 5

Question: howto check whether assignment to small classes is random?

– Regress SMALL on student characteristics and check for overall significant relationship (if we reject the null, the assignment is not random)

Randomized controlled experiments are rare in economics (very expensive). Natural experiments, also calledquasi-experiments, rely on observing real-world conditions that approximate what would happen in a randomized controlled experiment (treatmentappears as if it were randomly assigned)

Page 26: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 26Meeting 5

7.5Treatment

EffectsFIGURE 7.3 Difference-in-Differences Estimation

Example: Find two students with the same characteristics (before), locate them in small and normal size classes, compare the results during the next exam (after)

Page 27: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 27Meeting 5

Difference-in-differences (abbreviated as D-in-D, DD, or DID) estimator of the treatment effect:

( ) ( )( ) ( ), , , ,

ˆ ˆ ˆˆ ˆδ= C E B A

Treatment After Control After Treatment Before Control Beforey y y y

− − −

= − − −

Eq. 7.18

,

,

,

,

A mean for control group before policy

B mean for treatment group before policy

E mean for control group after policy

C mean for trea

Control Before

Treatment Before

Control After

Treatment After

y

y

y

y

= =

= =

= =

= = tment group after policy

Page 28: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 28Meeting 5

Consider the regression model:

Using the least squares estimates, we have:

( )1 2 3y β β β δit i t i t itTREAT AFTER TREAT AFTER e= + + + × +

( )1

1 2

1 3

1 2 3

β 0, 0 [Control before = A]

β β 1, 0 [Treatment before = B]E y

β β 0, 1 [Control after = E]

β β β δ

it

TREAT AFTER

TREAT AFTER

TREAT AFTER

TREAT

= =+ = =

=+ = =+ + + =1, 1 [Treatment after = C]AFTER

=

( ) ( ) ( )

( ) ( )1 2 3 1 3 1 2 1

, , , ,

ˆ ˆδ b δ b

Treatment After Conrol After Treatment Before Conrol Before

b b b b b b

y y y y

= + + + − + − + −

= − − −

Page 29: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 29Meeting 5

Example: let us test the effect of the change in the minimum wage on employment in fast food restaurants:

Given the sample we can calculate:

( ) ( )( ) ( )

, , , ,δ

21.0274 21.1656 20.4394 23.3312

2.7536

NJ After PA After NJ Before PA BeforeFTE FTE FTE FTE= − − −

= − − −=

Eq. 7.21

Page 30: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 30Meeting 5

The differences-in-differences regression is:

( )1 2 3β β β δit i t i t itFTE NJ D NJ D e= + + + × +

Eq. 7.22

Page 31: Meeting 5 Indicator Variables Chapter7 of PoEmrubas/aeQEM/pdf/M5 indicator... · 2014-02-23 · 1 if property is in the desirable neighborhood ... (slope-indicator / slope dummy)

Econometrics - QEM Page 31Meeting 5

The differences-in-differences regression is:

Since -$ is time-invariant, using the differenced data, the regression model can be transformed into:

The estimate of the treatment effect using the differenced data:

is almost the same as the D-i-D estimateWe fail to conclude that the minimum wage increase has reduced employment in these New Jersey fast food restaurants

( )1 2 3β β β δit i t i t itFTE NJ D NJ D e= + + + × +

Eq. 7.22

3β δi i iFTE NJ e∆ = + + ∆

( ) ( ) ( )22.2833 2.7500 0.0146

1.036 1.154

FTE NJ R

se

∆ = − + =