[scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping...

26
Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 26

Transcript of [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping...

Page 1: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Econometric Methods for Valuation Analysis

Margarita Genius

Dept of Economics

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 26

Page 2: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Correlation Analysis Simple Regression Outline

We will examine the concept of correlation between two variables

Calculate the simple correlation between two variables

Understand how to use the simple regression model to explain therelationship between two variables

Understand how to use the simple linear regression model to make forecastsof one variable

Perform tests of hypothesis in the simple linear regression model

Use Stata to perform linear regression using housing data

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 2 / 26

Page 3: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Correlation Analysis and Simple Regression Correlation Analysis

Question Type 1: Are the number of patents a firm applies for correlatedwith the level of R&D expenditures? Is the price of a house correlated withthe level of air pollution?

Question Type 2: What will be the effect of an increase of R&Dexpenditures on the number of patents? What will be the effect of anincrease in the level of pollution on the price of a house?

The first type of questions can be answered using correlation analysis.

While for the second type we need to postulate a model that explains therelationship between number of patents and R&D expenditures in the firstcase and house prices and measurements of air pollution in the second case.This model is the simple regression model.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 3 / 26

Page 4: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Correlation Analysis and Simple Regression Correlation Analysis: Application and data

Linear Regression will be applied to a hedonic model (i.e a model thatassumes that the price of a good is a function of its characteristics)

The data come from Harrison and Rubinfeld (1978) who used ahedonic model to study how house values are affected by air pollutionin Boston. The ultimate goal was to estimate willingness to pay forclean air. The data have been downloaded from Wooldridge,https://ideas.repec.org/p/boc/bocins/hprice2.htmlHarrison, D. and D.L. Rubinfeld (1978) ”Hedonic housing prices andthe demand for clean air”, Journal of Environmental Economics AndManagement, 5, 81-102.

The basic idea: if two houses are identical and only differ in pollutionlevels, then differences in their values should reflect willingness to payfor clean air.

The data consists of 506 census tracts from the 1970 US census, so itis aggregate data and not individual houses data. Only theinformation of owner-occupied one-family houses were included.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 4 / 26

Page 5: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Correlation Analysis and Simple Regression Correlation Analysis: Scatter Plot

Below you can see the scatter plot of price in $ versus nox (parts per 100million). Where price is the median housing price, nox is nitrous oxide(NO2)

01

00

00

20

00

03

00

00

40

00

05

00

00

me

dia

n h

ou

sin

g p

rice

, $

4 5 6 7 8 9nit ox concen; parts per 100m

Stata Command: scatter price noxIs there a strong relationship between price and nox? Are theypositively or negatively related?

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 5 / 26

Page 6: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Correlation Analysis: The correlation coefficient

The correlation coefficient measures the strength of the linearrelationship between two variables. If we have a sample of size n ofobservations for two variables X and Y, the sample correlation coefficient ris computed as follows,

Correlation Coefficient: r

r = ∑ni=1 (yi − y) (xi − x)√

∑ni=1 (yi − y)2 ∑n

i=1 (xi − x)2

Note

the sample correlation coefficient is an estimate of the population correlation(ρ)

−1 6 r 6 1the value of r is independent of the units of measurement of both variablesand therefore we can compare the values of r for different pairs of variables.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 6 / 26

Page 7: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Correlation Analysis

Figure: Scatter Plots for different values of r

3 4 5 6 7 8

78

910

1112

x

y

r=1

3 4 5 6 7 8

89

1011

12

x

y

r=0

4 5 6 7 8 9

78

910

1112

x

y

r=-1

4 5 6 7 8 9

78

910

1112

x

y

r=0.6

4 5 6 7 8 9

67

89

1011

12

x

y

r=-0.6

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 7 / 26

Page 8: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Correlation Analysis

when r=1.0 we have perfect positive correlation, all points in the scatter plotlie on an upward sloping line.

when r=0, there is no correlation between the two variables

when r=-1.0 we have perfect negative correlation, all points in the scatterplot lie on a downward sloping line

when r=0.6 we have weak positive correlation

when r=-0.6 we have weak negative correlation.

Note: If r=0 this means that there is no linear relationship betweenthe two variables, IT DOES NOT MEAN THEY AREINDEPENDENT

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 8 / 26

Page 9: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Correlation Analysis

The existence of a significant correlation between two variables X and Ydoes not mean that there is a relationship of cause-effect between them.

For instance beer sales are the highest when ice cream sales are highestbecause of the heat (summer months). We can not say that beer sales causepeople buying ice creams or vice-versa. In this case we say that there existsa spurious correlation between beer sales and ice cream sales (i.e there is athird factor (summer heat) which drives both of them.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 9 / 26

Page 10: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression The Simple Regression Model

We saw from the scatter plot of price versus nox that there is a negativerelation between the two variables although it is not very strong as we willsee later that the correlation coefficient is equal to r=-0.426.

We would like now to find a model that explains the relationship betweenthe two variables so that we can use it to make forecasts of prices or predictthe effect of changes in the level of pollution. We are therefore assumingthat there is cause-effect relationship between the two variables.

So we are assuming that there is a variable y whose changes can beexplained by changes in another variable x. If furthermore we assume thatthe relationship is linear, then we have

THE SIMPLE REGRESSION MODEL

yi = β0 + β1xi + εi

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 10 / 26

Page 11: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression The Simple Regression Model

Main Components of Simple Regression Model

yi is the value of the dependent variable for observation ixi is the value of the independent or explanatory variable for observation iβ0 is the intercept of the regression lineβ1 is the slope of the regression line.β0 and β1 are the parameters of the model.εi is the error or disturbance term, i.e the difference between the value ofyi and the value of E(yi)

Discussion in class: what does the error term capture?

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 11 / 26

Page 12: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Assumptions

Assumption 1: The model is linear.

Assumption 2: E(εi) = 0 for all i.Assumption 3: Var(εi) = σ2, i.e the variance of the error term is the samefor all observations. This means that the variance of Y is also σ2 for allobservations.

Assumption 4: The observations of the error term are independent of oneanother and therefore the observations of Y are also independent.

Assumption 5: The distribution of all εi is Normal. This assumption will beneeded when we test hypotheses and have small samples. When the samplesize is large then by the Central Limit Theorem we will not need to makethis assumption.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 12 / 26

Page 13: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Assumptions

Using equation yi = β0 + β1xi + εi and Assumption 2 we have,

The Population Regression Line

E(yi) = β0 + β1xi

It results that yi = E(yi) + εi So the observed value of yi can bedecomposed in two parts, one part that can be predicted from xi, ie E(yi)and a part that is random and is due to other factors that affect yi and isdepicted by εi.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 13 / 26

Page 14: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression The Regression Coefficients

The simple linear regression model has two regression coefficients:

Regression Coefficients

β0 is the value of E(y) when x = 0, so it is the point where line intersects thevertical axis (intercept)β1 measures the average change in the dependent variable when x increases oneunit (slope of the regression line). Therefore it can be positive or negativedepending on the relationship between x and y.

The units of measurement of y and x will affect the interpretation of thevalues of β0 and β1

If the values of the regression coefficients were known to us we could usethem in order to make forecasts for y but since they are unknown we have toestimate them using data for y and x.

The estimation method we will use is called the Least Squares Method

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 14 / 26

Page 15: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression The Method of Least Squares: OLS

The objective is to find estimates of β0 and β1 (which will be denoted byb0 and b1) so that the estimated line represents in the best way the linearrelationship between x and y. How? Let the estimated line be given by,

yi = b0 + b1xi, yi is the predicted value by the regression model

The difference between the actual value and the predicted value is calledthe residual and denoted by ei.

ei = yi − yi

The least squares method determines a regression line (in other words avalue of b0 and b1) that minimizes the sum of squared residuals (∑n

i=1 e2i ).

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 15 / 26

Page 16: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression The Method of Least Squares: OLS

Sample Regression Line

yi = b0 + b1xi

The sample regression line can be used to forecast the value of y fordifferent values of x

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 16 / 26

Page 17: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression The Coefficient of Determination: R-squared

The coefficient of determination measures what percentage of the totalvariation in the dependent variable is explained by its relationship with theindependent variable or is explained by the regression model. It thereforemeasures how well the model fits the data and is computed after a modelhas been estimated.

Total Variation Decomposition

SST = SSR + SSETotal variation in Y = variation of Y explained by regression + variationthat is not explained by the regression

The coefficient of determination is given by

Coefficient of Determination

R2 =SSRSST

= 1− SSESST

Note,

0 6 R2 6 1The closer it is to 1 the better the fit

In the simple regression model R2 = r2, the coefficient of determination isequal to the square of the correlation coefficient between x and y.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 17 / 26

Page 18: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Confidence Intervals and Testing Hypotheses about β1

Confidence Intervals for β1

A (1− α)% Confidence interval for β1 is given by

b1 − tn−2, α2Sb1 < β1 < b1 + tn−2, α

2Sb1

where tn−2, α2

is the value from the table of the t distribution with n− 2degrees of freedom and Sb1 is the standard error of b1.

Testing Hypotheses about β1

H0 : β1 = 0H1 : β1 6= 0

We use the t-statistic which follows a t distribution with n− 2 degrees offreedom and is given by

Test Statistic

t =b1

Sb1

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 18 / 26

Page 19: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Confidence Intervals and Testing Hypotheses about β1

For an α% significance level test,We reject the null hypothesis if t− statistic > tn−2, α

2OR

t− statistic < −tn−2, α2

Alternatively we can use the p-value and reject the null hypothesis ifp− value < α

If we accept the null hypothesis then we say that the coefficient is notstatistically significant (or that x has no significant effect on y). If we rejectthe null hypothesis then we say that the coefficient is statistically significant(or that x has a significant effect on y).

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 19 / 26

Page 20: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Application: House Prices and Pollution-Data

We are going to analyze regional (census tracts in Boston) median housingprices. The following variables appear in the data set in the given order:Variables: price crime nox rooms dist radial proptax stratio lowstat lpricelnox lproptaxVariable labels:1. price: median housing price, $2. crime: crimes committed per capita3. nox: nitrous oxide, parts per 100 mill.4. rooms: avg number of rooms per house5. dist: weighted dist. to 5 employ centers6. radial: accessibiliy index to radial hghwys7. proptax: property tax per 10008. stratio: average student-teacher ratio9. lowstat: % of people ’lower status’10. lprice: log(price)11. lnox: log(nox)12. lproptax: log(proptax)

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 20 / 26

Page 21: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Application: House Prices and Pollution-Summary Statistics

Stata command ”summarize” gives us the descriptive statistics of thedata

Table: Summary statistics

Variable Mean Std. Dev. Min. Max.price 22511.51 9208.856 5000 50001crime 3.612 8.59 0.006 88.976nox 5.55 1.158 3.85 8.710rooms 6.284 0.703 3.56 8.779dist 3.796 2.106 1.13 12.13radial 9.548 8.707 1 24proptax 40.824 16.854 18.7 71.100stratio 18.459 2.166 12.6 22lowstat 12.701 7.238 1.73 39.07lprice 9.941 0.409 8.516 10.82lnox 1.693 0.201 1.348 2.164lproptax 5.931 0.396 5.231 6.567

N 506

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 21 / 26

Page 22: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Application: House Prices and Pollution-Correlations

Stata has been used to produce the scatter plot we saw before and tocompute the correlation coefficient and to estimate the simple regressionmodel.

The following table shows the bivariate correlations between price, nox androoms. Comment the signs, the strength and the significance.

Table: Cross-correlation table

Variables price, $ nox roomsprice, $ 1.000

nox -0.426 1.000(0.000)

rooms 0.696 -0.303 1.000(0.000) (0.000)

In stata: Select ”Statistics”, ”Summaries, tables, and tests”, ”Summary anddescriptive statistics”, ”Pairwise correlations” then enter the names of thevariables and tick ”Print significance level for each entry”

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 22 / 26

Page 23: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Application: House Prices and Pollution-OLS estimates

Results from least squares estimates for model price = β0 + β1nox + εStata command ”regress price nox”

Table: Dep = price

Variable Coefficient(Std. Err.)

nox -3386.853∗∗

(320.362)

Intercept 41307.806∗∗

(1816.182)

N 506R2 0.182F (1,504) 111.766

Significance levels : † : 10% ∗ : 5% ∗∗ : 1%

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 23 / 26

Page 24: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Application: House Prices and Pollution-Interpretation

The estimated model is ˆprice = 41307.806− 3386.853nox.

Interpretation of estimated model. An increase in one unit of nox (1 part per100 million) will decrease the average value of houses by 3386.853$

Variable nox is significant at 1% level (this is denoted by ** on the table).The p-value reported in the program Stata is 0.000.

R-squared is only 0.182.

What would be the effect on the value of a house of implementing a policythat reduces pollution by 10%? Denoting with the subindex 0 the initialsituation and with subindex 1 the situation after implementing the policy, wehave,

ˆprice0 = 41307.806− 3386.853nox0 and nox1 = 0.90nox0

ˆprice1 = 41307.806− 3386.853nox1 = 41307.806− 3386.853 ∗ 0.90nox0

The change in price is given by

∆P = 3386.853 ∗ 0.10nox0

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 24 / 26

Page 25: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression Application: House Prices and Pollution

We found that the effect of a 10% decrease in nox is an increase in price of338.6853 ∗ nox$. This means that for an area with an average pollution level(nox=5.55) the average price would increase by 1879.7 $

Discussion: Should we maybe include other important factors that affect thehouse prices? We will discuss this in the next set of slides.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 25 / 26

Page 26: [scale=0.2]/Users/genius/mgs/Classes/cagliari/lectures ...€¦ · plot lie on a downward sloping line when r=0.6 we have weak positive correlation when r=-0.6 we have weak negative

Simple Regression References

Wooldridge, J. M. (2015). Introductory Econometrics: A ModernApproach. South-Western College Pub.

Stock, J. H and Watson, M. W (2010). Introduction to Econometrics.Addison-Wesley.

M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 26 / 26