Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the...

33
RS - Returns and Data 1 1 Lecture 2-A Introduction: Review, Returns and Data Brooks (4 th edition): Chapter 2 All the information and material is on my webpage: https://www.bauer.uh.edu/rsusmel/4397/4397.htm Textbook: Introductory Econometrics for Finance, 4th edition or older, by Chris Brooks. Install R in your machine. We’ll run some simple programs. Two midterms and a final (paper for the MBA class). There is a project in between midterms. Homework due one week before each exam. This class 2

Transcript of Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the...

Page 1: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

1

1

Lecture 2-AIntroduction: Review, Returns

and Data

Brooks (4th edition): Chapter 2

• All the information and material is on my webpage:

https://www.bauer.uh.edu/rsusmel/4397/4397.htm

• Textbook: Introductory Econometrics for Finance, 4th edition or older, by Chris Brooks.

• Install R in your machine. We’ll run some simple programs.

• Two midterms and a final (paper for the MBA class). There is a project in between midterms.

• Homework due one week before each exam.

This class

2

Page 2: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

2

• This is an applied technical class, with some econometric theory and concepts, followed by related financial applications.

• We will review many math and statistical topics.

• Some technical material may be new to you, for example Linear Algebra. The new material is introduced to simplify exposition of main concepts. You will not be required to have a deep understanding of the new material, but you should be able to follow the intuition behind it.

• This is not a programming class, but we will use R to estimate models. I will cover some of the basics in class. But, the more you know, the more comfortable you will be running the programs.

• For some students, the class will be dry (“He fries my brain,” a student said last semester.)

This class

3

• How do we measure returns and risks of financial assets?

• Can we estimate expected returns with precision? What about the variance of returns?

• Can we explain asset returns?

• How can one explain variations in stock returns across various stocks?

• Are asset returns predictable? In the short run? In the long run?

• Does the risk of an asset vary with time? What are the implications? How can one model time-varying risk?

• Is the equity premium (excess returns of stocks over bonds) really that high?

Main Topics

4

Page 3: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

3

This course provides an introduction to the basics of financial econometrics, focusing on estimation of linear models and analysis of time series. There are many more topics in financial econometrics that we will not cover, among them:

• Credit risk management and probability of default

• Interest rate models and term structure models

• Analyzing high-frequency data and modeling market microstructure

• Estimating models for options

• Multivariate time series models

• Technical methods such as state-space models and the Kalman filter, Markov processes, copulae, nonparametric methods, ....

Not Covered Topics

5

What is Econometrics?

• Ragnar Frisch, Econometrica Vol.1 No. 1 (1933) revisited

“Experience has shown that each of these three view-points, that of statistics, economic theory, and mathematics, is a necessary, but not by itself a sufficient, condition for a real understanding of the quantitative relations in modern economic life.

It is the unification of all three aspects that is powerful. And it is this unification that constitutes econometrics.”

EconometricsMathematical Statistics

DataEconomic Theory

6

Page 4: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

4

• Economic Theory:

- The CAPM Ri – Rf = i (RMP – Rf )

• Mathematical Statistics:

- Method to estimate CAPM. For example,

Linear regression: Ri – Rf = αi + i (RMP – Rf ) + i

- Properties of bi (the LS estimator of i).

- Properties of different tests of CAPM. For example, a t-test for H0: αi =0.

• Data: Ri, Rf, and RMP

- Typical problems: Missing data, Measurement errors, Survivorship bias, Auto- and Cross-correlated returns, Time-varying moments.

What is Econometrics?

7

• Financial Econometrics is applied econometrics to financial data. That is, we study the statistical tools that are needed to analyze and address the specific types of questions and modeling challenges that appear in analyzing financial data.

• Finance is about the trade-off between risk and return.

– How do we measure risk and return?

– Can we predict them?

– How do we measure the trade-off?

– How much should I be compensated for taking a given risk?

• Thus, we will be concerned with quantifying rewards and risks associated with uncertain outcomes.

What is Financial Econometrics?

8

Page 5: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

5

Definition: Population

A population is the totality of the elements under study. We are interested in learning something about this population.

Examples: Number of alligators in Texas, percentage of unemployed workers in cities in the U.S., the total return of stocks in the U.S., the 10-year Japanese government bond yield.

A Random Variable (RV) X defined over a population is called the population RV.

Usually, the population is large, making a complete enumeration of all the values in the population impractical or impossible. Thus, the descriptive statistics describing the population –i.e., the population parameters– will be considered unknown.

Review – Population and Sample

9

• Typical situation in statistics: we want to make inferences about an unknown population parameter θ using a sample –i.e., a small collection of observations from the general population- {X1, …, Xn}.

• We summarize the information in the sample with a statistic, which is a function of the sample.

That is, any statistic summarizes the data, or reduces the information in the sample to a single number. To make inferences, we use the information in the statistic instead of the entire sample.

Get a statistic and Make inferences (“learn”)

Get a sample (size N or T)

Population Sample

Review – Population and Sample

Page 6: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

6

Definition: Sample

The sample is a (manageable) subset of elements of the population.

Example: The total returns of the stocks on the S&P 500 index.

Samples are collected to learn about the population. The process of

collecting information from a sample is referred to as sampling.

Definition: Random Sample

A random sample is a sample where the probability that any individual member from the population being selected as part of the sample is exactly the same as any other individual member of the population.

11

Review – Population and Sample

Example: The total returns of the stocks on the S&P 500 index is not a random sample.

In mathematical terms, given a random variable X with distribution F, a random sample of length n is a set of n independent, identically distributed (i.i.d.) random variables with distribution F.

12

Review – Population and Sample

Page 7: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

7

• The samples we collect to learn about the population by computing sample statistics are classified in three groups:

• Time Series Data: Collected over time on one or more variables, with a particular frequency of observation. For example, monthly S&P 500 returns, or 10’ IBM returns.

• Cross-sectional Data: Collected on one or more variables collected at a single point in time. For example, today we record all closing returns for the member of the S&P 500 index.

• Panel Data: Cross-sectional Data collected over time. For example, the CRSP database collects all stocks traded in U.S. markets at the daily frequency since 1962.

Review – Samples and Types of Data

13

• A statistic (singular) is a single measure of some attribute of a sample (for example, its arithmetic mean value). It is calculated by applying a function (statistical algorithm) to the values of the items comprising the sample, which are known together as a set of data.

Definition: Statistic

A statistic is a function of the observable random variable(s), which does not contain any unknown parameters.

Examples: sample mean ( ), sample variance (s2), minimum, median, (x1+xn)/2, etc.

Note: A statistic is distinct from a population parameter. A statistic will be used to estimate a population parameter. In this case, the statistic is called an estimator.

Review – Sample Statistic

14

Page 8: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

8

• Sample Statistics are used to estimate population parameters

Example: is an estimate of the population mean, μ.

Notation: Population parameters: Greek letters (μσ, θ, etc.)

Estimators: A hat over the Greek letter (θ).

Suppose we want to learn about the mean of IBM annual returns, μIBM. From the population, we get a sample: {X1962, X1963, …, Xn=2020}. Then, we compute a statistic, . As we will see later, on average is a good estimator of μ.

Calculate and infer μIBM

Get a sample (size N)

IBM returns {X1962, X1963, …, X2020}

Review – Population and Sample

• The definition of a sample statistic is very general. For example, (x1+xn)/2 is by definition a statistic; we could claim that it estimates the population mean of the variable X. However, this is probably not a good estimate.

• We would like our estimators to have certain desirable properties.

Review – Sample Statistic

16

Page 9: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

9

• Some simple properties for estimators:

-An estimator θ is unbiased estimator of θ if E[θ] = θ.

-An estimator is most efficient if the variance of the estimator is minimized.

-An estimator is BUE, or Best Unbiased Estimate, if it is the estimator with the smallest variance among all unbiased estimates.

- An estimator is consistent if as the sample size, n, increases to ∞, θn

converges to θ. We write θn → θ.

- An estimator is asymptotically normal if as the sample size, n, increases to ∞, θn, often standardized or transformed, converges in distribution

to a Normal distribution. We write θn → N(θ, Var(θn)).

Review – Sample Statistic

Definition: Suppose that X is a random variable. Let f(x) denote a function defined for -∞ < x < ∞ with the following properties:

1. f(x) ≥ 0

2. 1.

3.

• Then, f(x) is called the probability density function (pdf) of X. The RV X is called continuous. We use the pdf to describe the behavior of X.

• Analogous definition applies for a discrete RV, where the notation uses p(x) instead.

Review – PDF for a Continuous RV

18

Page 10: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

10

1.

.

Review – PDF for a Continuous RV

• The pdf is non-negative and integrates to

• Remark: We use the pdf to describe the behavior of the RV (discrete or continuous). 19

• A RV X is said to have a normal distribution with parameters (mean) and (variance) if X is a continuous RV with pdf f(x):

2

221

2

x

f x e

Review – Popular PDFs: Normal Distribution

Note: Described by two parameters: and . We write X ~ N(μ, σ2)

Page 11: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

11

• The normal distribution is often used to describe or approximate any variable that tends to cluster around the mean. It is the most assumed distribution in economics and finance: rates of return, growth rates, IQ scores, observational errors, etc.

• The central limit theorem (CLT) provides a justification for the normality assumption when the sample size, n, is large.

Notation: PDF: X ~ N(μ, σ2)

CDF: Φ(x)

Review – Popular PDFs: Normal Distribution

• Let the continuous RV X have density function):

Review – Popular PDFs: Gamma Distribution

1 0

0 0

xx e xf x

x

where > 0 and Γ(is the gamma function evaluated at .

Then, X is said to have a Gamma distribution with parameters and ., denoted as X ~ Gamma(, or Γ(,

It is a family of distributions, with special cases:

Exponential Distribution, or Exp(: = 1.

Chi-square Distribution, or χ : = /2 and = ½.

Page 12: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

12

• Below we plot the Chi-square distribution with parameter , which we refer as degrees of freedom:

Review – Popular PDFs: Gamma Distribution

0

0.1

0 .2

0 4 8 12 16

0

0.1

0 .2

0 4 8 12 16

( = 4)

( = 5)

( = 6)

Note: When is large, the χ converges to a N(, 2).

.

Review – CDF for a Continuous RV

• If X is a continuous random variable with probability density function, f(x), the cumulative distribution function (CDF) of X is given by:

• Note: The FTC (fundamental theorem of calculus) implies:

24

Page 13: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

13

Review – Histogram of a RV

Example: We use a histogram to estimate the distribution of a RV. Let X = Percentage changes in the CHF/USD exchange rate = ef

Data: Monthly from January 1971 to June 2020 (N=594 observations).

• Note: We overlay a Normal density (blue line) over the histogram.25

• The moments of a random variable X are used to describe the behavior of the RV (discrete or continuous).

Definition: Kth Moment

Let X be a RV (discrete or continuous), then the kth moment of X is:

if isdiscrete

if iscontinuous

• The first moment of X, = 1 = E(X) is the center of gravity of the distribution of X.

• The higher moments give different information regarding the shape of the distribution of X.

Review – Moments of Random Variables

26

Page 14: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

14

Definition: Central Moments

Let X be a RV (discrete or continuous). Then, the kth central moment of X is defined to be:

if isdiscrete

if iscontinuous

where = 1 = E(X) = the first moment of X.

• The central moments describe how the probability distribution is distributed about the center of gravity, .

Review – Moments of a RV

27

• The first central moments is given by:

The second central moment depends on the spread of the probability distribution of X about It is called the variance of X and is denoted by the symbol σ2 = var(X):

= var(X) =

The square root of var(X) is called the standard deviation of X and is denoted by the symbol = SD(X). We also refer to it as volatility:

=

Review – Moments of a RV

28

Page 15: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

15

• The third central moment:

contains information about the skewness of a distribution.

• A popular measure of skewness:

• Distribution according to skewness:1) Symmetric distribution

0

0 . 0 1

0 . 0 2

0 . 0 3

0 . 0 4

0 . 0 5

0 . 0 6

0 . 0 7

0 . 0 8

0 . 0 9

0 5 1 0 1 5 2 0 2 5 3 0 3 5

0, 0

Review – Moments of a RV: Skewness

29

0

0 . 0 1

0 . 0 2

0 . 0 3

0 . 0 4

0 . 0 5

0 . 0 6

0 . 0 7

0 . 0 8

0 5 1 0 1 5 2 0 2 5 3 0 3 5

0, 0

2) Positively (right-) skewed distribution (with mode < median < mean)

3) Negatively (left-) skewed distribution (with mode > median > mean)

0

0 . 0 1

0 . 0 2

0 . 0 3

0 . 0 4

0 . 0 5

0 . 0 6

0 . 0 7

0 . 0 8

0 5 1 0 1 5 2 0 2 5 3 0 3 5

0, 0

Review – Moments of a RV: Skewness

30

Page 16: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

16

• Skewness and Economics- Zero skew means symmetrical gains and losses. - Positive skew suggests many small losses and few rich returns. - Negative skew indicates lots of minor wins offset by rare major losses.

• In financial markets, stock returns at the firm level show positive skewness, but at the aggregate (index) level show negative skewness.

• From horse race betting and from U.S. state lotteries there is evidence supporting the contention that gamblers are not necessarily risk-lovers but skewness-lovers: Long shots are overbet (positive skewness loved!).

Review – Moments of a RV: Skewness

31

• The fourth central moment:

It contains information about the shape of a distribution. The property of shape that is measured by this moment is called kurtosis,

usually estimated by .

• The measure of (excess) kurtosis: γ2 = 3 = 3

• Distributions:

1) Mesokurtic distribution

0

0 . 0 1

0 . 0 2

0 . 0 3

0 . 0 4

0 . 0 5

0 . 0 6

0 . 0 7

0 . 0 8

0 . 0 9

0 5 1 0 1 5 2 0 2 5 3 0 3 5

0, moderateinsize

Review – Moments of a RV: Kurtosis

32

Page 17: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

17

0

0 2 0 4 0 6 0 8 0

0, smallinsize

2) Platykurtic distribution

0

0 2 0 4 0 6 0 8 0

0, largeinsize

3) Leptokurtic distribution (usual shape for asset returns)

Review – Moments of a RV: Kurtosis

33

• Note that moments are defined by expected values. We define the expected value of a function of a continuous RV X, g(X), as

• If X is discrete with probability function p(x)

Examples: g(x) = (x – μ)2 E[g(x)] = E[(x – μ)2] g(x) = (x – μ)k E[g(x)] = E[(x – μ)k]

• We estimate expected values with sample averages. The Law of Large Numbers (LLN) tells us they are consistent estimators of expected values.

Review – Moments and Expected Values

34

Page 18: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

18

• We estimate expected values with sample averages. For example, the first moment, the mean, and the second central moment, the variance, are estimated by:

= ∑

(N-1 adjustment needed for E σ )

• Besides consistent, they are both are unbiased estimators of their respective population moments (unbiased = “on average, I get the population parameter”). That is,

E μ “population parameter”E σ

Review – Estimating Moments

35

Theorem (Weak LLN)

Let X1, … , Xn be n mutually independent random variables each having mean and a finite -i.e, the sequence {Xn} is i.i.d.

Let ∑

.

Then for any > 0 (no matter how small)

• There are many variations of the LLN. It is a general result: A sampleaverage as the sample size goes to infinite tends to its expected value.

Also written as n → μ. (convergence in probability)

1 as P X P X n

• Long history: Gerolamo Cardano (1501-1576) stated it without proof.

Jacob Bernoulli published a rigorous proof in 1713.

Review – Law of Large Numbers (LLN)

30

Page 19: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

19

Review – Central Limit Theorem (CLT)

• Let X1, X2, …, Xn be a sequence of i.i.d. RVs with finite mean , and finite variance Then as n increases, n, the sample mean, approaches the normal distribution with mean μ and variance n.

This theorem is sometimes stated as

_

→ 0,1

where → means “the limiting distribution (asymptotic distribution) is” (or convergence in distribution).

• Many version of the CLT. This one is the Lindeberg-Lévy CLT.

• The CLT gives only an asymptotic distribution. We usually take it as an approximation for a finite number of observations. In these cases, the

notation goes from → to →. 31

• All statistics, T(X), are functions of RVs and, thus, they have a distribution. Depending on the sample, we can observe different values for T(X), thus, the finite sample distribution of T(X) is called the sampling distribution.

For the sample mean , if the Xi’s are normally distributed, then the sampling distribution is normal with mean μ and variance σ2/N. Or

~ N(μ, σ2/N).

Note: If the data is not normal, the CLT is used to approximate the sampling distribution by the asymptotic one, usually after some

manipulations. Again, in those cases, the notation goes from → to →.

• The SD of the sampling distribution is called the standard error (SE). Then, SE( ) = σ/sqrt(N).

Review – Sampling Distributions

38

Page 20: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

20

• For the sample variance σ2, if the Xi’s are normally distributed, then the sampling distribution is derived from this result:

(N – 1) s2/2 ~ 2N-1.

It can be shown that the 2N-1 has a variance equal to 2 times the

degrees of freedom (=2*(N – 1)), that is, Var[(N-1) s2/2 ] = 2 * (N – 1) Var[s2] = 2 * 4 /(N – 1)

Then, SE(s2) = SD(s2) = 2 * sqrt[2/(N – 1)].

Note: If the data is not normal (& N is large), the CLT can be used to approximate the sampling distribution by the asymptotic one:

s2 → N(2, 2*4/(N – 1))

Review – Sampling Distributions

39

• Sampling Distribution for the Sample Mean of a normal population:

X

Xf ( )

μ

~ N(, σ2/n)

Note: As n→∞, → μ –i.e., the distribution becomes a spike at μ!

Review – Sampling Distributions

33

Page 21: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

21

• 10,000 samples, for a N(2 4) population. Different sample sizes, n:

Note: As n→∞, → μ=2 –i.e., the distribution becomes a spike at μ!

n = 10 n = 60 n = 200

Review – Sampling Distributions

• We estimate sample averages for ef = % changes in the CHF/USD.

PPP_da <-read.csv("http://www.bauer.uh.edu/rsusmel/4397/ppp_2020_m.csv",head=TRUE,sep=",")x_chf <- PPP_da$CHF_USDT <- length(x_chf) ## Size of series read (T or N notation is OK)lr_chf <- log(x_chf[-1]/x_chf[-T]) ## Log returns

x <- lr_chf ## Series to be analyzedn <- length(x) ## Number of observationsm1 <- sum(x)/n ## Mean ( )m2 <- sum((x-m1)^2)/n ## Used in denominator of bothm3 <- sum((x-m1)^3)/n ## For numerator of Sm4 <- sum((x-m1)^4)/n ## For numerator of Kb1 <- m3/m2^(3/2) ## Sample Skewness (γ1)b2 <- (m4/m2^2) ## Sample Kurtosis (γ2)s2 <- sum((x-m1)^2)/(n-1) ## Sample Variance (s2)sd_s <- sqrt(s2) ## Sample SD (s)

Review – Estimating Moments in R

42

Page 22: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

22

R output:> m1[1] -0.002550636> s2[1] 0.001115257> sd_s[1] 0.03339546

> b1 ## Sample Skewness (γ1)[1] -0.06733514

> b2 ## Sample Kurtosis (γ2)[1] 4.621602

Review – Estimating Moments in R

43

Example: We estimate the moments of ef = % changes in the CHF/USD exchange rate (1971:Jan – 2020:Jun):

Statistic ef

Mean -0.002551Median -0.001431Maximum 0.145542 Minimum -0.145639 Std. Dev. 0.033395Skewness -0.067335Kurtosis 4.621602

• Small mean (0.25%), slight negative skewness, kurtosis greater than 3, pointing out to non-normality of data (“fatter tails”).

γ2 = 3 = 1.62

Review – Estimating Moments

44

Page 23: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

23

• Returns have better statistical properties than prices. Models tend to focus on returns.

• Gross Return, Rt:

where Pt = Stock price or Value of investment at time t

Dt = Dividend or payout of investment at time t

• Net or simple return, rt :

1 capital gain + dividend yield

Note: This is the return from time t-1 to time t. To be very explicit we can write this as rt-1,t.

Returns

45

• Log returns, rt (or continuously compounded returns):

rt = log(Rt ) = log(Pt + Dt) – log (Pt-1) ≈ Rt 1

Derivation:

Recall: ln(1) = 0,

. Now do a 1st-order Taylor expansion

around x0 to get

ln(x) ≈ ln(x0) + | ln(x0) +

Thus, expanding around x0 = 1, we have for x ≈ 1:

ln(x) ≈ 0 + 1 1

Set x = Rt to get result.

• When returns are small, say for daily or weekly data, the numerical differences between simple and compounded returns are very small.

Returns

46

Page 24: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

24

• For portfolios, the simple rate of return has an advantage: The rate of return on a portfolio is the portfolio of the rates of return.

• Let VP,t be the value of a portfolio at time t.

, = ∑ ,

where Ni is the investment in asset i, which has a value , at time t.

rP,t = , ,

,

∑ , ∑ ,

∑ ,= ∑ ,

where is the portfolio weight in asset i.

• This relationship does not hold for log returns because the log of a sum is not the sum of the logs.

Portfolio Returns

47

• Multi-period holding return

To simplify notation, include dividends into prices. That is,

Pd ,t +1 = Pt +1 + Dt

The two-period holding return is:

,,

,1 ,

,

,

,1= 1 , 1 , 1

Or

1 , = 1 , 1 ,

For small returns, we can use the log approximation:

, ≈ , ,

The k-period gross holding return 1 , = ∏ 1 ,

Or with the log approximation: , = ∑ ,

Multi-period Returns

48

Page 25: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

25

• If the (expected) returns are equal, that is, , . Then, the log approximation produces:

, = ∑ , ∗

• If the returns, , , are independent (covariance is 0) and with a constant variance equal to σ (a constant), then under the log approximation

Var( , ∑ , ∗ σ

Then, the SD is equal to

SD( , ∗ σ ∗ σ

Multi-period Returns

49

• We will deflate values by a Price Index, for example the CPI. Then,

Real Pricet

Then, the real return becomes:

1 1 1 1

where π is the inflation rate at time t.

The log approximation (for small returns) produces

π

Real Returns

50

Page 26: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

26

Example: Long-run S&P 500 monthly data, from Robert Shiller’s website (1871:Jan -2020: Mar).

Nominal & Real Prices

51

Example (continuation): Long-run S&P 500 monthly log returns, from Robert Shiller’s website (1871:Jan -2020: Mar; N = 1790).

Nominal & Real Returns

52

Page 27: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

27

Example (continuation): Prices have a clear trend, returns do not. In statistics, we prefer to work with data with no trends, like returns; they have better properties, for example, a well defined long-run mean (expected value).

Nominal & Real Returns

53

Example (continuation):

Nominal & Real Returns

mean changes with time(& variance too) non-stationary data

mean seems constant over time(& variance too) stationary data

54

Page 28: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

28

Example: Distribution of S&P 500 monthly log returns

Returns: Distribution

55

fat tail

Example (continuation): S&P 500 monthly log returns

Returns: Distribution

56

Page 29: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

29

Example (continuation): Inflation (in log changes). Smaller range than returns.

Inflation: Distribution

57

fat tail

Example (continuation): Univariate stats for returns and inflation

Return Inflation Real ReturnMean 0.003571 0.001693 0.001878 Median 0.006489 0.001399 0.005546Maximum 0.407459 0.067362 0.414844Minimum -0.307528 -0.067216 -0.307524Std. Dev. 0.040699 0.010364 0.040936Skewness -0.507281 -0.142742 -0.399958Kurtosis 14.38916 9.689002 14.13552Jarque-Bera 9751.2 3343.1 9296P-value (JB) 2.2e-16 2.2e-16 2.2e-16

• Monthly returns are slightly (left-) skewed (& median > mean), and with “fat tails”–i.e., kurtosis is higher than 3.

Returns: Sample Moments

58

Page 30: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

30

Example (continuation): Univariate stats for returns and inflation.

• Check some results from log approximation:

(1) π (for small % changes)0.001878 0.003571 0.001693 = 0.001878

(2) Change frequencies from monthly to annual,

- annual return = 0.003571 * 12 = 0.043852 (4.39%)

- annual SD = 0.040699 * sqrt(12) = 0.1409855 (14.10%)

Note: compounding the return: (1 + 0.003571)^12 = 0.043704. ¶

Returns: Sample Moments

59

• Assuming independence of returns and constant moments, we can use the log returns to easily change frequencies for the mean and variance of returns.

Suppose we have n-monthly compounded data and we are interested in the q-monthly compounded data. The approximation formulas for mean and standard deviation (SD) are:• q-mo mean = n-mo mean * q/n• q-mo SD = n-mo SD * √(q/n) q-mo Variance = sqrt(q-mo SD)

Example: Using the data from the previous table we want to calculate the weekly mean and standard deviation for returns (n=30, q=7).

- weekly return = 0.003571 * (7/30) = 0.0008332 (0.083%)

- weekly SD = 0.040699 * sqrt(7/30) = 0.019659 (1.97%)

Note: de-compounding the return: (1 + 0.003571)^(7/30) = 0.000832.

Returns: Sample Moments - Changing frequency

60

Page 31: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

31

• Recall that the sampling distribution of the sample mean is:~ N(, σ2/n)

Example: We estimate the monthly mean return as 0.003571, while the estimate of the variance of the monthly mean is

Estimated var( ) = 0.040699/1790 = 2.273687e-05.

Taking the square root we get the SD of the monthly mean (also called the SE):

S.E.( ) = sqrt(2.273687e-05) = .00476 (or 0.476%).

Compared to returns, the sample mean is more precisely estimated (0.476% vs 4.07%). Not surprised, the sampling distribution of the mean shrinks towards the sample mean as N increases.

Returns: Sampling Distribution

61

• Consider an n−period discount bond. Time is measured in years. Today is t . Bond (asset) pays Ft +n dollars n years from now, at t + n.

Ft +n = Face value.

Pt = Market price of the bond.

rn,t = Yield to maturity (YTM)

n = Maturity of bond

Pt,

Interpretation: Pt dollars invested at interest rate rn,t for n years compounded annually pays off Ft +n.

YTM, rn,t , is a raw number. 4% at an annual rate is 0.04

Yields

62

Page 32: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

32

• n is years, m is number of times return (yield) is compounded, for example, m = 4 for quarterly, m = 12 for monthly, m = ∞ for continuous compounding. Then,

Pt

As m → ∞, Pt 1 → Pt

where we used lim→

1 =

• The effective annual interest rate, ra, is simple the annual rate of return:

1 = 1 1 1

Continuous compounding

63

Example: We plot long-run S&P 500 (blue) and Bond (red) return data, from Robert Shiller’s website.

Bond Yields & Stock Returns

64

Page 33: Lecture 2-A Introduction: Review, Returns and Data · function defined for -∞< x < ∞with the following properties: 1. f(x) ≥ 0 2. ì B T @ T ¶ ? ¶ 1. 3. 2 = Q : Q >

RS - Returns and Data

33

Bond Yields: Distribution

Example (continuation): Bond return data, from Robert Shiller’s website:

65