Quantitative Methods for Finance - Lecture 1

Quantitative methods for finance

Lecture 1

Serafeim Tsoukas

Chapter 1: Descriptive statistics

• Descriptive statistics summarises a mass of

information.

• We may use graphical and/or numerical methods.

• Examples of the former are the bar chart and XY

chart, examples of the latter are averages and

standard deviations.

Graphical techniques

• Education and employment data

Higher A levels Other No Total

education qualification qualification

In work 9,713 5,479 10,173 1,965 23,852

Unemployed 394 432 1,166 382 2,374

Inactive 1,256 1,440 3,277 2,112 8,084

Total 11,362 7,352 14,615 4,458 37,788

Table 1.1 Economic status and educational qualifications, 2009 (numbers in

000s) Source: Adapted from Department for Children, Schools and Familits, Education and Training Statistics for

the UK 2009, http://www.education.gov.uk/rsgateway/DB/VOL/v000891/, contains public sector information

licensed under the Open Government Licence (OGL) v1.0. http://www.nationalarchives.gov.uk/doc/opengovernment-

licence/open-government

0

2,000

4,000

6,000

8,000

10,000

12,000

Higher education Advanced level Other qualifications No qualifications

Num

ber

of

people

(000s)

The bar chart

Figure 1.1 Educational qualifications of people in work in the UK, 2009

Note: The height of each bar is determined by the associated frequency. The first bar is 9,713 units high,

the second is 5,479 and so on. The ordering of the bars could be reversed (‘no qualifications’ becoming

the first category) without altering the message.

9,713

A multiple bar chart

Figure 1.2 Educational qualifications by employment category

0

2,000

4,000

6,000

8,000

10,000

12,000

Higher education

Advanced level Other qualifications

No qualifications

Nu

mb

er

of p

eo

ple

(0

00

s)

In work

Unemployed

Inactive

The stacked bar chart

Figure 1.3 Stacked bar chart of educational qualifications and employment status

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

Higher education

Advanced level Other qualifications

No qualifications

Num

ber

of people

(000s)

Inactive

Unemployed

In work

A stacked bar chart (percentages)

Figure 1.4 Percentages in each employment category, by educational qualification

0%

20%

40%

60%

80%

100%

Higher education Advanced level Other qualifications

No qualifications

Inactive

Unemployed

In work

The pie chart

Figure 1.5 Educational qualifications of those in work

Data on wealth in the UK

Table 1.3 The distribution of wealth, UK, 2005 Source: Adapted from HM Revenue and Customs Statistics, 2005, contains public sector information licensed

under the Open Government Licence (OGL) v1.0.http://www.nationalarchives.gov.uk/doc/open-governmentlicence/

open-government

Class interval (£) Numbers (thousands)

0–9,999 1,668

10,000–24,999 1,318

25,000–39,999 1,174

40,000–49,999 662

50,000–59,999 627

60,000–79,999 1,095

80,000–99,999 1,195

100,000–149,999 3,267

150,000–199,000 2,392

200,000–299,000 2,885

300,000–499,999 1,480

500,000–999,999 628

1,000,000–1,999,999 198

2,000,000 or more 88

Total 18,667

A (misleading!) bar chart

Figure 1.7 Bar chart of the distribution of wealth in the UK, 2005

0

500

1,000

1,500

2,000

2,500

3,000

3,500

0

10,0

00

25,0

00

40,0

00

50,0

00

60,0

00

80,0

00

100,0

00

150,0

00

200,0

00

300,0

00

500,0

00

1,0

00,0

00

2,0

00,0

00

Num

ber

of

indiv

iduals

Income class (lower boundary)

The histogram – the correct picture

Figure 1.9 Histogram of the distribution of wealth in the UK, 2005

Histogram versus bar chart

• The bar chart gives the wrong picture because of varying class widths.

• These two classes have similar frequencies (similar heights for bar chart) but the second is over 13 times wider. Adjusting for width, its frequency should be 111 (1,480 15/200).

Class interval £) Numbers (thousands)

10,000–24,999 1,318

: :

300,000–499,000 1,480

The frequency density

• Applying this principle leads to calculation of the

frequency densities:

Range Number, or

Frequency

Class width Frequency density

0– 1,668 10,000 0.1668

10,000– 1,318 15,000 0.0879

25,000– 1,174 15,000 0.0783

40,000– 662 10,000 0.0662

50,000– 627 10,000 0.0627

60,000– 1,095 20,000 0.0548

80,000– 1,195 20,000 0.0598

Numerical techniques

• We examine the measures of

– Location

– Dispersion

– Skewness.

Measures of location

• Mean – strictly the arithmetic mean, the well known

‘average’

• Median – the wealth of the person in the middle of

the distribution

• Mode – the level of wealth that occurs most often

• These different measures can give different answer.

The mean of the wealth distribution

875.186677,18

260,490,3

f

fx

Mean wealth is £186,875

Range x f fx

0 - 5.0 1,668 8,340

10,000 - 17.5 1,318 23,065

25,000 - 32.5 1,174 38,155

40,000 - 45.0 662 29,790

50,000 - 55.0 627 34,485

60,000 - 70.0 1,095 76,650

80,000 - 90.0 1,195 107,550

100,000 - 12 5.0 3,267 408,375

150,000 - 175.0 2,392 418,600

200,000 - 250.0 2,885 721,250

300,000 - 400.0 1,480 592,000

500,000 - 750.0 628 471,000

1,000,000 - 1500.0 198 297,000

2,000,000 - 3000.0 88 264,000

Total 18,677 3,490,260

Locating the mean

The median

• The wealth of the ‘middle person’ – i.e. the one

located halfway through the distribution.

• The median is little affected by outliers, unlike the

mean.

This person’s wealth

Poorest Richest

Simple example

• Values 45, 12, 33, 80, 77

• What is the median value?

• Put the values in order: 12, 33, 45, 77, 80

Median of the five values

Range Frequency

Cumulative

frequency

0– 1,668 1,668

10,000– 1,318 2,986

25,000– 1,174 4,160

40,000– 662 4,822

50,000– 627 5,449

60,000– 1,095 6,544

80,000– 1,195 7,739

100,000– 3,267 11,006

Calculating the median – wealth

• 18,677 (thousand) observations, hence person 9,338.5 in rank order

has the median wealth

• This person is somewhere in the £100–150k interval

Number with wealth

less than £100k

Number with wealth

less than £150k

• To find the precise median value, use

• Median wealth is £124,480.

f

FN

xxx 2LUL

Calculating the median (Continued)

4801243,267,000

000100000150000100

7,739,0002

18,677,000

,,,,

The mode

• The mode is the observation with the highest

frequency

Size Sales

8 7 10 25 12 36 14 11 16 3 18 1

Modal dress size = 12

Range Number, or

Frequency

Class

width

Frequency

density

0– 1,668 10,000 0.1668

10,000– 1,318 15,000 0.0879

25,000– 1,174 15,000 0.0783

40,000– 662 10,000 0.0662

50,000– 627 10,000 0.0627

: : : :

• For grouped data, the mode corresponds to the

interval with greatest frequency density.

Modal

class

Mode = £0–10,000

The mode – wealth data

Differences between mean,

median and mode

Figure 1.12 The histogram with the mean, median and mode marked

0 50 10 100 80 60 40 200 150 25

Class widths squeezed

Wealth (£000)

Mode Median Mean

Measures of dispersion

• The range – the difference between smallest and

largest observation. Not very informative for

wealth.

• Inter-quartile range – contains the middle half of

the observations.

• Variance – based on all observations in the

sample.

Inter-quartile range

• First quartile – one-quarter of the way through the

distribution, person ranked 4,669.25

• Third quartile – three quarters of the way through the

distribution, person ranked 14,007.75 hence

Q3 = 221,135.1

• IQR = Q3 – Q1 = 221,135 – 47,693 = 173,442.

6.692,47662

160,425.669,4000,40000,50000,401

Q

100

200

300

400

x

Wealth

(£000)

Box and whiskers plot

Median

First quartile

Third quartile

Outlier

IQR

The variance

• The variance is the average of all squared

deviations from the mean:

• The larger this value, the greater the dispersion of

the observations.

f

xf2

2

Small

variance Large

variance

The variance (Continued)

Range

Mid-

point x

(£000)

Frequency,

f

Deviation

(x – )

(x – )2

f(x – )

2

0– 5.0 1,668 –181.9 33,078.4 55,174,821.9

10,000– 17.5 1,318 –169.4 28,687.8 37,810,535.3

25,000– 32.5 1,174 –154.4 23,831.6 27,978,261.2

40,000– 45.0 662 –141.9 20,128.4 13,325,033.3

50,000– 55.0 627 –131.9 17,391.0 10,904,128.1

: : : : : :

1,000,000– 1500.0 198 1,313.1 1,724,297.9 341,410,980.4

2,000,000– 3000.0 88 2,813.1 7,913,673.6 696,403,275.4

Totals 18,677 1,499,890,455.1

Calculation of the variance

830680

22

67718

14558904991.,

f

xf

,

.,,,

The standard deviation

• The variance is measured ‘squared £s’ (because

we used squared deviations).

• Hence take the square root to get back to £s.

This gives the standard deviation:

or £283,385.

385283830680 ..,

Sample measures

• For sample data, use to calculate the sample variance.

• This gives an unbiased estimate of the population variance.

• Take the square root of this for the sample standard deviation.

1

2

2

n

xxfs

Measuring skewness

Right skewed

CS > 0

Left skewed

CS < 0

3

3

N

xfskewnessoftCoefficien

Skew of the wealth distribution

8985

7147572267718

02355188250623

3

.,,,

,,,,

N

xf

Range x f x- (x- 3 f (x- 3

0 - 5.0 1,668 -181.9 -6,016,132 -10,034,907,815

10,000 - 17.5 1,318 -169.4 -4,858,991 -6,404,150,553

: : : : : :

500,000 - 750.0 628 563.1 178,572,660 112,143,630,236

1,000,000 - 1500.0 198 1313.1 2,264,219,059 448,315,373,613

2,000,000 - 3000.0 88 2813.1 22,262,154,853 1,959,069,627,104

Total 18,677 3,898.8 24,692,431,323 2,506,882,551,023

Summary

• We can use graphical and numerical measures to

summarise data.

• The aim is to simplify without distorting the

message.

• Measures of location, dispersion and skewness

provide a good description of the data.

Descriptive statistics: Time series data

• Slightly different techniques are used for time

series data – data on one or more variable over

time.

• We look at investment data in the UK by way of

example.

Investment data: 1977–2009

Not very informative – we need a graph

Year Investment Year Investment Year Investment

1977 28,351 1988 97,956 1999 161,722

1978 32,387 1989 113,478 2000 167,172

1979 38,548 1990 117,027 2001 171,782

1980 43,612 1991 107,838 2002 180,551

1981 43,746 1992 103,913 2003 186,700

1982 47,935 1993 103,997 2004 200,415

1983 52,099 1994 111,623 2005 209,758

1984 59,278 1995 121,364 2006 227,234

1985 65,181 1996 130,346 2007 249,517

1986 69,581 1997 138,307 2008 240,361

1987 80,344 1998 155,997 2009 204,270

Source: Data adapted from the Office for National Statistics licenced under the Open Government Licence v.1.0.

http://www.nationalarchives.gov.uk/doc/open-government-licence/open-government








Time series chart of investment

Figure 1.16 Time–series graph of investment in the UK, 1977–2009

0

50,000

100,000

150,000

200,000

250,000

300,000

1977

1979

1981

1983

1985

1987

1989

1991

1993

1995

1997

1999

2001

2003

2005

2007

2009

Investm

ent

Chart of the change in investment

Figure 1.17 Time−series graph of the change in investment

-40,000

-30,000

-20,000

-10,000

0

10,000

20,000

30,000

1977

1979

1981

1983

1985

1987

1989

1991

1993

1995

1997

1999

2001

2003

2005

2007

2009

Change in investm

ent

The logarithm of investment

Figure 1.18 Time–series graph of the logarithm of investment expenditures

8.0

8.5

9.0

9.5

10.0

10.5

11.0

11.5

12.0

12.5

13.0

1977

1979

1981

1983

1985

1987

1989

1991

1993

1995

1997

1999

2001

2003

2005

2007

2009

Log investm

ent

Graphing several series

Figure 1.20 A multiple time–series graph of investment

0.00

10,000.00

20,000.00

30,000.00

40,000.00

50,000.00

60,000.00

70,000.00

80,000.00

90,000.00

100,000.00

1977

1979

1981

1983

1985

1987

1989

1991

1993

1995

1997

1999

2001

2003

2005

2007

2009

Investm

ent expenditure

s

Dwellings

Transport

Machinery

Intangible fixed assets

Other buildings

An area graph of the same data

Figure 1.22 Area graph of investment categories, 1977–2009

These are Excel’s

default colours!

Using separate axes – investment and the interest rate

Figure 1.21 Time–series graph using two vertical scales: investment (LH scale)

and the interest rate (RH scale), 1985–2005

0

2

4

6

8

10

12

14

16

0

50,000

100,000

150,000

200,000

250,000

1985

1987

1989

1991

1993

1995

1997

1999

2001

2003

2005

Inte

rest ra

te (

%)

Investm

ent (£

m)

Interest rates

Investment

Numerical summary measures

• It makes more sense to calculate the average

growth rate of investment, rather than the level.

• The growth rate is similar each year, the level

continuously increases.

• Calculate the growth factor over the whole time

period:

• Take the T−1 root:

• Subtract 1: 1.0637 − 1 = 0.0637

• The average growth rate is 6.4% p.a.

The growth rate of investment

2050.7351,28

270,204

1

x

xT

06371205732 ..

An approximate alternative

• The average growth rate can also be calculated

as the arithmetic mean of the annual growth rates:

i.e. 6.6%.

• This gives approximately the right answer, as long

as the growth rate is not too big.

0664.132

850.0963.0190.1142.1

Variance of the growth rate

• The stability of growth can be measured by calculating the

variance of the growth rate.

0766.0

0059.0

31

066.032323.0

12

22

2

s

n

xnxs

Year Investment

Growth

rate, x x2

1978 32,387 0.142 0.020

1979 38,548 0.190 0.036

1980 43,612 0.131 0.017

: : : :

2006 227,234 0.083 0.007

2007 249,517 0.098 0.010

2008 240,361 -0.037 0.001

2009 204,270 -0.150 0.023

Totals 2.1253 0.3230

Bivariate data

• We examine the relationship between investment and

GDP.

Figure 1.24 Scatter diagram of investment (vertical axis) against GDP (horizontal

axis) (nominal values)

0

50,000

100,000

150,000

200,000

250,000

300,000

0 200,000 400,000 600,000 800,000 1,000,000 1,200,000

Inve

stm

en

t (£

m)

GDP (£m)

• High values of investment seem associated with

high values of GDP, there is a close relationship.

• As both variables are growing over time, later

observations are at the top right of the graph, but

this does not have to be so.

• Both variables are influenced by inflation, so it

might be better to graph the real series, after

adjusting for inflation.

Bivariate data (Continued)

Figure 1.25 The relationship between real investment and real output

Real investment versus real GDP

50 000

100 000

150 000

200 000

250 000

300 000

500 000 700 000 900 000 1100 000 1300 000 1500 000

Real In

vestm

ent

Real GDP

Summary

• Slightly different graphical and numerical techniques are

used for time–series data

• A variety of time–series charts are available, both for single

and multiple series.

• The mean and variance are both useful descriptive

devices, but it makes more sense to apply them to the

growth rate, rather than the level, of a trended variable.

• Data transformations can be useful, e.g. taking logs or

differences and deflating to real terms.

Quantitative Methods for Finance - Lecture 1

Documents

Transcript of Quantitative Methods for Finance - Lecture 1