Post on 08-May-2022
Mortality forecasting methods:
The Lee-Carter model vs. the Brass model. Estimation on Norwegian data.
Xuan Ngoc Thi Tran
Master of Philosophy in Economics Department of Economics
University of Oslo
May 2019
Acknowledgements
First and foremost, I would like to express my deepest gratitude to my supervisor, Nico Keilman.
Your generous guidance, insightful feedback and endless enthusiasm really paved the way for
this thesis.
A very special dedication to my parents, Suong Van Tran and Gai Thi Truong, whose long
journey, hard work and resilience made this journey possible.
I would also like to thank all my friends at SV who has been there since the very beginning, and
especially Sigri, who helped me to the finish line.
And finally, to CBB. Without you I would not have thrived in the rush.
Any remaining errors or shortcomings are entirely my own.
Abstract
The aim of this thesis is to compare two mortality forecasting models, the Lee-Carter model, and
the Brass model. The analysis in this thesis makes use of Statistics Norwayβs population projection
report, which uses the Lee-Carter model in their forecasts, and the life table data for the Norwegian
population collected from SSB. In an attempt to find a simpler substitute to the Lee-Carter model,
I try to answer the three following questions: Will the Brass model result in a good fit to the
Norwegian data? Will the Brass model result in a simpler yet adequate substitute for a mortality
forecasting using Norwegian data? How do the results for future mortality obtained by the Brass
model differ from those based on the Lee-Carter model?
Comparing the two referred models, the Brass model did indeed return different results from those
reported by SSB. The forecasted life expectancy values found in this thesis were consistently higher
than the Lee-Carter values. In 2060 I forecast a life expectancy of 91.02 years for women and 90.67
for men. This differed with the Lee-Carter results by 2.27 and 0.72 years, respectively. Both models
found future life expectancy between men and women to intersect too early. This due to a steep
increase in male life expectancy during the past few decades. SSB made an arbitrary adjustment in
female life expectancy projections to correct for this. My analysis show a stronger argument for
adjusting male life expectancy, which suits the forecasted values of the Brass model better.
The analysis found both advantages and limitations using the Brass model. The regression returned
a good fit to Norwegian data. A logit transformation allows the relational Brass model to be
expressed linearly. This gives two parameters that permit a simple and intuitive interpretation of
the data. The advantage of the logit transformation equally resulted in a more complex estimation
of prediction intervals. A simulation had to be performed, returning a wider prediction interval than
what was reported by SSB. It is too premature to conclude whether the Brass model is better than
the Lee-Carter model, but a more comprehensive analysis is recommended.
Table of contents
1 Introduction and background .................................................................................... 1
2 Data description .......................................................................................................... 3 2.1 The survivorship function and life expectancy ............................................................................. 4
2.2 Period Selection ............................................................................................................................ 7
3 The Model .................................................................................................................... 9 3.1 The Brass model (1971) ................................................................................................................ 9
3.1.1 The logit transformation ........................................................................................................... 9
3.1.2 The standard life table ............................................................................................................ 10
3.1.3 The parameters: πΆπ and π·π ..................................................................................................... 12
4 Estimation .................................................................................................................. 13 4.1 Fitting the model ......................................................................................................................... 13
4.2 The fitted model .......................................................................................................................... 14
4.3 Results: alpha and beta ................................................................................................................ 15
4.4 Expanding the base period .......................................................................................................... 17
5 Forecasting models for the parameters ................................................................... 20 5.1 The Random walk ....................................................................................................................... 20
5.2 The unit root test ......................................................................................................................... 21
5.3 Forecasting results ....................................................................................................................... 22
5.3.1 Alpha ....................................................................................................................................... 22
5.3.2 Beta ......................................................................................................................................... 25
5.4 Life expectancy ........................................................................................................................... 27
5.5 The prediction interval (PI) ......................................................................................................... 29
5.5.1 Finding the variance ............................................................................................................... 31
5.5.2 Simulation ............................................................................................................................... 31
6 Summary and conclusion ......................................................................................... 35
7 References .................................................................................................................. 37
List of figures and tables
Figures
Figure 2.1 Rectangularization ......................................................................................................... 5
Figure 2.2: History of life expectancy in Norway Source: FHI, 2018 ............................................ 7
Figure 3.1: The logits of the standard life table, 1990 and 2017. ................................................. 11
Figure 4.1: Line graphs of the logits of ππ₯ for 1990 and 2017 against the standard ..................... 16
Figure 4.2 Estimated parameters for each year between 1990 and 2017 . (a) πΌπ‘ and (b) π½π‘. ....... 18
Figure 4.3: Fitted parameters πππ π1950 vs. πππ π1990. (a) πΌπ‘, women; (b) πΌπ‘, men; (c) π½π‘,
women and (d) π½π‘, men. ................................................................................................................. 19
Figure 5.1: Forecasts with 95% prediction intervals ..................................................................... 24
Figure 5.2: Forecasted life expectancy at birth for men and women, 10-year intervals between
2020-2060 ....................................................................................................................................... 28
Figure 5.3: SSBβs forecasted e0 with 80% PI Source: Statistics Norway, 2018 .......................... 30
Figure 5.4: Forecasted e0 with 80% PI ......................................................................................... 30
Tables
Table 1.1: Comparability with SSB ................................................................................................ 2
Table 2.1: An excerpt of the survivorship function life table ......................................................... 4
Table 3.1: Interpretation of πΌπ‘ and π½π‘ Source: Rowland, 2003 .................................................... 12
Table 4.1: Fitted values of πΌπ‘ and π½π‘ ............................................................................................ 14
Table 5.1: Life expectancy at birth (π0) for men and women. Brass compared to SSB .............. 27
Table 5.2: Variance of πΌπ‘ in 2040 and 2060 for men and women ................................................ 32
Table 5.3: Variance of forecasted π½π‘ for men and women ........................................................... 32
Table 5.4: Variance of forecasted ππ‘ for men and women ............................................................ 33
Table 5.5: The values of πΌπ‘, π½π‘ and ππ‘ used in the simulation ...................................................... 34
1
1 Introduction and background
Statistics Norway (βStatistisk sentralbyrΓ₯β, abbreviated as SSB) publishes official population
projections for Norway with regular intervals. Part of these projections are assumptions on the
future course of mortality. For more than a decade, SSB has used the so-called Lee-Carter model
for analysing the historical development of mortality and extrapolating it into the future. This
model, originally constructed by Ronald D. Lee and Lawrence Carter in 1992, starts from a table
of empirical age-specific death rates - one for men and one for women (SSB, 2018).
When the death rate in year π‘ for a population aged π₯ is written as π7,8, the model is
ππ;π7,8< = π7 + π7π8 + π7,8 (1)
Parameters π7 , π7 and π8 are to be estimated from the data, while π7,8 is an error term. The
estimates of π7, for all ages π₯, can be interpreted as the general age schedule of mortality. The
parameter π8 reflects year-on-year changes in mortality. These changes are not the same for every
age. The parameter π7 reflects how they differ across the ages. Once the parameters have been
estimated, predictions of mortality consist of extrapolating the time series of π8-values. Parameters
π7 and π7 are kept constant.
The Lee-Carter model, as expressed in (1), was first introduced in 1992 for US data. Given an
unexpectedly good fit to data of other western countries, it has since become one of the most widely
used forecasting models for mortality. However, it has some limitations:
1) Estimating the parameters of the model is complicated because the right-hand side of
expression (1) does not have any independent variables - there are only parameters. One
solution is to use singular value decomposition of ππ(π7,8) β πA7 . Parameters π7 are
estimated as averages of ππ(π7,8) across time (Lee and Carter, 1992).
2) Assuming a constant value of the parameters π7 may lead to strong distortions in the age
pattern of predicted mortality.
2
This paper attempts to overcome the complexity of the Lee-Carter model by applying the Brass
model. The Brass model was introduced in 1971 and with it, Brass created a new method to
generate life tables. It is one example of a so-called relational model life table, where a life table is
computed based on the relation to a βstandardβ life table (Rowland, 2003). The structure of the
model and estimation of its parameters are considerably less complicated than the Lee-Carter
model. Thus, a natural question is whether the Brass model is eligible to be used in forecasting
Norwegian mortality. Brass himself was optimistic about using his own method in forecasting
(Brass, 1971). In their paper, Lee and Carter also refer to the Brass model as a feasible option to
mortality forecasting (Lee and Carter, 1992). The application of the Brass method in mortality
forecasts has been explored in a number of studies by researchers including Keyfitz (1991) and
Himes et al. (1994), but none with the use of Norwegian mortality data.
Therefore, the questions that I will attempt to answer in this thesis will be as follows:
1. Will the Brass model result in a good fit to the Norwegian data?
2. Will the Brass model result in a simpler yet adequate substitute for a mortality forecasting
using Norwegian data?
3. How do the results for future mortality obtained by the Brass model differ from those based
on the Lee-Carter method?
To maintain comparability, I will throughout this thesis use SSBs approach as a guideline for many
of the adjustments that I will make in this analysis. The appropriate time period to use will also be
considered, therefore all available data is collected. Calculations were done using the statistical
softwares Stata and R. A summary of the comparability with SSBs analysis can be found in the
table below with further justifications explained throughout the text.
SSB This thesis
Age group 0-119 0-110
Base period 1990-2017 1990-2017
Forecasting horizon 2018-2100 (82-years) 2018-2060 (43-years)
Table 1.1: Comparability with SSB
3
2 Data description
The Norwegian life table data used in this paper was collected from two sources: Statistics Norway
(SSB) and the Human Mortality Database (HMD). Public access to SSB data only dates back to
1966 (SSB: Statistikkbanken, 2018). To get the entire timeline, the dataset from SSB (1966-2017)
was combined with HMD-data from 1846 to 1965.
In the SSB report for Norwayβs 2018 population projections, the period 1990-2017 was used as the
base to calculate future mortality. It is assumed that the dataset I have collected reflects the one
used in the SSB report. To maintain comparability I will therefore mainly use the data provided by
SSB. It must be noted that the age on the available life tables ranges from 0 to 106. I have extended
the range to age 110 by assuming a death probability of 0.5 for each year from 107 to 110. In SSBβs
report the maximum age is set to 119. Despite this difference, the main purpose of this paper is to
ultimately find the life expectancy at birth. Given the small and variable number of survivors at
these age groups, along with the calculation methods used in this thesis, the exclusion of the age
group 110-119 is not expected to have a significant outcome on the results.
In this paper we only use the survivorship function (βπ7-columnβ) of period life tables. This is a
cumulative probability function, where the π7-column describes the number of survivors at exact
age π₯ per 100,000 births based on a given set of age specific mortality rates. Here, I use one-year
age groups. The survivorship function shows the distribution for each age, starting at birth and
ending at age 110. The annual data is available by sex, and for reasons that will be further explained
later on, I will analyse the data for men and women separately. An excerpt of the dataset is
illustrated in table 2.1.
4
Table 2.1: An excerpt of the survivorship function life table
2.1 The survivorship function and life expectancy
Figure 2.1 plots the survivorship function for the Norwegian population based on age specific death
rates from the years 1846, 1950 1990 and 2017. The illustration shows a clear pattern in the
Norwegian survival distribution: with the years, the concentration of deaths in older ages has
increased. Survival was more evenly distributed between ages in 1846, where roughly 59% of
women and 54% of men survived until age 50. In 2017, the corresponding numbers had increased
to 98% and 97%, respectively. Improved living conditions, better health system and prevention of
diseases in the 150-year period has led to a drastic decline in infant and child mortality, currently
having reached an all-time low (FHI, 2018). An increasing proportion survive till maturity or older
ages and it is expected that this development will continue in the future (SSB, 2018). This tendency,
where the graph shifts toward the upper right corner making it look like a rectangle, is called
rectangularization (Rowland, 2003).
5
Figure 2.1 Rectangularization
6
Life expectancy (π7) is a measure of the average time a population group is expected to live based
on a given set of age-specific death rates. In the past, war, poverty and poor health were main
factors affecting the age at which people died. Technological improvements and the introduction
of penicillin in 1925 lead to a great reduction of infectious diseases and deaths (MeslΓ© and Vallin,
2011).
Figure 2.2 shows the life expectancy at birth for males and females in Norway during the period
1846-2016. As illustrated, the curves follow different paths. The life expectancy for women has,
for any given point in time, exceeded the life expectancy for men. Since 1925, life expectancy for
the female population has steadily increased, while the trajectory for the male population has been
less consistent. The gender gap in life expectancy was at its greatest in the time frame 1950-1990,
peaking at around the 1980βs. From 1990 onwards, the steepness in menβs life expectancy trajectory
has exceeded the one for women, which has led to a dramatic decrease in the life expectancy gender
gap.
Interestingly, a gender gap in life expectancy may be rooted in numerous factors related to biology
and psychology, ultimately leading to behaviours and lifestyle choices that affect life expectancy
(Oksuzyan et. al, 2006). Given the peak in the life expectancy gender gap during 1950-1990, the
most prominent explaining factor has been tobacco use. The Norwegian institute of public health
(FHI) reports that the proportion of men smoking peaked in the 1960βs followed by a steep decline.
For women, on the other hand, the smoking proportions were constant during the same time frame
and did not start to decline until the 90βs. A greater change in menβs habits has now lead to a drastic
reduction in the gap. From 2017 and onwards SSB reports that smoking is expected to be less
important for the gender gap in the future (SSB, 2018). This is further supported by a recent
Swedish study which pointed out that this impact plays a minor role for the current gender gap in
the Swedish population (Sundberg et. al., 2018). Given the variating factors in the sexes, the
forecasting of life expectancy for men and women will be performed separately.
7
Figure 2.2: History of life expectancy in Norway
Source: FHI, 2018
2.2 Period Selection
In mortality forecasting it is important to consider what past period should be used to extrapolate
future mortality (Keyfitz, 1991). As I have pointed out previously, the Norwegian life expectancy
patterns have varied greatly with time, especially for the male population. Choosing the appropriate
dataset as the base must therefore be done with careful consideration. SSB has noticed this in their
reports, as they have consistently underestimated the population projections dating before 2016. In
newer reports, the base period has been changed to 1990-2017, which has been found to reduce the
issue of underestimation (SSB, 2018). Choosing the same base period will not only maintain
comparability between the studies, but I also find this to be a reasonable time frame to use given
the consistency in life expectancy trajectories for both sexes in the period, as can be seen in figure
2.2.
8
The SSB report use prediction values for each year leading to 2100. Given the relatively short base
period (n = 28 years), the length of the forecasting horizon are subdued to large inaccuracies. As
SSB uses 2060 as one of their main reference years, I will restrict my forecasting horizon to this
year. This gives a shorter forecasting horizon which is equal to 43 years (m = 43).
9
3 The Model
When William Brass introduced his method in 1971, he revolutionized how life tables for
populations with low quality or incomplete datasets were constructed. His mathematical relational
method allowed life tables to be constructed independently of historical data (Preston et. al, 2004).
The World Health Organization (WHO) presents the Brass logit system as one of the main model
life tables, along with commonly used life tables like the UN model life table and the Coale-
Demeney model (WHO, 2000). In this section I will lay down the foundation of the Brass model,
followed by a description of the application to mortality forecasting on Norwegian data.
3.1 The Brass model (1971)
When Brass developed his relational method, he noticed that the relational curve between two life
tables becomes an approximate straight line, after logit transformation. This discovery allowed the
relationship between two survivorship functions to be expressed linearly:
π¦7,8 = πΌ8 + π½8π¦7,C + π8 (2)
The y on the left-hand side represents the logit transformed survivorship function for a given year,
π‘, and a given age, π₯. Similarly the y on the right-hand side has also been logit transformed, but
this one represents a single reference point, which in numerous literatures has been referred to as
the standard life table. This is denoted by the π (Newell, 1988). Parameters πΌ8 and π½8 are to be
estimated from the data, while π8 is an error term. Conventionally, by using available data one can
therefore make reasonable assumptions about πΌ8 and π½8, and together with the standard life table,
compute the life table for a given year.
3.1.1 The logit transformation
Following the model, the first key step consists of transforming the π7 -values by logit
transformation:
1. Divide π7 by 100 000 to express it as a proportion, π7.
10
2. Logit transform π7 by using the following expression:
πππππ‘(π7) = 0.5ππ(1 β π7π7
) (3)
Here we follow Brassβ original approach. Alternatively we could have defined πππππ‘(π7) as
ππ(π7/(1 β π7)).
3.1.2 The standard life table
In relevant literature it has been stated that ββ¦Any standard can be chosen, but it is sensible to
choose one that shows some sort of average pattern.β (Newell, 1988, pp.155). Brass has defined a
general standard life table, but he recommended using a life table that better reflects the population
given that the data is available (Brass, 1971).
Given the complete datasets available for the Norwegian population, I found the former to be the
most sensible approach: by using the average. The standard life table can therefore be found by the
following expression:
πC =
β π7,88KL8
π (4)
Where π‘ is the start of the base period, π is the end and π = π β π‘ + 1 is the number of years in
the given period.
As stated in the period selection section, the base period is 1990-2017, which gives the following
calculation of the standard life table:
π7,C =
β π7,8NOPQ8KPRRO
28 (5)
The values of the logit transformed life table are illustrated in figure 3.1. As expected, the standard
falls exactly between 1990 and 2017. Using the standard life table to find the life expectancy at
birth would return values close to those in the mid-2003.
11
Figure 3.1: The logits of the standard life table, 1990 and 2017.
12
3.1.3 The parameters: πΆπ and π·π
Given the standard, the Brass model summarizes age-specific mortality for a fixed year π‘ in two
parameters, πΌ8 and π½8. The estimate of πΌ8 can be interpreted as the level of mortality in year π‘. The
second parameter, π½8, denotes the relationship between childhood and adult mortality. It must be
noted that both interpretations must be
seen relative to the standard. A computed
life table using πΌ8 = 0 and π½8 = 1
returns a life table identical to the
standard life table.
In relevant literature, Brassβ general
standard life table has been used to find
appropriate ranges related to πΌ8 and π½8 .
Reasonable values are illustrated in table
3.1, where the range for πΌ8 is set between
-1.5 and 0.8 and for π½8 between 0.6 and
1.4. Note that the general standard life table returns life expectancy close to mid-1940βs. As my
standard life table is close to mid-2003, the ranges in my analysis could differ.
Given the interpretations of πΌ8 and π½8, it is possible to infer expectations on the regression results.
For both men and women in the Norwegian population, the life expectancy has steadily increased
over the past few decades. SSB has reported that this number will continue to increase in the future
(SSB, 2018). As time increase, πΌ8 is therefore expected to become more and more negative.
During the past decade, life expectancy has reached an all-time high and correspondingly, child
mortality has reached an all-time low (FHI, 2018). If either of these values would improve in the
future, the marginal change is expected to be quite small. It is therefore reasonable to assume that
the child-adult mortality relationship will remain quite steady, with a π½8 close to one.
πΌ8
-1.5 0.0 0.8
Higher life expectancy
Life expectancy the same as the standard
Low life expectancy
π½8
0.6 1.0 1.4
High infant and child mortality, low adult mortality
The relationship between adult and child mortality the
same as the standard
Low infant and child mortality,
high adult mortality
Table 3.1: Interpretation of πΌ8 and π½8 Source: Rowland, 2003
13
4 Estimation
4.1 Fitting the model
The linearity of model (2) allows the parameters πΌ8 and π½8 to be estimated by an ordinary least
squares regression (OLS). In his paper, Brass expressed a concern applying the OLS to fit the
estimates. Using the United Nations mortality schedule he found irregularities in the youngest and
oldest age groups. Especially, when comparing the fitted values against the observed values, he
noted great discrepancies at age one and in the oldest age groups β they did not fit the data well.
With the possibility of errors in reporting, Brass suggested weighted least squares (WLS) as an
alternative method. He reasoned that putting less weight on these groups in the regression could
solve the issue. However, he concluded that the simple operation of the OLS outweigh the
βarbitrary and laboriousβ application of the other, ultimately preferring the OLS (Brass, 1971).
Stewart (2004) performed an evaluation of the statistical methods used in the Brass relational
model, using the West model level 22 stable life table as the underlying mortality distribution.
Testing five different methods, which were based on a variation of OLS, WLS and maximum
likelihood estimation (MLE), he concluded that the MLE produced the most efficient estimates.
There is an additional concern. The survivorship function π7 is not observed, but is constructed
based on estimated age-specific death rates. An appropriate approach would be to use data on death
counts and numbers of persons alive, both broken down by age, use a Poisson model to estimate
the death rates and their standard errors, and include these standard errors in the estimation
procedure for πΌ8 and π½8. But this method is also found to be too complicated.
In the introduction I listed the level of βcomplexityβ as one of the limitations to the Lee-Carter
model. Despite the suggested methods of WLS or MLE, the logit transformation aspect of the data
makes the application of these methods quite difficult. There is also a possibility that the Norwegian
population data returns a better fit to the model. Thus, staying true to Brass and the underlying idea
of this thesis, I will proceed using the simple method of OLS.
14
4.2 The fitted model
An important assumption regarding the error term is that it has zero expectation and a constant
variance. A violation in the assumption could lead to inefficiency in the estimates and at the very
worst, biasedness (Carter Hill et. al, 2008). When I checked for heteroscedasticity in the residuals,
the patterns were found to be slightly irregular. Therefore, to overcome the presence of
heteroscedasticity the OLS regression was performed with robust standard errors.
Regardless of the robust standard error application, I found the model to return a good fit to the
data. For both men and women, all the reported π Β²-values were greater than 0.99. The estimated
values of πΌ8and π½8 with their corresponding standard errors (SE) and π Β²-values are listed in table
4.1. This table gives the time series dataset for πΌ8and π½8 , which will be used in the mortality
forecast analysis.
Table 4.1: Fitted values of πΌ8 and π½8
15
In figure 4.1, the empirical survivorship function (in logit form) π¦8 is plotted against the fitted
values. Fitting the model by using OLS on Norwegian data, I did not find the issues Brass referred
to, except for minor tendencies in the oldest age group in 2017 for men, as illustrated by the green
curve in figure 4.1 (b).
4.3 Results: alpha and beta
As the parameters πΌ8 and π½8 lays the foundation for the mortality forecast, it is important to have a
closer look at their behaviour. Figure 4.2 shows the estimated πΌ8 and π½8 parameters for each year
between 1990 and 2017. The green curves represent women, while the blue represent men. A fitted
dashed line has been added to each of the parameters to give a better illustration of the slopes.
Both parameters seem to behave as I previously expected. πΌ8 for both men and women follows a
downwards going slope, with men following a steeper trajectory. This reflects that men in the last
few decades have had a greater improvement in life expectancy compared to women, as noted in
section 2.1.
The parameter π½8 for both sexes fluctuates around a value close to one, indicating little change
between the child- and adult mortality in the past two decades. While the π½8 for women follows a
close to constant path, the menβs path is slightly tilting downwards. Following a similar reasoning
as for πΌ8, a great reduction in the mortality rate for men has naturally had a greater impact on the
child-adult mortality relationship.
16
Figure 4.1: Line graphs of the logits of π7 for 1990 and 2017 against the
standard
17
4.4 Expanding the base period
In section 2.2 I set the base period to be 1990-2017 (βπ΅ππ πPRROβ). To further emphasize the
importance of period selection and to see the implications of choosing a different one, I performed
an analysis using a longer time span. I did this by expanding the base period, starting in 1950
(βπ΅ππ πPRXOβ) instead of 1990. 1950 was arbitrarily chosen, but looking back at figure 2.2 this
period marks the start of modern life in Norway. Living conditions improved remarkably and war,
hunger and pandemics were no longer determining factors in life expectancy. This makes it a more
realistic option, as opposed to using the whole timeline since 1846.
I recomputed the standard using expression (6) and re-estimated πΌ8 and π½8 for every year in the 68-
year period. The results are illustrated in figure 4.3.
π7,C =
β π7,8NOPQ8KPRXO
68 (6)
Figure 4.3 show the results using πππ πPRXO (blue) compared to using πππ πPRRO (green). πΌ8 for
women are illustrated in (a) and for men in (b). Similarly for π½8, (c) represents women and (d) men.
The dashed lines have been added to illustrate the slopes of the curves. Given the different base
period there is a natural shift in all the curves.
With the exception of πΌ8 for women, the change in the base period returned remarkably different
results. In figure 4.3 (b) we can see that using πππ πPRRO returned a steeper slope than πππ πPRXO.
This reflects how men during the past few decades have had a remarkable increase in life
expectancy compared to previous years. The female child-adult mortality relationship follow a
more intuitive path, as seen in figure 4.3 (c) β with remarkable decreases in infant mortality 1950-
1990, it is natural that the curve is steeper for πππ πPRXO compared to πππ πPRRO.
18
(a)
(b)
Figure 4.2 Estimated parameters for each year between 1990 and 2017 . (a) πΌ8 and (b)
π½8.
19
The flat trend for men using πππ πPRXO in figure 4.3 (d) is a little misleading, where it seems like
the relationship has been constant since the 1950s. This is due to the high volatility in male
mortality combined with an decreasing infant mortality.
This analysis further confirms the importance of period selection. Menβs trajectory is probably
more extreme than it will be in the future, but overall it seems that the development the past few
decades reflects the future better. In what follows, I will restrict myself to parameter estimates for
1990-2017 (see Table 4.1).
Figure 4.3: Fitted parameters πππ πPRXO vs. πππ πPRRO.
(a) πΌ8, women; (b) πΌ8, men; (c) π½8, women and (d) π½8, men.
20
5 Forecasting models for the parameters
Now that the parameters have been estimated and evaluated, the forecasting of the mortality consist
of extrapolating the time series of πΌ8 - and π½8 -values. To produce the appropriate forecasting
models for the parameters, I applied basic time series theory to the data.
5.1 The Random walk
As illustrated in figure 4.2, both πΌ8 and π½8 for men and women fluctuates around a given trend.
These fluctuations are seemingly random, which points to the random walk model as a suitable
model. The random walk is one of the fundamental models in time series modelling and is defined
by the following expression:
π¦8 = π¦8ZP + π + π8 (7)
Where π¦8 represents the current value of the variable of interest, π¦8ZP the value in the previous
period, π is a constant which denotes a drift and π8 is a random error βΌπ(0, πN). π_`N and π are the
parameters to be estimated. When π is equal to zero, the random walk model is without drift. As
this model only incorporates one lagged value of π¦8, it is also called an autoregressive model of
order 1 (AR(1)).
Iain Currie produced a technical note on random walk with drift estimation, of which I will follow
here (Currie, 2010). The first step in estimating the random walk with drift is to compute the first
difference:
π§8 = π¦8 β π¦8ZP (8)
Which gives:
π§8 = π + π8 (9)
21
The parameter π can then be estimated as the average increment by the following equation:
οΏ½ΜοΏ½ = π§Μ =
1π β 1dπ§8
e
N
(10)
where n=28.
And the π_`N is estimated as the sample variance of π§8:
πAN =
1π β 2d(π§8 β π§Μ )N
e
N
(11)
The variance and standard error of οΏ½ΜοΏ½ are as follows:
πππ(οΏ½ΜοΏ½) = πππ(π§)Μ =
πAN
π β 1 (12)
Which gives:
ππΈ(π)j =
πAβπ β 1
(13)
The 95% prediction interval for π¦ predicted in the year (π + π) can be found using the following
expression:
π¦Almn Β± π‘ β ππΈ(π¦Aemq) (14)
ππΈ(π¦Aemq) = π β ππΈ(οΏ½ΜοΏ½) (15)
Where π denotes number of years forecasted ahead. E.g. with a forecast until 2060, π = 43. As
π = 28, π‘ = 2.04841
5.2 The unit root test
Now, letβs go back to equation (7):π¦8 = π¦8ZP + π + π8
This equation implies a second parameter, πO:
22
π¦8 = πOπ¦8ZP + π + π8 (16)
Where πO is equal to one. This implication is called a unit root and has an important feature: by
verifying the value of πO one can identify the proper autoregressive model.
πO < 1 denotes stationarity and πO =1 denotes nonstationarity The random walk model is a
nonstationary AR(1). To confirm the random walk as the proper model for my data I need to check
that the condition of πO = 1 is fulfilled (Carter Hill et al., 2008).
To verify that my estimated parameters follow a random walk, I therefore test to see whether πO=1.
In this thesis I use the Augmented Dickey-Fuller test (ADF), which tests the null hypothesis of
πO = 1 against an alternative. The test can be performed on models with or without a drift.
5.3 Forecasting results
5.3.1 Alpha
For both men and women, πΌ8 follows a downward trend which indicates a drift. I therefore run the
following time series model:
πΌ8 = πOπΌ8ZP + π + π8 (17)
To verify that this time-series model suits the data, I run an ADF-test. I fail to reject the null
hypothesis that πO = 1 against the alternative on all significance level, both for women and for
men. I therefore conclude the time series to be nonstationary and the random walk model with drift
to fit the data well for the πΌ8 -parameters. The preferred models with the standard errors in
parentheses are as follows:
Women: πΌ8t = πΌ8ZPuβ0.0142 + π8πvwN = 0.0009 (18)
(0.0058)
23
Men: πΌ8t = πΌ8ZPuβ0.0213 + π8πvwN = 0.0013 (19)
(0.0070)
The constant terms, -0.0142 and -0.0213, indicate the average annual change in πΌ8 for women and
men respectively. Over the 43-year forecasting horizon, πΌ8 is forecasted with a decline in πΌ8 of 43
times -0.0142 and -0.0213, or a total change of -0.6106 and -0.9159, for women and men
respectively. This corresponds to a πΌ8 value of -0.8089 for women and -1.2084 for men in 2060.
Figure 5.1 (a) and (b) plot the past values of alpha for women and men along with the forecasts
based on the time series model and the associated 95% prediction intervals. It can be seen that the
forecasts for men follow a steeper trend than the forecasts for women, which naturally follows the
trend of the original data ranging 1990-2017.
24
Figure 5.1: Forecasts with 95% prediction intervals
25
5.3.2 Beta
Figure 4.2 (b) illustrates that π½8 for both men and women fluctuates around a value close to 1. As
it looks like π½8 for men follow a (slightly) downward going trend, while it follows a flatter trend
for women, I will test the following two time series models on both sexes:
With drift:
π½8 = πOπ½8ZP + π + π8 (20)
Without a drift:
π½8 = πOπ½8ZP + π8 (21)
Performing the ADF test with a drift for both sexes, I found the results to reject the null of πO=1 on
all significance levels for women. For men it could not be rejected on the 1% and 5% significance
levels. Repeating the process, but using the model without a drift, the null could not be rejected at
the 1% and 5% level for women and it could not be rejected on all significance levels for men. I
believe that the test for men returned mixed results due to the slight downwards going trend, which
is very small in absolute terms. My conclusion is that the indication of the presence of a unit root
was stronger for both sexes using the model without a drift. The random walk model without a drift
is given by the following expression:
π½8 = π½8ZP + π8 (22)
A random walk without a drift can be interpreted as a constant value with random fluctuations
given by the error term, π8. As π½8 for both men and women randomly fluctuates around a constant
value close to one, I perform a hypothesis test to see whether the parameter can be expressed as
such a constant. Conducting the test of π»O: π½8 = 1 against and alternative, it returned mixed results.
For both sexes, only 20% of the tested values did not reject the null.
26
Due to the mixed results I instead set the π½8 to be the average of their historical values, given by:
πz =
1π d π½8j
NOPQ
8KPRRO
(23)
Where π is the number of years from 1990-2017. Assuming independence between the π½8j for the
subsequent years, the variance can be found by the following:
π£ππ;πz< = π£ππ |
1π d π½8j
NOPQ
8KPRRO
} =1πN d (ππΈ;π½8j <)N
NOPQ
8KPRRO
(24)
My preferred model for forecasting π½8 is as follows:
π½8j = πz + π8
Which gives the following preferred models with standard errors in parenthesis:
Women:
π½8j = 1.0097 + π8πvwN = 0.0002 (25)
(0.0007)
Men:
π½8j = 1.0174 + π8πvwN = 0.0005 (26)
(0.0012)
27
The π½8-model for women assumes a constant value equal to 1.0097. As the value is very close to
1, I assume that future development of the child-adult mortality relationship will stay close to the
standard. For men, we also assume a close to constant development of adult-child mortality
relationship in the future, but the estimate is a bit larger for men, equals 1.0174, when comparing
with the standard.
5.4 Life expectancy
With the appropriate data in place, the data can then be used to compute life expectancy values at
birth. Life expectancy can be found using the following expression:
π7 = 0.5 β π7 + d π8
e
8K7mP
(27)
Here π = 106,while π7 =οΏ½οΏ½
POOOOO as stated in section 3.1.1.
With the fitted values in the 10-year interval from 1990-2010 and the forecasted values from 2020
onwards, I found the life
expectancy values at
birth as presented in table
5.1. 2017 has been
included as a reference
point.
As expected, the
forecasts using the Brass
model differ from those
presented in the SSB
report. In table 5.1 we can
see that the model returns
life expectancies below
Table 5.1: Life expectancy at birth (πO) for men and women. Brass compared to SSB
Brass SSB
year Women Men Women Men
1990 79.55 72.82 79.81 73.44 2000 81.07 75.67 81.38 75.96
2010 83.40 78.74 83.15 78.85 2017 84.67 81.28 84.28 80.91
2020 85.18 82.08 84.70 81.60 2030 86.81 84.58 86.40 83.60
2040 88.31 86.83 87.80 85.40 2050 89.72 88.85 89.10 87.00
2060 91.02 90.67 90.30 88.40 Source: SSB.no, 2018
28
SSB in 1990 and 2000 for women, while it in 2010 onwards is higher. For men, values below SSB
lasts until 2010 and lies above after. Only counting the Brass-values that lies above SSB, the
difference between women varies between a minimum at 0.25 years in 2010 and a maximum of
0.72 in 2060. The difference is gradually increasing for every year. For men this difference starts
with 0.32 years in 2017 and concludes at 2.27 years in 2060.
By 2060 the Brass model forecasts a life expectancy at birth of 91.02 for women and 91.39 for
men. The corresponding values from the SSB report are 90.30 and 88.40, which differ by 0.72 and
2.27 years respectively.
In the SSB report, life
expectancy of women was
arbitrarily adjusted in 2060 in
order to avoid a situation in
which the life expectancy of
men would exceed that of
women. They did this by
increasing the life
expectancy of women in
2060 by 0.8 years. Without
the adjustments the life
expectancy between men and
women would intersect by
2100 - which according to
SSB is too early on the basis of demographic trend analysis of Norwegian data. I experience the
same issue in my analysis. In the forecast life expectancy for men has a large marginal increase
during the 10-year interval between 2020 and 2060. By 2060 the difference between men in Brass
and SSB is about 3 times as large as the difference between women for the same year. The
determining factor behind this issue is the same for both SSB and my analysis: due to the large
Figure 5.2: Forecasted life expectancy at birth for men and women,
10-year intervals between 2020-2060
29
increase in life expectancy for men the past few decades, a mechanical forecast using this trend
could lead to an overestimation, as the same trend is not expected to continue at the same rate in
the future. The Brass values presented here are unadjusted, indicating an intersection between the
sexes in the early 2060βs. Which clearly indicates that the Brass model result in a larger
overestimation than the method used by SSB.
To avoid a too early intersection of the paths, SSB listed two possible solutions: They could either
make an upwards adjustment for women or they could make a downward adjustment for men. They
did the former and justified this choice by the consistent underestimation they performed in the
data in the past. This despite the correction they had already done by choosing a shorter base period,
as I have mentioned in section 2.2. This fact, along with the given βover-forecastβ of male life
expectancy makes a stronger argument for a downward adjustment in the path for men, which
certainly corresponds well with the results that I have presented.
5.5 The prediction interval (PI)
Including a prediction interval for the life expectancy values would provide a more complete
depiction of the results. As illustrated in figure 5.3, SSB reports results using a 80% prediction
interval, which corresponds with a 80% probability that the true life expectancy value lies within
this interval. Here, I will do the same.
30
Figure 5.3: SSBβs forecasted eO with 80% PI
Source: Statistics Norway, 2018
Figure 5.4: Forecasted eO with 80% PI
31
5.5.1 Finding the variance
The variance is a necessary measure in computing the prediction interval. But due to the logit
transformation of the original data, the variance of the π¦7,8 in model (2) cannot be linearly
translated to a variance of the π7 in the same year. An attempt to derive the variance proved to be
a tedious and complex task. Thus, an alternative method was found in order to produce an estimated
prediction interval: finding it by performing a simulation.
5.5.2 Simulation
The idea behind the simulation is that a prediction interval for ποΏ½ can be found by simulating the
distribution of π7. This can be done by using the variance of πΌ8, π½8, and π8 in model (2). Generating
random draws from a normal distribution for each of these parameters, by using their designated
standard deviation as derived from their variance, the combination of these data points will then
form a data set which consists of n-numbers of π¦7,8-values. Then, by transforming each of these
values into π7, this data set will form a histogram (simulated distribution) of π7 in year π‘ of which
the prediction interval can be found. A complete depiction would be finding a prediction interval
for every π‘ from 2018 to 2060. But given the manual method which only returns values for a given
year, I will restrict the simulation to only year 2040 and 2060 for both sexes.
The first step in the simulation is to find the variance of each of the parameters in 2040 and 2060.
a) Variance πΆποΏ½οΏ½οΏ½οΏ½οΏ½οΏ½C8
Letβs have a second look at equation (17):
πΌ8 = πΌ8ZP + π + π8
For 2060 this gives:
πΌNOοΏ½O = πΌNOPQ + 43 β πOw + d π8w
NOοΏ½O
8KNOPοΏ½
(28)
32
The variance of πΌNOοΏ½O is therefore:
π£ππ(πΌNOοΏ½O) = π£ππ(πΌNOPQ) + 43N β var(πOw ) + 43 β var(π8w) (29)
Similarly for 2040, the variance can be found by the following expression:
π£ππ(πΌNOοΏ½O) = π£ππ(πΌNOPQ) + 23N β var(πOw ) + 23 β var(π8w) (30)
The results (rounded to 4 decimals):
π£ππ(πΌNOοΏ½O) π£ππ(πΌNOοΏ½O)
Men 0.5636 0.03848
Women 0.1010 0.0559
b) Variance π·ποΏ½οΏ½οΏ½οΏ½οΏ½οΏ½C8
π½8 = πz + π8 (31)
π½NOοΏ½O = πz + π8 (32)
π£ππ(π½8) = var(πz) + π£ππ(π8) (33)
The var(πz) is defined in equation (24)
The results:
π£ππ(π½8)
Men 0.0005
Women 0.0002
Table 5.2: Variance of πΌ8 in 2040 and 2060 for men and women
Table 5.3: Variance of forecasted π½8 for men and women
33
c) Variance πΊποΏ½οΏ½οΏ½οΏ½οΏ½οΏ½C8
As mentioned in section 4.2 the variance of the model should be constant, as heteroscedasticity
could lead to inefficiency in the estimates. Therefore, ideally π£ππ(π8) should be the same for every
year π‘. When πΌ8 and π½8 were estimated for every year from 1990 to 2017, the variance for each
estimation was small, but with slightly different values. Therefore I set the general variance of the
error term to be the average of the estimations:
β π£ππ(πΊπ)NOPQ8KPRRO
ππ. πππ¦ππππ
(34)
The results:
π£ππ(πΊ8)
Men 0.0029
Women 0.0015
The statistical program R allows you to generate a random normal distribution by using a set of
inputs: mean, standard deviation (SD) and number of observations (n). Here I set π = 1000 for
each generation. Given that I want to find the prediction interval for the forecasted values, these
are set as the mean. The inputs I used are summarized in table 5.5.
Table 5.4: Variance of forecasted π8 for men and women
34
The random extraction generated 1000 data points for each of the parameters. These were
combined using equation (2) which resulted in 1000 π¦7,8s for men and 1000 for women. A total of
2000 tables. In the end, computing the distribution for the life expectancy at birth (πO) in 2040 and
2060 resulted in the curves illustrated in figure 5.4. For women the 80% estimated prediction
interval resulted in a lower bound of 85.47 and an upper bound of 90.79 in 2040. For men, I found
these values to be 83.58 in the lower bound and 90.02 in the upper bound. Correspondingly in
2060, these values were 87.04, 94.12, 86.36 and 94.01, respectively. There is a large overlap
between the PIs of the genders. Menβs PI was wider due to a greater dispersion in the data. In 2060
the computation returned a median close to the forecasted values at 91.02 for women and 90.67 for
men. The mean values were 90.77 and 90.36, respectively.
The simulation performed here returned a
much wider prediction interval than what is
reported from SSB. In 2060 SSBs prediction
lie within a roughly Β±2 year prediction
interval. The prediction interval here is
almost double the size, at roughly Β±4 years.
The reason for this difference is unclear,
because the way SSB constructed its 80%
prediction interval is not well documented
(SSB, 2018).
πππππππ‘ππ ππππ ππ·
Men
πΌNOοΏ½O -1.2084 0.3822
πΌNOοΏ½O -0.78258 0.2363
π½8 1.0174 0.0226
π8 0 0.0539
women
πΌNOοΏ½O -0.8089 0.3177
πΌNOοΏ½O -0.52424 0.1964
π½8 1.0097 0.0138
π8 0 0.0386
Table 5.5: The values of πΌ8, π½8 and π8 used in the simulation
35
6 Summary and conclusion
Applying the Brass model to Norwegian population data, I found a satisfactory model to use in
mortality forecasting. It was shown that the model fits the data well, overcoming some of the issues
previously raised by Brass. The parameters are easy and intuitive to interpret given the linearity in
the model. As opposed to the Lee-Carter model, where π8 is the only parameter to be forecasted,
the Brass model require two parameters, πΌ8 and π½8. This yields a more flexible model.
Not surprisingly, the model returned results that differed from what was reported by SSB. In the
end, the forecasted Brass values resulted in a higher life expectancy at birth in 2060 for both men
and women. Most notably, the great improvement of the male life expectancy the past few decades
lead to an aggressive projection of the future trajectory. The same issue was found in SSBβs
analysis and an arbitrary adjustment had been performed in order to depict a more realistic forecast.
Given their justification, a similar line of argument would make a stronger case for using the Brass
model.
One of the critiques to the Lee-Carter model was its complexity. The Brass model also had a few
limitations. In constructing the prediction interval, a simulation had to be conducted as it was
difficult to find the variance of the life expectancy. Overall, the forecasting results that I presented
were greater than those reporter by SSB. This could indicate a general overestimation in using the
Brass model, or, alternatively, an underestimation of πO by SSB.
One possible improvement of my approach is to re-estimate the model by including time-series
expressions for πΌ8 and π½8 directly in expression (2). In that case, the only parameters to be
estimated are π and π, as well as the variance of π8. Also, one has to select an appropriate starting
value πΌPRQO. Time constraints did not allow me to implement this approach.
In this thesis, the simple approaches were preferred over the more complex. In the process I may
therefore have left out other suitable methods that could produce more precise results. Given the
many factors that needs to be considered in forecasting, among them the appropriate base period,
36
forecasting models and so on, it is too premature to conclude whether the Brass model is better
than the Lee-Carter. Hopefully, a more comprehensive analysis could prove in favour of Brass.
37
7 References
Brass, W. (1971). On the scale of mortality. Biological Aspects of Demography, William Brass (ed.). New York: Barnes & Noble Inc.
Carter Hill, R. Griffiths, W.E. & Lim, G. C.. (2008). Principles of Econometrics. USA: John
Wiley & Sons, Inc. 3rd edition. Currie, I. (2010). Volatility v Trend Risk: A technical note on estimating and forecasting with
random walk with drift. In: Longevitas (online document). [updated (21.04.2019); read (01.05.2019)]. Available from <https://www.longevitas.co.uk/site/informationmatrix/ volatilityv.trendrisk.html>
Folkehelseinstituttet (FHI) (2018). Life expectancy in Norway. In: Public Health Report - Health
status in Norway (online document). Oslo: Norwegian Institute of Public Health [updated (04.10.2018); read (01.04.2019)]. Available from <https://www.fhi.no/en/op/hin/population/life-expectancy/>
Himes, C. L., Preston, S. H. & Condran, G. (1994). A Relational Model of Mortality at Older
Ages in Low Mortality Countries. Population studies. Vol. 48(2), pp. 269-291. Keyfitz, N. (1991). Experiments in the Projection of Mortality. Canadian Studies in Population,
Vol. 18(2), pp. 1-17
Lee, R. D. & Carter, L. R. (1992). Modeling and forecasting U. S. mortality. Journal of the American Statistical Association, 87(419), pp. 659-671.
Newell, C. (1988). Methods and models in demography. Great Britain: Belhaven Press. MeslΓ©, F., & Vallin, J. (2011). Historical trends in mortality. In: R.G. Rogers, & E.G. Crimmins
(eds). International handbook of adult mortality. New York: Springer, pp. 9β47. World Health Organisation (WHO) (2000). WHO System of Model Life Tables. GPE Discussion
Paper Series. No. 8, World Health Organisation.
Preston, S. H., Heuveline, P. & Guillot, M. (2001). Demography: measuring and modeling population processes.
38
Oksuzyan, A., Juel, K., Vaupel, J. W. & Christensen, K. (2008). Men: good health and high mortality. Sex differences in health and aging. Aging Clinical and Experimental Research, Vol. 20(2), pp. 91-102.
Rowland, D. T. (2003). Demographic methods and concepts. New York: Oxford University Press
Inc. Statistics Norway (2018). Norwayβs 2018 population projections: Main results, methods
and assumptions. Report no. 2018/22, Statistics Norway.
Statistics Norway (2018). Statistikkbanken: DΓΈde, dΓΈdelighetstabeller. [updated (28.06.2018); read (01.04.2019)]. Available from: <https://www.ssb.no/statbank/table/07902/> Stewart, Q. T. (2004). Brassβ Relational Model: A Statistical Analysis. Mathematical Population
Studies, Vol. 11(1), pp. 51-72. Sundberg, L., Agahi, N., Fritzell, J. & Fors, S. (2018). Why is the gender in life expectancy
decreasing? The impact of age- and cause-specific mortality in Sweden 1997-2014. International Journal of Public Health. Vol 63(6), pp. 673-681.