The Forecasting and Impact of the Loss of Critical Human ... Manuscript.pdf · The Forecasting and...

Accepted for Publication: IEEE Trans. on Engineering Management

The Forecasting and Impact of the Loss of Critical Human Skills Necessary for Supporting Legacy Systems

Peter Sandborn and Varun J. Prabhakar

CALCE Electronic Products and Systems Center Department of Mechanical Engineering

University of Maryland College Park, MD 20742 USA

Index Terms

Workforce planning, workforce management, obsolescence, DMSMS, life-cycle cost,

organizational forgetting, legacy systems

Abstract

The loss of critical human skills that are either non-replenishable or take very long periods of

time to reconstitute, impacts the support of legacy systems ranging from infrastructure, military

and aerospace to IT. Many legacy systems must be supported for long periods of time because

they are prohibitively expensive to replace. Loss of critical human skills is a problem for legacy

system support organizations as they try to understand and mitigate the effects of an aging

workforce with highly specialized, low-demand skill sets. The existing research focuses on

workers that have skills that are obsolete and therefore need to be retrained to remain

employable; alternatively this paper addresses the system support impacts due to the lack of

workers with the required skill set. This paper develops a model for forecasting the loss of

critical human skills and the impact of that loss on the future cost of system support. The model

can be used to support business cases for system replacement. A detailed case study of a legacy

control system from the chemical manufacturing industry is provided and managerial insights

associated with the support of the system drawn.


Managerial Relevance Statement

The model developed in this paper allows managers to provide quantitative answers to the

following questions: 1) when will the impact of the loss of critical human skills be felt, i.e., is

there a point in the future where the system starts to become more expensive to support; 2) “how

bad is bad?”, i.e., when will the cost of supporting the system become prohibitive; and 3) is there

an alternative to replacing the system?, i.e., could a change in hiring or training practices (if

possible) resolve the problem, and if so, what hiring rates are needed? The case study provided

in the paper demonstrates the loss of younger workers (seeking what they perceive to be more

lucrative opportunities) and clearly supports a business case for the timing of future system

replacement.


I. INTRODUCTION

There are many types of mission, safety and infrastructure critical systems that have very

long field support lives of 20-30 years or more. These systems are often referred to as legacy

systems because they are based on, or are composed of, out-of-date (old) processes,

methodologies, technologies, parts, and/or application software.1 Due to mismatches between

the fast rate of technology advancement and the massive expense associated with the

replacement of these mission, safety and infrastructure critical systems, legacy system

dependence is likely to be a reality forever and many of today’s new systems are guaranteed to

become tomorrow’s legacy systems. Examples of legacy systems include military systems,

industrial and transportation control systems, and communications infrastructure. Technology

and part obsolescence are common problems faced by these systems, e.g., the inability to procure

spare parts to maintain the systems through the system’s support life. This problem is commonly

referred to as Diminishing Manufacturing Sources and Material Shortages (DMSMS) and is

particularly challenging for electronic part management, [1]. Obsolescence, however, is not

solely a hardware problem; it also impacts software [2] and the human skills necessary to

maintain legacy systems. Human skills obsolescence is a growing problem for organizations that

must support legacy systems. These organizations need to understand, forecast and mitigate the

effects of an aging work-force with highly specialized (and in many cases effectively

irreplaceable) skill sets.

Product manufacturing and support organizations require a dependable pool of skills so that

they can perform their businesses efficiently and without interruption. However, the pool of

available people and their associated skills is always in flux, requiring careful management and

1 In software engineering, legacy systems definitions are usually confined to software and architecture, this paper uses a broader definition that includes hardware and processes as well.


forecasting. Workforce planning is commonly defined as getting the right number of people

with the right competencies in the right jobs at the right time. There are numerous issues that

arise associated with mismatches between the skills possessed by the workforce and the skills

needed by employers, which we broadly classify into the following three categories: skills

obsolescence, skill shortage, and critical skills loss.

Skills obsolescence (also referred to as human capital obsolescence) is the situation in which

workers (for whatever reason) do not have the skills that they need, which includes workers that

have skills that are obsolete and need to be retrained. De Grip and Van Loo [3] define five types

of skills obsolescence: the wear of skills due to aging or illness that may be related to working

conditions; the atrophy of skills due to insufficient use (skills decay); job-specific obsolescence

due to technological and organizational change, including skills that are simply no longer needed

(skill depreciation); sector-specific obsolescence due to shifts in employment; and firm-specific

skills obsolescence due to displacement. A significant body of literature is devoted to the

analysis of skills obsolescence, e.g., [4] and the references contained therein. Skill shortage

refers to insufficient current skill competences, e.g., [5]. Skills shortage expresses the need to

identify, train and retain the workforce to fill current and expected future skill gaps. Skills

shortage also includes an organization’s failure to protect core skill competencies in economic

downturns [6].

The third category is critical skills loss, which is the topic of this paper. Critical skills loss

describes the loss of skills that are either non-replenishable or take very long periods of time

(many years) to reconstitute. Critical skills loss is a special case of “organizational forgetting”.

Organizational forgetting describes the situation where organizations lose knowledge gained

through learning-by-doing due to labor turnover, periods of inactivity, and/or failure to


institutionalize tacit knowledge [7]. Most existing analyses of organizational forgetting

implicitly assume that the situation is relatively short term and seek to forecast the recovery

period and the associated disruption in productivity and schedule, e.g., [8-10]. Critical skills loss

(the subject of this paper) represents a permanent (non-recoverable) and involuntary form of

organizational forgetting. Critical skills loss is usually the result of long-term (20+ years)

attrition where skilled workers retire and there are an insufficient number of younger workers to

take their place.2 Critical skills loss is not the result of inactivity, poor planning, or a lack of

foresight by an organization. It is simply the inevitable outcome of the organization’s

dependence on specialized skills for which there is relatively little, but non-zero, demand (which

is also the nature of DMSMS-type obsolescence problems for hardware and software, see [1]).

System support and management challenges created by the loss of necessary human skills have

been reported in a number of industries including healthcare [11], nuclear power [12], aerospace

[13], and other enterprises [14]. For example, Goodridge and McGee [15] and Hilson [16]

discuss the IT industry and the shortage of mainframe application programmers experienced in

legacy applications – the appropriate skills are no longer taught and younger workers are not

interested in learning them. The loss of critical skills problem is particularly troublesome for

organizations that must support legacy systems for long periods of time, e.g., 20-30 years or

longer. For many defense systems, the problems are potentially devastating: “Even a 1-year

delay in funding for CVN-76 [aircraft carrier] will result in the loss of critical skills which will

take up to 5 years to reconstitute through new hires and training. A longer delay could cause a

permanent loss in the skills necessary to maintain our carrier force.” [17].

2 In the case of legacy system support, 5 (or in some cases much longer) years of on-the-job experience by an engineer may be necessary to become competent.


As the human capital capable of supporting a system shrinks, eventually the time required to

support the system will increase – the objective of this paper is to understand when the increase

will occur and its magnitude. Increases in support time increase business interrupt time for

manufacturing causing a loss of revenue. Increases in support time in the service, transportation

and defense industries can decrease system availability, which can lead to loss of revenue,

compromised safety, and possibly result in the loss of life (e.g., air support for Marines

compromised by lack of helicopters due to repair delays). Due to prohibitive costs, legacy

systems are often not replaced until the consequences of their age (and possible lack of support)

result in catastrophic failures, or a viable replacement business case can be constructed and sold

to their customers.3 In order to construct business cases, management must be able to provide

quantitative answers to the following questions: 1) when will the impact of human obsolescence

be felt, i.e., is there a point in time where the system starts to become more expensive to support;

2) “how bad is bad?”, i.e., when will the cost of supporting the system exceed my profit margin

(or budget)? 3) “is there an alternative to replacing the system?” could a change in hiring or

training practices (if possible) resolve the problem? The model described in this paper provides

a forecast of the loss of human skills as a function of time and the associated impact of that loss

on the system support cost. These outcomes can be used as direct support for system

replacement business case development.

A. Existing Work

The existing literature suggests that some of the possible causes for critical skills loss

include: declines in education and training (e.g., university educated engineers are no longer

3 For example, “customers” may be the company owners (e.g., manufacturing system replacement), the tax-paying public (e.g., civil infrastructure replacement), or governments (e.g., military asset replacement).


trained in the programming languages used in many legacy systems, e.g., [18]); younger workers

that are discouraged from entering particular workforces that they perceive to be in decline (e.g.,

nuclear power [12]) or not cutting-edge [19,20]; younger workers leaving legacy system

sustainment jobs to pursue what they perceive to be more lucrative and exciting opportunities

(this observation is partially supported by Fig. 2 in Section III, which shows the distribution of

exit age for a legacy control system); shrinking “feeder” occupations (e.g., the U.S. Navy feeding

skilled workers to the nuclear power industry [12]); older workers being protective of what they

know (not passing along their knowledge in order to protect their jobs, e.g., [21]); and

differences in social and cultural influences between younger and older workers regarding the

perceptions of their jobs [15].

While there are many qualitative statements of the problem of critical skills loss and non-

replenishable knowledge loss [11-17], to the authors’ knowledge there is no quantitative

forecasting of its rate or impact. The majority of existing research addresses skills obsolescence,

which is the opposite of the problem addressed in this paper wherein workers’ skills are no

longer useful or become outdated as a result of automation and the rapid growth in technology,

e.g., [3,4,22-25]. These works address ways to mitigate skill “decay” over time for a given

workforce. The existing work on critical skill loss focuses on knowledge preservation – the

realization that non-replenishable knowledge is being lost and the need to capture it, [26,27].

There is also work on retirement wave planning [28], but this work is focused on head count not

skill content.

The analytical techniques presented by Bohlander and Snell [29] modeled a situation similar

to critical skills loss, however, they do not include the attrition or the costs associated with the

varying availability of the workers. Bordoloi [30] developed a model for workers that are at


different skill levels entering and exiting a company; additionally, the rate at which the employer

gains and loses employees is also taken into account in their planning model. However,

Bordoloi [30] stops short of estimating the experience in the skills pool as a function of time and

thereby determining the impact that critical skills loss has on the support of systems. Huang et

al. [31] developed a planning model using a computer simulation where the goal was to

determine an ideal hiring rate based on the different levels of skill that workers had. Although

this model uses workforce simulation and calculates an ideal hiring rate, this model does not

account for the costs incurred by the unavailability of workers in their determination of an ideal

hiring rate. Holt [32] points out that the traditional understanding of a workforce has been in

terms of the “physical sum of people employed,” which is the basis for most common workforce

planning models. A more useful approach is to observe workers in an organization as assets with

varying skill levels. Holt defines human capital investment as the “total amount and quality of

talent, knowledge, expertise, and training that these workers possess.” Holt minimizes cost while

maintaining balanced levels of human capital. The model is based on categorizing workers by

“classes” wherein all workers in a class share similar attributes. However, the Holt model lacks

aging and age-related effects of individual workers over time.

The maintenance workforce planning literature, e.g., [33-36], focus on resource allocation

and scheduling optimization. The goals of a maintenance policy are maximizing plant

availability and minimizing cost via the timely presence of maintenance workers. Koochaki et

al. [33] points out that “maintenance workers are usually highly skilled and therefore difficult to

recruit” and that “the efficient and effective use of a scarce maintenance workforce is very

important”. Koochaki et al. [33] considers the impact of maintenance resource constraints (i.e.,

limited maintenance workers) on the grouping of maintenance activities and comparing


condition-based maintenance (CBM) and age-based replacement. Ahire et al. [36], minimizes

the makespan (the total length of the schedule) for a sets of preventive maintenance tasks when

there are workforce availability constraints. Other papers in this area have treated the

influence of CBM on workforce planning and maintenance scheduling, e.g., [33] and references

contained therein, however, these papers focus on the determining the optimum size of the

maintenance workforce. None of the existing work treating the maintenance workforce

addresses the problem of loss of irreplaceable maintenance resources.

The existing literature makes no attempt to study or verify the postulated causes of critical

skills obsolescence, or quantify its impacts on an organization. The objective of this paper is to

develop a model that uses existing historical workforce data (that organizations have) to forecast

the future impacts of that loss and to relate the impacts to management decision. This paper is

concerned with the dynamics of the cumulative skills pool (“head content”) and its relation to the

cost of sustaining a system as a function of time. The questions that need to be addressed are

“what will today’s skills pool look like after ‘x’ years?” and “what impact will the future skills

pool have on the ability to continue to support the system?” The model described in this paper

offers a projection into the future and a way to quantify cumulative experience versus the influx

of new inexperienced workers and worker departures. This model enables management to not

only project future costs, but also to make quantitative business cases for system replacement.

Section II describes the dynamic human skills model developed in this paper. This is

followed in Section III by a numerical case study of a legacy control system from the chemical

manufacturing industry. The case study includes the analysis of the life-cycle cost impacts of

critical skills loss and the managerial insights derived from the analysis.


II. A MODEL FOR ESTIMATING THE HUMAN SKILLS POOL OVER TIME

The motivation for the model developed in this section is to forecast the future change in

“experience” relative to today’s skills pool. The model assumes that sufficient experience to

support the system exists today, and determines how long it takes for the experience to decrease

to a point where there is an impact on the cost and how large that impact will be.

The model presented in this section estimates the number of skilled employees (“head count”

or pool size) and the cumulative experience in the skills pool (“head content”) in order to

determine the resources that will be available to maintain a legacy system or family of systems in

the future. The cumulative experience in the skills pool impacts the time to perform the

maintenance activities to sustain a system. The time to perform maintenance to support a system

is a significant cost driver for availability-centric systems. Availability is the ability of a service

or a system to be functional when it is requested for use or operation and is determined by the

combination of reliability and maintainability. For many systems, the cost of unavailability is

very large, e.g., high-volume manufacturing, airlines, medical devices, etc.

The assumptions and parameters used to build the model developed in this section were

driven by the actual data available for the numerical case study (see Section III). In several cases

the simplest possible assumption was made because there was no data to support more complex

assumptions – these assumptions are addressed as they appear in the model. The model

developed in this section has four primary inputs: the distribution of current ages (fC), the

distribution of hiring ages (fH), the distribution of exit ages (fL) and the hiring rate (H). The data

from which the distributions can be constructed can be readily obtained from human resource

records kept by organizations. Using these inputs, we wish to determine the quantity of people


of a specific age in the pool as a function of time.4 Once the quantity of people of each age at a

point in time are computed, the model converts the head count to cumulative experience at that

point in time using a parametric relationship derived from actual data. The experience can then

be used to calculate the time to resolve a maintenance event and that time (along with the

quantity of maintenance events to be resolved) can be used to calculate the cost.

A. The Pool Size and Experience Calculation

In order to assess pool size and experience, we have to project (i.e., forward calculate into the

future) the experience of the people in the pool based on when they were hired while accounting

for age related loss.

The level of experience within the skills pool varies over time and can be determined from

the following: 1) the new hires added to the skills pool; 2) the attrition rate or loss of skilled

employees due to job changes, promotions, retirement, etc.; and 3) accounting for the varying

skill levels of the people in the pool and how those skill levels improve as people remain in the

pool.

Initially, we assume that the organization’s policy dictates the hiring rate, H, resulting in a

pool with varying skills that can be modeled over time. Since generally the data (see Section III

for example) is only available on an annual basis, we utilize discrete one year time steps, i,

throughout the analyses. Let H be the new hires per year as a fraction of the pool size at the start

of the analysis period (i = 0). Let fC be the probability mass function (PMF) of age for the current

skills pool, fH be the PMF of age for new hires, and fL be the PMF of age for people exiting the

4 Both stationary and non-stationary models are considered herein, i.e., stationary models assume that the distributions do not change over time and non-stationary models allow them to change over time governed by drift and volatility parameters.


pool (attrition). Then, Ni(a), the net frequency of people in the pool of age a during year i, is

given by,

)(1)()1()( 1 afaHfaNaN LHii (1)

where, i is the number of years from the start of the analysis period. The first term in the

brackets is the current pool, the second term in the brackets is the new hires, and the multiplier

models the retention rate. Note, (1) assumes that the hiring rate, H, is the same for all ages. As a

boundary condition, we require that at i = 0, N0(a) = fc(a) corresponding to the current

distribution of worker age in the skills pool. Ni(a) is a fraction that is measured relative to the

starting (i = 0) pool; for example, the number of people of age a in year 0 would be the product

of N0(a) and the total initial (i = 0) head count. Therefore, the cumulative net frequency of people

in the skills pool, NNET, in year i is the sum of Ni(a) over all the ages given by,

r

yaiNET aNiN )()( (2)

where, r is retirement age and y is the age of the youngest employee in the skills pool. We use

the term “net frequency” because NNET(i) can be greater than 1, which can occur if the number of

new hires exceeds the number of people exiting the pool. The age of the youngest worker, y,

corresponds to the age of the youngest worker at the start of the analysis period (from fC) or the

youngest worker that is hired during the analysis period (from fH). This model assumes that a

mandatory retirement age, r, exists. Early retirement along with other reasons for the loss of

skilled employees are captured in the PMF for attrition, fL. Conversely, the PMF of retention, fR

for the skills pool can be estimated annually by, fR(a) = 1 – fL(a).

Estimating the size of the pool (head count) over time is necessary but not sufficient to

capture the ability to support a system into the future because not all workers in the pool have an

equivalent level of experience and therefore an equivalent level of “value” to the support of the


system. The following analysis estimates cumulative experience in order to track the true value

of the pool of skilled workers where “experience” is defined as the length of time that a worker

has spent in a particular position. The cumulative experience of the pool in year i, Ei, can be

calculated using,

r

yaiEEi aNIaRE )( (3)

where, RE and IE are parameters used to map age to effective experience measured in years (RE

and IE are determined using a curve fit to actual data, e.g., see Section III.B). Note, “experience”

has units of time, however, Ei is only used in the model to determine the change in cumulative

experience from the initial condition.

The cumulative experience of the pool in year i as a fraction of the current pool experience,

E0, can be calculated as,

0E

EE i

ci (4)

Using the cumulative experience, the time to perform maintenance in year i is given by,

ic

mm

i E

TT 0 (5)

where, mT0 is the time to perform a maintenance activity with a skills pool having E0 experience

at time i = 0. The time to perform maintenance modeled in (5) increases as experience decreases

due to several factors: 1) less experienced workers may require more time to perform the task

(learning curve effects), and/or 2) the number of workers who can perform the required

maintenance activity is smaller, so they may not (for example) be available at every site, i.e.,

they may have to travel from a different location.


All of the development so far assumes a fixed hiring rate that may or may not be sufficient to

maintain the current level of experience. The model can be used to estimate the hiring rate

required to maintain the current level of experience over the system’s support life. The required

hiring rate (assuming hiring is possible) in year i, Hi, can be solved for from (1) as,

)(

1)1(

)(1

)(1 af

aNaf

aNH

Hi

L

ii (6)

subject to the following constraint which ensures that the cumulative experience is constant at

E0,

10

E

EE i

ci (7)

The simplest assumption for solving (1) or (6) is that we have a “stationary process”. A

stationary process is a stochastic process whose joint probability distribution does not change

when shifted in time or space (time is the relevant parameter for this model). For a stationary

process, the mean and variance do not change over time. In reality, hiring age (fH) and exit age

(fL) are not stationary processes. To model the problem as a non-stationary process, we model

the change in the mean of the distribution using geometric Brownian motion,5

tttt dWXdtXdX (8)

where Xt is the value of the quantity being simulated at time t (the mean of the hiring age or exit

age distributions in our case), μ is a drift component, σ is a variance component (also called

shock or volatility), and Wt is a Brownian motion (also known as a Weiner or continuous-time

stochastic process). Wt is modeled as dt in our case. No shape change in the distributions is

modeled (because no case study information could be obtained to support it).

5 Continuous-time models (such as Brownian motion) are commonly used to model continuous degradation processes for asset management of complex engineering systems [37].


B. The Cost Impact of Human Skills Obsolescence

The most immediate impact of the loss of critical human skills for the support of legacy

systems is in the ability to perform system support activities in a timely manner. In general,

corrective maintenance costs consist of: spare parts, labor, downtime, overhead,

consumables/handling, and equipment/facilities [38]. When a maintenance event occurs for a

system, the cost of performing the necessary maintenance action is modeled using,

bi

c

m

bli

c

mp

imi C

E

TfC

E

TCC

i

j

i

00 (9)

where the term in brackets is the maintenance time from (5), jbf is the fraction of the

maintenance events of severity level j that result in a business interrupt, piC is the cost of

replacement parts (if replacements are needed) in the ith year, liC is the cost of labor (per unit

time) in the ith year (assumed to be burdened, i.e., includes overhead), and biC is the cost of

business interrupt (per unit time) in the ith year. All equipment and facilities necessary for

performing maintenance are assumed to be subsumed in the overhead, and consumables and

handling are assumed to be negligible. liC , b

iC and piC are discounted to a base year using the

weighted average cost of capital. piC may include holding (inventory) costs if the part is obsolete

in year i and was procured in a lifetime or bridge buy at an earlier time.6 The failure modes of

6 A lifetime buy refers to purchasing a sufficient number of parts to sustain a system through its entire support life (until its end of support date) at the point in time when the part is discontinued (i.e., the point in time when the part is no longer procurable from its original manufacturer), while a bridge buy only procures enough parts to sustain the system to a redesign that will replace the part [39]. Lifetime and bridge buys are a common obsolescence mitigation approach used for managing DMSMS of electronic parts where the ability to procure parts after discontinuance may be limited or introduce unacceptable risks of procuring counterfeit parts. If parts are purchased at a lifetime or bridge buy, the cost of the replacement part (this is a per part cost) in year i is given by,


the system are assigned “severity levels” (j) that are correlated to business interrupt fractions jbf

that can range from 0 to 1 (0 = no business interrupt, 1 = all the maintenance time is business

interrupt). Time-to-failure distributions for each severity level are constructed from the time-to-

failure distributions of the relevant failure mechanisms and these are sampled to create

maintenance events using a discrete event simulation (see Section III).

The default model for maintenance time given in (5), which is embedded in (9), assumes that

only the pool experience impacts the time to perform maintenance. The problem is that an

organization may have sufficient experience in the pool, but there is a minimum head count that

must be staffed per system (a plant, for example, would be a single installation of a system under

analysis). If head count drops too low, maintenance time may be impacted even if sufficient

experience exists. For such a case, the average number of people per plant is given by,

N

MfP origs (10)

where fs is the pool size relative to today, Morig is the original number of people (corresponding

to a fully staffed organization), and N is the number of plants. Rearranging (10) to solve for the

threshold (minimum) pool size gives,

orig

s M

NPf

threshold

min (11)

When thresholdss ff the maintenance time is penalized using,

i

pp

i

ni

nkkn

pp

i w

CCni

w

I

w

CCni

)1( :For ;

)1()1( : For

1

where pC is the purchase price of the part, w is the weighted average cost of capital, n is the year of the

lifetime buy or bridge buy, and I is the holding cost per part per year. Note, both i and n are measured relative to the base year for money.


s

s

c

mm

i f

f

E

TT threshold

i

0 (12)

and (9) becomes,

li

bib

mi

pi

mi CCfTCC (13)

Equation (13) is used to compute the cost of maintenance events in the ith year of system

support.

III. NUMERICAL CASE STUDY

This section presents a case study that implements the model developed in Section II with

data from a chemical product manufacturing company. The system considered is a legacy

control system (developed in the 1970s) with over 2000 instances (plants) installed and currently

supported worldwide. The control system has two identical controllers designed so that one can

take over for the other upon failure (or stoppage for maintenance). The system consists of

software and obsolete electronics hardware, which is supported via existing lifetime buys and

aftermarket procurements. The software is machine code and represents the most serious critical

skills loss problem. The model developed in Section II requires three primary distribution

inputs, which are determined from actual data: the current age distribution (fC), the distribution of

hiring age (fH) and the distribution of exit age (fL). However, the two distribution inputs that are

available from the actual data are the hiring age (fH) and a current age distribution (fC) shown in

Fig. 1. Notice that the current age distribution (Fig. 1b) has a mode of approximately 55 years of

age, which, unfortunately is very close to the early retirement age from the company and

demonstrates the issue that this paper focuses on.


A. Synthesizing an Exit Age (fL) Distribution

The exit age distribution (fL) for this case study is not known and needs to be synthesized in a

way that is consistent with the known hiring age and current age distributions shown in Fig.1.

A simulator was created that starts with the distribution in Fig. 1a and uses a manually

defined exit age distribution to generate the current age distribution in Fig. 1b. The simulator

assumes that the hiring age and exit age distributions remain constant over time (a “stationary

process”) and that the company is propagated long enough to reach a steady state.7 The exit age

distribution was adjusted until a result that approximately matches the known current age

distribution (Fig. 1b) is obtained. We used 5 year increments in the simulator (assuming that this

allows perturbations in year-to-year hiring and exiting to be averaged out). The resulting

synthesized exit age distribution is shown in Fig. 2.

7 Note, the assumption that the hiring age and exit age distribution models remain constant over time is NOT an assumption of the general model developed in Section II. This assumption is specific to this stationary process case study. Note, in Section III.D the non-stationary process version of the case study lifts this assumption.

0

10

20

30

40

50

60

15 20 25 30 35 40 45 50 55 60 65

Count

Hiring Age into MOD

0

10

20

30

40

50

60

70

15 20 25 30 35 40 45 50 55 60 65 70

Count

Current Age in MOD

(a) (b)

Fig. 1. Actual data: (a) hiring age distribution (fH), (b) current age distribution (fC).


The distribution in Fig. 2 is a “bathtub curve”.8 It shows that a large portion of the workers

exit early – most of these are workers who are changing jobs within the company. For the

company modeled in this case study, it has been difficult to retain new people to support the

legacy system as younger employees relocate to other job opportunities that they perceive as

having better long-term career prospects. Between the ages of 45 and 60 almost no workers

leave, and above 60 the workers are retiring.

B. Mapping Pool Size to Experience

The distributions shown in Figs. 1 and 2 only capture the number of workers (pool size), not

their relative experience. To get from pool size to experience one more input is needed: the

mapping from age to experience within the company. To generate the parameters for the

mapping function in (3), both the years of experience and the years of service to the company

were considered (Fig. 3). The following weighting was adjusted to maximize the correlation

coefficient between the two:

8 The bathtub curve is commonly used in reliability engineering to describe a particular form of the hazard function comprised on infant mortality, random failures, and wear-out. In our case the hazard rate (instantaneous failure rate in time) is analogous to the exit rate (in age).

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

15 20 25 30 35 40 45 50 55 60 65 70

Fraction of all people exiting the pool

during this bin

Synthesized Exit Age from MOD

Fig. 2. Synthesized exit age distribution (fL).


Weighted Experience = 0.32(Years of Experience) + 0.68(Years in the Company) (14)

This weighting does not imply that years in the company is twice as important as experience -

experience is embedded within the years in the company.

At some point, workers cannot gain any more experience. Depending on the role of the

particular workers, experience was capped at between 5 and 15 years, i.e., the EE IaR term

in (3) is limited to a maximum value. Ideally, if enough data was available, we could have

synthesized separate hiring, loss and current age distributions for each role, however, the number

of usable records available did not allow this.

The best-fit line shown in Fig. 3 gives the values of the parameters in (3) as RE = 0.8446 and

IE = -21.068 for this case study.

C. Stationary Process Case Study Results

This section presents a case study that implements the model developed in Section II with

data from a chemical product manufacturing company. The fC, fH and fL distributions are shown

in Figs. 1 and 2. The analysis in this section assumes that all the input distributions are

y = 0.8546x ‐ 21.068R² = 0.708

0

5

10

15

20

25

30

35

40

0 10 20 30 40 50 60 70

Experience

Age

Average of Years of Service and Years of MOD Exp

67% confidence intervals

Fig. 3. Weighted experience to worker age correlation. R2 = 0.708.


stationary (unchanging) for the entire analysis period. The mandatory retirement age, r, is

assumed to be 65 years at which workers are required to leave the skills pool, and the youngest

worker is y = 22.

Figure 4a shows NNET, the net pool size (number of workers) over time as a fraction of the

pool size in 2010. Fig. 4b shows the relative experience (relative to 2010), and Fig. 4c shows the

average age of the workers in the pool. These results assume no hiring, H = 0. The results

indicate that although a 10% drop in head count occurs in the first 6 years, the experience

remains approximately constant (existing workers are gaining enough experience to offset the

drop in head count). After 2016, the experience drops quickly as the most experienced workers

leave and are not replenished.


If the lost skills are replenishable, we now wish to estimate the future hiring rate, Hi, required

to preserve the current level of experience, E0, in the skills pool. The analysis identifies a

solution for the system of equations (the objective function in (6) is subject to the constraint in

(7)) to determine the annual hiring rate, Hi, needed to replenish the loss in cumulative experience

0

0.2

0.4

0.6

0.8

1

1.2

2010 2015 2020 2025 2030 2035 2040 2045 2050 2055

Pool Size Relative to Today

Year

0

10

20

30

40

50

60

70

2010 2015 2020 2025 2030 2035 2040 2045 2050 2055

Average Age

Year

0

0.2

0.4

0.6

0.8

1

1.2

2010 2015 2020 2025 2030 2035 2040 2045 2050 2055

Relative Experience

Year

(a)

(b)

(c)

Year

Year

Year

Fig. 4. (a) Pool size, (b) relative experience, and (c) average age. No hiring (H = 0) assumed.


as a result of attrition and retirement. The estimation of the required hiring rate can be performed

at any point during the analysis or as frequently as needed to replenish the skills pool and

maintain the desired experience level. For this example, the operation is repeated periodically at

1 year time steps until the end of the analysis period (end of the system’s support life). Figure 5

shows results for hiring rate, Hi, as a function of the number of years from the start of the

analysis.

Figure 5 shows that no hiring is initially required (note, hiring is not allowed to drop below

0, which would reflect a layoff situation). Starting in 2017 a hiring rate of over 6% is required

for 9 years and settles to 2-5% thereafter. Note, when H is greater than zero in (1), the hiring

rate is assumed to be applied to the entire hiring age distribution, fH. It is important to note that

the required hiring rate in Fig. 5 takes into account both the time required to learn the skills

necessary to support the system and the exit age distribution found in Fig. 2.

The impact of the results developed in this example appears in Fig. 6, which shows the

annual cost of supporting the legacy control system from the chemical product manufacturing

company through year 2040 (over 2000 instances of the system require support in this case).

The cost model used is a stochastic discrete event simulator that samples time-to-failure

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

6.00%

7.00%

8.00%

9.00%

10.00%

2010 2015 2020 2025 2030 2035 2040 2045 2050 2055Required Hiring Rate to Maintain Pool Size

or Experience (%)

Year

Fig. 5. Required hiring rate, Hi, (if hiring is possible) to maintain pool experience.


distributions for the components of the control system to obtain maintenance events (event dates

and components that need replacement). Subsystem-specific (and severity category specific9)

failure distributions are sampled to obtain failure dates for the system. At each maintenance

event, maintenance resources are drawn and a cost is estimated using (13). The majority of the

maintenance events do not result in business interrupt time because they only impact one of the

two parallel control systems and jbf = 0, however, a small fraction (the most severe events)

result in dual control system failures. The risk of dual failures and the resulting business

interrupt is captured by the differing severity categories. The specific data associated with the

system count, the subsystem/severity category reliabilities, and the cost of business interrupt

time is proprietary to the customer and not included here.

9 Severity categories indicate both the level of maintenance required (which translates into the maintenance resources needed) and the amount of business interrupt (if any) associated with the maintenance event.

$0

$2,000,000

$4,000,000

$6,000,000

$8,000,000

$10,000,000

$12,000,000

$14,000,000

$16,000,000

2010 2015 2020 2025 2030 2035 2040

Support Cost

Year

$2M

Human obsolescence not included

Human obsolescence included in cost model

Fig. 6. Annual cost to support a legacy system with and without the inclusion of skills obsolescence (H = 0). For this figure w = 0 (no cost of money), therefore, the results are actual dollars.


For the case study, thresholdsf was determined to be 0.54, or once the number of people in the

pool drops below 54% of the number that are in the pool today (in 2010), extra maintenance time

penalty (modeled by (12)) will occur.

Two results are shown in Fig. 6. The results show that there is little effect of skill

obsolescence prior to 2025, after 2030 the impact of the loss of skills becomes significant. Note,

in year 2028 the availability of spares (hardware) also becomes a problem resulting in the cost

step shown between years 2028 and 2030. The lowest curve in Fig. 6 assumes that there is no

human obsolescence (icE = 1 for all i in (9)). Even in this case, the annual cost increases due to

part obsolescence that results in lifetime buys of parts that tie up significant capital in pre-

purchased parts and long-term holding costs is evident. The highest cost curve shown in Fig. 6

assumes that no replenishment of lost skills is possible (H = 0, which is close to reality for this

case study).

D. Non-Stationary Process Case Study Results

All the results presented so far are for the stationary process, i.e., the hiring age and exit age

distributions assumed to be the same at all times in the future. Figure 7 is for the same case as

Fig. 4b except that a non-stationary model with the following parameters: μ = 0.005, σ = 0.01

(hiring age), and μ = -0.005, σ = 0.01 (exit age) has been used. 100 samples are shown in Fig. 7

and the stationary result is indicated. It is evident that the stationary result represents a better-

than-average case. A histogram of the experience in 2021 relative to 2010 is shown as an inset

to demonstrate the utility of the non-stationary solution.


E. Managerial Insights

Figure 6 is only an example result but it serves to demonstrate how the model developed in

this paper can be used to draw important managerial insights. The two questions posed in the

introduction to this paper were: (a) when will the company start to see the effects of the loss of

critical skills, and (b) how bad will it be? Based on Fig. 6, one would conclude that for the

example considered in the case study the impact of human obsolescence will not be noticed until

2030 and the cost of supporting the system as a result of impending human skills obsolescence

will increase to over $13M/year by 2040. What could management do with this information?

This paper offers a means to estimate the theoretical hiring strategy for replenishing a skills pool

based on experience. The result shown in Fig. 5 emphasizes a potential deficiency in the

planning and allocation of human skills to address future sustainment needs - unfortunately, in

the case study described in this paper, changing the hiring profile to that shown in Fig. 5 is

impractical (practically speaking, the hiring into the sustainment groups is close to zero). An

alternate strategy would be to manage sustainment needs through planned “phase-out” of

Fig. 7. Non-stationary model. Relative experience. No hiring (H = 0) assumed.


existing systems. The result in Fig. 6 tells management that a replacement system needs to be

developed and deployed to the plants before 2030; this information can be used as part of a

business case made to higher levels of management to support long-term strategic planning. It

should be noted (and as demonstrated in Fig. 7) that the stationary solution in Fig. 6 is a better-

than-average case. Also, as discussed in the next section, while in the case study “break-fix”

may be adequately supported until 2030, the personnel performing break-fix tasks provide

additional value that will be lost long before 2030.

In addition to the application-specific insights, Fig. 2 in the case study supports the

supposition that younger workers leave legacy system sustainment jobs for other jobs in the

company - exit age peaks at 35 years and nobody exits the company between 45 and 55 years.

IV. DISCUSSION AND CONCLUSIONS

The general workforce planning problem is ensuring that you have the right number of

people, with the right skills sets, in the right jobs, at the right time. To address this problem in

cases where the workforce is non-replenishable, a model for the loss of critical skills necessary to

support legacy systems has been developed. The model estimates the number of skilled

employees (pool size) and the combined (cumulative) experience of the skills pool in order to

determine the resources that will be available to maintain a legacy system or family of systems.

The cumulative experience of the skills pool impacts the time (and thereby the cost) to perform

the necessary maintenance activities to sustain a system. Because of the prohibitive cost of

replacement, legacy systems are often not replaced until their age results in catastrophic failures

or support costs that are untenable. The model presented in this paper can be used to support


business cases for system replacement by forecasting when issues will arise and how expensive

they will be.

Several simplifying assumptions have been made in the model described in this paper. In our

solution, we have assumed that everyone entering the pool has no experience – depending on the

application this may not be the case; some workers could enter the pool with relevant experience

gained from other positions. Our solution also assumes that the only way workers gain

experience is through years on the job. There may be other methods that can accelerate the rate

at which workers become more experienced, e.g., knowledge bases created to capture the

experience of older workers [26,27]. A discrete time analysis is presented in this paper because

the available input data only exists on an annual basis, however, a continuous time solution could

also be developed.

There are also indirect consequences of a shrinking skills pool that are difficult to quantify in

terms of money. The people who are maintaining systems (especially if they are engineers) are

most likely performing other tasks as well. Besides “break-fix,” their tasks may include

preventative maintenance, upgrade projects and knowledge transfer. As resources decrease, we

assume that all tasks except break-fix tasks would decrease; however, even if sufficient resources

are available for break-fix tasks, an acceleration in critical skills loss can occur as a result of

dwindling resources. Reducing non-break-fix tasks can help by increasing or improving: future

maintenance efficiency, system reliability that could decrease future maintenance requirements,

system performance, and retention of skilled people. There are many other factors that could

modify the case study results presented. These include geography (location in the world),

gender, the type of product produced, etc. These effects can certainly be analyzed with the

model if sufficient data existed to distinguish groups of workers and places of employment.


The model presented in this paper has only been used to assess support (maintenance and

business interrupt) costs, but it could also be used to assess system availability penalties (beyond

business interrupts) that may be assessed by customers for systems subject to availability

contracts such as performance-based logistics contracts [40].

Finally, the initial use case presented in this paper should be followed-up with a larger

population that potentially enlists the cooperation of more than one company possibly via the

solicitation of support from relevant trade organizations whose members share a common

interest in this topic.

REFERENCES

[1] P. Sandborn, “Trapped on technology’s trailing edge,” IEEE Spectrum, vol. 45, no. 4, pp. 42-58, 2008.

[2] P. Sandborn, “Software obsolescence - Complicating the part and technology obsolescence management problem,” IEEE Transactions on Components and Packaging Technologies, vol. 30, no. 4, pp. 886-888, 2007.

[3] A. De Grip and J. Van Loo, “The economics of skills obsolescence: A review,” Research in Labor Economics, vol. 21, ed. A. De Grip, J. Van Loo, and K. Mayhew, Elsevier, pp. 1-26, 2002.

[4] A. De Grip, J. Van Loo, and K. Mayhew, The Econonics of Skills Obsolescence: Theoretical Innovations and Emperical Applications, Research in Labor Economics, vol. 21, Elsevier, 2002.

[5] F. Green, S. Machin, and D. Wilkinson, “The meaning and determinants of skills shortages,” Oxford Bulletin on Economics and Statistics, vol. 60, no. 1, pp. 165-187, May 1998.

[6] K. Melymuka, “Don’t just stand there – Get ready,” Computerworld, vol. 36, no. 26, pp. 48-49, 2002.

[7] D. Besanko, U. Doraszelski, Y. Kryukov, and M. Satterthwaite, “Learning-by-doing, organizational forgetting and industry dynamics,” Econometrica, vol. 78, no. 2, pp. 453-508, March 2010.

[8] C. D. Bailey, “Forgetting and the learning curve: A laboratory study,” Management Science, vol. 35, no. 3, pp. 340-352, March 1989.

[9] L. Argote, S. Beckman, D. and Epple, “The persistence and transfer of learning in an industrial setting,” Management Science, vol. 36, no. 2, pp. 140-154, 1990.


[10] L. Argote, Organizational Learning: Creating, Retaining and Transferring Knowledge, Springer, New York, 2013.

[11] J. Waldman, “Measuring retention rather than turnover: A different and complementary HR calculus,” Human Resource Planning, vol. 27, no. 3, pp. 6-9, January 2004.

[12] “Nuclear workforce planning: Preparing for the long haul,” ScottMadden Perspectives, Spring 2008. Available at: http://www.scottmadden.com/insight/233/Nuclear-Workforce-Planning.html

[13] “Testimony of Elliot Pulham President & CEO, The Space Foundation, Before the Commission on the Future of the U.S. Aerospace Industry” May 14, 2002, Available at: http://www.spaceref.com/news/viewpr.html?pid=8335

[14] M. Leibold and S. Voelpel, Managing the Aging Workforce: Challenges and Solutions, John Wiley & Sons, New York, NY, 2006.

[15] E. Goodridge and M. K. McGee, “IT’s generation gap,” Informationweek.com, 2002.

[16] G. Hilson, “Generation gap is big iron’s only threat,” Computing Canada, vol., 27, no. 7, pp. 1 and 8, March 23, 2001.

[17] Congressional Record Volume 140, Number 65 (Monday, May 23, 1994). Available at: http://www.gpo.gov/fdsys/pkg/CREC-1994-05-23/html/CREC-1994-05-23-pt1-PgH68.htm

[18] S. Shead, “Universities won’t teach ‘uncool’ Cobol anymore – but should they?” ZDNet, March 7, 2013. Available at: http://www.zdnet.com/universities-wont-teach-uncool-cobol-anymore-but-should-they-7000012250/

[19] J. D. Ahrens, N. Prywes and E. Lock, “Software process reengineering: Towards a new generation of CASE technology,” J. Systems Software, vol. 20, pp. 71-84, 1995.

[20] W. S. Adolph, “Cash cow in the tar pit: Reengineering a legacy system,” IEEE Software, vol. 13, no. 3, pp. 41-47, 1996.

[21] D. M. Andolšek, “Knowledge hoarding or sharing,” Proceedings of the Management, Knowledge and Learning International Conference, Celje Slovenia, 2011.

[22] S. S. Dubin, “Obsolescence of lifelong education,” American Psychologist, vol. 27, pp. 486-498, 1972.

[23] J. Allen and A. De Grip, Skill Obsolescence Lifelong Learning and Labour Market Participation, Research Centre for Education and the Labour Market, Maastrict: Maastricht University, 2004.

[24] W. Arthur, W. Bennett Jr., P. L. Stanush, and T. L. McNelly, “Factors that influence skill decay and retention: A quantitiatve review and analysis,” Human Performance, vol. 11, no. 1, pp. 57-101, 1998.

[25] J. D. Hagman, Effects of Training Schedule and Equipment Variety on Retention and Transfer of Maintenance Skill, Alexandria, VA: U.S. Army Research Institute for Behavioral and Social Sciences, Research Report 1309, 1980.

[26] C. Joe and P. Yoong, “Harnessing the knowledge assets of older workers: A work in progress report,” Decision Support in an Uncertain and Complex World: The IFIP TC8/WG88.3 International Conference, pp. 392-401, 2004.


[27] D. E. Hailey and C. E. Hailey, “The nature of process preservation: Tools for capturing and preserving professional skills.” Available at: http://imrl.usu.edu/publications/ProcessP.pdf

[28] B. Friel, “Reality check,” Government Executive, May 15, 2002.

[29] G. W. Bohlander and S. A. Snell, Managing Human Resources, 15th edition, Mason, Ohio: South-Western CENEGAGE Learning, 2010.

[30] S. K. Bordoloi, “Flexibility, adaptability and efficiency in dynamic manufacturing systems,” Journal of Production and Operations Management, vol. 8, no. 2, pp. 133-150, 1999.

[31] H.-C. Huang, L.-H. Lee, H. Song, and B. T. Eck, “SimMan - A simulation model for workforce capacity planning,” Computers and Operations Research, vol. 36, no. 8, pp. 2490-2497, 2009.

[32] B. A. Holt, Development of an Optimal Replenishment Policy for Human Capital Inventory, Ph.D. Dissertation from the University of Tennessee Knoxville, 2011.

[33] J. Koochaki, J. A. C. Bokhorst, H. Wortmann, and W. Klingenberg, “The influence of condition-based maintenance on workforce planning and maintenance scheduling,” International Journal of Production Research, vol. 51, no. 8, pp. 2339-2351, 2013.

[34] S. Martorell, M. Villamizar, S. Carlos, and A. Sanchez, “Maintenance modeling and optimization integrating human and material resources,” Reliability Engineering and System Safety, vol. 95, no. 12, pp. 1293-1239, December 2010.

[35] D. Ait-Kaki, J.-B. Menye, and H. Kane, “Resources assignment model in maintenance activities scheduling,” International Journal of Production Research, vol. 49, no. 22, pp. 6677-6689, November 2011.

[36] S. Ahire, G. Greenwood, A. Gupta, and M. Terwillinger, “Workforce-constrained preventative maintenance scheduling using evolution strategies,” Decision Sciences, vol. 31, no. 4, pp. 833-859, 2000.

[37] N. Gorjian, L. Ma, M. Mittinty, P. Yarlagadda, and Y. Sun. “A review on degradation models in reliability analysis,” Proceedings 4th World Congress on Engineering Asset Management, Athens, Greece, 2009.

[38] W. J. Fabrycky and B. S. Blanchard, Life-Cycle Cost and Economic Analysis, Prentice Hall, Upper Saddle River, NJ, 1991.

[39] D. Feng, P. Singh, and P. Sandborn, “Optimizing lifetime buys to minimize lifecycle cost," Proceedings of the 2007 Aging Aircraft Conference, Palm Springs, CA, April 2007.

[40] R. Beanum, “Performance-Based logistics and contractor support methods,” Proceedings of the IEEE Systems Readiness Technology Conference (AUTOTESTCON), 2006.

The Forecasting and Impact of the Loss of Critical Human ... Manuscript.pdf · The Forecasting and...

Documents

Transcript of The Forecasting and Impact of the Loss of Critical Human ... Manuscript.pdf · The Forecasting and...