Cost Estimation Methods For Software Engineering

Cost Estimation Methods

For Software Engineering

By

Andre Ladeira

Dissertation submitted in partial fulfillment of the requirements for the degree

Magister lngeneriae in

Engineering Management

In the faculty of Engineering

at the

Rand Afrikaans University

Supervisor: Prof L Pretorius

January 2002

Cost Estimation Methods for Software Engineering

Executive Summary

This dissertation summarizes several classes of software cost estimation models

and techniques. Experience to date indicates that expertise-based techniques

are less mature than the other classes of techniques (algorithmic models), but

that all classes of techniques are challenged by the rapid pace of change in

software technology. The primary conclusion is that no single technique is best

for all situations, and that a careful comparison of the results of several

approaches is most likely to produce realistic estimates.

As more pressure on accurate cost estimation increase, research attention is

now directed at gaining a better understanding of the software-engineering

process as wall as constructing and evaluating software cost estimation tools.

This dissertation evaluated four of the most popular algorithmic models used to

estimate software cost (SLIM, COCOMO II, Function points and SLOC)

This dissertation also provides an overview of the baseline cost estimation model

tailored to these new forms of software engineering. The major new modeling

capabilities are an adaptable family of software sizing models, involving Function

Points and Source Lines of Code. These models are serving as a framework for

an extensive current data collection and analysis effort to further refine and

calibrate the model's estimation capabilities.

Index

Chapter 1


Introduction .......................................................................... &

1.1. Background to the problem .................................................................... 6

1.2. Background literature ............................................................................. 6

1.3. Problem Statement .............................................................................. 10

1.4. Research objective .............................................................................. 11

1.5. Conclusion ........................................................................................... 11

Chapter 2

Estimation Processes ....................................................... 12

2.1 . Software Cost ...................................................................................... 12

2.2. Software Cost Estimation Process ....................................................... 13

2.3. Estimation and the software process ................................................... 13

2.4. Inputs and Outputs to the Estimation Process ..................................... 15

2.5. The Estimation Process ....................................................................... 19

2.6. Timing of the estimates ........................................................................ 21

2.7. Estimation Constraints .......................................................................... 22

2.8. Data gathering ..................................................................................... 24

2.9. Problems with the Cost Estimation Process ......................................... 25

2.1 0. Problems with Requirements ............................................................... 26

2.11. Conclusion ........................................................................................... 28

Chapter 3

Size Estimation .................................................................. 30

3.1 Lines of code ........................................................................................ 30

3.2 Function Point Analysis ........................................................................ 34

3.3 Conclusion ........................................................................................... 42

2


Chapter 4

E:!itirrlettiC>rl ~E!tll()cJ!i .......................................................... 44

4.1. Software Life-cycle Management (SLIM) method ................................ 44

4.2. Constructive Cost Model (COCOMO II) ............................................... 49

4.3. Expertise-Based Technique ................................................................. 57

4.4. Cost Estimation method ....................................................................... 59

4.5. Conclusion ........................................................................................... 61

Chapter 5

Case Stucty ......................................................................... 64

5.1 Project description .............................................................................. 64

5.2 Project Size estimation ....................................................................... 64

5.3 Effort estimation .................................................................................. 66

5.4 Conclusion .......................................................................................... 70

Chapter 6

Conclusions a net Recorr1rt1enctations .............................. 71

6.1 Conclusions ........................................................................................ 71

6.2 Recommendations .............................................................................. 73

6.3 Further Investigation ........................................................................... 76

References ............................... Error! Bookmark not defined.

~IC>!iSCir)f ............................................................................. ~~

3


Appendix A ......................................................................... 82

Scaling Drivers ........................................................................... 86

Appendix 8 ......................................................................... 85

Architecture I Risk Resolution ....................................................... 89

Appendix C ......................................................................... 86

Team Cohesion .......................................................................... 90

Appendix 0 ......................................................................... 86

Process Maturity ......................................................................... 90

Appendix E ......................................................................... 87

Product Complexity ..................................................................... 91

Appendix F ......................................................................... 88

Effort multipliers .......................................................................... 92

Appendi>e c:J ...•••••••.........••.........•••............•••..........••••.......... !JE;

Nu Metro Server Technical Specification .......................................... 99

List of figures

Figure 1.1: Influencing factors to be evaluated to produce an accurate estimate .9

Figure 1.2: Information to be used to predict scenarios on future projects ......... 10

Figure 1.3: Estimation principle .......................................................................... 11

Figure 2. 1: Classical view of software estimation process ................................. 16

Figure 2.2: Actual cost estimation process ......................................................... 18

4


Figure 3.1: Definition checklist for source statements counts ............................. 32

Figure 4.1: Rayleigh curve ................................................................................. 44

List of tables

Table 1.1: Project levels of complexity ........................................................ 8

Table 3.1: Function point complexity matrix ................................................ 37

Table 3.2: Function point complexity-weight matrix ..................................... 37

Table 4.1: Rating scheme for the COCOMO II scale factors ........................... 55

Table 4.2: Effort multipliers cost driving rating for the post-architecture model. .. 58

Table 4.3: Early design and post-architecture cost driver .............................. 57

5


Chapter 1

Introduction

1.1. Background to the problem

"If there is one management danger zone to mark

above all others, it is software cost estimation."

Robert Glass - Building Software Quality

The reason for the strong emphasis on software engineering cost estimation is

that it provides the vital link between the general concepts and techniques of

economic analysis and the particular world of software engineering. There is no

good way to perform a software cost-benefit analysis, breakeven analysis, or

make-or-buy analysis without some reasonably accurate method of estimating

software engineering costs, and their sensitivity to various product, project, and

environmental factors. Software engineering cost estimation techniques also

provide an essential part of the foundation for good engineering management.

Cost in a project is also due to the requirements for software, hardware and

human resources. The bulk of the cost of software development is due to the

human resources needed, and most cost estimation procedures focus on this

aspect. Most cost estimates are determined in terms of person-months (PM).

1.2. Background literature

As the cost of the project depends on the nature and characteristics of the

project, at any point, the accuracy of the estimate will depend on the amount of

reliable information that is available about the final product [4][27]. When the

project is being initiated or during the feasibility study, the analysts have only

6


some idea of the data the system will get and produce and the major functionality

of the system. There is a great deal of uncertainty about the actual specifications

of the system. As the user specifies the system more fully and accurately, the

uncertainties are reduced and more accurate cost estimates can be made.

Despite the limitations, cost estimation models have matured considerably and

generally give fairly accurate estimates.

By far, the project sizing technique that delivers the greatest accuracy and

flexibility is function point analysis [24]. Based upon logical, user-defined

requirements, function points permit the early sizing of the software problem

domain. In addition, the function point methodology presents the opportunity to

size a user requirement regardless of the level of detail available. An accurate

function point size can be determined from the detailed information included in a

thorough user requirements document, or an adequate function point size can be

derived from the limited information provided in an early proposal.

An alternative sizing method is counting lines of code [20]. It dependent upon

information that is not available until later in the development life cycle. Function

points accurately size the stated requirement. If the problem domain is not clearly

or fully defined, the project will not be properly sized. When there are missing,

brief, or vague requirements, a simple process using basic diagramming

techniques with the requesting user can be executed to more fully define the

requirements.

In addition to the project size, project complexity must be properly evaluated

[Matson]. To some extent, complexity levels are evaluated by 14 general system

characteristics:

• Data communication

• Distributed data processing

• Performance

• Heavily used configuration

7


• Transaction rate

• Online data entry

• End-user efficiency

• Online update

• Complex processing

• Reusability

• Installation ease

• Operational ease

• Multiple sites

The assessment of a project's complexity should also take into consideration

complex interfaces, database structures, and contained algorithms. The

assessment of complexity can be based upon five varying levels of complexity as

shown in table 1.1 :

Level1: Simple addition/subtraction Simple logical algorithms Simple data relationships

Level2: Many calculations, including multiplication/division in series More complex nested algorithms Multidimensional data relationships

Level3: Significant number of calculations typically contained in payroll/actuarial/rating/scheduling applications Complex nested algorithms Multidimensional and relational data relationships with a significant number of attributive and associative relationships

Level4: Differential equations typical Fuzzy logic Extremely complex logical and mathematical algorithms typically seen in military/telecommunications/real-time/automated process control/navigation systems Extremely complex data

LevelS: Online, continuously available, critically timed Event-driven outputs that occur simultaneously with inputs Buffer area or queue to determine processing priorities Memory, timing, and communication constraints

Table 1.1: Project levels of complexity [19]

8


The capability to deliver software is based upon a variety of risk factors that

influence a development organization's capability to deliver software in a timely

and economical fashion. Risk factors include such things as the software

processes that will be used, the skill levels of the staff (including user personnel)

who will be involved, the automation that will be utilized, and the influences of the

physical (development conditions) and business environment (competition and

regulatory requirements). In fact, numerous factors influence our ability to timely

deliver software with high quality. Categorized in Figure 1.1 are some examples

of influencing factors that must be evaluated to produce an accurate estimate.

MANAGEMENT DEFINITION DESIGN • Team Dynamics • Clearly Stated • Formal Process • High Morale Requirements • Rigorous reviews

• Project Tracking • Formal Process • Design Reuse

• Project Planning • Customer • Customer

• Automation Involvement Involvement

• Management • Experience Level • Experience Skills • Business Impact Development Staff

• Automation

BUILD TEST ENVIROMEMENT • Code Review • Formal Testing • New Technology

• Source Code Methods • Automated Tracking • Test Plans Process

• Project tracking • Development Staff • Adequate Training • Project Planning Experience • Organizational • Automation • Effective Test Dynamics

• Management Tools • Certification Skills • Customer

Involvement

Figure 1.1: Influencing factors to be evaluated to produce an accurate estimate [19].

This information can be used to predict and explore "what-if' scenarios on future

projects (see Figure 1.2).

9


Estimate Project Completion

Access: Size \ ~ Complexity Size Influence ,__~ Rate of delivery •, ..... ~ Complexity

Factors Influence Factors

Select a baseline profile

Baseline of Performance

Create Profile Rate of Delivery Time to Market

Defects

Figure 1.2: Information to be used to predict scenarios on future projects

An organization should develop profiles that reflect the rate of delivery for a

project of a given size, complexity, and risk factors [12].

1.3. Problem Statement

At the core of the estimating challenge are two issues [14]: the need to

understand and express (as early as possible) the engineering problem domain,

and the need to understand the capability to deliver the required software

solution within a specified environment. Only then it will be able to accurately

predict the effort required to deliver a project.

The current engineering problem domain can be defined simply as the scope of

the required software. The problem domain must be accurately assessed for its

size and complexity. To complicate the situation, experience tells that at the point

in time that an initial estimate is required (early in the system's life cycle) it cannot

be presumed that all the necessary information is available. Therefore, there

must be a rigorous process that permits a further clarification of the problem

domain.

An effective estimating model considers three elements: size, complexity, and

risk factors. When factored together, they result in a more accurate cost estimate

(see Figure 1.3).

10


Definition

~ ( Project Size ) *

\~

Project Complexity

Capability

* ( Risk Factors = ~

Figure 1.3: Estimation principle

1.4. Research objective

The objective in this dissertation is as follows:

Estimates -Schedule -Effort -Costs

• Determining the software engineering cost estimation principle and

process.

• Investigating different size estimation methods and determining the

difference between the different methods.

• Investigating different effort estimation methods or techniques and

determining the difference between the methods or techniques.

1.5. Conclusion

The structure of the research dissertation will be as follows:

• Chapter two will cover a comprehensive literature review of the subject

matter and related fields.

• Chapter three will cover an investigation into two different size estimation

methods (function points calculation and source line of code count).

• Chapter four will cover an investigation into three different effort and cost

estimation methods used in the software engineering industry.

• Chapter five will cover a case study where the estimated methods were

used.

• Chapter six contains the conclusion and recommendations on the findings

made.

11


Chapter 2

Estimation Processes

2.1. Software Cost

Despite the terminology, software engineering cost does not refer directly to a

monetary value associated with software development. Such a value is almost

impossible to arrive at and not always useful. The questions are "What's the

effort involved?" and "How long will it take?" The answers to these two questions

can then be translated to the monetary value. This leads to the following

definition of software cost.

Software cost consists of three elements [9]:

• Manpower loading is the number of engineering and management

personnel allocated to the project as a function of time.

• Effort is defined as the engineering and management effort required to

complete a project, usually measured in units such as person-months. The

types and the levels of skills for the resources influence the cost of the

project.

• Duration is the amount of time (usually measured in months) required to

complete the project.

Arriving at a cost estimate involves using a number of different factors to try to

determine the overall cost of a system. Deciding which factors to include and

combining them to arrive at the estimate make up the engineering cost

estimation process that is defined as follows [14]:

Direct costs include items such as analysis, design, coding, testing and

integration. Depending on who is doing the engineering and why, software cost

12


may also include a number of other items such as training, customer support,

installation, level of documentation, configuration management, and quality

assurance.

2.2. Software Cost Estimation Process

A software cost estimation process is the set of techniques and procedures that

an organization uses to arrive at a software cost estimate. Generally there is a

set of inputs to the process (e.g., system requirements) and an output of effort,

manpower loading, and/or duration.

It is very difficult to examine the software cost estimation process without the

overall context of the software development process in use within a given

organization.

The set of procedures, techniques, and standards that an organization uses for

organizing, managing, and controlling software development projects is called

the software process.

Organizations have different software processes, depending on the type of

software they are developing. For many organizations, the development process

is very informal; in other cases it is well documented and stringently monitored.

2.3. Estimation and the Software Process

Cost of a project can be estimated for a number of reasons. Why it is done is an

important factor in determining when and how it is done. The reasons why a cost

estimation process is undertaken include the following [14]:

• Project approval. For every project there must be a decision by the

organization to undertake the project. Such a decision requires an

estimate of the money and resources required to complete it.

13


• Project management. Project managers are responsible for planning and

control of projects. Both activities require an estimate of the activities

required to complete a project and the resources required for each activity.

• Project team understanding. For members of a project team to work

together more efficiently on a project, it is necessary that each one

understand his/her role in the project and the overall activities of the

project. A project task definition, which can be used for this purpose, is

generated by a cost estimate.

The "why" of the cost estimation process can be any of the above reasons and is

one of the factors determining when the estimate is done. Project approval

requires estimates to be performed very early in the project life cycle, often

before requirements have been clearly specified. The project approval process

typically has a number of points where a "go/no go" decision must be made. At

each of these points, an estimate may be required to permit management to

make the decision. Early in the project life cycle, these may be approximate order

of magnitude estimates sufficient to allow the organization to determine whether

they should continue to look at a project. Late in the project, management can

get much more detailed estimates of cost to completion in order to decide

whether to cancel an ongoing project.

For managing and understanding a project, an estimate can be done early in the

development of the project to arrive at an initial estimate, and then repeated on a

regular basis during development to keep the estimate current [1][2][3]. For

these estimates the prime concern is not necessarily the absolute "cost," but the

estimated set of tasks required to complete the project, the results of each of

these tasks, how these tasks fit together, and the resources required to complete

each task.

14


Re-estimates are required throughout the development cycle regardless of why

the estimate is done. As a project progresses, more information is available on

the product and the process being used to develop it. This information can be

used to increase the accuracy and detail of the estimate.

2.4. Inputs and Outputs to the Estimation Process

The software cost estimation process computes a set of outputs as a function of

a set of inputs. The inputs to the estimation process depend on when the

estimate is being performed. Very early estimates are necessarily based on

sparse and incomplete data regarding the project and the development process.

Preliminary estimates are needed before requirements are known or architecture

has been defined [22]. Such estimates will necessarily be based on sketchy data

and will not have a high degree of accuracy. Estimates performed late in the

development cycle are based on a much wider set of information. Computing

cost to completion late in the development cycle allows a great deal of project

and process information to be used. Given that more information is available,

more detailed estimates can be made, which have a much greater degree of

accuracy than the initial estimates.

Most models of cost estimation view the estimation process as being a function

computed from a set of cost drivers. These drivers are assumed to be the

characteristics of a system that determine the final cost of production. In most of

the advocated cost estimation techniques, the primary cost driver is assumed to

be the software requirements [2][3][1 0]. In this model of software cost estimation

(illustrated in Figure 2.1 ), the requirements are the primary input to the process

and form the basis for the estimate. The estimate is then adjusted according to a

number of other cost drivers (such as experience of personnel and complexity of

system) to arrive at the final estimate.

15


In this classical view, the effort, duration, and loading are computed as fixed

numbers (perhaps with tolerances), or a set of relationships between the values

is given, allowing managers to trade off costs in order to minimize any of the

three values.

Cost drivers

Requirement

Other cost drivers

Software cost estimation process

Figure 2.1: Classical view of software estimation process [22]

Loading

In fact, the cost estimation process can be much more complex than that

portrayed in Figure 2.1. There is interdependency between many items of

information, all of which are relevant to the cost estimation process (Figure 2.2).

Many of the data items that are inputs to the cost estimation process are

modified and output by the process. Thus, rather than viewing the cost estimation

process as a function of the requirements, it is often more accurate to view this

process as trying to satisfy a set of constraints. The inputs to the system are a

set of constraints on the requirements, software architecture, financial resources,

etc., while the outputs are a cost estimate and a set of assumptions that satisfy

all the constraints.

This view allows the constraints to be imposed on any of the factors that affect

the cost. These factors range far beyond requirements to include issues such as

delivery date, finances and software process.

Requirements are viewed as constraints that must be satisfied. In a few cases,

these requirements are fixed, complete, and correct. In most cases, however,

16


during estimation the estimator detects inconsistencies and ambiguities in the

requirements. As part of the estimation process, the estimator will resolve some

of these ambiguities by imposing new constraints on the requirements. In other

cases, the problems with the requirements remain, with a corresponding affect

on the accuracy of the estimate.

Financial, calendar, manpower, architectural, and software process constraints

are also significant to the cost estimation process. Financial, calendar, and

manpower constraints limit the amount of resources that can be allocated to a

project. Financial constraints limit the amount of money that can be budgeted for

the project; calendar constraints specify a delivery date that must be met; and

manpower constraints limit the number of people that can be allocated to the

project. For example, if a fixed amount of money is available for a project, then

the estimated cost should satisfy this financial constraint, perhaps by varying the

functionality.

The software architecture defines the different components used to construct the

system and the interrelationships between these components. The stage in the

development life cycle determines whether the software architecture is a factor

for the estimation process. For example, maintenance organizations that are

working with an existing system are constrained to use the existing architecture

and can base their estimates on this architecture.

The cost estimation process for new development may not make any

assumptions on the software architecture and base the estimate entirely on the

basis of system functionality. For many larger contracts, the software process

becomes one of the constraints that must be satisfied by the estimating process.

Many organizations have within their software process a standard Work

Breakdown Structure (WBS), which defines the tasks to be performed to

complete a project. Frequently, the estimating process will be working under the

17


constraint that the standard WBS must be used for a project. The estimating

process will then tailor the WBS to the specific project, adding sufficient detail.

For example, one situation where constraints to the software process affect the

estimation process is the requirement to develop according to the ISO 9000

standard. Significant cost is incurred by adhering to this standard; for small

changes, ISO 9000 can actually be the dominant cost factor. When estimating a

system developed to this standard, estimators must be aware of the cost incurred

by use of the standard.

Cost drivers

Cons1raints

Other inputs

Vaque requirements

Other cos drivers

Sofhvare cost estimation process

Figure 2.2: Actual cost estimation process [22]

Less vaque (and modified) requirements

Loading

Contingency

Tentative WBS

Less fuzzy architecture

Aside from the various constraints, other factors that must be included as part of

the estimation process are the risks associated with the project. These risks

could include, for example, dependency on outside contractors, lack of

experience in the application domain, etc. These risk factors should be identified

18


as early as possible in order include them in the decision making and project

management processes.

2.5. The Estimation Process

An estimate is arrived at by taking the identified constraints, applying the

estimation process, and generating results that satisfy all the constraints. A

variety of techniques are used by different organizations to arrive at these

estimates. The processes used can be classified as either model based or

analogy based.

Model-based estimation builds a costing model of system development based on

the characteristics of the system being built, the process being used to build it,

and it's the development environment.

A model can be a formal mathematical model or a set of informal guidelines used

by an estimator. Informal models are used by experienced developers who have

gained sufficient knowledge about system development by working on previous

projects. The informal model used by such an estimator is expressed as a set of

"rules of thumb" or, at an even more primitive level, as a "gut feel" [30]. When

questioned as to how they developed their model and how they apply it,

estimators are usually unable to say exactly what it is they do. It appears to be an

issue of gaining the required experience in order to arrive at accurate estimates.

Formal models attempt to quantify all inputs to the cost estimation process, and

then apply a set of equations that describes the relationships between the inputs

and the outputs of the cost estimation process. The equations are developed

through analysis of historical data and must be calibrated to each individual

development environment. The best known formal models are Boehm's

COCOMO II [2][4] function points, and Putnam's application of Rayleigh curves

to the development process [27].

19


The usual method of applying the formal model is to transform the requirements

into a measure of the "size" of the system. This size measure, which can be

either SLOG (Source Lines of Code) or FPs (Function Points), is used as the

basis for creating the cost estimates. The estimator can also quantify a set of

other cost drivers, examples of which include:

• Product attributes, e.g., required reliability, product complexity, etc.

• Computer attributes, e.g., memory constraints.

• Personnel attributes, e.g., applications experience, programming language

experience.

These cost drivers become multipliers that can be used to increase or decrease

the initial estimate. The bulk of the current literature and research on cost

estimation is devoted to formal models, particularly as relates to new system

development [2][4][27].

Analogy-based estimating processes estimate costs by comparing the current

development project with previous development projects undertaken by the

organization. An analogy-based technique requires maintenance of a history of

past projects; this information can be used as a reference point. Past projects

with properties similar to the current project are identified and their costs used as

a basis for estimating the current project.

At the most informal-level of analogy based techniques, the history of past

projects is maintained in the estimator's memory. Finding past projects with

properties similar to the current project involves the estimator thinking of similar

project and what cost was involved in those projects. Such an approach is highly

dependent on the memory of the individual estimators and a very low employee

turnover.

The analogy-based approach can be made more rigorous in a number of ways.

The history of past projects can be maintained as a computerized database, with

20


detailed metrics and descriptions of characteristics recorded for each project.

Using a historical database, an estimator can query the database searching for

projects with similar characteristics and then base the estimate on actual costs

and process of the previous projects. Such an approach avoids the fallibility of

human memory and provides a much more detailed historic record of what

occurred in the course of a project [9].

2.6. Timing of the estimates

Estimation is not a task done once, at the beginning of a project. Rather,

estimates and re-estimates are undertaken throughout the life of a project

[7][8][10]. The success of an estimator is not necessarily the accuracy of the

initial estimates, but rather the rate at which the estimates converge to the actual

costs. The timing of estimates depends on the type of organization involved and

why the estimate is being performed.

Contractors usually perform two estimates early in the development life cycle.

The first is done to prepare a bid for the contract, usually in a relatively quick

fashion, with the objective of arriving at a winning bid. The timing of this bid is

very much dependent on the procuring agency that issues the Request for

Proposal (RFP). The contractor is required to generate an estimate at this point,

basing it on information within the RFP and obtained informally from the

contracting agency.

Upon winning a bid, most contracting organizations immediately undertake a

second, more detailed, estimation process. The objective of this estimate is to

develop a more accurate and detailed cost estimate and project plan which are

based on the previous estimate and WBS. Frequently, much discussion between

the contractor and the agency is necessary to deal with previously undetected

issues and problems in the requirements.

21


2. 7. Estimation Constraints

An estimation process involves arnvmg at an estimate that satisfies the

constraints. These constraints vary depending on the timing of the estimate and

the organization performing the estimate, but can include:

• System requirements.

• Delivery date.

• Financial.

• Manpower resources.

• System architecture.

• Software process.

When preparing a bid to develop new software, a contracting organization is

usually faced with constraints on system requirements, delivery date, manpower

resources, and software process. Depending on the system under construction,

constraints may be placed upon the architecture. The constraints on the

requirements of the system vary considerably among projects. Some projects

have requirements which are well understood and well documented within the

RFP. In these cases, the constraints on the requirements are well understood by

all parties involved. However, in many cases, requirements are not clearly

understood up front, or are flexible in terms of the actual functionality to be

delivered as part of the end product.

Delivery date and financial resources are constraints that are very firm and have

a large impact upon a contractor's preparation of a bid for estimation purposes.

There are two reasons that these constraints are imposed upon contractors.

First, the procuring agency has a budget and timetable, which they are under

pressure to meet and which they are not willing to exceed. Second, there will be

competing bids submitted.

22


Once the bid has been won, the contractor performs another more detailed

estimate [8][10]. This estimate is in many ways more realistic because there is

less pressure to satisfy financial constraints; it is usually done by the project

manager to determine how much the system is really going to cost. Although

financial constraints affect the process, the manager usually defines in much

more detail the functionality of the system and the process used to develop the

system. This results in a more accurate estimate and can determine whether the

system may be built for the contracted price.

Re-estimates done by contractors during development involve modifying the

duration, effort, and functionality. As understanding of the tasks increases, more

accurate estimates can be made regarding effort and duration. As the

requirements of the system are better understood, they can be re-estimated and

appropriate modifications made to the effort and duration estimates.

From a procuring agency's perspective, estimates are performed under a

different set of constraints. Project Directors try to balance the following

constraints while getting approval for the project:

• Financial. How much money is the organization willing to put into this

project?

• Calendar. When do I have to show results to keep management satisfied?

• Requirements. What is the functionality required of the system?

Each of these constraints has a different level of priority, depending on the

particular project. Once project development begins, control of the project passes

from the Project Director to the Project Manager (PM). At this point budgetary

approval has been received and all previous estimates are considered to be cast

in stone. Thus, here is great pressure on the PM not to change any of the

previous estimates.

23


The PM must decide in what order to sacrifice the financial, calendar, and

requirements constraints. Different PMs have different approaches; generally

they try to maintain the functionality of the system, but let either the calendar or

financial constraints slip. In reality, however, it appeared that if the original

estimates were incorrect, all of the constraints were affected.

2.8. Data gathering

It seems obvious that without knowledge of the past, it is impossible to predict

what may happen on future projects. (Even with knowledge of the past, there is

still no guarantee 1that the future can be predicted.) A corollary is that if an

organization wants to improve its cost estimation process, should gather relevant

data on previous projects.

The simplest way to gather data is to have a stable work force so that project and

process data are maintained in the memory of the individuals of the organization.

The individuals can then use this information to estimate costs of other projects.

However, relying on individuals' imperfect memories is barely sufficient for small

projects; for large projects it is completely inadequate.

Even if this information is gathered, it is often done for financial purposes and is

not used by software managers to estimate the cost of future projects. There are

a number of reasons why this data may not be useful [27][30]:

• The data is not accurate. If the primary perceived purpose of time sheets

is to monitor the staff, the accuracy of the figures in the time sheets must

be questioned.

• The data is not accessible. Often time sheets are gathered for the benefit

of the financial department rather than to assist estimators. Thus, they are

kept on systems not easily accessible to estimators, or worse, are simply

stored as m asses of paper files.

• The data is not broken down in a useful way. The overall cost of a project

has a limited usefulness. What is usually of more interest to an estimator

24


is how the project was broken down into activities and the cost of each of

these individual activities.

2.9. Problems with the Cost Estimation Process

What factors make software cost estimation difficult? There are situations where

a high level of accuracy in cost estimation can be found; many of these situations

were identified by the following characteristics [3]:

• The users are experienced in the system, know what they want, and can

express what they want.

• The requirements are clear, precise, correct, and complete.

• The project duration is short.

• The manpower loading is small.

• The people doing the estimation are experienced in the application

domain and have developed similar systems.

• The development environment and development process are familiar to all

people involved.

• Staff turnover is low both among the developers and the users.

• No unfamiliar software or hardware from outside suppliers is to be

integrated with the final product.

A project satisfying the above characteristics frequently resulted in accurate

cost estimates. However, most of the projects did not satisfy the above

conditions and therefore the estimates produced were not accurate. The

characteristics needed for accurate estimates can be reversed in order to

enumerate problems leading to inaccurate estimates:

• Problems with the requirements.

• Issues in maintenance.

• Procurement process.

• System size.

• Software process and process maturity.

• Monitoring progress of the project.

25


• Lack of historical data.

• Lack of application domain expertise.

• Embedded software.

2.10. Problems with Requirements

Almost universally and without exception, organizations blame problems with the

requirements as a major reason why cost estimates were inaccurate. The

problems are numerous: incomplete, ambiguous, inconsistent, incorrect, and

incomprehensible.

The problem of users not understanding the requirements existed for all types of

systems and all types of developments. For new development projects, users

would request systems (and quotes) before there was a complete understanding

of the problem or the solution.

Cost estimates can be made without a clear understanding of the requirements

of the system being built; it must be accepted that these estimates have a very

high likelihood of error.

Requirements creep. As projects progress and the knowledge of the problem

increases, it seems inevitable that users (and developers) request more and

more features and changes to be included in the product. Thus, over the

development of the project, new features work their way into the requirements,

leading to "requirements creep" (or, as Boehm described it, "requirements

gallop"). New feature requests come from many sources and for many reasons,

but the problem seems to be universal. Correct and complete requirements for

complex systems are impossible to achieve. A fact that must be accepted is that

a complete statement of the requirements cannot be defined before development

begins [14]. This has nothing to do with the competence of the users or the

developers but rather is inherent in the nature of complex computer system

26


applications. Unless the system being developed is almost identical to a

previously developed system, the requirements will invariably be wrong and/or

incomplete. As a project evolves, users and developers gain a better

understanding of the problem and of the solutions. As people gain a better

understanding of the problem being solved the requirements evolve.

One frequent assumption is that the requirements will be firm before

development begins. Anyone working under this assumption will meet serious

problems when trying to estimate software costs accurately.

Since the requirements are probably wrong or incomplete, it is unlikely that the

estimates based on those requirements will be accurate. If the requirements are

included as part of the RFP put out by a procurement agency and a contractor is

expected to submit a firm bid based on those requirements, a frequent result

later in the development stages is confrontation between the contractor and the

agency as they argue over the meaning of each requirement and the cost

associated with the changing requirements.

Long development time, leading to requirements that are obsolete before the

system is delivered [8][10]. The rate of change in technology is so fast that any

attempts to predict what the technology will be in a few years are doomed to

failure. As the technology changes, so do the range of solutions to problems, and

the users' expectations of the solutions. Projects with a long time between

initiation and expected delivery suffer in that the solution is usually obsolete by

the time it is delivered. The customer is dissatisfied because the product does

not satisfy the new requirements. Large staff turnover for end users, resulting in

changing requirements as new staff arrive. Developing software systems

requires a consistent users' base throughout the development cycle. If the users'

base changes too frequently, requirements continually change, and it is difficult

for developers to obtain consistent answers and comments from the end users.

27


2.11. Conclusion

All private businesses have two concepts in common. These are

• Ensure that a profit is made

• Ensure their survival

To ensure that this happens, all projects taken on must ensure that the business

is not worse of than when started with the project. This can be accomplished

when the initial cost estimate is complete and accurate.

To determine the cost of a software project, being low level software integration

or high level web page development, the process is no different. The estimation

process has many unknown factors that must be determined before the

estimation process can be started. The following factors must be considered

• The software process. Most software engineering firms or companies

have a different management methodology on developing software. These

differences can influence the cost estimation processes. There can be

more documentation or formal processes that must be completed before

the development process can move into the next step.

• There are more inputs to be considered than in previous year of software

cost estimation. Previously the only considerations taken into account was

were the system requirement and cost drivers. Today there are more

factors to consider. Some of them are the company software process,

financial constraints, risk factors and the specific software architecture

• System requirement. In some cases the required software to be

engineered is a new system based on new technology released. There is

no data or experienced manpower available. A steep learning curve must

be taken into consideration.

Another obstacle in the cost estimation process is the specific requirements set

by the client. In many cases these requirements are vague, incomplete and

28


ambiguous. The system analyst or project manager must set up a task team to

determine the complete and correct requirements. This process can be time

consuming and sometimes expensive.

Time brackets allocated for request for proposal (RFP) are inadequate. The

project manager or the specific member assigned with the RFP must create a

cost estimation with the vague information supplied. This in turn may cause that

the estimation process is inaccurate.

These obstacles can be resolved by firstly estimating the size of the project with

available requirement. Different size estimation methods are available; the most

popular methods are the counting of source lines of code and function point

counting.

These methods will be discussed in more detail in the next chapter. The goal of

the chapter is to determine what method would be best suited for one of the

biggest obstacles, accurate estimations with limited requirements.

29


Chapter 3

Size Estimation

3.1 Lines of code

The traditional size metric for estimating software development effort and for

measuring productivity has been lines of code (LOC). A large number of cost

estimations models have been produced, most of which are functional lines of

code, or thousands of lines of code (KLOC). The definition of KLOC is important

when comparing these models. Some models include comment lines, and others

do not. Similarly, the definition of what effort (E) is being estimated is equally

important. Effort may represent only coding at one extreme of the total analysis,

design, coding and testing effort at the other extreme. As a result, it is difficult to

compare these models.

The abbreviation NCLOC is used to represent a non-commented source line of

code. NCLOC is also sometimes referred to as effective lines of code (ELOC).

NCLOC is therefore a measure of the uncommented length.

The commented length is also a valid measure, depending on whether or not line

documentation is considered to be a part of programming effort. The abbreviation

CLOC is used to represent a commented source line of code [11]

By measuring NCLOC and CLOC separately the total length can be defined:

Total length (LOC) = NCLOC + CLOC Equation 3.1

KLOC is used to denote thousands of lines of code.

A logical source statement has been chosen as the standard line of code.

Defining a line of code is difficult due to conceptual differences involved in

30


accounting for executable statements and data declarations in different software

languages. The goal is to measure the amount of intellectual work put into

program development, but difficulties arise when trying to define consistent

measures across different languages.

To minimize these problems, the Software Engineering Institute (SEI) definition

checklist for a logical source statement is used in defining the line of code

measure. The Software Engineering Institute (SEI) has developed this checklist

as part of a system of definition checklists, report forms and supplemental forms

to support measurement definitions [12][20].

Figure 3.1 shows a portion of the definition checklist as it is being applied to

support the development of the COCOMO II model. Each checkmark in the

"Includes" column identifies a particular statement type or attribute included in the

definition, and vice-versa for the excludes. Other sections in the definition clarify

statement attributes for usage, delivery, functionality, replications and

development status.

There are also clarifications for language specific statements for ADA, C, C++,

CMS-2, COBOL, FORTRAN, JOVIAL and Pascal.

31


Definition Checklist for Source Statements Counts

1•11~ !<kal ~nun:t· litH.·~

LH;.tkal ·mlllT\' .. t:iknwut' Statenwn! type

Vtlt~P:"? a ,Jnp or stntA't)£..:~: t";Ottt;;'lt:?S nlOt8 th"~'! t>ne :ypit. t~ta:;.:;.d~< ;: ~3> Nte lypB< ·/r:U: tr-t: n:gtJ?St J:! t!' ... >:den<:t:,

1 Ex>>cutablo?:

2 Noru,;,x<'::vt<ib!oi:i 3 Df?cJar,1tlons

4 C<;I!IPII,-r (hrectiv"'" !) Comrnt?nts

On !h&lr v.vn hn<'s f

8 g

ClftlinBs ·hith S(lUh~"> cod;, Banwtrs and m>t1·1Aant; SfMCt.:r::.

B!;mk 1 .;;mpty) .:::ornmt'nts

10 BL:mk llm:-s

Hmv produced 1 Pn)\1! arm tM:>d

[),:.1u1ition

:? (71\on;:.r.atf.td wlt11 :;ourcof!o ~~<)fle \JRn;,r-.tors 3 Conv•:rted mth <Jlllf.lmat.;:d transli:ltrm; 4 CopiBd m reused ,.;ithNJt ch;ul(!*

5 l'.k:-dified

On!,nn De!irulion ! N;:.w v.'0rk no prk>r ;:,xlf.Jfii!C;:.

2 !Ynor wo1k: !ake-n or <~<lapted from 3 A, f•tB'ViOtJ$ vBr$iOn bvHd, 01 t~h;asi1

4 \>>mmerd$11. <>ff-the-~he!f soft-v:,~re (COTS> •:.th&r than ht>r<~riBS S Gt•\'ernnwnt fumrsh,;d &<•ft·Nnm ·GFSJ t:>lher thrn1 tBur,.;; hhr:~rins 6 Another product 7 A v.;;ndor-suppl~d h1t1quage support llhrDry (lmmodified! 8 A ve-uda·supphed t>pet.:1tin9 ::;y::;tem vr ttti!!ly o_vnrn<Y.htte\l.' g A lo::ri1 or modified !l'lnlJH<'H~e support hor,~ty or or:-F<ralh'l\') system iO Cltllt-r cummtm:10! hbrary i'l A reuse library (Softwani df<si(lned for ret;so:,• 12 Other t;ofl•f;drt:: compon•:nt <X l!brm,·

i3

14

Figure 3.1: Definition checklist for source statements counts [4]

Some changes were made to the line-of-code definitions that depart from the

default definition provided in [20]. These changes eliminate categories of

32


software which are generally small sources of project effort. Not included in the

definition are commercial-off-the-shelf software (COTS), government furnished

software (GFS), other products, language support libraries and operating

systems, or other commercial libraries. Code generated with source code

generators is not included though measurements will be taken with and without

generated code to support analysis.

There are a number of problems with using LOG as the unit of measure for

software size. The primary problem is the lack of a universally accepted definition

for exactly what is a line of code really is.

Another difficulty with lines of code as a measure of system size is its language

dependence. It is not possible to directly compare project development by using

different languages.

Still another problem with the lines of code measure is the fact that it is difficult to

estimate the number of lines of code that will be needed to develop a system

from the information available at requirements or design phase of development

[7][8].

If cost models based on size are to useful, it is necessary to be able to predict

the size of the final product as early and accurately as possible. Unfortunately,

estimating software size using the lines of code metric depends so much on

previous experience with similar project that experts can make radically different

estimates.

Finally, the lines of code measure places undue emphasis on coding, which is

only one part of the implementation phase of a software development project. It

is stated that coding accounts only for 10% to 15% of the total effort on a large

engineering system. It is also questioned whether the total effort is really linearly

dependent on the amount of code [28].

33


3.2 Function Point Analysis

The function point cost estimation approach is based on the amount of

functionality in a software project and a set of individual project factors

[3][17][15]. Function points are useful estimators since they are based on

information that is available early in the project life cycle.

Software engineers have been searching for a metric that is applicable for a

broad range of software engineering environments. The metric should be

technology independent and support the need for estimating, project

management, measuring quality and gathering requirements. Function Point

Analysis is the measure that accomplishes all these requirements.

There have been many misconceptions regarding the appropriateness of

Function Point Analysis in evaluating emerging environments such as real time

embedded code and Object Oriented programming. Since function points

express the resulting work-product in terms of functionality as seen from the

user's perspective, the tools and technologies used to deliver it are independent.

Introduction to Function Point Analysis

One of the initial design criteria for function points was to provide a mechanism

that both software engineers and users could utilize to define functional

requirements. It was determined that the best way to gain an understanding of

the users' needs was to approach their problem from the perspective of how they

view the results an automated system produces. Therefore, one of the primary

goals of Function Point Analysis is to evaluate a system's capabilities from a

user's point of view. To achieve this goal, the analysis is based upon the various

ways users interact with computerized systems. From a user's perspective a

system assists them in doing their job by providing five (5) basic functions. Two

of these address the data requirements of an end user and are referred to as

34


Data Functions. The remaining three addresses the user's need to access data

and are referred to as Transactional Functions.

Function point calculations

Function points (FP) measure size in terms of the amount of functionality in a

system. Function points are computed by first calculating an unadjusted function

point count (UFC). Counts are made for the following categories [Fenton]:

Internal Logical Files - The first data function allows users to utilize data they

are responsible for maintaining. For example, a pilot may enter navigational data

through a display in the cockpit prior to departure. The data is stored in a file for

use and can be modified during the mission. Therefore the pilot is responsible for

maintaining the file that contains the navigational information. Logical groupings

of data in a system, maintained by an end user, are referred to as Internal

Logical Files (ILF).

External Interface Files - The second Data Function a system provides an end

user is also related to logical groupings of data. In this case the user is not

responsible for maintaining the data. The data resides in another system and is

maintained by another user or system. The user of the system being counted

requires this data for reference purposes only. For example, it may be necessary

for a pilot to reference position data from a satellite or ground-based facility

during flight. The pilot does not have the responsibility for updating data at these

sites but must reference it during the flight. Groupings of data from another

system that are used only for reference purposes are defined as External

Interface Files (ElF).

The remaining functions address the user's capability to access the data

contained in ILFs and EIFs. This capability includes maintaining, inquiring and

outputting of data. These are referred to as Transactional Functions.

35


External Input - The first Transactional Function allows a user to maintain

Internal Logical Files (ILFs) through the ability to add, change and delete the

data. For example, a pilot can add, change and delete navigational information

prior to and during the mission. In this case the pilot is utilizing a transaction

referred to as an External Input (EI). An External Input gives the user the

capability to maintain the data in ILF's through adding, changing and deleting its

contents.

External Output - The next Transactional Function gives the user the ability to

produce outputs. For example a pilot has the ability to separately display ground

speed, true air speed and calibrated air speed. The results displayed are derived

using data that is maintained and data that is referenced. In function point

terminology the resulting display is called an External Output (EO).

External Inquiries - The final capability provided to users through a

computerized system addresses the requirement to select and display specific

data from files. To accomplish this a user inputs selection information that is used

to retrieve data that meets the specific criteria. In this situation there is no

manipulation of the data. It is a direct retrieval of information contained on the

files. For example if a pilot displays terrain clearance data that was previously

set, the resulting output is the direct retrieval of stored information. These

transactions are referred to as External Inquiries (EQ).

In addition to the five functional components described above there are two

adjustment factors that need to be considered in Function Point Analysis.

Functional Complexity - The first adjustment factor considers the Functional

Complexity for each unique function. Functional Complexity is determined based

on the combination of data groupings and data elements of a particular function.

The number of data elements and unique groupings are counted and compared

to a complexity matrix that will rate the function as low, average or high

36


complexity. Each of the five functional components (ILF, ElF, El, EO and EQ)

has its own unique complexity matrix.

Tables 3.1 shows the complexity rating matrix for the different categories

calculated.

For ILF and ElF For EO .and EQ ForEI

Record Data Elements File Data Elements File Data Elements

Elements 1 - 20 - 51+ Types 1 - 6- 19 20 + Types 1 - 5 -19 50 5 4 15

1 Low Low Avg 0 or 1 Low Low Avg 0 or 1 Low Low

2-5 Low Avg High 2-3 Low Average High 2-3 Low Avg

6+ Avg High High 4+ Avg High High 3+ Avg High

Table 3.1: Function point complexity matrix [11]

Table 3.2 shows the complexity weight matrix that must be applied after the

function points have be categorized and complexities determined.

Function Type Complexity-Weight

Low Average High

Internal Logistic Files 7 10 15

External Interfaces Files 5 7 10

External Inputs 3 4 6

External Outputs 4 5 7

External Enquiries 3 4 6

Table 3.2: Function point complexity-weight matrix [11]

All of the functional components are analyzed in this way and added together to

derive an Unadjusted Function Point count (UFP).

37

16 +

Avg

High

High


UFP= ~X *W ~ 1 1 Equation 3.2

Where Xi is the specific number for specific function type andWi is the complexity

weight value listed in table 3.2

Value Adjustment Factor

The Technical complexity factor (TCF) is when the Unadjusted Function Point

count is multiplied by the second adjustment factor called the Value Adjustment

Factor. This factor considers the system's technical and operational

characteristics and is calculated by answering 14 questions [1 ][29]. The factors

are:

• Data Communications. The data and control information used in the

application are sent or received over communication facilities.

• Distributed Data Processing. Distributed data or processing functions are

a characteristic of the application within the application boundary.

• Performance. Application performance objectives, stated or approved by

the user, in either response or throughput, influence (or will influence) the

design, development, installation and support of the application.

• Heavily Used Configuration. A heavily used operational configuration,

requiring special design considerations, is a characteristic of the

application.

• Transaction Rate. The transaction rate is high and influences the design,

development, installation and support.

• On-line Data Entry. On-line data entry and control information functions

are provided in the application.

• End -User Efficiency. The on-line functions provided emphasize a design

for end-user efficiency.

• On-line Update. The application provides on-line update for the internal

logical files.

38


• Complex Processing. Complex processing is a characteristic of the

application.

• Reusability. The application and the code in the application have been

specifically designed, developed and supported to be usable in other

applications.

• Installation Ease. Conversion and installation ease are characteristics of

the application. A conversion and installation plan and/or conversion tools

were provided and tested during the system test phase.

• Operational Ease. Operational ease is a characteristic of the application.

Effective start-up, backup and recovery procedures were provided and

tested during the system test phase.

• Multiple Sites. The application has been specifically designed, developed

and supported to be installed at multiple sites for multiple organizations.

• Facilitate Change. The application has been specifically designed,

developed and supported to facilitate change.

Each component is rated from 0 to 5, where 0 means the component has no

influence on the system and 5 means the component is essential [26]. The

technical complexity factor (TCF) can then be calculated as [19]:

TCF = 0.65 + 0.01 (LFi) Equation 3.3

Where Fi is the function counts determined in the initial analysis process. The

TCF can range from 0.65 to 1.35 because a figure of 0.65 would result if all the

complexity factors had no influence, and a figure of 1.35 would indicate all the

complexity factors had a significant influence.

Each of these factors is scored based on their influence on the system being

counted. The resulting score will increase or decrease the Unadjusted Function

Point count by 35%. This calculation provides us with the Adjusted Function

Point count. The final function point figure can then be calculated [Matson]

39


FP=UFP*TCF Equation 3.4

Function Points as a Sizing Metric

Function points are a synthetic method, much the same as square feet or meters

that permit the calculation of a relative size for individual software projects,

applications, or subsystems even in their early requirements stages. Function

point counting is typically performed when a developer wants to size and

estimate development time and effort for an application or a project. In addition to

functional size, other risk and complexity factors must be considered when

estimating effort. These factors include, but are not limited to [19]:

• Development and/or maintenance tasks to be performed

• Application complexities; e.g., logical complexity, mathematical

complexity, security requirements, etc.

• Performance considerations

• Source code languages used

• Extent of reusable components from previously developed documents and

code

• Skill sets of both development and user personnel in all phases

• The process and technology to be applied in development and

maintenance

• The environment in which development and/or maintenance will take

place

• When the impact of selected risk and complexity factors is considered, the

effort required for development or maintenance of a certain range of

function points can be estimated accurately.

40


An Approach to Counting Function Points

Function point counting can be accomplished with minimal documentation.

However, the accuracy and efficiency of the counting improves with appropriate

documentation. Examples of appropriate documentation are:

• Design specifications

• Display designs

• Data requirements (Internal and External)

• Description of user interfaces

Function point counts are calculated during the workshop and documented with

both a diagram that depicts the application and worksheets that contain the

details of each function discussed.

Benefits of Function Point Analysis

Organizations that adopt Function Point Analysis as a software metric realize

many benefits including: improved project estimating; understanding project and

maintenance productivity; managing changing project requirements; and

gathering user requirements. Each of these is discussed below.

Estimating software projects is as much an art as a science. While there are

several environmental factors that need to be considered in estimating projects,

two key data points are essential. The first is the size of the deliverable. The

second addresses how much of the deliverable can be produced within a defined

period of time. Size can be derived from Function Points, as described above.

The second requirement for estimating is determining how long it takes to

produce a function point. This delivery rate can be calculated based on past

project performance or by using industry benchmarks. The delivery rate is

expressed in function points per hour (FP/Hr) and can be applied to similar

proposed projects to estimate effort (i.e. Project Hours = estimated project

function points FP/Hr).

41


Productivity measurement is a natural output of Function Points Analysis [19].

Since function points are technology independent they can be used as a vehicle

to compare productivity across dissimilar tools and platforms. More importantly,

they can be used to establish a productivity rate (i.e. FP/Hr) for a specific tool set

and platform. Once productivity rates are established they can be used for

project estimating as described above and tracked over time to determine the

impact continuous process improvement initiatives have on productivity.

3.3 Conclusion

The basis of the Measure LOC is that program length can be used as a predictor

of program characteristics such as effort and ease of maintenance. The

advantage of SLOC is that it is simple to measure. The disadvantages of SLOC

include:

• It cannot measure the size of specification.

• It characterises only one specific view of size, namely length; it

takes no account of functionality or complexity

• Inadequate software design may cause excessive line of code

• It is language dependent

• Users cannot easily understand it

On the other hand the function points can be used as an estimation variable that

is used to determine the size each element of the software or as baseline metrics

collected from past projects and used in conjunction with estimation variables to

develop cost and effort projections.

The advantages of function points include:

• It is not restricted to code

• Languageindependent

• The necessary data is available early in a project.

42


• Layout independent

The disadvantages of function points include:

• Subjective counting 1

• Hard to automate and difficult to compute

• Ignores quality of output

• Oriented to traditional data processing applications

Selecting a size estimation method will depend on the preference and experience

of the firm. For the best result it is best to use both methods and compare results

once completed but will increase the cost of the estimation process.

Once the size is estimated the effort and cost must be determined. These values

may be presented to possible customers or management for new projects or

form part of a motivation for new developments. These methods will be

discussed in chapter 4.

1 In a paper by [JEFFERY] they concluded that there was a 30% variation between analysts counting function points

43


Chapter 4

Estimation Methods

4.1. Software Life-cycle Management (SLIM) Method

Putnam developed a constraint model called SLIM to be applied to projects

exceeding 70,000 lines of code. Putnam's model [27] assumes that effort for

software projects is distributed similarly to a collection of Rayleigh curves.

Putnam suggests that staffing rises smoothly during the project and then drops

sharply during acceptance testing. The SLIM model is expressed as two

equations describing the relation between the development effort and the

schedule. The first equation, called the software equation, states that

development effort is proportional to the cube of the size and inversely

proportional to the fourth power of the development time [Fenton]. The second

equation, the manpower-buildup equation, states that the effort is proportional to

the cube of the development time.

The Rayleigh curve represents manpower as a function of time [25]. SLIM uses

separate Rayleigh curves for design and code, test and validation, maintenance,

and management. A typical Rayleigh curve is shown in Figure 4.1

Percent of total effort

Time

Figure 4.1: a Typical Rayleigh curve

44


Development effort is assumed to represent only 40 percent of the total life cycle

cost. Requirements specification is not included in the model. Estimation using

SLIM is not expected to take place until design and coding.

The Software Equation

Putnam used some empirical observations about productivity levels to derive the

software equation from the basic Rayleigh curve equation [11]. The software

equation is expressed as:

Size =CE±(,~ J Equation 4.1

Where

• C is a technology factor. The technology constant, C, combines the effect

of using tools, languages, methodology, quality assurance procedures.

standards etc. It is determined on the basis of historical data (past

projects). C is determined from project size, area under effort curve, and

project duration.

• Size is the quantity of function created in source lines of code written,

function points, objects, or other measures of function.

• E is the total project effort in person years. It includes all categories of

labor used on the project.

• Time is the elapsed calendar development time from the start of detailed

design until the product is ready to enter into operational service

(frequently this is a 95% reliability level).

SLIM is applicable to all types and sizes of software projects. It computes

schedule, effort, cost, staffing for all software development phases and reliability

for the main development phase. It works with software languages, and function

points as well as other sizing metrics. It is specifically designed to address the

concerns of senior management, such as:

45


• What options are available if the schedule is accelerated by four months to

meet a tight market window?

• How many people must be added to get two months schedule compression

and how much will it cost?

• When will the defect rate be low enough so that a reliable product can be

marketed and have satisfied customers?

• If the requirements grow or substantially change, what will be the impact on

schedule, cost, and reliability?

• How can the quantifying value of the process improvement program?

SLIM can record and analyze data from previously completed projects which are

then used to calibrate the model; or if data are not available then a set of

questions can be answered to get values of FP from the existing database.

The Rayleigh-Putnam Curve uses a negative exponential curve as an indicator of

cumulative staff-power distribution over time during a project. The technology

factor is a composite cost drivers involving the following primarily components:

• Overall process maturity and management practices

• The extent to which good software engineering practices are used

• The level of programming languages used

• The state of the software environment

• The skills and experience of the software team

• The complexity of the application

The software equation includes a fourth power and therefore has strong

implications for resource allocation on large projects. Relatively small extensions

in delivery date can result in substantial reductions in effort [26].

46


The Manpower-Buildup Equation

To allow effort estimation, Putnam introduced the manpower-buildup equation

[11 ]:

Equation 4.2

where D is a constant called manpower acceleration, E is the total project effort

in years, and t is the elapsed time to delivery in years.

The manpower acceleration is 12.3 for new software with many interfaces and

interactions with other systems, 15 for standalone systems, and 27 for re

implementations of existing systems [Putman].

Using the software and manpower-buildup equations, the effort [11] can be

solved:

Equation 4.3

This equation is interesting because it shows that effort is proportional to size to

the power 9/7 or -1.286, which is similar to Boehm's factor [4] which ranges from

1.05 to 1.20.

Inputs

The primary input for SLIM is SLOC, function points or any valid measure of

function to be created [27]. The model uses size ranges for input: minimum,

most likely, and maximum. Other important inputs include:

• Language: Multiple choices and mixes.

• System Type: One of nine (business, scientific, command & control, real

time, etc.).

• Environmental Information: Tools, methods, practices, database usage;

standards in place and adherence and usage of those standards.

• Experience: Personnel skill and qualifications.

47


• Process Productivity Parameter: a macroscopic factor determined by

calibration from historical data. It is a reliable tuning factor that accurately

reflects application complexity and the efficiency of the organization in

building software. This is a sensitive parameter that is capable of

measuring real productivity and process improvement. SLIM contains and

expert system to determine the Process Productivity Parameter when the

user has no historical data. This (non-linear) parameter is dealt with in

terms of a linear scale ranging from 0 to 40.

• Management Constraints: Maximum allowable schedule, minimum cost,

maximum and minimum staff size, required reliability at the time the

software goes into service as well as the desired probabilities for each of

these constraints.

• Accounting: Labor rates, inflation rates, and other economic factors.

• Flexibility: Extensive tailoring for milestones, phase definitions, and fraction

of time and effort applied to each phase based on the organization's own

history.

Processing

There are three primary modes of operation: building and using an historical

database, performing estimating and analysis, and creating presentations and

reports [27].

For estimation, SLIM uses the software equation in conjunction with

management constraints for schedule, cost, staffing and required reliability to

determine an optimal solution with the highest probability of successful

completion. Through Monte Carlo simulation techniques, the size range

estimates are mapped through the software equation to provide estimates of the

uncertainty in schedule, cost staffing and reliability. The solution obtained can be

compared with the user's historical data to test its reasonableness. This

discloses impossible or highly improbable solutions so that expensive mistakes

are avoided.

48


Outputs

The primary output of SLIM is the optimal solution, which provides development

time, cost, effort and reliability expected at delivery [27]. It also provides

comprehensive sensitivity and risk profiles for all key input and output variables,

and a consistency check with similar projects. SLIM's graphical interactive user

interface makes it easy to explore quickly extensive tradeoff and "what if'

scenarios including design to cost, schedule, effort and risk. It has 181 different

output tables and graphs from which the user can choose. These outputs

constitute a comprehensive set of development plans to measure and control the

project while it is underway.

Calibration

The process productivity parameter for SLIM can (and should) be obtained by

calibration using historical data. All that is required are project size, development

time and effort. These numbers are input into the software equation to solve for

the process productivity. The historical data can also be used to compare with

any current solution to compare for reasonableness.

4.2. Constructive Cost Model (COCOMO II)

The COCOMO (Constructive Cost Model) cost and schedule estimation model

was originally published by Boehm [3]. It became one of most popular parametric

cost estimation models of the 1980s. But COCOMO '81 experienced difficulties in

estimating the costs of software developed to new life-cycle processes and

capabilities. The COCOMO II research effort was started in 1994 at University of

South California to address the issues on non-sequential and rapid development

process models, reengineering, reuse driven approaches and object oriented

approaches.

49


COCOMO II was initially published in the Annals of Software Engineering in 1995

[5]. The model has three sub models, Applications Composition, Early Design

and Post-Architecture, which can be combined in various ways to deal with the

current and likely future software practices marketplace.

The Application Composition model is used to estimate effort and schedule on

projects that use Integrated Computer Aided Software Engineering tools for rapid

application development. These projects are too diversified but sufficiently simple

to be rapidly composed from interoperable components. Typical components are

GUI builders, database or objects managers, middleware for distributed

processing or transaction processing and domain specific components such as

financial, medical or industrial process control packages.

The Early Design model involves the exploration of alternative system

architectures and concepts of operation. Typically, not enough is known to make

a detailed fine-grain estimate. This model is based on function points (or lines of

code when available) and a set of five scale factors and 7 effort multipliers.

The Post-Architecture model is used when top level design is complete and

detailed information about the project is available and as the name suggests, the

software architecture is well defined and established. It estimates for the entire

development life-cycle and is a detailed extension of the Early-Design model. It

uses Source Lines of Code and/or Function Points for the sizing parameter,

adjusted for reuse and breakage; a set of 17 effort multipliers and a set of 5 scale

factors that determine the economies/diseconomies of scale of the software

under development.

Cost factors are also evaluated and weighted within COCOMO II for application

complexity and software reliability; execution, memory, and environmental

constraints; development personnel skill levels; tools and technologies; and a

variety of other considerations.

50


COCOMO avoids estimating labor costs in monetary value because of the large

variations between organizations in what is included in labor costs, and because

person-months are a more stable quantity than monetary value, given current

inflation rates and international money fluctuations. In order to convert COCOMO

person-month estimates into rand estimates, the best compromise between

simplicity and accuracy is to apply a different average rand per person-month

figure for each major phase, to account for inflation and the differences in salary

level of the people required for each phase

COCOMO II Model Rationale and Elaboration

The rationale for providing this mix of models (application composition, early

design and post-architecture models) rests on three primary premises.

First, current and future software projects will be tailoring their processes to their

particular process drivers. These process drivers include reusable software

availability; degree of understanding of architectures and requirements; market

window or other schedule constraints; size; and required reliability (see [5] for an

example of a tailoring guidelines).

Second, the granularity of the software cost estimation model used needs to be

consistent with the granularity of the information available to support software

cost estimation. In the early stages of a software project, very little may be known

about the size of the product to be developed, the nature of the target platform,

the nature of the personnel to be involved in the project, or the detailed specifics

of the process to be used.

Third, given the situation in premises 1 and 2, COCOMO II enables projects to

furnish coarse-grained cost driver information in the early project stages, and

increasingly fine-grained information in later stages. Consequently, COCOMO II

does not produce point estimates of software cost and effort, but rather range

estimates tied to the degree of definition of the estimation inputs.

51


Modeling Software Economies and Diseconomies of Scale

Software cost estimation models often have an exponential factor to account for

the relative economies or diseconomies of scale encountered as a software

project increases its size. This factor is generally represented as the exponent B

in the COCOMO effort equation [5]:

PM nominal =A* (Size Y Equation 4.4

Where

PMnominal orE is person-months of estimated effort.

A is a coefficient that is provisionally set to a default value of 2 .5, but should be

set to reflect a specific organization's cost and culture.

B is an exponential factor to account for the relative economies or diseconomies

of scale encountered in different size software projects.

If the value of B is smaller than 1.0, the project exhibits economies of scale. This

means that if the product's size is doubled, the project effort is less than doubled.

The project's productivity increases as the product size is increased.

Some project economies of scale can be achieved via project-specific tools (e.g.,

simulations), but in general these are difficult to achieve. For small projects, fixed

startup costs such as tool tailoring and setup of standards and administrative

reports are often a source of economies of scale.

If B = 1.0, the economies and diseconomies of scale are in balance. This linear

model is often used for cost estimation of small projects. It is used for the

COCOMO II Applications Composition model.

52


If B > 1.0, the project exhibits diseconomies of scale. This is generally due to two

main factors: growth of interpersonal communications overhead and growth of

large-system integration overhead. Larger projects will have more personnel, and

thus more interpersonal communications paths consuming overhead. Integrating

a small product as part of a larger product requires not only the effort to develop

the small product, but also the additional overhead effort to design, maintain,

integrate, and test its interfaces with the remainder of the product.

A multiplicative constant, A, is used to calibrate the model locally for a better fit

and it captures the linear effects of effort in projects of increasing size. The

coefficient A in the equation is provisionally set at 3.0 Initial calibration of

COCOMO II to the original COCOMO project database [5] indicates that this is a

reasonable starting point. This value must be adjusted as the size of the project

varies.

Scaling Approach

The COCOMO II scaling value is integrated into a single rating-driven model.

Table 4.1 list a summary of the scale divers and the rating criteria. A project's

numerical ratings Wi are summed across all of the factors, and used to determine

a scale exponent B via the following equation [6]:

B = 1.01 + 0.01 * l)¥; Equation 4.5

Thus, a 100 KSLOC project with Extra High (0) ratings for all factors will have

Wi= 0, B = 1.01, and a relative effort E = 1001.01= 105 PM.

A project with Very Low (5) ratings for all factors will have Wi= 25, B = 1.26, and

a relative effort E = 331 PM. This represents a large variation, but the increase

involved in a one-unit change in one of the factors is only about 4. 7%. Thus, this

approach avoids the 40% swings involved in choosing a development mode for a

100 KSLOC product in the original COCOMO model.

53


Scale Very Low Low Nominal High Very High Extra High Factors (Wt)

PREC thoroughly largely somewhat generally largely throughly unprecedented unprecedented unprecedented familiar familiar familiar

FLEX rigorous occasional some general some general relaxation relaxation conformity conformity goals

RESL Little (20%) some (40%) often (60%) generally mostly full (100%) (75%) (90%)

TEAM Very difficult some difficult basically largely highly seamless interactions interactions cooperative cooperative cooperative interactions

interactions

PMAT Weighted average of "Yes" answers to CMM Maturity Questionnaire

Table 4.1 Rating scheme for the COCOMO II scale factors2 [6]

Appendix A list a full description of the meaning of each scaling driver

Cost Factors: Effort-Multiplier Cost Drivers

COCOMO II uses a set of effort multipliers to adjust the nominal person-month

estimate obtained from the project's size and exponent drivers [6]:

(

17 )

p Madjusted = p Mno min a/ * D EM; Equation 4.6

Table 4.2 summarizes the COCOMO II effort-multiplier cost drivers by the four

categories of Product, Platform, Personnel, and Project Factors. The superscripts

following the cost driver names indicated the differences between the COCOMO

II cost drivers and its counterpart in the original COCOMO model.

2 * % significant module interfaces specified,% significant risks eliminated.

t The form of the Process Maturity scale is being resolved in coordination with the SEI. The intent

is to produce a process maturity rating as a weighted average of the project's percentage

compliance levels to the 18 Key Process Areas in Version 1.1 of the Capability Maturity Model

based [Paulk 1993] rather than to use the previous 1-to-5 maturity levels. The weights to be

applied to the Key Process Areas are still being determined.

54


Very Low Low Nominal High Very High Extra High

RELY slight low, easily Moderate, high financial risk to inconvenience recoverable easily loss human

losses recoverable life losses

DATA DB bytes/Pgm 10 < D/P < 100 < D/P < D/P < 1000 SLOC < 10 100 1000

CPLX Appendix C RUSE none Across project Across across across

program product multiple line product

lines DOCU Many life-cycle Some life-cycle Right-sized to Excessive for Very

needs needs life-cycle life-cycle needs excessive uncovered uncovered. needs for life-cycle

needs TIME 50% use of 70% 85% 95%

available execution time

STOR 50% use of 70% 85% 95% available storaqe

PVOL major change major: 6 mo.; major: 2 mo.; major: 2 wk.; every 12 mo.; minor: 2 wk. minor: 1 wk. minor: 2 minor change days every 1 mo.

ACAP 15th percentile 35th percentile 55th 75th percentile 90th percentile percentile

PCAP 15th percentile 35th percentile 55th 75th percentile 90th percentile percentile

PCON 48% I year 24% /year 12% /year 6% I year 3% I year AEXP < 2 months 6 months 1 year 3 years 6 years PEXP < 2 months 6 months 1 year 3 years 6years LTEX < 2 months 6 months 1 year 3 years 6 years TOOL edit, code, simple, front- basic lifecycle strong, mature strong,

debug end, backend tools, lifecycle tools, mature, CASE, little moderately moderately proactive life integration integrated integrated cycle tools,

well integrated with processes, methods, reuse

SITE: International Multi-city and Multi-city or Same city or Same Fully Collocati Multi-company Multi-company metro. Area building collocat on Or complex ed SITE: Some phone, Individual Narrowband Wideband Wide band lnteracti Commun mail phone, FAX email electronic elect. ve icatlons communication. comm, multi me

occasional dia video conf

SCED 75% of 85% 100% 130% 160% nominal

Table 4.2: Effort multipliers cost driving ratmg for the post-architecture model [6].

55


Table 4.2 provides the COCOMO II effort multiplier rating scales. Appendix F lists a full description of the meaning of each effort multipliers cost drivers.

Development Schedule Estimates

The initial baseline schedule equation for all three COCOMO II models is3 [6]:

[ (-~ozs+o2(B-Iol))J SCED% TDEV = 3.67* PM) *---

100 Equation 4. 7

where TDEV is the calendar time in months from the determination of its

requirements baseline to the completion of an acceptance activity certifying that

the product satisfies its requirements. PM is the estimated person-months

excluding the SCED effort multiplier, and SCEDPercentage is the schedule

compression I expansion percentage in the SCED cost driver rating.

Early Design effort multiplier cost drivers

In Early Design, however, a reduced set of effort multiplier cost drivers is used.

These are obtained by combining the Post-Architecture cost drivers as shown in

Table 4.3.

The resulting seven cost drivers are easier to estimate in early stages of software

development than the 17 Post-Architecture cost drivers. However, their larger

productivity ranges (up to 5.45 for PERS and 5.21 for RCPX) stimulate more

variability in their resulting estimates. This situation is addressed by assigning a

higher standard deviation to Early Design (versus Post-Architecture) Estimates.

3 PM is the estimated person-months excluding the SCED effort multiplier. SCED% is the compression I expansion percentage in the SCED effort multiplier in table

56


Early Design Cost Driver Counterpart Combined Post-Arch.

Cost Driver

CPLX RELY,DATA,CPLX,DOCU

RUSE RUSE

PVOL TIME, STOR, PCON

ACAP ACAP, PCAP, PCON

PREX AEXP, PEXP, L TEX

TOOL TOOL, SITE

SCED SCED

Table 4.3: Early design and post-architecture cost driver [5].

4.3. Expertise-Based Technique

Expertise-based technique is useful in the absence of quantified, empirical data.

They capture the knowledge and experience of practitioners seasoned within a

domain of interest, providing estimates based upon a synthesis of the known

outcomes of all the past projects to which the expert is privy or in which he or she

participated. The obvious drawback to this method is that an estimate is only as

good as the expert's opinion, and there is no way usually to test that opinion until

it is too late to correct the damage if that opinion proves wrong. Years of

experience do not necessarily translate into high levels of competency.

Delphi Technique

The Delphi technique [13] was developed at The Rand Corporation in the late

1940s originally as a way of making predictions about future events. More

recently, the technique has been used as a means of guiding a group of informed

individuals to a consensus of opinion on some issue.

Participants are asked to make some assessment regarding an issue,

individually in a preliminary round, without consulting the other participants in the

exercise. The first round results are then collected, tabulated, and then returned

57


to each participant for a second round, during which the participants are again

asked to make an assessment regarding the same issue, but this time with

knowledge of what the other participants did in the first round. The second round

usually results in a narrowing of the range in assessments by the group, pointing

to some reasonable middle ground regarding the issue of concern. The original

Delphi technique avoided group discussion; the Wideband Delphi technique [5]

accommodated group discussion between assessment rounds.

This is a useful technique for coming to some conclusion regarding an issue

when the only information available is based more on "expert opinion" than hard

empirical data.

It becomes more obvious that a number of parameters need to be determined

based on as expert's (or designer's) estimates. The accuracy of these is crucial

to the performance of the model that has to be calibrated to the needs of the

specific software organization. One may also expect that a group of experts

(designers) can do a better job than a single individual. The Delphi method helps

coordinate a process of gaining information and generating reliable estimates.

The group estimating procedure governed by the Delphi method comprises a

series of the following steps:

• Coordinator presents each expert with a specification of the proposed

project and other relevant information.

• Coordinator calls a group meeting where experts discuss the estimates.

• Experts fill out estimation forms indicating their personal estimates of total

project effort and total development effort. The estimates are given in a

interval format: the expert provides the most likely value along with an

upperandlowerbound.

• Coordinator prepares and circulates the summary report indicating the

group estimates and the individual estimates.

58


• Coordinator calls a meeting during which experts discuss current

estimates.

This process is repeated until a consensus is reached. The group estimate is

taken as an average of the weighted individual estimates, computed as [24]

. Lower bound of estimate + 4 *most likely estimate + upper bound of estiamte Estzmate = _____ .:....._ ___________________ _ 6

Equation 4.8

The variance of the individual estimate is defined as [24]

Upper bound- Lower bound

6 Variance Equation 4.9

The group variance is the average of the variances of the individual estimates.

4.4. Cost Estimation method

Once the effort required has been determined the resources must be allocated to

the project. The number of resources required will depend on the person-month

and the time to complete the project. Equation 4.10 indicates how the amount of

resources can be determined [24]

Effort Estimated Number of resources required=----------

Calender months estimated Equation 4.10

For an estimated 12 month project, with an estimated person months of 120 the

required number of resources will be 120/12 which is ten resources (full time

development engineers). An assumption is made that all developer engineers will

be fulltime allocated to the project and will part of the project in all phases of the

project (calendar months estimated).

59


In the case that only a limited number of resources are available, equation 4.7 is

not valid. The person months must then be divided by the number of available

resources to get the estimated calendar months.

The effort cost will consists of the sum of all the salaries of the development

engineers for the specific period. The effort cost can be calculated a follow

(expressed in rand value) [24]

Effort cost= TDEV *~)Cost to companyJ Equation 4.11

Where

• TDEV is the calendar time in months from the determination of its

requirements baseline to the completion of an acceptance activity

certifying that the product satisfies its requirements.

• Cost to company is the direct cost of each engineer allocated to the

project.

Other resources like analysts and project managers are not incorporated in the

effort formula. These values must be determined separately and then added to

the effort cost.

Direct cost like operating systems, development tools and licensing must also be

added. Indirect cost like travel expenses, training and stationary must then also

be added. The sum of all these values will result in the cost of the project

(equation 4.14)

Total Cost = Direct Cost + Indirect Cost + Effort Cost Equation 4.12

60


4.5. Conclusion

The cost estimation process is an interesting mix of formal models and

experience. In this sense, the overall modeling process is not straight forward

and requires a significant level of skill. To produce a meaningful and reliable

estimate, the cost estimation process needs to be thoroughly arranged and

carefully followed.

From this section it seems that all estimation tools have specific strong points for

specific types of project. It would be best when estimating the cost that more than

one estimation method be used to get a global view on the possible cost.

The advantages and disadvantage of the methods and techniques discussed are

summarized in table 4.4

Method Advantages Disadvantages

SLIM Uses linear programming A study carried out by [PENGELLY]

to consider development indicated that SLIM did not perform

constraints on both cost accurately on small projects. However,

and effort [LONDIEX] reported that SLIM is

suitable for software developments that

meet the following of

1) Software size is greater than 5000

lines

2) Effort greater than 1.5 man years

3) Over 6 months development time.

SLIM estimates are extremely sensitive

to the technology factor

Process is not transparent

61


COCOMO COCOMO is transparent Extremely vulnerable to mis-

11 and it can be seen how it classification of the development mode

works

Success depends largely on tuning the

Drivers are particularly model to the needs of the organization,

helpful to the estimator to using historical data which is not

understand the impact of always available

different factors that affect

project cost COCOMO estimates assume that the

project will enjoy good management by

both the developer and the customer.

COCOMO assumes that the

requirements specification is not

substantially changed after the plans

and requirements phase, although

some refinements and reinterpretations

are inevitable. Any significant

modifications or added capabilities

should be covered by a revised cost

estimate.

Expert- Group of experts are The process is extremely sensitive to

Based involved in the estimation the technology factor

technique process and not one

individual Large amount of human resource

required

Large amount of time required

Table 4.4: Advantages and disadvantages of cost estimation methods.

62


Although the COCMO II method is the most popular method used for estimation,

it does not mean that the other methods are less accurate. The best scenario

would be that all three methods are used and that a comparison is made, but this

will add a cost component to the whole estimation process.

Firms should keep a database of project history and this can also be a useful

reference to measure the end result of the current estimation process with

historical projects.

63


Chapter 5

Case Study

5.1 Project description

To demonstrate how the cost estimation is used in practice a use case will be

presented. The project under discussion is a server application to be developed

to interface with a terminal. The full technical specification of the project is listed

in Appendix G but a summary of the project is as follows:

• Clients will be given a member card to be used at any Nu Metro cinemas.

• The card will be swiped and the system must determine the following

o Is the user a valid user

o Has the user used the card the same day

o How many tickets are allowed

• The terminal will receive a valid list of users with all the required detail

from the server

• The terminal will initialize a connection to the server on the predefined

time each day and receive the required information.

• The terminal will send client usage information to the server once all the

relevant detail has been received

The first step of estimating the cost of a project would be to analyze the

requirements of the system (Appendix G) and determine the size of the project.

5.2 Project Size estimation

In the case of this project both the function point count and source line of code

was used. Firstly the SLOG size estimation method will be discussed.

Once all requirements was available a workgroup was set up with the following

staff members:

64


• Project manager

• Head software developer

• Developer assigned for the project

• Specialist developer working with the terminal

The project was assessed and broken into modules. Each member had to give

her/his view on the difficulty of the project and the number of non-commented

lines of code to be generated.

A variance between 800 and 1500 lines of code was given. The specialist giving

the lowest value and the assigned engineer giving the highest value. A value of

1000 SLOC was agreed on by all parties reached.

Once this was completed the same team started with the function point count.

The final results were as follows:

• External inputs. The inputs were the parameter files to be received from

the terminal and the request for connection file.

• External output. The external file was the parameter file sent to the

terminal.

• External Inquiries. None

• Internal Logical Files. The processing file

• External Interface Files. The audit files and database queries.

All the function types were marked as highly complex and the value adjustment

factor would have no influence meaning that the value of TCF = 0.65.

From equation 3.2 the UFP is

UFP = 2*6 + 1*7 + 1*15 + 2*10 =54

Using equation 3.4 the FP is

65


FP = 0.65 * 54 = 35

To bring the two values into perspective [5] has a conversion table between

SLOG and FP for each development language. The project in question was

developed in Visual Basic 6 and the conversion value from FP to SLOG is 36.

The number of SLOG for the function point count was 1200, differing by 100 from

the SLOG calculation.

On completion the number of source line of code developed was 1253. Table 5.1

shows the difference.

SLOC FP [!}ctual ·· ...... .: ..........

Estimated 1000 1200 1253

Variance 253 53

Table 5.1: Estimation method variations

5.3 Effort estimation

For this project only the GOGOMO II method and the expert-base technique was

used. The individual results are listed below.

COCOMO II

To determine the effort, equation 4.4 will be used. The inputs needed are A, 8

and the Size. The default value for A is used. The value for A can be different

from organization to organization. The Post-Architecture model is used because

the requirements are already defined.

The Size has already been determined as 1.2 KLOG or 1200 LOG. The only

value that still needs to be determined is the B scale driver. To determine B,

66


equation 4.5 will be used. The summary of the values for the effort estimation is

shown in table 5.2.

Scale Factor Rating Value

PREC Considerable 3.5

Development flexibility Considerable 2.3

Architecture I Risk Some 3.4

Resolution

Team Cohesion Medium 2.5

Process maturity Medium 1.63

Table 5.2: Summary of effort estimation values

After all the scale driver values have been determined, B van be determined with

B=l.01+0.01*LW; = 1.01 +0.01(3.5+2.3+3.4+2.5+ 1.63)= 1.143

Now the nominal person-month value can be determined with

PMnominal =A*(SizeY = 1.25(1.2)1·143 == 1.2

The A coefficient value is taken as 1.25. From previous project data it was found

that the default of 2.5 gave an over estimation and it was found that the

calibration of the coefficient to 1 .25 gave a more accurate value.

Now the nominal person month has to be adjusted. To adjust the person month

the cost drivers have to be determined. Table 5.3 provides a summary of the 17

cost driver's values. The description of each value is listed in appendix A.

67


Driver Value

Required Software Reliability 4

Data Base Size 5

Product Complexity 4

Required Reusability 2

Documentation to life cycle needs 3

Execution time constraints 3

Main storage constraint 3

Platform Volatility 3

Analyst capability 3

Engineering capability 3

Application experience 4

Platform experience 3

Language and tool experience 3

Personal continuity 2

Use of software tools 3

Multi-site development 3

Required development schedule 2 (75%)

Table 5.3: Summary of cost dnver values

The Adjusted Person-Months is determined with equation 4.6:

PMadjusted = PMnominal *( D EM;)

= 1.2*(1.1705)

= 1.4

PMadjusted without SCED driver is 1.2*(1.7558) =2.1

To determine the schedule equation 4.7 is used

[ (-~o28+o2(s-Lol))J SCED% TDEV = 3.67 * PM j * ---

100

= 3.67(2.1)0·3066 * (0.75) == 3 Months

68


The cost for just the development for the code would be the cost to company of

one developer. Other expenses would include the hardware cost, development

software and overheads

Expert base technique

The expert base technique is started having a work session. The following

members of the project was involved in the estimation process:

• Project manager

• Head software developer

• Developer assigned for the project

• Specialist developer working with the terminal

Each member had to give there estimation on how long they expected the

development to last. The results are listed in table 5.4.

Member Months

Project manager 2

Head software developer 2.5

Developer assigned for the project 3

Specialist developer working with the 2.5

terminal

Table 5.4: Summary of team estimation values

From equation 4.8 the estimated time= (3 + 4*2.5 + 2)/6 = 2.5 months

The project took 2.5 months to complete, which is exactly in line with the expert

base technique prediction

COCOMO II Expert Actual

3 months 2.5 Months 2.5 Months

Table 5.5: Companson between COCOMO II and Expert Based Techmque

69


5.4 Conclusion

From the results of the size estimation it can be seen that the function point count

was more accurate than the source line of code estimation. This can be

contributed to the fact that the project at hand is small of size and that the

requirements were available.

Function points are very accurate for small project and starts to become less

accurate as the size and complexity increases. The same goes for SLOG

estimation.

Function point counting is currently more accurate than SLOG because there are

fewer dependants on early accurate requirements and takes complex

development functions into account.

On the effort side it can be seen that the expert-based scenario is more accurate

than the GOGOMO II method. This can also be attributed to the fact that the

project is small in size. Most of the cost drivers are used for medium to large

projects with medium to large man power teams.

The conclusion is that the expert method would be well suited for small projects.

Where small projects can be seen as projects where one or two developers are

involved and the project life span is less than months.

70


Chapter 6

Conclusions and Recommendations

6.1 Conclusions

This dissertation has presented an overview of a variety of software estimation

techniques, providing an overview of several popular estimation models currently

available. Literature to date [4] indicates that estimate based techniques are less

mature than the other classes of techniques, but that all classes of techniques

are challenged by the rapid pace of change in software technology.

The baseline COCOMO II family of software cost estimation models presented

here provides an adaptable cost estimation capability well matched to the major

current and likely future software process trends. It is currently serving as the

framework for an extensive data collection and analysis effort to further refine

and calibrate its estimation capabilities.

Thus, it can be see that the COCOMO II rating scales and effort multipliers

provide a rich quantitative framework for exploring software project and

organizational tradeoff and sensitivity analysis. The framework would enable the

project manager to explore alternative staffing options involving various mixes of

application, platform, and language and tool experience. An organization-level

manager could also explore various options for transitioning a portfolio of

applications from their current application/platform/language configuration to a

desired new configuration (e.g., by using pilot projects to build up experience

levels).

Software cost estimation is an important part of the software development

process. Models can be used to represent the relationship between effort and a

primary cost factor such as size. Cost drivers are used to adjust the preliminary

71


estimate provided by the primary cost factor. Although models are widely used to

predict software cost, many suffer from some common problems. The structure

of most models is based on empirical results rather than theory. Models are often

complex and rely heavily on size estimation. Despite these problems, models are

still important to the software development process. Models can be used most

effectively to supplement and corroborate other methods of estimation.

The following points were observed by the author:

• Experience and informal analogy are the primary cost estimation methods.

The majority of organizations relied on individuals' expertise and

experience to arrive at cost estimates. Managers received little or no

training in estimation. Estimators were expected to arrive at accurate

estimates by relying on their knowledge of the software process used

within the organization and recollections of their previous projects

• Few organizations have sufficient historical data to be used for cost

estimation. With a few notable exceptions, organizations did not have

information regarding past projects recorded in a manner that was useful

and accessible to estimators. However, a number of organizations had

recently implemented programs to gather and store this data, but it will be

a few years before the impact of the data gathering on estimation

accuracy can be determined.

• Estimation cannot be improved without a well-defined and well controlled

software process. Organizations without a defined and controlled software

process cannot achieve consistency in their software development.

Without consistency in software development, consistently accurate

estimates are not possible.

• Requirements creep is a major reason for cost overruns. It can be

minimized, but cannot be eliminated. Two conclusions are drawn. First, if

72


cost estimates are to be accurate, the initial software requirements must

be as complete and correct as possible. Second, for complex systems, it

is impossible to generate requirements that are 100% complete and

correct. Thus, one must accept the fact that complete accuracy for

estimates of complex systems is not possible.

6.2 Recommendations

The solution to improving estimation accuracy is not a high technology issue. No

existing tools, models, or methodologies can be brought to bear on the problem

that by themselves will have a significant impact. Rather, the problem is one of

applying simple technologies, an effective software development process, and

proper management and control to achieve a consistency in development, which

allows more accurate cost estimation. Solutions to the cost estimation problem

must address the issues in all of these areas or they will not be effective.

The cost estimation problem varies considerably among organizations that do

their estimation under very different constraints. The recommendations are

general in nature and must be tailored to the individual organizations needs,

depending on whether they are maintenance groups, procurement organizations,

commercial developers, etc.

The following recommendations are based on two assumptions:

• There is significant room for improvement in the accuracy of cost

estimates for software intensive systems.

• Although there is room to improve the level of accuracy of software cost

estimates, there will continue to be a large margin of error; organizations

must adapt to accept this fact.

73


Software Process Improvements

Improving software cost estimation accuracy must begin with a solid and

effective software development process. An effective software process can be

used to increase accuracy in cost estimation in a number of ways.

• Formalizing when and how estimates and re-estimates costs are

performed. A critical aspect to estimation accuracy is to have a well

defined process that defines when and how cost estimates are performed.

• The process used to perform the estimates, including who performs the

estimate and who has sign off authority on the estimate.

• Permit effective monitoring and control of software costs. No cost estimate

will be accurate without effective monitoring and control of software costs.

If there is no effective technique for monitoring and controlling the project,

there is an increased risk of the costs of the project escalating without

management being able to recognize or identify the problem at a time

when action can be taken to minimize the effect.

• Objective measure of completeness. Each WBS work item should have

clearly identified output items and an objective means of determining the

completeness of these items.

• Analyzing problems reported during the development process. Every

reported problem with either a product or a process should be traced back

to its cause. This requires determining which Work Breakdown Structure

activity, and which work item of that activity, was the cause of the problem.

This is a prerequisite to determining whether a particular activity within the

organization is a cause of the problem.

• Management must recognize that cost estimates based on the initial

requirements are wrong because the requirements are wrong. This means

there must be a provision within the software process to re-estimate costs

as requirements are changed. The re-estimation depends on the

constraints under which the system is being developed.

74


Maintaining a Historical Database

Organizations should maintain a database, which can be used as a basis for

estimating costs of future projects. The database should include both project

metrics (which describe the features of the system built) and process metrics,

which describe features of the process used to build the system. It is impossible

to identify specific metrics that should be recorded and used by every

organization; each organization and each situation are unique. However, metrics

recorded for the purpose of improving cost estimation should be able to satisfy

the following:

• Actual cost of the system development. Unless the actual cost of the

system development is known, it is impossible to determine the accuracy

of the estimates.

• All estimates and re-estimates are recorded. To determine the accuracy .of

estimates, and the rate of convergence of the estimates to the actual cost,

a complete record of all estimates must be maintained

• The characteristics of the completed product. This includes the size

measured in some suitable units (e.g., Source Lines of Code, Function

Points), a description of the functionality of the system, classification of

type of software, and any other information that characterizes the system.

This information is required if any rigorous estimation by analogy is to be

performed or if any costing models are to be developed.

These processes will simplify the cost estimation process and will in turn

increase management capabilities to ensure that cost effective and on budget

software is engineered.

75


6.3 Further Investigation

One of the statements made in this dissertation is that cost estimation methods

should stay up to date with software trends. The current cost estimation methods

express effort in a value of man months. This gives a good indication on the

amount of human resource will be required.

It happens that a project requires a senior engineer of the architectural design

and the junior engineers can complete the project while a database administrator

is required half way trough the project

Future cost estimation research should be inclined towards delivering indications

on what skills will be needed for what period to complete a specific project.

76


Glossary

Algorithmic Models (also known as parametric models): produce a cost

estimate using one or more mathematical algorithms using a number of variables

considered to be the major cost drivers. These models estimate effort or cost

based primarily on the hardware/software size, and other productivity factors

known as cost driver attributes.

Analogy (or Comparative) Models: Models that use a method of estimating that

compares a proposed project with one or more similar and completed projects

where costs and schedules are known. Then, extrapolating from the actual costs

of completed projects, the model(s) estimates the cost of a proposed project.

Constructive Cost Model (COCOMO): A software cost estimation model

developed by Barry Boehm and is described in his book, Software Engineering

Economics.

Cost Analysis: The review and evaluation of the separate cost elements and

proposed profit of (a) an offeror's or contractor's cost or pricing data and (b) the

judgmental factors applied in projecting from the data to the estimated costs in

order to form an opinion on the degree to which the proposed costs represent

what the cost of the contract should be, assuming reasonable economy and

efficiency.

77


Cost Driver Attributes: Productivity factors in the software product development

process that include software product attributes, computer attributes, personnel

attributes, and project attributes.

Cost Drivers: The controllable system design or planning characteristics that

have a predominant effect on the system's costs. Those few items, using

Pareto's law, that have the most significant cost impact.

Cost Model: An estimating tool consisting of one or more cost estimating

relationships, estimating methodologies, or estimating techniques used to predict

the cost of a system or one of its lower level elements.

Delphi Technique: A group forecasting technique, generally used for future

events such as technological developments, that uses estimates from experts

and feedback summaries of these estimates for additional estimates by these

experts until a reasonable consensus occurs. It has been used in various

software cost-estimating activities, including estimation of factors influencing

software costs.

Domain: A specific phase or area of the software life cycle in which a developer

works. Domains define developers and users areas of responsibility and the

scope of possible relationships between products. The work can be organized by

domains such as Software Engineering Environments, Documentation, Project

Management etc.

Expert Judgment Models: use a method of software estimation that is based on

consultation with one or more experts that have experience with similar projects.

An expert-consensus mechanism such as the Delphi technique may be used to

produce the estimate.

Function Points: Function Points are those pieces of code that perform some

specific activity related to inputs, inquiries, outputs, master files, and external

system interfaces.

78


Life Cycle: The stages and process through which hardware or software passes

during its development and operational use. The useful life of a system. Its length

depends on the nature and volatility of the business, as well as the software

development tools used to generate the databases and applications.

Metric: Quantitative analysis values calculated according to a precise definition

and used to establish comparative aspects of development progress, quality

assessment or choice of options.

New Line of Code: A source line of code that will be developed completely, i.e.,

designed, coded and tested.

PM: Person Months, A person month is the amount of time one person spends

working on the software development project for one month.

Price Analysis: The process of examining and evaluating a proposed price

without evaluating its separate cost elements and proposed profit.

Process: The sequence of activities (in software development) described in

terms of the user roles, user tasks, rules, events, work products, resource use,

and the relationships between them. It may include the specific design

methodology, language, documentation standards etc.

Rayleigh Distribution: A curve that yields a good approximation to the actual

labor curves on software projects.

Real-Time: 1) Immediate response. The term may refer to fast transaction

processing systems in business; however, it is normally used to refer to process

control applications. For example, in avionics and space flight, real-time

computers must respond instantly to signals sent to them. 2) Any electronic

operation that is performed in the same time frame as its real-world counterpart.

For example, it takes a fast computer to simulate complex, solid models moving

79


on screen at the same rate they move in the real world. Real-time video

transmission produces a live broadcast.

Security: The protection from accidental or malicious access, use, modification,

destruction, or disclosure. There are two aspects to security, confidentiality and

integrity.

Software Development Life Cycle: The stages and process through which

software passes during its development. This includes requirements definition,

analysis, design, coding, testing, and maintenance.

Software Engineering Institute (SEI): SEI is a federally funded research and

development center established in 1984 by the DoD with a broad charter to

address the transition of software engineering technology. The SEI is an integral

component of Carnegie Mellon University and is sponsored by the Office of the

Under Secretary of Defense for Acquisition and Technology. SEI developed the

Software Acquisition Capability Maturity Model (CMM) and the Checklist and

Criteria for Evaluating the Cost and Schedule Estimating Capabilities of Software

Organizations.

Software Method (or Software Methodology): Focuses on how to navigate

through each phase of the software process model (determining data, control, or

uses hierarchies; partitioning functions; and allocating requirements) and how to

represent phase products (structure charts; stimulus-response threads; and state

transition diagrams).

Source Lines of Code (SLOC): All executable source code statements including

deliverable Job Control Language (JCL) Statements, Data declarations, Data

Typing statements, Equivalence statements, and Input/Output format statements.

SLOG does not include any statement that upon its removal, the program will still

compile, e.g., comments, blank lines, and non-delivered programmer debug

statements.

80


Validation: In terms of a cost model, a process used to determine whether the

model selected for a particular estimate is a reliable predictor of costs for the type

of system being estimated.

Work Breakdown Structure: A work breakdown structure is a product-oriented

family tree, composed of hardware, software, services, data and facilities which

results from system engineering efforts during the development and production of

a defense material item, and which completely defines the program. A work

breakdown structure displays and defines the product(s) to be developed or

produced and relates the elements of work to be accomplished to each other.

81


Appendix A

Scaling Drivers

Development Flexibility (FLEX)

To determine the flexibility of the development process the following features has to be taken into account.

• Need for software conformance with pre-established requirements

• Need for software conformance with external interface specifications

• Premium on early completion

Feature Very low Nominal High Extra

High

Need for software Full Considerable Considerable Basic conformance with pre-established requirements Need for software Full Considerable Considerable Basic conformance with external interface specifications Premium on early High Medium Medium Low completion

Table A 1 Development Flexibility scaling drivers

82


Scaling Drivers

Precedentedness includes the following features

Feature Very low Nominal High Extra High

Organizational understanding of General Considerable Considerable Thorough product objective Experience in working with related Considerable Considerable Extensive software systems Moderate

Concurrent development of Extensive Moderate Moderate Some associated new hardware and operational procedures Need for innovative data processing Considerable Some Some Minimal architecture and algorithms

Table A2: Precedentedness scaling drivers

The PREC rating is the subjective weighted average of the listed characteristics.

Architecture I Risk Resolution (RESL)

The RELS rating is the subjective weighted average of the listed characteristics (see Appendix B).

Team Cohesion (TEAM)

The Team Cohesion scale factor account for the source of project turbulence and entropy due to difficulties in

synchronizing the project's stakeholders: users, customers, developers, maintainers, interfaces and others. These

83


difficulties may arise from differences in stakeholder's objectives and cultures, difficulties in reconciling objectives or stack

holder's lack of experiences and familiarity in the operating team. Appendix C provides a detailed definition for the overall

TEAM rating levels. The final rating is the subjective weighted average of the listed characteristics (see Appendix C).

Process Maturity (PMAT)

To determine the process maturity the following Key Process Areas (KPA) questionnaire must be completed and the

weight average must be determined (See Appendix D).

• Check Almost Always when the goals are consistently achieved and are well established in standard operating

procedures.

• Check Frequently when the goals are achieved relatively often, but sometimes are omitted under difficult

circumstances.

• Check About Half when the goals are achieved about half of the time

• Check occasionally when the goals are sometimes achieved, but less than often.

• Check Rarely If ever when the goals are rarely if ever achieved.

• Check Does Not Apply when the engineers have the required knowledge about the project or organization and the

(KPA).

• Check Don't Know when uncertain about how to respond to the KPA

After the KPA in completed each compliance level is weighted and a PMAT factor is calculated, as in equation A1

84


s-[t(KPA%*i *2_)] 1=1 100 18

Equation A1

Appendix B

Architecture I Risk Resolution (RESL)

Characteristics Very Low Low Nominal High Very Extra High high

Risk Management Plan identifies all critical risk items, establishes None Little Some Generally Mostly Fully milestones for resolving them by Product Design Review Schedule, budget and internal milestones through Product design None Little Some Generally Mostly Fully Review compatible with Risk Management Plan Percentage of development schedule devoted to establishing 5 10 17 25 33 40 architecture, given general product objectives Percent of required top software architects available to the project. 20 40 60 80 100 120 Tool support available for resolving risk items, developing and None Little Some Good Strong Full verifying architectural specs Level of uncertainty in Key architecture drivers: mission, user Extreme Significant Considerable Some Little Very little interface, hardware, technology and performance Number and criticality of risk items > 10 5-10 Critical 2-4 Critical 1 Critical > 5 Non- < 5 Non-

Critical Critical Critical -

Table 81: Architecture/Risk resolution scaling table

85


Appendix C

Team Cohesion (TEAM)

Characteristics Very low Low Nominal High Very High Extra High Consistency of stakeholder objectives and cultures Little Some Basic Considerable Strong Full Ability, willingness of stakeholders to accommodate other Little Some Basic Considerable Strong Full stakeholders objectives Experience of stakeholders in operating team None Little Little Basic Considerable Extensive Stakeholder teambuilding to achieve shared vision and commitments None Little Little Basic Considerable Extensive

Table C1: Team cohesion scaling table

Appendix D

Process Maturity (PMAT)

Key Process Area Almost Always Often About Half Occasionally Rarely if Does Not Don't >90% 60-90% 40-60% 10-40% Ever <10% Apply Know

Requirements Management Software Project Planning Software Project Tracking Software Subcontract Management Software Quality Assurance Software Configuration Management Organization Process Focus Organization Process Definition Training Program Integrated Software Management Software Product Engineering Intergroup Coordinating

86


Peer Review Quantitave Process management Software Quality Management Defect Prevention Technology Change management Process Change Management

Table 01: Process maturity scaling table

Appendix E

Product Complexity (CPLX)

Very Low Low Nominal High Very High Extra high Control Straight-line Straight forward Mostly simple Highly nested Reentrant and recursive Multiple resource Operations code without a nesting nesting. Decision programming coding. Fixed-priority scheduling with

few non-nested programming tables and simple operators with interrupt handling and dynamically charging structure operators callbacks or many compound complex callbacks priorities and microcode-programming message passing. predicates. Queues level control operators and stack control.

Computation Evaluation of Evaluation of Use of standard Basic numerical Difficult but structured Difficult and unstructured al Operations simple moderate-level math and statistical analysis: numerical analysis: near numerical analysis:

expressions expressions routines. Basic multivariate singular matrix equations, Highly analysis of noisy, matrix/vector interpolation, partial differential stochastic data Complex Operation ordinary differential equations. parallelization

equations.

87


Device Simple read, No cognizance 1/0 processing Operation at Routines for interrupt Device timing dependent dependent write needed of includes devices physical 1/0 level. diagnosis, servicing, coding, micro-Operations statements with particular processor selection, status Optimized 1/0 masking. Communication programmed operations I

i simple formats. or 1/0 device checking and error overlaps. line handling

characteristics. 1/0 processing done at GET/Put level.

Data Simple arrays Single file Multi-file input and Simple triggers Distributed database Highly coupled, dynamic management in main subsetting with no single file output. activated by data coordination. Complex relational and object Operations memory. data structure Simple structural stream contents. triggers. Search structures. Natural

Simple COST- changes, no edit, changes, simple Complex data optimization. language data DB queries, no intermediate edits. restructuring. management updates files.

User Simple input Use of simple Simple use of Widget set Moderately complex Complex multimedia, : Interface forms, report graphic user widget set development and 2D/3D, dynamic graphics, virtual reality Management generators interface (GUI) extension. Simple multimedia. Operations builder voice 1/0,

multimedia -----

Table E1: Product complexity scaling table

Appendix F

Effort multipliers

Required Software Reliability (RELY)

This is the measure of the extent to which the software must perform its intended function over a period of time. If the

effect of a software failure is only slight inconvenience then RELY is low. If a failure would risk human life then RELY is

very high

88


Data Base Size (DATA)

This measure attempts to capture the effect large data requirements have on product development. The rating is

determined by calculating D/P. the reason the size of the database is important to consider is because of the effort

required to generate the test data that will be used to exercise the program.

D DataBaseSize(Bytes) E t· F1 -= qua 1on P Pr ogramSize(SLOC)

DATA is rated as low if D/P is less than 10 and very high if it is greater than 100

Product Complexity (CPLX)

Complexity is divided into five areas: Control operations, computational operations, device-dependent operations, data

management operations, and user interface management operations. Select the combination of areas that characterize

the product or a sub-system of the product. The complexity rating is the subjective weighted average of these areas. If the

Control operations are Low and the Data Management Operations is high then the Complexity is the average of 1 and 4,

which are 2.5. Always round of to the value closes to the Nominal value, which are 3 (see Appendix E).

Required Reusability (RUSE)

89


This cost driver accounts for the additional effort needed to construct components intended for reuse on the current or

future projects. This effort is to be consumed with creating generic design of software, more elaborate documentation and

more extensive testing to ensure components are ready fore use in other applications.

Documentation Match to Life-Cycle Needs (DOCU)

Several software cost models have a cost driver for the level of required documentation. In COCOMO II, the rating scale

for the DOCU cost driver is evaluated in terms of the suitability of the project's documentation to its life-cycle needs. The

rating scale goes from Very Low (many life-cycle needs uncovered) to Very high (very excessive for live-cycle needs).

Execution Time Constraint (TIME)

This is a measure of the execution time constraint imposed upon a software system. The rating is expressed in terms of

the percentage of available execution time expected to be used by the system or subsystem consuming the execution

time resource. The rating ranges from nominal, less than 50% of the execution time resource used, to extra high, 95% of

the execution time resource is consumed.

Main Storage Constraint (STOR)

This rating represents the degree of main storage constraint imposed on a software system or subsystem. Given the

remarkable increase in available processor execution time and main storage, one can question whatever resources are

available, making these cost drivers still relevant.

90


Platform Volatility {PVOL)

"Platform" is used here to mean the complex of hardware or software (OS, DBMS) the software product calls on the

perform its tasks. If the software to be developed is an operating system then the platform is the computer hardware. If a

database management system is to be developed then the platform is the hardware operating system. The platform

includes any compilers or assemblers supporting the development of the software system. This rating ranges from low,

where there is a major change every 12 months, to very high, where there is a major change every two weeks

Analyst Capability {ACAP)

Analyst is personnel that work on requirements, high-level design and detailed design. The major attributes that should be

considered are the rating is Analysis and design ability, efficiency and thoroughness, and the ability to communicate and

cooperate. The rating should not be considered the level of experience of the analyst. Analysts that fall in the 15th

percentile are rated very low and those that fall in the 951h percentile are rated as very high.

Programmer Capability {PCAP)

Current trends continue to emphasize the importance of highly capable analyst. However the increasing role of complex

software packages, and the significant productivity leverage associated with programmer's ability to deal with these

software packages, indicates a trend towards higher importance of programmer capability as well.

91


Evaluation should be based on the capability of the programmers as a team rather than individuals. Major factors, which

should be considered in the rating, are ability, efficiency and thoroughness and the ability to communicate and cooperate.

The experience of the programmer should not be considered. Programmers that fall in the 15th percentile are rated very

low and those that fall in the 95th percentile are rated as very high.

Applications experience (AEXP)

This rating is dependent on the level of applications experience of the project team developing of the software system or

subsystem. The ratings are defined in terms of the project team's equivalent level of experience with this type of

application. A very low rating is id for application experience of less than two months. A very high rating is for experience

of six years or more.

Platform Experience (PEXP)

The Post-Architecture model broadens the productivity influence of PEXP, recognizing the importance of understanding

the use of more powerful platforms, including more graphic user interface, database, networking, and distributed

middleware capabilities.

Language and Tool Experience (L TEX)

This is a measure of the level of programming language and software tool experience of the project team developing the

software system or subsystem. Software development includes the use of tools that perform requirements and design

92


representation and analysis, configuration management, document extraction, library management, program style and

formatting, consistency checking, etc. In addition to experience in programming with a specific language the supporting

tool set also effects development time. A low rating given for experience of less than two months. A very high rating is

given for experience of six or more years.

Personnel Continuity (PCON)

Staff turnaround has an important impact on a project. The rating scale for PCON is in terms of the project's annual

personnel turnover: from 3 %, very high, to 48 %, very low.

Use of Software Tools (TOOL)

Software tools have improved significantly since the 1970's projects used to calibrate COCOMO. The tool rating ranges

from simple edit and code, very low, to integrated lifecycle management tools, very high.

Multi site Development (SITE)

Given the increasing frequency of multi site developments, and indications that multi site development effects are

significant, the SITE cost driver has been added in COCOMO II. Determining its cost driver rating involves the

assessment and averaging of two factors: site collocation (from fully collocated to international distribution) and

communication support (from surface mail and some phone access to full interactive multimedia)

93


Required Development Schedule (SCED)

This rating measures the schedule constraint imposed on the project team developing software. The ratings are defined in

terms of the percentage of schedule stretch- out or acceleration with respect to a nominal schedule for a project requiring

a given amount of effort. Accelerated schedules tend to produce more effort in the later phases of development because

more issues are left to be determined due to lack of time to resolve them earlier. A schedule compress of 74 % is rated

very low. A stretch - out of a schedule produces more effort in the earlier phases of development where there is more

time for thorough planning, specification and validation. A stretch- out of 160% is rated very high.

94


Appendix G

Nu Metro Server Technical Specification

95

Electric Liberty Technical Specification

NU Metro Server Product: Nu Metro Server Project: Brasilia

Technical Specification

Author: Andre Ladeira

Electric Liberty

N U Metro Server FreeStyle

Revision 0.01

Document Status: Draft

Document Source: BSL-Template-TDD Nu-metro.doc Print Date: 2003/03/04

Revision 0.01


Contents


DOCUMENT CONTROL. ....................................................................................................................... 98

INTRODUCTION ................................................................................................................................. 100

Overview .................................................................................................................................... 1 00

Overall Architecture ................................................................................................................... 100

TECHNICAL FLOW ............................................................................................................................. 101

DATABASE DESIGN ........................................................................................................................... 102

Data related issues .......................................................................................................... 1 03

TECHNICAL COMPONENTS .............................................................................................................. 108

Communication Protocol ........................................................................................................... 108

Author: Andre Ladeira Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01


NU Metro Server

Document Control

c f f on 1gura 1on c t on ro

Project: Brasilia

Title: NU Metro Server

Category: Technical Specification

VSS Reference: VSS\Brasilia\Specifications\Technical

Template Used: VSS\Brasilia\Templates\BSL-Template-TDD

Created By:

Creation Date:

D t H" t ocumen IS ory

Date Version Status Who

0.01 Draft

R H" t eVISIOn 1s ory

Date Version Changes

0.01 New document created

R H" t ev1ew IS ory

Product Nu Metro Server Project Brasilia

VSS Version

-

Date Version Status Management Minute Reference

1.00



References

Description

Related Documents

Related Specifications

Author: Andre Ladeira Document Source: BSL-Template-TDD Nu-metro.doc

NU Metro Server

Source

Product: Nu Metro Server Project: Brasilia

Print Date: 2003/03/04 Revision 0.01


Introduction

Overview


This document describes the functionality of the NuMetro Server and the interface components involved. The document must be read in conjunction with the NuMetro Functional specification document

Overall Architecture

The architecture diagram below is a logical interpretation of the production environment implementation of the NuMetro server.

Client Side

Engenico Terminal -.


Server Side

Protocol converted converts data to a readable ascii format

Terminal connects to protocal converted via x25 radio pad

Nu Metro Server determines which message is being send

and sends the relevant data back to protocal converter to be send

to terminal

.. Protocal converter

•

Nu Metro Server connected to protocal converter via TCP/IP

Nu Metro Server

Valid data is retrieved

Print Date: 2003/03/04 Document Source: BSL-Template-TDD Nu-metro.doc Revision 0.01


Technical Flow


The Nu Metro server uses only one external com+ component for data access namely the modFreestyle class.

An external application (protocol converter) is used to translate messages received from the terminals to ASCII strings and visa versa.

To determine when a message arrives the Microsoft Winsock TCP/IP component is used. The following methods are used:

tcpCiient_DataArrival: Once the data has arrived at the protocol converter this event will be triggered and the process will start. A string check is also put in place to ensure that the full message has arrived. In the case that the full message has not arrive the message will be placed in memory until the full message has been retrieved.

All the data will be checked to determine that no corrupted data has arrived (check sum and string length count, see message protocol)

SendData: Once the message has been compiled the data is send to the protocol converter.

Note: most of the application is string manipulation to ensure that the agreed data protocol is met.

Data flow process

• Login. The terminal ID will be checked to determine if the terminal may be given access. In the case that the terminal ld is not valid the terminal will be locked out.

• Once the terminal has been successfully logged in all the parameters will be send. See database detail

• After successful completion of the parameter file upload the good card list (GCL) will be send.

• After successful completion of the CGL all vouchers issued will be send from the current terminal to the server, to be stored in the database.

• After successful completion of the voucher download the terminal will logout

Note: The server can not initiate communication with the terminal, the terminal initiates communications and send commands to the server. The server only responds on the valid commands. The terminal will connect at the time as specified in the parameter file send.



Database Design

Database tables used

NuMetroStatusHist

PK (:2kiNMStatusHistld

fkiMemberCustld qwfldinfo_NMstat dtEffectiveDate dtTerminationDate dtTimeStamp

NUMetroTerminal

PK (:2kiNUMetroTerminall!;!

fkiBUOrgld sTerminalld dtEffectiveDate dtTerminationDate dtTimeStamp

NUMetroMemberDetail

PK (:2kiNUMetroMemberDetailld

sTransactionType ~ sCardNo INoOfDependants dtTimeStamp

NUMetroVoucherDetail

PK 1:21siNUMetroVoucherDetailld

fkiNUMetrologinHistoryld sControiData sVoucherNo sCardNo dtVoucherlssued INoOmckets sManagerPinNo dtTimeStamp


NU Metro Server

NUMetrologinHistory

PK (:2kiNUMetroLoginHisto0£1d

fkiBUOrgld dtloginRequested bloginSuccessfull bDataTransferCompleted dtlogout

~~

NUMetroMemberDetaiiUpload

PK (:2kiNUMetroMemberDetaiiU(:21oadld

FK1 fkiN UMetrolog in H istoryld sSeqNo INoOfCards dtTimeStamp bSent dtSent

·~

NUMetroMemberDetaiiSeqNo

PK (:2kiNUM!iltroMemberDetaiiSegNold

FK2 fkiNUMetroMemberDetaiiUploadld FK1 fkiNUMetroMemberDetailld

dtTimeStamp

NUMetroMemberDetaii_Processed

pkiNUMetroMemberDetailld sTransactionType sCardNo INoOfDependants dtTimeStamp

Document Source: BSL-Template-TDD Nu-metro.doc


NUMetroParameter

PK (:2kiNUMetrQParameterld

fkiBUOrgld sMerchantNo sSiteVenue sAccountSiteNo sNextCaiiTime sNextCaiiMethod sNextCaiiNo sNextSeqNo !Velocity IMaxTicketsPerVoucher sManager _1_PinNo sManager_1_CardNo sManager_2_PinNo sManager_2_CardNo dtEffectiveDate dtTerminationDate dtTimeStamp NUA_No_1 NUA_No_2 NUA_No_3



Data related issues

Table: FS BAckoffice.NuMetroParameter ..


Description: This contains all the relevant data to be loaded onto the terminal (see communication protocol for detail)

Column Value Description

pkiNUMetroParameterld lnt Primary key

fkiBUOrgld lnt Unique organization id

sMerchantNo Char Cinema merchant number

sSiteVenue Varchar Cinema name

sAccountSiteNo Varchar Freestyle account number

sNextCaiiTime Char Login time every day

sNextCaiiMethod Char Call method (X25)

sNextCaiiNo Char Other call number

sNextSeqNo Varchar Next voucher sequence number

IVeloci!Y_ Tinyjnt Velocity

IMaxTicketsPerVoucher Tinyint Max number of tickets to be issued

sManager 1 Pin No Char Manager one's pin number for supervisors card

sManag_er 1 Card No Char Managersone's card number for supervisors card

sManager 2 Pin No Char Manager two's pin number for supervisors card

sManager 2 Card No Char Managers two's card number for supervisors card

dtEffectiveDate Datetime From when active

dtTerminationDate Datetime To when active

dtTimeStamp Datetime Date creates

NUA No 1 varchar NUA number to access

NuMetroLobginHistory

Table: FS BAckoffice. NuMetrolobginHistory

Description: Inserts all the detail of terminal that is requesting connection


pkiNUMetrologinHistoryld lnt Primary key

fkiBUOrgld lnt Organization ld

dtloginRequested Datetime Date of login

bloginSuccessfull Bit Was login successful

bDataTransferCompleted bit Was data transfer completed

dtlogout Datetime Date of logout



NU Metro Server

N uMetroStatusH ist

Table: FS BAckoffice. NuMetroStatusHist

Description: Store status detail of nu metro members


pkiNMStatusHistld lnt Pkey

fkiMemberCustld lnt Customer ID

qwfldinfo NMstat lnt

dtEffectiveDate Datetime Effective date

dtTerminationDate Datetime Termination date

dtTimeStamp datetime

NuMetroTerminal

Table: FS BAckoffice. NuMetroTerminal

Description: Store all the terminal detail


pkiNUMetroTerminalld lnt Pkey

fkiBUOrgld lnt Organization ID

sTerminalld varchar TerminaiiD


dtTerminationDate Datetime Termination Date

dtTimeStamp Datetime

NuMetroMemberdetaiiUpload

Table: FS BAckoffice. NuMetroMemberdetaiiUpload


Description: Store the detail of all the member detail uploaded to each terminal


pkiNUMetroTerminalld lnt Pke_y_

fkiBUOrgld lnt Organization ID

sTerminalld varchar TerminaiiD


dtTerminationDate Datetime Termination Date




NuMetromemberdetail

Table: FS BAckoffice. NuMetromemberdetail


DescriQ_tion: All members that must be uploaded, edited or currently in the terminal memory


pkiNUMetroMemberDetailld int Pkey

sTransactionTYQ_e char Transaction type (Add or delete))

sCardNo Char Member card number

INoOfDependants Tinyint Number of dependants


NuMetromemberdetaiiSeqNo

Table: FS BAckoffice. NuMetromemberdetaiiSeqNo

Description: Member detail upload sequence number linked table


pkiNUMetroMemberDetaiiSeqNold lnt Pkey

fkiNUMetroMemberDetaiiUploadld lnt Link to numetroMemberdetailupload

fkiNUMetroMemberDetailld int Link to Numetromemberdetail


NuMetroVoucherDetail

Table: FS BAckoffice. NuMetroVoucherDetail



pkiNUMetroVoucherDetailld lnt Pke_y

fkiNUMetrologinHistoryld lnt Link to Numetromemberloginhistory table

sControiData Varchar

sVoucherNo Varchar Voucher number

sCardNo Char Card number

dtVoucherlssued Datetime Date voucher issued

INoOfTickets Tinyint Number of tickets for member



sManagerPinNo

dtTimeStamp

Char

datetime

NuMetroMemberDetaii_Processed

NU Metro Server

Table: FS BAckoffice. NuMetroMemberDetail Processed


Was the ticket overridden



pkiNUMetroMemberDetailld int Pkey

sTransactionTy:pe Char Transaction type

sCardNo Char Member card number

INoOfDependants Tinyint No of dependants


Store Procedures Used

NuMetroMemberFilePopulate.

Select the member details to be uploaded to terminal. Only 13 members can be uploaded per message. This store procedure retrieves 13 members at a time.

NuMetroRetrieveMemberDetail

Retieve all the member detail

NuMetroUpdateMemberDetai/Send

Update table NuMetroMemberDetaiiUpload to indicate that data has been sent

Numetro Termina/PasswordValidate

Validates terminal password

NuMetroLoginHistoryLog

Inserts record into NUMetrologinHistory table



NuMetroVoucherDetaillnsert

NU Metro Server

Insert all voucher detail uploaded from terminal

NuMetroParameterSelect

Retrieve parameter detail for specific terminal

MemberValidationdetail


Select all valid Nu metro clients. Used to determine the amount of packages (messages to be send)

NuMetroCinemasNotEntered

Selects all cinemas that has not entered for a specific day



Technical Components

Communication Protocol

NU Metro Server

Operational and Protocol specification


HOST (Server) ~lngenico communication (Client)

Communication Protocol

Direction STX CMD LEN DATA ETX CHK

HOST=> TERM [STX] F[XXX] 4 Bytes As per [ETX] [CHK]

TERM=> HOST I[XXX] specification

2 Bytes

lXXXl = Name of process

Description

Field Description

STX Ox02

CMD Example 'FLIN'

LEN Length of data to follow, in ASCII '0012' indicates a lenqth of 12

DATA Depends on CMD

ETX Ox03

CHK XOR of all data, excluding STX. Result in ASCII 'FD' indicates a CHK of OxFD

All numeric data in the DATA tag from Host and terminal will be compressed numeric.

Login and Logout

Login- Terminal to Host Request

I Direction I STX I CMD

Author: Andre Ladeira Document Source: BSL-Template-TDD Nu-metro.doc

I DATA I CHK



TERM=> HOST I [STX] IILIN [Y] = TerminaiiD (8 Bytes, Numeric)

Login - Host to Terminal Reply

Direction STX CMD

HOST=> TERM [STX] FUN

NU Metro Server

I ooo8

LEN DATA

00013 [X][Zl

[X] =Valid login (1 -accepted, 0- rejected, 1 Byte, Numeric)

[Z] = Host date and time (DDYYMMDDHHMM, 12 Bytes, Numeric)

Logout- Terminal to Host Request

Direction STX CMD

TERM=> HOST [STX] I LOT

lY] = TerminaiiD ( 8 Bytes, Numeric)

Logout- Host to Terminal Reply

Direction STX CMD

HOST=> TERM [STX] FLOT

Voucher batch upload

Batch detail

Request

Direction STX CMD

TERM=> HOST [STX] IVBU


LEN

0008

LEN

0000

LEN

4 Bytes

Document Source: BSL-Template-TDD Nu-metro.doc

DATA

[Yl

DATA

DATA

[A][B][X][C1 ][C2]


I [ETXl I [CHK]

ETX CHK

[ETXl [CHK]

ETX CHK

[ETXl [CHK]

ETX CHK

[ETX] [CHK]

ETX CHK

[ETX] [CHK]



I Notes

I I

NU Metro Server

I rc3JrC4l rc5J I


I

[A] = Packet number to be retrieved (2 bytes length, truncated with nulls if needed , Numeric)

[B] = Total number of packets to be send (2 bytes length, truncated with nulls if needed , Numeric)

[X] = Number of record sent (2 bytes, Numeric)

[C1] =Voucher number (8 Bytes, Numeric)

[C2] = Card Number (14 Bytes, Numeric)

[C3] = Current Date Time (YYYYMMDDHHMM, 12 Bytes, Numeric)

[C4] = Number of tickets (2 Bytes, Numeric)

[C5] =Manager Number (1 Bytes, Numeric)

Fields [C1] to [C5] are repeated [X] times.

Field [C] is repeated [X] times where [X] may not be > 5

LEN = Variable

Description • Number of records in the voucher file for the specific package send • Voucher number. The number that was printed on the voucher • Card Number. The number of the card that was used • Current Date Time. The date and time the ticket was issued • Number of tickets. Number of tickets requested • Manager Number. The manager number as per the parameter file. If manager number= 1 then it

is 'Manager One Card Number' as per parameter file. The same if the manager number= 2. If the manager number= 0 then there was no override.

Reply

Batch detail Response


TERM=> HOST [STX] FVBU 0002 [X] [ETX] [CHK]

Notes



NU Metro Server

[X]= Number of record received (2 bytes, Numeric)

LEN= 2

Parameter File - Host to terminal

Request

Direction STX CMD LEN

TERM= HOST [STX] IPAF 0000

Notes

This will be a request for the parameter file

Reply

DATA


ETX CHK

[ETX] [CHK]


HOST => [STX] FPAF 0153 TERM

Notes

[A]= Merchant Number (14 Bytes, Numeric)

[B] = Site Venue (24 Bytes, Alpha Numeric)

[C] =Account Site Number (10 Bytes, Numeric)

[D] = Next Time (HHMM, 4 Bytes, Numeric)

[E] = Next Call Method (3 Bytes, Alpha Numeric)

[F] =Next Call Number (10 Bytes, Numeric)

[H] = Next Sequence Number (8 Bytes, Numeric)

[I] =Velocity Values (2 Bytes, Numeric)

[J] =Tickets Per Voucher (2 Bytes, Numeric)

[A][B][C][D][E][F][H] [ETX] [CHK]

[I][ J ][K][L ][M][N][O ][P][Qj_

[K] =Manager One Pin Number (6 Bytes, Numeric)



NU Metro Server

[L] =Manager One Card Number (14 bytes, Numeric)

[M] = Manager Two Pin Number (6 Bytes, Numeric)

[N] =Manager Two Card Number (14 bytes, Numeric)

[0] = NUA1 (12 Bytes, Numeric)

[P] = NUA2 (12 Bytes, Numeric)

[Q] = NUA3 (12 Bytes, Numeric)

Description


• Merchant Number. Unique number allocated by Freestyle to each individual card reader. This number will be used to uniquely identify the site.

• Site Venue. Indicates the site and the venue of the card reader • Account Site Number. Number for accounting purposes. • Next Call Date and Time. Next time for upload • Next Call Method. Will not be used for launch, but may be used in the future if a different medium

is available to transfer data. The default value will be "X25" • Next Call Number. Number needed to use for different medium. The value will be populated with

10 nulls. • Next Sequence Number. Specifies the next number to be printed on the next voucher issued. If

this value is not provided, the next sequential number in the card reader must be used. The voucher number will be as follow MMDDXXXX where XXXX is the sequential number.

• Velocity Values. Specifies the time required to elapse before a card can be approved as a valid re-swipe. Default will be 24 hours.

• Tickets Per Voucher. Specifies the maximum amount of tickets that can be issued on a voucher. Default value of 000 will mean according to max number of participating members on the card.

• Manager Pin Number. Stores the manager's pin number. If this pin number is not provided the old number must be retained.

• Manager Card Number. Card number of manager to be used to override system.

Good List Upload- Member Data

Data upload

Request

Direction STX CMD LEN

TERM=> HOST [STX] IGLU 0012

Notes

DATA ETX CHK

[A] [ETX] [CHK]

[A] = Sequence Number [CCYYMMDDXXXX] (12 Bytes, Numeric) where XXXX indicates the package to be send

Reply



Direction

HOST=> TERM

Notes

STX CMD

LSTX] FGLU

NU Metro Server

LEN DATA

4 Bytes lAJLB]lCJLDJLEl

[A]= Next sequence number [CCYYMMDDXXXX] (12 Bytes, Numeric)

[B] = Number of Cards [X] to follow (up to max of 13) (2 Bytes, Numeric)

[C] =Add "A" or Delete "D" (1 bytes, Alpha Numeric)

[D] =Card Number (14 Bytes, Numeric)

[E] = Number of dependants (2 Bytes, Numeric)

Field [C][D][E] is repeated [X] times where [X] may not be > 13

LEN = Variable


ETX CHK

JET)9 LCHKl

If the host returns the same sequence value, it would indicate that there are no more cards to download. In this case [X] will also= zero

Supervisors Card

There will be two cards issued per site. The cards will have the following detail: • Track One- "NU" +Card number (5555XXXXXXXXC) where XXXXXXXX is the unique number

and C the check sum. • Track Two - Merchant Number

Each manager will be allocated with a 6 digit password. Once the supervisor's card is swiped the manager will be prompted to enter the password.

The password will be validated against the data that has been downloaded in the parameter file.


Cost Estimation Methods For Software Engineering

Documents

Transcript of Cost Estimation Methods For Software Engineering