Download - Advanced SCM Branching Strategies

8/23/2019 Advanced SCM Branching Strategies

1/28

Advanced SCM Branching Strategies [email protected] Copyright 1998, Stephen Vance.Permission is granted for Perforce Software,

Inc. to copy and distribute.

Abstract

In Software Configuration Management (SCM) systems, branching allows development to proceed simultaneously along more thanone path while maintaining the relationships between the different paths. It is a fundamental technique behind any well-organizedlarge-scale development, maintenance, and release effort. Branching strategies sufficient for small-scale efforts are inefficient and

counterproductive when applied to large-scale efforts.

In this paper, I first define branching in a general sense. I then discuss various strategies for branching, starting with the obvious

and moving up to several that are more appropriate for larger development efforts. Along the way, I discuss the pros and cons ofeach strategy, using them to motivate the changes that compose the more complex strategies.

These strategies are based on experience with several SCM systems on development projects ranging from tens of thousands toseveral million lines of code. These projects were developed by up to several dozen people concurrently, some in an internationallydistributed environment.

Introduction

Branching is a relatively simple mechanism. Its sophisticated interactions with

technical and managerial issues confound and stymie many. The most obvious reason

for branching is to start an alternate line of development. This explanation is so

generic as to encompass all reasons for branching. Practically speaking it is only

marginally useful. It is more appropriate to ask under what circumstances one would

want to start an alternate line of development in the development process. To answer

this question, we must understand where a branch begins and what a branch representsin the development environment.

At the heart of any organization?s development infrastructure is the SCM tool. The

SCM system as a whole consists of the customizations and policies necessary to adapt

the SCM tool to the way the organization wants to use it. In some cases, the

organization must adapt to the way the SCM tool needs to be used. Regardless, the

state of the practice in SCM is to buy a tool and then figure out how it should be used.

This paper advocates a planning and analysis based approach to the significant issue

of branching. It begins with a quick overview of the strategies most organizations try,concluding with a statement of the assumptions used throughout the rest of the

discussion. From there, a discussion of codeline policy and codeline ownership

follows, with deeper treatment of branchpoints, merging and branch life span. Next

the concept of branch roles is introduced and applied to the common task of release

management. Within this conceptual framework, specific examples of release

branching strategies are built up and examined. Finally, the discussion concludes with


2/28

coverage of specialized branching topics: projects spanning releases, derivative

development projects, distributed development, and some unusual variations.

The Obvious Strategies

This section describes how organizations typically discover and grow into SCM. It

concludes with a summary of the level of SCM maturity that is expected as a

foundation for the rest of the paper.

When the issue of how code should be managed in an organization first arises, many

companies define mechanisms to maintain source in some directory structures. They

define methods to facilitate various self-evident issues, such as overwriting each

other?s changes, checkpointing, and code replication and integration. For example,

there may be a script that invokes the editor only if some well-known file does not

exist. Copies of the source base may be made periodically to capture some releasestate. Ports may be accomplished through copying the source and manually

reintegrating the changes.

Later, an organization may find one of the various freely available SCM tools, such as

RCS [TICH85], SCCS [ROCH75] or CVS [BERL90]. At first it will likely develop

entirely on the trunk, the sequence of versions that develops when no branching is

used. Eventually, features like locking, labels, numbering, and branching may be

discovered, but are seldom used effectively. Up to this level, an organization lacks the

maturity to plan many of their processes, eliminating the possibility of many useful

but more complex strategies for the management of their development environment.

Similarly, the tools that are being applied are limited in their capabilities to flexibly

address the issues.

The organization applies branching strategies to its environment without realizing it.

They are defining policies, rudimentary though they may be, that govern how

concurrent development and releases are managed.

Several problems are typically encountered and overcome through this stage of an

organization?s growth. We will assume for this discussion that these problems havebeen encountered and overcome. We will make the following assumptions about how

an SCM environment is managed:

The sharing of workspaces is absolutely forbidden.

The organization is working with an industrial strength SCM tool. Such a tool

is characterized by a client-server architecture, mnemonic branch names,


3/28

explicit workspace support, and enhanced merging capabilities based on

version ancestry analysis.

The organization recognizes that codeline policy and codeline ownership are

essential to a well-run SCM environment. A codeline is the abstraction of

workflow and time embodied concretely with a branch and its sequence ofversions.

The organization has defined packagings of their results, e.g. releases or project

builds

Developing Branching Strategy and Codeline Policy

This section defines the critical concepts of a codeline policy, a codeline owner and a

branching strategy. Three main attributes of a branch are identified and discussed to

assist in the formulation of codeline policy: branchpoints, merging policy and branchlife span.

A codeline policy describes the rules governing check-ins, merges and other uses of a

codeline. Each branch has an associated codeline policy dictating how it should be

used. [WING98] advocates codeline policy, and recommend that one should branch

on incompatible policy. Therefore, a new branch should be created when changing

development needs require a change in the current codeline's policy. A branch

provides a mechanism by which one can support a newly required set of policies

without changing the policies that are already in effect.

Codeline policies should not be arbitrarily invented. They are derived from the

organization?s software development requirements. They are shaped by the answers

to a number of detailed questions about how a company releases its software, how

they plan to develop their software, and what range of software packagings they need

to produce. Within an organization, certain sets of these answers will define common

approaches to development, and many codeline policies will resemble each other.

Bear in mind that not all codeline policies apply in all company?s environments.

[APPL98] and [WING98] both recommend that each codeline have a codeline owner.It is the codeline owner's job to rule on any questions regarding the codeline policy

and to ensure that any maintenance issues defined or inferred in the codeline policy

are successfully carried out. Sometimes the codeline owner will do the integration, but

he is at least responsible for delegating it.


4/28

A branching strategy consists of the guidelines within an environment for the creation

and application of codeline policies. Creating a branching strategy consists of:

identifying the categories of development that can be easily characterized,

defining the differences and similarities between them,

defining how they relate to each other, and

expressing all of this information as codeline policies and branches.

In addition, there needs to be an owner of the branching strategy who will have final

judgment on changing policy guidelines.

A codeline policy identifies how a branch should be used, but this assumes that the

branch exists. The branching strategy sets parameters for the issues relating to

branches creation, interaction, and retirement. These aspects of the branch's lifetimeare represented by the branchpoint, the merging policy, and the branch life span.

Generally, the branchpoint is fully defined before the branch is created for use. The

full specification of a branchpoint usually occurs through a label, but can also be

described by date and time or specific version numbers. Although other branchpoint

creation strategies exist, they are not within the scope of this discussion.

Identification of the need for a branchpoint occurs when the need for a different type

of project causes a change in branch policy. Any new project or type of project on any

branch carries with it the possibility to require a new branch, and therefore to define abranchpoint.

Merging is the process by which one codeline is integrated into another. Merging

occurs when there is utility in applying any set of changes on one branch to another

branch. Generally, merging is relevant when the source and target branches have

common ancestors. Ancestral relationships beyond those having a common immediate

branchpoint have varying levels of support in SCM tools.

The merge policy of a branch describes how frequently the branch is merged to other

branches. This policy can be divided into the import policy and the export policy. The

import policy for a branch defines when the codeline owner should have work on

other branches merged to it. The export policy is usually defined with respect to

recognizable characteristics of the development assigned to the branch, such as

stability or completeness. It also may be responsive to other events, such as imports

and other incoming merges, time intervals, or the branch's life span.


5/28

Life span refers to the amount of activity between branch creation and

decommissioning. Life span is a qualitative attribute, not a quantitative measure. A

branch?s life span is discussed as it compares to that of other branches.

In summary, a codeline policy defines the rules governing the use of a codeline or

branch. A branching strategy consists of the guidelines for creating and applying

codeline policies within an organization. Its primary purpose is to define a collection

of template codeline policies that can be applied to form a coherent development

environment. As a codeline has an owner to resolve ambiguities in a codeline?s

policy, a branching strategy should also have an owner to resolve conflicts between

codeline policies and to mentor the creation of new ones.

Branch Roles and Release Management

This section discusses five roles that branches can fulfill in the process of releasemanagement: mainline, development, maintenance, accumulation, and packaging.

These roles are individually addressed and applied to aspects of a prototypical three-

level release structure. Each role is defined and discussed with respect to the three

attributes of branchpoint, merge policy, and life span. A discussion of the role of risk

in the application of roles to a release branching strategy is included. Factors of risk

guide many of the decisions in performing release management in an SCM system.

Most software development efforts have some form of incremental deliverables,

usually manifested as a series of releases. For this reason, we will discuss the

relationships between branching and release management. There are five main roles

that need to be considered for branches in planning toward the goal of a single release:

1. mainline,

2. development,

3. maintenance,

4. accumulation, and

5. packaging.

Note that the same branch can fill two roles. Roles do not require their own branch, so

long as the role policies do not compete or their influences can be reconciled.

Typically, there are two to three levels of release, named by numbers connected with

periods (e.g. 1.2.3). This paper works with a three level release structure for greatest

applicability. The assignment of sequential numbers and a hierarchy of change


6/28

semantics are not intended to suggest that this scheme corresponds to the scheme

determined by an organization?s marketing department for public consumption. Many

have argued that the two should have no correspondence between them, even

suggesting the use of code names internally as the only designator for a release. This

paper takes the more moderate position that a correspondence can exist so long as it

works for both purposes. If a hierarchical structure best communicates the semantics

of the environment, use it and let the marketers invent a structure that suits their

needs.

In this structure the first number is associated with a major version, indicating that it

has significant feature and functional enhancements from the previous; there may also

be significant incompatibilities that require migration. The second number represents

a minor version, which contains lesser feature and function enhancements, a

significant number of bug fixes, and no incompatibilities. The third number refers to a

patch level, signifying almost exclusively a collection of bug fixes; no feature orfunction enhancements and no incompatibilities are allowed between patch levels.

It can be easily seen that even within a particular type of release, there can be several

different kinds of development projects at work, suggesting different policies

governing their management. As stated earlier, different policies suggest different

branches. Therefore, any give release is unlikely to be properly represented by a

single branch in the SCM tool.

The Mainline Role

The mainline is an important role in the proper management of a development effort.

The purpose of a mainline is that of a central codeline to act as the basis for

subbranches and their resultant merges.

The assumption present is that all of mainline's subbranches are related through an

ancestor, not only through a strict version relationship, but also in purpose. Also, the

fact that the mainline codeline is central implies that it is a singleton, a one-of-a-kind.

Often mainline is incorrectly bound in concept with the main branch in a version tree,frequently through the SCM tool vendor's naming. For example, ClearCase gives it

the name /main [ATRI94]. Perforce is better with //depot, but some Perforce

documentation suggests //depot/main for the trunk.

It is natural to want to consider the trunk as the mainline as shown in Figure 1(a). It

has several properties that would suggest its use for this purpose. First, it is the branch


7/28

on which most code starts its life; in some SCM systems, all code starts on the trunk.

As the primary lifeline for file creation, it is central to all successive development and

has an ancestral relationship to all development. Since the trunk usually can not be

deleted, its lifetime is guaranteed to exceed that of any and all subbranches, making it

a ripe candidate for any branch parenting. There is no branchpoint in a mainline that

resides on the trunk.

Figure 1. Mainline variants

However, in an environment with multiple independently developed products or

independent component groups with differing release cycles, merging these lines of

development into a common parent branch is questionable. Additionally, if there are

multiple geographic locations for development, it may be reasonable to provide eachsite with its own mainline and synchronize their mainlines. If one site owns the trunk

as its mainline, one creates an asymmetrical development environment that is

unnecessarily complicated to maintain. In this case, the branchpoint of a mainline will

be on the trunk, usually off of the trunk?s head revision at the time the mainline is

created.

If you have only one product or family of products, do not segregate the departments

internally onto differing release schedules and do not have geographically distributed

development, using the trunk as the mainline is probably adequate for your purposes.

If you are engaging in any of the above activities, you should seriously consider using

subbranches of the trunk as the mainlines on your various projects as shown in Figure

1(b). Some of these activities will be discussed in more detail later.

Usually, the mainline will not have to deal with merge policies. Particularly if the

mainline is on the trunk or the various mainlines are mutually independent, there is

nothing with which the mainline should synchronize. However, in the event of

multiple mainlines, such as one might use to support distributed development, the

mainline may have a merge policy in order to synchronize mainlines. Some specifics

are given below in the discussion on distributed development.

The life span of the mainline is the life span of the code base. In a strict release model,

the mainline will have the longest life span of any branch. This is not necessarily the

case in a more true-to-life production environment or with projects that do not have to

obey a release cycle. In distributed development it may be shortened in some cases to

the life span of development at that location. In an environment with multiple


8/28

mainlines, the code base in question is the code associated with the product or

component that motivated the creation of the mainline.

The Development Role

Development is the activity that produces the feature and function enhancements that

characterize major and minor releases. Several branches in each release are likely to

assume this role. The key concept behind development is the creation of new

functionality, generally a higher risk activity than the simple fix.

Discussion of development introduces risk into the equation of branch creation, a

topic that deserves elaboration.Risk mitigation is the single largest force driving the

evolution of software life cycle models. In the context of a life cycle, insulating the

overall system?s exposure to a change reduces risk. Tackling the change in ways that

reduce the investment or limit the impact does this. The waterfall model tried toreduce risk by planning a detailed road map in advance. Recognizing the impossibility

of this approach in most practical situations, the spiral model calls for contained

cycles of incremental development followed by review and preceded by corrective

planning. Other approaches such as RAD and Rapid Prototyping continue this trend.

Branching works cooperatively with these life cycles by providing a mechanism for

physically isolating riskier development ventures from the code base. This allows the

project leadership to have all of the benefits of SCM without imposing unstable code

on the rest of the developers.

In general, consider using separate branches for each high-risk project. High risk

projects are characterized by large size, large numbers of people, unfamiliar subject

matter, highly technical subject matter, very tight time lines, uncertain delivery dates,

incomplete or volatile requirements, and geographically distributed project teams.

Similarly, consider designating a single branch for low risk development in each

release. Several sources including [WING98] recommend using the mainline for this

purpose. Consider the factors discussed above for the mainline before committing to

this course of action. Low risk development may have different policy from the

mainline even if you have multiple members of a product family coordinating throughthe mainline.

At the time the development is started both low- and high-risk development branches

almost always will have their branchpoints on the mainline as the head revision.

Subprojects of high-risk development will have their branchpoint as the head revision

on the parent development branch when the sub-project is started. An independent


9/28

low-risk development branch will usually share a branchpoint with the mainline, use

the first revision of the mainline, or branch from the head revision when the first low-

risk project is started.

In a release environment, development branches will always have a merge policy.

This merge policy may not be invoked in the event that a particular development is

cancelled, but it will have been defined regardless. The policy is usually one of

merging to the parent branch when the development is finished. Sometimes in

incremental or distributed development this structure will be more complex. We will

deal with those cases below.

Development branch life span is usually the duration of the development project

effort. Sometimes, HRD will require fixes in the release cycle in which it was

developed. Some organizations have these fixes also occur on the development

branch, therefore extending the life of the branch. This approach to development willalso have an impact on the merge policy for the branch.

The Maintenance Role

Maintenance usually designates bug fixing activities. Analysis of maintenance

branching is very similar to that of development branching in that a risk based

approach clarifies many of the issues. Most bug fixes can be characterized as lower

risk than almost all development projects. It is usually acceptable for bug fixes for a

release be performed on the mainline. In this case there is no branchpoint that is

distinct from the mainline?s.

Environments in which the mainline must remain stable with high reliability will want

to move bug fixing to its own branch. This branch clearly has different policy from

both mainline and from low-risk development. In this situation the branchpoint will be

determined like the branchpoint for low-risk development was previously.

There is a category of bug fix that should be considered for its own branch. In any

code base, particularly as it ages, situations arise in which a bug fix can have a highly

destabilizing effect. This is more likely tied to the nature of the bug. It occurs morefrequently when the code base is being pushed well beyond the limits of its original

design. The branchpoint here will be determined like the branchpoint for a high-risk

development project. This is an example of what [APPL98] refers to as an Activity

Branch.


10/28

For maintenance, the merge policy is usually as simple as or simpler than the

development policy, primarily because the scope of the maintenance projects is

smaller. Distributed development can complicate maintenance merge policies, as well,

but often this is handled the same way as mainline, accumulation line or development.

Obviously, when the maintenance is performed on the mainline, life span is not an

issue. When it is performed on its own set of branches, the policies tend to look like

development policies.

The Accumulation Role

Toward the end of each release cycle, the need arises to consolidate the efforts of

various activities that required their own branch. Depending on the quantity of

branches and the significance of their changes, the integration of a release effort can

be a project in itself. This factor alone is a risk in the planning of the release as awhole. This risk can be mitigated through the "Propagate Early and Often" tenet in

[WING98].

The branch satisfying the accumulation role acts as the focus for merging the final

results of various subbranches. Often accumulation takes place by merging to the

mainline. Here, as we saw in the case of low-risk fixes, the accumulation branch is

indistinct from the mainline and therefore has no branchpoint of its own. Similarly,

unless multiple related mainlines are in effect, it has no distinct merging policy. The

branch?s life span in this model is identical to that of the mainline.

However, sometimes it is necessary to merge to a branch independently from the

mainline as an intermediate step. This would be followed by a merge from the

accumulation branch to the mainline. This strategy is recommended in two situations.

First when the code base is large and the changes that have not been merged back to

mainline are substantial. Second when the integration team has several people that

need to share intermediate integrated state. In the latter case, the branchpoint is

usually identified by the head revision of the mainline when the integration needs to

take place. The merging policy for such a branch will minimally indicate that the

accumulation will be merged to the mainline when the accumulation is finished.Additional intermediate merges may be called for depending on the accumulation

branch?s stability and content. This branch will tend to have a short life span,

spanning only the time necessary to integrate the projects and fix any conflicts.

Another way to accumulate that is useful in an environment requiring a high-

reliability mainline, is an accumulation branch that parallels the entire mainline. In


11/28

this case the branchpoint would be that of the mainline, the first version of the

mainline, or the head of the mainline when the first accumulation is required. The

merging policy in this situation can require considerable thought to achieve regularity

and consistency, particularly in a multiple mainline environment. This model?s

accumulation branch has life span almost as large as mainline?s, but is shorter due to

the eventual merge to mainline for packaging.

The Packaging Role

The packaging role is often confused with the accumulation or, more commonly,

mainline roles. Once the intended development and maintenance have been performed

and any accumulation has been done, it is time to prepare the code for release. Such

an effort may not be trivial, requiring a team of release engineers and additional fixes

beyond those already performed. The policy on a packaging branch is significantly

different from that on a maintenance branch, as the packaging role suggests, only thechanges necessary to make the product releasable should be addressed.

If work is to proceed on the other product branches, as is likely to happen if patch

levels of the product are to be produced, one does not want the release effort to stall

progress toward the next patch level. Other packaging branching strategies could even

keep minor versions running off of the same mainline, compounding the potential for

a stall while the packaging activity takes place.

Using a separate branch to insulate the release effort from the ongoing development

and maintenance, and vice versa, is recommended. In a multi-platform environment, itmay be advisable to create one packaging branch per platform for the final porting

effort. If the porting efforts are staggered, this allows the staggered releases to be

reflected in the version hierarchy. If the porting efforts are simultaneous,

accumulating the per-platform packagings to a master packaging branch, from which

the final build would be performed, should also be considered. This should be

determined in advance, as creating the separate packaging branches from the master

packaging branch works best.

In any case, the branchpoint for the primary packaging branch should be either thehead revision or the latest stable revision of the mainline. Some strategies will want to

use the accumulation branch instead of the mainline where they are distinct. The

branchpoint for each packaging branch should be the head revision of the master

packaging branch.


12/28

Packaging branches tend toward the same rules that apply to development in a single-

site environment. Even in a distributed environment, the release responsibility usually

resides at one location. Therefore, the packaging branches tend to exist only off of one

mainline and do not need to be reflected at other sites. They also tend to be merged

when they are complete, and their results are propagate to other locations through the

merge policy of their parent branch, either mainline or the main accumulation branch.

Even a packaging accumulation branch is usually owned by one site.

A packaging branch will have life span similar to that of a development branch for a

small- to medium-sized development project. A packaging accumulation branch may

have a slightly longer lifetime, lasting until all of its subordinate branches have

expired.

Branching Roles Summary

There are five branching roles that can be applied to release management: mainline,

development, maintenance, accumulation, and packaging. Within a release branching

strategy, there may be a many-to-many mapping between roles and branches.

The mainline role is the central codeline around which all other branches are

coordinated. The mainline does not necessarily reside on the trunk of the version tree.

The development role is applied to branches supporting the creation of feature and

functional content of a product. The primary motivation for the development role is

the mitigation of risk in the development process. Risk mitigation is accomplished byusing branches to isolate development from other streams of activity and vice versa.

Once the development effort has been stabilized, it can safely be merged into the main

flow of progress without putting the whole organization?s efforts at risk.

The maintenance role is associated with bug fixing and is characterized by low-risk

activities. There are ways to manage these low-risk activities with less overhead than

would be warranted for development. Once again, risk is the motivating influence and

certain maintenance tasks may call for more effective risk mitigation strategies.

The accumulation role provides a means for multiple activities to be safely integrated

without corrupting the main flow of activity. This recognizes that some integrations

can have a high associated risk, and that sometimes parallelizing integration and

ongoing efforts can cause instability in the environment.


13/28

Finally, the release role highlights the need to restrict efforts to only those fixes

necessary to release the product on all target platforms. Strategies were addressed to

allow the release effort to proceed in parallel with ongoing maintenance and

development activities.

Example Release Branching Strategies

The above discussion is somewhat abstract. Where the discussion was concrete, it

focused on small building blocks in the overall release scheme. This section zooms

out a level, and applies the abstract to three realistic release scenarios. These strategies

do not encompass all possible strategies. Such a task is impossible. Instead they

provide a foundation strategy which shows how the components fit together and

provides two expansions on it to address specific needs. The rationales provided as

each strategy is elaborated serve as examples that the reader can use to develop their

own release branching strategy.

Basic Release Strategy

Let's start with a simple but complete scenario that one might adopt, shown in Figure

2. For this example, the organization has only one product and performs work on each

release sequentially. Therefore, the mainline is on the trunk. Additionally, since the

products in question are relatively small, perhaps on the order of 100-200KLOC, the

accumulation role is on the mainline.

Figure 2. Basic release branching strategySince the development team is small,

everyone is dedicated to the release effort or temporarily redirected to efforts that do

not modify source code during the release effort. This allows the release effort to also

take place on the mainline, avoiding conflict with any ongoing development. This

works particularly well if there is only one platform for release or the platforms have

little differentiation.

During development, all low-risk development (LRD) and low-risk bug fixes (LRF)

are performed on the mainline. The assumption here is that none of these projects

have the likelihood to destabilize the code base or to require intermediate check-ins

for checkpointing that would have significant effect on other developers. It is likely


14/28

that these changes would be accomplished entirely in the client view and checked in

when finished.

The high-risk development (HRD) projects A and B in the example are performed on

their own branches. This activity keeps them isolated from the mainline development

until, as shown in the example, they are completed and merged to the mainline. A

single high-risk bug fix (HRF) is similarly handled.

Basic Release Strategy with Packaging Branches and Intermediate Accumulation

Next, let us consider the slightly more complex strategy shown in Figure 3. In this

example, we will still assume that the organization only has one product and that

products are released sequentially. Thus, we keep the mainline as the accumulation

branch and put it on the

trunk. Figure 3. Basic

release development with intermediate accumulation and packaging branches

We still try to isolate our HRD projects. However, projects A and B both affect the

some of the same parts of the system, so we anticipate merge conflicts in their

changes. We decide to mitigate the risk of this overlap by setting up an intermediate

accumulation branch. As soon as either project is ready to merge, we create the

accumulation branch. The diagram shows the accumulation branchpoint as distinct

from either project's branchpoint. However, the task could be simplified even further

by making the branchpoint identical to that for project B. Doing this would have the

effect of removing the intermediate mainline development from the accumulation

merge and further reducing the risk of difficulties. Notice that not all HRD has to be

merged to the intermediate accumulation branch. Nor does it required that the

intermediate accumulation branch service only two projects or be unique within arelease cycle. When the intermediate accumulation has been successfully completed,

it is merged back into the main accumulation branch, in this case the mainline.

Multiple Mainline Strategy Due to Multiple Products


15/28

Now let us assume that the company has multiple products, in Figure 4 there are two,

based on a common core, but otherwise independent. The trunk holds the

branchpoints for the core mainline and each of the product mainlines. The three

mainlines each look almost exactly like the mainline in the previous example.

However, their policies differ significantly, primarily in their starting compositions

and in their merge policies. Figure 4. Two

mainlines for products based on a core development mainline

The core mainline has its own release schedule, but its releases are the basis on which

the product mainlines are built. There are significant decisions an organization must

make for this model to work. The primary decision is how the core will be

incorporated into the products. The three variants discussed below are simply key

points along a continuum. It is up to the organization to define where on this

continuum they wish to fall.

The core can be a pure client to the products, in which it would be treated much like a

third-party library might. Note that by defining this method of dealing with core, we

suggest a method for dealing with third-party libraries. Such a strategy would developthe core to release readiness, then build and check in the resultant libraries and

headers. By merging the core release to a mainline dedicated to its release

representation, its opacity could be protected, or it could be repackaged. If neither

protection nor repackaging is required, a packaging branch for the core serves the

purpose, as shown in Figure 4. Products would then either refer to the release package

as the basis for their builds or they would merge the release package into their own

build structure, depending on organization policy or product structure.

The core can also be seen as being given to the products with a source code license.

Once again, this may also reflect a type of third party relationship. The management

of this model will be similar, except that a release package mainline may be

considered writable to include bug fixes applied by the product teams. However,

changes can be controlled and reviewed more easily if such fixes take place in the

version of the core release merged into the individual product mainlines. Fixes applied


16/28

at these levels may be merged back into the true core mainline at the core team's

discretion.

Another variation occurs when the core is more tightly coupled with the products. In

this model, the core team defines the majority of the core's functionality, but the

product teams make modifications in the same source base. In this model, the concept

of a core release is somewhat vague, as is the ownership of the core itself.

None of these models are fully reflected in any branching diagram. Their visible

manifestations are the supporting mainlines, branches and merge lines, but these do

not tell the complete story. The full model can only be conveyed through a larger

policy definition whose further details are outside of the scope of this paper.

The product mainlines are composed of some synthesis of the core release package

and their own course of development. At their beginning, they merge from a well-defined version of the core release. Possibly they incorporate more merges as core

patch levels are released. If they are operating with a source license or tightly coupled

relationship to the core, there may be merges from the product mainlines back into the

core. Otherwise, the product mainlines operate like the previous model.

Representing Release Levels through Branches

Many organizations try to tackle the issue of assigning release levels to branches

without identifying the salient characteristics of the various types of development and

how they motivate branching structures. Now that a foundation for organizing releasedevelopment with branches has been established, we can meaningfully identify what

branching strategies best serve the typical release configurations. As discussed above,

most organizations need to deal with major, minor and patch level releases. Based on

our discussion of branching roles and example application of these theories, we can

extend the framework to encompass larger pieces of the development cycle. This

section organizes the release branching strategies outlined above into units that are

meaningful to the three major release levels. We then provide specific rationale

against what [WING98] calls the "promotion" model of codeline management.

Major Versions

Major versions are characterized by significant feature and function content changes,

often accompanied by compatibility issues. The heavy development requirements of a

major release suggest that each major release should be assigned its own branch from

the mainline. Figure 5 shows an arrangement in which each major release is given a


17/28

submainline off of the product mainline. Regardless of whether the product mainline

is the trunk, a branch off of the trunk or even deeper, it is advisable to provide a major

release effort with a branch off of the mainline which should minimally act as the

accumulation branch for the release content

development. Figure 5. Concurrent major

versions from the same mainline

Notice in Figure 5 that release X does not necessarily stop due to the advent of release

X+1. The fact that major releases can be concurrent introduces additions to the merge

policy of release X. In order to ensure that changes made to release X are also present

in release X+1, as will almost always be desired, the merge policy of X should state

that changes made to X should immediately merge from X to X+1.

This may seem unnecessary from the perspective of small- to medium-scale

development, but for large-scale development, it is essential. In fact, this merge policy

arrangement is a transitive effect of two statements from [WING98]: "propagate early

and often," and "get the right person to do the merge." If the correct person is

performing the merge at the correct time, there is rarely a good reason for that person

not to continue the propagation process to the next release. It is unlikely that another

person is better qualified, and any delays will distance the person performing the

merge from a clear understanding of the merge issues.

Having said that, a good reason not to merge may be when an intermediate checkpoint

of a project is being propagated to the mainline. It may be a better use of the team?s

time to wait until the development is more complete so as not to further destabilize

X+1, which is typically under significant strain already. The person performing the

merge will probably be the same, and his recollection will probably still be clear when

the project is complete. A competing influence to this exception is the record keeping

necessary to ensure that all changes are propagated when the time comes. Most SCM

tools provide healthy support for determining what merges are necessary.

Patch Levels

Patch levels are subreleases that contain only minor fixes against a product release

that is otherwise frozen in feature and function content. The ability to provide patch

levels against the release version of a product can be of great importance to

customers.


18/28

Generally, patch levels are easy to address. The creation of a packaging branch off of

the release mainline isolates the release effort from any ongoing work on the mainline.

The ongoing work is significantly composed of bug fixes, providing a continuing

patch effort, regardless of release efforts. This ongoing effort provides a solid basis

for additional packaging branches (or subtrees) against the further fixed mainline.

Figure 6 shows a portion of a product mainline with two different patch level releases

from it. Figure 6. Multiple

packaging branches from the same mainline for patch levels

There is one issue that needs to be acknowledged when creating patch levels through

successive releases from a mainline. We discussed above that LRD might also takeplace on the mainline, depending on the environment. However, this same

development is usually not allowed to take place between patch level releases. Thus,

after the first release from the mainline, LRD is no longer allowed, violating the

principle that one should branch on incompatible policy. The recommended

alternative in this event is to put low risk development on its own branch parallel to

the mainline and declare it obsolete when the first packaging branch is created. This is

shown in Figure 7. Another approach is to create patch levels from the packaging

branch itself, which leads to complicated merge

policies. Figure 7. Using an LRD branch

to resolve policy incompatibilities for patch level releases

Minor Versions

Minor versions have attributes of both major versions and patch levels. Typically

minor versions will have new feature and function development similar to that of a

major version, but smaller in scale and usually without compatibility issues. Minor

versions will also usually contain a significant number of fixes, often with some thatare larger in scope than those that would go into a patch level.

The two primary approaches to handling minor versions are consistent with the

approaches for handling major versions and patch levels. An organization with longer

release schedules, a larger code base, or a larger development team should treat minor

versions more like major versions. An organization with shorter release schedules,


19/28

less code, or a smaller development team may consider treating minor versions like

patch levels or eliminating patch levels completely.

It is difficult and usually not worth the effort to try to craft a hybrid approach to

handle minor versions. Treating a minor version identically to a major version keeps

the patch level releases pure to their intent without any awkward content

manipulation. The tradeoff is in the number of cascading merges like those discussed

with concurrent major versions.

Treating minor versions like patch levels effectively eliminates the concept of the

patch level and clears up any policy inconsistency that may occur when the packaging

branch is formed. It also tends to work cleanly with an environment without

packaging branches, as the purpose of the packaging branch is to segregate the release

effort from ongoing minor development. It does create a small window during the

release effort in which mainline development should stop to ensure the purity ofrelease builds based on release fixes. Additionally, it reduces the organization's ability

to respond to changes needed at the minor version level once a particular minor

version has been finalized; this can be circumvented through ad hoc branching.

The Promotion Model

There are several interpretations to the termpromotion model in SCM systems. Most

promotion models are notdesirable models for an SCM strategy. In particular, this

paper addresses the notion put forth in [WING98] by that name in order to point out

the deficiencies of such a model. In that model, projects and releases branch off ofeach other. Figure 8 shows this

model. Figure

8. The Promotion Model, an example not to follow,reproduced by permission of


20/28

Perforce Software, Inc.From "High-level Best Practices in Software Configuration

Management."

Organizations are commonly tempted to try the promotion model for very logical

reasons. It is valid to think that future versions are built upon past versions. However,

it is illogical to presume that one should reflect the dependency between successive

versions with piggybacked branches. The fallacy of this logic is subtle and easy to

overlook; logical and physical representations of a system rarely reflect each other

precisely.

In [WING98] a mainline is put forth as a basic and incontrovertible principle of SCM

implementation. Although substantially true, the reasons are not always clear to the

less experienced SCM practitioner. Hopefully, the discussion up to this point has

sufficiently justified a mainline centric model similar to that promoted in [WING98],

although tuned to a larger scale effort.

Two of the primary deficiencies in the promotion model are the continually escalating

complexity of merge policy and the lack of a rendezvous codeline for multiple paths

of derivative development. In addition to making the management of a larger release

environment more difficult, this further complicates the kinds of development

discussed below, particularly distributed development.

Projects Spanning Releases

As an organization and its code base grow, the projects the organization undertakeswill tend to grow as well. This growth occurs for several reasons, not the least of

which is an increase in the number of products to fill new marketing targets.

Successful companies also keep their customers happy, particularly outside of the

shrink wrap markets, by providing frequent updates. This provides a return on

investment for the customer's license maintenance fees. These two forces compete and

together ensure that there will be projects whose time requirements span the normal

product release cycles regardless of the release level.

Branches usually originate from and merge to other branches within the same releasecycle. Projects that span releases have some conceptual complexity because they

conflict with this typical usage. The project spans releases because it was estimated to

take longer than a release cycle. Therefore, waiting until work on its target release has

begun is not an option.


21/28

Another possible approach would be to start the release development structure for

which the project is targeted based on the start of the project. This approach is likely

to lead to considerable merge overhead throughout the development organization to

accommodate a single project. It also may cause the paradoxical arrangement in

which the development structure for a release branches sooner than the structure for

one or more of its predecessor releases.

Neither of these approaches is palatable in most environments because they disrupt

the normal flow of development activities for the needs of a single project. The

recommended approach is to create a branch off of the latest release meets the

stability needs of the project. Development should proceed on this branch until

successive releases have evolved to a similar point in their ability to satisfy the

project's needs. At this time a new branch should be created and the development

merged to the new branch. Finally, when the release for which the project is targeted

is created, the development can merge to a HRD branch in that development structureand follow the normal course of development in that release.

Depending on the nature of the project and the other development that occurs in the

interceding releases, intermediate development branches may not be necessary.

However, avoiding the intermediate branch increases possibility of difficulties during

the merge into the target HRD release branch. The decision to avoid the intermediate

branches should be taken with great consideration.

Derivative Development Projects

This section discusses the management of derivative development projects. Derivative

development is based on the product code base, which is not directly intended for

product release content. There are many ways to base derivative development on a

product-oriented code base. Three strategies are represented here, indicating the

possible variations that can be applied.

Many companies will, at one time or another, need to perform derivative

development. Some examples of causes of derivative development are research grants,

government contracts, proof-of-concept prototypes and customized versions of aproduct. Derivative projects may end up integrated with the main product

development at some point in the future. They may be the basis for a rewrite for

commercialization. They may acquire a life of their own as an independent product.

They may reach a dead end and be retired.


22/28

There are two high-level issues that must be addressed when embarking on derivative

development. The anticipation for the integration of the derivative development back

into production is easy to answer, as it has much in common with projects that span

releases. Additionally, since much of derivative development is not intended for

reintegration, the issue of starting the project is more important. The bulk of this

section will address the issue of the requirements for the foundation upon which to

build.

In addressing the foundation issue, one must try to balance the stability of the

foundation code base, the availability of new fixes and features in the foundation code

base, and the complexity of management inherent in supporting the previous two. The

next three sections discuss three approaches to a solution, and in the process, outline

the solution space from which hybrid solutions can be formed. The section following

addresses the answer to the second question.

A comment on life cycle issues is warranted here. Often derivative development is

immune to an organization's life cycle requirements, since it is not immediately

intended for product release content. In defining the organization's strategy for

derivative development, one should carefully define the level of compliance required

for the integration of derivative development back into the production code base. This

consideration will impede the possible flow of poorly written, marginally stable,

unreviewed, and otherwise unsavory development into your perfect, pristine and

robust product code.

Developing from a Release Package

Developing from a release package provides the highest level of stability in the

foundation code base for the project. However, unless the timing of the project

coincides with the end of a release cycle, either through coincidence or fortune, one

must choose between delaying the start of the project or forgoing and fixes and

enhancements that have accrued since the last release. For a short-term project, one

that is smaller than the typical release cycle, this decision can be significant.

The positive side of this approach is that the management complexity over the life ofthe project is negligible. One only needs to consider intermediate imports and exports

on the derivative branch when a change in direction requires a reconciliation of the

derivative functionality with the foundation code base. This situation might occur

when the derivative functionality needs to be incorporated into the product

immediately or changes to the foundation code base must be brought into the

derivative development. The former case is somewhat ill advised; other means should


23/28

be used to accomplish this goal, if careful consideration or executive ultimatum

require it.

The branch for this model of derivative development should be created from the final

packaging accumulation branch. Typically, the branch already will be identified with

a label to make release rebuilds easier, providing an ideal foundation for a new

branch.

Deriving from Development in Progress

Deriving from development in progress is probably the most realistic case of

derivative development. In this model, the derivative branch is created from a

(hopefully) stable point on the accumulation branch. This branch creation will require

the placement of a label, or for systems that support it, the definition of specific

moment in time as the base of the branch.

Although the foundation code base is not necessarily as stable as one might prefer, the

latest fixes and features are present. This approach requires greater management

effort, as the chance is higher that new fixes and features should be integrated into the

derivative development branch. Typically, the rest of the organization is unfamiliar

with the derivative development activity, so merges of their development, especially

that of HRD, should not go the derivative branch automatically. Owners of derivative

branches are responsible for the evaluation of each possible merge, rather than

receiving them automatically.

Need-driven Branching

Need-driven branching is by far the most complicated strategy to manage, but is

sometimes the correct strategy for the task. Both of the above strategies fix their

branchpoint before the derivative development begins. Need-driven branching puts

development for a particular source file onto the branch only at the moment when the

file needs to be changed. Files that have not been changed on the derivative branch

will be mapped from the foundation branch and evolve as such. Need-driven

branching takes the tenet "branch only when necessary" [WING98] to its logicalextreme.

Need-driven branching has a complication that requires management overhead to

correct. Files changed on the need-driven branch hide later changes to the same file on

the foundation branch. However, files on the foundation branch that have not spawned

a need-driven branch remain fully visible as work on them continues. The difficulty in


24/28

managing need-driven branching arises when foundation development changes one or

more of each category of file. The complete set of foundation changes is not seen in

the need-driven branch, leaving the code base in an inconsistent state.

The easiest merge policy to avoid hiding effects mandates regularly merging changes

from the foundation to files on the need-driven branch. However, as in the previous

examples, this activity should be the responsibility of the need-driven branch owner,

not the mainstream development community.

Need-driven branching incorporates the latest fixes and features, while continuing to

track their evolution and enhanced stability. However, it achieves this at the expense

of management effort. Need-driven branching should be considered when the

derivative development cannot be delayed to derive from the release, when the release

will coincide fairly closely with the expected completion of the derivative

development and when the derivative development needs high reliability as quickly aspossible. Obviously, there is a considerable degree of subjectivity in assessing

whether need-driven branching is appropriate. Generally speaking, need-driven

branching should be avoided unless there is compelling reason to use it.

Reintegration Issues

Although the intent of much derivative development is specifically not to be

integrated into the product code base, it frequently occurs that the derivative

development is useful enough to charge someone other than its original sponsor

money for it or some commercialization of it. When this happens, or when there wassome original intent to commercialize it into product features, there is a need to

reintegrate.

The approaches for reintegrating derivative development are the same as those for

integrating development that spans releases. If the release already exists when the

derivative development is ready for reintegration, create a HRD branch in the

development structure and merge the derivative development to it. If not, one should

create a branch in the latest release that satisfies the development's requirements and

subsequently treat it as release spanning development.

Regardless of how the development is reintegrated into a release, this is where the life

cycle compliance issues will crop up. It is recommended that all quality checks

required in the regular development life cycle be applied to the final HRD branch

before it is merged to the accumulation branch. The task will be simpler if the life


25/28

cycle was applied to the derivative development, but this is often in conflict with the

needs of derivative development.

Distributed Development

Distributed development is defined for purposes of this discussion as development

carried out at multiple locations with the same common goal. Distributed development

may require considerably more effort to synchronize than that of any of the previously

discussed arrangements. Many factors may conspire to impede simple

synchronization:

The same project may be worked on at multiple locations.

The organization may want to track development at each location

independently.

The network connection between locations may be slow, on-demand,

unreliable, costly or non-existant.

The network overhead of the SCM tool may further prohibit use on the above

network connection.

Security concerns may require additional layers of networking for encryption

and authentication, further slowing an already bandwidth bound arrangement.

Security concerns may require that some locations have more limited access

than others.

International trade or labor laws may constrain the work environments

differently between locations.

Through all of this, reducing the already high management complexity of multiple

locations requires maintaining as much similarity between locations as possible. A

comprehensive discussion of distributed development issues could easily be the

subject of another paper. This section will hint at some of the solutions that can be

adopted.

As of this writing, I am aware of only two vendors that have products that directlyaddress the distributed development issue: Rational's ClearCase MultiSite and

Continuus' DCM option to Continuus/CM. Both provide their own unique constraints

on the solution to the problem.

The simplest solution is for all locations to be considered equal and to access a

common source repository. This sounds like the answer is at our fingertips, and we


26/28

should look no further. However, reality works against us. Reliable connections aside,

many SCM systems do not support such arrangements well with their base products

with usable performance. Perforce alone seems to have reasonable performance over

slow connections, making this kind of arrangement somewhat feasible. However, it

remains to be proven whether the performance will scale as the number of locations,

number of developers, and volume of transactions grow.

An arrangement that may work well if there is a master location and several satellite

locations is for each satellite location to send in batches of changes which would be

checked in to a branch dedicated to the site's integration and propagated to the correct

development branches. If this could be done at a time when the remote location was

not working, the resultant databases could be copied back to the remote location to

reflect the total development picture. However, the return update is a high-bandwidth

operation and is not supported by all SCM tools. A variant could simply send back a

snapshot of the latest development, requiring less bandwidth, which could beunconditionally checked in if changes had taken place. This variant ignores issues of

creation and deletion of files and directories.

Another arrangement which works well in some situations is to mirror each branch on

which distributed development is to take place. Each location maintains a set of

branches for themselves and a set for each location. If one location is designated a

master location, each location except the master can maintain only two sets of

branches. In this scheme, each location has ownership of the branches for their

location. The branches that correspond to other locations are read-only for

development purposes and are used as intermediate accumulation branches for

incoming deltas from the other locations.

Each of the last two arrangements requires considerable work on the part of the SCM

maintainers to replicate branching structures and perform merges. They also put the

responsibility for merges in the hands of someone who is almost guaranteed not to be

the best person for the job.

Unusual Variations

There are two additional variations on branching strategies which, although not

usually called for, are of some interest to the discussion of branches. These are

presented more for their novelty value and to expand the reader's thinking on how

branches can be used, than as a recommendation.


27/28

Call the first variation Release On Demand. In this strategy, the mainline must be kept

as stable as possible at all times so that one can create a release at any time. This

would be accomplished by policies mandating that no code be merged to the mainline

unless it has passed considerable review and testing. This is as much of a life cycle

issue as a branching issue. At the time a release is desired, simply create a packaging

branch from the head of the mainline, build, test and ship. This could be used to

implement a daily build [MCCO93] which would be fully repeatable should the need

arise. Another application for its use would be in an environment in which the latest

development is always desired, but the reliability infrastructure is large, such as one

might find in a military or space program environment.

The other variant might be called the Smorgasbord Release. In this model all

development that is a candidate for a particular release content is branched off of the

same branchpoint. One then creates a release by picking and choosing from available

contributing branches and merging them to the packaging branch. This model mightbe appropriate for a very indecisive environment, or when an organization is

sufficiently resource rich to be willing to do development that may never make it to

release content. Another situation in which this model would be useful might occur in

an environment in which high levels of reuse have been attained; in this situation, the

reusable components would reside on branches and be pulled into a project as needed.

Conclusion

Planning and analysis are critical to the success of any SCM system. A branching

strategy motivated by a risk-based analysis of the organization?s development needs

will provide a strong foundation. Incorporating the concepts of branch roles, codeline

policy and codeline ownership will assist in performing the required analysis.

Application of the principles of branchpoint, merge policy and branch life span will

ensure that the parameters governing codeline policy are properly and completely

addressed.

Once the branching strategy has been formulated, the organization can implement the

customizations required to make the SCM tool suit its environment. Until SCM

systems have reached sufficient maturity to address the larger issues of policy,adopting the practices put forth in this and other papers will help an organization

achieve success in their software development endeavors.

References


28/28

[APPL98] Appleton, Brad, Stephen P. Berczuk, Ralph Cabrera, and Robert Orenstein, "Streamed

Lines: Branching Patterns for Parallel Software Development," Submitted to the 1998

Conference on Pattern Languages of Program Design (PLoP'98), Allerton Park, IL,

August 1998.

[ATRI94] ClearCase Concepts Manual, Atria Software, Natick, MA, 1994.

[BERL90] Berliner, Brian, "CVS II: Parallelizing Software Development," USENIX 1990.

[BOLI95] Bolinger, Don, and Tan Bronson,Applying RCS and SCCS, O'Reilly & Associates,

Inc., Sebastopol, CA, 1995.

[JAME94] Jameson, Kevin,Multi-Platform Code Management, O'Reilly & Associates, Inc.,

Sebastopol, CA, 1994.

[MCCO93] McConnell, Steve, Code Complete, Microsoft Press, Redmond, WA, 1993.

[PERF98] "Networked Software Development: SCM over the Internet and Intranets," PerforceSoftware, Inc., Alameda, CA, 1998. Available

athttp://www.perforce.com/perforce/wan.html .

[ROCH75] Rochkind, Marc J., "The Source Code Control System," IEEE Transactions on

Software Engineering, Vol. SE-1 No. 4, December 1975.

[TICH85] Tichy, Walter F., "RCS - A System for Version Control," Software Practice and

Experience, Vol. 15 No. 7, July 1985.

[WING98] Wingerd, Laura and Christopher Seiwald, "High-level Best Practices in Software

Configuration Management," draft of a paper to be presented at the Eighth InternationalWorkshop on Software Configuration Management, Brussels, 1998. Available

athttp://www.perforce.com/perforce/bestpractices.html .
http://www.perforce.com/perforce/wan.htmlhttp://www.perforce.com/perforce/bestpractices.htmlhttp://www.perforce.com/perforce/bestpractices.htmlhttp://www.perforce.com/perforce/wan.html