VCS 4 CDE { Version Control Systems as Common Data …¤ger_VCS_as_CDE.pdf · 2018. 12. 13. · VCS...
Transcript of VCS 4 CDE { Version Control Systems as Common Data …¤ger_VCS_as_CDE.pdf · 2018. 12. 13. · VCS...
-
Faculty of Civil, Geo and Environmental Engineering
Chair of Computational Modeling and Simulation
Prof. Dr.-Ing. André Borrmann
Faculty of Architecture
Chair of Architectural Informatics
Prof. Dr.-Ing. Frank Petzold
VCS 4 CDE – Version Control Systems as
Common Data Environments
July 27, 2018
Report
Advanced Topics in Building Information Modeling
Michael Jäger
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Contents
Introduction 1
1 Requirements of the Construction Industry 2
1.1 Data storage systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Flat file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Document Management Systems . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Building Information Models . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 Common Data Environments . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Distribution and performance . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Data safety and security . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Versioning and collaboration . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.4 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.5 Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Basics of Version Control Systems 10
2.1 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Working directories and commits . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Distributed VCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 History and Popular Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page II
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
2.3 Supplemental Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Potential use of VCS in construction 14
3.1 Example Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Planning Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.3 Execution Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Tracking Changes with Diff . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Experimental Version Comparison . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Summary 20
4.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
List of Figures 22
List of Tables 22
References 23
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page III
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Introduction
Common data environments (CDE) and Document management systems (DMS)
serve as essential digital infrastructure for contemporary collaborative construction projects,
especially those utilizing BIM. There are countless tailor-made solutions, some independent,
some integrated with enterprise content management or groupware environments such as
Microsoft Exchange or IBM Domino.
One might compare this state to software development. Like the ACE industry, that
sector is shaped by numerous contributors working on the same projects as well as needs for
accountability and traceability. Unlike the ACE industry, the software sector has long relied
on dedicated Version control systems (VCS) to provide just that. Those systems can
record document edits, highlight changes and ensure, within limitations, identical data sets
for each collaborator. Many among them, most famously git, are open source and work on top
of regular file systems, allowing compatibility with and integration intro regular development
software.
This report investigates a) the construction industry’s requirements concerning DMS and
CDE for BIM projects and b) capabilities of common VCSs. It examines possible use of
the latter in the construction industry to distribute not only documents and plans but also
bulding information models.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 1
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Chapter 1
Requirements of the Construction
Industry
With the professionalization of the construction industry its projects have taken on enourmous
complexity. Contracts and specifications are measured in shelf-meters while plan binders fill
entire warehouses for single projects. Even more difficult than efficiently accessing those large
ammounts of information is distributing them among among planners, owners, authorities,
contractors and numerous other stakeholders. Missing or outdated documents and plans are
a if not the prevalent cause of delays, errors and cost increases.
A distiction should be made between several types of data, which are henceforce all
referred to as documents:
contracts, agreements, regulations are not (or rarely) changed once finalized, usually
before work starts. Any ammendments would be seperate documents. Frequent change
of non-finalized documents in early project stages is to be expected.
billing and accounting data is generated throughout the project. It is frequently up-
dated, commented on or superseded, all the while being critical for legal and financial
regards.
plans are the core product of architects and engineers. While ideally any plan is complete
once published and approved, in (central european) reality they are frequently updated
and corrected. While insufficient supply of billing data inhibits cash flow and creates
legal issues, inadequate distribution plans leads to costly mistakes and delays in actual
construction.
building information models differ greatly from plans, although they may seem super-
ficially similar. Data harmonization and exchange is baked into the BIM approach,
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 2
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
eliminating many problems and error causes of traditional planning. Technical dif-
ficulties in implementation remain but are actively being worked on in big-budgeted
software companies in a highly competative and rapidly evolving market.
1.1 Data storage systems
Computer-aided data storage systems have proven inevitable for managing, storing and dis-
tributing project data of any kind. Over the years many such systems have been developed,
each with varying degrees of complexity and specificity.
1.1.1 Flat file systems
The simplest form are flat file systems. Emulating their physical namesake, they provide
a simple hierarchical structure. Any document’s context is derived from its place within
the hierachy and its name, which in itself is often codified in project- or organization-wide
regulations. Adherence to hierachy and naming conventions is usually enforced manually.
Access control – that is, limiting access to certeain documents to certain individuals or
groups – is commonly available but regularly lacking in usability.
Usually every stakeholder maintains their own document storage; yet documents need to
be made available to all relevant stakeholders. Traditionally, this has been achieved through
mail or fax. Fax, however, is widely considered outdated while sending documents via mail
or curier takes time that could otherwise be used more productively. E-Mail on the other
hand is fast but generally not legally binding. It is commonly used to notify recipients of
incoming mail ahead of time.
Data transfer via E-Mail places strict restrictions on file sizes. The simplest solution here
is an internet-accessible file server using i.e. (Secure) File Transfer Protocol, hosted by a
project party. That server may in turn be embedded into one or more stakeholders’ storage
system, unless they are reluctant to rely on such a server as a replacement for their own
storage on grounds of limited availability and performance — which they often and rightfully
are.
In practice, cloud based private-oriented shareware such as WeTransfer or Dropbox is
used for singular data transfers if no stakeholder can or will provide such a server.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 3
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
1.1.2 Document Management Systems
The flaws of flat file systems inspired the development of dedicated document management
systems (DMS) as we know them today. A DMS in its prevailing definition is built around
a database that stores metadata along with documents themselves. This metadata includes
dates, authors, states (work-in-progress, approved, archived, . . . ), contexts of documents.
The DMS creates indices of a documents content, either manually or through content
recognition, to facilitate efficient data retrieval. It also allows users to lock and unlock files,
marking them as being worked on and avoiding users overriding each others changes.
1.1.3 Building Information Models
Building information modeling aims to unify all relevant information about a building —
starting with geometry but extending to cost and construction schedules, materials and qual-
ity requirements, static models, physical simulations and environmental calculations.
Figure 1.1: Building Information Model
The concept of BIM introduces a whole new set of technical challenges to IT environments
in the ACE sector while sharing all requirements associated with conventianal planning meth-
ods. Every project’s planning data consists of informational resources 1. Each ressource is
stored, accessed, locked and unlocked, edited, approved, etc. seperately and has their own
metadata, eg. last-edited-at or approved-by. The end result of a project utilizing BIM is a
single set of planning data.
Early BIM projects strove for a single 3D-model enriched with metadata for elements,
building parts and the entire project. The difficulties associated with allowing several project
participants access to that model lead to a less centralized approach of multiple discipline-
specific models that are federated in a coordination model according to predefined model
views.
Different extents and forms of BIM utilization are appropriate for different projects, since
the necessary technology is not yet fully mature in all sectors. At the same time not all
1the terms resource and document describe very similar concepts and are used interchangibly in this context
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 4
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
projects utilize BIM to its full potential yet. Many planners, owners and contractors are
still sceptical of its advantages (rightfully citing inappropriate payment regulations), are
unwilling to make the necessary adaptations to their business practices, or lack the expertise
and personal and fincancial capacity to do so.
1.1.4 Common Data Environments
A more formalized concept that adapts the idea of a DMS into a so-called Common Data
Environment (CDE) is provided in the british PAS 1192 [18] and the recently released ISO
19650-1 [11] with very little difference between the two. These standards describe processes
and structures for information exchange in a BIM environment. They do not demand a
specific implementation, rather explain how and for what such an enviroment shall be used.
Many existing software solutions are capable of serving as a CDE; software for that express
purpose is in development.
A CDE is a single place where all project participants store and exchange project related
information. Inside this area, each team can work on their own ressources (so-called contain-
ers) before submitting them for approval and cross-checking them with other stakeholders’
resources. Notably, ISO 19650 extends the idea of a building information model to a project
information model, including non-graphical data and documentation alongside graphical data
that is federated 3D- (or more) models.
There are four formalized states of a document: work-in-progress, shared, published and
archived; with well-defined processes and authorizations to transfer documents between those
states. The work-in-progress stage explicitly includes resources that are not ready to be shared
with other project participants. In the next stage they are shared with the project team but
not yet final, as they require harmonizing with other participants’ data. Once that is achieved
and a resource is verified, it reaches the published stage, where it can be used for tender and
construction. Especially the early stages are to be understood as iterative with work states
being updated frequently [16].
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 5
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
1.2 Requirements
The following subsections investigate feature requirements for data infrastructure in an ACE-
environment. Some of these concepts are also explored in Annex A of ISO 19650-1, specifically
simultaneous working, information security and information transmission.
Once all those challenges are met the largest on remains: getting stakeholders to accept,
trust and actually use the provided system. This can be achieved through contractual obli-
gation, whose implementation lies within the project owner’s responsibility. The harder, but
more sustainable way requires time, experience and and positive examples.
1.2.1 Distribution and performance
A fundamental problem of simple file-based document management: Every stakeholder main-
tains their own document database in whichever form they see fit. This leads to inconsis-
tencies whenever a change does not reach an affected party in time or at all. Instead, it
is important that all participants access the same versions of the same files (the exception
being, of course, current work in progress).
Just as important as synchronisation is uninterrupted availability to all stakeholders at any
time. Any down time limits or prohibits (depending on the level of integration) productivity.
The conventional flat file system approach places responsibility for that with each party’s own
IT – which is convenient from a legal perspective but may not be a good thing for smaller
parties without a dedicated IT department.
There are two main approaches to guaranteing availability: one or more central servers
or automatic distribution among mirrored repositories.
Any centralized system is constrained by a required performance level. The server needs to
be able to supply the expected number of clients simultaniously without causing unacceptable
delays. Working on a remote server may not be an option at all for parties without a fast
internet access.
A decentralized system maintains several independent copies of the data set. Changes
are regularly copied between the instances. Such systems require more hardware and more
complex software for managing changes of the different instances.
1.2.2 Data safety and security
For legal, contractual and practical reasons, construction plans and other documents needs
to be archived for a long time, sometimes as long as the building exists. Throughout that
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 6
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
time, no irreversible damage must come to the data and it must remain legible. Nowadays
more and more clients demand a building information model that can be transfered into an
asset information model representing the built product to simplifiy facility management.
A paper archive is often viewed as safer than a server, yet both are equally susceptible to
force majeure or physical sabotage. Depending on the physical storage medium, power failure
might actually compromise data safety, though. A competent IT department will prevent
data loss through regular off-site backups and long term archiving on tape drives.
File formats, software standards and storage mediums evolve over time. Care needs to be
taken that digital data, if it is to be used in place of printouts, is stored in useable file formats
on storage mediums that remain accessible even after decades of technological development.
Storing PDF/A files along with native file formats is a promising approach while the cloud
has the potential to eliminate the need for maintenance of outdated storage hardware, as
data storage becomes an abstract service.
Security is at least as important as safety. Limited by contractual obligations and prac-
tical necessity, most data in a construction project constitutes a company secret. Contract
data is usually only available to the contracting parties. This extends to billing information,
as knowledge of a competitors prices offer a great advantage to any company in subsequent
tenders. Security-related information may also be classified, ie. for prisons or military instal-
lations. Even most file systems support at least basic access control. In addition, encryption
may be used for especially delicate documents.
1.2.3 Versioning and collaboration
Computer systems must aid communication, coordination and collaboration, either direct
from partner to partner or indirect through editing shared resources.
Any building project is subject to evolution; each piece of information may be created,
approved, adapted or invalidated by several contributors. All of these changes need to be
documented, attributed to their author and possibly reverted.
Conventionally, this is achieved in two distinct ways: For plans a version number is printed
on it along with that version’s author, its creation date and notable changes. Drafts of textual
documents commonly have each version saved as a seperate document, commonly with the
creation date and other various pre- and suffixes as part of the file name. Both situations lead
to many versions of the same document often appearing side by side. It is easy to confuse
versions, leading to duplicate work or overlooked changes.
Many modern document management systems have integrated version control. They
allow users to freely browse previous versions and the previously mentioned accompanying
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 7
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
data and even to reset a document to an earlier state. Some users are still reluctant to rely on
these versioning systems, be it for lack of experience, uncertainty regarding their reliability
or their prevailingly non-liable nature.
1.2.4 Concurrency
Whenever a resource is shared, multiple users might want to edit the same resource at the
same time. The resulting inconsistencies can be avoided with either pessimistic or optimistic
concurrency.
A DMS with pessimistic concurrency requires a user to check out (or lock) a document
before gaining write-access and check in (or unlock) it once they are done. In some systems
this happens automatically. No other user may edit a locked document.
In an optimistic collaboration system users are permitted to edit a resource simultaniously.
Such a system will assume that most changes won’t conflict with each other and can therefor
be merged into a single updated version automatically. Conflicts that do occur need to be
manually resolved. This approach works well for textual data, since changes are limited to
specific lines of text. Whether this extends to the STEP-based IFC format as well will be
examined in a later chapter.
1.2.5 Aggregation
An important influece on a project’s specific challenges is its level of aggregation, or what is
to be considered an individual ressource.
The introductory despription in section 1.1.3 fits a concept more precisely called BIG
BIM. Its counterpart little bim utilizes specific software for specific isolated tasks within a
project (fig. 1.2). The resulting models are self-contained ressources comparable to plans and
documents in traditional planning. Depending on project complexity, the level of aggregation
in BIG BIM is either much higher or much lower.
With a high level of aggregation an entire building may be represented in a single or few
models, each representing a partial model. The model may be subdivided by zone (floor,
building section), by domain, ie. functional aspects such as an architectural, structural,
HVAC model, or even not at all. Partial models are coordinated and cross-checked (ie.
federated) with each other using model checkers that scan for collisions and conflicts and
assembled into coordination models according to predefined model views.
The lowest level of aggregation on the other hand treats single building elements, pieces of
information or even paragraphs in textual documents as objects. Managing so many objects
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 8
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Figure 1.2: Open vs. Closed and little vs. BIG describe different approaches to BIM
requires a specialised BIM-server or product model server. A BIM-server is usually integrated
seamlessly into their respective deleveloper’s BIM software as part of a closed BIM solution.
A product model server on the other hand uses open standards to provide ressources to
all kinds of clients, be it modelling software or a browser-based web interface – allowing
for an Open BIM approach. Both store their resources in a database and handle resource
managment internally.
The higher the level of aggregation, the fewer ressources need managing. On the other
hand, this makes concurrent work more difficult and possibly less efficient, especially with
pessimistic forms of collaboration. The lower the level of aggregation, the more fine-grained
locking or approval operations will be.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 9
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Chapter 2
Basics of Version Control Systems
A Version Control System is a software system that
- allows multiple users to work on the same files simultaniously,
- prevents or harmonizes conflicts in file changes arising from that,
- stores every version of every file along with its editor, timestamp, etc.
The files a VCS works with are stored in a special location called repository. Unlike
a two-dimensional file system (name and directory) a repository identifies files by name,
subdirectory and time stamp.
2.1 Functionality
There is a set of commands that are shared by virtually all VCSs [19]. This section briefly
explains the basic features. Note that not all systems use the same syntax; so the most
common terms are used.
2.1.1 Working directories and commits
Files are not edited directly in the repository. Instead, a working copy is created with the
checkout command. The changes a user makes are transfered to the repository with the
commit command. Since the repository may have been changed by other users, the working
copy can be updated to retrieve those changes. A single commit is the atomic unit of change
in a data set that is archived in the repository. This record is called commit history.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 10
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Before commiting changes, a user might wish to review a summary of those changes. This
is done with the diff command that compares versions of a file with each other line by line.
2.1.2 Branching
Early VCSs had a linear commit history: each version builds upon the previous commit and
there is always a current commit. However, this doesn’t represent the actual workflows of
not only software developers, but engineers and architects as well. Instead, users work on
different versions of a project and its repository in parallel (eg. after version 3.0 of a software
is released, some developers work on version 3.1 and others on 4.0). This is protrayed by
branching.
The branch command creates a copy (ie. a branch) of the repository. Each branch can
be commited to separately. The merge command then attempts to combine the changes in
both branches into one version. This may or may not work automatically – as changes often
conflict with each other – and remains one of the largest challenges VCS users and developers
face. Before merging the differences can be reviewed with diff.
Branch and commit history are often visualized in Directed Acyclic Graphs or DAGs
(fig. 2.1). Each node represents a commit, each linear sequence of nodes a branch.
While branching more accurately represents a developer’s workflow, it also complicates
project management significantly because there no longer is a definitive latest version. This is
why conventionally, most projects have a master branch, one or more development or feature
branches (simplified). The master branch is only changed when a development branch is
merged into it and only when those are sufficiently stable. A release version may be yet
another separate branch from the master.
Figure 2.1: Example of a Directed Acyclic Graph
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 11
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
2.1.3 Distributed VCS
The last fifteen years have seen the emergence of Distributed Version Control Systems
(DVCS). Those systems have multiple repositories. There may still be a main repository,
but in essence each works independently from each other.
The clone command creates a local instance of the same repository, identical to the
original.
The push command attempts to copy the changes in a local repository into a remote one.
It only succeeds if the remote repositories contains no commits that the local one doesn’t.
The pull command synchronizes a local with a remote repository by merging a copy of
the remote one with the local one. It is used whenever the remote repository has received
commits after the local one has been cloned (for example by receiving a merge from another
branch). The pull command therefor often needs to be called before a push, lest the push
fails. For both push and pull a user must specify which branches on both instances they wish
to push or pull from and to.
It is common to have branches specifically to pull to from remote repositories to avoid
merging across repositories. Cloning, pushing and pulling each usually use SSH and require
user authentification. It is possible to limit access for specific users to certain parts of the
repository.
Figure 2.2: A Distributed VCS
DVCS have several notable advantages over centralized systems:
private They compartmentalize teams. Each group within a team (or even every user) can
have their own repository instance and adjust its structure to their workflow.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 12
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
offline They allow for geographically distributed teams to work together, even when internet
connection is interrupted or to slow for constant data exchange.
safe If one repository is compromised, it can easily be restored from another instance.
2.2 History and Popular Systems
One of the earliest modern VCS capable of handling multiple files was Concurrent Versions
System (CVS). It was a centralized system with automated merging that didn’t support
branching. Its successor is Apache Subversion (SVN), which gives each commit an absolute
reversion number; each file in the repository has the revision number of the last commit it
was changed by. It also creates a local copy of the repository on checkout or update, allowing
a user to review his changes even without connection to the repository.
Around 2005 two distributed VCS entered the market: Mercurial (hg, from mercury) an
git. Both are primarily used via Command Line Interfaces. While git is more flexible, it is
also more complex and difficult to learn then Mercurial[1].
Reliable data for market distribution today is hard to come by and mostly based on pos-
sibly biased sources. One source indicates that git has become the dominating solution based
on the number of StackExchange questions and Google Trends with ca. 75% of questions
about VCS regarding git[20]. Most sources conclude that git is at least among the mar-
ket leaders an keeps on gaining popularity (especially in open source projects) with Apache
Subversion and Mercurial remaining the strongest contenders[1][3].
2.3 Supplemental Software
Alongside the different programs for VCS a number of supplementary webservices have been
developed, such as GitHub, GitLab, BitBucket and SourceForge. Those are at their core file
hosting services that provide a central repository for a DVCS – either public (open source)
or private – along with additional features. GitHub for example offers tools to review and
manage changed code and enables web-based documentation and distribution [10]. Meanwhile
GitLab aids in project managament with task and issue tracking and is optimized for the
devOps cycle, ie. simultanious development and execution [9].
Many development environments have VCS interfaces that allow users, even those unfa-
miliar with command line interfaces, to use VCS commands from within their software. This
extends to branch selection, commit comments and code review with diff, pushing the actual
systems into the background.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 13
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Chapter 3
Potential use of VCS in
construction
3.1 Example Project
3.1.1 Overview
Whether and how a VCS can serve as a CDE is best explained in an example. Let us consider
a fictitious construction project with the following participants:
Figure 3.1: fictitious project diagram
In each office several individual planners work on different parts of the project while the
architects office is responsible for planning coordination
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 14
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
3.1.2 Planning Stage
C tasks A with model coordination and agrees to use git as common data environment. A
creates a central repository with the master branch that holds the general model. They set
their designers to work on a separate architectural branch. Each designer works on their own
local repository as each has a distinct style of working – some like to commit multiple times
a day, others only once every few days – while some prefer working from home. In any case
they regularly push their work to the architectural branch on A’s central repository after
pulling from the same to make sure they don’t create conflicts on it.
Meanwhile, The client C supervises the design process by pulling each commit from
the architectural branch of A’s central repository to his own local repo. After each review
meeting, the reviewed design state in the architectural branch on the central repo is merged
into the master branch. As variants are being explored, a separate branch is created for each.
Only the accepted variant is merged into the master branch.
As the design stage progresses, S and M join the project. A creates two new branches on
the central repo for structural and MEP planning respectively. Both S and M create local
repos and clone the central repo from A including the master branch. They work on their
specific repos independently but regularly pull revisions from the master branch an push
their own changes to their respective branches in A’s repo. As model coordinator, A remains
responsible for merging major commits to the architectural, structural and MEP branches
into the master branch.
Branching can be executed partially, so the master branch as the coordination model
would contain all parts of the model while the discipline-specific branches contain only files
relevant for that discipline.
3.1.3 Execution Stage
Eventually tendering begins. A creates another brach and modifies it to be published as
tender document.
The contract is subsequently awarded to G, who creates their own repository and clones
the master repository, which has become the basis of their contract. This clone can again be
cloned by the subcontractors (partially if necessary).
As the planning continues after the awarding, the general contracotr pulls commits from
the master branch and his subcontractors (in part) from him. The diff report for each pull
can serve as a basis for claim management as it summarizes everything that has changed
from one pull to antoher. For example: The Electrician S2 wishes to propose a change. They
create a new branch from the MEP branch for their offer. G convinces C to commission the
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 15
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
offer. The new branch is then merged into the MEP branch and subsequently into the master
branch.
Figure 3.2: Example DAG
This brief example is obviously vastly simplified but it demonstrates the basic idea of em-
ploying a VCS as a Common Data Environment. One significant challenge will be examined
in the following section.
3.2 Tracking Changes with Diff
Since IFC is a text-based format, it appears reasonable to compare versions of a model with
the Diff- Command.
However, Diff tools are designed for program code, where the semantic purpose of a line
of code are reasonably apparent when viewed within its immediate context. A single line in
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 16
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
an IFC file on the other hand represents a single data entity such as a vertex or line and
may contain references to many other entities at unrelated positions in the file. A designer
examining the raw data therefor cannot practically infer the meaning or context of such an
object, let alone changes to them.
One might consider developing a plugin for an IFC-Viewer that highlights the identified
changes within the 3D-View. This could be hindered by the generation of the IFC files
themselves: That process is part of the (usually proprietary) native software used to design the
model. Because that process is not standardised it is not necessarily consistent. Furthermore
it involves randomly created identifiers. Exporting identical models does not create identical
IFC files.
3.2.1 Experimental Version Comparison
The extent of these challenges was examined using Autodesk Revit 2019 and its preinstalled
architectural sample project with 1463 elements:
- It was exported to IFC three times without any modification (ver1, ver2, ver3).
- Another view was opened and the model was exported again (ver4).
- A wall opening was then added and the model exported once more (ver5).
- The new object was deleted and another version was exported (ver6).
- A wall was removed an then reinserted at the same position (ver7)
Each version was then compared with every other one and the results saved to text files
using the Windows command FC /1 ver1.ifc ver2.ifc > 12.txt. A text editor was then
used to count the changes based on the occurence of the headlines. the results can be seen
in table 3.1
IFC-File 1 2 3 4 5 6 7 lines bytes
1 - 2031 1629 1574 2074 1988 157 531529 268776802 - 1876 2237 2005 2061 105 531529 268776803 - 1833 1625 1865 105 531529 268776804 - 2090 1654 157 531529 268776805 - 2146 159 531545 268787486 - 157 531529 268776807 - 531529 26877666
Table 3.1: differences between and sizes of IFC files
It is striking that not only there are many differences even between IFC files that should
be identical, but the number of differences varies widely as well. This stems from the way the
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 17
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
diff tool interprets several changes in close proximity to each other as a single change. The
differences between versions 1 through 4 and version 6 stem only from randomly generated
22-character alphanumerical string identifiers (and metadata such as creation date). The
number of changes appears has no statistically relevant corelation with whether the file was
changed at all. Changing the viewport does not appear to have any impact beyond these
identifiers, as version 4 has the same file size. The exception is version 7, where the differences
in the file to the others where so numerous that the tool used counted differences in dozens
of subsequent lines as a single change, hence the low difference count.
Another tool called P4Merge was then used to create a three-way comparison between
versions 1, 2 and 7 in an attempt to filter the changes corresponding only to identifiers (it
should be noted that this process took several minutes on an average computer). This tool
identified 2822 differences unique to version 7 - stemming from one edit that is not even
recognisable or semantically relevant within the model itself.
3.2.2 Insights
These observations reveal significant challenges for the prospect of developing the proposed
software solution. The steps necessary for such a plugin would be the following:
1. list differences between two or more versions of an IFC file – this can be accomplished
using existing algorithms.
2. filter out changes that are limited to randomly generated indentifiers – abstracting
identifiers is a basic task of every software compiler and should not be too complicated
to transfer to building information models.
3. filter out objects that are syntactically different but semantically identical – Especially
this task requires either complex algorithms or could be approached with machine
learning or other artificial intelligence paradigms. The latter would require an AI with
a semantic understanding of building information models – a bold requirement, yet one
that would be useful far beyond version comparison.
4. list the remaining differences that represent actual changes to the model
5. highlight those changes in 3D-view and model structure tree
One way to circumventing the problem of semantically irelevant changes could be a propo-
sition introduced by Koch and Firmenich [13, 14]: They define a language to describe Building
Information Models based on changes, not on states. This language records a sequence of
modeling operations as opposed to a set of objects. It may more accurately reflect a designer’s
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 18
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
intent in fewer data points and, by way of being a structured sequence instead of an unstruc-
tured list, making purely syntactical changes unnecessary or at least obvious. However, its
implementation requires an extension of the IFC standard, which may harm the spread of
any tools that utilize it.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 19
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
Chapter 4
Summary
4.1 Limitations
The sections above assume that the IFC standard is employed at least for shared ressources.
While that is an idealistic assumption, not all project owners are prepared to forgo the
advantages of closed BIM. Commiting native formats to the repository is impractical, as
those binary formats can not be effectively processed by VCSs. For each commmit the entire
model would have to be archived instead of just the changes since the previous one. This
would lead to unacceptable storage requirements and defeat the entire purpose of VCS in the
first place. The VCS approach is therefor not applicable to closed BIM environments without
major adaptation of the used VCS software.
The considerations in this report focus primarily on the model (however many dimen-
sions it may contain). While that could include cost and scheduling information, structural
and environmental analysis data and more, it excludes documents separate from the model.
Such documents could include invoices, notices of concern or delay and other legally binding
documents. These documents are issued independently from the planning process and each
other and are never edited. In large projects, dozens of such documents change hands each
day. It would be possible to create a commit for each such document, leading to a possibly
very long commit history. Those might be stored in a separate repository – the concept of a
common data environment not really intact. This is clearly not what VCS are designed for.
4.2 Outlook
Because each project partner has a separate repository, it is quite difficult to alter records
of previous transactions. If a client desires addidtional protection from hampering with data
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 20
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
records it is possible to store each commit in a blockchain that is distributed to eac project
participant. In a blockchain the validity of an entry can only be verified if all preceding
entries in all versions of the chain are valid as well. If someone wanted to modify an entry,
they would have to perform computationally intensive operations on more than half of the
copies of the record.
To summarize, Version Control Systems fulfil many of the requirements to Common Data
Environments. Costs will arise for the necessary development effort to capitalize on the
advantages of VCS along with an expected limitation in ease-of-use of the non-construction-
specific software and thus a less steep learning curve. It remains doubtful whether the saved
license fees compared to specialized ISO 19650-compatible software would offset these costs.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 21
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
List of Figures
1.1 Building Information Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Open vs. Closed and little vs. BIG describe different approaches to BIM . . . 9
2.1 Example of a Directed Acyclic Graph . . . . . . . . . . . . . . . . . . . . . . 11
2.2 A Distributed VCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1 fictitious project diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Example DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
List of Tables
3.1 differences between and sizes of IFC files . . . . . . . . . . . . . . . . . . . . . 17
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 22
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
References
[1] 2018 version control software comparison: svn, git, mercurial. 2018. url: https : / /
biz30.timedoctor.com/git-mecurial-and-cvs-comparison-of-svn-software/ (visited on
05/24/2018).
[2] aiim. What is document management. 2018. url: https://www.aiim.org/What- Is-
Document-Imaging.
[3] Best version control systems. 2018. url: https://www.g2crowd.com/categories/version-
control-systems (visited on 05/24/2018).
[4] André Borrmann, Markus König, Christian Koch, and Jakob Beetz. Building informa-
tion modelin. In 2015. Chapter 12 - Kooperative Datenverwaltung, pages 207–236.
[5] Ed Boxall. Common data environment (cde): what you need to know for starters.
August 26, 2015. url: https://www.aconex.com/blogs/common-data-environment-
cde-tutorial.
[6] Wibke Cartensen. A brief history of version control. 2016. url: https ://www.red-
gate.com/blog/database-devops/history-of-version-control (visited on 05/27/2018).
[7] Martin Fiedler. Lean Construction - Das Managementhandbuch. 2018.
[8] Berthold Firmenich, Christian Koch, Torsten Richter, and Daniel G. Beer. Versioning
structured object sets using text based version control systems. In Proceedings of the
22nd CIB-W78 Conference on Information Technology in Construction, 2005.
[9] GibLab. The only single product for the complete devops lifecycle. 2018. url: https:
//about.gitlab.com/ (visited on 06/26/2018).
[10] GitHub. The worlds leading software development platform. 2018. url: https://github.
com/ (visited on 06/26/2018).
[11] Organization of information about construction works — Information management us-
ing building information modelling — Part 1: Concepts and Principles. Specification,
International Standardisation Organisation, 2017.
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 23
https://biz30.timedoctor.com/git-mecurial-and-cvs-comparison-of-svn-software/https://biz30.timedoctor.com/git-mecurial-and-cvs-comparison-of-svn-software/https://www.aiim.org/What-Is-Document-Imaginghttps://www.aiim.org/What-Is-Document-Imaginghttps://www.g2crowd.com/categories/version-control-systemshttps://www.g2crowd.com/categories/version-control-systemshttps://www.aconex.com/blogs/common-data-environment-cde-tutorialhttps://www.aconex.com/blogs/common-data-environment-cde-tutorialhttps://www.red-gate.com/blog/database-devops/history-of-version-controlhttps://www.red-gate.com/blog/database-devops/history-of-version-controlhttps://about.gitlab.com/https://about.gitlab.com/https://github.com/https://github.com/
-
Advanced Topics in Building Information ModelingVCS 4 CDE – Version Control Systems as Common Data Environments
[12] Organization of information about construction — Information management using
building information modelling — Part 2: Delivery phase of the assets. Specification,
International Standardisation Organisation, 2017.
[13] Christian Koch. Bauwerksmodellierung im kooperativen Planungsprozess: Mit der
Objektorientierung zur Verarbeitungsorientierung. Dissertation, Bauhaus-Universität
Weimar, July 2, 2008.
[14] Christian Koch and Berthold Firmenich. An approach to distributed building modeling
on the basis of versions and changes. In Walid Tizani and Michael J. Mawdesley, editors,
Advanced Engineering Informatics. 2011.
[15] Richard McPartland. What is the common data environment. October 18, 2016. url:
https://www.thenbs.com/knowledge/what-is-the-common-data-environment-cde.
[16] Fred Mills. What is a common data environment? July 15, 2015. url: https://www.
theb1m.com/video/what-is-a-common-data-environment.
[17] Mohamed M. Nour, Berthold Firmenich, Torsten Richter, and Christian Koch. A ver-
sioned ifc database for multi-disciplinary synchronous cooperation. In Joint Interna-
tional Conference on Computing and Decision Making in Civil and Building Engineer-
ing, June 14, 2006.
[18] Specification for information management for the capital/delivery phase of construc-
tion projects using building information modelling. Specification, Construction Industry
Council, 2013.
[19] Eric Sink. Version control by example. 2011. url: https://ericsink.com/vcbe/html/
index.html (visited on 05/27/2018).
[20] Version control systems popularity in 2016. 2016. url: https : / / rhodecode . com /
insights/version-control-systems-2016 (visited on 05/24/2018).
Chair of Computational Modeling and SimulationChair of Architectural Informatics
Page 24
https://www.thenbs.com/knowledge/what-is-the-common-data-environment-cdehttps://www.theb1m.com/video/what-is-a-common-data-environmenthttps://www.theb1m.com/video/what-is-a-common-data-environmenthttps://ericsink.com/vcbe/html/index.htmlhttps://ericsink.com/vcbe/html/index.htmlhttps://rhodecode.com/insights/version-control-systems-2016https://rhodecode.com/insights/version-control-systems-2016
IntroductionRequirements of the Construction IndustryData storage systemsFlat file systemsDocument Management SystemsBuilding Information ModelsCommon Data Environments
RequirementsDistribution and performanceData safety and securityVersioning and collaborationConcurrencyAggregation
Basics of Version Control SystemsFunctionalityWorking directories and commitsBranchingDistributed VCS
History and Popular SystemsSupplemental Software
Potential use of VCS in constructionExample ProjectOverviewPlanning StageExecution Stage
Tracking Changes with DiffExperimental Version ComparisonInsights
SummaryLimitationsOutlook
List of FiguresList of TablesReferences