Preserving and Reusing Architectural Design …...Victor, Remco, Hans and Patricia, thank you for...

University of Groningen

Preserving and reusing architectural design decisionsvan der Ven, Jan

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2019

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):van der Ven, J. (2019). Preserving and reusing architectural design decisions. University of Groningen.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 27-11-2020

https://www.rug.nl/research/portal/en/publications/preserving-and-reusing-architectural-design-decisions(2694a905-30e0-4eef-b3a6-402777bb936b).html

https://www.rug.nl/research/portal/en/persons/jan-ven-van-der(c6d2977e-883f-4ea3-b677-b79041105f9c).html

https://www.rug.nl/research/portal/en/publications/preserving-and-reusing-architectural-design-decisions(2694a905-30e0-4eef-b3a6-402777bb936b).html

Preserving and Reusing ArchitecturalDesign Decisions

Jan Salvador van der Ven

2019

ii

The research presented in this thesis was carried out at the Software Engineeringand Architecture group, at the Bernoulli Institute for Math, CS, AI at the Universityof Groningen.

This research has partially been sponsored by the Dutch Joint Academic and Com-mercial Quality Research & Development (Jacquard) program on Software En-gineering Research via contract 638.001.406 GRIFFIN: a GRId For inFormatIoNabout architectural knowledge.

Cover illustration by Bloei media - bloeimedia.nlNo mice were harmed in the design process.

Printed by Gildeprint - The Netherlands

ISBN (digital): 978-94-034-1534-5ISBN (print): 978-94-034-1535-2

Copyright c©2019, Jan Salvador van der VenAll rights reserved unless otherwise stated.

Preserving and Reusing ArchitecturalDesign Decisions

PhD thesis

to obtain the degree of PhD at theUniversity of Groningenon the authority of the

Rector Magnificus dr. E. Sterkenand in accordance with

the decision by the College of Deans.

This thesis will be defended in public on

Monday 8 April 2019 at 12:45 hours

by

Jan Salvador van der Venborn on 14 December 1977

in Groningen, the Netherlands

SupervisorProf. J. Bosch

Co-supervisorProf. P. Avgeriou

Assessment committeeProf. U. ZdunProf. I. CrnkovicProf. D. Karastoyanova

ISBN (digital): 978-94-034-1534-5ISBN (print): 978-94-034-1535-2

v

“Perfect is the Enemy of Done ”

- Unknown Author

vii

Acknowledgements

This project took quite some time. I have worked with many people, and I hopeI remember everyone that helped me in any way. Roughly, the work on my thesiscan be divided into two periods with a pause in between. In the first period, I wasenrolled by the university of Groningen, and I had the opportunity to work on myresearch full-time. In the pause I worked in industry in several companies. Withnew insights from industry, I restarted my research part-time, with guidance-on-distance from Jan Bosch who resided in Gothenborg by that time. Fourteen yearsafter starting, my work on this thesis is finished.

First of all, I want to thank Jan Bosch for the guidance of this project, and alsofor the endless positive and constructive attitude towards finishing this work. Thebusiness collaboration we did was challenging and fun, but you always came backwith a subtle question: "how is research"?

In addition to Jan, I would like to thank three other supervisors that have beeninvolved for shorter periods: Jos Nijhuis, Dieter Hammer and Paris Avgeriou. Youall had a different perspective on research, which shaped my thoughts. Paris,thank you for the collaboration, supervision, and also for making it possible forme to teach at the university after I left. This teaching was fun and it also helpedme to organize my thoughts on the research I was doing.

For this research, I have often asked others to provide me with input, either inthe form of filling in a survey, or being interviewed. Thanks to all the participants,without you this research could have never been completed.

In the first period of my research, I enjoyed working with Johanneke, Marco,Sybren, Jens and Natasha. Special thanks to Anton, with whom I published myearliest works and who co-authored several of the publications that form the basisfor this thesis. In the GRIFFIN project, we collaborated with researchers from theVU. Victor, Remco, Hans and Patricia, thank you for the fruitful discussions on ar-chitectural knowledge and the role of design decisions in this. Rik, I would like tothank you specifically as we were the first ’GRIFFIN-ers’. I liked the conversationswe had in that period, personal as well as professional.

In the second period of my research I was not at the university, but did havecontact with some researchers in Groningen: Dan, thanks for the interesting dis-cussions on research and other topics. Mircea, it was great to have a short collabo-ration with you on the Github paper.

Between the first and the second period, I worked at several companies. Thepeople in one company shaped my thinking radically, which is also reflected inthis thesis: Factlink. Tom, Remon, Gordon, Mark, Martijn and Jens, it was greatto be part of the team. Special thanks to Merijn: I have learned a lot from you asyou introduced me into the world of startups and showed me the joy of makingproducts.

There is one person who was involved in the first period as well as the pauseand the second period of my research. Ivor, somehow our lives keep crossingeach other. Thank you for your numerous reflections on the software industryand research. Either at work, in the gym or with a small beer, we always hadchallenging and fun discussions. Also, thanks for giving me the opportunity to

viii

work with you at Crop-R. I could not have finished this thesis if I didn’t have theflexible employment you made possible for me. I know we will work togethersome day again, and look forward to it.

While mentioning this, thanks to the Crop-R guys and girls, I loved workingwith you: Jeroen, Kristina, Maarten, Cees, Geoffrie, Ronald, Erik-Jan, Gert, Ma-rina, Lud en Mark. The discussions during lunch sharpened my thoughts on re-search in many occasions. Special thanks for Nico, for the deadly fights we foughtat the table-tennis table that kept me in shape.

When I started writing the introduction of my thesis, Jos van Essen came ininterrupting me several times. In most occasions his interruptions ended up infruitful discussions on the state of the world and software development. Thankyou for this input Jos, you helped me shape my introduction to the form it cur-rently is.

One group of friends followed my whole trajectory: the Freestylers. Jur, Axel,Martijn, Bastiaan, Hylke, Maarten, Ferdinand, Edwin, Bobby, thanks for the nu-merous discussions on everything in all kinds of places around the world (mostoften on camping Neus). You guys inspired me to finish my thesis not by saying Ishould do it but by showing how it’s done while not making it bigger than it is.

Another group of friends kept me on the ground, while they also challengedme with philosophical discussions, handball and intriguing board-games. Rik,Marijn, Sjabbe and Chris, thanks for the relaxing moments we had together in theyears I worked at my research.

Finally, I would like to thank my family for the support and reflection: Irene,Corry and Marijke. Without putting pressure on me, you kept believing I couldfinish this. Mare en Tara, thank you for your endless enthusiasm, you energizedme all the time. Last but not least, Iris thank you very much for your support andpatience, especially during the restless period when I was finishing my thesis. Icould not have completed this without you.

ix

Contents

Acknowledgements vii

1 Introduction 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 A Brief History of Software Development . . . . . . . . . . . . . . . 31.3 Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Architectural Design Decisions . . . . . . . . . . . . . . . . . . . . . 61.5 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.6 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 111.7 Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 121.8 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Exploring Use Cases for Architectural Decisions 172.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Architectural Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3 From industrial needs to Use Cases . . . . . . . . . . . . . . . . . . . 202.4 The Use Case Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.5 Use Case Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.7 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . 32

3 Design Decisions: the Bridge between Rationale and Architecture 373.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.3 Rationale in software architecture . . . . . . . . . . . . . . . . . . . . 413.4 Design decisions: the bridge between rationale and architecture . . 433.5 Archium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.6 Related work and further developments . . . . . . . . . . . . . . . . 513.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Enriching Software Architecture Documentation 554.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.2 Challenges for Software Architecture Documentation . . . . . . . . 574.3 Enriching Documentation with Formal AK . . . . . . . . . . . . . . 594.4 The Knowledge Architect . . . . . . . . . . . . . . . . . . . . . . . . 614.5 Resolved Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.6 The LOFAR Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.7 Quasi-Controlled Experiment . . . . . . . . . . . . . . . . . . . . . . 764.8 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.9 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

x

4.10 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5 Exploring the Context of Architectural Decisions 935.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.2 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 945.3 The Agile Architecture Axis Framework . . . . . . . . . . . . . . . . 955.4 Industrial Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 995.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1075.6 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1105.7 Related and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 1115.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6 Busting Software Architecture Beliefs 1176.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196.3 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236.5 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1266.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

7 Pivots and Architectural Decisions: Two Sides of the Same Medal? 1337.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1337.2 Conceptual Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1357.3 Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 1367.4 New Product Development . . . . . . . . . . . . . . . . . . . . . . . 1397.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1447.6 Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1467.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1477.8 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1477.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

8 Towards Reusing Decisions by Mining Open Source Repositories 1498.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1508.2 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1518.3 Research Approach: Mining Decisions . . . . . . . . . . . . . . . . . 1548.4 Programming Language Comparison . . . . . . . . . . . . . . . . . 1618.5 Decision Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1668.6 Accessing Decision Rationale . . . . . . . . . . . . . . . . . . . . . . 1698.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1748.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

9 Conclusions 1839.1 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1839.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1859.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

List of Figures 189

xi

List of Tables 191

Abbreviations 193

Abstract 195

Samenvatting 197

Bibliography 199

1

Chapter 1

Introduction

“The secret to getting ahead is getting started.”

- Mark Twain

1.1 Introduction

When I was young, I always took a city plan with me when visiting an unknowntown. If I wanted to know my location, I managed with the street-signs, the indexof the city plan and some orienteering-skills. When I would meet up with some-one, I used my wired phone to contact the person a few days ahead to make anappointment of where and when to meet. My sister used to send me letters whenshe lived in Australia. It took some weeks to communicate, but it was great to beable to hear something from her from the other side of the world. I never do anyof the above-mentioned activities anymore, software radically changed them all.

Now, when I visit a town, the primary concern is that my smart-phone ischarged, so the software can assist me when I need it. GPS satellites help meto show me were I am, Internet applications provide detailed maps of the world. Icall my appointment when I am near to see where we meet, and send a message tomy friend in the US as I walk there. It is great to see his response almost instantlyfrom the other side of the world.

Software has an enormous impact on everyday live. It enables us to be moreproductive and have rich communication with each other, but also facilitate thedistribution of fake-news and addicts us to our mobile devices. Software is in-cluded in traditionally physical objects like toothbrushes or thermostats. Data isproduced in enormous quantities and this data is increasingly used to gain in-sights, make better decisions or sell products. Even in domains that have for along time increased productivity by innovating the physical devices are being dis-rupted by software-driven products.

Software systems are increasingly interconnected, which creates great oppor-tunities, but also great challenges when building these systems. For example, theBIOSCOPE project 1 involved 5 companies and dozens of systems that needed tointeract. It combined the data from weather prediction, satellite images, drone im-ages and predictive algorithms on crop growth and diseases to provide automatedadvise to farmers. One of the products delivered potato haulm killing taskmaps

1https://business.esa.int/projects/bioscope

https://business.esa.int/projects/bioscope

2 Chapter 1. Introduction

that can be used direly in (GPS-guided) tractors, based on satellite or drone imagesthat are guaranteed to be less then ten days old.

It is challenging to build software systems like these. Many decisions have tobe made to make sure all parts integrate well to meet business needs. For example,where and how should the data be stored? What data is communicated at whichfrequency? What is done when one of the systems fails? The answers to thesetopics are typical examples of decisions that form the architecture of software sys-tems. This architecture can be seen as the backbone of the system. Documentingand discussing the software architecture is extremely important for the success ofa system [67]. Visualizations in the form of diagrams or models are used to com-municate the architecture design. Figure 1.1 shows a diagram that was used in theBIOSCOPE project to discuss the interaction between the system parts.

FIGURE 1.1: An architectural diagram used in the BIOSCOPE project.

When systems evolve, it is hard to remember why architectural decisions weremade, even if the design is visualized and documented. This effect is known asknowledge vaporization [75]. Even though the results of a decision might be doc-umented, the decision process and the reasoning behind the decision is lost, whilethe system evolved. This thesis investigates how architectural design decisionscan be preserved, recovered or reused in order to address architectural knowledgevaporization.

1.2. A Brief History of Software Development 3

Period Application Process Architecture

Up until1980

Small monolith programsfor single purpose.

Inspired by otherengineering prac-tices. Trial-and-error.

Close relationshipbetween the designand the code.

1980 -1995

Software is available fora lager audience, andsoftware developmentbecomes easier.

Waterfall method-ology, focus onpredictability ofprojects. Increas-ingly more expen-sive to developsoftware.

Emphasis on gooddesign by doc-umenting archi-tectures. Start ofreusing compo-nents.

1995 -2007

Adoption of the Internet.Software is becomingessential to companies,maintenance is increas-ingly important. Spe-cialized IT departmentsemerge, while compa-nies arise that solely dosoftware products.

Increasing com-plexity as moresystems interact.Agile approachesemerge. Projects aretechnology-focused("can we make thiswork?").

Increasing com-plexity of softwarearchitecture. Exten-sive architecturaldocumentationbased on styles andpatterns.

2007 un-til now

Software gets into everypart of human life. Formore companies, softwarebecomes the distinctive el-ement of the business.Much easier to start a soft-ware business.

Agile matured,challenges remainin scaling andcontinuous deploy-ment. User focusedprojects ("Can wemake a business outif this?").

Consolidation ofstyles in specific do-mains. Lightweightarchitectural docu-mentation, close tothe source code.

TABLE 1.1: Historical Change in Software Engineering and Architec-ture

1.2 A Brief History of Software Development

Developing software is a time-consuming and tedious process. In the past decades,the ideas on how to build reliable software have evolved. Software engineeringused to be an engineering practice, similar to traditional engineering practices likeconstruction or manufacturing. As time passed, the software industry maturedthrough trial and (sometimes very expensive) error. In order to manage the con-stantly decreasing time-to-market, the software development process changed. Toshow the changes, roughly four periods can be distinguished in software engineer-ing, as summarized in Table 1.1.

In the first period (up until ~1980), the computer was new, and specialized com-panies and research institutes were experimenting with software. This softwarewas mostly monolith (single purpose and single system) [199]. In the beginning


of this period, there was no distinction between software and hardware, as bothwere sold together and software only worked for the hardware it was designedfor. The IBM System/360 2 (development started in 1965) was the first to changethis. In this family of computers, the same software could be used in differenthardware versions. In the beginning of this period, there was no clear softwareengineering process as hardware engineering practices were adopted for softwaredevelopment. The designs were closely related to the implementation as therewere not many layers of abstraction necessary; the design was the code. Softwarearchitecture was not existing, as Kruchten et al. [113] stated, ’until the late 1980s, theword "architecture" was used mostly in the sense of system architecture’. This changedin the next period.

In the second period (1980-1995), computers and software entered the homes ofpeople with the introduction of the Personal Computer and operating systems likeMS-DOS3. With the greater adoption of third-level programming languages likeCOBOL or C, computer programs were getting larger and more abstract. It wasincreasingly more difficult to create software as the cost increased (exponentially)with the size of the systems. Large, long-running projects were hard to predictand often ran over time and budget, or completely failed [33]. The interest fromindustry and academia in software architecture increased. Because of the highcosts of failing projects, there was an increasing interest in carefully documentingthe architecture in order to prevent expensive errors. Research emphasized goodarchitectural design to manage cross-cutting concerns like non-functional require-ments. Frameworks for structured reuse of code emerged (e.g. COM componentsand dlls).

Around the change of the millennium (period 1995-2007), the Internet emergedand the increasing communication between software systems increased the com-plexity software development. Software was used in more domains (B2C andB2B), while companies arose that developed software as their main product. Main-tenance of software systems got increasing attention as systems were longer in useand always on. It turned out that problems from bad design decisions could bevery expensive (e.g. the Millennium Bug 4). In order to address the predictabilityof projects, experiments were conducted with agile software development indus-try settings. The Agile Manifesto5 (2001) expressed the common agile values andprinciples. With these, the emphasis on control and up-front design was slowlybeing replaced by an more iterative and empirical approach. Projects were tech-nology focused ("Can we make this work?"), and software architecture was respon-sible for addressing the technological challenges, while meeting nonfunctionalqualities like performance or scalability. Methods for quality attribute based ar-chitecting, like SAAM [106] and ATAM [107] were developed and used. Researchon Architecture Description Languages (ADLs) [138], documenting architectures[42], views [67, 116], architectural styles [164] and patterns [81] was dominant inthis period.

2https://en.wikipedia.org/wiki/IBM_System/3603https://en.wikipedia.org/wiki/MS-DOS4https://en.wikipedia.org/wiki/Year_2000_problem5http://agilemanifesto.org/

https://en.wikipedia.org/wiki/IBM_System/360

https://en.wikipedia.org/wiki/MS-DOS

https://en.wikipedia.org/wiki/Year_2000_problem

http://agilemanifesto.org/

1.3. Software Architecture 5

The last period (2007 until now) started with the introduction of the iPhone6,and therewith the start of mobile Internet. In this period, the Internet maturedand stretched to mobile devices and other physical objects (the Internet of Things).As Woods describes, software systems went ’from being "always on" to being "usedfrom anywhere for anything" ’ [199]. Standardization of many aspects of the Inter-net occurred, resulting in Software as a Service (SaaS) and Platform as a Service(PaaS) solutions. This democratized software development as it was possible for amuch larger group to develop software. The focus of projects was to validate thebusiness ideas, instead of exploring technological possibilities [156]. Agile soft-ware development proved it’s value, starting with small (startup) companies butgaining adoption across the whole software industry. The battles on the architec-tural styles and patterns were mostly settled as domains grew to a common un-derstanding for their key architectural choices, reducing the need for architecturaldecisions [92]. In some cases, the ecosystem made the architectural decisions forthe developers (e.g. the App stores, J2EE or .NET), while in other cases successfuldesign decisions were copied, which resulted in similarity in design. For exam-ple, all large web-frameworks use the MVC architectural style, and REST of SOAPimplementations share the same design decisions. In software architecture, thereis an increasing emphasis on documenting light-weight decisions, to make it pos-sible to change direction just-in-time [152]. The domain of the software architectis stretched to include automated testing, Continuous Integration and ContinuousDelivery, as well as monitoring and updating running systems [92]. As architec-ture decision-making shifts to being more of a team activity [177], the role of ar-chitect changes from decision maker to knowledge manager [197], responsible forsharing knowledge.

1.3 Software Architecture

As the previous section described, software architecture changed when softwareengineering did. The change in perception on software architecture becomes clearwhen looking at the different definitions that have been used for software archi-tecture. One of the first definitions used is from Perry and Wolf [149]:

A software architecture is a set of architectural elements that have aparticular form. We distinguish three different classes of architecturalelements:

• processing elements;

• data elements; and

• connecting elements.

This definition defines different mandatory elements for the architecture, andis focused on the design itself. The definition is inspired by the architecture usedin hardware and construction, by focusing on views that help to communicate thedesign. This is similar to the more abstract definition used by Garlan and Shawtwo years later [68]:

6https://www.youtube.com/watch?v=9hUIxyE2Ns8

https://www.youtube.com/watch?v=9hUIxyE2Ns8


The framework we will adopt is to treat an architecture of a specificsystem as a collection of computational components — or simply com-ponents — together with a description of the interactions between thesecomponents — the connectors.

The perception of an architecture as a set of components and connectors hasbeen dominant for a long time. Many architectural styles and patterns have beendefined in terms of components and connectors. However, even though architectswere using views with components extensively, some aspects were missing forbeing able to understand the design and the system. The rationale or reasoningbehind the decisions was often forgotten, and lacked a clear representation in thearchitecture. Forgetting the rationale of decisions caused design erosion [75], mak-ing it harder to maintain software systems. Also, as agile development practiceswere adopted more often, software architecture needed to be more lightweight, tobe able to change with the changing business needs. Decisions needed to be de-ferred to later in the product life-cycle, without missing the rationale behind thedecisions. Bosch [27] explicitly mentions the design decision as the main elementof software architecture:

The key difference from traditional approaches is that we do not viewa software architecture as a set of components and connectors, but ratheras the composition of a set of architectural design decisions.

In this thesis, the definition of software architecture as a set of architecturaldesign decisions is adopted. In the next section, this concept is explored in moredepth.

1.4 Architectural Design Decisions

With the increasing interest in the decisions as first-class entities for architecturedesign, software architecture entered a new phase [27]. As the component and con-nector view was closely connected to the implementation of the design (the result),the design decision view was connected to the decision process (how we reach theresult) [38]. For the domain of architectural design decisions, there are severalterms that are used that need a proper introduction as they are used throughoutthis thesis. A Design Decision is a decision in the solution space that directly in-fluences the design of a system [96]. Architectural Design Decisions can be seenas a subset of Design Decisions, that specifically concern the software architectureof a system. An Architectural Design Decision can for example concern the adop-tion of an architectural style [68] or the inclusion of a specific component in thesystem [188]. There is a thin line between Design Decisions and Architectural De-sign Decisions, which follows the discussion on what architecture is and what not[65]. This thesis does not intend to provide answers in this discussion. The con-text for this thesis is software architecture, so if not mentioned otherwise, DesignDecisions are considered Architectural Design Decisions.

One of the main benefits of using Design Decisions as first-class entities is thatthe architecture design can be split into smaller, easier to comprehend pieces, that

1.4. Architectural Design Decisions 7

together form a whole architecture. Unlike dividing the system in functional com-ponents, Design Decisions can address cross-cutting concerns like performanceor maintainability, affecting multiple elements of the system. Compared to tra-ditional architectural documentation, they are easier to comprehend as they aresmaller and focus on one decision topic. A Design Decision can consist of textand fragments of existing (architectural) views, as part of artifacts. Van Heesch etal. [85] state that research on Architectural Design Decisions focuses on decisiontemplates, decision models, and annotations. Kruchten introduced a classifica-tion of Architectural Design Decisions [114], while the GRIFFIN project placed thedecision process in a broader context by introducing the concept of architecturalknowledge [23], which includes aspects like concerns and architectural design. Tovisualize the concepts used in this thesis, Figure 1.2 shows a domain model forArchitectural Design Decisions. This domain model describes the main elementsaround Architectural Design Decisions and their relationships.

Alternative

Decision Topic

Decision

Concern

Artifact

Rationale

Ranking

becomes

is proposed for

is made on

addresses

is reflected in

contains

is ordered by

is selected based on

FIGURE 1.2: Domain model for Architectural Design Decisions.

• Concern. A concern is an interest to the systems development, its operationor any other aspect that is critical or otherwise important to one or morestakeholders. The Concern can arise from different sources, like businessneeds, specific requirements, the development organization or previouslymade Decisions.

• Decision Topic. A Decision Topic is a specific question that needs to be an-swered to progress with the design. It addresses a specific Concern.


• Alternative. To address a Decision Topic, one or more potential Alternativescan be proposed. The alternatives can be reflected in one or more Artifacts,can contain Rationale for the decision, and can be ordered by several Rank-ings. When an Alternative is chosen as the solution for a Decision Topic, itbecomes a Decision.

• Decision. One Alternative can be chosen to implement for a Decision Topic,this is the Decision. The selection of the right Alternative to a Decision canbe based on one or more Rankings.

• Artifact. An artifact is a general term for anything used in the design orimplementation of a system. This can be a file (e.g. an architecture or re-quirement description), a model (e.g. UML models) or written code.

• Rationale. The Rationale of an Alternative describes, with text or supportingviews, the reasoning behind the Alternative; why would this be a good orbad choice?

• Ranking. A Ranking is a way to order Alternatives. This can be either ex-plicit (e.g. an ordered list of alternatives based on the performance) or im-plicit as part of the decision process.

Several domain models for architectural knowledge exist [22, 172], and at-tempts show that these models can be mapped on each other [131]. The modelused in this thesis is derived from the work we did in the GRIFFIN project [190]. Itcontains the main elements from the ’Decision’ package we developed [61] in thecontext of this project. The model described in Figure 1.2 is used in the chaptersof this thesis with a different focus. In Chapter 3, the decision process is ana-lyzed, connecting the Rationale and Alternatives to the Decisions (see Figure 3.3).In Chapter 4, the domain model is used as basis for the experiment with the de-veloped tooling. This model (see Figure 4.7) adds details to the Concerns, andconnects the Decision element to documentation where the knowledge is furtherdetailed. In Chapter 8 the domain model is used when searching for design deci-sions in the history of source code. Here, the focus is on identifying the three mainelements: the Alternatives, the Rationale, and the Decision.

Architectural Knowledge like presented in the domain model comes in threedifferent forms: tacit, documented and formal. Tacit knowledge [144] exists asthoughts in human brains, and is accessible by that person only. Documentedknowledge helps to communicate, as it can be read by multiple people. However,the interpretation of the documentation can be different between people. Even theperson who wrote the work can have problems understanding the text later as tacitassumptions that base the text can be forgotten. Formal knowledge (e.g. meta-data or models) can help to increase the understandability and consistency, whileit can also be processed by software. The research in this thesis is conducted onall types of knowledge for the software architecture domain. Architectural Knowl-edge [22] consists of the design, as well as the context, the underlying assumptionsand the Architectural Decisions [119]. All this knowledge is essential for under-standing the architecture and the architecture decisions are the elements that tie allthe knowledge together. In this thesis, the main focus is on architecture decisions,either tacit, documented or formal.

1.5. Research Questions 9

1.5 Research Questions

The field of software architecture clearly acknowledges the importance of architec-tural knowledge [43, 60], and specifically architectural decisions [96, 178] in orderto address knowledge vaporisation [177]. Knowledge vaporisation is the problemthat knowledge about the (architecture) design is not available when needed, typi-cally during the evolution of the system. There are several reasons why knowledgevaporisation occurs. First, it is hard and time-consuming to capture knowledgein artifacts. Second, it is difficult to maintain this knowledge when the systemevolves. Knowledge vaporisation affects the cost of maintenance and evolution,increases the chances on errors and slows down the development. Some solutionsto address knowledge vaporisation have been proposed by academia [43, 60, 96,178]. However, these ideas do not seem to find large-scale adoption in industry[197]. This thesis explores industry needs on managing architectural knowledgeand shows how these needs can be supported in order to address knowledge va-porization.

First, we investigate the industrial needs for managing architectural knowl-edge, where we focus on architectural decisions. We explore user scenarios wherearchitecture decisions are created and used, based on interviews conducted withindustry partners. The first research question addresses the exploration of indus-trial needs:

Industry Needs

RQ1: What are the industry needs for managing architecture decisions?

Artifacts

RQ2: How can tacit knowledge about

architectural decisions be preserved for later use?

Process

RQ3: How does the architecture decision process

influence the decisions?

Reuse

RQ4: How can architecture decision makers reuse

decision data?

FIGURE 1.3: Research Questions


From the research done on this first research question, the scenarios have beendivided in 27 use cases for working with architectural knowledge. These use casesinvolve (multiple) themes of knowledge vaporisation. The use of artifacts to pre-serve architectural knowledge was subjected in 15 use cases. Elements from thedecision process came back in 9 of the use cases while reuse of decisions waspointed out in 10 of the use cases. The remaining three research questions subjectthese themes, as shown in Figure 1.3.


The scenarios from the initial research showed that it is hard to preserve andmanage architectural decisions in typically used artifacts like architectural doc-uments. There is a great availability of tools that support the documentation ofarchitecture decisions [4]. However, the adoption in practice of tools is very low[197] and practical, adequate tools are uncommon [38]. It is often unclear what tocapture and how to capture it [38]. We have investigated the research field of ratio-nale management [35, 54] to see how they cope with preserving tacit knowledge inartifacts. We explored how this can be applied in software architecture research. Inaddition, we investigated what kind of tooling decision-makers need to preservetheir tacit knowledge. Therefore, the second research question of this thesis is:

RQ2: How can tacit knowledge about architectural decisions be preserved for later use?

The research results from the second research question help us to understandhow tacit knowledge can be preserved in artifacts. Architecture artifacts are theresult of a decision process. This is rarely a structured or standardized process.Many stakeholders of the system influence the decisions made. In this process,there is often unawareness that decisions are made [197], and there is a lack ofmotivation or incentive to capture them [38]. The quality of this decision processis hard to assess. However, there are many factors that influence the decision pro-cess. We investigated what the main elements are that influence the architecturedecision process. Therefore, the third research question of this thesis is:

RQ3: How does the architecture decision process influence the decision results?

In our work on the decision process, we saw that making decisions fast to copewith the speed of the changing market is essential for businesses. Reuse of de-cisions that are made in the past will help to increase this decision speed. Theneed for data to base decisions on is increasing rapidly [28]. For architecturaldecisions, typically reused artifacts are architectural styles and patterns. Specificdesign decisions, for example which component to use, are very project-specific.Reuse of these architectural decisions between projects and systems [4] is very dif-ficult. However, sharing data is becoming more common, as you can see in theopen source community. As part of the research in this thesis, we searched fordata on architectural decisions in open source projects. We investigate if it is pos-sible to gather data on the occurrence of specific decisions in the past, to assistdecision makers in the present. Therefore, the fourth and last research questionfor this thesis is:

RQ4: How can architecture decision makers reuse decision data?

The four research questions are addressed this thesis. The mapping of the re-search questions to the chapters of this thesis is described in Section 1.7. A sum-mary of the results is given in the conclusions in Chapter 9.

1.6. Research Methodology 11

1.6 Research Methodology

Research in software engineering is a combination of data-driven and social sci-ences research. On the one hand it is possible to prove aspects of software de-velopment (e.g. deduce the correctness or efficiency of algorithms), while on theother hand creating software is a social activity where determinism is not possibleas validation. Opinions, discussions and emotions highly influence the effective-ness of a software project.

Hevner et al. [86] describe two paradigms for research in information systems.These paradigms are similarly relevant for software engineering research. First,they describe the behavioral-science perspective. Hevner et al. state this ’seeks todevelop and justify theories (i.e., principles and laws) that explain or predict organiza-tional and human phenomena surrounding the analysis, design, implementation, manage-ment, and use of information systems’ [86]. Second, they describe the design-scienceparadigm [52], where research is seen as a problem-solving activity. This is inline with Shaw, [163] who describes software engineering research by defining aspecific goal:

Generally speaking, software engineering researchers seek better waysto develop and evaluate software. They are motivated by practicalproblems, and key objectives of the research are often quality, cost, andtimeliness of software products.

With this definition, the focus is on the practical usefulness of the research (util-ity), not on the development of theories (discover truth). Hevner et al. [86] arguethat these two paradigms can be seen as two sides of the same coin, both aiming todiscover useful IT artifacts. Latour [124] showed that research is a social construct,where social interactions and research artifacts together construct meaningful the-ories. Hevner et al. [86] identify four types of IT artifacts that are used in softwareengineering research:

• constructs: vocabulary and symbols

• models: abstractions and representations

• methods: algorithms and practices

• instantiations: implemented and prototype systems

This thesis reports on research that developed several of these types of IT ar-tifacts. We iterated the build-and-evaluate loop often several times. When welearned from our initial results, we changed the theory and the artifacts and onceagain evaluated the results. This way of working is similar to the validated learn-ing approach advocated by Eric Ries [156] for new product development. Whilethe IT artifact can be used as a goal for design-science research, one other aspectis important too: the evaluation of the artifact [86]. The contribution of this thesiswill be addressed by describing the artifact and the evaluation of the conductedresearch in the conclusions in Chapter 9.


1.6.1 Research Techniques

In software engineering research, many research techniques are used. In the fol-lowing list, the used research techniques in this thesis are summarized.

• Interview As software engineering is a social activity, is it important to geta solid idea of the opinions of participants. Structured interviews [161] are away to gather data on existing problems and chosen solutions. Two chaptersin this thesis are based on interviews as research methodology: Chapter 2and 7.

• On-line Survey A more data-driven way to acquire knowledge from peo-ple that are involved in software projects is by conducting an on-line survey[154]. One of the chapters in this thesis is based on the results of a conductedsurvey: Chapter 6. The survey was codified, and statistical analysis has beenused to validate the research assumptions.

• Participant Observation In order to identify problems in software projects,we conducted research on several case studies as participant researchers [161].In these cases, the researchers were part of a project team, and comparativemulti-case analysis methodology [55] was used to generalize over the casesto an abstract model. This methodology was used in Chapter 5.

• Proof of Concept In order to validate that a proposed methodology or toolis feasible, a prototype, or proof of concept, is used [74]. With this technique,the developed prototype shows the feasibility of the approach, and demon-strates a solution that is generalizable to usage in practice. In Chapter 3, anexample case is modeled in a proof of concept tool Archium [101] to showthe feasibility of the approach. In Chapter 4, an example implementationis used in combination with a controlled experiment to show the feasibilityof the approach. Last, in Chapter 8, the possibility of mining decisions andgetting access to rationale is shown with an example implementation.

• Controlled Experiment With the quasi-controlled experiment [102] used inChapter 4, subjects were instructed to use a developed tool in order to vali-date hypotheses about expected efficiency.

• A/B Experiment As an experimental technique, the validation of work pre-sented in Chapter 8 was done by an A/B experiment [47]. In this experiment,the subjects were unaware they were part of the research, and two groups ofsubjects were used to validate additional assumptions.

1.7 Structure of this Thesis

The chapters of this thesis consist of published material. Apart from minor changes(e.g, references or small corrections), these chapters are the same as the publica-tions. In two cases, the chapter is a combination of two publications (Chapter 2and Chapter 8). The chapters are not ordered chronologically; the work is ordered

1.8. Publications 13

Research Question Chapter Research Technique

RQ1 Chapter 2 Interview

RQ2 Chapter 3 Proof of ConceptChapter 4 Proof of Concept & Controlled Experiment

RQ3 Chapter 5 Participant ObservationChapter 6 On-line SurveyChapter 7 Interview

RQ4 Chapter 8 Proof of Concept & A/B Experiment

TABLE 1.2: Structure of this Thesis

by the topics of the publication to increase the readability of this thesis. The chap-ters follow the order of the research questions.

Chapter 1 introduces the research questions. Chapter 9 summarizes the con-tribution of this thesis and presents future work. The contribution of the otherchapters to the research questions is shown in Table 1.2.

1.8 Publications

The research presented in this thesis has been published in international confer-ences, as book chapters as well as in international journals. The chapters consist ofthese publications, sometimes with minor changes to improve the readability. Theresearch in this thesis is performed by Jan Salvador van der Ven, which in mostcases is the first author of the publication. For all the publications, at least one se-nior researcher co-authored the paper, where they helped primarily with feedbackon the structure and contents.

In Chapter 2, we identified the industrial needs for architectural knowledge byconducting interviews at four companies that were involved in the Griffin project.The interviews were conducted by the involved researchers (the author of this the-sis conducted the interviews at one of the partners). Together with Anton Jansen, Iconstructed the use case model presented in this work, based on the interview re-sults. The use case model creates the context for this thesis as it gives an overviewof the needs for architectural knowledge management. I also did the validation ofa set of use cases in cooperation with ASTRON, as described in the chapter. Thework in Chapter 2 is based on the following publications:

• Jan S. van der Ven, Anton G. J. Jansen, Paris Avgeriou, and Dieter K. Ham-mer. “Using Architectural Decisions”. In: Second International Conference onthe Quality of Software Architecture (QoSA 2006). Karlsruhe University Press,2006, pp. 1–10

• Anton G. J. Jansen, Jan van der Ven, Paris Avgeriou, and Dieter K. Ham-mer. “Tool support for Architectural Decisions”. In: Proceedings of the 6thIEEE/IFIP Working Conference on Software Architecture (WICSA 2007). Mum-bai, India, Jan. 2007


In Chapter 3, the relationship between software architecture and the rationalemanagement field is explored. The concept of the explicit architectural designdecision is introduced to bridge the gap between these worlds and enable betterunderstanding of architectural decisions. This work is written in cooperation withAnton Jansen. My contribution to this work is the comparison of the processesand the validation of the ideas with the example. The work on Archium was doneby Anton. Chapter 3 is based on the following publication:

• Jan Salvador van der Ven, Anton Jansen, Jos Nijhuis, and Jan Bosch. “DesignDecisions: The Bridge between Rationale and Architecture”. In: RationaleManagement in Software Engineering. Springer, 2006, pp. 329 –348

Chapter 4 has been previously published with Anton Jansen as first author,as he did the most of the writing of this manuscript. I started this research withthe initial idea on annotating architectural decisions in documentation. I createdthe first version of the Word plug-in. As third and last author of this research, Icontributed with the initial idea, and the structuring and writing the manuscript.Chapter 4 is based on:

• Anton Jansen, Paris Avgeriou, and Jan Salvador van der Ven. “EnrichingSoftware Architecture Documentation”. In: Journal of Systems and Software82.8 (Aug. 2009), pp. 1232–1248

The work of Chapter 5 describes three dimensions that can be used to classifythe nature of architectural decisions: the architect, the artifacts and the periodicity.The dimensions are distilled by me from the industrial cases. I participated infour of the five companies as participant researcher, the data of the last companycame from the coauthor Jan Bosch. The identification of the problems in the caseswas done by me, while Jan Bosch assisted me with proof-reading of the work andfeedback on the content. Chapter 5 is based on:

• Jan Salvador van der Ven and Jan Bosch. “Architecture Decisions: Who,How, and When?” In: Agile Software Architecture. Ed. by Muhammad AliBabar, Alan W. Brown, and Ivan Mistrik. Boston: Morgan Kaufmann, 2014,pp. 113 –136

In Chapter 6, we assess beliefs around software architecture, based on the re-sults of an on-line survey. I created and distributed the survey. I summarized thebeliefs and conducted the coding and analyses on the results of the questionnaire.The work from Chapter 6 is based on:

• Jan Salvador van der Ven and Jan Bosch. “Busting Software ArchitectureBeliefs: A Survey on Success Factors in Architecture Decision Making”. In:42th Euromicro Conference on Software Engineering and Advanced Applications(SEAA). Aug. 2016, pp. 42–49

Similar to Chapter 3, Chapter 7 explores the relationship between software ar-chitecture and another field, in this case the field of lean startup. It describes howthe concept of design decisions and pivots show similarities. The work is based on

1.8. Publications 15

interviews with founders and experienced architects at five different companies. Iconducted these interviews, created the conceptual model and conducted the anal-ysis of the pivots. Also, I create the comparison of the concepts and developed theguidelines. The work in Chapter 7 is based on:

• Jan Salvador van der Ven and Jan Bosch. “Pivots and Architectural Deci-sions: Two Sides of the Same Medal? What Architecture Research and LeanStartup can learn from Each Other”. In: Proceedings of International Conferenceon Software Engineering Advances (ICSEA 2013). 2013, pp. 310–317

Chapter 8 describes how architectural design decisions can be mined from theversion management of open source projects. The idea came from me, and I cre-ated the software that was used to do the mining, visualization, and validationwith the email experiment. I developed the criteria for the programming lan-guages and the assessment of them. I conducted the analysis of the early resultswith subject matter experts as well as the email experiment. The chapter is basedon the extended version of this research, which is currently under review, supple-mented with a part from the previously published version. The work in Chapter 8is based on the following manuscripts:

• Jan Salvador van der Ven and Jan Bosch. “Making the Right Decision: Sup-porting Architects with Design Decision Data”. In: Proceedings of the 7th Eu-ropean Conference on Software Architecture (ECSA 2013). Ed. by Khalil Drira.Vol. 7957. Lecture Notes in Computer Science. Springer, 2013, pp. 176–183

• Jan Salvador van der Ven and Jan Bosch, "Towards Reusing Decisions myMining Open Source Repositories". Submitted to an international Software En-gineering Journal., 2018

Additional publications done by the author of this thesis, which are not in-cluded as chapters in this thesis, are:

• Marco Sinnema, Jan Salvador van der Ven, and Sybren Deelstra. “Using Vari-ability Modeling Principles to Capture Architectural Knowledge”. In: SIG-SOFT Softw. Eng. Notes 31.5 (Sept. 2006)

• Remco de Boer, Rik Farenhorst, Viktor Clerc, Jan Salvador van der Ven, Patri-cia Lago, and Hans van Vliet. “Structuring Architecture Project Memories”.In: Proceedings 8th International Workshop on Learning Software Organizations(LSO 2006). 2006, pp. 39–47

• Hylke Faber, Menno Wierdsma, Richard Doornbos, Jan Salvador van derVen, and Kevin de Vette. “Teaching Computational Thinking to PrimarySchool Students via Unplugged Programming Lessons”. In: Journal of theEuropean Teacher Education Network 12.0 (2017), pp. 13–24

• Hylke H. Faber, Jan Salvador van der Ven, and Menno D.M. Wierdsma.“Teaching Computational Thinking to 8-Year-Olds Through ScratchJr”. In:Proceedings of the 2017 ACM Conference on Innovation and Technology in Com-puter Science Education. ITiCSE ’17. Bologna, Italy: ACM, 2017, pp. 359–359

17

Chapter 2

Exploring Use Cases for ArchitecturalDecisions

“In theory there is no difference between theory and practice. In practice there is.”

- Yogi Berra

This section is based on:

• Jan S. van der Ven, Anton G. J. Jansen, Paris Avgeriou, and Dieter K. Ham-mer. “Using Architectural Decisions”. In: Second International Conference onthe Quality of Software Architecture (QoSA 2006). Karlsruhe University Press,2006, pp. 1–10

• Anton G. J. Jansen, Jan van der Ven, Paris Avgeriou, and Dieter K. Ham-mer. “Tool support for Architectural Decisions”. In: Proceedings of the 6thIEEE/IFIP Working Conference on Software Architecture (WICSA 2007). Mum-bai, India, Jan. 2007

The problem introduction sections were used from the latter, while the descrip-tion of the use case model is based on the work presented in QoSA 2006.

Abstract

There are increasing demands for the explicit representation and subsequent sharing andusage of architectural decisions in the software architecting process. However, there is lit-tle known on how to use these architectural decisions, or what type of stakeholders need touse them. This chapter presents a use case model that arose from industrial needs, and ismeant to explore how these needs can be satisfied through the effective usage of architec-tural decisions by the relevant stakeholders. The use cases are currently being validated inpractice through industrial case studies. As a result of this validation, we argue that theusage of architectural decisions by the corresponding stakeholders can enhance the qualityof software architecture.

2.1 Introduction

Current research trends in software architecture focus on the treatment of architec-tural decisions [112, 114, 180] as first-class entities and their explicit representation

18 Chapter 2. Exploring Use Cases for Architectural Decisions

in the architectural documentation. From this point of view, a software system’sarchitecture is no longer perceived as interacting components and connectors, butrather as a set of architectural decisions [100]. This paradigm shift has been initi-ated in order to alleviate a major shortcoming in the field of software architecture:Architectural Knowledge Vaporization [27, 189]. Architectural decisions are one ofthe most significant forms of architectural knowledge [189]. Consequently, archi-tectural knowledge vaporizes because most of the architectural decisions are notdocumented in the architectural document and cannot be explicitly derived fromthe architectural models. They merely exist in the form of tacit knowledge in theheads of architects or other stakeholders, and inevitably dissipate. Note, that thisknowledge vaporization is accelerated if no architectural documentation is createdor maintained in the first place. Architectural knowledge vaporization due to theloss of architectural decisions is most critical, as it leads to a number of problemsthat the software industry is struggling with:

• Expensive system evolution. As the systems need to change in order todeal with new requirements, new architectural decisions need to be taken.However, the documentation of existing architectural decisions that reflectthe original intent of the architects is lacking. This in turn causes the adding,removing, or changing of decisions to be highly problematic. Architects mayviolate, override, or neglect to remove existing decisions, as they are unawareof them. This issue, which is also known as architectural erosion [149], resultsin high evolution costs.

• Lack of stakeholder communication. The stakeholders come from differ-ent backgrounds and have different concerns that the architecture documentmust address. If the architectural decisions are not documented and sharedamong the stakeholders, it is difficult to perform trade-offs, resolve conflicts,and set common goals, as the reasons behind the architecture are not clear toeveryone.

• Limited reusability. Architectural reuse cannot be effectively performedwhen the architectural decisions are implicitly hidden in the architecture. Toreuse architectural artifacts, we need to know the alternatives, and the ratio-nale behind each of them, as to avoid making the same mistakes. Otherwisethe architects need to ‘re-invent the wheel’.

The complex nature and role of architectural decisions requires a systematicand partially automated approach that can explicitly document and subsequentlyincorporate them in the architecting process. We have worked with industrial part-ners to understand the exact problems they face with respect to loss of architecturaldecisions. We demonstrate how the system stakeholders exactly can use architec-tural decisions with the help of a use-case model.

The rest of this chapter is structured as follows: first, the notion of architecturaldecisions is introduced, continued by the vision of how to share and use these de-cisions by relevant stakeholders as the Knowledge Grid. In Section 2.3 we give anoverview of how our industrial partners defined the needs for using and sharingarchitectural decisions. Section 2.4 presents the use case model, including the ac-tors and system boundary. The ongoing validation of the use cases is conducted

2.2. Architectural Decisions 19

in Section 2.5. Section 2.6 discusses related work in this field and Section 2.7 sumsup with conclusions and future work.

2.2 Architectural Decisions

To solve the problem of knowledge vaporization and attack the associated prob-lems of expensive system evolution, lack of stakeholder communication, and lim-ited reused we need to effectively upgrade the status of architectural decisions tofirst-class entities. However, first we need to understand their nature and theirrole in software architecture. Based on our earlier work [27, 100, 189], we havecome to the following conclusions on architectural decisions so far:

• They are cross-cutting to a great part or the whole of the design. Each deci-sion usually involves a number of architectural components and connectorsand influence a number of quality attributes.

• They are interlaced in the context of a system’s architecture and they mayhave complex dependencies with each other. These dependencies are usuallynot easily understood which further hinders modeling them and analyzingthem (e.g. for consistency).

• They are taken to realize requirements (or stakeholders’ concerns), and con-versely requirements must result in architectural decisions. This two-waytraceability between requirements and decisions is essential for understand-ing why the architectural decisions were taken.

• They must result in architectural models, and conversely architectural mod-els must be rationalized by architectural decisions. This two-way traceabilitybetween decisions and models is essential for understanding how the archi-tectural decisions affect the system.

• They are derived in a rich context: they result from choosing one out of sev-eral alternatives, they usually represent a trade-off, they are accompanied bya rationale, and they have positive and negative consequences on the overallquality of the system architecture.

The exact properties and relationships of the architectural decisions [27, 119]are still the topic of ongoing research. Some properties [100, 142, 180] and relation-ships [112] have been identified. In this chapter, the definition from [189] is usedfor architectural decisions:

A description of the choice and considered alternatives that (partially)realize one or more requirements. Alternatives consist of a set of ar-chitectural additions, subtractions and modifications to the softwarearchitecture, the rationale, and the design rules, design constraints andadditional requirements.


A description of an architectural decision can therefore be divided in two parts:a description of the choice and the associated alternatives. The description of thechoice consists of elements like: problem, motivation, cause, context, choice (i.e.the decision), and the resulting architectural modification. The description of analternative include: design rules and constraints, consequences, pros and cons ofthe alternative [189]. For a more in-depth description how architectural decisionscan be described see [180, 189].

2.3 From industrial needs to Use Cases

In order to support and semi-automate the introduction and management of ar-chitectural decisions in the architecting process an appropriate tool is required. Inspecific, this tool should be a Knowledge Grid [200]: “an intelligent and sustain-able interconnection environment that enables people and machines to effectivelycapture, publish, share and manage knowledge resources”.

Before pinpointing the specific requirements for a knowledge grid in the nextsections, it is useful to consider the more generic requirements by combining theareas of knowledge grids and architectural decisions. First, this system shouldsupport the effective collaboration of teams, problem solving, and decision mak-ing. It should also use ontologies to represent the complex nature of architecturaldecisions, as well as their dense inter-dependencies. Furthermore, it must effec-tively visualize architectural decisions and their relations from a number of dif-ferent viewpoints, depending on the stakeholders’ concerns. Finally, it must beintegrated with the tools used by architects, as it must connect the architecturaldecisions to documents written in the various tools or design environments, andthus provide traceability between them.

We are currently participating in the Griffin project that is working on tools,techniques and methods that will perform the various tasks needed for buildingthis knowledge grid. Until now, the project has produced two main results: a usecase model, and a domain model. The domain model describes the basic conceptsfor storing, sharing, and using architectural decisions and the relationships be-tween those concepts [61]. The use case model describes the required usages ofthe envisioned knowledge grid. The use cases are expressed in terms of the do-main model, in order to acquire a direct link between what should be described(the domain model), and how it should be used (use cases). The focus in thischapter is on the use case model.

Four different industrial partners participate in the Griffin project. They are allfacing challenges associated to architectural knowledge vaporization. Althoughthe companies are of different nature, they all are involved in constructing largesoftware-intensive systems. They consider software architecture of paramountimportance to their projects, and they all use highly sophisticated techniques formaintaining, sharing and assessing software architectures. Still, some challengesremain.

We conducted qualitative interviews with 14 employees of these industrialpartners. Our goal was to analyze the problems they faced concerning sharingarchitectural knowledge, and to identify possible solutions to these problems. Peo-ple with different roles were interviewed: architects (SA), project managers (PM),

2.4. The Use Case Model 21

architecture reviewers (AR), and software engineers (SE). To guide the interviews aquestionnaire was used (see appendix). The questionnaire was not directly shownto the employees, but used us starting points and checklist for the interviewers.

The results from the interviews were wrapped up in interview reports that de-scribed the current challenges and envisioned solutions by these companies. Theinterview reports contained some needs from the interviewees, which included:

1. Find relevant information in large architectural descriptions (SA, PM, AR).

2. Add architectural decisions, relate them to other architectural knowledgelike architectural documentation, or requirement documentation (SA, PM).

3. Search architectural decisions and the underlying reasons, construct (multi-ple) views where the decisions are represented (SA, PM, AR).

4. Identify what knowledge should minimally be made available to let devel-opers work effectively (SA).

5. Identify the changes in architectural documentation (PM).

6. Identify what architectural decisions have been made in the past, to avoid re-doing the decision process. This include identifying what alternatives wereevaluated and the issues that played some critical role at that time (SA, PM,AR, SE).

7. Reuse architectural decisions (SA, PM, SE).

8. Keep architecture up-to-date during development and evolution (SA, PM).

9. Get an overview of the architecture (SA, PM, AR).

The next section describes how these interviews were used to construct usecases for managing and sharing architectural knowledge.

2.4 The Use Case Model

This section elaborates on a set of use cases that roughly define the requirementsfor a potential knowledge grid. The starting point for the use cases were the inter-view reports and the requirements stated in these reports. First, we describe the ac-tors of the knowledge grid, starting from the roles of our interviewees. After this,the primary actor and the scope are discussed. To understand the dependenciesbetween the use cases a use case model consisting of 27 use cases, including therelations, is presented in Figure 2.1. Besides presenting the dependencies amongthe use cases, the figure also relates the use case to the identified needs describedin the previous section.


Knowledge Grid

17

ProjectManager

Architecture Reviewer

Architect

Maintainer

All

Summary User-Goal Subfunction

2

16

4

5

26

14

24

25

1

6

9

11

15

21

23

10

12

18

19

22

20

13

8

3

727

Legend

X

Use case X Actor Includes relationship

Use Case Titles

1. Check implementation against architectural decisions (need 8)2. Identify the subversive stakeholder (need 3)3. Identify key architectural decisions for a specific stakeholder (need 1,9)4. Perform a review for a specific concern (need 3)5. Check correctness (need 8, 9)6. Identify affected stakeholders on change (need 3)7. Identify unresolved concerns for a specific stakeholder (need 9)8. Keep up-to-date (need 5)9. Inform affected stakeholders (need 5)10. Retrieve an architectural decision (need 6)11. View the change of the architectural decisions over time (need 5)12. Add an architectural decision (need 2)13. Remove consequences of a cancelled architectural decision (need 8)14. Reuse architectural decisions (need 14)15. Recover architectural decisions (need 6, 7)16. Perform incremental architectural review (need 1, 9)17. Assess design maturity (need 1)18. Evaluate impact of an architectural decision19. Evaluate consistency (need 1)20. Identify incompleteness (need 1)21. Conduct a risk analysis22. Detect patterns of architectural decision dependencies23. Check for superfluous architectural decisions24. Cleanup the architecture25. Conduct a trade-off analysis (need 3)26. Identify important architectural drivers (need 3)27. Get consequences of an architectural decision (need 3, 6)

FIGURE 2.1: Use case diagram

2.4. The Use Case Model 23

2.4.1 Actors

We identified the following actors being relevant for the use cases, based on theroles of the interviewees.

• Architect. Architects should be able to create and manage an architecture, andget an overview of the status of the architecture. This results in demands forviews that show the coverage of requirements or describe the consistency ofthe design. Also, the architect is responsible for providing stakeholders withsufficient information, to ensure that their concerns are met in the architec-ture design.

• Architecture Reviewer. Architecture reviewers are often interested in a specificview on the architecture. They can be colleagues, experts from a certain field,or reviewers from an external organization. They want to understand thearchitecture quickly and want to identify potential pitfalls in the architecture,like poorly founded architectural decisions, architectural incompleteness, orarchitectural inconsistency.

• Project Manager. The concerns of the project manager are usually driven bythe planning; what is the status of the architecture, are there potential upcom-ing problems or risks, and how can we address them? The project manageralso addresses people-related issues, e.g. which stakeholder is the biggestrisk for the architecture?

• Developer. The primary concern of the developer is that the architectureshould provide sufficient information for implementing the system. The de-scriptions must be unambiguous. Also, the developer must know where tolook for the necessary knowledge; this can be in the architectural documen-tation, or by knowing which person to contact.

• Maintainer. The maintainer is often ignored as a stakeholder of an architec-ture. However, the maintainer is one of the most important actors when thearchitecture has to evolve. The maintainer has interest in the evolution ofthe architecture (up-to date information), and the consequences of changesin the architecture.

We encountered that the different companies used different terms for the roles theyhave in the software development process. The list of actors presented above is anabstraction of those different roles.

2.4.2 Describing the use cases

We present the use cases, as mandated in [44], using the following elements:

• Scope. All the use cases are defined as an interaction on a knowledge gridtype of system (see Section 2.3). From the use case model perspective, thissystem is considered a black-box system.


• Goal level. The descriptions from the interviews were very diverse in detail.As a consequence, some use cases describe a single interaction on the sys-tem (e.g. add an architectural decision), while others are very high-level de-mands of the system (e.g. perform an incremental architectural review). Weadopted three goal levels from [44] of a decreasing abstraction: Summary,User-goal and Subfunction, for describing this difference. A Summary goaluse case can involve multiple User-goals use cases, and often have a longertime span (hours, days). A User-goal use case involves a primary actor usingthe system (in Business Process Management often called elementary busi-ness process), often in one session of using the system. Subfunction use casesare required to carry out User-goal use cases. They typically represent an ele-mentary action on the system, which is used by multiple User-goal use cases.

• Primary actor. The list of actors described in Section 2.4.1 are used to deter-mine the primary actor for a specific use case. Sometimes, a use case can beperformed by all actors (e.g. identify key architectural decisions). In thesecases, the term All is used as a substitute for the primary actor. In othercases, when the type of actor affects the use case, the most suitable actor wasselected as primary actor, and the others were left out.

• Main success scenario and steps. First, a short description of the use case wasconstructed. From this, a set of steps was defined, describing the main suc-cess scenario. Due to space constraints, this is not shown for all the use cases.In the next section, four use cases are described in detail.

• Includes relationships. The “include” relationships between the use cases arebased on the steps defined for these use cases. This relationship expressesthat a use case contains behavior defined in another use case, as defined inUML 2.0 1. When a use case includes another use case with a different pri-mary actor, this typically means that the first actor will ask the second actor toperform the specified use case. For example, in use case 2 (Identify the sub-versive stakeholder), the project manager will ask the architect to conducta risk analysis (use case 21). Off course one person can also have multipleroles, and thus perform as the primary actor in both use cases.

Figure 2.1 presents the characteristics (Primary actor, goal level, and name) ofthe use case model, which consists of 27 use cases. Note that to enhance the read-ability, the uses relationship (between the actor and the use case) is not visualizedwith arrows, but by horizontal alignment. For example, the architecture revieweracts as a primary actor for use cases 16, 4, 26, and 5. The use cases are vertically di-vided in the three goal levels: Summary, User-goal and Subfunction. For example,use case 16 is a Summary use-case and use case 4 an User-goal.

2.5 Use Case Validation

An use case model like the one presented in Section 2.4 can not be formally vali-dated. Instead, the use cases need to be applied in practice and their effect on the

1http://www.uml.org/

http://www.uml.org/

2.5. Use Case Validation 25

development process should be measured. However, before the use cases can beapplied, they need further refinement to become usable. In this validation section,we present these refinements, show the relevance of the use cases, and present theimprovement these use cases make on the development process.

Currently, the Griffin project is involved in conducting case studies at our in-dustrial partners to validate the use cases. In this section we briefly present theAstron Foundation case study. Astron is currently engaged in the development ofthe LOw Frequency ARray (LOFAR) for radio astronomy 2. LOFAR pioneers thenext generation of radio telescope and will be the most sensitive radio observatoryin the world. It uses many inexpensive antennas combined with software, insteadof huge parabolic dishes, to observe the sky. This makes LOFAR a software inten-sive telescope. LOFAR will consists of around 15.000 antenna’s distributed over77 different stations. Each antenna will generate around 2 Gbps of raw data. Thechallenge for LOFAR is to communicate and process the resulting 30Tbps datastream in real-time for interested scientists.

In the LOFAR system, architectural decisions need to be shared and used overa time span of over 25 years. This is due to the long development time (more then10 years), and a required operational lifetime of at least 15 years. Astron is judgedby external reviewers on the quality of the architecture. The outcome of thesereviews influences the funding, and consequently the continuation of the project.Therefore, it is evident that the architecture has to hold high quality standards.

Together with Astron, we identified eight use cases being of primary concernfor the LOFAR case study: 5, 7, 10, 12, 15, 17, 19, and 20. This section focuseson assessing the design maturity, which is a major concern for Astron. Assessingthe design maturity is a specialization of the earlier identified need for getting anoverview of the architecture (see Section 2.3, need 9). The following use-cases arerelevant with regard to this assessment:

• Asses design maturity (UC 17, see Figure 2.2)

• Identify incompleteness (UC 20, see Figure 2.3)

• Check correctness (UC 5, see Figure 2.4)

• Evaluate consistency (UC 19, see Figure 2.5)

Of these four use cases, use case 17 is the only Summary level use case (seeFigure 2.1). Use cases 5, 19, and 20 are used by this use case. In the remainder ofthis section, these use cases are presented in more detail. For each use case, thefollowing is presented: the relevance to Astron, the current practice at Astron, amore elaborate description of the use case, and the envisioned concrete realizationwithin Astron.

The elaborated use case descriptions make use of the concept of knowledgeentity. All the domain concepts defined within the knowledge grid are assumedas being knowledge entities. For this case study, this includes among others: archi-tectural decisions, rationales, decision topics, alternatives, requirements, specifica-tions, assumptions, rules, constraints, risks, artifacts, and the relationships amongthem.

2http://www.lofar.org/

http://www.lofar.org/


UC 17: Assess design maturity

Description: This use case verifies whether a system conforming to the architec-ture can be made or bought. The architect wants to know when the architecturecan be considered as finished, complete, and consistent.Primary actor: Project Manager.Scope: Knowledge gridLevel: Summary.Precondition: None.Postcondition: The knowledge grid provides an overview of the matureness,and reports potential risks.Main success scenario:

1. Identify incompleteness (UC 20)

2. Check correctness (UC 5)

3. Evaluate consistency (UC 19)

4. The grid generates a report based on the knowledge of the previous steps

Extensions: None

FIGURE 2.2: Use case 17

2.5.1 UC 17: Assess design maturity

Relevance The design maturity is an important part of the quality of the LOFARarchitecture. LOFAR is constructed using cutting edge technology to reach max-imum performance. However, due to the long development time, these cuttingedge technologies are typically emerging when the initial architecture design isbeing made. So, the architecture has to be made with components that do not yetexist. It is therefore hard for Astron to make a judgment whether the architectureis sufficiently matured to start actual construction.Current Practice Within the LOFAR case study, the design maturity is first as-sessed for the various subsystems by each responsible architect. For each subsys-tem the main issues with regard to incompleteness, correctness, and consistencyare reported. Based on these reports, the opinions of the architects and projectmanagement it is decided whether the system is mature enough to be proposed tothe external reviewers to proceed to the next project phase.Use case realization The design maturity use case is presented in Figure 2.2. Thisuse case consists of three other use cases that in turn are used to check the architec-ture for completeness, correctness, and consistency. These use cases are presentedin the remainder of this section.

2.5.2 Use Case 20: Identify incompleteness

Relevance Use case 20 determines whether the architecture covers all (essential)requirements. For Astron this is relevant from a management perspective; incom-pleteness gives pointers to directions where additional effort should be concen-trated.


Use Case 20: Identify incompleteness

Description: In this use case, the system provides a report about the structure ofthe architectural decisions.Primary actor: Architect.Scope: Knowledge gridLevel: User-goal.Precondition: The user is known within the knowledge grid.Postcondition: The knowledge provides an overview of the incomplete knowl-edge entities.Main success scenario:

1. The architect selects a part of the architecture.

2. The knowledge grid identifies the knowledge entities in the part.

3. The grid reports about the incompleteness of these knowledge entities.

Extensions: None


Current practice Astron checks for the completeness of the architecture descrip-tion by peer review and risk assessment. The peer review is done iteratively; fel-low architects give feedback (among others) on the completeness of the architec-tural descriptions. A risk assessment is performed before every external review.The result of this process, a risk matrix, is used for the next iteration of the archi-tectural description. During the design phase, the architect signifies specific pointsof incompleteness, typically by putting keywords like ‘tbd’ (to be determined), or‘tbw’ (to be written) in the architecture documents.

For example, in the central processor, the signals of the antenna’s should be cor-related with each other. Therefore, the signals of all the stations should be routedall-to-all. However, the architectural decision on what network topology to usefor this task is still incomplete, as some alternatives have been evaluated, but nosuitable (cost-effective) solution can be selected so far.Use case realization The general use case is described in Figure 2.3. For Astronthis is realized by the following:

Risks Currently, the relationships between the identified risks (for example inthe risk matrix) and the design (in the architectural documentation) are not explic-itly clear. The knowledge grid allows the architect to relate risks to particular partsof the design and to architectural decisions. This use case enables the architect topartially check the completeness of the mitigation of risks, as every risk shouldbe addressed by at least one architectural decision. Whether the risk is actuallyaddressed by the decision, is checked by UC 5, presented in the next subsection.

Requirements An example inconsistency indicator can be that every requirementshould lead to one or more (mostly non-formal, usually textual) specifications. Itcan thus be automatically determined which requirements are not covered by anyspecifications.

Visualization Incompleteness can be visualized by a “to-do” list of open decision


Use Case 5: Check correctness

Description: In this use case, the knowledge grid supports the user in validatingthe correctness of the architectural decisions addressing the requirements.Primary actor: Architect.Scope: Knowledge gridLevel: Subfunction.Precondition: The knowledge grid contains incompleteness information of thedesign.Postcondition: The knowledge grid contains markings about the correctness; Anoverview of incorrect knowledge entities is provided.Main success scenario:

1. The architect selects a set of requirements in the knowledge grid.

2. The knowledge grid provides a list of related architectural decisions andother related knowledge entities (e.g. assumptions, rules, constraints).

3. The architect evaluates related elements and marks the incorrect elements.

4. The architect continues with the next requirement.

5. The knowledge grid provides an overview of incorrect architectural deci-sions and requirements.

Extensions:3a. The elements are correct, the architect marks them as such.


topics, or visual indicators in the documentation (e.g. icons, or coloring of textpieces).

2.5.3 Use Case 5: Check correctness

Relevance Besides completeness, it is also important to know whether the archi-tectural decisions actually address the requirements. In this sense, correctness iscomplimentary to completeness. For example, completeness only means that de-cisions are taken with respect to all requirements, while correctness means thatthese decisions actually lead to a solution that meets these requirements. Astronspends considerable effort in verifying the correctness of the design. Prototypesof major hardware and software components are made and evaluated. Simula-tions and models are used as well. For example, to deal with the major concern ofthe enormous amounts data to be processed, Astron has developed an elaborateperformance model. This model allows the architects to simulate and validate thecorrectness of many different concepts for distributed data processing.Current practice Similar to the check for completeness, peer reviews are used toverify the correctness of the design description. Domain experts verify the designdescription created by the architect. Based on this feedback, the architect adaptsthe design description. Doubts about the correctness of parts of the design are typ-ically annotated with the key word ‘tbc’ (to be confirmed) or placed in a separate


open issue sections. If there is any doubt about the way in which the correctnessis verified, keywords like ‘under discussion’ are typically used in the architecturaldocumentation.

For example, there has been an incorrect assessment of the distributed behav-ior of the used calibration algorithm. It was expected that each node used 80%local data, and that for the remaining 20% all the data on the other nodes wasneeded. Based on this assessment the architectural decision was made to use adistributed database grid. However, during performance tests it turned out thatfor this 20% the data of only one or two other nodes was needed, instead of all theother nodes. Consequently, the architectural decision turned out to be wrong, asthe architectural decision for a centralized database is a significant better alterna-tive. In retrospect, verification of the architectural decision by the correct domainexpert could have prevented this situation from arising in the first place.Use Case realization The knowledge grid itself cannot determine the correctnessof the architectural decisions without in-depth semantic knowledge of the under-lying architectural models. Therefore, this use case makes provision for assist-ing the architect in determining the correctness of the design, rather than that theknowledge grid determines the correctness itself.

Requirements For each requirement or risk, the architect needs to find out if theinvolved architectural decisions correctly addresses the requirement or risk. Thisuse case describes how this process can be supported.

Visualize The visualization of the incompleteness can subsequently be used tovisualize incorrect elements. However, since the checking of correctness is mostlymanual job for the architect, the results may vary when different people are check-ing the correctness. Integration of this information is then needed.

2.5.4 Use Case 19: Evaluate consistency

Relevance This use case is concerned with the consistency between the architec-tural decisions themselves. As the LOFAR project consists of many componentsthat are developed in parallel, detecting contradictions is important, as this pro-vides an early warning for mistakes in the overall design. Inconsistencies makethe design of the system harder to understand and create problems during therealization of the system.Current practice Checking for inconsistencies in textual descriptions is largely amanual job. The part of the design that is modeled (e.g. in the performance mod-els) can automatically be checked for inconsistencies. However, they only covera very small part of the overall design, and therefore a small part of the architec-tural decisions. Most of the inconsistencies are found by inspection, either by thearchitect or reviewer.

For example, there has been an inconsistency in LOFAR between the protocolused by the central processor (the correlator of the radio signals) and the stations(the locations where the antenna’s are residing). Although large efforts have beenput in a consistent definition of the data packet header, versioning etc., the useddefinition of how to interpret the subband data turned out to be inconsistent. Forthe station a subband was defined starting with the lowest frequency leading to


Use Case 19: Evaluate consistency

Description: In this use case, the knowledge grid supports the user in detectinginconsistencies in the architecture design.Primary actor: Architect.Scope: Knowledge grid.Level: User-goal.Precondition: The user is known within the knowledge grid.Postcondition: The knowledge grid contains markings about the consistency;An overview of inconsistent knowledge entities is provided.Main success scenario:

1. The architect selects a subset of architectural knowledge in the grid.

2. Architect selects a specific knowledge entity or a part of the design, and asksthe knowledge grid for consistency assistance.

3. The knowledge grid provides a list of related (and potentially inconsistent)knowledge entities.

4. The architect marks the inconsistent knowledge entities.

5. The architect repeats steps 3 and 4 for the remaining knowledge entities.

6. The knowledge grid provides an overview of inconsistent knowledge.

Extensions:4a. The knowledge entities are consistent, the architect marks them as such.


the highest frequency of the subband, while for the central processor it was definedthe other way around.Use case realization The architect is supported with relevant context informationin the decision making process. For Astron, this will include the visualization ofrelevant requirements, and closely related architectural decisions and specifica-tions. Techniques similar to the work of [83] could be used for this. Furthermore,once an inconsistency is detected, the architect is supported with a visualization ofthe relevant architectural decisions. This allows the architect not only to confirman inconsistency, but also to detect its cause and consequently resolve it.

2.6 Related Work

Software architecture design methods [16, 26] focus on describing how sound ar-chitectural decisions can be made. Architecture assessment methods, like ATAM[16], assess the quality attributes of a software architecture. The use cases pre-sented in this chapter describe some assessment scenarios that could be reusedfrom these design and assessment methods.

Software documentation approaches [42, 90] provide guidelines for the docu-mentation of software architectures. However, these approaches do not explicitly

2.6. Related Work 31

capture the way to take architectural decisions and the rationale behind those de-cisions. The presented use cases describe how stakeholders would like work withthis knowledge.

Architectural Description Languages (ADLs) [138] do not capture the decisionsmaking process in software architecting either. An exception is formed by the ar-chitectural change management tool Mae [88], which tracks changes of elementsin an architectural model using a revision management system. However, thisapproach lacks the notion of architectural decisions and does not capture the con-sidered alternatives or rationales, something the knowledge grid does.

Architectural styles and patterns [36, 164] describe common (collections of) ar-chitectural decisions, with known benefits and drawbacks. Tactics [16] are similar,as they provide clues and hints about what kind of techniques can help in certainsituations. Use case 22 (Detect patterns of architectural decision dependencies),can be used to find these kinds of decisions.

Currently, there is more attention in the software architecture community forthe decisions behind the architectural model. Tyree and Akerman [180] provide afirst approach on documenting design decisions for software architectures. Con-cepts and guidelines for explicit representations of architectural decisions can befound in the work of Babar et al. [142] and our own work [100, 189]. Closely re-lated to this is the work of Lago and van Vliet [123]. They model assumptions onwhich architectural decisions are often based, but not the architectural decisionsthemselves. Kruchten et al. [112], stress the importance of architectural decisions,and show classifications of architectural decisions and the relationship betweenthem. They define some rough outlines for the use cases for describing how to usearchitectural knowledge. Furthermore, they provide an ontology based visualiza-tion of the knowledge in the grid. We emphasize more on the explicit modeling ofthe use cases and are validating a set of extended use cases in the context of a casestudy.

Integration of rationale and design is done in the field of design rationale. SEU-RAT [35] maintains rationales in a RationaleExplorer, which is loosely coupled tothe source code. These rationales have been transferred to the design tool, to let therationales of the architecture and implementation level be maintained correctly.DRPG [15] couples rationale of well-known design patterns with elements in aJava implementation. Just like SEURAT, DRPG also depends on the fact that therationale of the design patterns is added to the system in advance. The importanceof having support for design rationales was emphasized by the survey conductedby Tang et al. [171]. The results emphasized the current lack of good tool supportfor managing design rationales. The use cases presented in this chapter are anexcellent start for requirements for such tools.

From the knowledge management perspective, a web based tool for managingarchitectural knowledge is presented in [142]. This approach uses tasks to describethe use of architectural knowledge. These tasks are much more abstract then theuse cases defined in this chapter (e.g. architectural knowledge use, architecturalknowledge distribution).

Finally, another relevant approach is the investigation of the traceability fromthe architecture to the requirements [195]. Wang uses Concern Traceability mapsto re-engineer the relationships between the requirements, and to identify the root


causes. The results from such systems could be valuable input for defining therelationships between knowledge entities, as used in our validation.

2.7 Conclusions and Future Work

In order to upgrade the status of architectural decisions, we must first understandhow they can be shared and used by a software development organization. Forthis purpose, we have proposed a use case model that came out of industrial needsand aims to fill specific gaps and in particular to alleviate the dissipation of archi-tectural decisions. This use case model is considered as the black-box view of aknowledge grid type of system that is envisioned to enrich the architecting pro-cess with architectural decisions.

A reasonable question to reflect upon is: how exactly was the software architec-ture quality enhanced by the use case model proposed in this chapter? Althoughpinpointing what exactly constitutes the quality of software architecture per se isa difficult issue, we can identify five arguments in this case:

• Less expensive system evolution. As the systems need to change in orderto deal with new requirements, new architectural decisions need to be taken.Adding, removing and modifying architectural decisions can be based onthe documentation of existing architectural decisions that reflect the originalintent of the architects. Moreover, architects may be less tempted to violateor override existing decisions, and they cannot neglect to remove them. Inother words the architectural decisions are enforced during evolution andthe problem of architectural erosion [149] is reduced.

• Enhanced stakeholder communication. The stakeholders come from dif-ferent backgrounds and have different concerns that the architecture docu-ment must address. Architectural decisions may serve the role of explainingthe rationale behind the architecture to all stakeholders. Furthermore, theexplicit documentation of architectural decisions makes it more effective toshare them among the stakeholders, and subsequently perform tradeoffs, re-solve conflicts, and set common goals.

• Improved intrinsic characteristics of the architecture. These concern at-tributes of the architecture, such as conceptual integrity, correctness, com-pleteness and buildability [16]. Architectural decisions can support the de-velopment team to upgrade such attributes, because they give more com-plete knowledge, they provide a clearer and bigger picture. In other words,architectural decisions provide much richer input to the formal (or less for-mal) methods that will be used to evaluate these attributes.

• Extended architectural reusability. Reuse of architectural artifacts, such ascomponents and connectors, can be more effectively performed when the ar-chitectural decisions are explicitly documented in the architecture. To reusearchitectural artifacts, we need to know why they were chosen, what theiralternatives were, and what benefits and liabilities they bring about. Suchkind of reusability prevents the architects from re-making past mistakes or

2.7. Conclusions and Future Work 33

making new mistakes. Finally architectural decisions per se, can and shouldbe reused, probably after slight modifications.

• Extended traceability between requirements and architectural models. Ar-chitectural decisions realize requirements (or stakeholders’ concerns) on theone hand, and result in architectural models on the other hand. Therefore,architectural decisions are the missing link between requirements and ar-chitectural models and provide a two-way traceability between them [189].The architect and other stakeholders can thus reason which requirementsare satisfied by a specific part of the system, and vice-versa, which part ofthe system realizes specific requirements.

We are currently trying to validate the use case model in four industrial casestudies to better understand the pragmatic industrial needs and make the use casemodel as relevant and effective as possible. After this validation, we plan to per-form a second iteration of validation interviews with the original intervieweesfrom the first iteration, as well as more stakeholders with different roles, in or-der to fully cover the most significant roles. Furthermore external architects willalso be asked to validate the use case model. In the meantime we have alreadyattempted to implement parts of the knowledge grid in the form of tool support,which is used in the aforementioned case study of the Astron Foundation.

Acknowledgements

This research has partially been sponsored by the Dutch Joint Academic and Com-mercial Quality Research & Development (Jacquard) program on Software En-gineering Research via contract 638.001.406 GRIFFIN: a GRId For inFormatIoNabout architectural knowledge.

Appendix: Griffin questionnaire

This appendix contains the questionnaire that was used during the interview. Thequestionnaire was sent to most interviewees in advance and used by the Griffinresearch team to see whether all relevant subjects have been discussed during theinterview.

Introduction of yourself

1. Can you describe your role and responsibilities within the organization?

2. Can you give an estimate on what percentage of your time is spent on activi-ties related to architecture? Examples include capturing architectural knowl-edge, communicating architectural knowledge to stakeholders, et cetera.

3. With what kind of stakeholders in the architecting process do you interactmost?


Architecture

1. For the sake of clarity: what is your definition of “(software) architecture”?Does this definition differ from the generally accepted definition within yourorganization?

2. Can you describe the software design process, and the place the architecturetakes in it?

3. Are architectures kept up-to-date during evolution? What techniques areused in keeping the architectures up-to-date?

4. For what stakeholders are architecture documents created? What are (gener-ally spoken) the most important stakeholders?

5. Are tools, methods, templates, or architectural description languages (ADLs)used in constructing an architecture?

6. Are you satisfied about the way these tools, methods, templates, or ADLs arebeing utilized during the architecture construction process? Can you men-tion any improvement points?

Architectural knowledge

1. What architectural decisions are documented, and how?

2. Can you quantify the impact of Architectural Knowledge that is lost / notpresent / too implicit? Can you give examples?

3. Could you provide a top-3 list of burdens in modelling architectures? Whatare your ideas on this?

Your architectures of today and in the past

1. What are the most important quality characteristics of your architecture (orarchitectures)?

2. What kind of solutions do you provide in your design? Do you reuse certainsolutions (e.g. architectural patterns) in your architectures?

3. (If possible to mention commonalities): With what kinds of design aspectsdo you deal explicitly in your architectures? Examples of design aspects in-clude interfaces, error handling, execution architecture, data consistency, androbustness.

4. From what sources do you obtain information for these design aspects?

5. Is there a topic on which you foresee a big change in the use of architecturesin the future?

2.7. Conclusions and Future Work 35

Architecting in daily practice

1. How is the availability of architectural information planned, managed, andreviewed?

2. What will be (in your opinion) a big change in the architect’s job in the fu-ture?

3. Looking back on the last few years, what would you reckon as a significantstep forward in architecting support?

4. How would you prepare for this?

5. How is this change planned?

37

Chapter 3

Design Decisions: the Bridgebetween Rationale and Architecture

“But which is the stone that supports the bridge?”

- Kublai Khan

This chapter is based on: Jan Salvador van der Ven, Anton Jansen, Jos Nijhuis,and Jan Bosch. “Design Decisions: The Bridge between Rationale and Architec-ture”. In: Rationale Management in Software Engineering. Springer, 2006, pp. 329–348 1

Abstract

Software architecture can be seen as a decision making process; it involves making theright decisions at the right time. Typically, these design decisions are not explicitly rep-resented in the artifacts describing the design. They reside in the minds of the designersand are therefore easily lost. Rationale management is often proposed as a solution, butlacks a close relationship with software architecture artifacts. Explicit modeling of designdecisions in the software architecture bridges this gap, as it allows for a close integrationof rationale management with software architecture. This improves the understandabil-ity of the software architecture. Consequently, the software architecture becomes easier tocommunicate, maintain and evolve. Furthermore, it allows for analysis, improvement, andreuse of design decisions in the design process.

3.1 Introduction

Software design is currently seen as an iterative process. Often used phases in thisprocess include: requirements discussions, requirements specification, softwarearchitecting, implementation and testing. The Rationale unified Process (RuP) isan example of an iterative design process split into several phases. In such aniterative design process, the software architecture has a vital role [149].

Architects describe the bare bones of the system by making high-level designdecisions. Errors made in the design of the architecture generally have a huge im-pact on the final result. A lot of effort is spent on making the right design decisions

1The authors Jan Salvador van der Ven and Anton Jansen contributed equally to this research

38 Chapter 3. Design Decisions: the Bridge between Rationale and Architecture

in the initial design of the architecture. However, the argumentation underlyingthe architecture is usually not documented, because the focus is only on the resultsof the decisions (the architectural artifacts). Therefore the evaluated alternatives,made trade-offs and rationale about the made decision remain in the heads of thedesigners. This tacit knowledge is easily lost. The lost architecture knowledgeleads to evolution problems [75], increases the complexity of the design [27], andobstructs the reuse of design experience [114].

To solve the problem of lost architectural knowledge, often techniques for man-aging rationale are proposed. Experiments show that maintaining rationale in thearchitecture phase increases the understandability of the design [30]. However,creating and maintaining this rationale is very time-consuming. The connectionto the architectural and design artifacts is usually very loose, making the rationalehard to use and keep up-to-date during the evolution of the system. Consequently,there seems to be a gap between rationale management and software architecture.

To bridge this gap, we unite rationale and architectural artifacts into the con-cept of a design decision, which couples rationale with software architecture. De-sign decisions are integrated with the software architecture design. By doing this,the rationale stays in the architecture, making it easier to understand, communi-cate, change, maintain, and evolve the design.

Section 3.2 of this chapter introduces software architectures. Section 3.3 dis-cusses how rationale is used in software architectures. Section 3.4 introduces theconcept of design decisions. Section 3.5 presents a concrete approach that uses thisconcept. After this, related and future work is discussed, followed by a summary,which concludes this chapter.

3.2 Software architecture

This section focuses on the knowledge aspects of software architectures. In thefollowing subsection, the software architecture design process is discussed. Next,different ways are presented to describe software architectural knowledge. Thissubsection ends with a discussion on the issue of knowledge vaporization in soft-ware architecture.

3.2.1 The software architecture design process

A software architecture is based on the requirements for the system. Requirementsdefine what the system should do, whereas the software architecture describes howthis is achieved. Many software architecture design methods exist (e.g. [16] and[26]). They all use different methodologies for designing software architectures.However, they can all be summarized in the same abstract software architecturedesign process.

Figure 3.1 provides a view of this abstract software design process and its as-sociated artifacts. The main input for a software architecture design process is therequirements document. During the initial design the software architecture is created,which satisfies (parts of) the requirements stated in the requirement document.After this initial design phase, the quality of the software architecture is assessed.

3.2. Software architecture 39

FIGURE 3.1: An abstract view on the software architecture designprocess

When the quality of the architecture is not sufficient, it is modified (architecturalmodification).

To modify the architecture, the architect can among others employ a numberof tactics [16] or adopt one or more architectural styles or patterns [164] to im-prove the design. This is repeated, until the quality of the architecture is assessedsufficient.

3.2.2 Describing software architectures

There is no general agreement of what a software architecture is and what it is not.This is mainly due to the fact that software architecture has many different aspects,which are either technically, process, organization, or business oriented [26]. Con-sequently, people perceive and express software architectures in many differentways. Due to the many different notions of software architectures, a combinationof different levels of knowledge is needed for its description. Roughly, the follow-ing three levels are usually discerned:

• Tacit/Implicit knowledge. In many cases, (parts of) software architecturesare not explicitly described or modeled, but remain as tacit information in-side the head(s) of the designer(s). Making this implicit knowledge explicitis expensive, and some knowledge is not supposed to be written down, forexample for political reasons. Consequently, (parts of) software architecturesof many systems remain implicit.

• Documented knowledge. Documentation approaches provide guidelineson which aspects of the architecture should be documented and how thiscan be achieved. Typically, these approaches define multiple views on anarchitecture for different stakeholders [95]. Examples include: the Siemensfour view [90], and the work of the Software Engineering Institute [42].

• Formalized knowledge. Formalized knowledge is a specialized form of doc-umented knowledge. Architecture Description Languages (ADL) [138], for-mulas and calculations concerning the system are examples of formalized


knowledge. An ADL provides a clear and concise description of the usedarchitectural concepts, which can be communicated, related, and reasonedabout. The advantage of formalized knowledge is that it can be processed bycomputers.

Often, the different kinds of knowledge are used simultaneously. For example,despite that UML was not invented for it, UML is often used to model certain ar-chitectural concepts [42]. The model structure of UML contains formalized knowl-edge, which needs explanation in the form of documented knowledge. However,the use of the models is not unambiguous, and it is often found that UML is usedin different ways. This implies the use of tacit knowledge to be able to understandand interpret the UML models in different contexts.

3.2.3 Problems in software architecture

There are several major problems with software architecture design [27, 99, 114].These problems come from the large amount of tacit architectural knowledge. Cur-rently, none of the existing approaches to describe software architectures (see pre-vious subsection) gives guidelines for describing the design decisions underlyingthe architecture. Consequently, design decisions only exist in the heads of the de-signers, which leads to the following problems:

• Design decisions are cross cutting and intertwined: Typical design deci-sions affect multiple parts of the design. However, these design decisionsare not explicitly represented in the architecture. So, the associated architec-tural knowledge is fragmented across various parts of the design, making ithard to find and change the decisions.

• Design rules and constraints are violated: During the evolution of the sys-tem, designers can easily violate design rules and constraints arising frompreviously taken design decisions. Violations of these rules and constraintslead to architectural drift [149], and its associated problems (e.g. increasedmaintenance costs).

• Obsolete design decisions are not removed: When obsolete design deci-sions are not removed, the system has the tendency to erode more rapidly.In the current design practice removing design decisions is avoided, becauseof the effort needed, and the unexpected effects this removing can have onthe system.

Because of these problems, the developed systems have a high cost of change,and they tend to erode quickly. Also, the reusability of the architectural artifactsis limited if design decision knowledge vaporizes into the design. These prob-lems are caused by the focus in the software architecture design process on theresulting artifacts (e.g. components and connectors), instead of the decisions thatlead to them. Clearly, design decisions currently lack a first-class representation insoftware architecture designs.

3.3. Rationale in software architecture 41

FIGURE 3.2: An abstract view on the rationale management process

3.3 Rationale in software architecture

To tackle the problems described in the previous section, the use of rationale isoften proposed. Rationale in the context of architectures describes and explainsthe used concepts, considered alternatives, and structures of systems [95]. Thissection describes the use of rationale in software architectures. First, an abstractrationale construction process is introduced. Then, the reasons for rationale use insoftware architecture are described. The section is concluded with a summary ofproblems for current rationale use in software architecture.

3.3.1 The rationale construction process

A general process for creating rationale is visualized in Figure 3.2. First, the prob-lems are identified (problem identification) and described in a problem statement.Then, the problems are evaluated (problems remaining) one by one, and solutions arecreated (create solutions) for a problem. These solutions are evaluated and weightedfor their suitability of solving the problem at hand (decision making). The best so-lution (for that situation) is chosen, and the choice is documented together withits rationale (Choice + Rationale). If new problems emerge from the decision made,they have to be written down and be solved within the same process.

This process is a generalized view from different rationale based approaches(like the ones described in Section 3.3). Take for example QOC, and the scenariodescribed in [134]. The design of a scroll bar for a user interface is discussed. Thereare several questions (problems), like "Q1: How to display?". For this question,there are two options (solutions) described, "O1: permanent" and "O2: appearing".In the described example, the second option is considered as the best one, andselected. However, this option generated a new question (problem), "Q2: How tomake it appear?". This new question needs to be solved in the same way. Otherrationale management methods can be mapped on this process view too.


3.3.2 Reasons for using rationale in software architecture

There are many reasons for using rationale in software projects. Here, the most im-portant reasons are summarized and related to the problems existing in softwarearchitecture.

• Supporting reuse and change During the evolution of a system and its archi-tecture, often the rules and constraints from previous decisions are violated.Rationale needs to be used to give the architects insight in previous decisions.

• Improving quality As posed in the previous section, design decisions tendto get cross-cut and intertwined. Rationale based solutions are used to checkconsistency between decisions. This helps in managing the cross-cutting con-cerns.

• Supporting knowledge transfer When using rationale for communicationof the design. Transfer of knowledge can be done over two dimensions: lo-cation (different departments or companies across the world) and time (evo-lution, maintenance). Transferring knowledge is one of the most importantgoals of an architecture.

3.3.3 Problems of rationale use in software architecture

As described in this section, rationale could be beneficial in architecture design.However, most methods developed for capturing rationale in architecture designsuffer from the following problems:

• Capture overhead. Despite the attempt to automate the rationale captureprocess, both during and after the design, it is still a laborious process.

• For the designers, it is hard to see the clear benefit of documenting rationaleabout the architecture. Usually most of the rationale captured is not usedby the designer itself, and therefore capturing rationale is generally seen asboring and useless work.

• The rationale typically loses the context in which it was created. When ra-tionale is communicated in documented or formalized form, additional tacitinformation about the context is lost.

• There is no clear connection from the architectural artifacts to the ratio-nale. Because the rationale and the architectural artifacts are usually keptseparated, it is very hard to keep them synchronized. Especially when thesystem is evolving, the design artifacts are updated, while the rationale doc-umentation tends to deteriorate.

As a consequence of these problems, rationale based approaches are not oftenused in architecture design. However, as described in Section 3.2, there is a needfor documenting the reasons behind the design. The following section describesan approach which couples rationale to architecture.

3.4. Design decisions: the bridge between rationale and architecture 43

3.4 Design decisions: the bridge between rationale andarchitecture

The problems, as described in Subsection 3.2.3 and Subsection 3.3.3, can be ad-dressed by the same solution. This is done by including rationale and architec-tural artifacts into one concept: the design decision. In the following subsection,the two processes from Subsections 3.2.1 and 3.3.1 are compared. In Subsection3.4.2, design decisions are introduced by example and defined in Subsection 3.4.3.The last subsection discusses designing with design decisions.

3.4.1 Enriching architecture with rationale

The processes described in Subsections 3.2.1 and 3.3.1 have some clear resem-blances. Problems (requirements) are handled by Solutions (software architectures/ modifications), and the assessment determines if all the problems are solved ad-equately. The artifacts created in both processes tend to describe the same things(see Figure 3.3). However, the software architecture design process focuses on theresults of the decision process, while the rationale management focuses on thepath to the decision.

Some knowledge which is captured in the rationale management process ismissing in the architecture design process (depicted as black boxes in Figure 3.3).There are two artifacts which contain knowledge that is not available in the soft-ware architecture artifact: not selected solutions and choice + rationale. On theother hand, the results of the design process (the architecture and architecturalmodifications), are missing in the rationale management process.

The concept of first-class represented design decisions, composed of rationale,architectural modifications, and alternatives, is used to bring the two processestogether. A software architecture design process no longer results in a static designdescription of a system, but in a set of design decisions leading up to the system.The design decisions reflect the rationale used for the decision making process,and form the natural bridge between rationale and the resulting architecture.

3.4.2 CD player: a Design Decision Example

This subsection presents a simple case, which shows the impact of designing an ar-chitecture with design decisions. The example is based on the design of a compactdisc (CD) player. Changing customers’ needs have made the software architectureof the CD player insufficient. Consequently, the architecture needs to evolve.

The software architecture of the CD player is presented in the top of Figure3.4, the current design. The design decisions leading to the current design are notshown in Figure 3.4 and are instead represented as one design decision. The CDplayers’ architecture is visualized in a component and connector view [42]. Thecomponents are the principal computational elements that execute at run-time inthe CD player. The connectors represent which component has a run-time path-way of interaction with another component.

Two functional additions to the software architecture are described. First, asoftware-update mechanism is added. This is used to update the CD player, to


Problem Identification

Create Solutions

Problem Statement

Choice + Rationale

Problems Remaining

Decision Making

Solution

Yes

Done

No

Legend

Process Artifact Decision

RequirementAnalysis

Initial Design

Requirements Document

Software Architecture

AssessmentArchitectural Modification Insufficient

Done

Sufficient

Unrepresented Design

KnowledgeCorresponding Artifacts

Software architecture design

Rationale management

FIGURE 3.3: Similarities between software architecture design pro-cess and the rationale management process


make easier to fix bugs and add new functionality in the future. Second, the inter-net connection is used to download song information for the played CD, like songtexts, additional artist information, etc.

As shown in Figure 3.4, design decisions are taken to add the described func-tionality. The design decisions contain the rationale and the functional solution,represented as documentation and an architectural component and connector view.Note that the rationale in the picture is shortened very much because of space lim-itations. The added functionality is directly represented by two design decisions,Updater and SongDatabase.

The first idea for solving the internet connectivity was to add a componentwhich handled the communication to the Patcher. This idea was rejected, and an-other alternative was considered, to create a change to the Hardware Controller.This change enabled the internet connectivity for the Internet song db too, andwas considered better because it could reuse a lot of the functionality of the ex-isting Hardware Controller. Note that the view on the current design shows acomplete architecture, while it is also a set of design decisions. The resulting de-sign (Figure 3.5) is visualized with the two design decisions taken: the Updaterand the SongDatabase.

3.4.3 Design decisions

In the example of subsection 3.4.2, the software architecture of the CD player is theset of design decisions leading to a particular design, as depicted in Figure 3.4. Inthe classical notion of system design only the result depicted in Figure 3.5 is visiblewhile not capturing the design decisions leading up to a particular design.

Although the term architectural design decision is often used [16, 42, 90], aprecise definition is hard to find. Therefore, we define an architectural designdecision as:

A description of the choice and considered alternatives that (partially) realize one or morerequirements. Alternatives consist of a set of architectural additions, subtractions andmodifications to the software architecture, the rationale, and the design rules, design

constraints and additional requirements.

We detail this definition by describing the used elements:

• The considered alternatives are potential solutions to the requirement thedesign decision addresses. The choice is the decision part of an architecturaldesign decision; it selects one of the considered alternatives. For example,Figure 3.4 contains two considered alternatives for the connectivity designdecisions. The Ethernet Object alternative is not selected. Instead, the Inter-net Connectivity is selected.

• The architectural additions, subtractions, and modifications are the changesto the given architecture that the design decision makes. For example, in Fig-ure 3.4 the Song Database design decision has one addition in the form of anew component (the Internet Song Database), and introduces two modifica-tions to components (info shower and internet connection).


FIGURE 3.4: The architecture of a CD player with extended function-ality


FIGURE 3.5: The result of the design decisions of Figure 3.4


• The rationale represents the reasons behind an architectural design decision.In Figure 3.4 the rationale is shortly described within the design decisions.

• The design rules and constraints are prescriptions for further design deci-sions. As an example of a rule, consider a design decision that is taken to usean object-oriented database. All components and objects that require persis-tence need to support the interface demanded by this database managementsystem, which is a rule. However, this design decision may require that thecomplete state of the system is saved in this object-oriented database, whichis a constraint.

• Timely fulfillment of requirements drives the design decision process. Therequirements not only include the current requirements, but also include re-quirements expected in the future. They can be either explicit, e.g. men-tioned in a requirements document, or implicit.

• A design decision may result in additional requirements to be satisfied bythe architecture. Once a design decision is taken, new insights can lead toprevious undiscovered requirements. For instance, the design decision touse the Internet as an interface to a system, will cause security requirementslike logins, secure transfer etc.

The given architecture is a set of earlier made design decisions, which repre-sent the architectural design at the moment the design decision is taken.

Architecture design decisions may be concerned with the application domainof the system, the architectural styles and patterns used in the system, COTScomponents and other infrastructure selections as well as other aspects describedin classical architecture design. Consequently, architectural design decisions canhave many different levels of abstraction. Furthermore, they involve a wide rangeof issues, from pure technical ones to organizational, business, political, and socialones.

3.4.4 Designing with design decisions

Existing design methods (e.g. [16] and [26]) describe ways in which alternativesare elicited and trade-offs are made. An architect designing with design decisionsstill uses these design methods. The main difference lies in the awareness of thearchitect, to explicitly capture the design decisions made and the associated designknowledge.

Subsection 3.2.3 presented key problems in software architecture. Designingwith design decisions helps in handling these problems in the following way:

• Design decisions are cross cutting and intertwined. When designing withdesign decisions the architect explicitly defines design decisions, and the re-lationships between them. The architect is made aware of the cross cuttingand intertwining of design decisions. In the short term, if the identified in-tertwining and cross cutting is not desirable, the involved design decisionscan be reevaluated and alternative solutions can be considered before the de-sign is further developed. In the long term, the architect can (re)learn which

3.5. Archium 49

design decisions are closely intertwined with each other and what kind ofproblems are associated with this.

• Design rules and constraints are violated. Design decisions explicitly con-tain knowledge about the rules and constraints they impose on the archi-tecture. Adequate tool support can make the architect aware about theserules and constraints and provide their associated rationale. This is mostlya long term benefit to the architect, as this knowledge is often forgotten andno longer available during evolution or maintenance of the system.

• Obsolete design decisions are not removed. In evolution and maintenance,explicit design decisions enable identification and removal of obsolete designdecisions. The architect can predict the impact of the decision and the effortrequired for removal.

Designing with design decisions requires more effort from the architect, as thedesign decisions have to be documented along with their rationale. In traditionaldesign, the architect forms the bridge between architecture and rationale. In de-signing with design decisions, this role is partially taken up by the design deci-sions.

Capturing the rationale of design decisions is a resource intensive process. Tominimize the capture overhead, close integration between software architecturedesign, rationale, and design decisions is required. The following section presentsan example of an approach that demonstrates this close integration.

3.5 Archium

The previous section presented a general notion of architectural design decisions.In this section, a concrete example realization of this notion is presented: Archium[100]. First, the basic concepts of Archium are presented, after which this approachis illustrated with an example.

3.5.1 Basic concepts of Archium

Archium is an extension of Java, consisting of a compiler and run-time platform.Archium consists of three different elements, which are integrated with each other.The first element is the architectural model, which formally defines the softwarearchitecture using ADL concepts [138]. Second, Archium incorporates a decisionmodel, which models design decisions along with its rationale. Third, Archium in-cludes a composition model, which describes how the different concepts are com-posed together.

The focus in this subsection is on the design decision model. For the compo-sition and architectural model see [100]. The decision model (see Figure 3.6) usesan issue-based approach [126]. The issues are problems, which the solutions of thearchitectural design decisions (partially) solve. The rationale part of the decisionmodel focuses on design decision rationale and not design rationale in general (seeSection 3.3).


FIGURE 3.6: The Archium design decision model

Archium captures rationale in customizable rationale elements. They are de-scribed in natural text within the scope of a design decision. Rationale elementscan explicitly refer to elements within this context, thereby creating a close rela-tionship between rationale and design elements.

The motivation and cause elements provide rationale about the problem. Thechoice element chooses the right solution and makes a trade-off between the solu-tions. The choice results in an architectural modification.

To realize the chosen solution in an architectural design decision, the compo-nents and connectors of the architectural model can be altered. In this process, newelements might be required and existing elements of the design might be modifiedor removed. The architectural modification describes these changes, and therebythe history of the design. These architectural modifications are explicitly part ofdesign decisions, which are first-class entities in Archium. This makes Archiumcapable of describing a software architecture as a set of design decisions [100].

Rationale acquisition is a manual task in Archium. The approach tries to min-imize the intrusiveness of the capturing process by letting the rationale elementsof the design decisions be optional. The only intrusive factor is the identificationand naming of design decisions.

The rationale elements are to a certain extend similar to that of DRL [126] (seeSection 3.3). The Problem element is comparable to a Decision Problem in DRL. ASolution solves a Problem, likewise Alternatives do in DRL. The Motivation elementgives more rationale about the Problem and is comparable to a supportive Claim inDRL. A Cause can be seen as a special instance of a Goal in DRL. The Consequence

3.6. Related work and further developments 51

element is like a DRL Claim about the expected impact of a Solution. The Pro andCon elements are comparable to supporting and denying DRL Claims of a Solution(i.e. a DRL Alternative).

3.5.2 Example in Archium

An example of a design decision and the associated rationale in Archium is pre-sented in Figure 3.7. It describes the Updater design decision of Figure 3.4. Ra-tionale elements in Archium start with an @, which expresses rationale in naturaltext. In the rationale, any design element or requirement in the scope of the designdecision can be referred to using square brackets (e.g. [iuc:patcher]). In this way,Archium allows architects to relate their rationale with their design in a naturalway.

A design decision can contain multiple solutions. Each solution has a realiza-tion part, which contains programming code that realizes the solution. A realiza-tion can use other design decisions or change existing components. In the Interne-tUpdate solution the realization contains the InternetUpdateChange, which definesthe Patcher component and the component modifications for the Internet Con-nection (see Figure 3.4). The IUCMapping defines how the InternetUpdateChange ismapped onto the current design, which is an argument of the design decision.

To summarize, the architectural design decisions contain specific rationale el-ements of the architecture, thereby not only describing how the architecture hasbecome what it is, but also the reasons behind the architecture. Consequently, de-sign decisions can be used as a bridge between the software architecture and itsrationale. The Archium environment shows that it is feasible to create architec-tures with design decisions.

3.6 Related work and further developments

This section describes related and future work. The related work focuses on soft-ware architecture. After this, Subsection 3.6.2 describes future work on designdecisions.

3.6.1 Related work

Software architecture design methods [16, 26] focus on describing how the rightdesign decisions can be made, as opposed to our approach which focuses on cap-turing these design decisions. Assessment methods, like ATAM [16], asses thequality attributes of a software architecture, and the outcome of such an assess-ment steers the direction of the design decision process.

Software documentation approaches [42, 90] provide guidelines for the docu-mentation of software architectures. However, these approaches do not explicitlycapture the way to and the reasons behind the software architecture.

Architectural Description Languages (ADLs) [138] do not capture the road lead-ing up to the design either. An exception is formed by the architectural changemanagement tool Mae [88], which tracks changes of elements in an architectural


design decision Updater(CurrentDesign design) {@problem {# The CD player should be updatable.[R4] #}@motivation {# The system can have unexpected bugs or require

additional functionality once it is deployed. #}@cause {# Currently this functionality is not present in the

[design], as the market did not require thisfunctionality before. #}

@context {# The original [design]. #}

potential solutions {solution InternetUpdate {

architectural entities {InternetUpdateChange iuc;IUCMapping iucMapping;

}@description {# The system updates itself using a patch,

which is downloaded from the internet. #}realization {

iuc = new InternetUpdateChange();iucMapping = new IUCMapping(design,iuc);

return design composed with iuc using iucMapping;}

@design rules {# When the [iuc:patcher] fails to update,the system needs to revert back to theprevious state. #}

@design constraints {# #}@consequences {# The solution requires the system to

have a [iuc:internetConnection] to work. #}pros { @pro {# Distribution of new patches is cheap,

easy, and fast #} }cons { @con {# The solution requires a connection to

the internet to work. #} }}

/* Other alternative solutions can be defined here */}

choice {choice InternetUpdate;@tradeoff {# No economical other alternatives exist #}

}}

FIGURE 3.7: The Updater design decision in Archium

3.6. Related work and further developments 53

model using a revision management system. However, this approach lacks the no-tion of design decisions and does not capture considered alternatives or rationaleabout the design.

Architectural styles and patterns [164] describe common (collections of) archi-tectural design decisions, with known benefits and drawbacks. Tactics [16] arestrategies for design decision making. They provide clues and hints about whatkind of design decisions can help in certain situations. However, they do not pro-vide a complete design decision perspective.

Currently, there is more attention in the software architecture community forthe decisions behind the architectural design. Kruchten [114], stresses the impor-tance of design decisions, and creates classifications of design decisions and therelationship between them. Tyree [180] provides a first approach on documentingdesign decisions for software architectures. Both approaches model design deci-sions separately and do not integrate them with design. Closely related to this isthe work of Lago [123], who models assumptions on which design decisions areoften based, but not the design decisions themselves.

Integration of rationale with the design is also done in the design rationalefield. With the SEURAT [35] system, rationale can be maintained in a Rationale-Explorer, which is loosely coupled to the source code. This rationale has to beadded to the design tool, to let the rationale of the architecture and the implemen-tation be maintained correctly. DRPG [15] couples rationale of well-known designpatterns with elements in a Java implementation. Likewise SEURAT, DRPG alsodepends on the fact that the rationale of the design patterns is added to the systemin advance.

3.6.2 Future work

The notion of design decisions as first-class entities in a software architecture de-sign raises a couple of research issues. Rationale capture is very expensive, so howcan we determine which design decisions are economical worth capturing? So far,we have assumed that all the design decisions can be captured, in practice thiswould often not be possible or feasible. How do we deal with the completenessand uncertainty of design decisions? How can we support addition, change, andremoval of design decisions during evolution?

First, design decisions need to be adapted into commonly used design pro-cesses. Based on this, design decisions can be formalized and categorized. Thiswill result in a thorough analysis of the types of design decisions. Also, dependen-cies need to be described between the requirements and design decisions, betweenthe implementation and design decisions and between design decisions amongthemselves.

Experiments by others have already proven that rationale management helpsin improving maintenance tasks. Whether the desired effects outweigh the costsof rationale capturing is still largely unproven. The fact that most of the benefitsof design decisions will be measurable after a longer period when maintenanceand evolution takes place complicates the validation process. We are currentlyworking on a case study which focuses on a sequence of architectural design de-cisions taken during evolution. Additional industrial studies in different domains


are planned in the context of an ongoing industrial research project, which willaddress some of the aforementioned questions.

3.7 Summary

This chapter presented the position of rationale management in software archi-tecture design. Rationale is widely accepted as an important part of a softwarearchitecture. However, no strict guidelines or methods exist to structure this ra-tionale. This leaves the rationale management task in the hands of the individualsoftware architect, which makes it hard to reuse and communicate this knowledge.Furthermore, rationale is typically kept separate from architectural artifacts. Thismakes it hard to see the benefit of rationale and maintaining it.

Giving design decisions a first-class representation in the architectural designcreates the possibility to include problems, their solutions and the rationale ofthese decisions into one unified concept. This chapter described an approach inwhich decisions behind the architecture are seen as the new building blocks ofthe architecture. A first step is made by the Archium approach, which illustratedthat designing an architecture with design decisions is possible. In the future, wethink that rationale and architecture will be used together in design decision likeconcepts, bridging the gap between the rationale and the architecture.

Acknowledgements

This research has partially been sponsored by the Dutch Joint Academic and Com-mercial Quality Research and Development (Jacquard) program on Software En-gineering Research via contract 638.001.406 GRIFFIN: a GRId For inFormatIoNabout architectural knowledge.

55

Chapter 4

Enriching Software ArchitectureDocumentation

“Either write something worth reading or do something worth writing.”

- Benjamin Franklin

This chapter is based on: Anton Jansen, Paris Avgeriou, and Jan Salvador vander Ven. “Enriching Software Architecture Documentation”. In: Journal of Systemsand Software 82.8 (Aug. 2009), pp. 1232–1248.

Abstract

The effective documentation of Architectural Knowledge (AK) is one of the key factors inleveraging the paradigm shift toward sharing and reusing AK. However, current docu-mentation approaches have severe shortcomings in capturing the knowledge of large andcomplex systems and subsequently facilitating its usage. In this chapter, we propose totackle this problem through the enrichment of traditional architectural documentation withformal AK. We have developed an approach consisting of a method and an accompanyingtool suite to support this enrichment. We evaluate our approach through a quasi-controlledexperiment with the architecture of a real, large, and complex system. We provide empir-ical evidence that our approach helps to partially solve the problem and indicate furtherdirections in managing documented AK.

4.1 Introduction

The knowledge about a software architecture and its environment is called Archi-tectural Knowledge (AK) [119] and has resulted in a paradigm shift in the softwarearchitecture community [8, 9, 121]. The most important type of AK are architec-tural (design) decisions, which shape a software architecture [100]. Other typesof AK include concepts from architectural design (e.g. components, connectors)[172], requirements engineering (e.g. risks, requirements), people (e.g. stakehold-ers and roles), and the development process (e.g. activities) [23].

There is a growing awareness both in industry and academia that effectivelysharing AK, both inside the developing organization and with external actors, is

56 Chapter 4. Enriching Software Architecture Documentation

one of the key factors for project success [8, 9, 121]. Organizations are already ex-ploring this new paradigm by conducting research on the benefits of knowledge-based architecting [122]. The aim of this research is to bring enough evidence toconvince the relevant stakeholders to embrace this new way of working by pro-ducing and consuming documented AK. In specific, stakeholders need to spendsignificant effort in documenting the AK, and therefore must be convinced thatthey will get a good return on their investment. On the other hand, when consum-ing AK, stakeholders need to trust the credibility of the documented knowledge(e.g. maintainers should have confidence in how up-to-date the AK is).

Documenting AK is not new, but has been common practice in the softwarearchitecture community over the last years [42]. In both heavyweight processes(e.g. the Rational Unified Process [117]) and agile processes (e.g. XP, SCRUM [17,160]), knowledge is documented to facilitate communication between stakehold-ers. The essential difference between the former and the latter is that heavyweightprocesses determine large documents up front, while agile processes produce lessdocumentation, strictly when needed. In essence, the knowledge in both cases istransformed from implicit or tacit knowledge [144] into explicit knowledge [79].Two types of explicit knowledge can be discerned: documented and formal knowl-edge. Documented knowledge is expressed in natural language and/or images,while formal knowledge is expressed in formal languages or models with clearlyspecified semantics (e.g. ADL’s, Domain models, etc).

Architectural Knowledge is mainly represented as documented knowledge inthe form of an Architecture Description [95] or Architecture Documentation [42].An architecture document has several benefits for AK sharing as it allows for: (1)Asynchronous communication (not face-to-face) among stakeholders to negotiateand reason about the architecture; (2) Reducing the effect of AK vaporization [96];(3) Steering and constraining the implementation; (4) Shaping the organizationalstructure; (5) Reuse of AK across organizations and projects; (6) Supporting thetraining of new project members.

However, when systems grow in size and complexity, so does the architecturaldocumentation. In such large and complex systems, this documentation often con-sists of multiple documents, each of considerable size, i.e. tens to hundreds ofpages. Moreover, it becomes more complex, as within and between these docu-ments, there are many concepts and relationships, multiple views, different levelsof abstraction, and numerous consistency issues. Current software architecturedocumentation approaches cannot efficiently cope with this size and complexity;they are faced with a number of challenges that are outlined here and elaboratedin Section 4.2:

1. Creating understandable architecture documentation [42];

2. Locating relevant Architectural Knowledge [24];

3. Achieving traceability between different entities [91];

4. Performing change impact analysis [173];

5. Assessing the maturity of the design [16];

4.2. Challenges for Software Architecture Documentation 57

6. Trusting the credibility of the information [127].

The research problem we address in this chapter, is how to manage AK doc-umentation of large and complex systems, in order to deal with these challenges.To partially tackle this problem, we propose an approach that enriches documen-tation with formal knowledge. The approach consists of a method supported bya tool suite. The key idea of this approach is to enrich software architecture doc-uments by making the AK they contain explicit, i.e. capture this knowledge in aformal model. This formalized AK in turn is used to support the author and readerof the software architecture document with dealing with the aforementioned chal-lenges. The proposed approach is complimentary to current architecture docu-mentation approaches, as it builds upon them in order to transform documentedinto formal knowledge.

The usage of the process and the tool are demonstrated through a large andcomplex industrial example. We provide empirical evidence for the benefits of theapproach through a quasi-controlled experiment in the context of this example.For reasons of scope and length, we only focus on one of the challenges (under-standability).

The rest of this chapter is organized as follows. Section 4.2 presents the afore-mentioned challenges of software architecture documentation in more detail. Thenext section introduces our method for enriching software architecture documen-tation with formal AK while Section 4.4 presents the accompanying tool, the Knowl-edge Architect. Section 4.5 explains how our approach, i.e. our method and tooladdresses the aforementioned challenges. To exemplify the approach, Section 4.6presents an example of the application of our method for a large, complex, andindustrial system. We validate our approach with respect to one of the challengesusing a quasi-controlled experiment in Section 4.7. In Section 4.8, related work ispresented and the limitations of the approach are discussed in Section 4.9. Thischapter ends with directions for further work in Section 4.10.

4.2 Challenges for Software Architecture Documenta-tion

As described in the previous section, the research problem we deal with, is the in-efficiency of current software architecture documentation approaches to deal withlarge and complex systems. We have broken down this problem into a set of chal-lenges, which are elaborated in the following paragraphs:

• Understandability Documentation always loses some of the intentions of theauthor when someone else reads it. As the size of documentation increaseswhen systems become larger and more complex, the understandability of thedocuments becomes more challenging [42]. Especially when stakeholdershave different backgrounds, the language and concepts used to describe thearchitecture might not be understandable to everyone. Although good refer-ences and glossaries can help to improve the understandability, just readingthe documentation often leads to ambiguities and differences in interpreta-tion.


• Locating relevant AK Finding relevant AK in (large) software architecturedocumentation is often problematic. The knowledge needed is often spreadaround multiple documents [24]. The first obstacle is to find the relevant doc-uments in the big set of documents accompanying a system. The practiceof informal sharing these documents through e-mails or shared directoriescomplicates matters, leading to a situation where different people have dif-ferent versions of the same document. The second obstacle is to locate therelevant AK within these documents. Although a clear documentation struc-ture, glossary, and outline certainly helps, software architecture documentslack the required finer granularity for locating the exact AK.

• Traceability Providing traceability between different sources of documenta-tion is difficult [91]. In practice, the lack of traceability usually occurs be-tween requirements and software architecture documents, since it is oftenunclear how these documents relate to each other. Text and tables have a lim-ited ability to communicate different relationships. Figures (e.g. in the formof models or views [42]) inside architectural documentation are more effec-tive in communicating relationships within or between documents. How-ever, the semantics of these models and views are usually not explicit andtherefore decrease the understandability.

• Change impact analysis It is often necessary to predict the impact of a changeon the whole system. Therefore, we need to analyze which parts of the ar-chitecture are influenced when an architectural decision is made or recon-sidered [171]. Since documentation usually does not make these decisionsand their relationships explicit, making a reliable change impact analysis isoften very hard. The lack of traceability between the different architectureelements further exacerbates this problem.

• Design maturity assessment Evaluating the maturity of an architecture de-sign is difficult as there is no overview of the status of the architecture withrespect to its conceptual integrity, correctness, completeness and buildability[16, 183]. These types of qualities are different than run-time qualities (e.g.performance) or design-time qualities (e.g. modifiability) in that they are in-herent to the architecture per se. Therefore they are quite complex qualitiesand usually difficult to assess through scenario-based evaluation methods[16]. To make matters worse, the size and complexity of an architecture doc-ument directly influences these qualities and their assessment.

• Trust Architectural documentation is constantly evolving and needs to bekept up to date with changes in the implementation and the requirements.In large and complex systems, changes occur quite often and the cost of up-dating the architecture document is sometimes prohibitive. Therefore, thedocument is quickly rendered outdated and the different stakeholders (e.g.developers and maintainers) lose their confidence in the credibility of theinformation in it [127].

The challenges comprise the starting point for the remaining sections in thischapter. An overview of the different sections and their relationships is illustrated

4.3. Enriching Documentation with Formal AK 59

Challenges (Section 4.2)

Approach (Section 4.3 and

Section 4.4)Address challenges(Section 4.5)

Industrial example

(Section 4.6)Experiment on(Section 4.7)

FIGURE 4.1: Overview of the paper

in Figure 4.1. On the left, the challenges described in this section designate theproblem statement. The next two sections describe our approach, consisting of amethod (Section 4.3) and a tool (Section 4.4), for enriching documentation withformal AK. In Section 4.5, we describe how our approach partially resolves thesix challenges. An industrial example presented in Section 4.6 helps to illustratethe approach while a (partial) validation through a quasi-controlled experiment isdescribed in Section 4.7.

4.3 Enriching Documentation with Formal AK

A major cause of the inefficiency of current software architecture documentationapproaches is the fact that they focus on documented and not formal knowledge.While documented knowledge can be managed by humans, this managementdoes not really scale up when the size and complexity of the documentation in-creases. On the other hand, formal knowledge is more appropriate for automatedprocessing and can handle scalability issues much more effectively. Consequently,formal knowledge in large and complex system can be automatically managed byappropriate tools that in turn support understanding AK, locating and tracing it,as well as analyzing and keep it up-to-date.

The key idea behind our approach is to add formal knowledge to existing doc-umented knowledge in order to facilitate automated processing that scales effi-ciently and deals with the aforementioned challenges. Formal knowledge is addedthrough annotating the existing documented AK sources according to a formalmeta-model. This is different than creating formal AK from scratch, e.g. as doneby [180], because we essentially reuse the existing AK and build formal AK uponit. Our approach is comprised of a method that describes the activities that needto be undertaken, accompanied with a tool that provides possibility to annotatedocuments. Next, the activities of our method are described.

1. Identify documentation issues The first activity in our method concernsidentifying the problems in managing AK, starting from the six generic chal-lenges presented in Section 4.2. Each one of these challenges can be refinedinto the specific problems the organization is facing. Not all six challengesmust be necessarily dealt with; each organization can choose and emphasizeon specific challenges. Furthermore the list of challenges discussed in thischapter is not exhaustive; additional challenges can be considered accord-ing to the specific organizational context. After the challenges have been


KEArtifact

Fragment

Artifact

described by

contained in

contained in

Author

creates

relates to

FIGURE 4.2: The basic AK model

described in an organization-specific way, a number of use cases for man-aging AK need to be identified that will help to address the challenges. Forexample, we can derive specialized use cases on tracing particular types oforganization-dependent AK such as risks and assumptions. As a startingpoint for selecting use cases, we propose our previous work on an abstractAK use case model that describes several possible uses of AK [183]. Sincethese use cases are rather abstract, they also need to be translated into theparticular context of the system, by taking into account the sources that con-tain the AK.

2. Derive a domain model Based on the identified AK use cases, we derive adomain model consisting of concepts (i.e. Knowledge Entities (KE)) and theirrelationships that describe relevant AK. The domain model and the use casemodel are intertwined in the sense that the elements of the domain modelshould be used as specified in the identified AK use cases. Figure 4.2 presentsthe basic model that can be used while constructing a specific domain model.This activity aims at producing a domain model (and thus the relevant AK)that is organization-dependent. This allows for the reuse of existing conceptsand terminology within an organization across different projects. It allowsan organization to use the domain model as a “standard” reference model tosynchronize their terminology within the organization.

3. Capture AK Once a domain model is derived, AK can be captured that ad-heres to the domain model. It is very important to minimize the effort re-quired to capture this knowledge. To achieve this, automation in the form oftool support plays a crucial role. Tools can substantially reduce the requiredeffort by (semi-)automatically capturing AK. Typically, this involves infor-mation extraction techniques (e.g. [24]) and assisting a user with producingAK (e.g. [180, 202]).

4. Use AK The goal of this activity is to use in practice the use-cases identifiedin activity 1 and thus deal with the corresponding challenges presented in

4.4. The Knowledge Architect 61

Section 4.2. This activity involves consuming both documented and formalAK. The combination of these two types of knowledge should deliver morevalue as compared to the sole consumption of documented AK.

5. Integrate AK The domain model describes the relevant AK for a set of AKuse cases. The different AK elements in the domain model are not alwaysconfined only to software architecture documents. Other sources may alsocontain valuable AK, e.g. analysis models, presentation slides, architecturalmodels, wikis, discussion fora, and e-mails. Integrating the AK of softwarearchitecture documents with these sources enables a more complete repre-sentation of the knowledge.

6. Evolve AK A software architecture constantly evolves due to new develop-ments and insights. Hence, there is the need to evolve the AK. This includesremoving outdated knowledge [93] and updating relevant knowledge. So,both documented and formal AK should be kept up-to-date and in sync witheach other. The challenge is to streamline this process to reduce the effort re-quired. For example, in the context of architecture documents, this meansfinding a way to deal with cut & paste actions inside architecture documentsand reflecting this in the related formal model.

The first four activities of the method (i.e. identify documentation issues, de-rive domain model, capture AK, and use AK) comprise the basic iteration whereAK is produced and used. The final two activities comprise the next iterationwhere AK is integrated and evolved. In the remainder of this chapter, we onlydiscuss the first part (activities 1, 2, 3, and 4) and leave out the integration andevolution activities (i.e. 5 and 6). We have made this selection in order to scopethis work down to the basic iteration. By studying the first four activities, we cansee whether the method works and brings the expected benefits. As further work,we plan to do additional research on the last two activities. Collecting a largeamount of formal AK using the first four activities will provide us with a basis toexperimenting with for the integration and evolution activities. The next sectionpresents a tool suite that supports the outlined method.

4.4 The Knowledge Architect

4.4.1 Introduction

The Knowledge Architect is a tool suite that supports our proposed method bycreating, using, and managing AK across documentation, source code and otherrepresentations of architectures. We briefly outline the tool suite and then explainhow its different parts support the different activities of the method. The heart ofthe tool suite is an AK repository which provides various interfaces for tools tostore and retrieve AK. The AK itself is represented in terms of fundamental units:the Knowledge Entities (KEs). Different tools can interact with the AK repositoryto manipulate the KEs:


Excel importer/exporter

Document Knowledge

Client

Python plug-in

Knowledge Explorer

AK repository

Sesame OWL/RDF store

OWLIM

Webservice layer

Domain model

Excel plug-in

Protégé

FIGURE 4.3: The Knowledge Architect tool suite

• The Document Knowledge Client is a plug-in for Microsoft Word and en-ables the capture (by annotation) and use of AK within software architecturedocuments. The validation experiment in Section 4.7 focuses on the Docu-ment Knowledge Client.

• The Analysis Model Knowledge Clients supports capturing (by annota-tion) and using AK of quantitative analysis models. Specifically, there aretwo of such clients: a plug-in for Microsoft Excel [98] and a plug-in forPython.

• The Knowledge Explorer is a tool for analyzing the relationships betweenKEs. It provides various visualizations to inspect KEs and their relationships.

Figure 4.3 presents an overview of how the various tools are related. The AKrepository is the central point for storing and retrieving AK. It is built aroundSesame, an open source RDF store [32]. Sesame offers functionality to store andquery information about ontologies [6]. Domain models are modeled as ontolo-gies, which are expressed in OWL (Web Ontology Language) [6]. The Protégé toolis used to create the OWL definition of the domain model, which is subsequentlyuploaded to Sesame. To provide some intelligence in the AK repository, Sesameis extended with the inferencer OWLIM [108], which offers OWL lite [193] reason-ing facilities. The inferencer is mostly used to automatically generate the inverserelationships that exist between KEs. In this way, a user does not has to manu-ally define them. The Document Knowledge Client uses a custom layer on topof Sesame to access the KEs. This layer provides a more high-level interface toSesame; no tool developer is needed to understand the querying and low levelstoring mechanism of Sesame.

The Knowledge Architect tool suite can be used to support the activities of theproposed method, except for the first one. Activity 1 (Identifying documentationissues) is not supported, since it is a manual activity of refining the challengesand selecting the relevant use cases for AK. The AK repository is used to store thedomain model resulting from activity 2.

4.4. The Knowledge Architect 63

FIGURE 4.4: The Knowledge Architect Word plug-in button bar

The capturing of AK (activity 3) is supported by the Document KnowledgeClient and Analysis Model Knowledge Clients that capture AK from Word docu-ments, Excel analysis models and Python programs.

Using AK (activity 4) is supported by different parts of the Knowledge Archi-tect, depending on the specific use cases that have been selected. For example,the Knowledge Explorer can be used to search for specific AK elements, while theDocument Knowledge Client can be used to assess the completeness of the AK.

Integration of AK (activity 5) is naturally implemented in the Knowledge Ar-chitect through the central Knowledge Repository that collects all AK, and thecombination of the various plug-ins, that store and retrieve the knowledge.

Evolving AK (activity 6) is mostly supported by the Knowledge Explorer, whichvisualizes the interdependencies between the AK elements and thus facilitateschange impact analysis. Changes to the AK can then be edited using the DocumentKnowledge Client and Analysis Model Knowledge Clients. The central Knowl-edge Repository is also useful for evolving the AK, allowing for easy managementand identification of out-of-date AK and providing a history of its evolution.

4.4.2 Document Knowledge Client

The Document Knowledge Client1 is a tool to capture and use explicit AK insideMicrosoft Word 2003. The plug-in adds a custom button bar (see Figure 4.4) andprovides additional options in some of the context-aware pop-up menu’s of Word.The tool automatically adapts at start-up time to the domain model used in theAK repository.

Figure 4.4 presents the buttons that give access to the functionality of the Wordplug-in. In short, they give access to the following functionality:

1. Add current selected text and/or figure(s) as a new KE.

2. Add current selected text and/or figure(s) to an existing KE.

3. Create a KE table at the end of the document

4. Color the text of the KEs based on their type.

1downloadable from http://search.cs.rug.nl/griffin

http://search.cs.rug.nl/griffin


5. Color the text of each KE based on its completeness.

6. Show a list of KEs in the current document.

7. Export KEs of the document to a XML file.

8. Import KEs from the document into the connected AK repository.

9. Connect to an AK repository.

10. Read annotations from the current active document, i.e. enable the plug-infor the current active document.

11. Open the settings menu.

12. Display the plug-in version & authors information.

4.4.3 Knowledge Explorer

Typically, the size of an AK repository will be considerable containing thousandsof KEs. Finding the right AK in such a big collection of KEs is not trivial. Hence,there is a need for a tool to assist in exploring an AK repository. The KnowledgeArchitect Explorer is such a tool. In this subsection, we briefly explain how thistool works and what kind of techniques are used to deal with the size of an AKrepository.

Figure 4.5 presents a screenshot of the Knowledge Explorer. On the left handside the search functionality is shown. Users can use the see-as-you-type searchbox on the bottom left to look for specific KEs. The resulting KEs of this searchaction are shown in the list on the left hand side. The results can be filtered usingthe drop down box on the left, thereby reducing the size of the found results. Thefiltering is based on the type of the AK. The available options are presented basedon the used domain model.

Double clicking on one of the search results focuses the visualization in themiddle part of the figure on the selected KE. The selected KE (i.e. DD26) is in-dicated with a red background color. The middle visualization shows how theselected KE is related to other KEs. Double clicking on these related KEs changesthe focus of the visualization accordingly.

The relationships that are shown depend on so-called “pillars”. The pillars arethe concepts of the domain model that are selected from a list on the top rightand visualized as gray pillars in the middle. In the case of Figure 4.5, these pillarsare the Alternative, Decision Topic, and Requirements concepts. The pillar con-cept allows for easy inspection of whether a KE is (in)directly related to other KEsof a specific type. For example, this allows for checking whether a requirementeventually leads to a specification. This is simply achieved by only enabling therequirement and specification pillar. To get additional information about a KE, themouse can be hovered over a KE and a pop-up window will present this informa-tion.

4.5. Resolved Challenges 65

FIGURE 4.5: The Knowledge Explorer

Another way to deal with the size of the AK repository is by using the list foundin the middle right. This list presents all the KE authors and provides the oppor-tunity to either include or exclude KE from specific authors for the visualizationin the middle.

The last mechanism that helps dealing with the AK repository size is the slideron the middle right. This slider controls the distance at which a KE is no longerconsidered related to the selected KE. This distance is defined as the maximumnumber of relationships that may be followed to find a related KE. By moving theslider to the right, more distant related KEs are visualized, whereas moving theslider to the left reduces this number.

4.5 Resolved Challenges

The method and the tool of the proposed approach aim at resolving the challengespresented in Section 4.2. In this section we outline how this takes place at a generallevel while in Section 4.6 we will go into the details of these challenges for a specificorganization. Each challenge is addressed by the proposed method and tool asdescribed below.

Understandability

• Method The domain model derived in activity 2 of the method provides acommon language for communication. This makes an architecture designeasier to understand, as all concepts are defined in a clear way and are relatedto other concepts. The understandability is further increased when people


become aware that they have to be strict when annotating their text (activity3). This increases the clarity and unambiguity of the text. Also, when access-ing and using the annotated documents (activity 4), the understandability isexpected to increase, as described in the experiment in Section 4.7.

• Tool The knowledge explorer enhances understandability by visualizing therelationships between the different KE instances through the documentation.This offers the opportunity to gain insight into the architecture in a way thatis hard to achieve by simply reading a software architecture document. Thedocument knowledge client improves understanding by offering traceabilitysupport, additional rationale, and meta-data about KE instances. In Section4.7, we empirically validate whether this tool enhances the understandabil-ity.

Locating relevant AK

• Method The method makes finding relevant AK easier due to the classifica-tion of the knowledge and the relationships the KEs have with each other.The classification allows one to scope the search for relevant AK to specifictypes of knowledge. This improves the quality of search results. In addi-tion, the formal AK model allows to link search results to other related AK,thereby making it easier to find and understand the context of the knowl-edge.

• Tool The knowledge explorer offers search by keyword and by KE category(see Section 4.4.3), in order to find knowledge. Also, relevant AK can befound by following the relationships of KEs. The document knowledge clientcan color KE instances making them easily findable on a document page (seeSection 4.4.2). In addition, the tool can create a table with the different KEinstances, using different orderings at the end of document. The KE instancesin this table are provided with navigable links to their source, making thelocating of relevant AK easier.

Traceability

• Method The method does not only focus on capturing KE instances, but alsoon capturing the relationships among these instances. In doing so, the result-ing AK model provides traceability among the AK (even through differentsets of documentation, as described in activity 5).

• Tool The document knowledge client supports people in creating (see Section4.6.4) and using traceability information inside documents (see Section 4.6.5).Apart from the documents, the knowledge explorer tool supports analysis oftraceability knowledge (see Section 4.4.3).

Change impact analysis

• Method An important form of AK are architectural (design) decisions. Oncethese decisions are captured in a formal model (i.e. activity 3 of the method),assessing the impact of changing such a decision becomes easier. For ex-ample, techniques like Bayesian belief networks can then be employed topredict the impact of architectural design decisions [173].

4.6. The LOFAR Example 67

• Tool The impact of changes can be analyzed in the knowledge explorer (seeSection 4.4.3). Selecting a changed KE instance (e.g. a requirement) in thetool will visualize the related (and potentially affected) knowledge (e.g. de-cisions).

Design maturity assessment

• Method The method helps with assessing the maturity of a design. For com-pleteness, automatic model checking can be used to asses what kind of AKis likely to be missing. To assist in assessing the correctness and consistencyof the architecture design extensive formalization is required to model thesemantics of the behavior of the designed system.

• Tool The Document Knowledge Client offers a completeness check, statusfield, and space for review comments to support such an assessment. Werefer to Section 4.6.5 for an in-depth description of how the client supportsthis assessment.

Trust

• Method The method helps with addressing the trust challenge by offering thepossibility to attach meta-data to the captured and formalized AK. This fa-cilitates the different stakeholders to investigate the author of the knowledgeand the date it was created, and decide whether or not to trust it. Anotherexample is aligning the process with KE instances by having a status fielddescribing the status a KE has in this process. For example, [114] proposesto associate a status (e.g. Idea, Tentative, Decided, Approved, Challenged,Rejected, Obsolesced) to a decision.

• Tool The knowledge repository maintains a rich history of the KE instances,thus establishing how up to date they are. For example, the Document Knowl-edge Client can track the use, changes, and comments to individual KE in-stances, thereby providing a history that is suitable to judge the credibility ofthe knowledge. Also, by making it easier to assess the KE of an architecturethrough the explorer, it is easier to gain trust in the document at hand.

Figure 4.6 presents a visual summary of the relationships between the chal-lenges, activities, and tools. The challenges are depicted on the left side, the activ-ities in the center, and the associated tools on the right side of the figure. Besidesthese relationships the figure also illustrates the scope of the upcoming two sec-tions: the industrial example of Section 4.6 and the quasi-controlled experiment ofSection 4.7.

4.6 The LOFAR Example

In this section, we present an example of a real, large, and complex system. First,we present an introduction of this system. Then we present how activities 1, 2, 3and 4 of our method (see Section 4.3) are applied in this context. We also outline,where appropriate, how the tooling of Section 4.4 is used to support the activitiesof our method.


5: Integrate AK

3: Capture AKTool set

Challenges Activities Tools

2: Locating relevant AK

3: Traceability

4: Change impact analyses

5: Design maturity assessment

6: Trust

1: Identify documentation issues

4: Use AK

6: Evolve AK

Knowledge explorer

1: Understandability Protégé

2: Derive domain model

Document knowledge client

Excel plugin

Python plugin

Scope industrial example

Scope controlled experiment

FIGURE 4.6: Overview of the approach and its validation


4.6.1 Introduction

The industrial example investigated in this chapter is LOFAR (LOw FrequencyARray)2: a new radio telescope under construction by ASTRON, the NetherlandsInstitute for Radio Astronomy. LOFAR is rapidly becoming a European effort,with France, Germany, the United Kingdom, and Sweden having funded stations,with others to be added soon. What makes LOFAR interesting from a softwarearchitecture perspective is the fact that it is the first of a new generation of softwaretelescopes [37]. Software is of paramount importance in the system design, as itis one of the crucial design factors for achieving the ability to communicate andprocess the 27 Tflop/s data stream in real-time to be fed into scientific applications.The architecture of this large and complex system is described in many differentdocuments, ranging in scope from the entire system and particular subsystems tospecific prototype analysis.

4.6.2 Activity 1: Identify Documentation Issues

The first activity entails identifying the current issues with respect to using thearchitecture documentation. The challenges outlined in Section 4.2 are manifestedin the LOFAR project as follows:

• Understandability Creating a radio telescope that uses cutting edge tech-nology involves many different specialists, each coming from a very differ-ent background, e.g. astronomers, high performance computing specialists,antenna specialists, industrial manufacturing experts, politicians, and em-bedded systems engineers. Hence, creating an understandable software ar-chitecture is vital for communicating and thereby creating consensus aboutthe design among the stakeholders.

• Locating relevant AK The architectural documentation of the LOFAR systemconsists of multiple documents, which in total encompasses over 1000 pages.Locating relevant AK is very hard simply due to its size.

• Traceability The architecture description of LOFAR is split in separate docu-ments for the top-level and individual sub-systems. Finding out what exactlythe relationships are between these documents is very hard. It is especiallydifficult to understand how particular requirements are addressed in the ar-chitecture design.

• Change impact analysis Predicting the impact of a design change is a majorissue for LOFAR, as it forms a critical part of risk management. For example,a major risk is a change in the available budget, which has ramifications tothe viability of the telescope design. Change impact analysis is needed toidentify these ramifications.

• Design maturity assessment At the time the investigation for this exampletook place, an important issue for ASTRON was to know whether the designwas mature enough to be built or if additional design activities were needed.

2http://www.lofar.org/

http://www.lofar.org/


• Trust The design time for the LOFAR telescope is around 10 years, with anexpected minimal operating time of 15-20 years. During this period, the tele-scope and its software will be constantly upgraded to improve performance.Hence, having up-to-date, trustworthy AK will play a crucial role in the fu-ture of the telescope, as this partly defines the scientific relevance and successof the instrument.

After describing how the six challenges are manifested in the LOFAR project,we identified a number of use cases that help to address the challenges. We startedfrom the use case list of [183] and derived a prioritized list of project-specific usecases. Based on this, we decided together with ASTRON to focus our effort on theuse case: Perform incremental architectural review. ASTRON wants to perform betterand more efficient architectural reviews. As stated in [183], this use case makes useof three other use cases: Perform a review for a specific concern, View the change of thearchitectural decisions over time and Identify important architectural drivers. This mainuse case touches upon three specific concerns (the aforementioned challenges):traceability, design maturity, and understandability.

Architectural reviews in ASTRON take place in two stages: first, the reviewersindividually review one or more architectural documents and create commentsabout them; second, a review coordinator collects these comments and organizesa review meeting to discuss the most pressing issues. This use case focuses on sup-porting the first stage of the review process by enriching the used documentationusing the Document Knowledge client (see Section 4.4.2). This helps the review-ers in a better and more efficient preparation for the review meeting. To increaseefficiency, the document review can take advantage of tracking which KEs havebeen found consistent, complete, and correct, i.e. assessing the design maturity.The coloring of these KEs allows the reviewers to focus more easily on that part ofthe architecture description that requires further attention. Furthermore, provid-ing traceability and easy spotting of relevant AK can improve the understandinga reader has of the software architecture.

4.6.3 Activity 2: Derive a Domain Model

To discover what AK is relevant in the LOFAR system, we investigated the AKused and documented in the system taking into account the use cases from the pre-vious activity. Independently from each other, two of the authors and a softwarearchitect of ASTRON examined a part of the architecture documentation. With amarker pencil, they annotated the text and/or figures that represented KEs. In thesideline of the document, they wrote down the name of the concept they believedthis annotation to be an instance of. Prior to this, no deliberations were made onthese concepts.

After completing this exercise, we compared the annotations and associatedconcepts with each other. The annotations made by the independent reviewerswere surprisingly similar. Although the names of the concepts differed, the mean-ing of most of them were similar. Using affinity diagrams, we grouped the con-cepts. In case of doubts, the original pieces of text annotated were revisited andcompared with each other. The aim of this exercise was to come up with the mini-mum set of concepts that was good enough to cover all the annotations.


Decision Topic

Concern

Alternative

Decision Specification

originates from

raises

creates

adresses

chooses

Risk

Requirement

Quick Decision

FIGURE 4.7: A domain model for AK in documentation

The end result, i.e. the derived concepts and their relationships are presentedin Figure 4.7. Each concept inherits from a KE, as modeled in Figure 4.2. Thereforethe domain model for AK, specific to the LOFAR architecture documentation, iscomprised of the following concepts:

• Concern. A concern is an interest to the system’s development, the systemoperation, or any other aspect that is critical or otherwise important to oneor more stakeholders.

• Requirement. A requirement is something that is explicitly demanded fromthe system by a stakeholder.

• Risk. A risk is a special type of concern, which expresses a potential hazardthat the system has to deal with.

• Decision Topic. A scoping of one or more concerns, such that a concreteproblem is described. Often stated as a question, e.g. what is the contents ofthe data transport layer?

• Alternative. To solve the described problem (i.e. a decision topic), one ormore potential alternatives can be thought up and proposed.

• Decision. For a decision topic there are sometimes multiple alternatives pro-posed, but only one of them can be chosen to address the described decisiontopic. The decision outlines the rationale for this choice.

• Quick Decision. Often only one alternative is described to address a deci-sion topic. Providing rationale for such an alternative is often lacking. Themere fact that the architect only describes a single alternative in the docu-ment, implicitly indicates that the architect has chosen this alternative as the


one to use. Thus the alternative becomes a decision in its own right, i.e. aquick decision.

• Specification. A special kind of decision is a specification. It indicates theend of the refinement process for the software architecture. Any concernscoming up from the alternative chosen are in principle the responsibility ofthe detailed design.

4.6.4 Activity 3: Capture AK

Capturing AK with the Document Knowledge Client involves the Add KE and Addto existing KE buttons, but can also be performed by selecting a piece of text andright clicking and choosing the appropriate option from the pop-up menu. Whenadding a new KE, a menu appears, which allows the user to provide the followingadditional information about a KE:

• Name that identifies the KE.

• Type of the KE, which is one of the concepts of the domain model being used.This can be selected through a pull-down menu.

• Status of the KE, which describes the level of validity of the KE, and is se-lected from the following options:

– To be reviewed the KE needs to be reviewed by someone else then thecreator of the knowledge.

– Reviewed the KE has been reviewed, but no verdict has been reachedyet.

– To be discussed the KE is controversial and should be discussed.

– To be checked additional analysis is still needed to support the validityof the KE.

– Validated the KE can be regarded as stable and trustworthy.

– Obsoleted the KE is no longer valid.

• Connections The user can add and remove relationships to other KEs. Basedon the earlier defined KE type and the domain model, the tool determines thetype of relationships that might be available for new relationships to otherKEs. Creating a relationship to a related KE is a four step process. To illus-trate this process, we take as an example a new KE of the “Requirement”type. The first step is to choose the type of relationship. In our example,this could be either the “raises” or “created by” (the inverse of “creates”)relationship, as defined in the domain model (see Figure 4.7). The secondstep is to determine the scope in which the target KE of the relation can befound, which is either the current document, or the whole AK. Usually, aself-containing architecture document will have most of its relationships toKEs within the current document. The third step is to search for the KE,which is based on partial name matching. The (intermediate) results of the


search are presented in a table like fashion, such that all details of the foundKEs can be inspected. The fourth step is to select one or more of the searchresults and confirm the creation of a relationship. The inverse relationship(s)will be automatically created and maintained by the tool.

• Notes, which are additional textual information about the KE. Usually thesecontain pointers to more information or comments about the validity of theKE.

• Creator of the KE, which is automatically determined by the tool, based onthe current configured Word user.

4.6.5 Activity 4: Use AK

The enriched documentation can be used to execute the use cases identified in thefirst activity. In this section we focus on the use case of performing an incremen-tal architectural review, as discussed Activity 1. We first describe how using AKduring architectural reviews, helps to deal with traceability and understandabilityissues. Next, we describe how the design maturity can be assessed during such areview.

Traceability & Understandability

A KE can be edited or removed by choosing the appropriate option from the pop-up menu when right-clicking on the text of the KE. In the same menu, the relation-ships among KEs can be followed. Thus useful traceability among KEs is provided.Figure 4.8 exemplifies this: under the “Connections...” the pop-up menu lists therelationships that a KE has, while clicking on them moves the cursor to the ap-propriate piece of text. This allows for a hyper-link style of navigation inside anarchitecture document. Navigating back to the originating KE is easy due to theautomatically created inverse relationships.

To enhance the understandability of the document, the tool facilitates the recog-nition of existing KEs by coloring the text based on the KE type (button 4 in Figure4.4). Figure 4.8 gives an example of the effect of this coloring. The colors used foreach type can be configured in each AK repository. This improves understandingin two ways. Firstly, by simply browsing through an annotated document givesthe reader a global understanding of where most relevant AK resides in the doc-ument. Secondly, by making the KEs and their type easy to spot, a reader (e.g. areviewer) can straightforwardly guess the message, that the architect tries to com-municate.

Design Maturity Assessment

The Document Knowledge Client can support the architect in assessing the com-pleteness of the architecture description. Based on the domain model, the toolperforms model checks to identify incomplete parts. For each KE inside the docu-ment, a completeness level is determined. The completeness levels are named afterthe colors that are used to color the text of the KE. To find out why the tool deems a


FIGURE 4.8: A software architecture document with colored KEs andpop-up menu for tracing the relationships of a KE

FIGURE 4.9: Incompleteness information of a KE


certain KE to be incomplete, the user can inspect the “Completeness....” option ofthe context pop-up menu to see which rules are not adhered to. Figure 4.9 presentsan example of this. The tool distinguishes the following four completeness levels(ordered from high to low severity):

• Red One or more primary rules are violated.

• Orange The primary rules are adhered to, but one or more secondary rulesare violated.

• Yellow Both primary and secondary rules are adhered to. However, the KEhas not achieved the status of “validated” yet.

• Green Both primary and secondary rules are adhered to. In addition, the KEhas been validated by a reviewer.

The distinction between primary and secondary rules is a pragmatic one. Pri-mary rules are those that check whether the document is complete enough to pro-vide a minimum level of traceability. This minimum level should ensure the exis-tence of at least one reasoning path a reader could follow. Secondary rules focusmore on the completeness of the architecture design. Both the primary and sec-ondary rules depend on the specific domain model used, as they use the conceptsand relationships of the domain model to detect missing information. The rulesare evaluated inside the AK repository, which offers an infrastructure to easily addor remove new rules during run-time. For the ASTRON LOFAR Domain model(see Figure 4.7), the following primary rules are used:

• All Alternatives address one or more Decision Topics each.

• All Decision Topics are addressed by at least one Alternative.

• All Decisions choose exactly one Alternative. This rule is not applied for aQuick Decision.

• All Decision Topics have an originating Concern or Alternative.

The following secondary rules are used:

• A Concern raises at least one Decision Topic.

• Concerns, that are not Requirements or Risks, are created by Alternatives.

• Chosen Alternatives and Quick Decisions that are not Specifications, eithercreate at least one Concern,or raise at least one Decision Topic.

• Quick Decisions should not have “chooses” or “chosen by” relations to otherKEs.

• A Quick decision should be the only Alternative addressing a Decision Topic.

• Exactly one Alternative should be chosen for a Decision Topic.


4.7 Quasi-Controlled Experiment

This section presents a quasi-controlled experiment to empirically validate a partof the presented approach. This experiment is conducted as an observationalstudy. This section follows the controlled experiment reporting guidelines of [102].Since the experiment is only part of this chapter, some parts of the reporting guide-lines are already covered in other sections. In specific, the content of the structuredabstract is part of the introduction, related work is discussed in Section 4.8, and fu-ture work is presented in Section 4.10.

4.7.1 Motivation

To validate our approach, we conducted a quasi-controlled experiment. The exper-iment focused on one of the identified challenges (understandability, see Section4.2) and on a specific use case (performing incremental architecture review, seeSection 4.6.1). In addition, the focus was on the Document Knowledge Client anddid not involve the Explorer.

Problem Statement and Research Objectives

The research question we answer with the quasi-controlled experiment is the fol-lowing: Does our approach for enriching software architecture documentation with formalAK improve the understanding of a software architecture description? We present theresearch objective using the template suggested in [102]: Analyze the presentedapproach for the purpose of improving with respect to software architecture un-derstanding from the point of view of the researcher in the context of the LOFARexample presented in Section 4.6.

Context

The context of the quasi-controlled experiment is the LOFAR system, as describedin the previous section.

4.7.2 Experimental Design

Goals, Hypotheses and Parameters

In our experiment, we compare the understanding one has of the architecturewhen using a normal documentation approach as opposed to a documentation ap-proach which includes the possibility for enriching the documentation. For this,we need a way to quantify the understanding (and associated communication)someone has of a software architecture. Achieving such a measurement for sucha complex topic as a software architecture is very difficult. One activity in whichthe understanding of a software architecture plays a key role is that of an architec-tural review. Understanding the architecture is crucial for a reviewer’s ability tojudge an architecture. Hence, we can indirectly measure the understanding some-one has of a software architecture by looking at how well he or she performs anarchitecture review.

4.7. Quasi-Controlled Experiment 77

Based on this assumption about the relationship between understandabilityand architectural review, we have formulated the following null hypotheses:

• H01 : Consuming formal AK makes an architecture review less efficient, i.e.#comments(ConsFormAK) < #comments(ConsDoc)

• H02 : Consuming and producing formal AK makes an architecture reviewless efficient, i.e. #comments(ConsProdFormAK) < #comments(ConsDoc)

• H03 : Consuming formal AK degrades the quality of a review, i.e. quality-Comments(ConsFormAK) < qualityComments(ConsDoc)

• H04 : Consuming and producing formal AK degrades the quality of a review,i.e. qualityComments(ConsProdFormAK) < qualityComments(ConsDoc)

The associated alternative hypotheses are:

• H1 : Consuming formal AK makes an architecture review more efficient, i.e.#comments(ConsFormAK) > #comments(ConsDoc)

• H2 : Consuming and producing formal AK makes an architecture reviewmore efficient, i.e. #comments(ConsProdFormAK) > #comments(ConsDoc)

• H3 : Consuming formal AK improves the quality of a review, i.e. quality-Comments(ConsFormAK) > qualityComments(ConsDoc)

• H4 : Consuming and producing formal AK improves the quality of a review,i.e. qualityComments(ConsProdFormAK) > qualityComments(ConsDoc)

The experiment is embedded into ASTRON’s normal development process andfollowed their normal procedures for an architectural review. This means that oneperson is the coordinator for the review. He or she receives the software archi-tecture document from the architect and sends them out to the reviewers. Thereviewers read the software architecture document and send their comments be-fore a deadline to the coordinator. After all reviewers have sent in their comments,the coordinator makes a selection of these comments and arranges a meeting withthe architect and reviewers to discuss the selected comments.

Independent Variables

The experiment consists of two independent variables: (1) the (none) use of thetool (2) the (none) production of formal knowledge. We call the combination ofthese two variables a situation, i.e. a treatment in empirical research. To determinethe effectiveness of our approach, we examine the following three situations:

• Situation 1: Consume documented/formal AK In this situation, the subjectsuse the Knowledge Architect Document Knowledge Client with an anno-tated version of the software architecture document. They are not allowed tocreate new annotations. Hence, the subjects only consume formal AK [121].


TABLE 4.1: Experimental design: #subjects per situation

Situation

Chapter 1 2 31 6 5 52 & 3 5 6 5

Total 11 11 10

• Situation 2: Consume documented AK and produce formal AK In this sit-uation, the subjects use the Document Knowledge Client on a unannotatedversion of the document. They are encouraged to make their own anno-tations alongside their review. Hence, the subjects produce formal AK andonly consume their own produced formal AK and the documented knowl-edge from the document.

• Situation 3: Only consume documented AK The subjects do not use theDocument Knowledge Client and do not consume formal AK. They merelyreview the document, as in a “normal” review, but still read the documentfrom a computer screen.

Situation 3 acts as a baseline to compare the performance of situations 1 and 2.

Dependant Variables

The experiment uses two dependent variables for measuring the understanding ofthe architecture. They are based on the review comments of the subjects. First, wemeasure the broadness of this understanding by looking at the quantity of commentseach subject makes in a limited amount of time, i.e. 1 hour. Second, we measurethe deepness of this understanding by rating the quality of the comments. This latterquality is defined as the extend a comment helps to improve the architecture andits description. The comments are rated by two people: the architect and a veryexperienced architecture reviewer. They give each comment a rating on a scale of1 to 5, with 1 being the lowest quality rating and 5 the highest.

Each of the subjects perform two reviews. For this, we have split the softwarearchitecture document in two equally sized parts, i.e. Chapter 1 and Chapters2 & 3, that describe different aspects of the architecture independent from eachother. Each subject therefore performs two reviews, one for Chapter 1 and one forChapters 2 and 3. Consequently, each subject only participate in a maximum of 2out of 3 situations.

We designed the experiment in such a way that the subjects were evenly dis-tributed over the 3 situations per document part. Table 4.1 presents this distribu-tion. The experiment design is a semi-randomized design, as we put additionalconstraints on allowed assignments of subjects to situations. That is, each subjectwas randomly assigned to two different situations. The top and middle of Table4.2 presents the resulting assignments of subjects to situations using a sort cardrandomization.


TAB

LE

4.2:

Rat

ings

ofre

view

erco

mm

ents

ofth

efir

stho

ur

Cha

pter

1Su

bjec

t1

23

45

67

89

1011

1213

1415

16Si

tuat

ion

31

21

31

13

31

22

22

13

Weight

10

00

00

20

00

11

30

20

00

02

10

00

31

40

00

01

22

12

01

42

00

51

01

03

10

00

11

41

26

83

21

01

97

31

11

10

01

11

44

16

40

00

01

20

24

23

64

40

09

84

10

10

00

00

00

00

20

11

00

00

01

72

20

12

10

13

50

00

00

00

00

00

00

00

10

00

00

00

00

10

00

00

0C

hapt

er2

&3

Situ

atio

n2

31

32

22

22

31

13

13

1

Weight

10

02

41

10

10

10

00

10

01

10

20

01

32

61

30

03

122

00

63

52

12

10

21

01

00

00

33

10

310

45

53

11

511

30

00

23

41

01

30

22

00

00

02

00

06

26

06

51

112

24

11

10

23

00

20

10

00

00

00

00

01

31

12

01

00

50

50

00

00

11

00

00

00

00

00

00

00

03

00

00

00

00

0


Subjects

In total 16 persons participated in the experiment. The subjects had different back-grounds: senior software engineers (subjects 8 & 15) and software engineers work-ing on the LOFAR system (subjects 3 & 7), master students in software engineeringwho have participated in a course on software architecture (subjects 1,2,4,6,9,10,and 11), and academic researchers of AK (subjects 5,12,13,14, and 16). Hence, 4 ofthe subjects are practitioners and the other 12 are academics. One of the masterstudents (subject 6) knows the Document Knowledge Client, for he has been in-volved in its development. All other subjects were not knowledgeable about thetool.

Objects

The document used for the experiment was a recently created software architec-ture document of the LOFAR system (see Section 4.6.1). This document is not partof the set of documents we investigated for the domain model (see Section 4.6.3).Each subject was provided a laptop on which the Document Knowledge Clientand the document was available. For subjects that were performing their reviewin situation 1 or 2, a one page supporting leaflet was provided to them. The leafletexplained the LOFAR domain model (see Section 4.6.3) and a very short manualon the workings of the Document Knowledge Client.

Instrumentation

Before the actual review started, a 40 minute presentation explaining the experi-ment was given to the subjects. This presentation included an explanation of theDocument Knowledge Client and the LOFAR domain model. Following was asmall training exercise lasting for 20 minutes. In this exercise, the subjects usedthe Document Knowledge Client to annotate and use formal AK in a sample doc-ument.

To guide the subjects with capturing comments during their review a templatewas provided to them. The template was a simple table, with one row per com-ment. The reviewers were asked to fill the following two significant columns:comment text and comment type. The first column contains the text of the com-ment. In the second column, the reviewer classified his/her comment as either apositive remark or as an improvement to the architecture (description).

To gather important qualitative information from the subjects, the experimentends with a group discussion. We used the following checklist to ensure the dis-cussion covered vital parts we were interested in:

• General remarks about the quasi-controlled experiment.

• Bugs the subjects encountered while using Document Knowledge Client.

• Improvements that could be made to our tool and approach.

• Domain model how good or bad it was for the specific review task.


• Creating annotations besides the provided ones, a situation not covered inour quasi-controlled experiment.

• Future use, i.e. whether the subjects would like to use the Document Knowl-edge Client again the next time when they are performing an architecturalreview.

Data Collection Procedure

The experiment was performed three times at separate days. The experimentstarted with the aforementioned 40 minute presentation and a 20 minute train-ing exercise. After this first hour, the subjects started with their reviews in theirassigned situation. Once the two review sessions were completed, the experimentwas concluded with a 15-30 minute wrap-up discussion session to collect the ex-periences of the subjects. All in all, including breaks, the entire experiment took 5to 7 hours per person.

For each review, the subjects had two hours of time. The review commentswere collected at the end of the first and second hour. Due to time constraints ofthe ASTRON engineers, we limited their review time to a single hour. This is whyin the rest of this experiment the focus is on this first hour.

Analysis Procedure

In this experiment, we focus on the result of the review that was send to the coor-dinator; a list of comments and remarks about the software architecture document.By judging these comments, we quantify how well a reviewer had performed thereview, and thus indirectly measure how well they understood the architecture.

We simplified the experiment analysis by making several important assump-tions. Without these assumptions, we should use a non-parametric statistical test.However, seeing that we have very limited number of subjects, achieving signif-icant results is most likely impossible. Hence, we want to use a parametric test.However, for this to work we need to make three assumptions. Firstly, as the met-ric for the quality of a comment the average score of both raters is used. Secondly,we assume that the number of comments per subject is an independent variable,i.e. the number of comments made by one reviewer does not influence the num-ber of comments made by another reviewer. Thirdly, the number of comments fora situation has a normal distribution. Based on these three assumptions, we canuse the student t-test [165] to statically test whether the encountered differencesare significant. We use the one-tailed variant of this test, as we want to measurewhether the found differences are statistically significant.

The student t-test calculates the chance that a similar result will be found whenthe experiment is repeated. In this chapter, we call this chance the confidencelevel, which is defined as 1− p with p being the so-called p-value. Most empiricalresearchers use a confidence threshold of 0.95 (i.e. 95% or α = 0.05) as the mini-mum level to accept a hypothesis. For this chapter, we use an α value of 0.05 tostatistically accept a hypothesis, i.e. p < α. For results with confidence levels be-tween 0.80 to 0.95, we regard the results to be strong indicators for their associatedhypothesis.


Validity Evaluation

We improve the reliability and validity of the data collection in various ways.Firstly, we enabled the automatic file saving feature of Word on a short interval of 5minutes to prevent losing either review comments or annotations due to crashes.Secondly, we ensured that assistance was available for the reviewers in the casethey were confronted with problems, e.g. with understanding the working of thetool.

4.7.3 Execution

Sample

Table 4.2 presents the raw data resulting from first hour of the experiment. Thetable presents a number of things for the two reviews (i.e. Chapter 1 and Chapter2 & 3). First, it displays in which situation a subject was performing a review.Second, it presents the results in this situation after 1 hour. Third, it shows forevery subject the number of comments that have received a certain rating. Theleft number is the rating by the architect and the right number is the rating of thereview expert. In total, 203 comments were collected for the first hour and 94 morein the second hour.

Preparation

The preparation went smoothly and followed the description outlined in the ex-perimental design (see Section 4.7.2).

Data Collection Performed

The data collection performed followed the description of Section 4.7.2. There wasone exception, subject 16 only performed the first hour of the review of Chapter 1,whereas 2 hours were planned. Since the analysis of this experiment concentrateson the first hour, this deviation has no influence on the experiment results.

Validity Procedure

No crashes occurred during the experiment. Assistance with the Document Knowl-edge Client was needed during the first execution of the experiment, as the colorscheme used to color KEs according to their type in the document was not clear.The supplied one page leaflet was updated to include this information.

4.7.4 Analysis: Quantity

Descriptive Statistics

Based on the results presented in Table 4.2, we evaluate the quality and quantity ofthe comments. For the quantity, we count the number of comments per reviewerper situation. This number in turn is averaged over all the reviewers in a particular


Average amount of comments per reviewer per situation

7,91

4,82

6,3

0

1

2

3

4

5

6

7

8

9

Situation 1 Situation 2 Situation 3

average #comments

per reviewer

FIGURE 4.10: Average number of comments of the reviewers per sit-uation

situation. Figure 4.10 presents the resulting average number of comments of thereviewers per situation.

Data Set Reduction

The comments displayed in Table 4.2 are those comments the reviewers them-selves labeled as improvements and not as positive remarks. Since the positiveremarks do not add value to the review process, we have left these out.

Hypothesis Testing

Figure 4.10 shows that, in the experiment, on average the subjects make more com-ments when consuming formal and documented knowledge (situation 1) com-pared to a normal review (situation 3). This supports hypothesis H1. In addition,the subjects seem to make less comments when producing formal AK (situation 2)compared to a normal review in which only documented AK is consumed (situa-tion 3). Hence, we reject hypothesis H2 and consider the associated null hypothe-sis H02. However, the question is whether these found differences are statisticallysignificant.

To calculate the t-test value, the standard deviations of the results per situationare needed. In short order, these are: 7.66 (situation 1), 4.71 (situation 2), and 6.15(situation 3). Based on these values , the data of Table 4.2, and Figure 4.10, we findthe following confidence levels for hypotheses H1 and H02 as shown in Table 4.3.


TABLE 4.3: Confidence levels for H1 and H02

Hypothesis Situations Confidence p

H1 1 > 3 0.6980 0.3020H02 2 < 3 0.7296 0.2704

Average comment quality per situation

2,622,69

2,32

2,1

2,2

2,3

2,4

2,5

2,6

2,7

2,8

Situation 1 Situation 2 Situation 3

Ave

rage

Com

men

t qua

lity

(1-5

)

FIGURE 4.11: Average quality of comments of the reviewers per situ-ation

4.7.5 Analysis: Quality

Descriptive Statistics

We analyze the results for the quality of the comments by calculating an averagecomment rating score for each subject in each situation. Since the The PearsonProduct Moment Correlation Coefficient [158] between the two raters is rather low,i.e. r = 0.29. Hence, using the average is a rather conservative way to deal withthis. We calculate this average using the following equation:

∑i=ni=0 cwai+∑

j=nj=0 cwerj

2∗n

In this equation, n is the number of comments of a subject. cwai and cwerj arethe ratings the architect and the review expert have given as quality of a comment,i.e. the values from Table 4.2. Thus we calculate an average comment rating foreach subject in each situation. In turn, these averages are used to calculate theaverage quality of a comment per situation. The results of this calculation arepresented in Figure 4.11.


TABLE 4.4: Confidence levels for H3 and H4

Hypothesis Situations Confidence p

H3 1 > 3 0.9584 0.0416H4 2 > 3 0.9292 0.0708

Data Set Reduction

Note, that for subject 7 these numbers don’t add up, as one comment was not ratedby the architect. Since the average of both reviewers is used as the metric of thequality of a comment, we use for this comment only the quality rating given bythe review expert.

Another complication in the quality calculation are subjects that have no com-ments, i.e. subject 9 for Chapter 1 and subject 8 for Chapters 2 & 3. For these twosubjects an average quality of their comments cannot be determined. Hence, weexclude them from the calculation of the average comment quality per situation.

Hypothesis Testing

Comparing Figure 4.11 with the quantitative results presented in Figure 4.10 it issurprising that the average quality of the comments, although there are less, insituation 2 is higher than that in situation 3. This indicates that producing AKdeepens the understanding of an architecture document, but reduces the broad-ness of this understanding.

To determine whether the found differences are statistically significant, we usethe same student t-test as for the quantitative part. The standard deviations forthe three situations are: 0.39 (situation 1), 0.62 (situation 2), and 0.28 (situation 3).Based on these numbers, we find the confidence levels for hypotheses H3 and H4in Table 4.4.

4.7.6 Interpretation

Evaluation of Results and Implications

For the quantity of the comments, we have hypotheses H1 and H02. Based on theresults, we cannot statistically accept hypotheses H1 and H02 However, the resultdoes give a weak indication that an improvement in the number of comments forsituation 1 over situation 3 is likely and the opposite is the case for situation 2compared to situation 3. For completeness, we also calculated whether situation 1is an improvement over situation 2. The confidence we find for this improvementis 0.8661, which is not statistically significant, but still a strong indicator that thisdifference might exist.

Based on the results for the quality of the comments, we conclude that thereis a strong indication for H4. However, the hypothesis lacks the confidence tobe statistically accepted. For H3 this is different as for this hypothesis the datadoes statistically support the hypothesis. Thus, the quality of the comments, onaverage, is better when consuming formal and documented AK than when only


consuming documented AK. Consequently, we conclude that the understandingof the software architecture is deeper when using formal and documented AK thanusing only documented AK.

Limitations of the Study: Threats to Validity

There are several threats to the validity of the quasi-controlled experiment. Thefollowing list presents these threats and our mitigation strategy to deal with them.We have categorized the threads as being either for internal or external validity.Internal:

• Learning effect The presentation of the LOFAR domain model and the ex-ercise with the Document Knowledge Client before the start of the reviewsmight influence the behavior of the subjects in situation 3, since the subjectslearn spotting relevant KEs. In addition, the use of the tool in situation 1 and2 for the first review might influence a subject’s second review performancefor the same reason. Since we don’t want to complicate the execution of theexperiment, we did not explicitly mitigate this issue.

• Subjectivity of comment quality ratings The rating of a comment is a sub-jective measurement. This threat is mitigated by using the average rating oftwo persons instead of one to rate the quality of the comments.

• A comment is not an useful unit A reviewer can comment on different partsof the architecture in a single comment. Hence, a comment might not bea useful measurement unit, as the same comment could be represented bymultiple other comments. We manually inspect all the comments to detectand repair such cases by splitting them up in separate comments to mitigatethis threat.

• Personal bias of subjects One subject might be a much better reviewer thananother subject. Hence, there is a personal bias of the subjects that mightinfluence the experiment. To mitigate this threat, we let each subject performhis/her two reviews in different situations. Consequently, the personal biasis (partially) mitigated, as 2 out of 3 situations are influenced by this bias.For example, subjects 8 and 9 are excluded in the analysis of the quality, asthey have only for one situation more than zero comments. If these subjectswere included in the analysis then H3 would have a confidence of 0.74 andH4 0.55, since one contributes with a very high (3.75) score for situation 3 andthe other with a very low (1) score for situation 1. Hence, including them con-tributes a big bias in our experiment, especially as they do not "compensate"in another situation.

• Background bias of subjects Besides the aforementioned personal bias, thesubjects are also biased by their background, i.e. their education and expe-rience. We mitigate this effect by evenly distributing the subjects over thesituations based on their backgrounds.

• Bias of quality ratings The people rating the comments may favor one sit-uation over another. We mitigate this effect by performing these ratings


“blindly”, i.e. the person rating does not know from which situation theycome. In addition, these people did not participate as subjects in the experi-ment nor are they the authors of this publication.

External:

• Domain dependency The experiment takes place in the context of radio tele-scopes. Hence, the findings might not be representative for other applicationdomains.We try to mitigate this threat by having non-domain experts (i.e.master students) in our subject population. However, replication of the ex-periment in another application domain is advisable.

• Scalability of the findings The experiment is centered around one relativesmall (± 50 pages) document. Hence, it is the question as whether thesefindings still hold for the entire architectural documentation. We have triedto partial address this by using a document that provides an overview of theentire software architecture and relates to more specialized documents.

Inferences

Besides the four hypotheses, there are some additional observations concerningthe quasi-controlled experiment. Looking at Table 4.2 and recalling the standarddeviations presented earlier for the different situations, we see big differences inthe performance of our subjects. For example, subject 9 produces 0 comments ina hour about Chapter 1, whereas subject 16 manages to produce 20 in the sameamount of time and situation. Hence, the primary factor influencing the experi-ment seems to be a subject’s reviewing capabilities. The situation in which thisreview takes place is more a secondary influence.

4.7.7 Lessons Learned and Discussion

To collect the non-quantified experiences of the subjects, the experiment endedwith a discussion session (see Section 4.7.2). In this subsection, we present theseexperiences.

The subjects have two remarks regarding the setup of the experiment. The firstremark is that we test the reviewers performance in situation 3, i.e. only consumedocumented AK, with the software architecture document displayed on a com-puter screen. Many subjects prefer to review a document on paper as this offersreading benefits computer screens cannot deliver [145]. In the design of the ex-periment, we use computer screen reading for all situations as not to influencethe result by the distinction of reading on paper versus reading from a computerscreen.

Secondly, some of the subjects felt restrained in situation 1, i.e. consume doc-umented and formal AK, as they are not allowed to make annotations themselvesbesides the provided ones. We decided beforehand to leave out this case, as tohave a clear distinction between consuming and producing formal AK.

Based on some of the subjects comments, we have improved the usability ofthe Document Knowledge Client. In short, these comments have lead to four sig-nificant changes. First, changing the type of a KE has become possible and no


longer requires removal of the old KE. Second, a huge performance improvementhas been made when switching between different documents. Third, the user in-terface menu for the connections is revised to make it easier to see and edit theconnection a KE has. Fourth, the tool now supports some basic keyboard short-cuts.

The subjects indicate that in situation 2, i.e. produce formal AK and consum-ing documented and own produced formal AK, they operate in two different pro-cesses. In the first process, they read the document and try to make annotations ofthe AK they discover, i.e. they produce AK. In the second process, they read theseannotations and the accompanying texts to distill review comments, i.e. mostlyconsume AK. Most subjects start of with the first process followed by the secondprocess. However, in following iterations many subjects say to skip the first pro-cess altogether, as they feel slowed down by making annotations.

We asked the subjects on their opinion about the domain model. Generallyspeaking, the subjects were satisfied with the model. However, the software ar-chitecture document was lacking decision topics. Only with hard thinking thesecould be constructed. A problem is where to attach to the annotations of thesenon-existing decision topics.

Two of the subjects indicate that they find it hard to get an overview of thesoftware architecture with the Document Knowledge Client. The explorer tool(see Section 4.4.3) might provide this wanted overview. However, this tool wasnot part of the experiment. Testing the suitability of the explorer for this role isfuture work.

We also asked the subjects if they would use the tool again after the experi-ment. Several subjects indicate that they would like to use the tool when writinga software architecture document as to improve the quality of it. The majorityof the subjects would use the tool again for a review if the annotations would beprovided. The remaining minority prefers to use paper for their reviews.

4.8 Related Work

The approach presented in this chapter is based on annotating documents to maketheir knowledge explicit. Similar approaches exists in the context of the semanticweb in the form of annotation tools, e.g. MnM [181] and Annotea [104]. However,all of them focus on annotating web pages and plain text, not the Word documentsin which software architectures are typically written.

Closely related to annotating is the field of Information Extraction (IE) [46],where the challenge is to automatically annotate or extract information from doc-uments or find relevant relationships between annotations and/or documents.Usually, this involves machine learning, natural language processing, and/or sta-tistical techniques. For example, Ont-O-Mat [78] uses machine learning to semi-automatically annotate similar documents. Another example is the work of [24],who use latent semantic analysis [128] on (architectural) documents to find re-lationships between these documents. Based on these relationships an order ofreading the documentation is suggested.

Most software architecture documentation approaches [42, 90, 116] use archi-tectural views to describe different aspects of an architecture. This is reflected in

4.9. Limitations 89

the IEEE 1471 standard, i.e. a recommended practice for architecture descriptionof software-intensive systems [95]. The approach presented in this chapter can beused in conjunction with such approaches. Documentation approaches define foreach view what concerns are of interest and how they should be described. Ourapproach provides the necessary glue in the form of traceability to relate the viewsand their underlying decisions [53, 100] together.

Another form to support AK capturing in architecture documentation is by theuse of templates. [180] present such a template for architectural decisions. How-ever, templates typically have difficulty in making multiple relationships betweendifferent elements clearly visible. Visualizations like the one in the Explorer (seeSection 4.4.3) are much more capable of dealing with multiple relationships.

In the last couple of years, several AK management tools have been developed.This includes web-based tools like PAKME [12] and ADDSS [39], which focus ondesign patterns and architectural decisions, or more implementation focused toolslike Archium [100]) and AREL [172]. The main difference between the KnowledgeArchitect and these AK management tools is that the Knowledge Architect usesspecialized plug-ins to integrate with different AK sources, something these othertools do not do.

Compared to the meta-models of these AK management models or generalmeta-models like Kruchten’s ontology [119], the template from [180], or the COREmodel [23]), the domain model used in this chapter is rather simplistic. Both in thenumber of concepts and relationships and the provided detail. This is due to tworeasons. Firstly, our domain model focuses on AK in software architecture docu-ments in a specific organization instead of AK in general. Secondly, we strived fora minimal model that was just good enough as to make it easier to understand.

The just-in-time AK Sharing portal of [63] is similar to our AK repository; acentral storing location for AK. The main difference between the two approachesis the assumed knowledge management strategy [13, 79]. The just-in-time AKsharing portal focuses on a personalization strategy, whereas the Knowledge Ar-chitect is a codification strategy. Hence, the just-in-time AK sharing portal focuseson knowledge where the AK can be found, instead of modeling this AK itself, suchas done with the domain model in our approach.

Besides our quasi-controlled experiment, [59] performed an experiment re-garding AK. They evaluated whether making decisions, goal, and alternatives ex-plicit in the form of tables improves individual and team decision making. Theresults of their experiment indicated this indeed seems to be the case.

Another way to formally describe AK is to make use of an Architecture De-scription Language (ADL) [138]. ADLs offer a formal model to express certainconcepts and relationships. Often the selection of concepts supported are limitedto those of the component & connector [42] view. The Knowledge Architect couldbe extended with a domain model describing the concepts and relationships foundin ADLs, thereby making integration with existing ADLs possible.

4.9 Limitations

The presented method and the supporting tool is based on codifying AK by en-riching architecture documentation with formalized AK. Despite the benefits of


resolving the challenges identified in Section 4.2, the approach suffers from thefundamental limitations of AK codification:

• Cost The biggest and foremost limitation is the cost of capturing and main-taining the AK by means of a formalism. Most knowledge managementapproaches assume that the knowledge is already formalized and readilyavailable. However, in practice this is often not the case for AK. Hence, min-imizing the cost of capturing and maintaining AK is as important as maxi-mizing the benefits, which formalized AK offers. Our approach attempts toto minimize this cost by offering integrated tool support that automaticallyreuses as much context as possible. For example, with the Word plug-in theuser does not have to retype a description, simply selecting the text is goodenough. However the cost of annotating an architecture document of a largeand complex system, remains substantial.

• Start-up problem For most approaches, formalization of knowledge onlystarts to provide tangible benefits once a significant part of the knowledgehas been formalized [94]. Consequently, in the initial stages of knowledgecapturing, no benefits are perceived from the point of view of knowledgecreators. This in turn discourages people from capturing knowledge, whichleads to less formalized knowledge and less benefits. Hence, an approachthat is incremental in nature is needed. With the Document KnowledgeClient we offer such an incremental approach, since a software architecturedocument does not have to be completely annotated to start having (limited)benefits from these annotations. This partially solves the start-up problembut does not eliminate it.

• Asymmetric benefit The people who capture AK during the architectingprocess (i.e. the producers) are often not the people using this knowledge(i.e. the consumers). Formalized AK can easily provide benefits for con-sumers. However, the producers do not usually perceive direct benefits forthemselves. This results in an asymmetric benefit between producers andconsumers of AK, and therefore lack of motivation for producers to cap-ture complete and high quality AK. Hence, a good codification approachshould not only offer benefits for the consumers of the knowledge, but alsofor the producers. With the Document Knowledge Client we offer such ben-efits through the different types of checking (correctness, completeness andconsistency). These features help architects writing a software architectureby reminding them which parts of the architecture still require further atten-tion. However providing motivation for knowledge producers also largelydepends on organizational and process issues.

4.10 Future Work

In this chapter, we proposed an approach for enriching architecture documenta-tion with formal AK. This approach addresses to some extent the challenges thatcurrent software architecture documentation approaches face: understandability,

4.10. Future Work 91

locating relevant AK, traceability, change impact analysis, design maturity assess-ment and trust. We illustrated the approach through a large industrial example,by following the method activities and demonstrating the corresponding tool sup-port. Using a quasi-controlled experiment we presented evidence on how the pro-posed approach helps to tackle one of the challenges: understandability. Basedon our experience from LOFAR, the associated ongoing empirical research, andthe development of the knowledge architect tooling we see several directions forfuture work.

In this chapter, the focus was on a subset of activities of our generic method(see Section 4.3). This limited focus left out two important topics concerning docu-mentation enrichment with AK: integration (activity 5) and evolution (activity 6).For the first activity, we already have started to investigate the possibility of inte-grating different domain models as a way to integrate AK from different sources[131]. As for the second activity (evolution of AK), we have already combined theknowledge repository of our tool suite with a version management system thatrecords the evolution of individual AK entities. The initial results for both theintegration and the evolution activities look promising, but need to be validatedwith industrial case studies.

Another direction for future work is to investigate ways to improve the searchfunctionality within the Document Knowledge Client. Currently, relating Knowl-edge Entities in the tool is based on keyword searches and concept classifications.The results of this search might be improved by using information extraction tech-niques [46] that make use of the context of the search, i.e. the Knowledge Entityto be related, its position inside the document, and relationships already definedto other KEs. Furthermore we are looking into ways to make the Knowledge Ar-chitect interoperable with other AK management tools, e.g. the just-in time AKsharing portal [63], in order to make more and different kinds of AK available.

In the quasi-controlled experiment, we left out the situation in which subjectscould both consume existing formal AK and produce their own. It is interestingto investigate whether this situation is an improvement over the ones presentedin this chapter. Another direction for the experiment is to replicate it, as to createmore samples. This will allow for stronger statistical evidence and testing of theassumption we made in this chapter about the normal distribution of our samples.Moreover, we plan to use the data from the experiment in order to investigatehow people annotate AK inside documents and therefore better understand theprocess of producing formal AK. An interesting research question in this respect iswhether there exist differences in the way people annotate a software architecturedocument and how that affects the produced AK.

Acknowledgements

This research has been sponsored by the Dutch Joint Academic and CommercialQuality Research & Development (Jacquard) program on Software EngineeringResearch via contract 638.001.406 GRIFFIN: a GRId For inFormatIoN about ar-chitectural knowledge. We would like to thank the people from ASTRON whoparticipated in this research, in particular, Kjeld van der Schaaf. Special mentiongoes to the master students Jens Rasmussen, Natasja Sterenborg, Hubert ten Hove,


Alex Haan, Joris Best, and Marco van der Kooi for their work on the various partsof the Knowledge Architect tool suite. In addition, we would like to thank Indus-trial Software Systems of ABB Corporate Research for providing the opportunityto revise this publication.

93

Chapter 5

Exploring the Context ofArchitectural Decisions

“It’s tough to make predictions, especially about the future.”

- Yogi Berra

This chapter is based on: Jan Salvador van der Ven and Jan Bosch. “Architec-ture Decisions: Who, How, and When?” In: Agile Software Architecture. Ed. byMuhammad Ali Babar, Alan W. Brown, and Ivan Mistrik. Boston: Morgan Kauf-mann, 2014, pp. 113 –136.

5.1 Introduction

In the past decade, the creation of software systems has changed rapidly. Tra-ditionally, long-lasting waterfall projects (>2 years) were standard, whereas nowrapid development (<3 months) with fast-changing requirements is becoming thenorm for creating software products. In both cases, the architecture of the systemhas to be taken into account, although when, how, and who is responsible differssignificantly. Formerly, experienced architects created models and documentationfor the system beforehand, so the development team had a solid base of decisionson which to build. Nowadays, in more agile projects, the architectural decisionsare made just in time by the development team itself, often assisted by a partici-pating architect. Alternatives for heavy template-based documentation, like wikisor photos, are used to document the decisions.

This leads to a change in responsibilities, and in the role, of the architect. Theresponsibilities for the architectural decision process shift from the formal architecton one hand to the development team on the other hand. The architect takes onmore of an advisory servant role within the project, often participating in the de-velopment team as a designer or developer. This difference, and thereby the newlyneeded alignment between agile and architecture is the topic of this chapter.

In recent research, architecture in agile software development has been a topicof hot debate [2, 31]. While some authors emphasize the importance of architecturewith Agile [45], others have described their experiences with Agile in product lines[5, 29]. This chapter contributes to the debate by presenting a framework thathelps in identifying alignment problems in agile and architecture. Our frameworkis validated by case studies.

94 Chapter 5. Exploring the Context of Architectural Decisions

We have conducted a literature search on architectural decisions, the role of thearchitect, and how these decisions are documented. On the basis of this search,together with our own experiences, we constructed our Triple-A (Agile, Architec-ture, Axes) Framework. This framework identifies three different axes. The firstaxis describes the person making the decision (often called the architect). The sec-ond axis shows the way in which the architectural decisions are communicated(e.g., the artifacts used). The third axis describes the length of the decision feed-back loop, the periodicity of a decision. We show that these axes can be used toprofile a project or case study and that positions on these axes can be indicators forthe success of a project.

The contribution of this chapter is twofold. First, we present a frameworkthat helps project teams and software development organizations understand howthey handle architecture. Second, we provide case study material that shows theeffects when changing the architecture decision process. This helps organizationsthat are gaining agility identify what points of their architecture process need im-provement.

The next section describes the research methodology used, followed by a de-scription of our Triple-A Framework. Then our industrial cases are described. Onthe basis of our evaluation of the shifts in these cases and their resulting effectson business, the Tripe-A Framework is validated in Section 5.5. This chapter endswith related work, reflections on further research, and conclusions.

5.2 Research Methodology

We took the following steps to create the theory presented in this chapter. (1) Dur-ing our participation in the industrial cases, we iteratively discovered changes inour case studies (for the good or the bad). (2) To emphasize the changes (describedas shifts in Section 5.4), we have taken two points in time per case study and de-scribed the differences at each. We made the discovered changes explicit and cat-egorized them. (3) Triggered by our experience while working on our cases, weconducted a thorough literature search of related projects and models. By gener-alizing our research and experiences, we created the axes that are the core of ourTriple-A Framework. (4) We validated our framework with our case study mate-rial. (5) Finally, we identified several problems that occurred in our case studiesand related them to changes in our model.

In our case studies, we used comparative multi-case analysis methodology [55].Our initial theory building and measurement is done as an iterative process dur-ing the full period of all the case studies. We used longitudinal case studies for aperiod of time ranging from 9 to 48 months, where the qualitative data obtained bythe participant observer complemented, in some cases, interviews with key partic-ipants in the project or product development team. In other cases the qualitativedata was discussed with participants of the projects to validate the findings.

For each case, the research started with a discussion about what happened dur-ing the case, resulting in the descriptions in Section 5.4. Two phases were identi-fied, resulting in shifts that were rated according to business impact. While someof the cases involved positive shift, other cases showed a negative shift between

5.3. The Agile Architecture Axis Framework 95

the phases. These shifts were used to validate our research-based Triple-A Frame-work. To determine the business impact of the shifts in the case studies, threesuccess factors were used:

• The return on investment (profit minus investment cost)

• The speed of the project (whether the project progressed as scheduled andfinished on time)

• The quality of the delivered project (whether the customer and the team weresatisfied with the quality)

The return on investment is measured by the success of the projects: Did theproject actually deliver? What were the investments and the cost? This evaluationwas done after the fact and based on estimations of the researchers who partic-ipated in the project because first-hand financial data was not available to them.The speed of the project was measured by the speed of delivery of functionality (orchanges in case of bugs) as perceived by the participant observer. The researchersbased their quality assessment on discussions with customers and project teammembers.

The cases we use as a basis of our theory are not selected at random. From ourexperience, other cases could have been chosen. However, as stated by Eisenhardt[55], in case study research it is “neither necessary nor preferable to randomlyselect cases.” We have selected the cases that contained the clearest shifts, as de-scribed in Section 5.4. The cases are similar in that they all involved relativelysmall, collocated teams facing complex, real-life problems, but they involve a vari-ety of situations — from a small product company (case Epsilon), to small projectsat large companies (case Beta), to moderate-size projects at large customer sites(cases Alpha and Gamma), to a large company changing its way of working (caseDelta). Three of the cases primarily involved the development of new products(Alpha, Beta, Gamma), whereas two involved evolution and maintenance of therunning system (Delta, Epsilon) in addition to the creation of new functionality.

5.3 The Agile Architecture Axis Framework

In software architecture literature, many models are used to describe the softwarearchitectural decision process. For example, different models [23], templates [180],or ontologies [114] are used to describe architectural knowledge. Several authorshave compared the ability of available models to process architectural design de-cisions. For example, De Boer et al. [23] describe a “core model” for architecturalknowledge. They present a model and validate it through interviews and by test-ing it against existing models in the literature. Bu et al. [34] analyzed nine differentapproaches for describing design decisions. Such topics are beyond the scope ofthis chapter; however, we have seen that there are three essential aspects of thearchitecture creation process that are rarely thoroughly described

• The Architect. In architectural knowledge literature, the architect is referredto as the person responsible for the architecture, or the person who makes


the architectural decisions. However, who this person is and what his or herskills are, are rarely discussed. And the effects of these skills on the results ofthe project are also seldom written about.

• Artifacts. This term is often used as an abstraction of all things that are cre-ated during the architecture development process. Examples of artifacts thatare mentioned are documents, models, or source code. However, the effectsof the types of artifacts used on the outcome of a project are rarely researched.

• Periodicity. The “decision loop” [23] describes how decisions lead to newdecisions based on the alternatives chosen. However, the periodicity of thedecisions is rarely described. What is the length of time between the actualdecision and the resulting validation of that decision in the quality attributesthat the system needs to comply to?

The Triple-A Framework consists of three axes for describing where the deci-sion process can differ in projects or companies. Each axis focuses on a differentaspect of the architecture decision process as described above. In the followingsubsections, each axis is discussed by first describing relevant literature, followedby the points that we identified on the axis.

5.3.1 Who makes the Architectural Decisions?

Although there has been some debate about the role of the architect [118], in mostliterature concerning architectural design decisions or architectural knowledge theterm architect is used but not well defined. For example, in the survey about ar-chitectural knowledge [23], the term architect(s) is used 11 times, without definingwho this person is or mentioning this person in the described core model. In an-other survey [34], the term is only used three times, again without a definition orexplanation. In [62], the work of architects is described, based on a survey con-ducted with a large group of architects. The authors mention that they includedarchitects of different types in their research (“...including software, IT, solution,enterprise, and infrastructure architects”), but do not mention what the exact skillsare or what effect a certain architect could have on the project. Kruchten [118]emphasizes that the architect has a broader role than just making the architecturaldecisions. In this chapter, we focus on the actual decision-making, not on the otherthings architects do.

Often, we have seen that people other than the formally assigned architectsmake architectural decisions. For example, the product owner (customer or do-main decisions) and the development team (technical decisions) are heavily in-volved in, and sometimes responsible for, the architecture decision process. Wehave extended the scopes for architects described in [135] (Enterprise, Domain,Application) with two additional roles we encountered in our industrial cases(Management, Development Team). Note that the described roles are not one-to-one mappings to position names. They should be interpreted more as basketsfor skills that a person with this role possesses.

• Management: Management can consist of company or project managers.Managers can have significant influence on the architecture decision process.

5.3. The Agile Architecture Axis Framework 97

However, the main focus of management is on project properties (on-time,in-scope). Typically, management lacks knowledge of the actual technicalbackground of the system or the customer’s specific functional demands.Because management often has a high position in the hierarchy of the orga-nization, its decisions are hard to debate.

• Enterprise Architect: Enterprise or solution architects are typically respon-sible for the decisions at enterprise scope [135]. They often have a thoroughbackground in theory and sometimes in practice.

• Domain Architect: Domain architects [135] are often customer-employedpeople who have thorough understanding of what the customer actuallyneeds. The role of product owner is typically existent in Scrum [109] projects.This person functions like a domain expert, but is not necessarily part of thecustomer organization, and is often close to the team. Domain experts orproduct owners are often responsible for architectural decisions that have ahigh functional impact.

• Application Architect: The application architect [135] is typically an archi-tect that also writes code as a team member. To make the right technicalarchitectural decisions, he or she must have up-to-date knowledge. Typi-cal role names that we encountered for these people are senior developer ortechnical architect.

• Development Team: The development team consists of the people involvedin the actual designing, coding, testing, and deploying of the system. Thisincludes people with architecting skills. The responsibility of the decisionslies with the whole team, in contrast to the previous roles.

Although often more than one role is involved in the decision process, in ourexperience there usually is one (sometimes assigned) role that has the formal or in-formal responsibility for making the decisions. The “who” axis describes who themain responsible person (or persons) is (are) for the architecture decision process.

5.3.2 What Artifacts are used to Document the Decision?

Architectural design decisions are often traceable to, or represented in, artifacts.Many authors use the term artifact [23, 114] to describe that the decisions have arepresentation somewhere. Some authors emphasize that the decisions themselvesshould be first-class entities or artifacts in the design [100], or provide templates tomake the artifacts for design decisions [180]. As described in [79], knowledge canbe shared personally (remembering, talking, etc.), or more formally (document-ing, modeling). We have identified the following points, ordered from heavy tolightweight documentation approaches:

• Mandatory Template-based Documentation: These artifacts are architecturedocuments that have to be created because they are part of an offer, agree-ment, or project plan. Examples of these documents are the functional andtechnical design documents.


• Facultative Template-based Documentation: Rational unified Process (RuP)[117] has a very extended set of document templates that can be used in soft-ware projects. An example of an architectural deliverable is the softwarearchitecture document. Because they are not mandatory, it is easier to de-cide not to use one of the documents. The templates add to the unnecessarycomplexity of writing down architectural decisions. On the other hand, tem-plates can help to make sure certain aspects of a design are not forgotten.

• Ad hoc Documentation: This documentation type consists of all the randomdocumentation that is present in almost every software development projectthat does not comply with a template. It can be structured on a commonshare or (semantic) wiki [80], or unstructured by e-mail or ad hoc sharing.Because they don’t have to follow a template, they are typically quicker towrite—but with the risk of forgetting important aspects.

• Meeting Notes (sketches, photos, etc.): Meeting notes, in written or visualform, can be used to create very lightweight documentation of design meet-ings or open spaces. This method has the advantages of being very quickand the fact that the details of a meeting are easier to remember when seeingthe drawings from the meeting. However, it can be difficult for people whowere not involved in the meeting to understand the decisions.

• Direct Communication: Direct communication is the tacit “documentation”that takes place every day. This can be in a chat, on the phone, or face-to-face. Direct communication is the richest form of communication because itis bidirectional (it is possible to ask for explanation) and multiple senses canbe used.

Projects usually do not use just one of these communication approaches. How-ever, there is usually one medium for reading and writing architectural decisionsthat tends to be preferred by the team. We use the items mentioned above as pointson our “how” axis.

5.3.3 What is the Feedback Loop of an Architectural Decision?

During both the initial development and the evolution of the system, architecturaldecisions are made with a goal in mind. Often, these decisions are made to confirmnonfunctional requirements or the quality attributes to a certain level. However,sometimes the confirmation of the suitability of the decision takes a long time.Kruchten [118] describes antipatterns where architects are disconnected from theactual team. In our opinion, these patterns show what happens if the feedbackloop from the decision to the actual validation of the decision becomes too long.Some authors have suggested the use of templates with “states” [114] or “status”information [180] to be able to document where the decision is in the feedbackloop. However, neither of these representations of the state of the decision de-scribes when a decision is actually implemented and validated.

To get a better idea of the validity of a decision, assessment methods like Ar-chitecture Tradeoff Analysis Method (ATAM) [107] are used to increase the confi-dence level for the decision. However, final validation occurs when the decision

5.4. Industrial Cases 99

is implemented in the system and is being measured by the usage. Typically threepoints in time are relevant for an architectural decision: the time the decision wasmade (“Decided” in [114] and [180]), the realization of the decision, and the vali-dation that it was a correct decision (the latter two are nonexistent in the describedliterature). We take the time between the initial decision and the validation aselements for the periodicity of the decision.

• Long (>6 months): In long-running projects or offers, sometimes architec-tural decisions are made before the project starts. It can take several yearsbefore the decision is implemented and validated.

• Medium (1-6 months): Often, a proof of technology or Proof of Concept(PoC) is used to validate whether an architectural design decision has thedesired result. These decisions are typically validated quicker than the long-running project decisions.

• Short (<1 month): In more agile settings, the validation of decisions can bemuch quicker. Especially when the technical infrastructure for continuousdelivery is in place, the time between decision-making and validation can bedecreased. The shortest cycle can be achieved during refactoring, where thedecision is changed on the existing code base.

Of course, some decisions cannot be validated in the short term. Also, a typicalproject has more than one decision point. However, projects tend to lean more to-wards one point on this axis, characterized by the organization style of the project.We use these three points in our framework as the “when” axis.

5.3.4 Summary of the Axes

To summarize, we have identified three axes that can be used to describe the ar-chitecture decision process in projects. In Figure 5.1, we visualize our Triple-AFramework in a radar plot.

In the following section, five industrial cases are introduced. All of the caseshave a representation on the Triple-A Framework, which is evaluated in Section5.5.

5.4 Industrial Cases

This section describes the industrial cases that are used in this chapter. In all of thecases, one of the researchers was involved as a team member within the studiedcompany. As authors, at least one of us was involved in all the described industrialcases over an extended period of time. This allowed us to study the response ofthe case organizations to the changes that we saw. This chapter focuses on issuesrelated to the work of software architects, although we studied the cases morebroadly including other software development aspects. The cases are anonymizedto protect the companies and customers involved. The roles that the researchershad in the cases varied from developer, architect, and team lead, to being part ofthe management of the organization.


FIGURE 5.1: The Triple-A Framework

For every case, the description starts with an introduction of the context, thecustomer, and the domain of the case. This is followed by a summary of projectcharacteristics: the technology used, the type of architectural challenges that weretackled, the number of people involved, the duration of the project, and the pro-cess(es) used. Then a separate description of the two different phases is given.Every case description finishes by noting the shifts that occurred, and the businessimpact on these shifts. After the description of the shifts, the differences betweenthe phases are summarized.

5.4.1 Case Alpha

Case Alpha involved the construction of a software system that had to replace alegacy geographic information system. The new system had to be coupled withseveral legacy back-office systems. The customer, a large harbor company in theNetherlands, initiated the project. The solution was service oriented, and consistedof several systems communicating with each other through an enterprise servicebus (ESB). Most of the software was written in (Oracle) Java. This coupling wasone of the most challenging issues in the project. This case consisted of a PoCand a realization phase, 3 and 6 months, respectively. Ten to twenty people wereinvolved during the various phases of the project. In the PoC phase, a lightweight,iterative approach was used, whereas RuP [117] was adopted in the realizationphase.


Phase One

During the first phase of case Alpha, an iterative approach was followed. Thefirst deliverable of the project was a PoC that had to be ready on a predefinedtime-line. During this period, functionality was delivered every 2 weeks. In everyiteration, new architectural challenges were tackled (e.g., how to process the dataof the legacy system via the chosen ESB to the user interface, or how to correctlymerge the real-time data from the ships’ locations to the static cargo data providedin one of the legacy systems). The customer was very happy with the result—arunning technological PoC—and the supplier company was invited to participatein the execution of the next phase of the project.

Phase Two

In the second phase, the organization of the project radically changed. This wasdone because there was the potential that the project would need to be scaled up inteam size. The result was that a group of eight architects from different companieswas formed, assisted by a team of five project managers. They conducted thoroughwork in documenting all the possible situations, interfaces with other systems, etc.The development team that was involved in the initial PoC was rarely consulted,and was colocated on the other side of the Netherlands. The main results of thisphase were the documents that were created: use case descriptions, architecturaldocumentation, project plans, and more.

Shifts

The following were the most striking shifts that took place in case Alpha:

• Because of fear of making the wrong decisions, the focus was more on docu-mentation and less on making working software.

• The iterations were longer (from 2-week iterations to half-year release).

• Because the system had to be connected to the deprecated systems that wererunning, there was a tendency to over-think the architecture to prevent mak-ing mistakes.

• There was no longer a structural feedback loop from the development teamto the architects and customers and vice versa.

By making the decision process more heavyweight and focusing on documen-tation, the project was slowing down so much that it became paralyzed. After 6months during the second phase, almost no working software was produced (oneuse case was realized). The project was discontinued because the customer nolonger had confidence in it.


5.4.2 Case Beta

The product developed in case Beta was an administrative case management sys-tem for a department of the Dutch national government. This system had to han-dle (changing) regulations, and was to be used by various departments who haddifferent demands for it. There was a multidisciplinary team of five to ten peo-ple involved for two periods of 6 months. The system was based on the OracleCollaboration Suite, and the UI was developed with Java technology. The mainarchitectural challenge in this project was the mapping of the desired functional-ity onto the technical infrastructure that was already in place at the customer site.The tradeoff between the specific solution and the generic components was alsoconsidered a major architectural challenge. The customer organization assistedthe team with their operations and architecture teams. During the first phase ofthe project, an agile approach was used: high customer involvement, iterative de-livery, and constant adaption of the product. In the second phase of the project, amore traditional approach was chosen, based on the Prince II [84] project manage-ment technique.

Phase One

Case Beta is split up in two phases that differed mainly in organization of theproject. Phase one was facilitated by a very light RuP approach. The main goalof the project was to prove that case-based working, on an extendable technicalsolution, was applicable for the organization. There was a practical attitude to-wards documentation, and a focus on working software. Biweekly iterations wereused to get the customer involved and to get feedback quickly. The results of thisphase were a proof of technological validity of the solution and a first workingversion of the case management system. The customer was enthusiastic about theresults and decided that a second phase should be initialized to complete the casemanagement system for a specific department.

Phase Two

In the second phase, the team remained mostly the same, but the project manage-ment methodology changed because a new project manager was assigned. Thisphase was managed strictly in Prince II, by a project manager and a steering com-mittee formed mostly of higher management from the customer and supplier or-ganizations. The goal of this phase was to extend, customize, and implement thetechniques from the first phase for a very small department in the organization.As prescribed by the development methodology used, the first aim of the projectwas to get the functionality and the architecture of the proposed system writtendown completely (functional design, technical design).

Shifts

The following shifts occurred in case Beta:

• In phase two, more emphasis was on (mandatory) documentation instead ofworking software.


• There was increased complexity of the organization around the developmentteam.

• The architectural decisions needed to be made earlier in the design process,and the decisions were never validated.

Because the customer was afraid of missing something in the description anddesign, it took about half a year to complete the documentation. The results wereso detailed and complex that the architect board that had to monitor the designwas unable to determine if the resulting documentation guaranteed that the re-sulting software could be created. The project became paralyzed because of theamount of documentation generated. Therefore, no value was delivered to thecustomer in the second phase.

5.4.3 Case Gamma

Case Gamma was conducted at a medium-sized product company in the Nether-lands. The project involved a new administrative software system for specific de-partments in Dutch hospitals. Changing regulations and different working envi-ronments needed to be taken into account from the beginning. The project wasexecuted by a multidisciplinary team of seven people, assisted by the architectof the company. A Java stack (JSF, Spring, Eclipselink) was used to create thisproduct from scratch, while a different team of approximately seven people de-veloped a part of the back end separately. This separate development was oneof the most challenging architectural parts of the project. Product developmenttook place over 12 months. The team used Scrum [109] with biweekly iterations toshow results while being able to adopt to changes in functionality.

Phase One

The company involved in case Gamma had the vision to create a reusable archi-tecture for its products. This generic, reusable architecture was implemented inparallel to the development of the system. This generic software was also used byanother product at the same time. Architects in cooperation with the company’smanagement made these architectural decisions (and the resulting interrelation-ship of the projects) before the project started. The development team was notifiedabout the decisions, but the concerns they had were never seriously heard. Dur-ing the first phase of the development of the system, the team felt that they had noinfluence on the decision process. Therefore, the atmosphere in the developmentteam became less constructive. The reusable architecture was blamed for everydefect in the system, as well as causing the system’s slow development.

Phase Two

The shift in this case came when the project team decided to take over responsi-bility and make the product independent of the decisions made before the projectstarted. This included stubbing certain parts of the application, and sometimeseven building functionality that was to be replaced by the generic software in the


(near) future. New decisions were made within the development team, and thecompany architect was kept informed but was not held responsible anymore. Be-cause the connection with the other projects stayed, the validation of the decisionswas still delayed, but the decision loop was shortened significantly.

Shifts

The following shifts were identified for case Gamma:

• The responsibility for the architectural decision-making shifted from the man-agement and architect to the development team.

• There was a quicker architecture decision process because the responsiblepersons were always present.

• There was less dependency on other projects within the company.

• There was less architectural documentation, and the documentation usedwas more lightweight.

Although the change caused the project to take more time in creating function-ality (some functionality had to be made twice—once by the project team and onceby the platform team), the overall speed of the project enhanced (more functional-ity was produced), and the team was much more committed to the result and theproduct, which increased its quality.

5.4.4 Case Delta

Case Delta is a Fortune 500 company developing software products and servicesoperating primarily on personal computers. The company’s products address bothconsumer and business markets and the company releases several products peryear, including new releases of existing products and completely new products.The products developed by the company range in multi- to tens of millions oflines of code and tend to contain very complex components that implement na-tional and international regulations. Although significant opportunities for shar-ing between different business units (BUs) exist, the company has organized itsdevelopment based on a BU-centric approach. The products developed by eachBU are typically based on a software product line. The company employs agileteams. It has new product development teams (who have no interdependencywith other teams) and component teams for large established products in bothNorth America and Asia.

The management and evolution of product architectures have been organizedthrough architect teams that mentor and coach agile development teams. The or-ganization arrived on this structure after falling into several traps around architec-ture and development efficiency. These traps included an overly complex softwarearchitecture for certain products, and software architects who had stopped codingand consequently had lost connection with the reality of software developmentthat the team faced. The latter caused teams to be uninterested in the architectural


design decisions; the software architecture documentation and the system devi-ated rapidly. Finally, there were some signs of architecture work that was done forthe architects, rather than for the benefit of the team or customer.

Phase One

In the situation of case Delta, the company used software architects as liaisons be-tween general management, product management, and the development teams.The architects’ role was to translate business needs into top-level architecture de-cisions before development of new products or the next (yearly) release of existingproducts started. Because of their role, architects spent very little time actuallybuilding systems and would often make decisions based on outdated understand-ing of technology and the implemented product architecture. This resulted in ar-chitectures that were more complex than necessary, due to the need to bring thedesigned and real, implemented software architectures together in one system. Asecond consequence was that teams were not committed to the designed archi-tecture, because it had limited bearing on reality and the architecture work wasviewed as being done for the architects’ sake.

Phase Two

The organization realized the challenges and made four main changes. First, ar-chitects returned to coding and spent a significant part of their time developingsoftware together with the teams. Second, the teams got more autonomy and inter-acted much more with customers, and system-level architects started to act morein a coaching and mentoring role. Third, the organization accepted that “life hap-pens” and that it often is better to refactor the architecture of products when theneed arrives than to try to predict everything beforehand. Finally, the latter alsochanged the perception around the importance of documentation and the organi-zation focused much more on maintaining a stable product team that collectivelyholds the architectural knowledge in their heads.

Shifts

To summarize, the following shifts took place in case Delta:

• The architecture team got more connected to the development team.

• Customers were explicitly heard during the development.

• Architectural decisions were made more quickly, and were checked againstthe working software.

• Documentation was used less and more emphasis was put on the tacit knowl-edge of team members.

The shift in case Delta resulted in features that were better suited to the needsof customers, which was a significant benefit. In addition, because of agile devel-opment practices, customers could get access to features much earlier than in the


traditional development model. Finally, the significantly shortened feedback cy-cles resulted in higher quality of the overall system, because customers reportedissues quickly and the team had the mechanisms to address their issues promptly.

5.4.5 Case Epsilon

A small startup company that creates a web-based product for the consumer mar-ket was the scene for case Epsilon. The project contained high-risk technologi-cal challenges; the architecture needed to be flexible in the beginning, to be ableto handle the expected high number of users. The application was created inRuby/Rails with a NoSQL back end based on MongoDB and Redis. The main ar-chitectural challenges were to be able to potentially scale up the application whenlots of consumers are using the system, and being able to adapt the system tochanging requirements from the customer. There was one development team ofseven people that conducting (bi) weekly iterations (using Scrum) over a period of12 months.

Phase One

In case Epsilon, the architecture was not fixed from the beginning, the main archi-tect was an experienced developer, and the product owner had significant influ-ence on the decision process. Architecture decisions were made before the imple-mentation phase of any iterative development methods. The architect also helpedthe team with the coding of the software. After 3 months, the architect moved andwas unable to participate in the project anymore, while the product owner got lessinvolved in the actual development.

Phase Two

This started the second phase of the project, where no formal architect was withthe team. All of the team members felt responsible for the architecture. No large ar-chitecture documentation was written — architectural decisions were made whenneeded and summarized in the wiki, or photos of meetings were shared in theinternal chat. Important architectural issues were formulated as functionality andput on the backlog as user stories, often after debating them with everyone inter-ested. Simplicity guided the architecture. The result of the project was a workingclosed Beta version of the product.

Shifts

The following shifts occurred in case Epsilon:

• Responsibility for the architectural decisions process changed from one ar-chitect to a team responsibility.

• There was a quicker decision process because the right people were alwayson-site.

5.5. Analysis 107

Case Domain Period Team size Experience ofteam

Alpha Harbor 9 months 10-20 FTE Experiencedteam and archi-tects

Beta Government 12 months 5-10 FTE Moderate teamand architects

Gamma Hospital 12 months 7-14 FTE Moderate team,experiencedarchitects

Delta Productcompany

12 months >50 FTE Experiencedteams and archi-tects

Epsilon Consumerproduct

12 months 5-8 FTE Moderate team,experiencedarchitect

TABLE 5.1: Overview of Case Studies

• There was extremely lightweight documentation of architectural decisionsand a focus on direct communication.

Although the change was not as severe as in the other cases, a definite changein the flexibility of the team was noticed. This resulted in quicker responses to bugor feedback reports and more predictable delivery of functionality (both qualityaspects).

5.4.6 Overview

As an overview of the case studies used, Table 5.1 describes the characteristics ofthem all: the period, the team sizes, and the team experience.

the following section, the case studies are mapped against the Triple-A Frame-work, followed by an analysis of the problems that can be generalized from thecase studies.

5.5 Analysis

In this section, we will provide an analysis of the results by correlating them toproblems that occur in software development. First, we will show how the casescan be mapped to the Triple-A Framework. From this, we will analyze what prob-lems have been addressed in our work.

5.5.1 Mapping the Cases to the Triple-A Framework

To illustrate the value of the Triple-A Framework, we show that the changes inthe cases from Section 5.4 can be seen as shifts among the axes of the Triple-A


Axis ResultCase Who How When RoI Speed Quality

Alpha - - - - -Beta - - - -Gamma ++ + + + +Delta ++ ++ + + + +Epsilon + + + + +

TABLE 5.2: Mapping the Cases on the Triple-A Framework

Framework. In this section, we identify what effects shifting along the axes haveon the efficiency of the whole case.

All five cases have been mapped on the Triple-A Framework. In the Appendix,a visual representation of the mapping is given. The results of the cases are sum-marized in Table 5.2. In this table, the cases are described and the shifts on the axesare visualized by: “-” shift toward the center, “+” shift away from the center, and“++” a radical shift away from the center. The results are split into the three partsthat were discussed above: Return on Investment (RoI), Speed, and Quality. In theresult columns, the “-” means that the result was worse in the second phase, andthe “+” means the results were better in the second phase. Empty fields indicateno difference has been found.

Generally, we saw that the shifts towards the center of the Triple-A Framework(cases Alpha and Beta) increased complexity and tended to decrease productivityand speed. In the other cases (Gamma, Delta, and Epsilon), the shift to the edgesof the axis resulted in increased quality and development speed.

5.5.2 Identified problems

As we have seen in our cases, organizations seek to improve agility because of realbusiness benefits that can be achieved. Companies invest in software architectureand software architects for the same reasons. Unfortunately, as we discussed in theintroduction, these areas are occasionally combined in ways that cause the organi-zation to fail in its ambitions. From the experience of the described cases, we haveidentified six main problems. They are discussed below. With every problem, weindicate which cases involved the specific problem, what axes are affected and ashort description of the identified problem.

Architects as single point of failureCases involved: Alpha, GammaAxis affected: Who

The architect(s) often represent a single point of failure. The architect has to bethe one that (a) talks to the customer to understand the vision and the most signif-icant requirements, (b) creates the main structure of the system, (c) makes sure thesolution is future-proof — especially concerning the nonfunctional requirements— and (d) makes sure the development team creates software conforming to thearchitectural decisions made. These tasks are hard for one person to do, especially

5.5. Analysis 109

in larger projects. A project being dependent on one person presents a high risk.

Complexity instead of simplicityCases involved: Alpha, Beta, Gamma, DeltaAxes affected: Who and When

Architects are assumed to be the cleverest people in the team; therefore, theyoften create smart solutions that are more complex than they need to be to solvethe problem at hand. The pressure from management and customers on the archi-tect to create a “future-proof” architecture often enhances this effect. When groupsof architects get too large compared to the rest of the team, the results look nice —but sometimes they are hard to implement. This is a typical example of the “cre-ate a perfect architecture, but too hard to implement” antipattern described byKruchten [118].

Outdated software architectsCases involved: Alpha, Gamma, DeltaAxis affected: Who

Often, the architectural decision-makers are not involved in developing thesoftware anymore. This creates a lack of hands-on experience in the technologythey are designing for. Because of this, their decisions are based on outdated as-sumptions and experiences, and because the decision-makers have no direct expe-rience with any important design flaws, there is no incentive to change the designwhen necessary.

Uncommitted teamsCases involved: Gamma, Delta, EpsilonAxis affected: Who

If many important decisions are made solely by the architect(s), the develop-ment team does not feel primarily committed to the decisions made. If this hap-pens, there is a lack of support for the decisions. In the worst case, the team op-poses the decisions made and undermines the actual development of the system.This is a typical consequence of what Kruchten calls the “architects in their ivorytower” [118].

Static architectural decisionsCases involved: Beta, EpsilonAxis affected: When

As customer demands, technology, and organizations change, architectural de-cisions also need to change. Therefore, architectural decisions do not last forever.During the development and evolution of a system, architectural decisions aremade and revised constantly. As architects in traditional settings are involvedprimarily in the earlier stages of development, the tendency is to make decisionsearlier on and to keep to them for a long time. This makes it hard to adapt the


system to new challenges.

Illusion of documentationCases involved: Alpha, Beta, DeltaAxis affected: How

It is very tough to have good documentation when it is used as a communi-cation medium. Often, documentation is out of date and is badly read. Sincearchitecture documentation especially needs to be created manually, it is typicallyoutdated within weeks—if not days or hours. In addition, very few people, evenarchitects, actually read architecture documentation. This is both because of itsobsoleteness and because it fails to help the reader build a relevant understandingof the system. However, these symptoms are rarely acknowledged—and whenthings don’t go as planned, often more documentation is mandated.

5.5.3 Summary

The description of the problems above can help teams identify problems and un-derstand which change (shift on a axis) could help improve the project. Based onthis set of problems, we can conclude that the who axis affected four problems,the how axis affected only one problem, and the when axis affected two problems.This is an indication that the who axis could be the most influential one.

5.6 Reflection

This section reflects on the findings from the previous section and discusses ques-tions on the validity of this research.

5.6.1 Findings

As with many solutions in software engineering research, the Triple-A Frameworkproposed in this chapter is no silver bullet. Often, the situation greatly influencesthe possibilities of companies, projects, teams, and individuals. However, the de-scribed axes can be used to see how responsibilities, timing, and communicationmethods can influence the results of a project. And, if a project or organizationneeds to change, the Triple-A Framework can help in identifying where the changecan be made, by analyzing the current decision process and focusing on one of theaxes where change can be achieved.

Some of the changes that derive from the Triple-A Framework have a big in-fluence on how a project is run. When a project stops using certain templates,makes other people responsible for the decision process, or waits for design issuesto occur, there is always the concern of trust involved. We have experienced thattrusting people to do the right things is often very tough, especially within largeorganizations or large contract structures (as seen in cases Alpha, Beta, and Delta).

In traditional organizational setups, architects come in three archetypes that,although overlapping in some cases, have different responsibilities. The first type

5.7. Related and Future Work 111

of architect acts as the bridge between the business strategy and customers on oneside and the software development team on the other side. Secondly, with highlycomplex systems, architects often have the responsibility for the end-to-end qual-ity requirements of the system and coach individual engineers to make sure thatnew features and functionality do not violate system properties. Finally, some or-ganizations share responsibility for the team between a project manager focusingon milestones and people management and an architect who acts as the techni-cal lead for the team. In our experience, and this is the position that we take inthis chapter, there is a fourth archetype. In this type, the architect becomes thecoach for the development team responsible for facilitating architectural decision-making. In an age where teams become increasingly self-selected, -directed, and-managed [7], it is important that architects move away from traditional hierarchi-cal, formal leadership roles and adopt meritocracy-based leadership styles. Thiscan be done in an iterative way by accurately shifting along the axes of the Triple-AFramework.

5.6.2 Questions of Validity

Several matters raise questions on the validity of our research. First of all, the par-ticipant/observer method does imply some subjectivity. The results are qualitative(not quantitative), and based on the experience of the researchers participating inthe project. However, this is one of the accepted ways to gather case study materialin software engineering research, and all of our cases involved real-life industrialsoftware projects that could not have been studied in any other way.

Although we have identified three axes that influence the results of the casestudy projects, it is possible that these are not the only parameters affecting theresults of the cases studied. Due to the nature of our research, and the fact that itwas conducted in real industrial settings, it is likely that there were other factorsinvolved. However, we have seen that in all of the cases the shifts did occur in thesame direction along the defined axis.

We have used only five cases, which involved mainly small project teams. Assuch, the degree to which this research can be generalized is restricted to projectsof this type. However, as seen in the context of case Delta, the axes also showimpact in larger organizations.

5.7 Related and Future Work

In the research community, there is currently a debate on the usefulness of agilesoftware development. The trend is to say that there are enough stories of suc-cessful and failed projects done with various methodologies, but insufficient (em-pirical) evidence to found a conclusion [31, 150]. This chapter contributes in thisdebate by describing additional case study results.

There has been much attention given to documenting software architectures[42, 90], as well as documentation templates [117] and computational modeling1 for documenting relevant architectural knowledge. Recently, there has been

1https://www.omg.org/spec/UML/

https://www.omg.org/spec/UML/


a trend toward using semantic wikis [80], and some research experiments showpromising results [73].

Another topic that is being discussed is the role of the architect [62]. Here, oftenthe architect is responsible for creating and maintaining the architecture documen-tation. In this chapter, we have shown the importance to collaborative multidisci-plinary decision-making of identifying who makes the decisions in projects.

In the architecture design decision research hierarchical structures are used tomodel architectural knowledge [45] or design decisions [100, 189]. This researchoften emphasizes the recording of decisions and the extraction of decisions later inthe development process. This chapter focuses on the decision process itself.

The agile community often doesn’t explicitly describe the role of architects’ ar-chitectural documentation or architectural decisions when explaining what theydo [109, 155]. Although there have been brave initiatives for merging the two [45],most of the agile community still has an aversion against architects and architec-ture documentation. Some authors emphasize the importance of architecture evenin agile settings [115]. In this chapter, we have shown changes for software archi-tecture in agile software development.

In future research, we would like to extend our validation of the Triple-A Frame-work to other industrial cases, specifically to larger-sized projects and distributedsettings where direct communication is more complicated. Second, we will con-duct more research on other possible axes that influence the architecture decisionprocess. Third, we would like the Triple-A Framework to be based on a morediscrete scale, to be able to score teams or companies.

5.8 Conclusions

From our industrial cases, we have seen a trend take shape. First, we have seenthat leaning heavily on architects as the persons that should solve the architecturalproblems leads to unproductive teams, or even no working software at all. Second,we have experienced that when the focus is too much on large (architectural) doc-umentation, the speed and quality of the project decreases. Third, we have seenthat making important architectural decisions early on in the project leads to ar-chitectural problems during development. Although the agile community makesthese claims, this is rarely backed up by case material. We contribute to this debateby presenting initial case study material.

In this chapter, we have followed two steps toward a better understanding ofwhat happens around architectural decision-making. First, based on our expe-rience and existing literature, we have generalized the Triple-A Framework forassessing how the architecture is handled in projects or organizations. Second, wehave described five industrial cases, and we have identified shifts in these cases.We have shown that these changes can be mapped to the three axes that we cre-ated within the Triple-A Framework. We have seen that the successes and failuresof the cases were influenced by the shifts that were made. These axes can be usedto help teams that are becoming more agile to align their architecture practices.

From our research conducted at the five case studies presented in this chapter,we can conclude that moving on the axes of the Triple-A Framework influences the

5.8. Conclusions 113

success of a project. This means that by moving away from the center (develop-ment team, direct communication and short feedback loop), the projects becamemore successful (Gamma, Delta, Epsilon), while by moving towards the center(management, mandatory templates, long feedback loop), the cases became lesssuccessful (Alpha, Beta). We are planning to use our framework on additionalcases to further validate our findings.

Appendix: Visual Representations of the Case Studies

117

Chapter 6

Busting Software Architecture Beliefs

“There are no rules of architecture for a castle in the clouds.”

- Gilbert K. Chesterton

This chapter is based on: Jan Salvador van der Ven and Jan Bosch. “BustingSoftware Architecture Beliefs: A Survey on Success Factors in Architecture Deci-sion Making”. In: 42th Euromicro Conference on Software Engineering and AdvancedApplications (SEAA). Aug. 2016, pp. 42–49.

Abstract

As software development changes, also the myths and beliefs around it come and go. Indifferent communities, different beliefs at kept, usually strengthened by success or failurestories. In this research, we study the beliefs surrounding software architecture. The beliefsrange from the amount of effort needed for architecture documentation, to the size of theteam or the persons responsible for making the architectural decisions. Most beliefs arebased on the idea that the outcome of the project is highly dependent on the methods usedduring the design and development of software. We conducted a survey with 39 architectswhere we evaluated 54 architectural decisions. In this survey, we assessed the way inwhich decisions were made, the success factors of the decisions, as well as the propertiesof the projects. We conduct statistical analysis in order to validate or invalidate someof the beliefs that currently exist in software development. We conclude that for most ofthe beliefs, no statistical evidence can be found, making these beliefs folklore for the tales,instead of useful guidelines for predicting projects success or failure.

6.1 Introduction

In the last decade, the creation of software systems has changed rapidly. Tradition-ally, long-lasting waterfall projects (>two years) were standard, while now rapiddevelopment (<three months) with fast changing requirements is becoming thenorm for creating software products [25]. In both cases, the architecture of thesystem has to be taken into account, although when, how and who is responsi-ble differs significantly [184]. Formerly, experienced architects created models anddocumentation for the system beforehand, so the development team had a solidbase of decisions to build on. Nowadays, at more agile projects, the architecturaldecisions are made JIT (Just In Time) by the development team itself, often assisted

118 Chapter 6. Busting Software Architecture Beliefs

by a participating architect. Alternatives for heavy template based documentation[42], like wikis or photos, are used to document the decisions [184]. This led toa change in responsibilities and in the role of the architect. The responsibilitiesfor the architectural decision process shift from the formal architect [62] to the de-velopment team. The architect gets more of an advisory servant role within theproject, often participating in the development team as a designer or developer.

However, both from the traditional architecture community as well as from theagile community beliefs exist concerning software architecture and the decision-making process. In order to validate the accuracy of these beliefs, we have con-ducted a survey with architects that make architecture decisions on a daily basis.In the survey, we analyzed three main aspects: what was the success of the deci-sion? What were the properties of the decision based on the Triple-A Framework[184]. As a last aspect, the additional project and decision properties were ques-tioned. In this survey, 39 participants provided details about a total of 54 archi-tectural decisions. The data provided by the participant forms the basis for ouranalysis.

Software architecture is a challenging research field. As projects are difficultto compare, research is sometimes based on anecdotal evidence that comes fromindividual projects or persons. This increases the likelihood of the emergence ofunfounded beliefs. This is reinforced by the architectural practice, where the suc-cess of projects is highly related to the image of the companies involved and suc-cessful systems are, sometimes unfoundedly, attributed to successful architecturaldecisions. Finally, in industry there are several movements, including program-ming paradigms, programming languages or development methods that lead tospecific views on building software with occasional religious overtones. As a con-sequence, in our field of research it is difficult to distinguish between opinions andfacts.

Our research focuses on software architecture. We have collected beliefs andchecked their validity by conducting a survey with software architects. We haveused questions on three distinct topics:

• Questions about the experience of the person and the characteristics of theproject

• Questions about the way in which architectural decisions were made (basedon the Triple-A Framework).

• Questions on the success of the project

The answers to these questions were used to validate or falsify the beliefs aboutsoftware architecture decision making.

The contribution of this research is threefold. First, we provide an overviewof the primary beliefs that exist currently around software architecture. Second,we show which of these beliefs are and which are not confirmed by our empiri-cal data, and we provide specific details about what aspects of these beliefs canbe validated. For the unfounded beliefs, we finally describe the impact for thearchitecture community of having these unfounded beliefs.

This chapter is organized as follows. First, the background for the beliefs, theTriple-A Framework and the project properties is given. Then the experimental

6.2. Background 119

setup is described. The results are discussed in the sequential section, followed bya reflection on the results. This chapter ends with related work and some conclud-ing words.

6.2 Background

6.2.1 Triple-A Framework

Our previous work describes the Triple-A Framework (Agile Architecture AxesFramework) [184] for characterizing the architectural decision process. In our cur-rent research, we use the Triple-A Framework as one of the aspects of an architec-tural decision, in order classify the decisions and relate them to the beliefs. TheTriple-A Framework consists of three orthogonal axes:

• Periodicity (When). The "decision loop" [23] describes how decisions leadto new decisions, based on the alternatives chosen. However, what is rarelydescribed is the periodicity of the decision. What is the length between theactual decision, and the resulting validation of this decision to the qualityattributes that the system needs to comply to?

• The Architect (Who). In architectural knowledge literature [62], the architectis referred to as the person responsible for the architecture, or the person whomakes the architectural decisions. However, who this person is, what role theperson has in the project or organization, and what skills this person needs,is very important. And the effects of these skills on the results of the projectare seldom written about.

• Artifacts (How). As an abstraction of all things that are created around thearchitecture development process, often the term artifact is used. Examplesof artifacts are documents (e.g. SAD), models (e.g. UML), or source code.

The periodicity axis is split this axis into two different parts: the time betweenhet arising of the problem and the actual decision, and the time it took to validatethat the decision was the right one. This resulted in four axes for the Triple-AFramework that we used in our survey as shown in Appendix A. The questionsabout the position on the Triple-A Framework were multiple-choice questions.

6.2.2 Success Factors of a Decision

The effect of a decision on the project success can be defined by different criteria[196]. Agarwal [3] describes three main measures for success: cost, time and qual-ity of the product. We have distilled four success factors from this: the Return onInvestment as a representation of the cost. The time measure is split into two fac-tors: the amount of effort needed for the project, and the development speed. Thelast success factor is the quality of the product. In addition to this, the results ofthe decision on four important quality attributes for software systems were asked:performance, maintainability, security and usability. So, in total eight success fac-tors were asked for, as shown in Appendix A. For all of these factors, a Likart scale


with values 1-5 was used, where 1 meant strongly disagree and 5 meant stronglyagree.

6.2.3 Project and Person Characteristics

In order to validate the beliefs, characteristics of the decision maker and the projectneeded to be measured. Therefore, the following project properties have beenincluded in the questionnaire:

• Experience of the respondent. To be able to compare the experience of therespondents, two questions were included: experience in architecting, andexperience as developer.

• Duration of the project. This parameter concerns the duration of the project.

• Project size. In order to get an idea about the size of the project, the projectteam size and the number of project partners was questioned. Additionally,we asked about the size of the project in line-of-code and person-months, butthat was unknown by a lot of respondents to this was left out in the analysis.

The complete list of Characteristics is included in Appendix A. All of the an-swers except for the number of partners were multiple-choice questions whereranges could be selected. In that one, a number could be entered for the numberof partners involved in the project.

6.2.4 Beliefs

We have looked for several stubbornly beliefs around software architecture. In thisresearch, we focus on the beliefs that relate positions on the Triple-A Framework toproject success. These beliefs are generally in the form that the success of a projectis depending on what the team / person does. The latter is reflected in the positionon the Triple-A Framework.

• B1. Making architectural decisions quicker leads to a worse quality prod-uct. Well-designed architectures are based on the most important architec-tural decisions [100, 201]. This implies the process for making these decisions[183] should be well thought out. The underlying belief for these importantdecisions is that the decision speed is of less importance compared to thedecision outcome. Hence, it is expected that decisions that are made hastiercreate projects of worse quality compared to projects based on decisions thattook more time.

• B2. Making and validating architectural decisions quicker increases thedevelopment speed and RoI of the project. Another belief affecting the de-cision making speed is the belief that when making decisions quicker, the de-velopment speed can be increased without sacrificing the quality (as checkedin B1), if the decisions are also validated quickly. Especially the lean startupcommunity [136, 156] stresses the importance of validating assumptions asquickly as possible with customers. Architectural decisions can have a huge

6.3. Experimental Setup 121

impact on the system and the business. Validating these decisions is con-sidered essential, especially for learning about the business (RoI), and to in-crease development speed by knowing what does (not) work early on in theproject.

• B3. People that code the system should design the system to achieve speedin development. The agile community stresses that one should avoid ivory-tower architects [118] making the important decisions for projects. The as-sumption is that they don’t have feeling with the code, and that nice lookingdesigns are favored over practical solutions. The belief implies that the de-velopment speed decreases as more time is spent on the architecture design,while the architecture is harder to implement.

• B4. Decisions made by higher ranked architects have a higher RoI andbetter Quality. The roles people have in organizations vary. As we haveshown in our previous work [184], it is possible to rank the decision-makerbased on the role he or she has in the company or project. Belief B4 impliesthat when higher-ranked architects make decisions, it leads to better qualityproducts that have a higher RoI.

• B5. Less architectural documentation decreases effort needed and speedsup development. Typically, more extensive architecture documentation takesmore time to write and more time to review [76]. Also, it is harder to complyto the architecture when it is contained in extensive documentation. Hence,the tradeoff to make just enough documentation is very difficult to make [50,147]. However, the belief that development speed increases and the effortspend on the project decreases when limiting used architectural documenta-tion is generally held.

• B6. Less architectural documentation decreases project quality. On theother side, there is the belief that better documentation increases the qual-ity of the system as it is better though-of and decisions made previously areeasier to reproduce. This belief is especially often phrased when weight-ing short-term benefits (quicker decision process) against long-term benefits(quality) [14]. As knowledge vaporization increases when decisions are notdocumented [100], it is expected that the quality of the system declines whendocumentation is omitted.

All of the described beliefs can be shown as a relationship between positionson the Triple-A Framework and imply specific success criteria for the belief. In thenext section, we describe the experimental setup of this research.

6.3 Experimental Setup

6.3.1 Survey

In order to validate or invalidate the beliefs stated in the previous section, weconducted a survey. The survey consisted of three main parts, as described in theprevious section:


• The context of the project, data of the respondent like background, experi-ences, and current role, as well as project properties like the type of project,the size and running time.

• The success factors of the decision, as described in the previous section.

• Finally, in the last part the position of the decision on the Triple-A Frame-work was determined to be able to classify and compare the different archi-tectural decisions.

Appendix A shows all the relevant questions of the questionnaire. The respon-dents could fill in the decision details for one or two decisions.

The invitation for the survey was sent to potential participants from a personalemail address of one of the researchers. In addition to the survey questions, weended the survey with a question if the respondent knew more people who wouldbe qualified and interested in filling in the survey. The survey was send to thosepeople too. The invitations were selected from the connections of the researchers,based on the experience of the connections.

In total 25 emails were send to Dutch speaking architects from the Netherlands,and 18 to English speaking architects in the Netherlands, Sweden and the USA.Some of the invited architects forwarded the invitation to other architects in thecompany that were better suited for the questionnaire. In addition, some of theinvited architects responded to the invitation that they were not classified to fillin this kind of survey, or that their company would not like them to provide de-tails about the architecture of their systems. As a result, a total of 39 respondentsprovided data about 54 architectural decisions.

6.3.2 Data Preparation

In order to be able to do statistical analysis, the data from the survey was pre-processed for analysis in R. The survey results were exported to a CSV that couldbe used as a data source in R. All the data was transformed to floats between 0 and1. The position on the Triple-A Framework was codified. The different optionsper axis [184] were mapped on a number between 0 and 1, where the distancesbetween the steps were alike. The success factors were already numbered (as theywere multiple-choice questions with only one option possible), they were normal-ized to numbers between 0 and 1. The project and architect properties were alsocodified and normalized between 0 and 1 laniary. The result was a large data-matrix that was used as the basis for our analysis.

6.3.3 Analysis Methodology

In order to validate the beliefs, we have sliced the data set into different groups.Based on the belief at hand, we use one of the axes, properties or success factors tosplit the data set. Then, we analyze what the differences in these data sets are byconducting a Wilcox test [198], comparing the created two groups on the specificproperty we are investigating. Only results that had a p < 0.05 are mentioned asvalid results.

6.4. Results 123

6.4 Results

This section describes the results of the survey. First, the some general observa-tions on the data are given. Then, the data about the beliefs from in Section 6.2.4are analyzed. This section ends with some addition observations on the surveybased on correlations that were found but did not connect to the assessed beliefs.

6.4.1 Characteristics of the Projects and Participants

Details of 39 architects and projects are shown in Figure 6.1. Generally, the respon-dents were experienced architects, both in development experience as well as inarchitecting experience. Most of the projects lasted less than 6 years. The teamsizes of most projects were between 4-30 project members. A few (4) projects werevery large (>100 project members).

FIGURE 6.1: Project and Architect Properties.


6.4.2 Assessing Beliefs

In this subsection, we describe all the beliefs that were identified previously in thischapter, and describe the survey data around it in order to validate or invalidatethe belief.

• B1. Making architectural decisions quicker leads to a worse quality prod-uct. To investigate this belief we have split the data in two groups based onthe speed of making decisions. The A1 Axis is used to split the dataset indecisions that are made within one week (marked as quick decisions) and agroup of decisions that took more than one week to complete. We run theWilcox test on these groups to see if the quality of the product differed sig-nificantly in one of these groups. None of the quality attributes providedany significant correlation. Therefore, we can conclude that our data cannotconfirm this belief.

• B2. Making and validating architectural decisions quicker increases thedevelopment speed and RoI of the project. For assessing this belief, thedata was again split into groups to identify differences between fast and slowdecision-making. Again the split was placed on periodicity of one week. Wechecked the two groups for the specific success factors: S1 (Return on Invest-ment) and S3 (development speed). As with the first belief, no significantcorrelation could be found. Hence, belief B2 could not be confirmed by thedata in this research, so the speed at which the decisions are made does notaffect the development speed or the RoI.

• B3. People that code the system should design the system to achieve speedin development. For this belief, the data was split in two groups by theA3 axis (who makes the decision). One groups consisted of projects wherethe development team was responsible for the decision-making, in the othergroup, others in the organization were responsible (e.g. application archi-tects, enterprise architects, management). We checked if there was a signifi-cant correlation between the groups and the speed of development (S3). Nosignificant correlation was found, so, also this belief was unfounded by ourdata.

• B4. Decisions made by higher ranked architects have a higher RoI andbetter Quality.

– B4.1 Affecting the same axis (A3) as B3, this belief is also centered onthe persons making the decisions. The groups were made based on theroles of the decision makers. The same grouping was used as with B3.The groups were checked for RoI (S1) and general quality (S4). Again,no significant correlations were found.

– B4.2 In order to further analyze this belief, we also looked for relation-ships based on the experience of the architect instead of the role in theorganization. Here, one group consisted of people with less than 6 yearsof architecting experience while the other group consisted of peoplewith more architecting experience. We found a significant correlation

6.4. Results 125

TABLE 6.1: Overview of Conclusions on Beliefs

Belief Properties P Conclusion

B1 A1 - S4 > 0.05 No evidence foundB2 A1 & A2 - S1 & S3 > 0.05 No evidence foundB3 A3 - S3 > 0.05 No evidence foundB4.1 A3 - S1 & S4 > 0.05 No evidence foundB4.2 arch_exp - S1 0.038 ConfirmedB5.1 A4 - S2 > 0.05 No evidence foundB5.2 A4 - S3 0.041 ConfirmedB6 A4 - S4 > 0.05 No evidence found

between the experience and the RoI (S1, p=0.038), but not to qualityof the product (S4). Hence, projects where more experienced architectsmade the decisions got better RoI than other projects.

• B5. Less architectural documentation decreases effort needed and speedsup development.

– B5.1 For this belief, we looked at the forth axis, how the architecture wasdocumented. We split the data in one group that does document thearchitectural decisions and the other group that do not explicitly docu-ment them (but rely on face-to-face communication, notes or photos ofwhite-boards). We checked the difference of these groups against thesuccess factors that involved the effort needed (S2), but no correlationwas found.

– B5.2 Then, the same group was run against the other success factor, thedevelopment speed of projects (S3). A correlation was found: less docu-mentation correlates with projects with higher development speed (S3,p=0.041).

• B6. Less architectural documentation decreases project quality. On thesame axis (A4), the amount of documentation was checked against the qual-ity of the product (S4). Here, we used the same distribution as in the previousbelief by splitting the data in one that uses documentation and one that doesnot use it explicitly. Between these groups, no significant quality differenceswere detected. So, this belief also found no ground in our data.

In Table 6.1, the results of the analysis of the beliefs are summarized. The firstcolumn states the belief, the second one the properties of the decision that wereused to do the assessments. The ’p’ Column describes the p-values for confirmingthe correlations. The last column states the conclusion of the belief.

6.4.3 Additional Observations

In addition to the evaluated beliefs and related parameters, we also checked thedata for correlations that we did not expect. From this, we saw that the two po-sitions on the when-axis of the Triple-A Framework (A1 and A2) were correlated.


So, typically architects that make decisions quickly also validate their decisionsquickly.

There were also correlations in the properties of the architects and projects.The size of the project seemed to correlate with many other properties: architec-ture experience (p=0.015), the duration of the project (p=0.000047) and the numberof partners (p=0.00054). This means that larger projects typically run longer, withmore partners and more experienced architects. Interestingly, there is no signif-icant relationship between the size of the project and any of the success factors.However, the RoI (p=0.047) as well as the quality of the projects (p=0.0095) washigher in longer running projects. Also, longer running projects took more time tomake decisions (A1, p=0.030).

Interestingly, development experience is an indicator for having a better qualityin the project (p=0.024), while architecting experience is not.

6.5 Reflection

6.5.1 Threats to Validity

There are several threats to the validity of this research. First, this research is donebased on a limited set of participants (39). But, as the researchers controlled thedistribution of the questionnaire, the seriousness and expertise of the participantswas confirmed. This makes the data from these participants of high quality.

When conducting a survey, the terminology used is very important. As theparticipant is not involved in a conversation with the researcher, there is no pos-sibility to correct wrong interpretations. Especially when using terminology fromthe research community to interview industry practitioners, this can lead to misin-terpretations. As both authors of this research have much experience in industry,we used the terminology familiar for the participants. However, in some cases wepresume that interpretation can be causing the seen results. For example with thecorrelation between decision speed (A1) and decision validation speed (A2) can becaused by the usage of the term ’validation’ in the context of decision making.

In this work, we have analyzed data to find correlation between specific param-eters in the interviews. We do acknowledge that this correlation does not implycausality, as is the case in the beliefs we investigated. Often, the parameter thatcan be influenced, (e.g. the way in which a team does documentation), is seen asthe cause, the property that is harder to influence (the RoI) is seen as the result.In our research, we investigated the believed relationships by assessing the corre-lation between the parameters involved. If no correlation is found, the causationdirection is irrelevant (as with B1-3 and B6). The partial confirmation for B4 andB5 does not have a cause-effect ’direction’, but is an indication that the belief canbe confirmed.

6.5.2 Implications for practitioners

The architecture community holds many beliefs, of which we assessed six. Fourof these beliefs have been busted, while the other two only have been partially


confirmed. Holding these unfounded beliefs has a severe economic, architecturaland process impact.

For the decision speed, as belief B1 is held unwarranted, there is severe eco-nomic impact as the decision process is taking too long, while the quality does notsuffer if decisions are made quicker (B2). Actually, the speed of making decisionsdoes not effect the RoI or the quality of the decision, so the process focus should belimiting the effort spent on this, while continue looking for aspects that do effectthe RoI or quality.

Considering the persons making the decisions, the impact of this research isthat it does not really matter if the architects making the decisions actually codethe system (B3). However, if you have more experiences architects, the RoI is posi-tively affected (B4). This means, that the processes and roles used in organizationsare irrelevant compared to the experience of the crew.

Last, the wrongly held beliefs about documenting architectural decisions alsohave a severe impact on the effort spent on system design. As having more docu-mentation slows down the development speed (B5), while not adding anything tothe quality of the product (B6), the economic implications are eminent. Consider-ing the practices of architects, it is interesting to see that there is a lot of researchabout preserving architectural knowledge [23] by documenting it, while the ben-efits of this documentation is doubtful. This also affects the ways of working forarchitects: what is the use of prescribed documents like the SAD? What is the rele-vance of an architecture review process when this process is based on documentsthat don’t help in a positive outcome of the project?

6.6 Related Work

In the architecture design decision research hierarchical structures are used tomodel architectural knowledge [23] or design decisions [100, 189]. This researchoften emphasizes the recording of architectural decisions or the extraction of thesedecisions later in the development process [186]. There has been much attentionto documenting software architectures [42], as well as documentation templates[117] and Model Driven Architecture [146]. In the field of architectural knowledge,Poort et al. [153] conducted a survey to correlate project properties and architec-tural practices to project success. They focus on architectural knowledge sharingreasons, not the actual way in which architectural decisions are made (as analyzedby a position on the how-axis of the Triple-A Framework). They also don’t findstrong correlations between the used practices to the project success, but do con-clude that the interpersonal relations have a strong effect on this success.

In the research on the success of software projects, Chow and Cao [40] assessedthe success factors within agile projects by conducting a survey with agile pro-fessionals in the field. In this work, they compared several practices that affectthe project to the perceived success of the project. No architectural related issueswere identified, although one of the conclusions that the team capability effectsthe time and effort (cost) success factors can be related to our findings that the RoIis positively affected by having more experienced architects.

Somers and Nelson [170] assess a more traditional way of (ERP) software de-velopment. Interestingly, the most important Critical Success Factor mentioned in


this research for the initial stage of the project is the ’architectural choices’. How-ever, there is no assessment of how these decisions should be made in this research.

6.7 Conclusions

All knowledge is founded on assumptions and beliefs. However, these beliefschange as new evidence is found. We have investigated the beliefs around ar-chitecture decision making, especially the beliefs that implied that the success ofa project is highly dependent on the way in which the decisions are made; whomakes the decisions, when are they made, how are they documented. We can con-clude that our empirical evidence, based on the survey we conducted, cannot fundmost of these beliefs. How quickly or slowly a decision is made does not effect thequality of the end-result. The architecting experience of the decision maker affectsthe RoI of a project, but the role of the decision maker or if the decision makeralso codes the system is irrelevant. Documenting the architecture does not seemto affect the success of the project. Even stronger, using less documentation seemsto speed up projects without losing quality.

Beliefs can help to have a common ground for discussions or plans. However,when these beliefs are not true, decisions are unfounded and projects can suffer.Especially the effort needed for discussions based on ungrounded beliefs can takea lot of time and effort. With this research, we hope to bury some of these debatesby showing the subjected beliefs are busted.

Acknowledgements

We would like to thank all the participants of the survey for providing us with thetheir insights and data. Also, we would like to thank Viktor Clerc for his help withthe initial questionnaire.

Appendix: Selection of the Survey Questions


TABLE 6.2: Project and Person Characteristics

Abbr. Description Question in the ques-tionnaire

Multiple choice answers

P1 Architectingexperienceof respon-dent

Please indicate thenumber of years youhave architectingexperience.

’Less than 1 year’ / ’Between1 and 3 years’ / ’Between 3and 6 years’ / ’Between 6and 10 years’ / ’More than 10years’

P2 Developerexperi-ence ofrespondent

Please indicate thenumber of years youhave experience as asoftware developer.


P3 Duration ofproject

How long has thisproject been ongoing?


P4 Team sizein project

Please indicate the(average) size of thesoftware developmentproject team (architects,developers, businessanalysts, testers, DBA,project management,etc.).

’1 - 3 project members’ / ’4- 8 project members’ / ’9 -30 project members’ / ’31 -100 project members’ / ’Morethan 100 project members’

P5 Number ofpartners inproject

Please indicate thenumber of different or-ganizations or partnersinvolved in the project.

<number>


TABLE 6.3: Success Factors


Multiple choice an-swers

S1 Return on Invest-ment

This decision increasedthe return on invest-ment of the project.

Strongly disagree /Disagree / Neutral /Agree / Strongly agree

S2 Effort reduction Because this decisionwas made, we neededto spend less effort onthe project.


S3 Developmentspeed increase

The decision madethe project finish morequickly.


S4 Quality increase The quality of the prod-uct was increased bythe decision.


S5 Performance The performance of thesystem was positivelyaffected by the deci-sion.


S6 Maintainability The maintainability ofthe system was posi-tively affected by thedecision.


S7 Security The security of the sys-tem was positively af-fected by the decision.


S8 Usability The usability of the sys-tem was positively af-fected by the decision.



TABLE 6.4: Triple-A Framework


Multiple choice answers

A1 Periodicityof the mak-ing of thedecision

What was the time be-tween the moment theconcern emerged andthe moment the archi-tectural decision wastaken to address theconcern?

’Within a day’ / ’Betweenone day and one week’ /’Between one week andone month’ / ’Between onemonth and half a year’ /’More than half a year’

A2 Periodicityof the val-idationof thedecision

What was the timebetween the momentthe architectural de-cision was taken andthe moment the archi-tectural decision wasvalidated?

’Within a day’ / ’Betweenone day and one week’ /’Between one week andone month’ / ’Between onemonth and half a year’ /’More than half a year’ /’Not validated’

A3 The personmaking thedecision

Who was the main de-cision maker?

’Development Team’ / ’Ap-plication Architect’ / ’Do-main Architect / ProductOwner’ / ’Enterprice Archi-tect’ / ’Management Project /Organization’

A4 Artifactsused to pre-serve thedecision

How was this decisioncommunicated?

’Direct communication to thestakeholders (real-life)’ / ’Di-rect communication to thestakeholders (via phone /telco)’ / ’Not explicitly doc-umented; Notes, sketchesand photos of whiteboards’/ ’Documented without atemplate’ / ’In a document,based on a template cho-sen for this decision’ / ’In adocument, based on a tem-plate that is mandatory in thisproject’

133

Chapter 7

Pivots and Architectural Decisions:Two Sides of the Same Medal?

“The only way to win is to learn faster than anyone else.”

- Eric Ries

This chapter is based on: Jan Salvador van der Ven and Jan Bosch. “Pivotsand Architectural Decisions: Two Sides of the Same Medal? What ArchitectureResearch and Lean Startup can learn from Each Other”. In: Proceedings of Inter-national Conference on Software Engineering Advances (ICSEA 2013). 2013, pp. 310–317.

Abstract

Software architecture research has gained maturity over the last decades. It focuses onarchitectural knowledge, documentation, the role of the architect and rationale for the ar-chitecture decisions made. It is widely recognized that considering architecture decisions asfirst class entities helps in designing and maintaining architectures. In the entrepreneurialand new product development space, the lean startup movement is gaining momentum asone of the most notable ways to develop products. During new product development inhighly uncertain environments, speed is the most important factor. Speed to get on themarket, speed to learn from your customers, but also speed to tackle technological risks.Because the runway for new product development is short, it is important to experimentand make decisions quickly. The pivot plays a crucial role as a business decision for newproduct development. Both pivots and architectural design decisions can be seen as highlyinfluential aspects for a product. In our research, we investigate what the fields of archi-tecture research and lean startup could learn from each other. We focus our research on thetwo most important aspects of these movements: the architectural decision and the pivot,and show that they can be seen as two sides of the same medal representing the technicaland the business side of the product.

7.1 Introduction

Every company changes direction multiple times during its lifetime. In the past,it took a company months or even years to change direction, especially in largerindustry settings. In the last decade, the speed in which a company can adapt to

134 Chapter 7. Pivots and Architectural Decisions: Two Sides of the Same Medal?

changes has become one of the most competitive qualities [25]. The place wherethis effect is amplified is in new product development, either in small startupsor in larger, established companies. Because these projects typically have a shortrunway to being successful, making decisions quickly is crucial.

Architects have the important role to align business strategy to the softwarearchitecture of the products [132]. Especially in the domain of new product de-velopment, this balance is an enormous challenge, because on the one hand thetime to market is essential, and on the other hand the continuation of product andcompany is dependent on the solidity of the architecture. In new product devel-opment, there is also a bootstrapping problem. You need experiments with theMinimal Viable Product (MVP) in order to be able to validate your business as-sumptions, while you also need to have a piece of architecture to be able to createthis MVP. This tension exists in many projects involving new product develop-ment.

As a software company, one of the most important aspects of your product isthe software architecture, as it highly influences the capabilities (quality attributes)of the product. This architecture is formed by the decisions made during the de-velopment and maintenance [100]. Various authors emphasize the importance ofthese architectural decisions in software development [114, 183]. Models [23], clas-sifications [184] and reasoning structures [179] have been posed to manage thesedecisions. Key concepts that are used in software architecture are: decision topic,rational, alternatives, choice, and risk.

Research literature studying new product development and startups [18, 136,156] identifies a key type of decision that is extensively (and explicitly) used, thepivot. A pivot is the result of a business decision that is made to change the di-rection of the product. These decisions are based on different kinds of implicit orexplicit experiments [25], in order to validate hypotheses about the product, itsusers or its business case. For the research described in this chapter, we investi-gated what kind of decisions these pivots are, and what the relationships betweenpivots and the architectural decisions are. We currently focus on the pivots madeat startups, because:

• At a startup, the runway is short, so the evolution of the architecture of thesystem is very high. Effects of pivots and architectural decisions are visiblevery quickly, and have a very high effect on the company’s success.

• Larger companies are adopting startup techniques [25] to increase their owntime-to-market, especially for new product development. This makes ourresearch relevant as learning for large companies seeking new product de-velopment.

The contribution of this chapter is threefold. First, we introduce a conceptualframework for new product development as an experiment system with pivotsand architecture decisions as first class entities. Second, we identify the key con-cepts for architecture research and new product development, and identify thegaps between them. Third, we provide guidelines for the two fields that describewhat they could learn from each other, based on the conceptual model and theidentified concepts from both fields.

7.2. Conceptual Model 135

This chapter is organized as follows. First, we introduce our conceptual frame-work. Then, we sequentially describe the concepts of software architecture (Sec-tion 7.3) and new product development (Section 7.4) from a research and a prac-tical perspective. In these sections, the key concepts are identified. Then, we de-scribe the differences and similarities of the two as analysis in Section 7.5. Basedon this we present our guidelines for both fields. This chapter ends with relatedand future work and some concluding words.

7.2 Conceptual Model

The premise of this chapter is that this experimentation, both in the business do-main and the technical domain, is a critical technique to increase the chances ofsuccess in new product development. Based on our literature findings, we haveconstructed a conceptual framework for running new product business as a setof decisions. In Figure 7.1, our conceptual framework is visualized. On the top,two essential risks are shown as input for the business: market risk and technol-ogy risk. Which risks are most important depends on the context: the problemaddressed, the market, the competitors, the solution chosen, the technical possi-bilities, etc. Based on which risks are most eminent, hypotheses are formulatedto reduce uncertainty of the associated risk. To test each hypothesis, one or moreexperiments are performed. These experiments can be explicit (e.g., conductinga planned usage test, running a Proof of Concept, predict usage statistics), or im-plicit (e.g., a coincidental encounter, different product use by end-users). Then,based on the results of the experiments, decisions are made for the direction ofthe product. These decisions steer the direction of the product and the associatedbusiness, affecting the market and/or the software architecture, and in the end theproduct itself. In new product development, pivots are illustrative examples ofthese decisions. Therefore, the naming of the decision types is based on the phasesdescribed by Maurya [136]. In the initial Problem / Solution (P/S) fit stage, thedecisions don’t affect the system at all, since there is typically no product yet. Inthe second, Product / Market (P/M) fit stage, the focus of the experiments is tovalidate the Minimal Viable Product. This can result in pivots that influence thebusiness as well as the product. For these decisions, the market fit is the mostimportant; so, the architectural impact is subordinate. In the following phase, as-suming that the product / market fit is validated, still experiments need to be con-ducted to figure out how to scale the product when usage (e.g., number of usersor usage per user) grows. Aside from direct business requirements, in each stagesoftware architecture decisions need to be made, for example to support increas-ing scale, reduce technical debt or support an alternative use case after a solutionpivot. This chapter focuses on pivots as decisions that arise from experiments thataffect the business as well as the architecture of a product.

The validation speed is very important in this context. Validating a hypothesistakes time and effort. This effort should result in new insights in the product orthe market. If the product changes direction later (a pivot, or abandoning a pivot),the effort should pay itself by what is learned by it. So, it is important to keepvalidation speed short, and create hypothesis focused on learning. This is whyvalidation speed it essential in our model.


FIGURE 7.1: Conceptual Framework for Decision-based New Prod-uct Development

When looking at product development through our conceptual model, it is pos-sible to see that pivots and architectural decisions are actually the ways to mitigaterisks by experimentation. However, they both have a different risk they are ad-dressing, while affecting each other constantly. So, they can be seen as two sidesof the same medal, one side showing the market challenges, while the other sideshows the associated technological risks. It is virtually impossible to encounterone without the other, as market risks are typically tackled with technologicalsolutions (e.g., the business drivers for the architecture), and technology alwaysaffects the business.

In the following sections, we will describe how this model can be used in bothsoftware architecture and new product development.

7.3 Software Architecture

7.3.1 Software Architecture Research

Software architecture has been researched extensively in the last decades [26, 42,90]. In this research, architectural knowledge [23, 119] and more specifically ar-chitectural decisions [114, 180, 189] play a vital role. What we can distill from thisresearch is that creating architectures is essentially a risk-mitigation process wherethe balance has to be found between non-functional requirements (e.g., quality at-tributes), business risks and technological challenges. Often the long-term viewis more important then short-term project goals for making the right architecturaldecisions. In high-pressure situations (e.g., deadlines), it is easy to give in on theselong-term issues, causing design erosion [75], technical debt [120] or even worse,

7.3. Software Architecture 137

project failure. In the next section, three cases are described that show how ar-chitecture decisions are used in practice. From this, we identify key concepts forcomparing architectural decisions and pivots.

7.3.2 Cases

In order to be able to compare software architecture practices to lean startup move-ment, we have to identify what parts are eminent for both fields. To do this for thesoftware architecture space, we have conducted a literature research combinedwith our experience as participant researchers in several cases. We have analyzedthe practices of software architecture in new product development in several cases[184]. In this chapter, we summarize the cases that contain relevant informationabout how software architecture is used in practice. The cases are anonymizedto protect the companies and customers involved. The cases are not selected atrandom. From the experience of the authors, other cases could have been chosen.However, as Eisenhardt [55] poses, in case study research it is "neither necessarynor preferable to randomly select cases". We have chosen to discuss the cases thatconsidered new product development, while being large enough to be relevant asindustrial cases. A more extensive description of these cases can be found in ourprevious work [184], where we focused on the role of the architect in the softwaredevelopment process. In this work, we describe our findings of these cases thatconsider pivots and architectural decisions.

Case Alpha involved the construction of a software system that had to replacea legacy Geographic Information System (GIS) for a large harbor. The new sys-tem had to be coupled with several legacy backoffice systems. The customer, alarge harbor company in the Netherlands, initiated the project. The solution wasservice oriented, and consisted of several systems communicating with each otherthrough an Enterprise Service Bus. Most of the software was written in Java. Thecoupling was one of the most challenging issues in the project. This case consistedof a pilot and a realization phase, three and six months, respectively. Ten to twentypeople were involved during the various phases of the project.

Case Alpha was a typical example of a project that was driven by risk man-agement in order to get the architecture of the system right. Several techniqueswere used to experiment in order to mitigate risks. In the pilot phase, the time wasfixed, and the goal was to show the most important (technical) risks could be tack-led. This resulted in a biweekly iteration that focused on tackling the top-priorityrisk. In this phase, a PoT (Proof of Technology) and a PoC (Proof of Concept) weremade, involving many architectural decisions. Both the PoT and the PoC weredemonstrated to the customer as well as the end-users to validate critical assump-tions.

Case Beta was conducted at a medium sized product company in the Nether-lands. The project involved a new administrative software system for specific de-partments in Dutch hospitals. Changing regulations and different working envi-ronments needed to be taken into account. The project was executed by a mul-tidisciplinary team of seven people, assisted by the architect from the company.A Java stack (JSF, Spring, Eclipselink) was used for creating this product fromscratch, while a different team of approximately seven people developed a part


of the backend separately. This separate development was one of the most chal-lenging architectural parts of the project. The development of the product tookplace for a period of 12 months.

In case Beta, several architectural experiments were conducted, the major oneconsisting of how to manage the introduced complexity of the platform. A proto-type was constructed early on. Also, interviews were held with key users in thefield. However, often the experiments were conducted ad-hoc without a concretehypothesis to validate. The architectural question if the generic backend part ofthe system could be reused was validated continuously by using this componentin another project, too.

A small startup company working on a web based product for the consumermarket was the scene for case Gamma. The project contained high-risk technolog-ical challenges, where the architecture needed to be flexible in the beginning, to beable to handle the expected high number of users. The application was created inRuby on Rails 1 with a NoSQL backend based on MongoDB 2 and Redis 3. Themain architectural challenges were to be able to potentially scale up the applica-tion when lots of consumers are using the system, while being able to adopt thesystem to changing requirements from the customers.

Case Gamma consisted of constant experimentation. As the product of thecompany was being developed, several hypotheses were considered, resulting ineither small pivots (e.g., users would like to see the results in a stream-like view),or architectural decisions (e.g., the graph database could be best modeled in Re-dis). However, again the experiments were setup implicitly, e.g., without forminga hypothesis or validating if the results were expected.

We have seen the experimental nature in all of these cases. Also, in all of thecases a clear Build, Measure, Learn (BML) loop [156] was used. In cases Alpha andBeta, this loop was used implicitly (never mentioned), while in case Gamma theBML loop was known and explicitly used.

7.3.3 Key concepts

Several key concepts come back in most of the research about architectural deci-sions [186]:

• Architecture Design decision. Design decisions are the building blocks forsoftware architecture. These decisions consist of the following parts:

• Decision topic. The decision topic is the actual problem that needs to besolved. Often, these topics arise from previous decisions (we decided to baseour application on NoSql technology, which specific database product arewe going to use?), or from non-functional requirements (how are we goingto ensure our up- time is high enough?)

• Choice. The choice, or decision, is the result of the decision process. Often,this is the only part that is communicated (discussed or documented).

1http://rubyonrails.org2http://www.mongodb.org3http://redis.io

http://rubyonrails.org

http://www.mongodb.org

http://redis.io

7.4. New Product Development 139

• Alternatives. A typical decision has more than one alternative to chose from.Alternatives can be just named (e.g., different component names), or some-times architecture parts are considered as alternatives (different styles or pat-terns, or comparing specific implementations of components). In rare cases,the alternatives are realized and compared as a Proof of Concept or Proof ofTechnology.

• Rationale. The rationale of a decision describes, often in plain text, whythe chosen alternative(s) solve(s) the problem at hand, and why the chosendecision is the best solution.

Based on our case material, we have seen two other key concepts that are impor-tant around software architecture design decisions:

• Risk. Decisions are often made to mitigate a risk. So, in order to addressa concrete market or technological risk, certain decisions need to be made.Risks can be seen as triggers for decision topics.

• Experimentation. To make sure you make the right decisions often, besidesthe rationale already discussed, experiments are conducted to make viablethat the suggested solution is correct. This can be done either as a PoT, PoCor something else.

In the following section, we will describe what the nature of new product devel-opment is and how the lean startup movement influences it.

7.4 New Product Development

7.4.1 Research

Experimentation in Research and Development (R&D) as a basis for decision-making is the normal approach in a variety of domains, including the manufactur-ing, automotive, mechanical engineering, medical, and pharmaceutical industry[175]. From the experiential perspective, frequent iterations of products in terms ofprototypes or multiple design iterations, testing, and more frequent milestones areassociated with faster product development [56]. In the software industry, innova-tion through experiments with customers is becoming more and more discussed[49], primarily in the web 2.0 and Software as a Service (SaaS) fields. However,in the software industry, these experiments are currently primarily performed inpilot stages for validating architectural decisions or on feature optimization.

7.4.2 Interview Setup

We have conducted interviews with founders and architects of startup companies,to identify what pivots were made in new product development, and what thenature was of these decisions. In our interviews, we have chosen to focus on piv-ots as an entrance to talk about the most important decisions and the decisionprocess. We interviewed representatives of the five different companies. In these


TABLE 7.1: Interview Questions

Question

Can you give a short description of the pivot?Who were involved in the decision process?What triggered the pivot?Did you validate the success / results of the pivot? How did you do that?How long did it take to do this validation?Were there any alternatives evaluated? If so, what alternatives?What were the results of the pivot?Did the pivot affect the (software) architecture of your system / product?What were the results on the architecture?

interviews, we discussed a total of nine pivots. Two of the companies were locatedin the Netherlands, two in the USA, and one in Sweden. All the companies wereproduct companies, delivering web-based software.

Our research has an exploratory nature. This why we have chosen to use semi-structured interviews for acquiring our data. The interviews lasted from one totwo hours. We have recorded all of the interviews to be able to listen again to theconversations during the analyses phase. In addition to this, the interviewer madenotes during the interview. Based on the notes and the recordings a log is createdwith results after the interview. These logs were the basis for our analyses.

The interviews were structured as follows. First an introduction was givenabout the current status and the goal of the research. The interviewee was askedpermission to publish about the results and if it was okay that the interview wasrecorded. Then general questions about the company and terminology was asked,after which the interviewee was asked to tell about several pivots he was involvedin. The interviews in the Netherlands were done face-to-face in Dutch, while theinterviews with Sweden and the USA were done via videoconference in English.

We have used interview questions as guidance through our open-ended in-terviews. First, basic questions about the interviewee and company were asked,including if the company worked according to lean startups principles and if thearchitecture of the system was considered explicitly. In order to relate the resultsof the different interviewees to each other, we have asked them to describe whatthey mean by three key terms in our research: pivot, architecture, and architecturedecision. Then, we used a set of questions to let the interviewees reason about thetheir pivot. As we wanted to focus on the decision process around pivots, we havenot extensively questioned the technical details, but focused on the decision partof the pivots. The interview questions that were used are shown in Table 7.1.

These questions were used as a baseline for the interview. Where viable, addi-tional questions were asked, or explanation was asked for. In some cases, whenthe answer to a question was already told or when the question was irrelevant forthe context, the question was skipped and later noted based on the recordings andnotes.

For our research to be generic, we have selected a variety of interviewees andcompanies. On the other hand, we had to narrow our research in order to make


TABLE 7.2: Overview Companies

Company Location Role Domain Size

Voys NLD Founder Voice over IP, telecomfor small business

23

Certive USA Lead Engi-neer

Enterprise analyticssoftware

20

Dataprovider NLD Founder Data 10Burt SWE Chief Ar-

chitectAnalytics for publish-ers

28

Zevents USA Lead Engi-neer

Local search advertis-ing

50

sure the interview results would be comparable. We used the following criteria forselecting the companies:

• Companies from software industry in the startup phase, or a close startuporigin.

• Companies at least one year in business at the time of the discussed pivot(s).

• Companies that produce a product or service (no consulting).

• Companies with more than one employee.

This resulted in the selection of a set of 5 companies, as shown in the Table7.2. In the columns, the Company name, the geographical location, the role of theinterviewee, the domain of the company and the size of the company (approxima-tion of the number of employees) is described. From each of the companies, weinterviewed one of the key persons involved in the pivot(s) that occurred.

7.4.3 Interview results

First, we had to identify our interviewees’ point of reference. To do this, we askedthem about what three key terms in this research mean to them.

• Pivot. Even thought the term pivot is widely used in software industry, therewas some difference in the explanations about what a pivot is. Two pointscame back in all interviews: that it is a radical interruption against the ’pre-vious’ way of working/thinking and that often, different users/customerswere targeted after a pivot. So, the business strategy of a company changed.One person emphasized that layoffs are often the result of a pivot, making it’scary’ for employees when a pivot occurs.

• Software Architecture. The traditional view on architecture was dominantat the interviewees. All of them identified connectors/interfaces as one ofthe most important parts of architecture. Also, the mapping of business (re-quirements) on the technical design of the system was mentioned often.


• Architectural Design Decision. One of the interviewees had no idea what anarchitectural decision meant. The others noted that it is a conscious decision,where a specific direction is chosen for the architecture of a system (a branch-point).

We have summarized the results from the interview in Table 7.3. In this ta-ble, after the name of the company and a short description of the pivot, the riskthat was tackled by the pivot is described. The next column describes what ex-periments were conducted to validate the pivot. This information was derivedfrom what the interviewees discussed based on the interview questions (e.g., thetrigger for the pivot and the alternatives evaluated). Then, evaluated alternativesare shown, and in the last column of the table the results on the architecture aredescribed.

Although Ries [156] identifies ten different types of pivots, he does not dis-cuss the effects that pivots have on the architecture. From our interviews we havefound that it is possible to typify pivots by the impact they have on the architec-ture, as described in our conceptual framework. Business (product/market fit)pivots were found in six of the pivots and scale pivots were identified in three ofthe pivots. Although all interviewees stressed the fast-paced, dynamic and un-certain nature of new product development, the importance of employing a struc-tured, systematic approach to decision making was recognized as important.

7.4.4 Key Concepts

The following key Concepts involving new product development are extendedfrom the literature:

• BML / Experiment. The basis of the lean startup lies in the Build MeasureLearn (BML) loop, as described by Ries in [156]. This means that in orderto find a sustainable business, one has to continuously execute experiments(build), measure the effects, and learn from the results.

• MVP. The Minimal Viable Product (MVP) is the first version of the productthat can be used to start the BML loop. This can be a first version of a product,but it can also be something else (e.g., a landing page, video) as long as thehypotheses about the product can be validated.

• Hypotheses. In order to be able to know if one goes in the right direction,you have to know where you want to go. This is posed in a hypothesis thatcan be tested by experimentation.

• Validation. Key to understanding the results of a build step is to identifyhow to validate or invalidate a hypothesis.

• Measuring. Even though validation is concerned one of the most importantparts of the BML loop, the measuring is always an arduous part. Measur-ing can be done either qualitative (e.g., interviews), or quantitative (surveys,usage measuring, A/B testing).


TAB

LE

7.3:

Ove

rvie

wof

Pivo

ts

Nr

Com

pany

Pivo

t/D

ecis

ion

Prio

riti

zed

Ris

kEx

peri

men

tsan

dV

alid

atio

nEe

valu

ated

Alt

erna

tive

sR

esul

tson

Arc

hite

ctur

e

P1Vo

ysBu

sine

ssm

odel

chan

geU

nkno

wn

Acc

iden

tally

show

ing

inte

r-na

llyus

edfu

ncti

onal

ity

toa

cust

omer

.

Non

eTh

ear

chit

ectu

rebe

cam

em

ore

ofa

’Chr

istm

astr

ee’

P2Vo

ysA

rchi

tect

ure

reco

nstr

ucti

onM

aint

aina

bilit

yde

crea

seTe

chno

logi

cale

xplo

rati

on1)

Buyi

ngfu

ncti

onal

ity

from

othe

rsu

pplie

rsan

d2)

mer

g-in

gw

ith

othe

rco

mpa

ny

Rew

orke

dar

chit

ectu

re,

the

syst

emw

asno

wm

anag

eabl

ygr

owin

gP3

Voys

Cha

nge

ofpr

od-

uctp

acka

ging

Cus

tom

ers

mis

-us

edth

epr

oduc

tU

sage

test

ing

and

mea

suri

ngN

one

Unk

now

n(c

urre

ntly

inpr

ogre

ss)

P4C

erti

veR

adic

alch

ange

inbu

sine

ssU

nkno

wn

Dem

onst

rati

nga

moc

k-up

topo

tent

ial

cust

omer

sat

aco

n-fe

renc

e

Unk

now

nM

oved

mor

eto

host

edan

dcl

oud-

base

dse

rvic

es

P5D

atap

rovi

der

Scal

ing

the

inde

x-in

gpo

ssib

iliti

esTe

chni

cal

pos-

sibi

lity

tosc

ale

prod

uct

Tech

nolo

gica

lpi

lots

,au

to-

mat

edpe

rfor

man

ceva

lida-

tion

All

diff

eren

tki

nds

ofN

oSql

solu

tion

sw

ere

eval

uate

dPo

ssib

leto

inde

xsi

tes

ata

high

spee

d.

P6D

atap

rovi

der

Enha

nce

defe

ctef

ficie

ncy

Dat

ano

tacc

urat

een

ough

Usa

geM

easu

ring

and

expe

ri-

men

tati

onat

cust

omer

site

1)Ex

tern

alpr

ovid

erfo

rda

taan

d2)

buyi

ngda

tafr

omot

h-er

s

Not

muc

h,th

em

ajor

chan

geis

inth

ew

ayth

eap

plic

atio

nw

asus

ed(t

hecu

stom

erca

nde

cide

the

erro

rra

te)

P7Bu

rtC

hang

eof

cus-

tom

ers

from

ad-

vert

iser

sto

pub-

lishe

rs

Adv

erti

ser

mar

-ke

tis

unce

rtai

nbu

sine

ss

Usa

gem

easu

ring

and

disc

us-

sion

1)St

ayon

adve

rtis

ers

and

2)m

ove

tobo

thpu

blis

hers

and

adve

rtis

ers

Bett

erdi

stri

bute

dsc

alab

lear

-ch

itec

ture

.M

any

prin

cipl

esw

ere

deci

ded

on(e

.g.

Star

tw

ith

two

onan

ythi

ng)

P8Bu

rtC

hang

ein

prod

-uc

tfr

omad

ver-

tise

rto

olto

anal

-ys

isto

olfo

rad

-ve

rtis

ers

Cus

tom

ers

are

not

able

toju

dge

the

mar

ket

valu

eof

prod

uct

Prot

otyp

e,D

emon

stra

teto

pote

ntia

lcus

tom

ers

Seve

ral

prot

otyp

esof

diff

er-

enti

deas

wer

etr

ied

Cha

nge

from

desk

top

tow

ebba

sed

plat

form

P9Z

even

tsC

hang

ein

focu

son

sear

chin

stea

dof

publ

ishe

ror

i-en

ted

site

Busi

ness

ofpu

b-lis

her

site

sw

asgo

ing

dow

n.

Dis

cuss

ion,

prot

otyp

esLo

tof

disc

ussi

onab

outo

ther

alte

rnat

ives

tool

plac

e.O

neal

tern

ativ

ew

asof

feri

ng’d

eals

’to

for

loca

lcom

pani

es.

Arc

hite

ctur

ean

dto

olin

gbe

-ca

me

mor

e’g

ener

ic’,

mak

ing

itha

rder

for

the

com

pany

todi

stin

guis

hit

self

agai

nst

oth-

ers.


• Pivot. A pivot is a key concept in the lean startup movement as a decision tochange direction for a product. Several types of pivots have been identifiedby Ries [156].

Based on the interviews, an additional concept comes back:

• Risk. Most of the pivots that were discussed in the interviews mentionedthat they were done in order to mitigate some risk. The identification of thisrisk was often the starting point for the pivot.

7.5 Analysis

In this section, we summarize what similarities and differences are between thearchitecture research space and the startup spaces, by comparing the most charac-teristics aspects of both: architectural decisions and pivots. The introduced con-cepts of both software architecture and lean startup / new product developmentare compared in Table 7.4.

One of the biggest differences is the focus. As the architecture community fo-cuses on long-term non-functional requirements, the lean startup community fo-cuses on rapid validation of business assumptions (hypotheses). This also has acost implication. For lean startups, the speed of validation is the most importantaspect. So, the experiments should be as fast and cost-efficient as possible, to beable to change direction quickly if market or technology demands that. This con-trasts the approach of the architecture community where the focus is much moreon making correct decisions to reduce cost later in the development.

Several parts come back in both worlds. Both consider risks as primary trig-gers for making a decision, and both have an explicit description of what needs tobe solved, the decision topic and the hypothesis. Further, both parts use experi-mentation to see if the decision is correct, even though these experiments have dif-ferent forms. The minimal version to validate your decision is correct also comesin different forms, in architecture this is often a technological proof while in newproduct development this typically involves customers and end-users.

Further, as can be seen from the table, several concepts from one field seem tobe nonexistent in the other field. The explicit parts of the decision in the softwarearchitecture field (Choice, Alternatives, Rationale) do not exist in the Lean startupfield. Alternatives are evaluated (as seen in the interviews) and rationale is used toargument decisions or pivots, but decisions as first class entities are not commonin the lean startup field. On the other side, the measuring and validation that iskey in the lean startup is not considered in the architecture space.


This research is based on a limited set of cases and interviews. To a certain extentinterviews bare some subjectivity in them, because it is a conversation betweentwo individuals. Because of the exploratory nature of our research, using semi-structured interviews was a good way to validate our model. However, this re-search could be extended by more interviews, and by gathering more quantitativedata based on surveys, as described in the future work.

7.5. Analysis 145

TAB

LE

7.4:

Con

cept

Com

pari

son

Arc

hite

ctur

eD

e-ci

sion

Con

cept

Lean

Star

tup

Con

cept

Arc

hite

ctur

eN

ewPr

oduc

tDev

elop

men

t

Arc

hite

ctur

alD

e-si

gnD

ecis

ion

-Fi

rstc

lass

enti

tyfo

rth

ear

chit

ectu

re-

-Pi

vot

-R

adic

alch

ange

inbu

sine

ssm

odel

Dec

isio

nto

pic

Hyp

othe

ses

Dec

isio

nto

pics

are

typi

cally

hier

arch

ical

(cau

sed

bypr

evio

usde

cisi

ons)

,or

caus

edby

aris

ing

orex

pect

edri

sks.

-

Cho

ice

-O

ften

refe

rred

toas

the

deci

sion

self

,thi

sis

the

sele

ctio

nof

the

best

alte

rnat

ive

The

choi

ceis

nog

expl

icit

lym

enti

oned

inne

wpr

oduc

tdev

elop

men

tspa

ce.

Alt

erna

tive

s-

Are

ofte

nm

ade

expl

icit

indo

cum

enta

tion

Alt

erna

tive

sar

era

rely

mad

eex

plic

it.

Rat

iona

le-

Exis

ting

inth

ehe

ads

ofth

ede

velo

pers

,or

(ide

-al

ly)w

ritt

endo

wn

expl

icit

lyLe

ssre

leva

ntas

the

resu

lts

are

mea

sure

dqu

ickl

y.R

isk

Ris

kO

ften

the

focu

sis

onte

chno

logi

calr

isks

.Is

ad-

dres

sed

byre

ason

ing,

ofte

nth

eca

use

ofan

de-

cisi

onto

pic

and

thus

ade

sign

deci

sion

Focu

sis

onth

ebu

sine

ssri

sks.

Isad

dres

sed

byex

peri

men

tati

on

Expe

rim

enta

tion

BML

/Ex

peri

-m

enta

tion

Aut

omat

edte

stin

g(e

.g.p

erfo

rman

cete

sts)

,Re-

sear

ch,D

iscu

ssio

nIn

terv

iew

s,U

sage

mea

suri

ng,

Dem

onst

rati

on,

Dis

cuss

ion,

Prot

otyp

ing,

Res

earc

h,U

sage

test

-in

gPo

C/

PoT

Min

imal

viab

lepr

oduc

tIn

orde

rdo

addr

ess

cert

ain

risk

s,Po

Cs

orPo

Tsar

eco

nduc

ted.

Mai

ngo

alis

tova

lidat

eth

evi

-ab

ility

ofth

eco

ncep

tor

the

tech

nolo

gy,n

otth

ebu

sine

ss

One

ofth

em

ain

goal

sfo

ra

prod

uct

unde

rde

-ve

lopm

ent.

Mai

ngo

alis

tost

artv

alid

atin

gth

ebu

sine

ssm

odel

asqu

ick

aspo

ssib

le.

-M

easu

ring

Rar

ely

done

Mea

suri

ngis

the

only

way

tova

lidat

eth

ehy

-po

thes

es-

Val

idat

ion

Isof

ten

notd

one,

ifit

was

done

,itw

asdo

neby

reas

onin

g.It

isof

ten

hard

tova

lidat

ea

NFR

Dir

ect

busi

ness

valid

atio

n.O

ften

the

exis

tenc

eof

com

pany

valid

ates

pivo

t.


For validity reasons, we have not presented our framework or model to ourinterviewees. This would have biased our interviewees, and possibly changed theway they described the pivots and answered the questions.

7.6 Guidelines

In addition to confirming the conceptual framework, the data presented in thischapter allowed us to derive a set of guidelines about what the field of softwarearchitecture and new product development could learn from each other.

7.6.1 Solve both Business and Architecture as Experiments

For new product development, explicit experimentation is common. Architectscan learn from this by doing similar explicit experiments to validate the archi-tectural decisions at hand. This helps architects to speed up development anddevelop business quicker.

7.6.2 Business as a Set of Decisions

As shown in our conceptual model new product development can be treated as aniterative process of running market and technology experiments. The experimentsare driven by the risks that need to be tackled, and the result of the experiments isa set of decisions that form the business and the product. As we have shown thatan architecture can be seen as a set of decisions, we think this view can be extendedwhen considering pivots as business decisions. In this view, the business can ac-tually be seen as the set of taken decisions based on the results of experiments.

By making the decisions in new product development more explicit, it is possi-ble to piggyback on the experience that the software architecture research alreadydeveloped. It can for example be used to trace the decision process, change de-cisions when the situation changes, and see the dependencies that decisions haveon each other.

7.6.3 Creative Validation of Architectural Decisions

Even though some efforts are made to validate architectural decisions, the fieldof software architecture could benefit much from the creative way that lean star-tups validate their hypotheses. Of course, the horizon for both decisions is notalways the same, but the tendency to validate an architectural decision by reason-ing could be enhanced by more objective ways of validation (e.g., usage statistics,A/B testing).

7.6.4 Sometimes, Architecture can be Added Later

We have seen that in highly uncertain environments pivots affect the balance in thedevelopment of new products. Since p/m pivots put the emphasis on validatingthe business, the architecture of the product is often minimal supported. This can


cause design erosion and technical debt. However, we have seen that there areseveral strategies used at our investigated companies to overcome this:

• Pivot away. The first strategy we identified was that in some cases the pivotwas so radical, that the current architecture was thrown away. So, no mat-ter how unbalanced the scale was, the complete business changed and thecomplete architecture of the system changed too. Off course the experienceof the team and the business knowledge is reused, but the system itself waslargely or completely rebuild. Sometimes a complete new technology stackwas adopted (P2, P4, P8), while in other cases existing components werereused (P1, P5, P7).

• Add architecture later. When a product/market fit is found, but the architec-ture of the system is unable to facilitate the next phase (scale, as described in[136]), then architecture needs to be added later. So, in order to handle certain(non-functional) requirements for scaling, like performance or changeability,the architecture of the system need to be improved. As we have seen in ourinterviews (P2, P5), this is possible even though it can be expensive.

7.7 Related Work

Although the field of new product development is not new, lean startup is quitenew, and within the research community there has not been much research aboutthis topic. The basis for our model, experimentation, lies in the work of Thomke[174] and Davenport [49]. This was extended with the methodologies from thelean startup community [18, 136, 156]. From our own work on architectural de-sign decisions, we generalized the idea of running a business as an explicit set ofdecisions [184], based on the experiments [25].

The relationship between business and architecture has been extensively stud-ied from the product line perspective, for example BAPO [132]. We have shownthat two types of decisions are extremely important in new product development:business (e.g., pivot) and architecture decisions.

7.8 Future Work

Based on the encouraging results from our research, we are planning to extendit in several ways. First, we are planning to interview more people, to extendour data set and further validate and refine our findings. For example, we havenot had any of the interviewees talk about hypotheses, even though the literatureemphasizes hypothesis-based experimentation. Second, we are planning to extendour question set to a questionnaire that can be send to a larger group of people fora more quantitative validation.

Also, we are planning to test the usage of our model in industrial settings.For this, we are planning to conduct case studies at several companies, where wewould guide the company into using the conceptual model, and reflect on theefficiency. This could sharpen our framework and it would give further validationof the viability of our proposed work.


Last, we would like to extend our guidelines to even more actionable guide-lines that could be used in the various stages a product can be in.

7.9 Conclusions

In this research, we have shown that new product development is based on twotypes of decisions: architectural decisions and pivots. We have presented a con-ceptual framework that addresses both decisions in the context of an experimentalrisk-based process. This framework can help practitioners to structure their newproduct development process. From our interviews we derived a set of guidelinesthat emphasized the importance of decisions in experiments. Both architecturaldecisions as well as pivots play a vital role in the development of new products, astwo sides of a medal representing the technical and the business part of a decision.

Acknowledgements

We would like to thank the following interviewees for taking the time to talk withus about their pivots: Mark Vletter, Gordon Rios, Christian Branbergen and TheoHultberg.

149

Chapter 8

Towards Reusing Decisions byMining Open Source Repositories

“The code is the architecture.”

- Hohpe et al. [92]

This chapter is based on:

• Jan Salvador van der Ven and Jan Bosch. “Making the Right Decision: Sup-porting Architects with Design Decision Data”. In: Proceedings of the 7th Eu-ropean Conference on Software Architecture (ECSA 2013). Ed. by Khalil Drira.Vol. 7957. Lecture Notes in Computer Science. Springer, 2013, pp. 176–183

• Jan Salvador van der Ven and Jan Bosch, "Towards Reusing Decisions myMining Open Source Repositories". Submitted to an international Software En-gineering Journal., 2018

The structure of this chapter is according to the latter. The description on thelevels of architectural decisions is added from the ECSA 2013 publication.

Abstract

Frameworks and reusable components caused a rapid increase in software developmentspeed. Frameworks offer a proven architecture to work on, and enable easy integrationof components. Components provide specific functionality: calculations, APIs, connec-tors, GUIs, etc. Selecting the right components is one of the most important decisionsin software development projects, but also surprisingly hard to get right. These decisionsare getting increasingly difficult as the availability of alternative components for similarfunctionality is growing. Luckily, this increased availability also holds for the number ofprojects that use these components, so an increasing set of example projects is availableonline. We show that data about previously made decisions in these projects can be madeavailable for decision-makers, so they can learn from others making similar decisions. Weprovide a detailed analysis on the suitability of our approach for popular programminglanguages. We describe how the decision can be identified in the version history of opensource projects. This data contains statistical data about the used components and can beused to base decisions on. Also, we show that decision-makers can be easily contacted forspecific decision rationale. Our approach is exemplified by an implementation that minesdecisions from Ruby projects.

150 Chapter 8. Towards Reusing Decisions by Mining Open Source Repositories

8.1 Introduction

In the past years, the productivity of software development has increased. Re-cently developed component frameworks [148] for a range of programming lan-guages help quick development and component integration. In basically all ofthe major development paradigms frameworks exist that help developers with abasis where they can start development immediately. These frameworks actuallyimplement a set of architectural decisions so the developer does not need to makethese decisions over and over again. For example, all the modern web frameworks(Rails, Django, etc.) implement the MVC (Model View Controller) pattern [66],and base their APIs on SOAP [194] or REST [64]. In addition, these frameworksprovide the ’plumbing’ for the integration of 3rd party Commercial-Off-The-Shelf(COTS) [129] or Open Source (OS) components [10], making it easier to combineand reuse these components. The main challenge for developers and architectsshifts from the choice for specific architectural patterns or styles to the selection ofthe right frameworks and component sets. The choice for the framework is oftendriven by non-project specific requirements like contracts with suppliers, experi-ence of the team or preference of the customer organization. However, compo-nents come and go much quicker than frameworks. So, the challenge for devel-opers and architects shifts from choosing the right architecture once, to makingcomponent related decisions continuously. This research focuses on the designdecisions that involve the selection of these components: Component SelectionDecisions. In this work, we define Component Selection Decisions as architecturaldecisions that involve the selection and evaluation of components. The abbreviation CSDwill be used for the rest of this manuscript.

In software ecosystems, information on selecting the right components is crit-ical for success [77]. Many of the decisions on component use have been madeearlier by others working on similar systems. It would be great if decision-makerscould access the decisions made by others, preferably in a data-driven fashion, thatwould allow them to determine what selections were made from a set of alterna-tives and with what frequency. That would give decision-makers hard, quantifieddata to base their own decisions on. The question is: how we can access thesedecisions and the people that made them?

Interestingly, over the last decade or more, several open-source software repos-itories have achieved broad adoption and host hundreds of thousands of projectsin virtually any programming language and application domain imaginable. Ex-amples include SourceForge1 with 430K projects and 3.7 million developers, andGitHub2 with more than 57 million repositories and more than 8.5 million devel-opers. On the one hand, this is part of the identified problem; how to locate thecorrect component in this huge set? But, as many of the projects in these reposito-ries are public there is a huge amount of data available about the structure of theseprojects as well as the evolution of these structures over time.

The version control systems (e.g. Git, svn, Mercurial) of these projects containexactly this evolution data. In order to provide decision-makers with data aboutthe CSDs that others made, these version control systems provide an excellent

1http://sourceforge.net/2https://github.com/

http://sourceforge.net/

https://github.com/

8.2. Context 151

source of data. However, considering the sheer volume of data, this requires anautomated, rather than manual, approach to derive the information. We proposean approach that harvests this big software data and makes it available to deci-sion makers in a statistical way. With this, decision makers can avoid making baddecisions while finding relevant alternatives for the decision at hand.

The contribution of this chapter is threefold:

• First, we analyze the possibility of mining CSDs from the history of versioncontrol systems. We show the suitability of this approach for the set of mostlyused programming languages.

• Second, we show that data from open source systems can be mined andmade available to base CSDs on by describing our implementation for opensource Ruby projects.

• Third, we demonstrate the applicability of the approach by showing howdecision makers benefit from the acquired data.

This manuscript is organized as follows. First, the context of making decisionsbased on big software data is explained, ending with a vision on how to base de-cisions on data. Then, in the research approach, the theoretical background onmining repositories for decisions is described, resulting in three research ques-tions. These questions are addressed in the subsequent sections. This work endswith a discussion and a conclusion.

8.2 Context

8.2.1 Components in Software: Libraries and Frameworks

As software development is maturing, strategic reuse of software can be the differ-ence between success or failure [166] [157]. On architecture level, patterns or styles,or COTS or OS components are the typical reusable artifacts. This means softwareengineering is changing for a large part from system development to componentintegration, where software developers are becoming component integrators [10].There are several reasons why components are gaining attention. First, as hard-ware is getting cheaper, there is no need to create tailored components for specificsituations that skimp on (hardware) resources. Instead, reusing existing compo-nents, and gluing them together is a very good (and typically faster) alternative.Additionally, tools for sharing sources and reusing components have maturedrapidly. It is nowadays easy to find numerous components that satisfy specificfunctionality. Examples of places where you can find components are Rubygems3

or Cocaopods4. Finally, frameworks have been developed that enable easy inte-gration of these components into projects. These frameworks satisfy a tailored setof architectural decisions and provide plumbing to tie components together.

This increased possibility for component integration creates a shift for the workof software architects and developers. Modern frameworks implement a set of

3https://rubygems.org/4http://cocoapods.org/

https://rubygems.org/

http://cocoapods.org/


specific architectural styles and patterns; therefore, these decisions are no debateanymore. Also, the integration of components is simplified as the ’plumbing’ istypically taken care of by these frameworks. The most important question to solvebecomes what components to use in a situation. This places CSDs on a critical pathin many software development projects. However, since the potential numberof components is immense, the knowledge about these components is extremelydifficult to acquire and it is hard to keep this knowledge up-to-date. Architectsand developers seldom use a structured process when selecting components [130],and normative methods provided by the research community are rarely used [82].Most component selection is done based on the experience from internal or exter-nal experts [10]. However, the context and implementation details are typicallyvery different. So, the challenge is how to access the right knowledge, and how tocontact decision-makers that faced similar challenges for sharing rationale on thedecision.

8.2.2 Available Decision Data

There are various sources available to base component selection decisions on. Of-ten, the initial search starts with the experience of the architect, or its colleagues (tacitknowledge). In addition, most components have online documentation that is con-sulted as an extra source of information. However, as the authors typically writethis documentation, it is the question how independent this data is. A third sourceof data for selection consists of anecdotal experience reports in the form of tutorialsor blog posts. These documents often describe a simplified use case of the com-ponent, making it hard to judge if the proposed solution will work in complex(real-life) situations. As a last source, many hosts of components provide metadataon the component: number of downloads, dependencies, versions, last modify date,etc. In Figure 8.1, an example of this data is given for the ’Rest’ Ruby Gem5.

Architects have to base their decisions on these sources, but to get real knowl-edge about the usage one has to experiment with a sample implementation, which isvery time consuming and expensive. Our work fills this gap between the availabledocumentation and the expensive experiments with a data driven solution thathelps architects to acquire data about the usage of components without havingto make the investment for the experimentation. The data is based on the actualimplementations that have been done previously with specific components. Statis-tics on the usage can be used to find the possible alternatives, while the commitdata can provide rationale or access decision maker that made a similar decisionbefore. It helps decision-makers to avoid the ’first fit’ trap [82] by making knowl-edge about alternatives easily accessible.

8.2.3 Vision: Decision Support based on Big Software Data

The previous subsection described how decision makers are forced to make deci-sions based on limited decision data. The needed data on made decisions is hid-den in project histories. We propose a methodology that enables better decisionmaking based on big data that is mined from real-world repositories. Because the

5https://rubygems.org/gems/rest

https://rubygems.org/gems/rest

8.2. Context 153

FIGURE 8.1: Publicly available Component Data for the Rest RubyGem


FIGURE 8.2: The Envisioned System: Decision Support and DataMining

data needs processing before it can be used for decision support, two parts need toexist: the acquisition and processing of the available data (Data Mining), and mak-ing the data accessible for architects (Decision Support). In Figure 8.2 these twoaspects are visualized. When the Decision Maker faces a problem that involvesa component selection, it can use the Decision Explorer to find similar decisionsmade by others previously (stored in the Decision Database). This database isfilled based on the data from Project Repositories. From these repositories, deci-sions have been extracted by identifying relevant Deltas.

With this envisioned system, the decision maker can base the decisions on dataof real-life projects. To the knowledge of the authors of this manuscript, a systemlike this does not exist currently. In this work, we investigate how to develop sucha system.

8.3 Research Approach: Mining Decisions

8.3.1 Architectural Decisions

In research about architectural design decisions [23] [180], typically four aspectsof decisions are considered: the decision topic, the choice, the alternatives that areconsidered and the rationale (sometimes formalized as ranking) of the decision.We discuss these four aspects of architectural decisions, and describe how we usethese aspects to identify decisions in repository data of open source projects.

• Decision Topic. The decision topic is the actual problem that needs to besolved. Often, these topics arise from previous decisions (e.g. we decided tobase our application on NoSql technology, which specific database productare we going to use?), or from non-functional requirements (e.g. how are wegoing to ensure our up-time is high enough?)

• Choice. The choice, or decision, is the result of the decision process. Often,this is the only part that is communicated (discussed or documented).

8.3. Research Approach: Mining Decisions 155

TABLE 8.1: Reflection of Decisions in Version Management

Decision Concept Version Management Concept

Topic and Decision CommitRationale Commit message and author informationAlternatives Structure of commits

• Alternatives. A typical decision has more then one alternative to chose from.Alternatives can be just named (e.g. different component names), or some-times architecture parts are considered as alternatives (different styles or pat-terns, or comparing specific implementations of components). In rare cases,the alternatives are realized and compared as a Proof of Concept or Proof ofTechnology.

• Rationale. The rationale of a decision describes, often in plain text, whythe chosen alternative(s) solve(s) the problem at hand, and why the chosendecision is the best solution.

The focus of this chapter is on design decisions that change after the initialimplementation of the system, during development or maintenance. These deci-sions express themselves through changes in the version management system, i.e.commits of new and changed code. All of the previously mentioned aspects ofa design decision can be located in the version history or implementation of thesystem. The decision topic and the choice have a reflection in the (architecturallyrelevant) commits. The rationale for the decision is ideally reflected in the commitmessage, and the author of the commit can be contacted for additional rationale.Alternatives can be found in the history of the architecturally relevant commits.Table 8.1 summarizes this.

There are different abstraction levels of architectural decisions. As describedby de Boer et al. [23], decisions are often related to each other, and this relation-ship typically forms a tree structure down from more abstract to more concrete(decisions cause new decision topics). Figure 8.3 symbolically visualizes such agraph. Generally speaking three levels of decisions can be distinguished:

1. High-level decisions. High-level decisions affect the whole product, al-though they are not necessarily always the decisions that are debated orthought through the most. Often, people that are not involved in the realiza-tion of the project (e.g. management or enterprise architects) heavily affectthese decisions. Typical examples of high-level decisions are the choice toadopt an architectural style (e.g. service-oriented), use a programming lan-guage, use high-level systems (e.g. service bus implementation) or a specificapplication server. Changing these decisions typically has a huge impact onthe architecture of the system.

2. Medium level decisions. Medium level decisions involve the selection ofspecific components or frameworks, or describe how specific components


FIGURE 8.3: Relationships between Decisions

map to each other according to specific architectural patterns. These deci-sions are often debated in the architecture and development teams and areevaluated, changed and discarded during development and maintenance ofthe system. They have a high impact on the (nonfunctional) properties of theproduct and are relatively expensive to change.

3. Realization level decisions. Realization level decisions involve the structureof the code, the location of specific responsibilities (e.g. design patterns), orthe usage of specific APIs. These decisions are relatively easy to change, andhave relative low impact on the properties of the system.

As we have experienced in our industrial cases [184], the architectural decisionsthat are hardest to make are the medium level decisions, for the following reasons:

• These decisions have a high impact on the functional and non-functionalproperties of the system

• they change constantly, especially compared to high-level decisions thatonly change when remaking the system

• they are costly to change because of the impact on the system

• because new components and version are created constantly, it is hard tostay knowledgeable about relevant alternatives

• they have unpredictable results until they are implemented in the system.

The focus of this chapter is on medium level design decisions that change duringdevelopment or maintenance.

8.3.2 Mining Big Software Data

Big Software Data [143] enables researchers to learn from the history of projectsat a large scale. To understand component evolution based on project history,


FIGURE 8.4: Project History concerning Component Change

Figure 8.4 shows component changes in three projects schematically. This figurevisualizes component changes in different projects over time, where the ’+’ meansthe addition of a component and a ’-’ means a component has been removed. Thishistorical data provides a great source for finding CSDs. Component changes arereflected on several levels of abstraction in projects.

• File - Level. When a project evolves, the source files of the system change.These changes are tracked in the version management system, often as changesper line (added, removed or modified lines). These changes can imply achange in component use.

• Commit - Level. This means a set of file changes (a commit) reflect a change,for example the change of a component. In earlier version management sys-tems, this was the default unit of change. Changes were pushed to a servercontaining the latest version of the total system.

• Pull Request - Level. In modern version management systems like Git orMercurial, commits are grouped together as pull requests. This makes it pos-sible to create small increments of changes by doing commits locally, whilebeing able to deliver a set of related changes at once to the master server.However, as Kalliamvakou et al. point out [105], pull requests are not usedwith enough discipline to know if this abstraction level is usable.

• Release - Level. A set of commits or pull requests can be bundled as a re-lease. When releasing to a production environment, the right componentconfiguration needs to be in place. Like the pull request - level, this level canbe seen as a set of component - level changes.

Component changes can be found on all levels. For the remainder of this chap-ter, we consider commit - level CSDs, even though we do analyze single lines inthe detection process. The commit is used because this is a level that is available


as first-class entity in all version management systems, and always has date, acommit message and author, information essential when looking for the decisionsand rationale of decisions. We consider the changes based on pull requests andreleases as potential future work.

The evolution of source code is captured in the version management systems ofprojects, and the decisions are also traceable in these systems. However, it dependson the version management system and the programming language structure howthey are represented, and how they can be extracted. When looking at the basic op-erations on component changes in projects, we encounter three distinct situations[186]:

1. A component was added in a commit. This means that somewhere in theproject, at least one line was added with a name of, or reference to the com-ponent. A typical example of this is the C #include statement. Componentaddition is an example of the ’Existence decision’ of Kruchten [114].

2. A component was removed in a commit. Removing a component is reflectedin the version management system by modifying import statements, config-uration files, and / or removing code and files. Kruchten calls this type ofdecision the ’Ban or non-existence’ [114].

3. A component was replaced in a commit. In the third situation, an explicitdecision was made to remove one component and introduce another one. Ifthis happens in the same commit, this is a strong indication that one compo-nent was replaced for another, hence an alternative with potentially similarfunctionality.

Adding components to a project happens the most often (in our initial study,62% of the found changes were additions versus 38% removals [186]). However,the only information that one can get from this that the component was introducedin a project. So, this data does not add very much to existing data sources like totaldownloads as shown in Figure 8.1. Sometimes, it is not even possible to checkif the component is actually used anymore even though it is still included in theproject (e.g. a library is included but it is never used in the source code).

The second situation (component removal) contains more valuable informa-tion for CDS detection: someone made an explicit choice to stop using a compo-nent. However, it is difficult to know if this component is removed in this specificcommit, or that the project stopped using the component some time ago and theexplicit removal happens during a cleanup of the code, long after the decision wasmade.

The third situation is the most promising, where it is known that a componentwas added and removed in the same commit. This increases the probability of thetwo components being related and perhaps even being alternatives for each other.

We use the delta (∆) as a concept to describe replacements of components(adding one component and removing another in the same commit). In Figure8.5 this concept is visualized. On the left side, the processed commits are shownfor two distinct projects. The first commit (10defd0. . . ) removed two components(Rest and jQuery UI), while adding one (Bootstrap). This leads to two deltas (and


Decision 2

Replace Component B by Component Z

Commit 10defd0…

- Rest - jQuery UI + Bootstrap ...

Commit 5db61e4…

- jQuery UI + Bootstrap ...

Commit bfd2806…

- MySQL + PostgreSQL ...

Project One

Project Two

Replace Rest by Bootstrap

Replace jQueryUI by Bootstrap

Replace MySQL by PostgreSQL

Δ

Δ

Δ

Δ

Δ

Legend

Commit

Delta of two components

Candidate Component Selection Decision

FIGURE 8.5: The Design Decision Extraction Process


Rest

jQuery UI

Bootstrap

1

2

Legend

Component Number of found deltas

n

FIGURE 8.6: Component Replacements

hence candidate CSDs): Rest→ Bootstrap and jQuery UI→ Bootstrap. The deltascan be calculated for all the commits in a set of projects. Then, the found candi-date decisions can be summed across all projects to see what changes happened atwhat frequency. An example of a way to visualize this graph is shown in Figure8.6, where only the interaction with Bootstrap is visualized for the example above.

Associated with such an identified decision, the commit can provide additionaldata. For example, the commit message can contain rationale on the decision.Also, the time stamp on the commit can help to search for trends in componentreplacement. Last, the author information provided by the commit can help tocontact the decision maker for additional rationale on the decision.

A single found deletion and addition is no evidence that the components wereactually replacements for each other. For example, in the scenario used above theBootstrap component is no replacement for Rest component. If deltas occur often(e.g. in several unrelated commits from different projects), the chances decreasethat the found replacements are incidents. If this mechanism is applied to a largenumber of projects, a weighed graph can be created that provides insight in CSDsthat have actually been applied across many projects.

8.3.3 Research Questions

In order to assess if the envisioned system can be constructed, several questionsarise. First, the suitability of programming languages for mining decisions shouldbe explored. Second, the extraction process should be tested; is it possible to iden-tify the decisions, to relate them to each other so statistical data can be acquired onthem? Based on our previous work with a small set of Ruby projects, we have theindication that this is feasible. However, is this approach effective and can it bescaled to larger number of projects? This leads to the following research questionsfor this research.

• RQ1: How do different languages compare in their suitability for miningCSDs?

• RQ2: How effective can CSDs be mined in a scalable way?

8.4. Programming Language Comparison 161

• RQ3: Do the identified CSDs provide sufficient information to base the deci-sion process on?

Because each of these questions required a different approach, we have in-cluded the experimental setup in the corresponding sections. Following, the re-search questions will be addressed sequentially.

8.4 Programming Language Comparison

8.4.1 Experimental Setup

In this section, the first research question is assessed: RQ1: How do different lan-guages compare in their suitability for mining CSDs? First, the criteria for the pro-gramming languages are described. Then, the selection of the languages is de-scribed, followed by an analysis of the applicability of our approach to each ofthese languages. This section concludes with a comparison of the programminglanguages based on the described criteria.

8.4.2 Criteria for Programming Languages on Suitability for Min-ing Decisions

Based on the theoretical description of the mining process in the previous section,we describe the suitability of programming language in this section. Three criteriaare used to assess the suitability of the language:

1. There has to be a large number of available projects that use the language.These projects can contain the data for CSD mining systems. We investigatedseveral hosting services for open source projects (e.g. GitHub, Sourceforge,BitBucket). However, as GitHub hosts 50 times more projects compared tothe largest competitor, we decided to base our analysis on GitHub projects.We counted the total number of projects (from GitHub6) and the number ofactive projects (from Githut7) as a measure for the available data.

2. The structure of the language (and supporting tools) needs to enable miningof data based on the version history of projects. For this, we identify whatcomponent management systems (usually in the form of dependency man-agement systems) are commonly used, and the suitability of these systemsfor discovering CSDs.

3. A language has to have a solid component ecosystem that facilitates intensivecomponent (re)use. For this criterion, we analyze if there is a general ac-cepted location for finding components, and check how many componentsthese locations host.

When these three criteria are met, the programming language can potentiallybe used to create a decision database where decision support can be based on.

6Checked on 18-4-2016 from http://github.com7Based on Q4 of 2014 from http://githut.info/

http://github.com

http://githut.info/


8.4.3 Programming Languages

We have analyzed a variety of different programming languages. The program-ming languages with the most active repositories on GitHub8 were analyzed. Shell,R and VimL were skipped since they are not programming languages that are usedextensively for consumer products. In addition, we also skipped CSS as this lan-guage is always integrated with other languages (e.g. HTML, JavaScript) and itdoes not use components in the language itself. Next, we discuss all of these pro-gramming languages, and describe where data on component selection is located.Per language (group), the involved files are described, and how they could beparsed to obtain relevant CSD data. Also, we describe the three criteria per lan-guage.

JavaScript was introduced in 1995 as a functional programming language thatoriginated as a client-side scripting language for managing dynamics and interac-tion for HTML websites. Lately, JavaScript got growing attention because of theasynchronous communication as a server-side implementation with Node.js9. Ahuge number of projects is available in JavaScript: 324K active projects and 2034Ktotal repositories with JavaScript as their main language. Even though librariesand components are used in client-side JavaScript, the inclusion of them is usu-ally done across different files of different types (e.g. JavaScript, static HTML ordynamic server-generated HTML). This makes it very difficult to process compo-nent decisions in client-side JavaScript. There are however tools that enable de-pendency management on one location for client-side JavaScript: Browserify10 orBower11. With Bower, the components that a project uses are defined in a singleJSON file: bower.json. This file would be perfect to mine for CSDs. The server-sideNode.js framework uses npm12 to manage dependencies that are defined in onefile: package.json. Here, the ’dependencies’ section describes the components thatthe project uses. This data can be used to trace the CSDs. There is an active com-munity across the Bower and npm tools that provide a large set of components tothe community.

Java is used extensively in the open source world. With 223K active and 1890Kavailable projects, a large dataset is available for mining CSDs. The Java pro-gramming language has a rich history in dependency and build management.Currently, three options are most often used: Maven, Ivy and Gradle. In all ofthese solutions, a structured file is used (Maven and Ivy use XML and Gradle usesJSON) to describe the components that are used. The used files have a specificname and location (pom.xml, ivy.xml and build.json), so they are relatively easy tofind and process. However, because there is no one accepted way to define com-ponents (many frameworks use different definitions of components), there is nosingle point where the components can be discovered with ease.

Python is a popular language, for open source development, industry, as wellas academia. It has more than 100 K available projects just on GitHub. The pippackage manager is commonly used to manage the required dependencies for

8Based on data from http://githut.info9http://nodejs.org/

10http://browserify.org/11http://bower.io/12https://www.npmjs.com/

http://githut.info

http://nodejs.org/

http://browserify.org/

http://bower.io/

https://www.npmjs.com/


projects. The requirements.txt and setup.py files define, per project, what librariesand components are used. This file can be used similarly to the Ruby Gemfile asdescribed in the previous section to mine for CSDs. Components can be foundeasily on the pypi website13, where 56K components are hosted.

Php is a rich language that has many extensions, components and frameworks.With 848K projects, it has a sufficient amount of project data available. Depen-dency management is not part of the language itself; packages can be included inbasically any file in the project. However, the composer library manager is used inmany (professional) projects for managing the components in the project. It usesa compose.json file that defines in JSON what specific components are used in theproject. This file could be used with our method to mine CSDs. Composer compo-nents are listed at packagist14, where around 52K components are available.

C and C++ are less popular in the open source community with 87K and 73Kactive projects (536K and 481K total) available. Because the implementation ofC / C++ parsers and tools varies across the different platforms, no real depen-dency manager exists. This makes it difficult to identify component changes acrossprojects. The data exists, but scattered across potentially any file in the project withan #include statement. These #include statements can include external compo-nents, but they can also refer to other components within the same project, whichcan pollute the data. There is no single location available for discovering or down-loading components.

C# has around 56K active projects on GitHub, and around 467K Github repos-itories tag as using C#. In .NET projects (C#), the typical structure is to have asolution file (.sln), which counts as the single point of entry for the project. Thisfile refers to potential multiple .csproj files (C# project files). These XML files de-fine among others the components that are used. The challenge with C# projectsis that the data is scattered across the project so it is hard to identify a single pointwhere the changes can be found. Components for C# (.NET) projects are availableat NuGet15, where around 35K unique components are available.

Objective-C is developed mainly for creating mobile apps for the iOS platform.A small base of 37K active projects is available on GitHub from the 301K total.Components are defined and distributed as Cocoapods16. The used componentsare summarized in one file per project: the Podfile. This file has a similar structureto the Ruby Gemfile as it is an unstructured summary of used components: aline containing ’pod x’ means that component x is used in the project. Hence,Objective-C projects could be mined perfectly for CSDs. Components are availableon cocoapods.org (around 7K).

Go, even though it is a very new language, already has 22K active projectsavailable on GitHub (119K total). In Go, dependency management was imple-mented from the beginning. In the Godep JSON file, all the used components aredefined. With the Godep tooling, this file is used to update the environment withthe right components and versions of these components. The Godep file would be

13https://pypi.python.org/pypi14https://packagist.org/15http://www.nuget.org/16http://cocoapods.org/

https://pypi.python.org/pypi

https://packagist.org/

http://www.nuget.org/

http://cocoapods.org/


TABLE 8.2: Project Availability of Programming Languages

Language Active Projects Total Repos Conclusion

JavaScript / Node.js 324 K 2.034 K ++Java 223 K 1.890 K ++Python 165 K 1.041 K ++PHP 139 K 848 K +Ruby 133 K 1.001 K ++C++ 87 K 536 K +C 73 K 481 K +C# 56 K 476 K +Objective-C 37 K 301 K +Go 22 K 119 K +

a perfect source for the component decisions. On Godoc17, already 60K compo-nents are available.

8.4.4 Results

This subsection provides a summary of the languages and the suitability for min-ing CSDs, as described in the previous section. All languages have an enormousnumber of projects available. Even the language with the least active projects (Go)has 22K projects available.

All of the modern heavy-used programming languages have the capability toprovide the decision data as described earlier in this chapter. In some languages,the extraction process is fairly simple, as there is one file that describes the usedcomponents with one component per line (Python, Ruby, Objective-C). Other lan-guages use a structured JSON or XML file that contains the same expression power(JavaScript Node.js, Java, PHP, Go). However, in order to acquire this data, pars-ing the changes is more complicated as the specific context should be taken intoaccount. Some other languages have a more complex structure to mine, where thecomponent dependencies are scattered across the project files (JavaScript, C++, C),or scattered across specific files (C#). With these languages, changes to the wholeproject should be considered instead of commits on a specific file. However, thedefinitions of the component usages have a simple structure (one line per compo-nent) so parsing them once they are found is possible.

Most languages provide a specific location where data about components canbe found. Typically, open source components are shared there. With Python, Rubyand Go, the use of the component mechanism and ecosystem is part of the bestpractice of the language and therefore these provide the most interesting cases forour approach.

Table 8.5 shows the results of the three assessed criteria for all programminglanguages. To conclude, seven of the ten assessed programming languages are suitablefor CSD extraction. For three of the C variants (C, C++ and C#), this is somewhat

17https://godoc.org/

https://godoc.org/


TABLE 8.3: Ease of Mining of Programming Languages

Language Tooling Data Location FileType

Conclusion

JavaScript/ Node.js

N/A Node.js: package.jsonor bower.json

JSON +/-

Java Maven, Ant/Ivy,Gradle

pom.xml XML +

Python Pip requirements.txt Text ++PHP Composer composer.json JSON +Ruby N/A Gemfile Text ++C++ MS studio #include statements N/A –C MS studio #include, makefile,

configure.inN/A –

C# MS studio .sln / .csproject &.vbproject

Text/XML

-

Objective-C

cocoapods.org Podfile Text ++

Go Godeps Godeps.json JSON +

TABLE 8.4: Programming Language Component Ecosystem

Language Component location # AvailableCompo-nents

Conclusion

JavaScript /Node.js

https://www.npmjs.com/,http://bower.io/search/

475K / 22K +

Java https://search.maven.org 216K +Python https://pypi.python.org/pypi 126K ++PHP https://packagist.org/ 165K +Ruby https://rubygems.org/ 139K ++C++ N/A N/A -C N/A N/A -C# https://www.nuget.org/ 102K +Objective-C http://cocoapods.org/ 7K +Go http://godoc.org/ 60K ++


TABLE 8.5: Summary of Programming Language Suitability

Language Project Availability Ease of Mining Ecosystem

JavaScript / Node.js ++ +/- +/-Java ++ + +Python ++ ++ ++PHP + + +Ruby ++ ++ ++C++ + – -C + – -C# + - +Objective-C + ++ +Go + + ++

more difficult, mostly because the references to components are potentially scat-tered though the projects. So, concerning RQ1: How do different languages comparein their suitability for mining CSDs?, it can be concluded that the methodology issuitable for most of the assessed languages, but some languages will be easier toimplement then others.

8.5 Decision Mining

8.5.1 Introduction

This section addresses the second research question: RQ2: How effective can CSDsbe mined in a scalable way?. We have extended our work on the proof-of-concept onmining open source Ruby projects [186] in order to show this.


For the proof-of-concept implementation, a set of Ruby repositories from GitHubwas cloned and the relevant history was processed to a database. As a start, allthe projects were cloned (create a copy of the whole project history) to a local com-puter. From these repositories, all the changes (commits) on the Gemfile18 werecollected. The Gemfile contains a list of all the components that are used in a Rubyproject, so the history of this Gemfile reflects the component usage in a project.

We looked at the lines that changed between commits on Gemfiles of all ana-lyzed projects. Since we were looking for decisions that involve component selec-tion, we focused on the changed lines of the commits representing the change ofcomponents (in this case, the lines that started with "gem").

Every commit on the Gemfile is taken, and every line that changed in the Gem-file within the commit is processed. To do this, we have automatically processedthe output from the git log command, which outputs the history of a file. As exam-ple, a fragment of the git log command of one of the selected projects (factory_girl)

18http://bundler.io/gemfile.html

http://bundler.io/gemfile.html

8.5. Decision Mining 167

commit 554e6ab378a3c10a28d9...Author: ##### <###@###.com>Date: Fri Aug 12 22:06:10 2011

rr => mocha...-gem "rr"+gem "mocha"+gem "bourne"...

FIGURE 8.7: A Fragment of a Git Log Output Example

TABLE 8.6: Acquired Data

Parameter Initial Dataset Final Dataset

# projects imported 620 1,318# commits 12,413 22,270# changed lines 43,053 71,745# added lines 26,665 44,662# removed lines 16,388 27,083

is presented in Figure 8.7. The following data can be extracted from this fragment:the commit-id, the author of the commit (for privacy reasons anonymized), thedate, and the commit message (in this case, "rr ⇒ mocha"). After that, the linesthat changed are displayed subsequently (added lines with a plus (+) sign and re-moved lines with a minus (-) sign). Git log offers opportunities to customize theoutput, which we used to process the data.

The described process was done completely automated, so no human interpre-tation was necessary for acquiring the data sets. From the local project repositories,we created insert scripts for all the commits and all the lines that were changed ineach commit. This was inserted into our database for further analysis. Our previ-ous work describes the extraction process in more detail [186]. We used the datafrom our previous work as the initial dataset, which we extended with a new setof projects. A summary of numbers of used projects and commits is presented inTable 8.6. The software for mining the repositories, as well as the insert scripts forreplicating the used dataset can be found in the replication package of this research19.

Components often have dependencies on other components [1]. This can bea problem in our approach, as the dependency for a component can also becomean identified change. However, when working with Ruby projects, the Gemfileidentifies the required components, and the software that installs the components(Bundler) handles the dependencies. So, the required components are not explicitin the Gemfile and therefore do not corrupt the data.

19http://jansalvador.nl/data/data.html

http://jansalvador.nl/data/data.html


8.5.3 Results

This section addresses RQ2: How effective can CSDs be mined in a scalable way? Toanalyze the results, we have selected a subset of the components that affect theselection of database technology with the Relationship Visualizer [186]. AppendixB shows four graphs in two dimensions. The first dimension is the threshold forhow often a specific change (e.g. component A is replaced by component B) oc-curred (N) at least. So, only if more than 9 commits were found with this specificchange, they are included in the upper part of the appendix. The other dimensionis the number of projects used (and hence, the number of commits the graph isbased on). This is divided in on our initial and our final data set.

Even though the four graphs consider the same subject, depending on thedataset and threshold they look quite different. If the threshold is lower, morealternatives are shown (e.g. pg -> sqlite3 is shown in the bottom graphs but not inthe graphs on top). If more projects are considered, more alternatives are found.If the number associating the arrow is higher, more decisions have been found forthis specific CSD. The lower threshold views can be used as an exploration strat-egy to find non-trivial alternatives.

For the showed dataset, we used the threshold of > 5 to be safe to avoid in-cidental commits as CSDs. As we showed in our previous research, 62% of theidentified commits were CSDs [186]. So, the chance that 6 commits describe thesame change that is not a CSD is very low (0, 386 = 0, 003). Increasing the datasetshould be weighed against the time it takes to collect and analyze the data (in thecurrent implementation this takes about twice the time to generate the views if thedata set doubles).

In order to validate that data becomes more meaningful when the number ofprojects increases, the data from our previous work [186] is extended with a newset of projects. We counted the total number of candidate CDSs we found. Table8.7 summarizes the results. In this table, the first column shows how often a certainCSD was found at least (N) in the data set. We wanted to make sure the proposedmethodology scales. We checked if the amount of decisions increases when datafrom more projects is added. This means that the decisions found are not just coin-cidental. Second; we assume that there is a significant overlap between decisionsmade in different projects. This would imply that more common decisions have ahigher number of occurrences if the total dataset size is increased. The validity ofthe assumptions was checked with the data in Table 8.7. It shows that the numberof deltas grows when the number of commits grows.

Based on the previous research where 62% of the found commits were deci-sions, we converted the deltas to (potential) CSDs. In Table 8.8, we show howmuch the number of CSDs grew and how much it grew compared to the totalnumber of commits.

The amount of CSD that occur more often (higher N) grows if more projectsare considered. This confirms that there is a significant overlap between decisionsmade in different projects. To conclude, we can say that by increasing the numberof projects, the strength of the decisions increases (more same decisions locatedwhere rationale can be found), and the total dataset with decisions is larger, somore decision can be located. So, concerning RQ2, we have shown that scaling theCSD mining process to larger quantities is applicable.

8.6. Accessing Decision Rationale 169

TABLE 8.7: Number of Deltas Identified

N≥ Initial Dataset Final Dataset

2 4,865 6,8523 770 1,2314 188 4315 80 2016 50 12410 18 51

TABLE 8.8: Summary of CDS Growth

N≥Initial:#CSD /

#commits

Extended:#CSD /

#commits

AbsoluteCSD

Growth

CSDGrowth

per commit

2 0.33549 0.26337 141% 79%3 0.05862 0.05224 160% 89%4 0.01483 0.01895 229% 128%5 0.00639 0.00895 251% 140%6 0.00402 0.00555 248% 138%10 0.00145 0.00229 283% 158%

8.6 Accessing Decision Rationale


In the previous section, the extraction of decisions is confirmed. However, theapplicability of these decisions is not assessed. Often, the rationale of the decision[189] is more important than the decision itself. This section describes how thefound decisions can be used to access relevant rationale.

The first data source for the decisions is the statistical data from other peoplemaking similar decisions; how often did others make decisions, how often didthey choose alternatives. As a second source, the commit messages can be usedto acquire rationale for the decision. In the case a decision is made more often,this dataset contains a lot of decision rationale, as every identified commit has acommit message describing why the changes were made.

In some situations, it can be necessary to have additional knowledge on themade decision. This can partially be found in the additional data from the minedCDSs. Every commit in a repository has an author. From this author, the nameand email address are known. So, contacting a decision maker for additional ra-tionale on the decision can be fairly easy. However, as this is data from open sourceprojects, it is unknown if the decision makers will respond to a call for help. Also,if they respond, it is unknown if their answers contain useful rationale. In order toaddress the usefulness of the methodology, this section analyses the last research


TABLE 8.9: Quantitative Results

Group % Decision? Rationale? Alternatives?

All % Yes 61,75 % 25,50 % 4,75 %All % No 38,00 % 68,75 % 84,00 %All % Empty 0,25 % 5,75 % 11,25 %

Researchers % Yes 66,00 % 23,50 % 5,00 %Subject Matter Experts % Yes 57,75% 27,50 % 4,50 %

question: RQ3: Do the identified CSDs provide sufficient information to base the decisionprocess on?

We have conducted an experiment where we contacted decision makers foradditional rationale. This experiment is not intended to enrich the current datasetwith rationale data, but to see if the commit meta-data can be used to contactrelevant peers for decision rationale.

8.6.2 Analysis of Rationale from Commit Messages

In order to identify whether commit messages and the information about removedand added components are good indicators of design decisions, we have presented100 different commits to six subject matter experts. In order to get these commits,we randomly picked 100 commits from the Gitminer database that had commitmessages of more then 30 characters (therefore, had a solid chance of containingrationale). We distributed the commits among our subject matter experts. Halfof the experts got the first 50 of these commits, the other half the got the last 50.Two researchers (one contributing to this chapter and one external) judged all 100commits. The participants that conducted the research were experienced Rubysoftware developers, experienced software architects, and researchers with soft-ware engineering background. We have asked them to answer, per commit, thefollowing questions:

• Does this commit involve a design decision?

• Does this fragment contain rationale for a decision?

• Does this fragment give relevant information about alternatives for a deci-sion?

The core results of the validation are presented in Table 8.9. As shown in thistable, according to our experts more then half (61,75%) of the selected messagescontained decisions. It is interesting to note that there was a significant differ-ence in the recognition of decisions between the researchers and the experts forthe same data set. The researchers found decisions in 66% of the commits, andthe experts in 57,75% of the commits. The existence of rationale in the data wasdiscovered in more than a quarter of the messages (25,5%). So, about 56% of theidentified architectural design decisions lacked any rationale. The subject mat-ter expert discovered rationale slightly more often then the researchers (27,5% vs.23,5%). The alternatives were much harder to find. Alternatives were only found


in 4,75% of the commits. The distribution between researchers and experts wasalmost the same.

The participants were given the opportunity to describe their experiences. Onesoftware engineer was surprised about the succinctly of the commit messages, andthe lack of information in it. His experience in company projects was that commitmessages were used much more to communicate decisions. He was able to showexamples of this to the researchers easily from a company repo. The lack of goodinformation in all the commits could be caused by our selection of open sourceprojects. One architect was really enthusiastic about the possibility to contact oth-ers that have made the same decision (the author information provided with thecommit).

During our analysis of the data collected with the Gitminer tool, we foundqualitative results in addition to the quantitative results. We identified differentaspects related to design decisions, that we used as expert validation:

• There were commit messages that indicated changes of components and ra-tionale about them. E.g. "Bundler and Jeweler not playing well. RemovingJeweler" (jeweler), or "use mysql2 instead of mysql because of shit encoding"(mysql).

• Commit messages where a decision is made, but the rationale was clearlymissing: "Changed to jeweler2" (jeweler), or "remove thin" (thin)

• Some commit messages indicated cleanup: "Don’t need json gem depen-dency." (json), or "Do not depend on rack directly" (rack).

• Several messages described configuration issues: "Make compatible withruby 1.9" (ruby-debug), or "Unfortunately, we can’t put ruby-debug in thegemfile because it breaks 1.9.2 compatibility. Just put it back in locally whenyou want to use it, or figure out how to do a switch by ruby verison in theGemfile" (ruby-debug)

• Some commits were done because of certain non-functional requirements:"rcov/coverage makes the specs take a) 2x as long to boot and b) slows downactual specs by about 25%" (rcov), or "Use 1.9.3-p194; replace rcov with sim-plecov. Future commits will turn simplecov on in all situations. (Accordingto its documentation, simplecov is pretty fast.)" (rcov)

8.6.3 Acquiring Tacit Knowledge from Decision-Makers

Besides the found rationale in the commit messages, we were interested to seeif decision-makers were willing to share additional tacit knowledge. For this, weused semi-automated emails to contact the authors of commits. Semi-automaticallymeans that we have created templates that have been automatically filled and sendto the authors of the commits based on the commit content. The emails were sentto a random set of commits from the dataset. This enabled us to see if the emailaddresses were usable (and real), and see if the authors are willing to provide ra-tionale of the decision. We constrained the data set in the following way:


• To make sure we use diverse decisions, we have randomly selected the deci-sions from our database.

• We have taken decisions that were involved in at least two different commits(N>=2) to avoid coincidental component replacements as much as possible.

• The emails were sent to unique persons, so no author received more than oneemail to avoid being reported as spam.

Two different templates were used to base our emails on (see Appendix A). Inone of the emails we explicitly stated we were conducting research and neededsome help, in the other email we asked the author about rationale for the decisionwe found. The two versions were used, as we did not know how many reactionswe would get. If the reply rate would have been low, we could send more emailsof the most successful version. We took the decisions that occurred the most oftenin our data set, and prepared emails for them. We sent out a total of 100 emails todifferent authors, 50 of each version. The number 100 was based on the balancebetween several things. On the one hand, a larger number would give a betterstatistical result. On the other hand, as the intention of this experiment was notto create a large database of replies, is seems unethical to let a very large numberof people write a serious answer without having a serious question. So, it wasdecided to send out 100 emails in order to validate the questions about the qualityof the author data.

8.6.4 Results

As shown in the previous section, roughly 60% of the commits on Gemfiles wereconsidered as concerning a design decision. For our whole dataset, this wouldmean that 60% of the 7527 commit messages contains decisions ( 4500). Of course,the other commit messages (with < 30 characters) could also contain decision in-formation, so this number could very well be higher. Calculated in the same way,about 1900 commit messages contain rationale about made decisions. When relat-ing this to the number of projects, on average every open source project we usedcontained 6 decisions in commits and 3 commit messages with relevant rationale.

However, this data is much more useful when looking at all the projects inthe Gitminer system. When handling a threshold limit for a certain amount ofprojects involved (say, 10 projects), then the chances of having relevant decisionsor rationale is significant. Architects can use this information to make much better-informed decisions, because they are based on more projects that made a similardecision, thus having more statistical relevance.

From the 100 emails we sent out, 10 turned out to have invalid email addresses.From the remaining 90 emails, 32 got replied (response rate 35,6%, N=90). Fromthese replies, the vast majority (two thirds, N=32) of the reactions came in within24 hours. Other studies that contacted Github users by email to fill in surveysgot lower response rates, even though they send the emails to active users: 23%,N=1.160 and 14,1%, N=10.000 in the work of Singer et al. [167] and 19%, N=4.500for the research conducted by Vasilescu et al. [182]. Compared to these researches,the response rate to our questions was significantly higher (Fishers exact test for


TABLE 8.10: Results of Email Experiment

Type 1(Please help me)

Type 2(Research) Total

# total emails send 50 50 100

# delivered emails 43 47 90% delivered emails 86% 94% 90%

# replied emails 21 11 32% replied emails 42% 22% 32%% replied emails of the delivered 49% 23% 36%

# replied emails with rational 20 8 28% replied emails with rationale 40% 16% 28%% replied emails with rationale of delivered 47% 17% 31%

binomial distribution: p=3e-2, p=5e-6, p=2e-3, respectively). From the replies, re-searchers of this manuscript judged the answers by assessing for rationale on thedecision, in order to be able to say something about the usefulness of the replies.About 28% (N=90) of the actual delivered emails got an answer containing ratio-nale of the decision. Table 8.10 summarizes the results of the email experiment.

There was a difference between the reactions for the two types of questions.When asking the subject for help, we got far more replies than when asking to as-sist in scientific research (40%, n=50 vs. 16%, n=50). This means that in the real sit-uation of a decision-maker needing rationale, 2 out of 5 emails will get replied. Wecan imagine this percentage will be even higher if the email contains more detailsabout the actual problem at hand (something we could not do automatically). Thisclear distinction in reply rate implies that, if asking the right questions, one has areasonable chance of getting rationale data from decision makers. Researchersshould pay attention that mentioning research could influence the results signifi-cantly.

As a conclusion, the question described in the beginning of this section can beanswered as follows: RQ3: Do the identified CSDs provide sufficient information to basethe decision process on? 10% of the email addresses turned out to be incorrect. So, thevast majority of the email addresses can be used to contact decision makers. Thedecision makers replied in 32% of the sent emails, most of them replied within 24hours. So, the decision makers do want to reply, but it might be necessary to senda question to multiple commit authors in order to get a reply. Sending the rightquestion does actually influence the reply rate, so phrasing the question right doeshelp (40%, N=50 for asking for help vs. 16%, N=50 for research question). Most ofthe answers contained relevant rationale about the decision at hand. Even thoughthe numbers are small (28%, N=100), the results are very promising that accessingrationale through this way is indeed plausible.


8.7 Discussion

8.7.1 Results

In the previous sections, we have shown that for many programming languages, itis possible to create a decision support system that assists decision makers in mak-ing their CSDs. We have seen that it is possible to scale the approach to increasethe dataset as well as the accuracy of the data. Even though the work describedin this chapter is based on a proof-of-concept for one programming language, theapproach is promising for the software development industry. Being able to makebetter founded decisions leads to more successful projects. Also, we have seen thatthe software engineering community is very auxiliary by providing rationale onthe decisions in roughly one third of the enquiries for rationale. People are willingto help others, independent of what their own benefit could be in this.


As threat to the external validity, the selection of the projects is important. Forthis research, we focused on decisions concerning component selection. Thesedecisions have a high impact on the system, while they are made constantly duringdevelopment and maintenance of systems [186]. Then, we further narrowed ourfocus on specifically open source components, because of the availability of thesecomponents in open source projects. Last, during our proof of concept, we scopedto specifically Ruby components. We have chosen to use Ruby projects becauseRuby is used extensively in both the open source and the industrial world, makingit ideal to conduct real world research on. As using company source code ofteninvolve legal issues, we have focused on open source projects. Our tooling was runon a company repository where one of the researchers was working at the time, tovalidate that we saw similar patterns. This showed some component replacementsthat were immediately recognized by the architect of the company as having beena debate in the past. As this was only one project, no real usable data was extractedbut the identification of decisions was confirmed.

Because cloning and mining for CSD takes (processing) time, a limited set ofopen source projects was processed. To confirm the validity of our approach, wehave added an extra set of projects to see if our results are repeatable and got betterbased on more data. We have shown that by extending the dataset the precisionas well as the number of potential alternatives increased. However, in order tobe usable on a large scale, it would be necessary to have a system that is updatedregularly with new projects and commits to existing projects. This is out of scopefor the current research.

Concerning the construct validity, several concerns can be pointed out. First,the research is based on a limited selection of projects. We selected the projectson statistical properties, not on specific knowledge about the projects, to makesure the projects were diverse and representative. We tried to make a dataset thatwas as representative as possible for generalizing to Ruby projects by selecting allactive projects in a specific point of time. However, it is possible that due to thetime of cloning, this dataset was accidentally biased (e.g. due to holidays there

8.7. Discussion 175

was more or less activity in certain types of projects), but we have not found anypatterns in this direction.

The data on Github can be used for empirical research, but one has to be cau-tious in many aspects, as the data is not always what it seems as Kalliamvakou etal. point out [105]. We minimize these threats in the following way. We selectedprojects that had some community (> 1 watcher and > 1 fork) that changed at leastonce in the month before we extracted the data. We did not look at the pull re-quests, but at the sequence of commits that were accepted in the project history.We did not analyze if the users were real persons, but seeing the results in theanswers to our emails, many commits were actually done by real humans.

When we contacted architects, we only emailed people once even though theymight be responsible for multiple CSDs. The selection of the people we emailedwas done at random, based on the decisions we mined. One threat to our researchis that we based our emails on old data (the initial data set), which was also men-tioned in the replies by some of the decision-makers. However, we did get enoughreplies to validate our assumption that authors are willing to provide tacit knowl-edge. One of the major drawbacks on this approach is that this rationale from thedecision maker is not part of the actual mined data. Acquiring this data will costtime for the person that needs this rationale. However, if this data is not availableelse ware, even though it might take some time it is still better then not having thedata. Some rationale is available instantly in the form of the previous commits andthe commit messages accompanying these.

As threat to internal validity, the following can be pointed out. Our definitionof decisions based on adding and removing components has some constraints.First of all, we don’t know if we found all CSDs that were actually made in theprojects, because they might be reflected in different (sequential) commits. Addi-tionally, German et al. [70] describe that sometimes developers change the historyof git repositories, which can cause missing decisions. We acknowledge this, butto reach our research goals it is sufficient to work with the decisions that we actu-ally identified. As future work one could look at these changes in sets of commits(e.g. pull requests or commits within a certain time-span). Secondly, there is noguarantee that the found commits are CSDs but rather based on coincidental ad-dition and removal of components. First of all, this would imply that the commitcontained different (unrelated) functionality, which is considered a bad practicein software development. Second, to minimize this effect we counted how oftena replacement occurred. The chance that unrelated replacements happen often isvery small, so these changes will have a lower relative occurrence score when thenumber of projects increases. In our previous work with subject matter experts, wehave found that 62% of the identified commits concerned actual decisions [186].

8.7.3 Related Work

In architecture design decision research, hierarchical structures are used to modelarchitectural knowledge [23] or design decisions [100] [189]. This research of-ten emphasizes the recording of decisions, and the extraction of made decisions


later in the development process. There is a growing base of evidence that explic-itly managing architecture decisions is effective [162] [111]. Traditionally, docu-menting software architectures [42], as well as documentation templates [117] andcomputational modeling [146] have been extensively used and researched. Dueto the sometimes-poor quality of the documentation, Lopez et al. [133] presentan ontology-based mining solution to structure and extract relevant decision datafrom a set of components. Van Heesch et al. [85] describe a tool that assist thearchitect in the decision and documentation process, by providing different view-points on the architecture. Another research tool proposed to assist the architectis the Decision Buddy [69]. One of the fundamentals of this tool is the SolutionRepository, where known solutions to decision problems are stored. This repos-itory could very well be seeded with relevant component replacement data fromour research, so the decision maker has access to the data at the right moment.Soliman and Riebisch [169] describe the reuse of decision data from a different an-gle with a focus on the sharing of architecture knowledge. A topic that is beingdiscussed heavily is the role of the architect [62] [184] and the role of ’the architec-ture document’ in the design process [184]. Often the architect is responsible forcreating and maintaining the architecture documentation. However, the decision-maker is not supported in making these decisions based on statistical data.

Van Vliet and Tang [191] describe the rationality of the decision-making pro-cess. They show that this process is not always as rational as commonly assumed,and use the term bounded rationality to describe decision making based on a finiteset of options based on limited information. Decision makers can benefit from ourresearch by basing their decisions on an extended dataset that is based on histori-cal usage data of components.

The increasing amount of available data in online repositories is getting moreattention in research. Kagdi et al. [103] provide a very extensive literature sur-vey with supporting taxonomy for mining software repositories. We have usedthis taxonomy to classify our research in Section 3. The popularity of the Work-ing Conference on Mining Software Repositories [141], and the recent attention inthe Empirical Software Engineering journal [151] are good examples of this. Thework of Le et al. [125] assesses architectural changes from software repositoriesbased on calculations of changes that are also seeded by the addition and removalof elements in the projects. For this, they also use different versions of open sourceprojects as the source. Similarly, Kouroshfar et al. [110] use calculations on themodel changes to identify co-changes in Module views to mine for architecturalelements. However, the goal of these researches was not to assist architects in mak-ing decisions, but to identify what types of decisions are made in these projects.Voinea and Telea developed a generic framework for mining of software reposi-tories [192]. Our work could benefit from this research if we want to extend it tolarger sets of repositories.

Commercial-Off-The-Shelf or Open Source Software components are typicalassets for software reuse [139]. For the discovery and selection of components,Ayala et al. [11] describe the high dependence on experience, either personal orfrom others. Most research investigates the component selection by interviewingdecision-makers [130] [82], instead of basing it on statistical data from open source

8.7. Discussion 177

projects. Li at al. [130] describe the how the research community often has a dif-ferent perspective on component selection than actual practitioners. We addressthis gap by providing decision support based on the actual decisions made, thatcan be used by practitioners instantly.

McMillan et al. [137] focus on finding relevant feature sets and open sourcecomponents for rapid prototyping. They base their data on open source reposi-tories, both textual and source-code. However, as they focus on identifying bothfeatures and modules, a lot of manual work (interpretation) is needed to createthe dataset. Also, only the currently used components are assessed, in contrastto our approach that is based on changes in the history of projects. Moura et al.[140] describe a combined mining and manual analysis approach for locating en-ergy efficiency in source code. They also focus on the commit as an element forresearch, however they do not process the amounts of commits we do, as theirapproach needs manual inspection. Dependencies between components have alsobeen studied from a data-mining perspective. The work of Blincoe et al. [19] de-scribes how dependencies between projects can be mined, based on the commitmessages containing references to other projects. Using data mining to prevent in-correct configurations of components has been studies by several authors [1] [41].Our work is not primary aimed at discovering (incorrect) dependencies, but pro-cesses the historic metadata of projects to discover alternative decisions, and usingthe commits as supporting rationale.

On the web, there are several initiatives that provide statistical data aboutprojects. For example, there are tools that help developers in increasing code qual-ity by providing statistics about the code [20] or that provide statistical informa-tion about how often a component is downloaded [159]. Git can be used to processhistorical information [72] and visualizations of the history [71]. However, to thebest of our knowledge we have found no research or practical solution that ac-tually extracts decisions in the version history of (open source) software projects.Dagenais and Robillard [48] investigated open source development for finding de-cisions. In this research, surveys and documentation were used as data source tolocate decisions, instead of the version history like our research. Ding et al. [51]investigated the mailing lists and related them to the architectural decisions madein open source projects. While not specifically focusing on SCDs, this research dididentify architectural decisions based on available open source data. Another opensource mining initiative involves searching relevant open source java frameworks[176]. This research focus on code fragments instead of CSDs, and only the usageof the code is analyzed, not the evolution of the code. We built our results on thehistory and accompanying structure of the system.

8.7.4 Future work

Data-driven analysis of architectural decisions based on open source data is a newway of conducting software architecture research. In this work, we have provideda first step to show the feasibility of this approach and provide the first evidencethat this approach helps decision-makers. In order to be able to use the data ofpreviously made decisions for CSDs in large settings, several things need to be


done. First, the data set needs to be extended, and the data needs to be made pub-licly available. Second, the data needs to be kept up-to-date automatically whenprojects evolve (and new decisions are committed). Then, experiences with thisdata set in industrial settings needs to be obtained. This could be done by extend-ing the tooling described in this chapter or by enriching existing tools like DecisionBuddy [69] or other tools for reusing decisions [87]. Based on these experiences,further empirical data about the usage of this approach can be acquired.

In order to provide even more precise data, additional reporting tools shouldbe developed. For example, to be able to see cohorts [47] of component changesover time would help to see trends that are occurring now instead of countingwhat happened in the whole history of all the projects. Another approach that isvery promising is to automatically classify the rationale (commit messages), so therationale can be found apart from the specific decision.

As an alternative flow of future work, the data can be enriched. For example,decisions that occur over multiple commits, but happen within a single pull re-quest or time-span could be mined. Also, additional lines in the Gemfile that arenow ignored could provide additional information on the relationships betweencomponents (e.g. the ’require’ keyword). In addition to this, it would be interest-ing to see if change in structure of the system could help in finding other kinds ofarchitectural decisions, like changes in architectural styles or patterns.

The use of this kind of data could also be extended to other domains thandecision-making. The data can help projects to pro-actively advise about potentialcomponent changes that happen often in similar projects. Or, when starting a newproject, the default stack (including components) for specific language could beautomatically provided, based on the actual use in real world projects. The datacould be used to manage the version of components, assisting the decisions madefor release management [89] of component based software systems.

8.8 Conclusions

Strategic reuse of (external) components is extremely important for the success ofsoftware projects, as they increase in size and complexity. Preferably, decisionsfor these components are based on experience of others, either data-driven orqualitatively by discussion with experienced decision-makers that faced similarproblems. However, this data is nearly impossible to acquire. Therefore, decision-makers are forced to make decisions based on incomplete data, making the deci-sion process more expensive and error-prone.

In our research, we have shown that the version history of real-world opensource projects contains valuable knowledge about the CSDs. We have shownthat a data-driven methodology is applicable for most popular programming lan-guages. The data about component usage provides insight into the made decisionsand the rejected alternatives. Additionally, we have shown that tacit knowledgefrom architects is accessible by contacting them based on the acquired metadata.These decision-makers provided rationale for their decisions in roughly one thirdof the emails. The results are a first step to enable architects to make decisionsbased on real-world data.


Decision-makers that face a CSD benefit from this research in three ways. First,the data from real world open source projects provides a wider range of optionswhen discovering possible alternatives. Second, the data of changed componentshelps to evaluate alternatives. Last, decision-makers are willing to help with ratio-nale for component selection problems, even when contacted in a semi-automatedway. With these benefits, a wider knowledgebase based on real world projects isopened to base CSDs on. By using this knowledge, decision-makers are supportedfor making better decisions based on real world data.

Acknowledgements

We would like to thank everyone that responded to our emails. Also, we wouldlike to thank the people of Factlink and Crop-R, who helped us several times dur-ing the development of our theory and tooling. Also, we would like to thankMircea Lungu for his help in shaping this manuscript.


Appendix A: Email Texts

Email Text 1

Dear <author_name>,

While looking at Github project <project_name>, I came accross yourcommit (<commit_url>). I was wondering if you could help me out withthe following. I see that you have replaced "<Comp A>" for "<Comp B>".For my current project, I am also using "<Comp A>", and I was thinkingabout moving to "<Comp B>". Could you tell me what made you changethis, and if you have any regrets on the change?

Thanks very much in advance for your help!

Kind regards,Jan Salvador van der Venhttp://jansalvador.nl

Email Text 2

Dear <author_name>,

I am currently working on my PhD research on architectural designdecisions. I am investigating if it is possible to find decisions basedon commits on open source projects.

In my search, I found your commit (<commit_url>) in project<project_url>. I was wondering if your could help me out by answeringthe following questions:

-Would you think this commit involved an architectural decision?-If so, could you tell me if you have evaluated alternatives?-Do you remember what the reasons were for choosing this alternative?

You would help me very much if you could answer these questions.Thanks very much in advance for your cooperation!

Kind regards,Jan Salvador van der Venhttp://jansalvador.nl


Appendix B: Dependencies for Database Components

183

Chapter 9

Conclusions

“The limits of my language means the limits of my world.”

- Ludwig Wittgenstein

9.1 Research Questions

This thesis consists of work on architectural knowledge vaporisation. We haveaddressed three different themes around this topic: artifacts, process and reuse.We started the research with exploring scenarios that involved architectural designdecisions. The first research question we addressed was:


This research question was addressed in Chapter 2, where the industrial needsfor architectural decisions were assessed. The data was extracted from interviewswith 14 employees from four different companies. We created an overview ofthe relevant stakeholders and identified 27 distinct use cases where architecturalknowledge was needed. Based on this work, we can conclude the following.

First, most of the industry needs for architectural decisions come from archi-tects or architecture reviewers. Others do not seem to be very interested in oraware of architectural knowledge. Second, in 12 of the 27 identified use cases, thearchitectural decision was explicitly mentioned. Other use cases involved the con-sequences of the decisions (e.g. UC5. Check Correctness), or process related issues(e.g. UC4. Perform a review for a specific concern). Architectural decisions can beseen as the most important aspect of architectural knowledge.

The use cases were described as interactions on a fictive Knowledge Grid. Inthis thesis, we have shown that functionality of such a Knowledge Grid can be im-plemented in real tooling. One tool was developed as proof of concept to addressthe second research question:

RQ2: How can tacit knowledge about architectural decisions be preserved for later use?

In Chapter 3, we related the processes from rationale management to that ofsoftware architecture design, creating a connection of rationale to (software) arti-facts. We discovered that tacit knowledge from rationale management was miss-ing in most software architecture artifacts (the Choice + Rationale and the Alterna-tives). We introduced a model for architectural decisions that included this under-represented tacit knowledge. This model connects the rationale of the decision

184 Chapter 9. Conclusions

to software artifacts like architectural documentation or source code. We showedhow to manage alternatives in the code itself to demonstrate the preservation oftacit knowledge in currently-used artifacts.

The developed model was used in the work of Chapter 4, where we showedthat it is possible to assist architects and reviewers in preserving tacit knowledgeon architectural decisions. The developed tool enabled architects and reviewers toenrich the documentation they were using with formal architectural knowledge,in this case architectural decisions. This work showed that reviewers get a bet-ter understanding of the architecture when using this tool. In the next researchquestion, the focus was on the decision process:

RQ3: How does the architecture decision process influence the decision results?

With the introduction of the Triple-A model in Chapter 5, we were able to seecharacteristics from decisions in the decision process. We showed that it is possibleto characterize teams by who makes the decision, how the decision is documentedand what the periodicity of the decision is. In the industrial cases from this chap-ter, we discovered interesting relationships between the factors from the Triple-Aframework and the success of the decision process. Concerning the documenta-tion, it is important that documents are kept short and to the point (The illusionof documentation and complexity instead of simplicity). Concerning the decision-maker, hands-on experience is essential to make up-to-date decisions, while thearchitect should never be the single point of failure. Last, architecting should be acontinuous process to cope with changing needs.

In the work presented in Chapter 6, we conducted a survey to extract the mosteminent factors that influence the architecture decision process. We challengedsome beliefs from the architecture as well as the agile community. On the timeaxis, decisions do not become better (in terms of RoI or development speed) whenyou reserve more time to make them, and this factor does not influence the qualityof the product significantly. On the who-axis, we busted the belief that ivory-towerarchitects create worse architectures. There was no indication that the role of theperson was relevant for the result of the project. The only relevant indicator wasthe amount of development and architecting experience; more experience createdhigher quality products. Concerning documentation, using less documentationspeeds up the development process, while the quality of the process does not seemto be affected.

In Chapter 7, the success factors of the lean startup movement have been stud-ied to see what can be learned for the architectural decision process. This researchis based on interviews with architects and founders at startups, to explore howthey made decisions. We showed that the decision processes of business and soft-ware architecture are aligning. There is a trend towards making decisions as short-running experiments. As unknowns increase, it is necessary to conduct experi-ments on your assumptions as quickly as possible. There is no strict line betweenbusiness and architecture in this. The most important aspect that can help makingbetter decisions is the availability of data. This is exactly the subject of the lastresearch question.

RQ4: How can architecture decision makers reuse decision data?

9.2. Contributions 185

The need for data to base decisions on is explored in Chapter 8. For this work,we have focused on a specific architectural decision: the decision to use or changea component in a software system. We showed that it is possible to mine dataon these decisions from the history of open source Ruby projects. In addition, wehave closed the gap to the tacit knowledge behind the decisions. We have seenthat decision-makers can access this knowledge by contacting the authors of thecommits. The authors responded with help for the decision-makers. The dataon the occurrence of decisions in the past can be used to reuse decisions, whichenables a faster decision process.

9.2 Contributions

In this thesis, we have shown that knowledge on architectural decisions can befound and used in all forms: tacit, documented and formalized. We describedhow to model tacit knowledge and add it to architectural artifacts. In addition, wehave shown that formal data on architectural decisions can be mined from opensource projects, and that is is possible to access the underlying tacit knowledge bycontacting the decision-makers.

Table 9.1 summarizes the contribution of this thesis per chapter. In the firstcolumn, the chapter number is given, the second column describes the main prob-lem addressed. As described in the introduction of this thesis, the third columndescribes what IT artifacts [86] were used to address the problem, and the fourthcolumn describes the type of artifact. The last column describes the evaluation ofthe artifacts.

As contribution of this thesis to the research community, first we presented ametamodel for architectural decisions, which can be used when formalizing ar-chitectural decisions in tooling. We show the feasibility of the metamodel by de-scribing how we based our tools in it. With these tools, it is possible to experimentwith design decisions; they can be preserved during the documentation process orrecovered later on based on version history. We showed that the developed toolsthat were based on this metamodel can provide data for reusing decisions.

As a second contribution, we showed that there are no easy parameters to pre-dict the success of architectural decisions. We showed that many beliefs on mak-ing architectural decisions do not hold. The only relevant indicator that predictsslightly better decisions is the experience of the decision-makers. We showed thesimilarities between architecture design and new product development, where de-cisions are based on data that is acquired from short-running experiments. Thecontribution of this thesis is that the experiment-based decision process can beadopter in architecture design to make better decisions, or fail more rapidly.

In addition, we have introduced a way of conducting experimental researchbased on practices from new product development; an A/B experiment. We haveused this method to see if there were differences in responses to our emails, whenwe phrased the email differently. This method is widely used in industry to mea-sure the effectiveness of software, but to the knowledge of the author of this thesis,this methodology is not used in software architecture research yet. Therefore, anadditional contribution of this thesis is that A/B testing can be used to generatedata for research.

186 Chapter 9. Conclusions

TABLE 9.1: Contributions per chapter

Ch Problem IT Artifact Type of Ar-tifact

Evaluation

2 How are architecturaldecisions used?

A Use Case model Construct Industrialassessment

3 How can architec-tural decisions bebetter preserved andunderstood?

A model for DesignDecisions and ratio-nale

Model andInstantia-tion

Exampleimplemen-tation

4 How can architecturaldecisions be integratedwith the documenta-tion and implementa-tion?

A tool that assistin managing archi-tectural decisions

Methodand Instan-tiation

Quasi-controlledexperiment

5 How to predict the ef-ficiency of an architec-tural decision?

A framework forclassifying architec-tural decisions

Model Industrialcases

6 Can we make better de-cisions by changing thedecision process?

guidelines for deci-sion making

Method Survey

7 How can we makearchitectural decisionsfaster?

Guideline for the de-cision process

Method Interviews

8 Can we reuse architec-tural decisions?

Implementation formining and explor-ing architectural de-cisions

Instantiation Expertreviewand emailexperiment

9.3. Future Work 187

9.3 Future Work

In this thesis, we have seen a shift from opinion, rationale based decision-makingto data-driven experimental decision-making. The term VUCA (Volatile, Uncer-tain, Complex en Ambiguous) 1 is often used for the current fast-changing world.In this world, there are many unknowns and assumptions and ideas need to bevalidated very rapidly, based on data. We believe the work in this thesis is thefirst step in this direction for software architecture, where we showed how deci-sion support can be constructed based available data. As the amount of availabledata is increasing, we expect that assistance for decisions based on this kind ofdata will increase, in research as well as in industry. Specifically, Chapter 8 fo-cused on component selection decisions. It would be interesting future work tosee of other types if architectural decisions can be mined from available data. Inaddition, extending this current work to other programming languages would bea logical choice.

A challenging tension exists after this thesis concerning software architectureresearch. On the one hand, keeping and maintaining architectural decisions isimportant for project success (especially in the long run). On the other hand, wesee that the adaptation of tools that manage this is very low in industry. Mainreasons given are the effort needed to use the tooling [38]. In addition, speedat which decisions must be made increases. So, the challenge remains to createnon-intrusive tooling that extracts relevant architectural knowledge fast to addressknowledge vaporisation.

1https://en.wikipedia.org/wiki/Volatility,_uncertainty,_complexity_and_ambiguity

https://en.wikipedia.org/wiki/Volatility,_uncertainty,_complexity_and_ambiguity

189

List of Figures

1.1 An architectural diagram used in the BIOSCOPE project. . . . . . . 21.2 Domain model for Architectural Design Decisions . . . . . . . . . . 71.3 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 Use case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.2 Use case 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.3 Use case 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.4 Use case 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.5 Use case 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1 An abstract view on the software architecture design process . . . . 393.2 An abstract view on the rationale management process . . . . . . . 413.3 Similarities between software architecture design process and the

rationale management process . . . . . . . . . . . . . . . . . . . . . 443.4 The architecture of a CD player with extended functionality . . . . 463.5 The result of the design decisions of Figure 3.4 . . . . . . . . . . . . 473.6 The Archium design decision model . . . . . . . . . . . . . . . . . . 503.7 The Updater design decision in Archium . . . . . . . . . . . . . . . 52

4.1 Overview of the paper . . . . . . . . . . . . . . . . . . . . . . . . . . 594.2 The basic AK model . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.3 The Knowledge Architect tool suite . . . . . . . . . . . . . . . . . . . 624.4 The Knowledge Architect Word plug-in button bar . . . . . . . . . . 634.5 The Knowledge Explorer . . . . . . . . . . . . . . . . . . . . . . . . . 654.6 Overview of the approach and its validation . . . . . . . . . . . . . 684.7 A domain model for AK in documentation . . . . . . . . . . . . . . 714.8 A software architecture document with colored KEs and pop-up

menu for tracing the relationships of a KE . . . . . . . . . . . . . . . 744.9 Incompleteness information of a KE . . . . . . . . . . . . . . . . . . 744.10 Average number of comments of the reviewers per situation . . . . 834.11 Average quality of comments of the reviewers per situation . . . . . 84

5.1 The Triple-A Framework . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.1 Project and Architect Properties. . . . . . . . . . . . . . . . . . . . . 123

7.1 Conceptual Framework for Decision-based New Product Develop-ment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

8.1 Publicly available Component Data for the Rest Ruby Gem . . . . . 1538.2 The Envisioned System: Decision Support and Data Mining . . . . 154

190 List of Figures

8.3 Relationships between Decisions . . . . . . . . . . . . . . . . . . . . 1568.4 Project History concerning Component Change . . . . . . . . . . . . 1578.5 The Design Decision Extraction Process . . . . . . . . . . . . . . . . 1598.6 Component Replacements . . . . . . . . . . . . . . . . . . . . . . . . 1608.7 A Fragment of a Git Log Output Example . . . . . . . . . . . . . . . 167

191

List of Tables

1.1 Historical Change in Software Engineering and Architecture . . . . 31.2 Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.1 Experimental design: #subjects per situation . . . . . . . . . . . . . 784.2 Ratings of reviewer comments of the first hour . . . . . . . . . . . . 794.3 Confidence levels for H1 and H02 . . . . . . . . . . . . . . . . . . . . 844.4 Confidence levels for H3 and H4 . . . . . . . . . . . . . . . . . . . . 85

5.1 Overview of Case Studies . . . . . . . . . . . . . . . . . . . . . . . . 1075.2 Mapping the Cases on the Triple-A Framework . . . . . . . . . . . . 108

6.1 Overview of Conclusions on Beliefs . . . . . . . . . . . . . . . . . . 1256.2 Project and Person Characteristics . . . . . . . . . . . . . . . . . . . 1296.3 Success Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1306.4 Triple-A Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

7.1 Interview Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1407.2 Overview Companies . . . . . . . . . . . . . . . . . . . . . . . . . . . 1417.3 Overview of Pivots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1437.4 Concept Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.1 Reflection of Decisions in Version Management . . . . . . . . . . . . 1558.2 Project Availability of Programming Languages . . . . . . . . . . . 1648.3 Ease of Mining of Programming Languages . . . . . . . . . . . . . . 1658.4 Programming Language Component Ecosystem . . . . . . . . . . . 1658.5 Summary of Programming Language Suitability . . . . . . . . . . . 1668.6 Acquired Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1678.7 Number of Deltas Identified . . . . . . . . . . . . . . . . . . . . . . . 1698.8 Summary of CDS Growth . . . . . . . . . . . . . . . . . . . . . . . . 1698.9 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 1708.10 Results of Email Experiment . . . . . . . . . . . . . . . . . . . . . . . 173

9.1 Contributions per chapter . . . . . . . . . . . . . . . . . . . . . . . . 186

193

Abbreviations

AK: Architectural Knowledge

AD: Architectural Decision

ADD: Architectural Design Decision

ADL: Architectural Description Language

API: Application Programming Interface

ATAM: Architecture Tradeoff Analysis Method

BML: Build, Measure, Learn

BU: Business Unit

COTS: Commercial off-the-shelf

CSD: Component Selection Decision

DD: Design Decision

DRL: Decision Representation Language

ESB: Enterprice Service Bus

GIS: Geographic Information System

GUI: Graphical User Interface

IT: Information Technology

JSON: JavaScript Object Notation

KE: Knowledge Entity

LOFAR: LOwFrequency ARray

MVP: Minimal Viable Product

OS: Open Source

OWL: Web Ontology Language

PaaS: Platform as a Service

R&D: Research and Development

194 Abbreviations

REST: REpresentational State Transfer

RoI: Return on Investment

RuP: Rational unified Process

SAD: Software Architecture Document

SaaS: Software as a Service

SOAP: Simple Object Access Protocol

UC: Use Case

UML: Unified Modelling Language

PoC: Proof of Concept

PoT: Proof of Technology

RQ: Research Question

QOC: Questions, Objects and Criteria

XML: Extensible Markup Language

XP: EXtreme Programming

195

Abstract

The software architecture is one of the most influential factors for the success orfailure of a software system. The decisions made when managing the softwarearchitecture form the basis of a software system. Forgetting these architecturaldecisions, and the reasons behind these decisions, results in knowledge vapori-sation. This architectural knowledge vaporisation has severe consequences. Theevolution of a system is more expensive and it is hard to reuse parts of the systemif architectural knowledge is missing.

Software architecture research emphasizes the importance of explicitly man-aging architectural decisions to cope with architectural knowledge vaporisation.Models, methods and tools have been proposed to address this issue. Despitethese efforts, industry seems to still struggle with managing architectural decisionsin real world projects. We address this topic from three directions: artifacts, pro-cess and reuse. First, we show that architectural decisions can be connected witharchitectural artifacts, making the maintenance of these decisions easier. Second,we investigate the decision process used in industry to look for success factors fordecision making. Last, we show that decisions can be reconstructed and reusedfrom the history of systems.

As a start, we investigate the needs from industry concerning architecturalknowledge. This investigation shows that the architectural decisions need to betaken into account in combination with existing artifacts like architectural docu-mentation or system implementation. We show how explicit decisions can formthe bridge between the tacit knowledge of architects and the artifacts that are usedin software architecture. For example, one of the developed research tools de-scribed in this thesis enables annotation and management of tacit knowledge asmeta-data when writing or reviewing architecture documentation. In this way, ar-chitectural decisions are connected with the documentation of the system, makingit easier to access and maintain them.

Next, we investigate how the decision process takes place in practice. We lookat the characteristics of the decision, like the person making the decision or the wayin which the decision is preserved. We investigate the correlation between decisioncharacteristics and the success of these decisions. We conclude that there are smallindicators for success: development experience helps to make better decisions,while large documentation slows projects down.

It is not easy to get experienced architects on all projects. That is why we fur-ther investigate the possibility to reuse previously made architectural decisions.We show how architecture decision data can be made accessible from the history ofthe source code of open source software systems. We show that this data containsstatistics on the occurrence of decisions, as well as tacit rationale of the decisions.The data was not documented explicitly during system development, but was ex-tracted when the systems were finished. In this way, knowledge vaporization can

196 Abstract

be tackled from a different angle by basing new decisions on relevant knowledgefrom existing systems.

197

Samenvatting

De software architectuur is de ruggengraat van software systemen. De ontwerp-beslissingen die worden genomen vormen de basis van deze architectuur. Als jedeze beslissingen en de motivering van deze beslissingen vergeet, verlies je be-langrijke kennis over het systeem; kennis verdamping (knowledge vaporization).De verdamping van architectuur kennis kan ernstige gevolgen hebben. De evo-lutie van het systeem is veel duurder terwijl het ook veel lastiger is onderdelenvan het systeem opnieuw te gebruiken als de onderliggende ontwerpbeslissingenonbekend zijn.

Onderzoek naar software architectuur benadrukt dat het belangrijk is om on-twerpbeslissingen expliciet te behandelen om kennis verdamping te voorkomen.Door middel van modellen, methodes en tooling wordt er getracht deze proble-men op te lossen. Ondanks deze inspanningen blijft het probleem in de dageli-jkse praktijk van het bedrijfsleven vaak voorkomen. In dit onderzoek bekijken wehet probleem vanuit drie verschillende hoeken: artefacten, processen en herge-bruik. Als eerste laten we zien dat beslissingen over de architectuur kunnen wor-den verbonden met veelgebruikte architectuur artefacten, waardoor minder ken-nis verloren gaat. Ten tweede hebben we onderzoek gedaan naar het architec-tuur ontwikkelprocess en hebben we bekeken welke aspecten van dit proces be-langrijk zijn voor succesvolle beslissingen. Als laatste laten we zien dat je on-twerpbeslissingen kan hergebruiken door ze af te leiden van de versiegeschiedenisvan bestaande software systemen.

We zijn begonnen met onderzoek naar de behoeftes vanuit het bedrijfslevenvoor architectuur kennis. Uit dit onderzoek blijkt dat het nodig is om architectuurbeslissingen te verbinden met de huidig gebruikte artefacten voor architectuur,zoals documentatie of de source code. We laten zien dat er een brug kan ontstaantussen de impliciete ontwerpbeslissingen en de artifacts door ontwerpbeslissingenexpliciet bij te houden. Een van de ontwikkelde tools geeft bijvoorbeeld de mo-gelijkheid om architectuur beslissingen als metadata te annoteren en bij te houdenbinnen architectuur documentatie. Op deze manier zijn de ontwerpbeslissingenverbonden met de documentatie waardoor ze makkelijker kunnen worden onder-houden.

Vervolgens hebben we onderzoek gedaan naar het beslissingsproces in de prak-tijk. We hebben gekeken naar de context van het beslissingsproces, zoals de ver-antwoordelijke voor de beslissing en hoe de beslissing wordt bijgehouden. Wehebben gekeken naar de correlatie tussen deze context en het succes van de besliss-ing. Hierbij hebben we kleine succesfactoren gevonden: de ervaring als ontwikke-laar helpt om betere beslissingen te nemen, terwijl uitgebreide documentatie pro-jecten vertraagt.

199

Bibliography

[1] Pietro Abate, Roberto Di Cosmo, Louis Gesbert, Fabrice Le Fessant, Ralf Treinen,and Stefano Zacchiroli. “Mining Component Repositories for Installability Issues”.In: Proceedings of the 12th Working Conference on Mining Software Repositories. MSR’15. Piscataway, NJ, USA: IEEE Press, 2015, pp. 24–33.

[2] Pekka Abrahamsson, Muhammad Ali Babar, and Philippe Kruchten. “Agility andArchitecture: Can They Coexist?” In: IEEE Software 27.2 (2010), pp. 16–22.

[3] Nitin Agarwal and Urvashi Rathod. “Defining ‘success’ for software projects: Anexploratory revelation”. In: International Journal of Project Management 24.4 (2006),pp. 358 –370.

[4] Zoya Alexeeva, Diego Perez-Palacin, and Raffaela Mirandola. “Design DecisionDocumentation: A Literature Overview”. In: Software Architecture. Ed. by BedirTekinerdogan, Uwe Zdun, and Ali Babar. Cham: Springer International Publish-ing, 2016, pp. 84–101.

[5] Muhammad Ali Babar, Tuomas Ihme, and Minna Pikkarainen. “An industrial caseof exploiting product line architectures in agile software development”. In: Pro-ceedings of the 13th International Software Product Line Conference. SPLC ’09. Pitts-burgh, PA, USA: Carnegie Mellon University, 2009, pp. 171–179.

[6] Grigoris Antoniou and Frank van Harmelen. A Semantic Web Primer. MIT Press,Apr. 2004.

[7] J. Appelo. Management 3.0: Leading Agile Developers, Developing Agile Leaders. Addi-son Wesley Signature Series. Addison Wesley, 2010.

[8] Paris Avgeriou, Philippe Kruchten, Patricia Lago, Paul Grisham, and DewaynePerry. “Architectural knowledge and rationale: issues, trends, challenges”. In: ACMSIGSOFT Software Engineering Notes 32.4 (2007), pp. 41–46.

[9] Paris Avgeriou, Particia Lago, and Philippe Kruchten. “Third international work-shop on sharing and reusing architectural knowledge (SHARK 2008)”. In: ICSECompanion (2008), pp. 1065–1066.

[10] Claudia Ayala, Oyvind Hauge, Reidar Conradi, Xavier Franch, and Jingyue Li.“Selection of Third Party Software in Off-The-Shelf-based Software development-An Interview Study with Industrial Practitioners”. In: Journal of Systems and Soft-ware 84.4 (Apr. 2011), pp. 620–637.

[11] Claudia Ayala, Øyvind Hauge, Reidar Conradi, Xavier Franch, Jingyue Li, andKetilSandanger Velle. “Challenges of the Open Source Component Marketplace inthe Industry”. In: Open Source Ecosystems: Diverse Communities Interacting. Ed. byCornelia Boldyreff, Kevin Crowston, Björn Lundell, and AnthonyI. Wasserman.Vol. 299. IFIP Advances in Information and Communication Technology. SpringerBerlin Heidelberg, 2009, pp. 213–224.

200 BIBLIOGRAPHY

[12] M.A. Babar, I. Gorton, and B. Kitchenham. “A Framework for Supporting Archi-tecture Knowledge and Rationale Management”. In: Rationale Management in Soft-ware Engineering. Ed. by Allen H. Dutoit, Raymond McCall, Ivan Mistrík, and Bar-bara Paech. Springer-Verlag, Mar. 2006. Chap. 11, pp. 237–254.

[13] Muhammad Ali Babar, Remco C. de Boer, Torgeir Dingsøyr, and Rik Farenhorst.“Architectural Knowledge Management Strategies:Approaches in Research andIndustry”. In: Proceedings of the 2nd Workshop on SHAring and Reusing architecturalKnowledge - Architecture, rationale, and Design Intent (SHARK/ADI 2007). Minneapo-lis, MN, USA, May 2007.

[14] Muhammad Ali Babar, Torgeir Dingsyr, Patricia Lago, and Hans van Vliet. Soft-ware Architecture Knowledge Management: Theory and Practice. 1st. Springer Publish-ing Company, Incorporated, 2009.

[15] Elisa L. A. Baniassad, Gail C. Murphy, and Christa Schwanninger. “Design Pat-tern Rationale Graphs: Linking Design to Source”. In: Proceedings of the 25th ICSE.Portland, Oregon, USA, May 2003, pp. 352–362.

[16] Len Bass, Paul Clements, and Rick Kazman. Software architecture in practice 2nd ed.Addison Wesley, 2003.

[17] Kent Beck and Martin Fowler. Planning Extreme Programming. Boston, MA, USA:Addison-Wesley Longman Publishing Co., Inc., 2000.

[18] S. Blank. The Four Steps to the Epiphany: Successful Strategies for Products that Win.Lulu.com, 2008.

[19] Kelly Blincoe, Francis Harrison, and Daniela Damian. “Ecosystems in GitHub anda Method for Ecosystem Identification Using Reference Coupling”. In: Proceedingsof the 12th Working Conference on Mining Software Repositories. MSR ’15. Piscataway,NJ, USA: IEEE Press, 2015, pp. 202–207.

[20] Bluebox. Code Climate. 2014.

[21] Remco de Boer, Rik Farenhorst, Viktor Clerc, Jan Salvador van der Ven, PatriciaLago, and Hans van Vliet. “Structuring Architecture Project Memories”. In: Pro-ceedings 8th International Workshop on Learning Software Organizations (LSO 2006).2006, pp. 39–47.

[22] Remco C. de Boer and Rik Farenhorst. “In Search of ‘Architectural Knowledge’”.In: Proceedings of the 3rd International Workshop on Sharing and Reusing ArchitecturalKnowledge. SHARK ’08. Leipzig, Germany: ACM, 2008, pp. 71–78.

[23] Remco C. de Boer, Rik Farenhorst, Patricia Lago, Hans van Vliet, Viktor Clerc, andAnton Jansen. “Architectural knowledge: getting to the core”. In: Proceedings of theQuality of software architectures 3rd international conference, 2007. QoSA’07. Medford,MA: Springer-Verlag, 2007, pp. 197–214.

[24] Remco C. de Boer and Hans van Vliet. “Architectural knowledge discovery withlatent semantic analysis: Constructing a reading guide for software product au-dits”. In: Journal of Systems and Software 81.9 (2008), pp. 1456–1469.

[25] Jan Bosch. “Building Products as Innovation Experiment Systems”. In: Proceedingsof Software Business - Third International Conference, ICSOB 2012. Vol. 114. Springer,2012, pp. 27–39.

[26] Jan Bosch. Design and use of software architectures: adopting and evolving a product-lineapproach. New York, NY, USA: ACM Press/Addison-Wesley Publishing Co., 2000.

BIBLIOGRAPHY 201

[27] Jan Bosch. “Software Architecture: The Next Step”. In: Software Architecture. Ed.by Flavio Oquendo, Brian C. Warboys, and Ron Morrison. Berlin, Heidelberg:Springer Berlin Heidelberg, 2004, pp. 194–199.

[28] Jan Bosch. Speed, Data, and Ecosystems: Excelling in a Software-Driven World. BocaRaton, FL, USA: CRC Press, Inc., 2016.

[29] Jan Bosch and Petra M. Bosch-Sijtsema. “Introducing agile customer-centered de-velopment in a legacy software product line”. In: Software: Practice and Experience(SPE) 41.8 (July 2011), pp. 871–882.

[30] Lars Bratthall, Enrico Johansson, and Bjorn Regnell. “Is a Design Rationale Vitalwhen Predicting Change Impact? A Controlled Experiment on Software Archi-tecture Evolution”. In: Second International Conference on Product Focused SoftwareProcess Improvement (PROFES). Vol. 1840. LNCS. Oulo, Finland: Springer, 2000,pp. 126–139.

[31] Hongyu Pei Breivold, Daniel Sundmark, Peter Wallin, and Stig Larsson. “WhatDoes Research Say about Agile and Architecture?” In: Proceedings of the 2010 FifthInternational Conference on Software Engineering Advances. ICSEA ’10. Washington,DC, USA: IEEE Computer Society, 2010, pp. 32–37.

[32] Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. “Sesame: A GenericArchitecture for Storing and Querying RDF and RDF Schema”. In: Proceedings ofthe First International Semantic Web Conference on The Semantic Web (ISWC 2002).London, UK: Springer-Verlag, 2002, pp. 54–68.

[33] Frederick P. Brooks Jr. The Mythical Man-month (Anniversary Ed.) Boston, MA, USA:Addison-Wesley Longman Publishing Co., Inc., 1995.

[34] Wanfeng Bu, Antony Tang, and Jun Han. “An analysis of decision-centric architec-tural design approaches”. In: Proceedings of the 2009 ICSE Workshop on Sharing andReusing Architectural Knowledge. SHARK ’09. Washington, DC, USA: IEEE Com-puter Society, 2009, pp. 33–40.

[35] J. E. Burge and D. C. Brown. “An Integrated Approach for Software Design Check-ing using Design Rationale”. In: 1st International Conference on Design Computingand Cognition (DCC ’04). MIT, Cambridge, USA, July 2004, pp. 557–576.

[36] Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad, and MichaelStal. A system of patterns. John Wiley and Sons, Inc., 1996.

[37] Harvey R. Butcher. “LOFAR: first of a new generation of radio telescopes”. In:proceedings of SPIE. 2004.

[38] Rafael Capilla, Anton Jansen, Antony Tang, Paris Avgeriou, and Muhammad AliBabar. “10 Years of Software Architecture Knowledge Management: Practice andfuture”. In: J. Syst. Softw. 116.C (June 2016), pp. 191–205.

[39] Rafael Capilla, Francisco Nava, Sandra Pérez, and Juan C. Dueñas. “A web-basedtool for managing architectural design decisions”. In: SIGSOFT Software Engineer-ing Notes 31.5 (2006).

[40] Tsun Chow and Dac-Buu Cao. “A survey study of critical success factors in agilesoftware projects”. In: Journal of Systems and Software 81.6 (2008). Agile ProductLine Engineering, pp. 961 –971.

202 BIBLIOGRAPHY

[41] Maelick Claes, Tom Mens, Roberto Di Cosmo, and Jérôme Vouillon. “A HistoricalAnalysis of Debian Package Incompatibilities”. In: Proceedings of the 12th Work-ing Conference on Mining Software Repositories. MSR ’15. Piscataway, NJ, USA: IEEEPress, 2015, pp. 212–223.

[42] Paul Clements, David Garlan, Len Bass, Judith Stafford, Robert Nord, James Ivers,and Reed Little. Documenting Software Architectures: Views and Beyond. Pearson Ed-ucation, 2002.

[43] Viktor Clerc. Architectural Knowledge Management in Global Software Development.VU University Amsterdam, 2011.

[44] Alistair Cockburn. Writing Effective Use Cases. 1st. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 2000.

[45] J. Coplien and G. Bjørnvig. Lean Architecture: For Agile Software Development. JohnWiley and Sons, 2010.

[46] Jim Cowie and Wendy Lehnert. “Information extraction”. In: Commun. ACM 39.1(1996), pp. 80–91.

[47] Alistair Croll and Benjamin Yoskovitz. Lean Analytics: Use Data to Build a BetterStartup Faster. O’Reilly Media, Inc, 2013.

[48] Barthélémy Dagenais and Martin P. Robillard. “Creating and evolving developerdocumentation: understanding the decisions of open source contributors”. In: Pro-ceedings of the eighteenth ACM SIGSOFT international symposium on Foundations ofsoftware engineering. FSE ’10. New York, NY, USA: ACM, 2010, pp. 127–136.

[49] Thomas H. Davenport. “How to Design Smart Business Experiments”. In: HarvardBusiness Review 87.2 (Feb. 2009), pp. 68–76.

[50] J. Andres Diaz-Pace, Matias Nicoletti, Silvia Schiaffino, and Santiago Vidal. “Pro-ducing Just Enough Documentation: The Next SAD Version Problem”. In: Search-Based Software Engineering: 6th International Symposium, SSBSE 2014, Fortaleza, Brazil,August 26-29, 2014. Proceedings 1 (2014). Ed. by Claire Goues and Shin Yoo, pp. 46–60.

[51] Wei Ding, Peng Liang, Antony Tang, and Hans van Vliet. “Understanding theCauses of Architecture Changes using OSS Mailing Lists”. In: International Journalof Software Engineering and Knowledge Engineering 25.9&10 (2015), pp. 1633 –1651.

[52] Aline Dresch, Daniel Pacheco Lacerda, and Jos Antnio Valle Antunes. Design Sci-ence Research: A Method for Science and Technology Advancement. Springer PublishingCompany, Incorporated, 2014.

[53] Juan C. Dueñas and Rafael Capilla. “The Decision View of Software Architecture”.In: EWSA. 2005, pp. 222–230.

[54] Allen H. Dutoit, Raymond McCall, Ivan Mistrik, and Barbara Paech. RationaleManagement in Software Engineering. Berlin, Heidelberg: Springer-Verlag, 2006.

[55] Kathleen M. Eisenhardt. “Building Theories from Case Study Research”. In: Academyof Management Review 14.4 (1989), pp. 532–550.

[56] Kathleen M. Eisenhardt and Behnam N. Tabrizi. “Accelerating Adaptive Processes:Product Innovation in the Global Computer Industry”. In: Administrative ScienceQuarterly 40.1 (1995), pp. 84–110.

BIBLIOGRAPHY 203

[57] Hylke Faber, Menno Wierdsma, Richard Doornbos, Jan Salvador van der Ven, andKevin de Vette. “Teaching Computational Thinking to Primary School Studentsvia Unplugged Programming Lessons”. In: Journal of the European Teacher EducationNetwork 12.0 (2017), pp. 13–24.

[58] Hylke H. Faber, Jan Salvador van der Ven, and Menno D.M. Wierdsma. “TeachingComputational Thinking to 8-Year-Olds Through ScratchJr”. In: Proceedings of the2017 ACM Conference on Innovation and Technology in Computer Science Education.ITiCSE ’17. Bologna, Italy: ACM, 2017, pp. 359–359.

[59] Davide Falessi, Giovanni Cantone, and Martin Becker. “Documenting design de-cision rationale to improve individual and team design decision making: an exper-imental evaluation”. In: Proceedings of the 2006 ACM/IEEE international symposiumon International symposium on empirical software engineering (ISESE ’06). New York,NY, USA: ACM Press, 2006, pp. 134–143.

[60] Rik Farenhorst and Remco de Boer. Architectural Knowledge Management: Support-ing Architects and Auditors. VU University Amsterdam, 2009.

[61] Rik Farenhorst, Remco C. de Boer, Robert Deckers, Patricia Lago, and Hans vanVliet. “What’s in Constructing a Domain Model for Sharing Architectural Knowl-edge?” In: Proceedings of the Eighteenth International Conference on Software Engineer-ing & Knowledge Engineering (SEKE’2006), San Francisco, CA, USA, July 5-7, 2006.Ed. by Kang Zhang, George Spanoudakis, and Giuseppe Visaggio. 2006, pp. 108–113.

[62] Rik Farenhorst, Johan F. Hoorn, Patricia Lago, and Hans van Vliet. “The LonesomeArchitect”. In: Journal of Systems and Software 84.9 (Sept. 2011), pp. 1424 –1435.

[63] Rik Farenhorst, Ronald Izaks, Patricia Lago, and Hans van Vliet. “A Just-In-TimeArchitectural Knowledge Sharing Portal.” In: WICSA. IEEE Computer Society, 2008,pp. 125–134.

[64] Roy Thomas Fielding. REST: Architectural Styles and the Design of Network-basedSoftware Architectures. University of California, Irvine, 2000.

[65] C. Gacek, A. Abd-Allah, B. Clark, and B. Boehm. “On the Definition of SoftwareSystem Architecture”. In: in Proceedings of ICSE 17 Software Architecture Workshop.Apr. 1995.

[66] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns:Elements of Reusable Object-oriented Software. Boston, MA, USA: Addison-WesleyLongman Publishing Co., Inc., 1995.

[67] David Garlan, Felix Bachmann, James Ivers, Judith Stafford, Len Bass, Paulo Mer-son, and Paul Clements. Documenting Software Architectures: Views and Beyond. 2nd.Addison-Wesley Professional, 2010.

[68] David Garlan and Mary Shaw. An Introduction to Software Architecture. Tech. rep.Pittsburgh, PA, USA, 1994.

[69] Sebastian Gerdes, Mohamed Soliman, and Matthias Riebisch. “Decision Buddy:Tool Support for Constraint-Based Design Decisions During System Evolution”.In: Proceedings of the 1st International Workshop on Future of Software Architecture De-sign Assistants. FoSADA ’15. New York, NY, USA: ACM, 2015, pp. 13–18.

[70] Daniel M. German, Bram Adams, and Ahmed E. Hassan. “Continuously miningdistributed version control systems: an empirical study of how Linux uses Git”.In: Empirical Software Engineering 21.1 (2015), pp. 260 –299.

204 BIBLIOGRAPHY

[71] Github. Github Network Graph Visualizer. 2014.

[72] Gitstats. Gitstats. 2014.

[73] Klaas Andries de Graaf, Antony Tang, Peng Liang, and Hans van Vliet. “Ontology-based Software Architecture Documentation”. In: WICSA / ECSA 2012. IEEE, 2012,pp. 121–130.

[74] Dawn G. Gregg, Uday R. Kulkarni, and Ajay S. Vinzé. “Understanding the Philo-sophical Underpinnings of Software Engineering Research in Information Sys-tems”. In: Information Systems Frontiers 3.2 (June 2001), pp. 169–183.

[75] Jilles van Gurp and Jan Bosch. “Design erosion: problems and causes”. In: Journalof Systems and Software 61.2 (Mar. 2002), pp. 105–119.

[76] I. Hadar, S. Sherman, E. Hadar, and J. J. Harrison. “Less is more: Architecture doc-umentation for agile development”. In: Cooperative and Human Aspects of SoftwareEngineering (CHASE), 2013 6th International Workshop on. May 2013, pp. 121–124.

[77] Nicole Haenni, Mircea Lungu, Niko Schwarz, and Oscar Nierstrasz. “CategorizingDeveloper Information Needs in Software Ecosystems”. In: Proceedings of the 2013International Workshop on Ecosystem Architectures. WEA 2013. New York, NY, USA:ACM, 2013, pp. 1–5.

[78] Siegfried Handschuh, Steffen Staab, and Fabio Ciravegna. “S-CREAM - Semi- au-tomatic CREAtion of Metadata”. In: EKAW ’02: Proceedings of the 13th InternationalConference on Knowledge Engineering and Knowledge Management. Ontologies and theSemantic Web. London, UK: Springer-Verlag, 2002, pp. 358–372.

[79] Morten T. Hansen, Nitin Nohria, and Thomas Tierney. “What’s Your Strategy forManaging Knowledge?” In: Harvard Business Review 77.2 (1999), pp. 106–116.

[80] Hans-Jörg Happel and Stefan Seedorf. “Ontobrowse: A Semantic Wiki for SharingKnowledge about Software Architectures”. In: SEKE. Knowledge Systems InstituteGraduate School, Sept. 19, 2007, pp. 506–512.

[81] Neil B. Harrison, Paris Avgeriou, and Uwe Zdun. “Using Patterns to Capture Ar-chitectural Decisions”. In: IEEE Software 24.4 (July 2007), pp. 38–45.

[82] Oyvind Hauge, Thomas Osterlie, Carl-Fredrik Sorensen, and Marinela Gerea. “AnEmpirical Study on Selection of Open Source Software - Preliminary Results”. In:Proceedings of the 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open SourceSoftware Research and Development. FLOSS ’09. Washington, DC, USA: IEEE Com-puter Society, 2009, pp. 42–47.

[83] Jane Huffman Hayes, Alex Dekhtyar, and Senthil Karthekeyan Sundaram. “Im-proving After-the-Fact Tracing and Mapping: Supporting Software Quality Pre-dictions”. In: IEEE Software 22.6 (Nov. 2005), pp. 30–37.

[84] B. Hedeman, H. Fredriksz, and G.V. van Heemst. Project Management: Based OnPrince2. ITSM Library. Van Haren Publishing, 2005.

[85] U. van Heesch, P. Avgeriou, and R. Hilliard. “A Documentation Framework forArchitecture Decisions”. In: J. Syst. Softw. 85.4 (Apr. 2012), pp. 795–820.

[86] Alan R. Hevner, Salvatore T. March, Jinsoo Park, and Sudha Ram. “Design Sciencein Information Systems Research”. In: MIS Q. 28.1 (Mar. 2004), pp. 75–105.

[87] Yoshiki Higo, Akio Ohtani, Shinpei Hayashi, Hideaki Hata, and Kusumoto Shinji.“Toward Reusing Code Changes”. In: Proceedings of the 12th Working Conferenceon Mining Software Repositories. MSR ’15. Piscataway, NJ, USA: IEEE Press, 2015,pp. 372–376.

BIBLIOGRAPHY 205

[88] André van der Hoek, Marija Mikic-Rakic, Roshanak Roshandel, and Nenad Med-vidovic. “Taming architectural evolution”. In: Proceedings of the 8th European soft-ware engineering conference. Vienna, Austria: ACM Press, 2001, pp. 1–10.

[89] André Van Der Hoek and Alexander L. Wolf. “Software Release Management forComponent-Based Software”. In: Software: Practice and Experience (SPE) 33.1 (2003),pp. 77–98.

[90] C. Hofmeister, R. Nord, and D. Soni. Applied Software Architecture. The Addison-Wesley Object Technology Series. Addison-Wesley, 2000.

[91] Christine Hofmeister, Robert L. Nord, and Dillip Soni. “Global Analysis: movingfrom software requirements specification to structural views of the software archi-tecture”. In: IEE Proceedings Software 4 (Aug. 2005), pp. 187–197.

[92] G. Hohpe, I. Ozkaya, U. Zdun, and O. Zimmermann. “The Software Architect’sRole in the Digital Age”. In: IEEE Software 33.6 (Nov. 2016), pp. 30–39.

[93] Pablo Martin de Holan and Nelson Phillips. “Organizational forgetting as a strat-egy”. In: Strategic Organization 2.4 (2004), pp. 423–433.

[94] J. Horner and M.E. Atwoord. “Effective Design Rationale: Understanding the Bar-riers”. In: Rationale Management in Software Engineering. Ed. by Allen H. Dutoit,Raymond McCall, Ivan Mistrik, and Barbara Paech. Springer-Verlag, Mar. 2006.Chap. 3, pp. 73–88.

[95] IEEE/ANSI. Recommended Practice for Architectural Description of Software-IntensiveSystems. IEEE Standard No. 1471-2000, Product No. SH94869-TBR. 2000.

[96] Anton Jansen. “Architectural Design Decisions”. PhD thesis. Institute of Mathe-matics and Computing Science, University of Groningen, July 2008.

[97] Anton Jansen, Paris Avgeriou, and Jan Salvador van der Ven. “Enriching SoftwareArchitecture Documentation”. In: Journal of Systems and Software 82.8 (Aug. 2009),pp. 1232–1248.

[98] Anton Jansen, Tjaard de Vries, Paris Avgeriou, and Martijn van Veelen. “Shar-ing the Architectural Knowledge of Quantitative Analysis”. In: Proceedings of theFourth International Conference on the Quality of Software-Architectures (QoSA 2008).Vol. 5281. LNCS. Oct. 2008, pp. 220–234.

[99] Anton G. J. Jansen and Jan Bosch. “Evaluation of Tool Support for ArchitecturalEvolution”. In: Proceedings of the 19th IEEE International Conference on AutomatedSoftware Engineering (ASE 2004). Linz, Austria: IEEE, Sept. 2004, pp. 375–378.

[100] Anton G. J. Jansen and Jan Bosch. “Software Architecture as a Set of ArchitecturalDesign Decisions”. In: Proceedings of the 5th IEEE/IFIP Working Conference on Soft-ware Architecture (WICSA 2005). Pittsburgh, Pennsylvania, USA: IEEE ComputerSociety, Nov. 2005, pp. 109–119.

[101] Anton G. J. Jansen, Jan van der Ven, Paris Avgeriou, and Dieter K. Hammer. “Toolsupport for Architectural Decisions”. In: Proceedings of the 6th IEEE/IFIP WorkingConference on Software Architecture (WICSA 2007). Mumbai, India, Jan. 2007.

[102] A. Jedlitschka and D. Pfahl. “Reporting guidelines for controlled experiments insoftware engineering”. In: Empirical Software Engineering, 2005. 2005 InternationalSymposium on (Nov. 2005), 10 pp.–.

206 BIBLIOGRAPHY

[103] Huzefa Kagdi, Michael L. Collard, and Jonathan I. Maletic. “A survey and tax-onomy of approaches for mining software repositories in the context of softwareevolution”. In: Journal of Software Maintenance and Evolution: Research and Practice19.2 (2007), pp. 77–131.

[104] José Kahan and Marja-Ritta Koivunen. “Annotea: an open RDF infrastructure forshared Web annotations”. In: WWW ’01: Proceedings of the 10th international confer-ence on World Wide Web. New York, NY, USA: ACM, 2001, pp. 623–632.

[105] Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. Ger-man, and Daniela Damian. “An in-depth study of the promises and perils of min-ing GitHub”. In: Empirical Software Engineering (2015), pp. 1 –37.

[106] R. Kazman, L. Bass, G. Abowd, and M. Webb. “SAAM: a method for analyzing theproperties of software architectures”. In: Proceedings of 16th International Conferenceon Software Engineering. May 1994, pp. 81–90.

[107] Rick Kazman, Mark Klein, Mario Barbacci, Tom Longstaff, Howard Lipson, andJeromy Carriere. “The Architecture Tradeoff Analysis Method”. In: Proceedings ofthe Fourth IEEE International Conference on Engineering of Complex Computer Systems(ICECCS. 1998.

[108] Atanas Kiryakov, Damyan Ognyanov, and Dimitar Manov. “OWLIM - A Prag-matic Semantic Repository for OWL”. In: Proceedings of the Web Information SystemsEngineering Workshop(WISE). Vol. 3807. LNCS. New York, USA, Nov. 2005, pp. 182–192.

[109] Henrik Kniberg. Scrum and XP from the Trenches: Enterprise Software Development.Lulu.com, 2007.

[110] Ehsan Kouroshfar, Mehdi Mirakhorli, Hamid Bagheri, Lu Xiao, Sam Malek, andYuanfang Cai. “A Study on the Role of Software Architecture in the Evolutionand Quality of Software”. In: Proceedings of the 12th Working Conference on MiningSoftware Repositories. MSR ’15. Piscataway, NJ, USA: IEEE Press, 2015, pp. 246–257.

[111] Heiko Koziolek and Thomas Goldschmidt. “Tool-Driven Technology Transfer toSupport Software Architecture Decisions”. In: Software Engineering 2014, Fachta-gung des GI-Fachbereichs Softwaretechnik, 25. Februar - 28. Februar 2014, Kiel, Deutsch-land. 2014, pp. 159–164.

[112] P. Kruchten, P. Lago, H. van Vliet, and T. Wolf. “Building up and Exploiting Archi-tectural Knowledge”. In: 5th Working IEEE/IFIP Conference on Software Architecture(WICSA’05). 2005, pp. 291–292.

[113] P. Kruchten, H. Obbink, and J. Stafford. “The Past, Present, and Future for SoftwareArchitecture”. In: IEEE Software 23.2 (Mar. 2006), pp. 22–30.

[114] Philippe Kruchten. “An Ontology of Architectural Design Decisions in SoftwareIntensive Systems”. In: 2nd Groningen Workshop Software Variability 2004. Oct. 2004,pp. 54–61.

[115] Philippe Kruchten. “Software architecture and agile software development: a clashof two cultures?” In: Proceedings of the 32nd ACM/IEEE International Conference onSoftware Engineering - Volume 2, ICSE 2010. 2010, pp. 497–498.

[116] Philippe Kruchten. “The 4+1 View Model of Architecture”. In: IEEE Softw. 12.6(Nov. 1995), pp. 42–50.

[117] Philippe Kruchten. The Rational Unified Process: An Introduction. 3rd ed. Boston:Addison-Wesley, 2003.

BIBLIOGRAPHY 207

[118] Philippe Kruchten. “What do software architects really do?” In: Journal of Systemsand Software 81.12 (Dec. 2008), pp. 2413 –2416.

[119] Philippe Kruchten, Patricia Lago, and Hans Van Vliet. “Building up and Reasoningabout Architectural Knowledge”. In: in Proceedings of the Second International Con-ference on the Quality if Software Architectures (QoSA). Springer-Verlag, 2006, pp. 43–58.

[120] Philippe Kruchten, Robert L. Nord, and Ipek Ozkaya. “Technical Debt: From Meta-phor to Theory and Practice”. In: IEEE Software 29.6 (2012), pp. 18–21.

[121] Patricia Lago and Paris Avgeriou. “First workshop on sharing and reusing archi-tectural knowledge”. In: SIGSOFT Software Engineering Notes 31.5 (2006), pp. 32–36.

[122] Patricia Lago, Paris Avgeriou, Rafael Capilla, and Philippe Kruchten. “Wishes andBoundaries for a Software Architecture Knowledge Community”. In: Proceedingsof the Seventh Working IEEE/IFIP Conference on Software Architecture (WICSA 2008).Washington, DC, USA: IEEE Computer Society, 2008, pp. 271–274.

[123] Patricia Lago and Hans van Vliet. “Explicit assumptions enrich architectural mod-els”. In: ICSE ’05: Proceedings of the 27th international conference on Software engineer-ing. St. Louis, MO, USA: ACM Press, 2005, pp. 206–214.

[124] Bruno Latour. Science in Action: How to Follow Scientists and Engineers Through Soci-ety. Harvard University Press, 1987.

[125] Duc Minh Le, Pooyan Behnamghader, Joshua Garcia, Daniel Link, Arman Shah-bazian, and Nenad Medvidovic. “An Empirical Study of Architectural Changein Open-source Software Systems”. In: Proceedings of the 12th Working Conferenceon Mining Software Repositories. MSR ’15. Piscataway, NJ, USA: IEEE Press, 2015,pp. 235–245.

[126] Jintae Lee. “Extending the Potts and Bruns model for recording design rationale”.In: Proceedings of the 13th International Conference on Software Engineering (ICSE 1991).Austin, Texas, United States: IEEE, 1991, pp. 114–125.

[127] T.C. Lethbridge, J. Singer, and A. Forward. “How software engineers use docu-mentation: the state of the practice”. In: Software, IEEE 20.6 (Nov. 2003), pp. 35–39.

[128] Todd A. Letsche and Michael W. Berry. “Large-scale information retrieval withlatent semantic indexing”. In: Inf. Sci. 100.1-4 (1997), pp. 105–137.

[129] Jingyue Li, Finn Olav Bjørnson, Reidar Conradi, and Vigdis B. Kampenes. “Anempirical study of variations in COTS-based software development processes inthe Norwegian IT industry”. In: Empirical Software Engineering 11.3 (2006), pp. 433–461.

[130] Jingyue Li, Reidar Conradi, Christian Bunse, Marco Torchiano, Odd Petter N. Slyn-gstad, and Maurizio Morisio. “Development with Off-the-Shelf Components: 10Facts”. In: IEEE Software 26.2 (2009), pp. 80–87.

[131] Peng Liang, Anton Jansen, and Paris Avgeriou. “Sharing architecture knowledgethrough models: quality and cost”. In: The Knowledge Engineering Review 24.3 (2009),pp. 225–244.

[132] Frank van der Linden, Jan Bosch, Erik Kamsties, Kari Känsälä, and J. Henk Obbink.“Software Product Family Evaluation”. In: SPLC. Ed. by Robert L. Nord. Vol. 3154.Lecture Notes in Computer Science. Springer Verlag, 2004, pp. 110–129.

208 BIBLIOGRAPHY

[133] Claudia López, Víctor Codocedo, Hernán Astudillo, and Luiz Marcio Cysneiros.“Bridging the gap between software architecture rationale formalisms and actualarchitecture documents: An ontology-driven approach”. In: Science of ComputerProgramming 77.1 (2012), pp. 66 –80.

[134] Allan MacLean, Richard M. Young, Victoria M.E. Bellotti, and Thomas P. Moran.“Questions, Options, and Criteria: Elements of Design Space Analysis”. In: Human-Computer Interaction 6.3&4 (1991), pp. 201–250.

[135] R. Malan and D. Bredemeyer. “Less is more with minimalist architecture”. In: ITProfessional 4.5 (2002), pp. 48, 46–47.

[136] A. Maurya. Running Lean: Iterate from Plan A to a Plan That Works. Lean Series.O’Reilly Media, Incorporated, 2012.

[137] Collin McMillan, Negar Hariri, Denys Poshyvanyk, Jane Cleland-Huang, and Bam-shad Mobasher. “Recommending Source Code for Use in Rapid Software Proto-types”. In: Proceedings of the 34th International Conference on Software Engineering.ICSE ’12. Piscataway, NJ, USA: IEEE Press, 2012, pp. 848–858.

[138] Nenad Medvidovic and Richard N. Taylor. “Classification and Comparison Frame-work for Software Architecture Description Languages”. In: IEEE Transactions onSoftware Engineering 26.1 (2000), pp. 70–93.

[139] Parastoo Mohagheghi and Reidar Conradi. “Quality, productivity and economicbenefits of software reuse: a review of industrial studies”. In: Empirical SoftwareEngineering 12.5 (2007), pp. 471 –516.

[140] Irineu Moura, Gustavo Pinto, Felipe Ebert, and Fernando Castor. “Mining Energy-aware Commits”. In: Proceedings of the 12th Working Conference on Mining SoftwareRepositories. MSR ’15. Piscataway, NJ, USA: IEEE Press, 2015, pp. 56–67.

[141] MSR ’15: Proceedings of the 12th Working Conference on Mining Software Repositories.Piscataway, NJ, USA: IEEE Press, 2015.

[142] Ian Gorton Muhammad Ali Babar and Ross Jeffery. Toward a Framework for Captur-ing and Using Architecture Design Knowledge. Tech. rep. UNSW-CSE-TR-0513. Uni-versity of New South Wales, Australia ans National ICT Australia Ltd., June 2005.

[143] O. Nierstrasz and M. Lungu. “Agile software assessment (Invited paper)”. In:2012 20th IEEE International Conference on Program Comprehension (ICPC). June 2012,pp. 3–10.

[144] Ikujiro Nonaka and Hirotaka Takeuchi. Knowledge-Creating Company. How JapaneseCompanies Create the Dynamics of Innovation. New York: Oxford University Press,1995.

[145] Kenton O’Hara and Abigail Sellen. “A comparison of reading paper and on-linedocuments”. In: CHI ’97: Proceedings of the SIGCHI conference on Human factors incomputing systems. New York, NY, USA: ACM, 1997, pp. 335–342.

[146] OMG. UML Specification, Version 2.0. 2012.

[147] J. Andrés Díaz Pace, Christian Villavicencio, Silvia N. Schiaffino, Matias Nicoletti,and Hernán Ceferino Vázquez. “Producing Just Enough Documentation: An Op-timization Approach Applied to the Software Architecture Domain”. In: J. DataSemantics 5.1 (2016), pp. 37 –53.

[148] David Parsons, Awais Rashid, Alexandru Telea, and Andreas Speck. “An archi-tectural pattern for designing component-based application frameworks”. In: Soft-ware: Practice and Experience (SPE) 36.2 (2006), pp. 157–190.

BIBLIOGRAPHY 209

[149] Dewayne E. Perry and Alexander L. Wolf. “Foundations for the Study of SoftwareArchitecture”. In: SIGSOFT Softw. Eng. Notes 17.4 (Oct. 1992), pp. 40–52.

[150] Kai Petersen and Claes Wohlin. “A comparison of issues and advantages in agileand incremental development between state of the art and an industrial case”. In:Journal of Systems and Software 82.9 (Sept. 2009), pp. 1479–1490.

[151] Martin Pinzger and Sunghun Kim. “Guest editorial: mining software repositories”.In: Empirical Software Engineering 21.5 (2016), pp. 2033–2034.

[152] E. R. Poort. “Driving Agile Architecting with Cost and Risk”. In: IEEE Software 31.5(Sept. 2014), pp. 20–23.

[153] Eltjo R. Poort, Agung Pramono, Michiel Perdeck, Viktor Clerc, and Hans van Vliet.“Successful Architectural Knowledge Sharing: Beware of Emotions”. In: Architec-tures for Adaptive Software Systems, 5th International Conference on the Quality of Soft-ware Architectures, QoSA 2009, East Stroudsburg, PA, USA, June 24-26, 2009, Proceed-ings. 2009, pp. 130–145.

[154] T. Punter, M. Ciolkowski, B. Freimut, and I. John. “Conducting on-line surveysin software engineering”. In: 2003 International Symposium on Empirical SoftwareEngineering, 2003. ISESE 2003. Proceedings. Sept. 2003, pp. 80–88.

[155] J. Rasmusson. The Agile Samurai: How Agile Masters Deliver Great Software. Prag-matic Bookshelf Series. Pragmatic Bookshelf, 2010.

[156] E. Ries. The Lean Startup: How Constant Innovation Creates Radically Successful Busi-nesses. Penguin Books Limited, 2011.

[157] David C. Rine and Nader Nada. “Three empirical studies of a software reuse ref-erence model”. In: Software: Practice and Experience (SPE) 30.6 (2000), pp. 685–722.

[158] Joseph Lee Rodgers and W. Alan Nicewander. “Thirteen Ways to Look at the Cor-relation Coefficient”. In: The American Statistician 42.1 (Feb. 1988), pp. 59–66.

[159] RubyGems.org. Ruby Gems. 2014.

[160] Ken Schwaber and Mike Beedle. Agile Software Development with SCRUM. PrenticeHall, Oct. 2001.

[161] Carolyn B. Seaman. “Qualitative Methods”. In: Guide to Advanced Empirical Soft-ware Engineering. Ed. by Forrest Shull, Janice Singer, and Dag I. K. Sjøberg. London:Springer London, 2008, pp. 35–62.

[162] Mojtaba Shahin, Peng Liang, and Zengyang Li. “Do Architectural Design Deci-sions Improve the Understanding of Software Architecture? Two Controlled Ex-periments”. In: Proceedings of the 22Nd International Conference on Program Compre-hension. ICPC 2014. New York, NY, USA: ACM, 2014, pp. 3–13.

[163] Mary Shaw. “What makes good research in software engineering?” In: STTT 4.1(2002), pp. 1–7.

[164] Mary Shaw and David Garlan. Software Architecture - Perspectives on an emergingdiscipline. Prentice Hall, 1996.

[165] David J. Sheskin. Handbook of Parametric and Nonparametric Satistical Procedures.3rd ed. Chapman & Hall/CRC, 2003.

[166] Sajjan G. Shiva and Lubna Abou Shala. “Software Reuse: Research and Practice”.In: ITNG 2007. IEEE Computer Society, May 14, 2007, pp. 603–609.

210 BIBLIOGRAPHY

[167] Leif Singer, Fernando Figueira Filho, and Margaret-Anne Storey. “Software Engi-neering at the Speed of Light: How Developers Stay Current Using Twitter”. In:Proceedings of the 36th International Conference on Software Engineering. ICSE 2014.New York, NY, USA: ACM, 2014, pp. 211–221.

[168] Marco Sinnema, Jan Salvador van der Ven, and Sybren Deelstra. “Using Variabil-ity Modeling Principles to Capture Architectural Knowledge”. In: SIGSOFT Softw.Eng. Notes 31.5 (Sept. 2006).

[169] Mohamed Soliman and Matthias Riebisch. “Modeling the Interactions betweenDecisions within Software Architecture Knowledge”. In: Software Architecture: 8thEuropean Conference, ECSA 2014, Vienna, Austria, August 25-29, 2014. Proceedings(2014). Ed. by Paris Avgeriou and Uwe Zdun, pp. 33–40.

[170] T. M. Somers and K. Nelson. “The impact of critical success factors across thestages of enterprise resource planning implementations”. In: System Sciences, 2001.Proceedings of the 34th Annual Hawaii International Conference on. Jan. 2001, 10 pp.–.

[171] A. Tang, M. A. Babar, I. Gorton, and Jun Han. “A Survey of the Use and Documen-tation of Architecture Design Rationale”. In: 5th Working IEEE/IFIP Conference onSoftware Architecture (WICSA’05). 2005, pp. 89–98.

[172] Antony Tang, Yan Jin, and Jun Han. “A rationale-based architecture model fordesign traceability and reasoning”. In: Journal of Systems and Software 80.6 (June2007), pp. 918–934.

[173] Antony Tang, Yan Jin, Jun Han, and Ann E. Nicholson. “Predicting Change Impactin Architecture Design with Bayesian Belief Networks.” In: Proceeding of the FifthWorking IEEE / IFIP Conference on Software Architecture (WICSA 2005). Pittsburgh,USA: IEEE Computer Society, Nov. 2005, pp. 67–76.

[174] S.H. Thomke. Experimentation Matters: Unlocking the Potential of New Technologies forInnovation. Innovation / Harvard Business School. Harvard Business School Press,2003.

[175] Stefan Thomke. Enlightened Experimentation: The New Imperative for Innovation (HBROnPoint Enhanced Edition). Harvard Business Review, Feb. 2001.

[176] Suresh Thummalapenta. “SpotWeb: Detecting Framework Hotspots via MiningOpen Source Repositories on the Web”. In: in Proceedings of the 2008 InternationalWorkshop on Mining Software Repositories (MSR). 2008, pp. 109–112.

[177] Dan Tofan. “Understanding and Supporting Software Architectural Decisions: forReducing Architectural Knowledge Vaporization”. English. PhD thesis. Universityof Groningen, 2015.

[178] Dan Tofan, Matthias Galster, and Paris Avgeriou. “Difficulty of Architectural De-cisions - a Survey with Professional Architects”. In: Proceedings of the 7th EuropeanConference on Software Architecture (ECSA). Springer LNCS, 2013.

[179] Dan Tofan, Matthias Galster, and Paris Avgeriou. “Reducing Architectural Knowl-edge Vaporization by Applying the Repertory Grid Technique”. In: Proceedings ofthe 5th European Conference on Software Architecture (ECSA). Springer LNCS, 2011,pp. 244–251.

[180] Jeff Tyree and Art Akerman. “Architecture Decisions: Demystifying Architecture”.In: IEEE Software 22.2 (Mar. 2005), pp. 19–27.

BIBLIOGRAPHY 211

[181] Maria Vargas-Vera, Enrico Motta, John Domingue, Mattia Lanzoni, Arthur Stutt,and Fabio Ciravegna. “MnM: Ontology Driven Semi-automatic and AutomaticSupport for Semantic Markup”. In: LNCS 2473 (2002), pp. 213–221.

[182] Bogdan Vasilescu, Vladimir Filkov, and Alexander Serebrenik. “Perceptions of Di-versity on GitHub: A User Survey”. In: Proceedings of the Eighth International Work-shop on Cooperative and Human Aspects of Software Engineering. CHASE ’15. Piscat-away, NJ, USA: IEEE Press, 2015, pp. 50–56.

[183] Jan S. van der Ven, Anton G. J. Jansen, Paris Avgeriou, and Dieter K. Hammer.“Using Architectural Decisions”. In: Second International Conference on the Qualityof Software Architecture (QoSA 2006). Karlsruhe University Press, 2006, pp. 1–10.

[184] Jan Salvador van der Ven and Jan Bosch. “Architecture Decisions: Who, How,and When?” In: Agile Software Architecture. Ed. by Muhammad Ali Babar, AlanW. Brown, and Ivan Mistrik. Boston: Morgan Kaufmann, 2014, pp. 113 –136.

[185] Jan Salvador van der Ven and Jan Bosch. “Busting Software Architecture Beliefs:A Survey on Success Factors in Architecture Decision Making”. In: 42th EuromicroConference on Software Engineering and Advanced Applications (SEAA). Aug. 2016,pp. 42–49.

[186] Jan Salvador van der Ven and Jan Bosch. “Making the Right Decision: SupportingArchitects with Design Decision Data”. In: Proceedings of the 7th European Conferenceon Software Architecture (ECSA 2013). Ed. by Khalil Drira. Vol. 7957. Lecture Notesin Computer Science. Springer, 2013, pp. 176–183.

[187] Jan Salvador van der Ven and Jan Bosch. “Pivots and Architectural Decisions: TwoSides of the Same Medal? What Architecture Research and Lean Startup can learnfrom Each Other”. In: Proceedings of International Conference on Software EngineeringAdvances (ICSEA 2013). 2013, pp. 310–317.

[188] Jan Salvador van der Ven and Jan Bosch. “Towards Reusing Decisions by MiningOpen Source Repositories”. In: Journal of Software Systems. 2018.

[189] Jan Salvador van der Ven, Anton Jansen, Jos Nijhuis, and Jan Bosch. “Design Deci-sions: The Bridge between Rationale and Architecture”. In: Rationale Managementin Software Engineering. Springer, 2006, pp. 329 –348.

[190] Hans van Vliet, Paris Avgeriou, Remco C. de Boer, Viktor Clerc, Rik Farenhorst,and Anton G. J. Jansen. “The GRIFFIN project: lessons learned”. In: Software Archi-tecture Knowledge Management: Theory and Practice (2009), pp. 137–154.

[191] Hans van Vliet and Antony Tang. “Decision making in software architecture”. In:Journal of Systems and Software (2016), pp. –.

[192] Lucian Voinea and Alexandru Telea. “Visual querying and analysis of large soft-ware repositories”. In: Empirical Software Engineering 14.3 (2008), pp. 316–340.

[193] W3C. OWL Web Ontology Language - W3C Recommendation. http://www.w3.org/TR/owl-features/. 2004.

[194] Aaron E. Walsh, ed. Uddi, Soap, and Wsdl: The Web Services Specification ReferenceBook. Prentice Hall Professional Technical Reference, 2002.

[195] Zhenyu Wang, Khalid Sherdil, and Nazim H. Madhavji. “ACCA: An Architecture-centric Concern Analysis Method”. In: 5th Working IEEE/IFIP Conference on SoftwareArchitecture (WICSA). Pittsburgh, United States, Nov. 2005, pp. 99–108.

http://www.w3.org/TR/owl-features/

http://www.w3.org/TR/owl-features/

212 BIBLIOGRAPHY

[196] John Wateridge. “How can IS/IT projects be measured for success?” In: Interna-tional Journal of Project Management 16.1 (1998), pp. 59 –63.

[197] R. Weinreich and I. Groher. “The Architect’s Role in Practice: From Decision Makerto Knowledge Manager?” In: IEEE Software 33.6 (Nov. 2016), pp. 63–69.

[198] Brian L. Wilcox. “Social support, life stress, and psychological adjustment: A test ofthe buffering hypothesis”. In: American Journal of Community Psychology 9.4 (1981),pp. 371 –386.

[199] E. Woods. “Software Architecture in a Changing World”. In: IEEE Software 33.6(Nov. 2016), pp. 94–97.

[200] Hai Zhuge. The Knowledge Grid: Toward Cyber-Physical Society. 2nd. River Edge, NJ,USA: World Scientific Publishing Co., Inc., 2012.

[201] Olaf Zimmermann. “Architectural Decisions as Reusable Design Assets”. In: IEEESoftware 28.1 (Jan. 2011), pp. 64 –69.

[202] Olaf Zimmermann, Uwe Zdun, Thomas Gschwind, and Frank leymann. “Combin-ing Pattern Languages and Reusable Architectural Decision Models into a Com-prehensive and Comprehensible Design Method”. In: Proceedings of the seventhWorking IEEE/IFIP Conference on Software Architecture (WICSA). Los Alamitos, CA,USA: IEEE Computer Society, 2008, pp. 157–166.

Preserving and Reusing Architectural Design …...Victor, Remco, Hans and Patricia, thank you for...

Documents

Transcript of Preserving and Reusing Architectural Design …...Victor, Remco, Hans and Patricia, thank you for...