microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical...

59
Enterprise Architecture Design and the Integrated Architecture Framework Andrew Macaulay, CGE&Y pp 04 – 09 Understanding Service Oriented Architecture David Sprott and Lawrence Wilkes, CBDI Forum pp 10 – 17 Business Process Decomposition and Service Identification using Communication Patterns Gerke Geurts and Adrie Geelhoed, LogicaCMG pp 18 – 27 Metadata-driven Application Design and Development Kevin S Perera, Temenos pp 28 – 38 Best Practice for Rule- Based Application Development Dennis Merritt, Amzi! Inc. pp 39 – 48 DasBlog: Notes from Building a Distributed .NET Collaboration System Clemens Vasters, newtelligence AG pp 49 – 58 Dear Architect Adapting to change has always been important to business success and today change is more rapid than ever. Companies merge, markets shift, competitors develop new products, and customers demand change, all with unprecedented speed. In today’s business environment, staying ahead means continuously adapting to change – and making change work in your favour. To do that, business leaders need IT systems that support and enable their strategic decisions. There is a growing consensus in the industry that the way to create this kind of adaptive, agile IT architecture is through Web services – discrete units of software, based on industry-standard protocols that interoperate across platforms and programming languages. Whether or not you build IT systems and Web service-based connectivity using Microsoft ® Windows and Microsoft ® .NET, you still need to connect together a broad range of personal and business technologies. These enable you to access and use important information, whenever and wherever it is needed. The final result you desire is an integrated, cost-effective IT architecture that empowers your business. Information once isolated in back-end systemsis now available to all your employees via streamlined and automated processes that can span multiple systems. It gives me great pleasure to introduce the very first issue of the new Microsoft Architects Journal’ – a platform where authoritative software architects from all corners of Microsoft’s architect community will discuss the connection between opportunities once out of reach and the solutions that now make them possible. We hope you will enjoy your journal. Simon Brown General Manager, Developer and Platform Evangelism Group, Microsoft EMEA JOURNAL1 MICROSOFT ARCHITECTS JOURNAL JANUARY 2004 A NEW PUBLICATION FOR SOFTWARE ARCHITECTS JOURNAL1

Transcript of microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical...

Page 1: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

Enterprise ArchitectureDesign and the IntegratedArchitecture FrameworkAndrew Macaulay, CGE&Ypp 04 – 09

Understanding ServiceOriented ArchitectureDavid Sprott and LawrenceWilkes, CBDI Forumpp 10 – 17

Business ProcessDecomposition andService Identificationusing CommunicationPatternsGerke Geurts and Adrie Geelhoed,LogicaCMGpp 18 – 27

Metadata-drivenApplication Design andDevelopmentKevin S Perera, Temenospp 28 – 38

Best Practice for Rule-Based ApplicationDevelopmentDennis Merritt, Amzi! Inc.pp 39 – 48

DasBlog: Notes from Building a Distributed .NET CollaborationSystemClemens Vasters, newtelligence AGpp 49 – 58

Dear ArchitectAdapting to change has always beenimportant to business success andtoday change is more rapid than ever.Companies merge, markets shift,competitors develop new products,and customers demand change, all with unprecedented speed. In today’sbusiness environment, staying aheadmeans continuously adapting to change– and making change work in yourfavour. To do that, business leadersneed IT systems that support andenable their strategic decisions.There is a growing consensus in the industry that the way to create this kind of adaptive, agile ITarchitecture is through Web services –discrete units of software, based onindustry-standard protocols thatinteroperate across platforms andprogramming languages.

Whether or not you build IT systemsand Web service-based connectivityusing Microsoft® Windows™ andMicrosoft® .NET, you still need toconnect together a broad range of

personal and business technologies.These enable you to access and useimportant information, whenever and wherever it is needed. The finalresult you desire is an integrated,cost-effective IT architecture thatempowers your business. Informationonce isolated in back-end systemsis nowavailable to all your employees viastreamlined and automated processesthat can span multiple systems.

It gives me great pleasure to introducethe very first issue of the new ‘MicrosoftArchitects Journal’ – a platform whereauthoritative software architects fromall corners of Microsoft’s architectcommunity will discuss the connectionbetween opportunities once out ofreach and the solutions that now makethem possible.

We hope you will enjoy your journal.

Simon BrownGeneral Manager,Developer and Platform Evangelism Group, Microsoft EMEA

JOURNAL1 MICROSOFT ARCHITECTS JOURNAL JANUARY 2004 A NEW PUBLICATION FOR SOFTWARE ARCHITECTS

JOURNAL1

Page 2: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Editorial 2

Dear ArchitectWelcome to the inaugural issue of JOURNAL! Software architecture is a tough thing – a vast, interestingand largely unexplored subject area.As an art, it requires intuition andunderstanding of well-establishedarchitectural disciplines. As anengineering practice, it leads toformation of system models consistingof parts; with descriptions of theirshape and form in terms of properties,relationships and constraints. Therationale for their existence oftenderives from the system requirements.And of course, everyone has or wants to say something about it!

The richness of this topic is one of thereasons we have launched JOURNAL – ‘Microsoft Architects Journal’. It willbe a platform for thought leadership on a wide range of subjects onenterprise application architecture,design and development. Authors will discuss various business and ‘soft’ concerns that play a key role in enterprise systems development.It will provide a unique source ofinformation previously not availablethrough any other Microsoft offering.

The responsibilities and requiredcapabilities of the architect vary,depending on the particular role that is being fulfilled during the enterprisesolutions development cycle. Typically,during the early business and ITstrategy phases, the architect provides a supporting role. This involvementserves to add value in visioning andscoping, helping to reduce complexityand risks, and ensuring the strategy isviable and feasible. The architect alsogains knowledge of the business and its aspirations, which will provide the basis for architecture design. Thearchitect can translate between thetechnology view and business viewof the strategy.

As the organisation moves intoenterprise and project architecturedesign phases, the architect rolebecomes much more significant. Thearchitect is responsible for structuring,modelling and design of the architecture.

Finally, as the organisation movesthrough subsequent phases toimplement, deploy and maintain the solution, the architect performs a guiding and verification role:

ensuring quality and architecturalimplementation conformance to thedesign. The architect may alsooptimise the architecture, as theproblem domain becomes betterunderstood. In all stages, the architect will perform as part of the assignmentteam, and frequently a team will fulfilthe architecture role. An indicative listof typical architect responsibilities is:

– Support business visioning andscoping activities

– Translate between business and ITrequirements

– Communicate with stakeholders,both within business and IT

– Weigh different interests – Determine solution alternatives – Create a viable and feasible design – Choose solutions – Manage quality – Manage complexity – Mitigate risks – Communicate

This requires a diverse range of skills,including knowledge of architecturedesign, workgroup and communicationsskills, and consultancy skills. In thisissue of JOURNAL, we reflect the

EditorialBy Arvindra Sehmi

Keep updated with additional information at http://msdn.microsoft.com/architecture/journal

Page 3: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Editorial 3

wide variety of architectural issues in today’s software industry – fromService-Oriented Architecture and best practices for rule-based systems to the role of Blogging in an enterprisesolution. Our authors come fromorganizations throughout the world,and each offers a unique perspective on software architecture and design.

Andrew Macaulay, a technicalarchitect at Cap Gemini Ernst &Young, writes about enterprisearchitecture design and the IntegratedArchitecture Framework, and describesa model for enterprise architectureand its importance in helping softwarearchitects understand the business as a whole.

David Sprott and Lawrence Wilkes,analysts at CBDI Forum, provide an insight into Service-OrientedArchitecture. In their article theyoutline some of the principles ofarchitecting solutions with services and emphasize the importance of aservice-oriented environment.

Gerke Geurts and Adrie Geelhoed,architects at LogicaCMG, discuss

communication patterns and their role in defining business processes and services. They assert that suchpatterns afford decomposition ofbusiness processes into business,informational and infrastructuralservices and the definition of theirdependencies, thus providing a solidbasis for enterprise information andapplication architecture.

Kevin Perera, systems architect withTemenos, will present a pragmatic‘late-bound’ approach using metadatadescriptions of artifacts at design time, to make development timeimplementation of his applicationsextremely flexible.

Dennis Merrit, a principal in Amzi! Inc.,discusses the problem of encodinglogical rules, and argues for a rulesprocessing engine based approach tobusiness automation in which the rulesare abstracted from the process.

Clemens Vasters, an executive teammember at newtelligence AG andprominent ‘blogger’, expounds themerits of Weblogs as a means of sharing knowledge and ideas.

He describes the lessons he learnedwhile designing and implementing‘dasBlog’ – a Web service based Weblog engine built using Microsoft.NET technologies.

I’m certain you’ll find something ofinterest and value in JOURNAL.We’ll be keeping you updated with additional information at[ http://msdn.microsoft.com/architecture/journal] where you’ll also be able to download the articles for your added convenience.

Finally, if you have an idea for an article you’d like to submit for a future issue of JOURNAL, please send a brief outline and resume to me – [email protected].

Please also e-mail me any comments on JOURNAL.

Arvindra SehmiArchitect, Developer and PlatformEvangelism Group, Microsoft EMEA

Page 4: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Enterprise Architecture 4

Enterprise Architecture in contextOver the past few years, and assoftware and systems engineering has matured, it has become acceptedthat there is a clear need for an‘architectural view’ of systems. Thisneed has grown as a result of theincreasing complexity of systems andtheir interactions within and betweenbusinesses. Furthermore, continuedpressure to reduce IT costs anddeliver real, quantifiable businessbenefit from solutions necessitate aclear understanding of how systemssupport and enable the business.

The ‘architectural view’ of systems(both business and IT systems) isdefined in the ANSI/IEEE Standard1471-2000 as: ‘the fundamentalorganization of a system, embodied inits components, their relationships toeach other and the environment, andthe principles governing its designand evolution’. Further to this high-level definition, and in the same way as there are different levels ofarchitecture within building (cityplans, zoning plans and buildingplans), it is important to classifybusiness and IT architecture into a number of different levels:

Enterprise Architecture – definingthe overall form and function ofsystems (business and IT) across anenterprise (including partners andorganisations forming the extendedenterprise), and providing aframework, standards and guidelinesfor project-level architectures. Thevision provided by the EnterpriseArchitecture allows the development of consistent and appropriate systemsacross the enterprise with the ability

to work together, collaborate orintegrate where and when required.

Project-Level Architecture – defines the form and function of thesystems (business and IT) in a projector programme, within the context ofthe enterprise as a whole and not justthe individual systems in isolation.This project-level architecture willrefine, conform to and work within the defined Enterprise Architecture.

Application Architecture – defines the form and function of theapplications that will be developed todeliver the required functionality of thesystem. Some of this architecture maybe defined in the Enterprise and

Project-level Architecture (asstandards and guidelines) to ensurebest-practice and conformance to the overall architecture.

When considering how organisationstypically manage business change andIT enablement, traditional approachesto strategic business change use a topdown view of the business in terms of its people and processes. However,the traditional software and systemsengineering approaches tend to focus on identifying and delivering thespecific functionality required toautomate a task or activity. Lessimportance is attached to how theresulting system will interact withother systems and the rest the

Enterprise Architecture Design and the Integrated Architecture FrameworkBy Andrew Macaulay, CGE&Y

“Over the past few years, and as software and systemsengineering has matured, it has become accepted that there is a clear need for an ‘architectural view’ of systems.”

Information,

Organisation

and Process

Framework

Standards &

Guidance

Business

ArchitectureEnterprise

Architecture

Business Vision and Strategy

Business

Drivers

Business

Drivers

IS and IT Strategy

Technical

Architecture

Development

Projects

(Project

Architectures)

Framework

Standards &

Guidance

Drives

Supports & enables

Maintain and Evolve

Change Programme

Figure 1. Business and Systems Alignment

Page 5: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Enterprise Architecture 5

business in order to deliver widerbusiness benefit. As a result, there is often a gap between the high levelvision and structure of the business,and the systems implemented tosupport them (in other words thealignment between business and IT is poor).

To bridge this gap, manyorganisations are developing anenterprise architecture to provide a clear and holistic vision of howsystems (both manual and automated)will support and enable their business.An effective enterprise architecturecomprises a comprehensive view of thebusiness, including its drivers, visionand strategy; the organisation andservices required to deliver this visionand strategy; and the information,systems and technology required for the effective delivery of these services(see Figure 1).

Defining an enterprise architecture is complex, because it encompasses the systems within the context of the whole enterprise. To simplify this, an enterprise architecture istypically structured by considering a business or system as a series of components (or services) with inter-relationships, without having to consider the detailed design within the individual components.

Both the components and their inter-relationships must be viewed in terms of the services that they provide, andthe characteristics, such as security,scalability, performance, integration,required of those services. Thesecomponents can then be grouped byservice characteristics, distribution

and other business-driven aspects as well as functionality.

Although enterprise architectureshould ideally be designed using a topdown approach, many organisationshave severe budget constraints forstrategic IT initiatives which do notreadily offer short-term return oninvestment or quantifiable businessbenefit. For this reason, manyenterprise architectures are initiallycreated as part of an approved largeproject or programme. Once in placethere are opportunities to refine itfurther and to start demonstratingbenefit and value by providingstandards and guidelines forsubsequent projects.

Cap Gemini Ernst & Young’sIntegrated ArchitectureFrameworkCap Gemini Ernst & Young has,over the past 10 years, developed an approach to the analysis anddevelopment of enterprise and

project-level architectures know as theIntegrated Architecture Framework(IAF). This approach, now its thirdmajor revision, has been developed at a global level based on theexperience of Cap Gemini Ernst &Young architects on real projects,together with a formal review processincluding academics. IAF has beensuccessfully used on many hundreds of engagements, both large and small,across the globe.

IAF breaks down the overall probleminto a number of the related aspectareas covering Business (people and process), Information (includingknowledge), Information Systems, andTechnology Infrastructure, with twospecialist areas addressing theGovernance and Security aspectsacross all of these. Analysis of each of these areas is structured into fourlevels of abstraction: Contextual,Conceptual, Logical and Physical,as shown in Figure 2.

Contextual

Why?

Security

Governance

Conceptual

what?

Logical

how?

Physical

with what?

Business Information Information

systems

Technology

infra-structure

Contextual answers the

question 'why do we need

an architecture?' and 'what

is the overall context?'

Conceptual answers the

question 'what are the

requirements and what is

the vision of a solution?'

Logical answers the

question 'how are these

requirements to be met?'

Physical answers the

question 'with what is

the solution built?'

Figure 2. Integrated Architecture Framework

Page 6: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Enterprise Architecture 6

This approach allows the pragmaticdeployment of the framework in manydifferent scenarios, both by using onlythe relevant parts of the frameworkand by supporting iterative workingacross the streams. This flexibilityminimises the traditional effects of a waterfall approach and ensurescoherency across the aspect areas. Forexample, a project architecture usingIAF will, in many cases, only need touse sufficient of the Business andInformation aspect areas to providethe overall context for the project. Anenterprise architecture will concentratemainly on the contextual, conceptualand logical levels.

The Contextual level brings together the business and other drivers, visionand strategy and their resultingpriorities into a set of principles all of which are described with theirimplications and priorities. Thiscomprehensive set of statements isthen used in a consistent manner inthe decision making process, providingtraceability back to the originalbusiness drivers, strategy and vision,and demonstrating the requiredbusiness-systems alignment.

Although much of the work done at this stage is concerned with datagathering, the importance of this stagecannot be overstressed. It will providethe basis for the entire architecturedesign by creating and documenting an understanding of the scope from an overall business perspective.

The Conceptual level details theservices and the interactions betweenthese services in support of theprinciples defined in the Contextuallevel. As the models defined in the

Conceptual level are service-based(that is they do not detail specificproducts or standards), they remainstable unless the business itselffundamentally changes its vision andobjectives; providing a solid foundationfrom which the logical architecture canbe derived. Key decisions taken at thislevel include:– What areas of the business to use

IT to support?– Which overall business architecture

(e.g. moving to a front-office, mid-office, back-office model) will be used?

– How systems will reflect theorganisation/business architecture,the level to which departmentsystems are consolidated into a suiteof core applications or are alloweddepartmental flexibility with a centralintegration service?

The Logical level describes thesolution as product-independentservices or components, includes aclear definition of the integration andcollaboration contracts between theseservices or components. By remainingindependent of products, this level ofthe architecture can remain relativelystable. It can change to reflect top-downchanges, including new fundamentalbusiness models (for example a movetowards a Customer-Centric model),as well as bottom-up changes, such as opportunities for technologyenablement (such as CRM) orfundamental changes in technologyparadigm (for example ServicesArchitecture or Grid). Through this,the impact of change at the businessor technology level can be assessed in a clear and consistent manner.

Key decisions taken at this level include:

– How should the logical components be grouped (for example, providing a multi-channel customer logonservice with separate channel-specific/optimised authenticationcomponents alongside a commonauthentication support componentusing a central directory)?

– How will logical components be shared across systems (thesecomponents could be presented as Web Services)?

– How a central integration hubsupports the various businesssystems, or the way in whichcollaboration tools are used alongsidedatabases applications andintegration to support virtual team?

Typically, at the logical level, theremight be more than one way toapproach the solution (which reflectsthe various drivers, for example cost,flexibility, security, manageability). Thekey decision at this level is then toselect (with the business) the solutionalternative that delivers the servicesrequired, in a way that best addressesthe guiding principles.

The Physical level details the designprinciples, standards and guidelines,including component grouping incritical areas as well as deploymentmodels. This provides the frameworkwithin which the detailed design can be undertaken, as well as selectioncriteria (not functional specifications)for products to be either developed or purchased.

It is at this level that solutionframeworks and architectures such asthe Microsoft Systems Architecture(MSA) can be used at this level toaccelerate development of the physical

Page 7: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Enterprise Architecture 7

architecture, improve the quality of the architecture (by using provensolutions) and reduce project risks.

Examples of key decisions taken at this level are:– Which physical components which

will be part of a package solution (e.g. using physical components from the ERP solution).

– What additional components will berequired around this package, andthe standards and guidelines fordeveloping these components (e.g.language, tools, etc.)?

– What are the standards and detailedproduct selection criteria for theinfrastructure products to bedeployed? – leading onto a candidatelist for selection. If there is a clearproduct or vendor strategy, thiscandidate list becomes the productstandards to be taken forward.

Communication and Architecture GovernanceBecause of the large number ofpotential stakeholders, the enterprise

architecture needs to be communicatedat many different levels, using theappropriate visual representations andlanguage: For senior management, thearchitecture must show how thebusiness goals and drivers will besupported, and how benefit can bederived. The focus of business users ismore on their own individual businessareas and is more functionally-biased.IT management and staff will want to focus on the technical components,including how they will be able toprovide the required levels of support.Project-level architects will beconcerned with the standards andguidelines which will provide re-useopportunities or impose constraints on their individual designs.

Programmes and projects mustconform to the enterprise architectureto ensure that business benefits canbe realised, and that the systems andsoftware engineering activities canbenefit from the analysis already donein defining the enterprise architecture.Furthermore, as with all architectures,

the enterprise architecture requiresongoing maintenance, especially around the more physical areas such as technical standards. This ensuresthat the enterprise architectureremains valid and relevant to thebusiness as it changes. The enterprisearchitecture should be under the controlof an enterprise-wide governancefunction that ensures its maintenanceand verifies the ongoing conformance of systems. Even where, for clear andjustified business reasons, conformanceis not possible, this function is then able to make sure that the businessunderstands the real costs ofimplementing non-conformant systems,for example increased running costs orlack of future flexibility.

Linking Architecture and DesignAs the use of the term ‘architecture’ hasgrown within business and IT, therehave been many areas of confusion:for example, in many organisations,architecture and design are seen asbeing the same thing. It is Cap GeminiErnst & Young’s view that this is not

Delivers Does Not Deliver

Architecture – Non-functional requirements – Prototypes– Functional scope and – Comprehensive functional requirements

responsibilities (who does what) – Detailed data analysis– Key design and product choices – Built and implemented systems– High level design– Design constraints

Design – Functional requirements and – Solution Visionhow they will be met – Comprehensive and traceable non-

– Detailed data analysis and functional requirementsdata model as necessary – Security and governance architectures

– System design documentation– Built and implemented systems

Table 1. Comparing Architecture and Design

“Programmes and projects must conform to the enterprise architecture to ensure that business benefits can be realised.”

Page 8: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Enterprise Architecture 8

the case because design andarchitecture offer different andcomplimentary perspectives on thesolution – in fact, Cap Gemini Ernst & Young use IAF and the RationalUnified Process (RUP; a softwaredevelopment approach) on projects todeliver a complete design approachfrom architecture to code. Table 1shows the comparison betweenarchitecture (IAF) and design (RUP,including application architecture):

As with the architecture of buildings,software architecture and (detailed)design are, in fact, part of an overall‘design continuum’ required to deliver a complete solution. Aligning IAF andRUP processes provides a

comprehensive and consistentframework in support of this.Furthermore, the enterprisearchitecture will provide much of the context and other inputsrequired by the project architecture,whilst the project architecture will cater for the unique requirements of the solution. In projects withsignificant potential risk, especially on complex or large projects, the use of the architectural approach atproject-level will mitigate many of the risks by ensuring that there is a clear and holistic understanding of the overall context of the solutionincluding external systems and driversthat may affect the solution.

With IAF, a project-level technicalarchitecture uses the same basicapproach as the enterprisearchitecture, albeit with differentlevels of detail, with more focus on the logical and physical level ofinformation, information systems,technology infrastructure, governanceand security aspects (using thecontext and business architecturedefined in the enterprise architecture).

As a result, the output from theenterprise architecture can be useddirectly as the input into the project-level technical architecture. From theproject-specific technical architecture,the detailed and specific designprinciples, guidelines, standards, and

Inception

Business Modelling

Requirements

Analysis & Design

Implementation

Test

Deployment

Configuration &

Change Management

Environment

Contextual

Architecture

Conceptual

Architecture

Logical

Architecture

Physical

Architecture

Elaboration Construction Transition

RUP

IAF

Project Management

Figure 3. IAF and RUP

Page 9: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Enterprise Architecture 9

constraints, which then guide thedetailed software and systemsengineering design activities, can bederived. In the case of IAF, the outputfrom the project-level technicalarchitecture can be mapped onto RUPdesign artefacts such as business usecases, system use cases, and non-functional specification. These mappingstypically are not one-to-one, but doprovide traceability through from thearchitecture to the physical detaileddesign, as well as helping accelerate theoverall design process from architecturethrough to delivered systems.This helps accelerate the design/development effort whilst continuing to mitigate project risks.

SummaryEnterprise architectures are becomingmore important today as the level ofcomplexity and inter-operationbetween systems and businessincreases, and as there is even moreneed for business-system alignment and cost-effective use of IT to deliverbusiness benefit. Enterprisearchitecture (and project-leveltechnical architecture) providesvaluable input into applicationarchitecture and detailed design

by helping architects understand thebusiness as a whole and by placing thesolution being designed into the overallbusiness and technical context withinwhich the project is being delivered.

The key objectives of an enterprisearchitecture are to understand …– The relevant parts of whole business,

in context (incl. external partners)– The end-to-end processes (including

external processes/actors)– Non-functional requirements

(including security & governance)which results in a solution that …– Supports the non-functional

requirements.This may need specificcomponent organization to support,for example, specific cross-domainsecurity requirements, or a service-based approach to provide therequired flexibility.

– Is seen in the context of the whole business and end-to-endprocesses.For example, service level objectives may exist for overalltransaction times that span morethan one business – understandingthe limitations of this allow thesemeasures to be refined.

– Links to, and is traceable to, thebusiness principles so that the

impact of changes, some of which mayresult from the design stage, can beevaluated in business terms.

– Drives, contextualises, and constrainsthe design. The design for theapplication or infrastructure willneed to be governed by thearchitecture in order to fully deliverthe complete solution including thenon-functional requirements.

– Is clearly scoped, understood andclearly defines the responsibilities of each element.

and provides the rationale …– For decisions, standards and product

selections that support the businessgoals and drivers.

Further ReadingCap Gemini Ernst & Young Technology Services http://www.cgey.com/technology

Services Architecturehttp://www.cgey.com/technology/sa/index.shtml

Adaptive IThttp://www.cgey.com/adaptive/solutions/adaptive-it-overview.shtml

Andrew [email protected]

Andrew Macaulay joined Cap Gemini Ernst & Young as a TechnicalArchitect in 1993 following ten yearsas a Technical Consultant. He has been an Architect and TechnicalConsultant on many major

engagements, both at an Enterpriseand the Project level, and has beeninstrumental in developing, deliveringand training Cap Gemini Ernst &Young’s approach to architecture,Integrated Architecture Framework.

Page 10: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 10

If there was a hit parade of IT acronyms, Service OrientedArchitecture, or SOA, would surelybe number one. Yet for all themedia comment, how many reallyunderstand what SOA is? Howdoes it affect what architects,CIOs, project managers, businessanalysts and lead developers do? In this article we provide a concise explanation that we anticipate will baseline the subject.

IntroductionIt seems probable that eventually mostsoftware capabilities will be deliveredand consumed as services. Of coursethey may be implemented as tightlycoupled systems, but the point of usage– to the portal, to the device, to anotherendpoint, and so on, will use a servicebased interface. We have seen thecomment that architects and designersneed to be cautious to avoid everythingbecoming a service. We think this isincorrect and muddled thinking.It might be valid right now given thematurity of Web Service protocols andtechnology to question whethereverything is implemented using Webservices, but that doesn’t detract fromthe need to design everything from aservice perspective. The service is themajor construct for publishing andshould be used at the point of eachsignificant interface. Service OrientedArchitecture allows us to manage the usage (delivery, acquisition,consumption, and so on) in terms of,and in sets of, related services. Thiswill have big implications for how wemanage the software life cycle – rightfrom specification of requirements asservices, design of services, acquisitionand outsourcing as services, assetmanagement of services, and so on.

Over time, the level of abstraction at which functionality is specified,published and or consumed hasgradually become higher and higher.We have progressed from modules,to objects, to components, and now toservices. However in many respects thenaming of SOA is unfortunate. WhilstSOA is of course about architecture,it is impossible to constrain thediscussion to architecture, becausematters such as business design andthe delivery process are also importantconsiderations. A more usefulnomenclature might be ServiceOrientation (or SO). There are actuallya number of parallels with objectorientation (or OO) and componentbased development (CBD):– Like objects and components, services

represent natural building blocks thatallow us to organize capabilities inways that are familiar to us.

– Similarly to objects and components,a service is a fundamental buildingblock thata. Combines information

and behaviour.b. Hides the internal workings

from outside intrusion.c. Presents a relatively simple

interface to the rest of theorganism.

– Where objects use abstract datatypes and data abstraction, servicescan provide a similar level ofadaptability through aspect or context orientation.

– Where objects and components can be organized in class or servicehierarchies with inherited behaviour,services can be published andconsumed singly or as hierarchiesand or collaborations.

For many organizations, the logicalstarting place for investigating Service

Oriented Architecture is theconsideration of Web services. HoweverWeb services are not inherently serviceoriented. A Web service merely exposesa capability which conforms to Webservices protocols. In this article we willidentify the characteristics of a wellformed service, and provide guidancefor architects and designers on how todeliver service oriented applications.

Principles and DefinitionsLooking around we see the term oracronym SOA becoming widely used,but there’s not a lot of precision in theway that it’s used. The World WideWeb Consortium (W3C) for examplerefers to SOA as ‘A set of componentswhich can be invoked, and whoseinterface descriptions can be publishedand discovered’. We see similardefinitions being used elsewhere; it’s a very technical perspective in whicharchitecture is considered a technicalimplementation. This is odd, becausethe term architecture is moregenerally used to describe a style orset of practices – for example the stylein which something is designed andconstructed, for example Georgianbuildings, Art Nouveau decoration or a garden by Sir Edwin Lutyens and Gertrude Jekyll .

CBDI believes a wider definition of Service Oriented Architecture is required. In order to reach thisdefinition, let’s start with some existingdefinitions, and compare some W3Cofferings with CBDI recommendations.We’ll begin by looking at definitions ofbasic Service concepts.

Service– A Component capable of performing

a task. A WSDL service: A collectionof end points (W3C).

Understanding Service Oriented ArchitectureBy David Sprott and Lawrence Wilkes, CBDI Forum

Page 11: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 11

– A type of capability described usingWSDL (CBDI).

A Service Definition – A vehicle by which a consumer’s

need or want is satisfied accordingto a negotiated contract (implied or explicit) which includes ServiceAgreement, Function Offered and so on (CBDI).

A Service Fulfillment – An instance of a capability

execution (CBDI).

Web service – A software system designed to

support interoperable machine-to-machine interaction over a network.It has an interface described in aformat that machines can process(specifically WSDL). Other systemsinteract with the Web service in amanner prescribed by its descriptionusing SOAP messages, typicallyconveyed using HTTP with XMLserialization in conjunction withother Web-related standards (W3C).

– A programmatic interface to acapability that is in conformancewith WSnn protocols (CBDI).

From these definitions, it will be clearthat the W3C have adopted a somewhatnarrower approach to defining servicesand other related artefacts than CBDI.CBDI differs slightly insofar as not allServices are Components, nor do theyall perform a task. Also CBDIrecommends it is useful to managethe type, definition and fulfilment asseparate items. However it is in thedefinition of SOA that CBDI reallyparts company with the W3C.

Service Oriented Architecture:– A set of components which can be

invoked, and whose interfacedescriptions can be published anddiscovered (W3C).

CBDI rejects this definition on twocounts: First the components (orimplementations) will often not be a set. Second the W3C definition ofarchitecture only considers theimplemented and deployedcomponents, rather than the science, artor practice of building the architecture.CBDI recommends SOA is moreusefully defined as:

The policies, practices, frameworks thatenable application functionality to beprovided and consumed as sets ofservices published at a granularityrelevant to the service consumer.Services can be invoked, published and discovered, and are abstractedaway from the implementation using a single, standards based form ofinterface. (CBDI)

CBDI defines SOA as a style resultingfrom the use of particular policies,practices and frameworks that deliverservices that conform to certainnorms. Examples include certaingranularity, independence from theimplementation, and standardscompliance. What these definitionshighlight is that any form of servicecan be exposed with a Web servicesinterface. However higher orderqualities such as reusability andindependence from implementation,will only be achieved by employingsome science in a design and buildingprocess that is explicitly directed atincremental objectives beyond thebasic interoperability enabled by use of Web services.

SOA BasicsIt’s would be easy to conclude that the move to Service Orientation really commenced with Web services –about three years ago. However,Web services were merely a step along a much longer road. The notionof a service is an integral part ofcomponent thinking, and it is clear that distributed architectures wereearly attempts to implement ServiceOriented Architecture. What’simportant to recognize is that Webservices are part of the wider picturethat is SOA. The Web service is theprogrammatic interface to a capabilitythat is in conformance with WSnnprotocols. So Web services provide us with certain architecturalcharacteristics and benefits –specifically platform independence,loose coupling, self description, anddiscovery – and they can enable aformal separation between the providerand consumer because of the formalityof the interface.

Service is the important concept. WebServices are the set of protocols bywhich Services can be published,discovered and used in a technologyneutral, standard form.

In fact Web services are not amandatory component of a SOA,although increasingly they willbecome so. SOA is potentially muchwider in its scope than simplydefining service implementation,addressing the quality of the servicefrom the perspective of the provider and the consumer. You can draw aparallel with CBD and componenttechnologies. COM and UMLcomponent packaging address

“Higher order qualities such as reusability and independencefrom implementation, will only be achieved by employingsome science in a design and building process.”

Page 12: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 12

components from the technologyperspective, but CBD, or indeedComponent Based SoftwareEngineering (CBSE), is the discipline by which you ensure you are buildingcomponents that are aligned with the business. In the same way, Webservices are purely the implementation.SOA is the approach, not just theservice equivalent of a UMLcomponent packaging diagram.

Many of these SOA characteristics wereillustrated in a recent CBDI report1,which compared Web servicespublished by two dotcom companies asalternatives to their normal browser-based access, enabling users toincorporate the functionality offeredinto their own applications. In onecase it was immediately obvious thatthe Web services were meaningfulbusiness services – for exampleenabling the Service Consumer toretrieve prices, generate lists, or add an item to the shopping cart.In contrast the other organization’sservices are quite different. Itimplemented a general purpose API,which simply provides Create, Read,Update, and Delete (CRUD) access totheir database through Web services.While there is nothing at all wrongwith this implementation, it requiresthat users understand the underlyingmodel and comply with the businessrules to ensure that your dataintegrity is protected. The WSDL tellsyou nothing about the business or the entities. This is an example of Web services without SOA.

SOA is not just an architecture ofservices seen from a technologyperspective, but the policies, practices,and frameworks by which we ensure

the right services are provided and consumed.

So what we need is a framework forunderstanding what constitutes a goodservice. If, as we have seen in theprevious example, we have varyinglevels of usefulness, we need somePrinciples of Service Orientation thatallow us to set policies, benchmarksand so on.

We can discern two obvious sets here:– Interface related principles –

Technology neutrality,standardization and consumability.

– Design principles – These are moreabout achieving quality services,meeting real business needs, andmaking services easy to use,inherently adaptable, and easy to manage.

Interestingly the second set mighthave been addressed to some extentby organizations that have establishedmature component architectures.However it’s certainly our experiencethat most organizations have foundthis level of discipline hard to justify.While high quality components havebeen created perhaps for certain coreapplications where there is a clearcase for widespread sharing and reuse, more generally it has been hardto incur what has been perceived as an investment cost with a short term return on investment.

However when the same principles are applied to services, there is nowmuch greater awareness of therequirements, and frankly business and IT management have undergone a steep learning curve to betterunderstand the cost and benefits of

IT systems that are not designed forpurpose. Here we have to be clear –not all services need all of thesecharacteristics; however it isimportant that if a service is to be used by multiple consumers, (as istypically the case when a SOA isrequired), the specification needs to be generalized, the service needs to be abstracted from theimplementation (as in the earlierdotcom case study), and developers ofconsumer applications shouldn’t need to know about the underlying modeland rules. The specification ofobligations that client applications must meet needs to be formally defined and precise and the servicemust be offered at a relevant level ofgranularity that combines appropriateflexibility with ease of assembly intothe business process.

Table 1 shows principles of good service design that are enabled by characteristics of either Web services or SOA.

If the principles summarized in Table 1 are complied with, we get someinteresting benefits:– There is real synchronization

between the business and ITimplementation perspective. Formany years, business people haven’treally understood the IT architecture.With well designed services we canradically improve communicationswith the business, and indeed movebeyond alignment and seriouslyconsider convergence of business and IT processes.

– A well formed service provides uswith a unit of management thatrelates to business usage.Enforced separation of the service

1 Service Based Packaged Applicationshttp://www.cbdiforum.com/secure/interact/2003-07/service_based_pkd_apps.php3

Page 13: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 13

provision provides us with basis for understanding the life cycle costsof a service and how it is used in the business.

– When the service is abstractedfrom the implementation it ispossible to consider variousalternative options for deliveryand collaboration models. No oneexpects that, at any stage in theforeseeable future, core enterpriseapplications will be acquired purelyby assembling services from multiplesources. However it is entirelyrealistic to assume that certainservices will be acquired fromexternal sources because it is moreappropriate to acquire them. Forexample authentication services,a good example of third partycommodity services that can deliver a superior service because of

specialization, and the benefits of using a trusted external agency to improve authentication.

Process MattersAs indicated earlier, CBDI advises that good SOA is all about style –policy, practice and frameworks.This makes process matters anessential consideration.

Whilst some of the benefits of servicesmight have been achieved by someorganizations using components, thereare relatively few organizations thatrigorously enforce the separation ofprovision and consumption throughoutthe process. This gets easier withservices because of the formality of theinterface protocols, but we need torecognize that this separation needsmanaging. For example it’s all too

easy to separate the build processes of the service and the consumer, but if the consumer is being developed bythe same team as the service then it’sall too easy to test the services in amanner that reflects understanding of the underlying implementation.

With SOA it is critical to implementprocesses that ensure that there are at least two different and separateprocesses – for provider and consumer.

However, current user requirements for seamless end-to-end businessprocesses, a key driver for using Web Services, mean that there willoften be clear separation between theproviding and consumer organizations,and potentially many to manyrelationships where each participanthas different objectives but

Enabled by Web services Technology neutral Endpoint platform independence.

Standardized Standards based protocols.

Consumable Enabling automated discovery and usage.

Enabled by SOA Reusable Use of Service, not reuse by copying of code/implementation.

Abstracted Service is abstracted from the implementation.

Published Precise, published specification functionality of service interface, not implementation.

Formal Formal contract between endpoints placesobligations on provider and consumer.

Relevant Functionality presented at a granularityrecognized by the user as a meaningful service.

Table 1. Web services and SOA

Page 14: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 14

nevertheless all need to use the same service. Our recommendation is that development organizationsbehave like this, even when both theproviding and consuming processes arein-house, to ensure they are properlydesigning services that accommodatefuture needs

For the consumer, the process must beorganized such that only the serviceinterface matters, and there must beno dependence upon knowledge of theservice implementation. If this can be achieved, considerable benefits offlexibility accrue because the servicedesigners cannot make anyassumptions about consumerbehaviours. They have to provideformal specifications and contractswithin the bounds of which consumerscan use the service in whatever waythey see fit. Consumer developers only need to know where the service is, what it does, how they can use it.The interface is really the only thing of consequence to the consumer as this defines how the service can beinteracted with.

Similarly, whilst the provider has a very different set of concerns, it needsto develop and deliver a service thatcan be used by the Service Consumer in a completely separate process. Thefocus of attention for the provider is therefore again the interface – the description and the contract.Another way of looking at this is to think about the nature of thecollaboration between provider andconsumer. At first sight you may thinkthat there is a clear divide betweenimplementation and provisioning,owned by the provider, andconsumption, owned by the consumer.

However if we look at these top levelprocesses from the perspective ofcollaborations, then we see a verydifferent picture.

What we have is a significant number ofprocess areas where (depending on thenature of the service) there is deepcollaboration between provider andconsumer. Potentially we have a majorreengineering of the software deliveryprocess. Although we have twoprimary parties to the service basedprocess, we conclude there are threemajor process areas which we need tomanage. Of course these decompose, butit seems to us that the following arethe primary top level processes.

– The process of delivering theservice implementation.– ‘Traditional’ Development– Programming– Web Services automated by tools

– The provisioning of the service –the life cycle of the service as areusable artefact.– Commercial Orientation– Internal and External View– Service Level Management

– The consumption process.– Business Process Driven– Service Consumer could be

internal or external– Solution assembly from Services,

not code– Increasingly graphical, declarative

development approach– Could be undertaken by business

analyst or knowledge worker

The advantage of taking this view isthat the collaborative aspects of theprocess are primarily contained in the

provisioning process area. And theprovisioning area is incrediblyimportant because the nature of the agreement has a major influence on the process requirements.There are perhaps two major patterns for designing consumer/provider collaborations:

– Negotiated – Consumer and Provider jointly agree serviceWhen new services are developedthough, there is an opportunity forboth provider and consumer to agreewhat and how the services shouldwork. In industries where there aremany participants all dealing witheach other, and where services arecommon to many providers, it isessential that the industry considersstandardizing those services.Examples include:– Early adopters– New Services– Close partners– Industry initiative –

forming standards– Internal use

– Instantiated – This is it.Take it or leave itOne party in the collaborativescenario might simply dictate theservices that must be used.Sometimes the service will alreadyexist. You just choose to use it, or not.Examples include:– Dominant partner– Provider led – Use this service

or we can’t do business– Consumer led – Provide this service

or we can’t do business– Industry initiative –

standards compliance– Existing system/interface

Page 15: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 15

ArchitecturesThis process view that we haveexamined at is a prerequisite tothinking about the type of architecturerequired and the horizons of interest,responsibility and integrity. For SOA there are three importantarchitectural perspectives as shown in Figure 1.

– The Application Architecture.This is the business facing solutionwhich consumes services from one ormore providers and integrates theminto the business processes.

– The Service Architecture.This provides a bridge between the implementations and theconsuming applications, creating a logical view of sets of serviceswhich are available for use,invoked by a common interface and management architecture.

– The Component Architecture.This describes the variousenvironments supporting theimplemented applications, the businessobjects and their implementations.

These architectures can be viewed from either the consumer or providerperspective. Key to the architecture is that the consumer of a serviceshould not be interested in theimplementation detail of the service –just the service provided. Theimplementation architecture couldvary from provider to provider yet still deliver the same service.Similarly the provider should not beinterested in the application that the service is consumed in. Newunforeseen applications will reuse the same set of services.

The consumer is focused on theirapplication architecture, the servicesused, but not the detail of thecomponent architecture. They are

interested at some level of detail in the general business objects thatare of mutual interest, for exampleprovider and consumer need to sharea view of what an order is. But theconsumer does not need to know howthe order component and database are implemented.

Similarly, the provider is focused on thecomponent architecture, the servicearchitecture, but not on the applicationarchitecture Again, they both need tounderstand certain information aboutthe basic applications, for example tobe able to set any sequencing rulesand pre and post conditions. But theprovider is not interested in every detail of the consuming application.

The Service ArchitectureAt the core of the SOA is the need to beable to manage services as first orderdeliverables. It is the service that wehave constantly emphasized that is thekey to communication between theprovider and consumer. So we need aService Architecture that ensures thatservices don’t get reduced to the status of interfaces, rather they have an identity of their own, and can bemanaged individually and in sets.

CBDI developed the concept of theBusiness Service Bus (BSB) preciselyto meet this need. The BSB is a logicalview of the available and used servicesfor a particular business domain, suchas Human Resources or Logistics. Ithelps us answer questions such as:– What service do I need?– What services are available to me?– What services will operate together?

(common semantics, business rules)– What substitute services

are available?

ApplicationArchitecture

ServiceArchitecture

ComponentArchitecture

ComponentArchitecture

Consumer

Provider

Businessprocess

Businessservicebus

Figure 1. Three Architectural Perspectives

Page 16: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 16

– What are the dependencies betweenservices and versions of services?

Rather than leaving developers todiscover individual services and putthem into context, the Business ServiceBus is instead their starting point thatguides them to a coherent set that hasbeen assembled for their domain.

The purpose of the BSB is so thatcommon specifications, policies, etccan be made at the bus level, ratherthan for each individual service. Forexample, services on a bus should allfollow the same semantic standards,adhere to the same security policy,and all point to the same global modelof the domain. It also facilitates theimplementation of a number ofcommon, lower-level businessinfrastructure services that can beaggregated into other higher levelbusiness services on the same bus

(for example, they could all use thesame product code validation service).Each business domain develops avocabulary and a business model ofboth process and object.

A key question for the ServiceArchitecture is ‘What is the scope of the service that is published to theBusiness Service Bus?’ A simplisticanswer is ‘At a business level ofabstraction’. However this answer is open to interpretation – better tohave some heuristics that ensure that the service is the lowest commondenominator that meets the criteria of business, and is consumer oriented,agreed, and meaningful to thebusiness. The key point here is thatthere is a process of aggregation andcollaboration that should probablyhappen separately from theimplementing component as illustratedin Figure 2. By making it separate,

there is a level of flexibility that allowsthe exposed service(s) to be adjustedwithout modifying the underlyingcomponents. In principle, the level of abstraction will be developed suchthat services are at a level that isrelevant and appropriate to theconsumer. The level might be one or all of the following:– Business Services– Service Consumer Oriented– Agreed by both Provider

and Consumer– Combine low level implementation

based services into somethingmeaningful to business

– Coarser Grained– Suitable for External Use– Conforms to pre-existing

connection design

The SOA PlatformThe key to separation is to define a virtual platform that is equally

Aggregation

Fine Grained

Implementation

-Based Services

Coarse Grained

Business Services

Infrastructure

Services

Existing

Application

Service

Wrapper

External

Services

Existing

Application

Service

Wrapper

Abstraction

Figure 2. Levels of Abstraction

David [email protected] Sprott, CEO and PrincipalAnalyst, CBDI Forum is a software

industry veteran, a well-knowncommentator and analyst specializingin advanced application delivery. Since1997 he has founded and led CBDI, an

independent analyst firm and think-tank, providing a focus for the industryon best practice in business softwarecreation, reuse and management.

Page 17: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Understanding SOA 17

relevant to a number of realplatforms. The objective of the virtual platform is to enable theseparation of services from theimplementation to be as complete as possible and allow components built on various implementationplatforms to offer services which haveno implementation dependency.

The virtual SOA platform comprises ablueprint which covers the developmentand implementation platforms. Theblueprint provides guidance on thedevelopment and implementation of applications to ensure that thepublished services conform to the same set of structural principles thatare relevant to the management andconsumer view of the services.

When a number of differentapplications can all share the samestructure, and where the relationshipsbetween the parts of the structure arethe same, then we have what might becalled a common architectural style.The style may be implemented invarious ways; it might be a commontechnical environment, a set of policies,frameworks or practices. Exampleplatform components of a virtualplatform include:– Host environment– Consumer environment– Middleware– Integration and

assembly environment– Development environment– Asset management – Publishing & Discovery– Service level management– Security infrastructure

– Monitoring & measurement– Diagnostics & failure– Consumer/Subscriber management– Web service protocols– Identity management– Certification– Deployment & Versioning

The Enterprise SOAThe optimum implementationarchitecture for SOA is a component-based architecture. Many will befamiliar with the concepts of processand entity component, and willunderstand the inherent stability and flexibility of this componentarchitecture, which provide a one to one mapping between business entitiesand component implementations.Enterprise SOA (ESOA) brings the two main threads – Web services andCBD (or CBSE) – together. The result is an enterprise SOA that applies toboth Web services made availableexternally and also to core businesscomponent services built or specifiedfor internal use. It is beyond the scope of this article to explore ESOA in moredepth. For more on this topic there is a five part CBDI Report Series onEnterprise SOA2.

SummaryThe goal for a SOA is a world widemesh of collaborating services, whichare published and available forinvocation on the Service Bus. AdoptingSOA is essential to deliver the businessagility and IT flexibility promised byWeb Services. These benefits aredelivered not by just viewing servicearchitecture from a technologyperspective and the adoption of Web

Service protocols, but require the creation of a Service OrientedEnvironment that is based on thefollowing key principals we havearticulated in this article;

– Service is the important concept.Web Services are the set of protocolsby which Services can be published,discovered and used in a technologyneutral, standard form.

– SOA is not just an architecture ofservices seen from a technologyperspective, but the policies,practices, and frameworks by whichwe ensure the right services areprovided and consumed.

– With SOA it is critical to implementprocesses that ensure that there are at least two different andseparate processes – for provider and consumer.

– Rather than leaving developers to discover individual services andput them into context, the BusinessService Bus is instead their startingpoint that guides them to a coherentset that has been assembled for their domain.

For further guidance on planning and managing the transition to WebServices and SOA, CBDI are providingthe ‘Web Services Roadmap’, a set ofresources that are freely available athttp://roadmap.cbdiforum.com/

Lawrence [email protected] Wilkes, Technical Directorand Principal Analyst, CBDI Forum isa frequent speaker, lecturer and writer

on Web Services, Service OrientedArchitecture, Component BasedDevelopment and ApplicationIntegration approaches. Lawrence hasover 25 years experience in IT working

both for end user organizations invarious industries, as well as forsoftware vendors and as a consultant.2http://www.cbdiforum.com/secure/interact/2003-03/foundation.php3

Page 18: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 18

IntroductionOrganisations use information andcommunication technology (ICT) as a means to reach their objectives.Paradoxically, many organisations findthat yesterday’s business ICT solutionshinder them in reaching today’sobjectives. An organisation might be able to modify, wrap or replaceexisting ICT solutions to meet today’srequirements, but how can it avoidtoday’s solutions once again becoming tomorrow’s problems?

Many business ICT solutions tend tolack flexibility to deal with changingbusiness requirements andtechnology. Business requirementschange as organisations adjust theirstrategies, business processes andinternal structures to deal with changesin their environment (for examplebecause of competition or legislation) or because they choose to merge oroutsource activities. New technologiesare introduced to obtain and maintaincompetitive advantage or because oldertechnologies become too cumbersomeand expensive to support.

A main reason for the lack ofsufficient flexibility in business ICTsolutions is the failure to consider long-term dynamics of business and technology during solutiondevelopment. Should enterprisefortune-telling therefore become avalued discipline within developmentprojects? Not necessarily. To build ICT solutions that provide long-lasting support for organisations, it is essential to have models that helpus to understand the structure anddynamics of the organisations we are trying to support.

Many of the today’s organisation and business modelling approaches are based on ‘best practices’ but to acertain extent remain a black art.What is often missing is a stabletheoretical foundation such as theDEMO (Dynamic Essential Modelling of Organisations) framework [Dietz2002]. By looking at organisations froma communication perspective, DEMOrecognises reoccurring communicationpatterns between people and proceedsto describe how these patterns combineto form business processes.

DEMO provides business analysts and ICT solution architects with toolsto decompose the business processesinto elementary business transactions.Each business transaction identifies a business role that operates as aminiature supply chain providing awell-defined service to other businessroles. Thus when providing automatedsupport to business processes, itmakes sense to decompose businessprocess logic according to the samepatterns that occur in a purely humanworld. Therefore, the DEMO approachcan be used to identify process-oriented business services in service-oriented and component-based ICTsolution architectures.

In this article we will first discuss on how organisations are made up of cooperating people who usecommunication to align their actions.By studying how people usecommunication to align their actions,we will encounter the businesstransaction communication pattern.We will then describe how businesstransactions are composed to formbusiness processes and enable theidentification of business roles. Each

business role provides a well-definedservice for other business roles withinor outside of the organisation. Bystudying how business roles exchangeand remember organisation we will encounter organisational rolesthat provide informational andinfrastructural services. This will result in a technology-independentinformation architecture that consistsof business, informational andinfrastructural services and describes their dependencies. We will conclude this article with someremarks regarding the realisation of services using human resourcesand/or technology.

OrganisationsAn organisation is a group of peoplewho cooperate to achieve commonobjectives. In all but the simplestorganisations no single person is able to perform all the work that must bedone to achieve the common goals, somembers specialise themselves andmust cooperate for the organisation to be effective. To improve cooperation,organisations tend to formalise the roles that members play within theorganisation and to agree uponcommon procedures to coordinate the work performed by different roles.

A business process is an orderedexecution of activities by people playingorganisational roles in order to producegoods or provide services that has addvalue in the organisation’senvironment or to the organisationitself. Each activity on its own can beregarded as a decomposable process,and as a sub-process of the containingprocess [Eriksson and Penker 2000:71].For example, a sales process maycontain a delivery activity that

Business Process Decomposition and ServiceIdentification using Communication PatternsBy Gerke Geurts and Adrie Geelhoed, LogicaCMG

“Many business ICT solutions tend to lackflexibility to deal with changing businessrequirements and technology.”

Page 19: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 19

contains nested activities for theselection of a transport provider and the transport itself.

A person may play one or more roleswithin an organisation. Each rolerepresents the elementary authorityand responsibility a person must have in order to perform a particularproduction act. In practice, roles do not usually correspond directly with organisational functions. Oftenorganisational functions map tomultiple roles and roles may be playedby various organisational functions.For example, ‘secretary’ and ‘claimhandler’ may be organisationalfunctions within a social service. Boththe secretary and the claim handlerare allowed to refuse incompletebenefit requests, but only the claimhandler is allowed to grant benefits to claimants. So both organisationalfunctions can play the ‘claimadmission’ role, but only oneorganisational function can play the ‘claim granting’ role.

The roles participating in a businessprocess perform two types of acts:production acts and coordination acts [Dietz 2002]. Production actscontribute to the realisation of goodsand/or services for the environment of an organisation. The result of aproduction act can be material orimmaterial. The manufacturing, storageand transport of goods are examplesof material production acts. A verdictby a judge or a decision to grant abenefit claim are examples ofimmaterial products.

Different roles must cooperate whenthey are dependent on goods or servicesthat are produced by other roles within

or outside the organisation. Byperforming coordination acts peopleenter into and honour commitmentstowards each other regarding theexecution of certain production acts[Dietz 2002]. An example of acommitment is an account managerpromising to extend a credit limitwhen requested to do so by a client.

A successful production act causes achange in the ‘production world’ (thecreation of a good or service), which is recorded as a production fact.Similarly, a successful coordinationact causes a change in the ‘coordinationworld’ (the creation or compliance witha commitment), which is recorded as acoordination fact. Examples ofproduction facts are ‘John’s clubmembership has started to exist’ and‘product P has been transported tolocation L’. Example of coordinationfacts are ‘John has requestedFerdinand to start John’s clubmembership’ or ‘supplierrepresentative S has promisedcustomer representative C totransport product P to location L’.This principle is shown in Figure 1.

CommunicationA coordination act is performed by onerole (the performer) and directed to

another role (the addressee). Becausepeople are not able to read one another’s minds, the performer mustcommunicate with the addressee toshare his thoughts.

To introduce some communication(and information) concepts we will usean example. A bank client wants hisaccount manager to arrange a highercredit limit on his bank account. Tocommunicate with his accountmanager, the client formulates hisdesire for a higher credit limit (mentalstate of speaker) in a message using alanguage that both he and the bankmanager understand (commonlanguage). He then converts themessage into perceivable signs byspeaking out the message. Soundwaves in the air (a communicationchannel) surrounding the client andthe account manager transport thesigns to the account manager. Theaccount manager hears the signs andreconstructs the message. Shesubsequently interprets the message todetermine its meaning (mental state ofthe hearer). She understands that herclient wants to increase his credit limit,but as he has a private and a jointaccount she asks him which account isinvolved. The client indicates he wantsto adjust the credit limit on his private

Coordination Acts

COORDINATION

WORLD

ROLES PRODUCTION

WORLD

Responsibility Authority Competence

Production ActsCoordination

Facts

Production

Facts

Figure 1. Coordination and Production Worlds

Page 20: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 20

account. Once she understandscompletely what her client wants herto do, she checks his credit rating andthen promises him she will perform the requested action. Both partiesunderstand that she has committedherself to increasing the credit limit on his private account and that she is expected to comply with hercommitment (common social culture).

The example illustrates thatcommunication takes place by theexchanging of messages. However,the example also shows thatcommunication is more than the mere exchange of information.When communicating, people try to influence each other’s behaviour.By everything they say, they dosomething [Reijswoud and Dietz1999:5], i.e. they perform communicativeacts. This principle is the main tenet of the Language/Action Perspective(LAP) [Austin 1962; Habermas 1981;Searle 1969], a methodology thatconsiders communication from anaction perspective.

Messages or communicative acts have a form (syntax) and a meaning(content in the form, i.e. information).The form of a message consists of thelanguage the message is expressed inand the substance (e.g. sound waves,electric charges, light pulses) thatcarries the message [Dietz and Mallens2001]. The meaning of a messageconsists of an intention, a fact and atime-for-completion [Reijswoud andDietz 1999:13]. A fact specifies acertain action or state of affairs (forexample, the credit limit of clientaccount 1-345 is €2000). The intentionof a message reflects how a message isintended to be taken by the receiver(e.g. request, promise). Depending on

their intention, messages can begrouped in five families:

– Assertives commit the speaker to something being the case (forexample stating);

– Directives try to get the hearer to do something expressed in themessage (for example asking,requesting and commanding);

– Commissives commit the speaker toone or more actions in the future (forexample promising);

– Declaratives bring about a new state of affairs by merely declaring it (for example declaring);

– Expressives express the attitudesand/or feelings of the speaker about a state of affairs (for example apologising).

The time-for-completion of a messagespecifies how long the combination ofthe fact and the intention are valid.In day-to-day conversation the time-for-completion tends to be implicit and a default value of ‘now’ or ‘as soon aspossible’ is assumed. Figure 2 shows theform and content of a sample message.

A communication process is successfulwhen mutual understanding isreached, i.e. when the mental state of the speaker and the generatedmental state in the hearer correspond[Reijswoud and Dietz 1999:14]. Toachieve mutual understanding the

speaker and the hearer may ask each other for clarification.Communication is only complete when the hearer confirms he hasunderstood the speaker.

A conversation is a sequence ofcommunicative acts between two people with a particular goal [Dietzand Mallens 2001]. Technically,conversations consist ofcommunicative acts that share thesame fact and time-for-completion (see Figure 3) [Reijswoud and Dietz1999:20]. Within business processestwo kinds of conversations arerelevant. Performative conversationsaim at letting other people dosomething. Informative conversationsaim at the sharing of existingknowledge. The main differencebetween performative and informativeconversations, is that performativeconversations are about the creation of new (production) facts, whereasinformative conversations are aboutthe distribution of knowledge.Examples of informative andperformative conversations are shown in Figure 3.

Business EventsA coordination act is a communicativeact directed from the performer of thecoordination act to an addressee. As a communication act consists of anintention, a fact and a time-for-

(English spoken sentence)

promisethe house is clean

this evening (before partner is home)

message

form

content/meaningintention

propositionfact

time-for-completion

Figure 2. Message form and content

Page 21: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 21

completion, coordination acts can befully specified as follows [Dietz 2002]:

<performer>: <intention>:

<addressee>: <fact>: <time-

for-completion>

A coordination act is always about a production act, or more preciselyabout the production fact, whosecreation marks successful completion of the production act. The fact in thespecification of a coordination acttherefore is a production fact.

Examples of concrete coordination acts are:

Mark: request: Esther:

credit limit of account

1-345 is ¤2000:tomorrow

Greg: promise: Harvey:

Greg’s article for journal

#1 is written:13/10/2003

The successful completion of acoordination act is recorded as acoordination fact, which can be specifiedin the same way as the coordination

act. For the performer of a coordinationact, the coordination fact representshis/her expectation that the addresseewill perform certain actions. From the perspective of the addressee thecoordination fact is a ‘business event’,an event within the organisation or its environment that triggers theorganisation to perform some action[Eriksson and Penker 2000:74]. Anexample of coordination (f)acts andbusiness events is shown in Figure 4.

The Business Transaction PatternWhen studying conversations betweenpeople who are trying to coordinatetheir actions, the communicative actsthat make up the conversation appearto follow the same conversationpattern, regardless of the businesscontext. Dietz and Mallens [2001]describe this pattern as follows:

First of all, one agrees upon what is to be achieved. Next the execution takesplace in the production world, forexample, the shipment of goods ordered.The commitment still stands, however,until the result is accepted between theparties involved, meaning that there is a third phase. It is this pattern thatwe call a ‘business transaction’, theelementary block of a business process.

The three phases of the businesstransaction pattern are shown inFigure 5.

A business transaction involves tworoles; a customer or initiator rolestarts the business transaction, and a supplier or executor role actually

Informative conversation (natural language):

Harry: what is the capital of the Netherlands?Sally: Amsterdam

Informative conversation (communicative acts):

Harry: asks: Sally:What is the capital of the Netherlands?Sally: States: Harry: Amsterdam

Performative conversation (natural language):

Person W: The house will be clean when I come back home tonight, won’t it?Sally H: Off course it will be.

Performative conversation (communicative acts):

Person W: requests: Person H:The house is clean tonight

Figure 3. Informative and performative conversation examples

performer intention addressee production fact time-for-completion

Harvey Greg

coordination (f)act/

business event

request

Greg's article for journal #1 is written

before 13th October 2003

Figure 4. Structure of coordination (f)acts or business events

Page 22: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 22

performs the intended action. The first phase of a business transaction,the order phase, is a performativeconversation that starts with arequest from the customer to perform a particular action and ends with apromise by the supplier. During theexecution phase the supplier performsthe requested action. The transaction is concluded with a performativeconversation in the result phase,which starts with a statement fromthe supplier that the requested actionhas been performed and finishes whenthe customer accepts that the actionactually has been performed asoriginally agreed.

The amount of interaction (negotiation)that takes place between customers and suppliers in the order and resultphases can vary strongly. In practicethere are instances where certain steps are taken implicitly and nocommunication occurs at all. Forexample, the expiry of the period inwhich a customer can return boughtproducts to a shop can be regarded as the implicit acceptance by thecustomer that the shop has fulfilled its obligations. In other situations long negotiations may take placebefore suppliers promise certainactions or customers accept theoutcome of a production action. The ICT industry (unfortunately) providesplenty of examples in this respect.Reijswoud and Dietz [1999:98-108]describe the business transactionpattern in greater detail.

Business Processes and RolesIn each of the three phases in abusiness transaction, new businesstransactions can be initiated whoseoutcomes are required to continue withthe original transaction. In this way

Request

Quit

Promise

Decline

Execute

Stop

State Accept

Reject

Order Phase

Initiator(Customer)

Executor (Supplier)

Initator(Customer)

Result Phase

Success Layer

Negotiation Layer

Failure Layer

Execution Phase

Figure 5. Activity diagram for business transaction pattern

<<coordination fact>>MemberPayment[requested]

<<coordination fact>>MemberPayment[promised]

<<coordination fact>>MemberPayment

[stated]

<<coordination fact>>MemberPayment[accepted]

<<production fact>>fee for membership M

Payment[promised]

Request

Promise

<<coordination fact>>MemberRegistration

[requested]

<<coordination fact>>MemberRegistration

[requested]

<<coordination fact>>MemberRegistration

[stated]

<<coordination fact>>MemberRegistration

[accepted]

Person MemberRegistration

Request

MemberPayment

Promise

State

Pay fee for membership M

<<production fact>>membership MMembership

[new]

State

start membership M

Accept

Accept

Figure 6. Example business process [Dietz 2002]

“The amount of interaction (negotiation) that takes placebetween customers and suppliers in the order and resultphases can vary strongly.”

Page 23: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 23

it is possible to create arbitrarycomplex structures of nested andchained business transactions. Dietz[2002] defines a business process as a structure of causally interconnectedtransactions for delivering a particularfinal product to the environment.In other words, a business processstarts with a top-level businesstransaction which is initiated by a role in the environment of theorganisation and which may includeother business transactions. Thesechild transactions produce intermediategoods or services (production facts) that are required to successfullycomplete the parent transaction.

Figure 6 shows an example businessprocess for the registration of librarymembers. The business process consistsof two business transactions. The‘member registration’ businesstransaction is initiated by a personwho wants to become a librarymember and executed by employees of the library who are authorised toregister new members, i.e. they areallowed to play the ‘memberregistration’ role. A person must paybefore being registered as member. Forthis reason the ‘member registration’role initiates a second businesstransaction to obtain payment of themembership fee. The executor of the‘membership payment’ transaction is the ‘membership payment’ role.

Though the ‘membership payment’ role usually is played by the sameperson who initiates the membershipregistration transaction, this is notnecessarily the case, e.g. an altruisticfriend or relative might pay the fee.This particular example illustrates an important principle: every businesstransaction has by definition one

corresponding business role, whichrepresents the elementary amount of authority a person must have to be an executor of the transaction.Dietz [2002] explores the subject ofauthorisation in greater detail.

Each business role that is responsiblefor the execution of a particularbusiness transaction operates as anelementary supply chain. Actors in a particular business role can act assupplier in one business transactionand as a consumer in an arbitrarynumber of subordinate businesstransactions (see Figure 7). Everybusiness role provides an elementaryand well-defined business service and may consume one or more otherelementary business services offeredby other business roles.

Up until this point in our analysis of how organisations achieve theirobjectives, we have focused on howorganisation members ‘talk’ to eachother to align their acts. In thefollowing sections we will concentrateon what people do betweencoordination actions.

Work Items and Action RulesAn organisation springs to life when a

business event occurs – that is, when a coordination fact is created. Thecoordination fact is created within a particular business transactioninstance and signals there is work to be done by one of the roles thatparticipate in the transaction, unless the coordination fact signals the end of the business transaction instance(for example a business transactionhas reached the ‘accepted’ or ‘stopped’ state).

Every business role has a work itemlist that contains prioritised entriesfor all coordination facts that the role must respond to. Whether acoordination fact results in a workitem for the initiator or the executor of a business transaction depends onthe intention of a coordination fact.For example, a request results in awork item for the executor, whereas a promise creates a work item for the initiator.

An action rule (shown in Figure 8)specifies the actions that must beperformed by a certain business role inresponse to a particular coordinationfact [Dietz 2002; Reijswoud and Dietz1999:117-128]. Though action rules areprocedures for business roles, these

Membership RegistrationBusiness Transaction

Person

customer

roles within business transaction

supplier customer supplier

MembershipRegistration

business roles

MembershipPayment

Membership PaymentBusiness Transaction

Figure 7. Business roles

Page 24: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 24

roles always remain responsible fortaking well-considered and sociallyacceptable decisions, even if that means deviating from the procedures!This is a fundamental reason why it is not possible to fully automate the execution of action rules. Whenautomating action rules, the people whoare responsible for the execution ofthese rules must be able to intervene inand overrule the automated procedures.

The daily life of a person in a businessrole consists of selecting the work itemwith the highest priority from his/herwork item list and performing theapplicable action rule. The executionof an action rule always ends with theperformance of one or morecoordination acts to pass the buck onto the next role that must play its part

in the business process. Thecoordination act may be preceded by one or more actions to retrieve orcalculate facts and/or a production act. provides some examples of actionrules written in pseudo code.

Business KnowledgeThe participating roles in a businesstransaction must have knowledge aboutapplicable business rules (restrictions),production facts and coordinationfacts in order to be able to actresponsibly. The restriction thatsomeone can only rent a video if hehas paid all previously completedrentals is an example where knowledge of production facts is needed. Video store employees need knowledge ofcoordination facts (in this caseunfinished video return transactions)

to be able to comply with therestriction that a customer may notrent any videos if he/she still has any unreturned videos.

In many cases knowledge of externalrules and facts is also required.Legislation and standards are examplesof external rules on how acts are to beperformed. In any stage of a businesstransaction external knowledge may be required. For example, during theassessment of a benefit request, a social service may require additionalinformation about the income, housingsituation, household and health of anapplicant. As the actual facts thatmust be known about an applicantmight depend strongly on his or hersituation, the applicant only has toprovide a limited amount of

Business role Work item Action Rule

Membership on requested with person P is member of new membership MRegistration MembershipRegistration(M) if Age(P) > minimal age for membership

and number of members < maximum numberthenpromise MembershipRegistration(M)

elsedecline MembershipRegistration(M)

end ifend with

Membership on promised request MembershipPayment(M)Registration MembershipRegistration(M) with fee(M) is remaining_fee(M)

Membership on accepted if promised MembershipRegistration(M)Registration MembershipRegistration(M) then

<decide to start membership M>state membership M has started

end if

Figure 8. Examples of action rules [Dietz 2002]

“The participating roles in a business transaction must haveknowledge about applicable business rules (restrictions), productionfacts and coordination facts in order to be able to act responsibly.”

Page 25: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 25

information in the benefit request.Additional information will berequested from the applicant and/orother organisations when necessary.Figure 9 shows business process andbusiness knowledge concepts.

Three Layers of ServicesWe can view the actions that take place within a business process at three levels of abstraction: the business,informational and infrastructural level.So far we have focused on the businesslevel. At this level we see business roles that are played by social actors;actors who are aware of the socialimplications of their actions. Theseactors are responsible for thecoordination of production acts by

entering into and complying withmutual commitments. They are alsoresponsible for the execution ofproduction acts.

Social actors use the services of rationalactors to memorise and rememberknowledge and to exchange messageswith other actors. Rational actors canperform logical operations onknowledge, but have no awareness ofthe social context they operate in. Theyplay roles at the informational level,as they are responsible for thegathering, remembering, providingand computing of knowledge. We cansee what knowledge is rememberedand distributed by these actors, butcannot see how the storage and

transmission takes place. That is, wecan see the meaning of documents andmessages but cannot see their form.

Rational actors depend on the services of formal/physical actorsto produce, distribute, store, copy and destroy data or documents thatcontain knowledge. These actors playinfrastructural roles: they deal withdata and documents but do not showany interest in their meaning.

The actors at the business,informational and infrastructural levelprovide three types of services, layeredon top of each other. At the businesslevel we find business transactionexecution services that coordinateother business transaction executionservices and informational services toproduce a particular good or service.At the informational level wedistinguish three categories ofinformational services: communicationservices enable the exchange ofmessages between business roles,computation services offer facilities tocompute derived facts and informationmanagement services provide long-termmemory of (production, coordination andexternal) facts and business rules.Informational services depend oninfrastructural services for the storage,distribution and calculation of data.The dependencies between these rolesare shown in Figure 10.

The realisation of services at all threelevels of abstraction can vary frombeing fully manual to highlyautomated. Human beings are ableto perform roles at all levels, whereasmachines are very capable to performinfrastructural and informational tasks,

BUSINESS PROCESS

BUSINESS KNOWLEDGE

0..11*

*

*

*

*

1

delegator

*

1

1

1

ends withperforms

delegate

authorises execution of

executor

*

1

*

* *

*

*

* *

1 0..1

0..10..11

1

follows guidelines of

schedules response to

uses uses

contains

generatesgenerates

1

<<business>>

action rule

<<business>>

Business Transaction<<business>>

Role

<<business>>

delegation

<<business>>

Work Item<<business>>

Coordination Activity<<business>>

Production Activity

<<business>>

Production Fact<<business>>

Coordination Fact<<business>>

External Fact

<<business>>

Derived Fact

<<business>>

Fact

<<business>>

Original Fact

initiator

Figure 9. Business process and business knowledge concepts

Page 26: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 26

but have absolutely no socialawareness. It is for this reason that machines cannot carry theresponsibility for performing actions in a socially acceptable manner. In theend there must always be a person who is responsible for the execution of business transactions.

A human actor who plays a certainbusiness role can delegate parts of theexecution of business transactions to amachine, but he/she will always remainresponsible for the result and have tostand by to deal with exceptionalsituations. For example, an e-commerceweb site may handle the majority oforders without any human

intervention. However, a customer whodoes not receive his order will contacta sales person to resolve the problem.

ConclusionBy analysing how and whycommunication takes place betweencooperating persons, we haveencountered the business transaction

Data Storage

Business Transaction Execution

Business Role Informational Role Infrastructural Role

Derived Fact Computation

Calculator

Data Transfer

Message Exchange

Original Fact & Rule Memory

Figure 10. Business role support by informational and infrastructural roles

Page 27: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Business Process Decomposition 27

communication pattern. We have seen how instances of this pattern are chained together to form businessprocesses. The understanding thatbusiness processes are composed ofbusiness transactions enables us toperform the reverse action; thedecomposition of business processesinto business transactions.

Each business transaction identifiesone business role, which represents the elementary amount of authority and responsibility to execute thetransaction. These business roles arethe building blocks for authorisationpolicies within organisations.

Each business role can be viewed as anelementary supply chain and provides awell-defined service. By studying howthese services are realised we haveencountered the need for additionalinformational and infrastructuralservices to enable storage andexchange of information.

The decomposition of businessprocesses into business, informationaland infrastructural services and thedefinition of their dependencies providea solid basis for enterprise information and application architecture. Astechnology has not played a role in thedecomposition of business processesinto services, the resulting servicesare technology-independent. The degree

to which services are automated mayvary strongly, but this has no influenceon the information architecture.

When implementing a service, wewould recommend hiding both thechosen implementation technology from other interacting services as well as the degree of automationwithin the service. This approachprovides an organisation with theflexibility to change the technologyitself as well as the way it is applied in business process implementations.

We started this article with thequestion how to avoid today’ssolutions becoming tomorrow’sheadaches. Focusing on communicationpatterns that stay the same regardlessof technology and business processchanges certainly seems like a goodplace to start.

Further ReadingMore information on DEMO can be found on the DEMO web site(http://www.demo.nl). The DEMOhandbook [Reijswoud and Dietz 1999] provides detailed information on the theoretical background ofDEMO and also describes a business modelling approach.

ReferencesAustin 1962J L Austin, How to do things withwords, Harvard University Press,Cambridge MA, 1962

Dietz 2002J L G Dietz, The Atoms, Molecules andFibers of Organizations, Data &Knowledge Engineering, 2003, internet:www.demo.nl/documents/2003-DKE.pdf

Dietz and Mallens 2001J L G Dietz, P J M Mallens, AnIntegrated, Business-OrientedPerspective on Facts and Rules,data2knowledge newsletter, January-May 2001

Eriksson and Penker 2000Business Modeling with UML: BusinessPatterns at Work. Wiley ComputerPublishing, 2000

Habermas 1981J Habermas, Theorie desKommunikatives Handelns, ErsterBand, Suhrkamp Verlag, Frankfurt amMain, 1981

Reijswoud and Dietz 1999V E van Reijswoud, J L G Dietz,DEMO Modelling Handbook Volume 1,Delft University of Technology –Department of Information Systems,Version 2.0, May 1999

Searle 1969J R Searle, Speech Acts, an Essay inthe Philosophy of Language,Cambridge University Press,Cambridge MA, 1969

Gerke [email protected] Geurts is a technical architectwithin LogicaCMG. He has a keeninterest in the many facets of softwaredevelopment in an enterprise setting.His waking hours are spent coachingand trying to stay ahead of clients,software developers and his children.

Adrie [email protected] Geelhoed is an enterprisearchitect working for LogicaCMGFinancial Services and providesbusiness-centric architecture andsoftware development guidance tocustomers and consultants.

Page 28: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 28

In this article, I present an overview of a pragmatic approach to using ametadata-driven approach todesigning applications that has apractical presence in the developmentof applications. The article is presentedin an open style and hence is nottargeted at the deep technicalcommunity. I hope this willdemonstrate why metadata-drivendesign is one of the most powerfultools at the disposal of IT professionalswho are engaged on building asoftware product.

Remember 30?Turning thirty was a brief affair. Norun up to a massive ‘fair well twenties’party, no late night at a nightclub, andno enormous hangover in the morning.But it was no less memorable for thelack of pomp and ceremony. In place ofa hangover was a small realisationthat formulating patterns is importantto anyone working with technology.

I was in Singapore, to assist a smallteam running a package selection andintegration project for a bank. I hadarrived shortly before my birthday,with the intention to lay low, get somereading done, maybe to do a littlework, and certainly to defer allcelebrations until I returned home.A colleague of mine, Lyn, was clearlyappalled by my attitude and had analtogether different idea. Post dinneron that ceremonious day I was dulypress-ganged up to the seventieth floorbar of our hotel. So there we were sat,with the wonderful views out over theSingapore harbour, me drinking theubiquitous Singapore Sling and Lyndrinking a fruit cocktail as she wasnearly six months into her secondpregnancy. Not a scenario I ever

imagined for my thirtieth birthday.It was the first time any of us hadworked closely together and hence wesat discussing the path of professionalfortune that had brought us to thisproject. Lyn had formerly worked forSWIFT and held much of the businessknowledge the project would require.As for me, my previous engagement hadbeen on a large custom applicationproject but this time my role was todefine an architecture in which toposition and integrate the variousproducts to be employed on the project.

Clearly, Lyn and I were differentcharacters with very differentprofessional backgrounds. Businessverses techie on the seventieth floor.The stage was set for a showdown, atête-à-tête to the bone, but strangely we didn’t end up trying to kill eachother. Business verses techie it mayhave been, but Lyn and I reached anagreement that evening on one thingthat has influenced my professionalwork since I first started reading thepatterns literature. When it comes to IT projects, they work on patterns. Andthese patterns are not just for techies.

How high are your patterns?Patterns of analysis, design, andimplementation are virtuallyubiquitous in the IT industry todayand have received much publicity.Given the corpus of work freelyavailable in the patterns area, therecan be little doubt that the majority ofkey players in the IT industry believeapplying patterns to the building ofapplications is a valuable and powerfulapproach. Many aspects of patternsfeed directly into the common goals of methodologists, analysts, designers,implementers, testers and

management, namely those of higherquality, more maintainable, morereliable and faster to develop solutions.

Powerful as patterns can be, they haveone aspect in application design anddevelopment that has arguably provendifficult to achieve. That aspect is a substantial improvement intraceability (and hence consistency),from the analysis of the solutiondomain through to the creation of an application based on a standardframework-based implementation.

It’s a simple but lofty goal; to makesoftware quicker to develop and morereliable the IT industry needs to learnto reuse software infrastructure moreeffectively; to reuse infrastructuremore effectively, we need to define theallowable (or supportable?) patterns of use for the target infrastructure.Introduce a more complete traceabilityin the software lifecycle and you getthe potential to realise this, becausetraceability provides a formalised path to transform analysis-time artefactsright through to build-time artefacts.This is the key to being able toforward-engineer the analysis;you need to know the technical(infrastructure) environment you areplanning to target with generated code.

Looking at the various infrastructureinitiatives in the IT industry, acrossmany technologies, it is clear thatmany software infrastructure providersare looking to productise the conceptsaround patterns in to a framework for the general implementation ofapplications. To a certain extent, thishas been going on in the IT industry for many years. However, this time theinfrastructure providers are looking

Metadata-driven Application Design and DevelopmentBy Kevin S Perera, Temenos

“Traceability provides a formalised path to transform analysis-time artefacts rightthrough to build-time artefacts.”

Page 29: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 29

much higher in the layers of theapplication architecture. This time,they are out to steal the wind of theapplication architect and put it in thesails of their framework.

Looking at what is on offer today and what is clearly in the pipeline,application architects out there shouldbe tracking this carefully; this time we have infrastructure products and tools that have been creeping up towards the base of the functionallayers. And with this creeping comes anew adversary for the not-invented-here syndrome. The tools are not justhelping you to model, but are helpingyou to speed your coding viasystematic code generation.

Want some of this? Then you need to start working with patterns,both on the infrastructure andapplication layers.

This movement in the IT industryhas been gathering strength formany years. A simplified view couldbe, naturally, simple.

– The higher the patterns of [reuseof] infrastructure climb, the morewe can produce repeatable, higherquality and faster time to marketbusiness applications.

The less simple, more technical view on this should be a more long-term perspective.

– Understand the patterns and we can build the supportinginfrastructure.

– Productise the infrastructure andwe can build high-productivity toolsto support that infrastructure.

– Give the technical community thetools, and they will create morerepeatable, higher qualityapplications with a faster time to market than is typical today.

Patterns, patterns, everywhereIf our work in information technology isas heavily linked to patterns as theindustry appears to believe, thenworking with patterns is a reality formost of us. We probably don’t realise thepatterns we employ in our everydayroles, but this is the nature of apervasive subject matter. If something is everywhere, after only a brief period most of us simply don’t see itanymore; especially if it’s related totasks we know and perform frequently.

If they are everywhere, what does itmean to work with patterns? Manythings to many people I am sure, butfor the purposes of this article it hastwo concrete meanings. In the designprocess, using patterns means basingthe design on commonly occurringstructures (‘patterns of design’)1,preferably using design tools capableof storing the design in some form of model representation such as UML.In the build process – where the real code is cut – using patternsmeans using software infrastructureproducts and tools including codeframeworks and code generators such as Codesmith2.

Fortunately, the nature of architectureand design proffers the opportunity to identify and classify patterns. Thevaluable work done by other authorsmeans I don’t need to engage in a longdiscourse on the value proposition ofpatterns in design. However, the samestatement cannot be made carte

blanche for the implementation phase of the software lifecycle wherepatterns – or more specificallyframeworks – unfortunately havegained the occasional bad reputation.

Effectively, frameworks suffer from‘generics’. It is often possible to applythe use – or abuse – of a framework inmany different ways. In providing for areasonable number of alternative usesand styles of use, to provide a genericbase service to those people buildingapplications, frameworks have gainedan unjust reputation of being overlycomplex and hence difficult tounderstand as IT professionals areswamped with information on thevarious approaches employed.Getting the level of a framework right, achieving the balance betweenprescription and flexibility, has provena difficult challenge to meet. Get itwrong and you always face thequestion, do you need patterns toimplement software-based systems?

The root cause of this issue isprobably that most application teamsare under pressure to produce higherquality solutions as fast as possible.Naturally, this leads teams away fromapproaches that are potentially morecomplex and that hence may delay theonset of the application’s build. Whenworking under pressure, nobody islooking for an approach that they feelhas the potential to bring more stress.

Anyone for some stress reduction?Thinking about stress brings aquestion to mind. Ever found a patternto help relieve some stress? I have onethat is absolutely fantastic. Go home,hide in the study away from themadding crowd, scan read theremainder of the days’ emails and

1 For example, as per the definitionsdocumented so very successfully inthe ‘gang of four’ book ‘Patterns ofSoftware Design’.

2 http://www.ericjsmith.net/codesmith/

Page 30: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 30

try to update the little work and playmind-map I keep hidden on my laptop.All the while sipping a good single maltor maybe a white port. Now that is atruly great pattern.

I have always found patterns in manyplaces, from the manner in whichstart of day status meetings are run to the end of the day wind-down. It isinherent in the nature of patterns thatthey help us to manage complexity.A good pattern can provide a goodunderstandable description of complexstructures that can equally assist inthe description of complex relations.This can make life so much morepleasant and stress-free for anyonewhom needs to work with complexity.Admittedly, doing so also has a coupleof other useful side-effects, but we willinvestigate those later in the article.

If the use of software frameworks and infrastructure can be defined bypatterns [of use], then metadata is the language used to describe thosepatterns. Hence, describing thepatterns is the responsibility ofmetadata. Metadata is not a secularlanguage, many different dialects ofmetadata exist and may even cohabit.Metadata is the common term for therepresentation of the data models thatdescribe patterns. Hence, for any givenset of metadata the quality of thedescription it offers is determined by the associated metadata model(literally, the model of the metadata).

It is worth noting at this point thatpatterns are not and should never berestricted to design-time. Previously,standards such as the UnifiedModelling Language and initiativessuch as Model Driven Architectureindicated to many that modelling

and metadata were upstream in theproject lifecycle. Not so. IBM, like manylarge technology organisations,started work on formalising a model of a total system based in UML duringthe late 90’s. Many of these initiativesnever became official externalpublications, but in October 2003Microsoft broke this apparent silenceand published details of its pendingSystem Definition Model (SDM). Withmodels such as SDM covering the entirescope of a system, you can expect to seemodel-driven and hence metadata-driven processes encroaching onalmost every area of the softwaredevelopment lifecycle.

Describe your patterns to reap thebenefits of functional, policy andservice abstractions Using metadata in the design and buildprocesses – or metadata-driven designand development of applications –proffers one solution that canelegantly leverage the powerful natureof patterns, and hence leverage thehigh value software framework andinfrastructure products that are nowavailable in the market.

Metadata helps to describe the patternspresent in your application’s domain.Describing (or modelling) the patternsthat the application must employ helpsto promote understanding of whatfeatures the software infrastructuremust provide and the style in whichthat infrastructure could best beutilised. This provides very earlyvisibility of the infrastructure needsand also permits, very importantly,a safer, more controlled environment in which the software infrastructurecan climb higher in the layers of anapplication’s architecture. Control over how infrastructure creeps up

the layers is critical to providing highervalue services to the applicationdevelopers. This provides yourapplications developers with anincreased level of functional abstraction.

Modelling the needs of an applicationusing patterns (in turn described bymetadata) may be applied to almost any area. Hence, metadata can be used to describe the mapping of objectattributes to relational tables, throughthe definition of rules around theruntime session management of a poolof related objects to the signatures of the services offered by a particularclass of object. Such rules form thepolicy definitions used by applicationdevelopers to control the underlyingbehaviour of the infrastructure.

When carefully and judiciously appliedto the modelling of your application,metadata may be employed in thedescription of both the infrastructureand application service boundaries and for the configuration data of theassociated service implementations.Subsequently, the metadatadescriptions associated with theseservices (and the metadata model) maybe used in the forward-engineering ofyour services – into existing and futuretechnology environments. This providesa rich – and reusable – approach toservices definitions.

Using metadata in this manner, acrossthe business (application) and technical(infrastructure) domains, requires anagreement on the overall metadata-model between the business analysisand the software infrastructure. Thisis an agreement on a behaviouralcontract between the application andits infrastructure and permits both thedetailed functional analysis and the

“The metadata concept may be applied to manyareas of technology, and hence to many softwareproducts, in the business and technical layers.”

Page 31: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 31

construction of softwareinfrastructure to run in parallel.In such a scenario, the applicationdevelopment team may now tacklecritical issues at a very early stage in the development process which isknown to be a key factor in improvingthe effectiveness of the softwaredevelopment lifecycle.

The metadata-driven approach alsoworks well with other mainstreamapproaches such as use case modellingand artefact-based methodologies (for example, the Rational UnifiedProcess), has the potential to naturallylend itself to reuse and repeatability,and is independent of the targettechnology platform.

TechnologyBefore diving into the discussionabout how metadata links to theService Oriented Architecture and Web Services, it is probably worth alittle pause at this point to discuss the technology behind metadata.In thinking how should this be bestformulated, I came to the conclusionthat there is no easy way to say this ina technology-related article. So let’sjust put it out there – quite simply,there is no ‘technology’ in metadata.

Metadata itself is a concept. That iswhy many different models of metadata– or metadata models – can exist. Themetadata concept may be applied tomany areas of technology, and henceto many software products, in thebusiness and technical layers. JNDIhas a meta model, as does LDAP andActive Directory. They all use similarconcepts in their meta models, but allhave different (physical) metadata.What is clear is those design anddevelopment products using metadata

also use (embed) or link (integrate) to tools. Frequently these tools aregraphical and, more and more, arelinked to improving developmentproductivity. Next to the metadataitself, which clearly has a primaryimportance, this use and integration of tools to support the storage,propagation and use of your metadatamodel across the software lifecycle isprobably the most important aspect of technology in the design and buildenvironments.

When selecting tools to assist withdesign and development, the mostimportant criteria are often centred on:

– the ability of the tools to store anduse metadata, both of their designand yours.

– the ability of the tools to sharemetadata, which is particularlyuseful between the design-time and build-time to help reduce thesemantic gap between the design and the implementation.

– the integration of the developmenttooling with the underlying softwareinfrastructure and runtime platform,particularly for code generation.

– the integration of the design anddevelopment tooling with codegeneration technology, such asscripting languages that may be used to interrogate the metadata model.

– standards (MOF) compliance.

Linking SOA and Web Services using metadataInstead of creating a several thousandword essay on why using metadata indesign and development is so great,I’ll try an alternative, more ‘techie-friendly’ approach. Let’s assume we all agree that metadata-driven design

and development is just fine. Patternsare great, patterns govern most ofwhat we do in IT, metadata is a coolway to represent your patterns, andfinally that metadata can be used inboth design and build.

As we all agree that metadata-drivendesign and development is great, wedon’t have to waste any more timedebating its merits. In place of theessay, let’s take a look at what we couldimplement with this approach based ona link between SOA and Web Services.

First, a cautionary note to the readerbefore continuing. The examples areexactly that – examples. Focussing on one aspect of using metadata in asystem does not mean the sameconcepts do not apply to other areas.Many areas of systems already useaccepted metadata models, particularlyaround authentication andauthorisation using products such asActive Directory/LDAP and certificatemanagement services.

The propositionWalking through two exampleapplications built on similar principlesbut different implementations, I amgoing to explore how a metadata-basedapproach can be used to define theservice interfaces on an existingapplication and on a new application.The primary goal of this exercise is topropagate those interfaces to anadditional technology environmentusing a combination of infrastructureproducts, a (code) framework forbuilding applications (an applicationframework) and a metadata modelseeded in the design phase.

Page 32: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 32

Example applications – common groundIn this walkthrough, the two exampleapplications perform similar functionalroles and have a similar façade to theexterior world.

As I am focussing on the serviceinterface layer on the core server, I’mgoing to make a few assumptionsabout the implementation of eachapplication. First, both applications arerunning on the same database and datamodel – application #2 is thecandidate replacement for application#1. Second, I am not concerned withthe presentation interface or anymiddle-tier user interface components.Third, I am not concerned with anydetailed deployment aspects orconfiguration of any communicationsmechanisms. Fourth, the server model is identical, with application sessionsshared across all requests for servicesand server-side session handlingembedded in the application or theruntime platform. Fifth, and last, thedevelopment of the application iscarried out by multiple teams inseveral time zones.

While the actual (business)functionality may be similar betweenthe applications, what is not commonbetween the applications is the modelfor presenting their service interfaceson the system boundary.

Example application #1 – the original, the legacyThe server-side interface of the originalapplication was comprised of a Cheader file. Each function on theinterface employed a signaturetailored to its purpose. Externalcomponents called to the interface

functions using a DCE-based remoteprocedure call.

Adding a new function is relativelysimple, but highly coupled to the DCEtechnology. Adding an alterative accessto provide a non-DCE invocationchannel to the function is not possiblewithout significant re-engineering.

Note that the data dictionaries of allservices are exposed directly on theapplication’s interfaces and thatfunction names are placed directly inthe global namespace of the application(they must be unique across allfunctions being developed by theteams). This is shown in Figure 1.

Example application #1 – theevolutionThe original application #1 was laterwrapped by a set of C++ classes toexpose the server-side interfaces on a CORBA bus. The C++ interfaces are not direct wrappers, but group ‘C’functions in to sets of services and

provide a uniform ‘generic’ object as the request context.

Each service defines a number of classesderived from the request context class,to create a set of dedicated ‘containers’for the parameters required by theunderlying ‘C’ functions on the server-side. Thus functions are grouped intoservices and the parameter definitions for all functions are encapsulated by

the set of container class definitions.To facilitate the handling of these‘more generic’ requests, a new server-side (application) component type wasintroduced to check the request contextto validate the correct parameters have been supplied and map theseparameters to the correct underlying‘C’ function. The request mapper is theimplementation of the service definedin the interface definition language(IDL). Normally, to permit paralleldevelopment, there is one requestmapper per service IDL. However, thismodel is not enforced by the system.

Adding a new function to an existingservice requires the derivation of a new container class and the addition of new code in the mapper to marshalthe container data and make theunderlying ‘C’ call. Adding a newfunction to a new service requires firstthat the service interface be defined inIDL and a mapper be created based onthat service interface.

Note that the data dictionaries of allservices are exposed directly on theapplication’s interfaces, but are scopedby service and request contextdefinitions as shown in Figure 2.

Example application #2 – the Kingof the Hill?The key difference betweenapplication #2 and application #1 isthat application #2 was built with a

External

Component

‘C’ RPC fooA( p1, p2, p3 )

‘C’ RPC fooB( p12, p5 ) Application

Server

Component

Figure1. Application #1

Page 33: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 33

metadata-based approach. Application#2 is not restricted to reusing theexisting ‘C’ functions.

Each server-side function inapplication #2 is written based on a common metadata model for therepresentation of a service request.This is similar to the evolvedapplication #1 in concept, but thehandling of the generic request contextsis embedded in the server code. Server-side functions must now understand thegeneric request context format. This isshown in Figure 3.

To represent the request context,concrete parameters are not desirable.In place of a concrete functionsignature, at runtime application #2 is using a generic structure for therequest context. However, in thisinstance the request context needs tobe similar in nature to a name-valuepair property-bag. Property bags canbe hierarchical and hence can be used to represent any arbitrary datastructure much like an XML documentbased on an XML Schema.

The service name and request contextare all that is required by the newserver-side application components to execute the request. The servicename coupled to the definition of thepermitted structure of the requestcontext is persisted in the database as a ‘service definition’. This definition,effectively an entry (row) in thedatabase, belongs to that service alone.

A new generic request handlerapplication module is included in the system. This module has beendesigned to interrogate the metadatarepository and validate incomingrequests against the associated servicedescription metadata. It can also createthe internal memory structure from therequest context to pass to the targetservice implementation.

Thus the interface definition is held inthe metadata repository (in this case,the applications database).

Adding a new service to theapplication is straightforward on theserver-side. Implement the generic

interface and manage the requestcontext structure required for theservice request, assign the service aname, register the name and requestcontext description in the databaseand perform any deployment actionsrequired to deploy the new service code on the server.

The same cannot be stated for theclient side. In place of the definitiveclient-side header file or IDL interfacedefinitions of application #1, theclient-side of application #2 receivesonly a generic entry point function via the same technical-interfacingtechnique. This is a big change, in that the service data dictionaries are no longer propagated directly onto the service interfaces. The service interfaces, from an externalperspective, are all soft-set in thedatabase. Note also that the scope ofa service definition is now one-to-onewith the service implementation.

Clearly, application #2 suffers from an imbalance. The server-side is nowvery flexible and well defined, whilethe client-side needs to understand the metadata before it is possible tounderstand what services are on offer.

Enter the world of service discovery.The services of application #2 need to be discoverable by clients wishing to use them. This is possible via thetried and trusted pattern calleddocumentation, Users (clients) of the applications services can read the interface specification andtherefore construct well formed and validate requests.

Metadata-driven SOAObviously, example application #2 isvery service oriented at the system

External

Component

‘CORBA’ svcA.doIt( 'fooC', txA )

‘CORBA’ svcA.doIt( 'fooC', txB ) Application

Server

Component

Request

Mapper

Figure 2. Application #1 evolved

External

Component

application.doIt( ‘svcA’, requestContext )

application.doIt( ‘svcB’, requestContext ) Application

Server

Component

Request

Handler

Metadata

Figure 3. Application #2

Page 34: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 34

boundary – it does not know anythingelse – but it would be unfair to claimexample application #1 does notembody some aspects of a serviceoriented architecture.

In its original form, application #1 islike many so-called legacy applicationswritten in ‘C’ or COBOL. The code mayhave been modified many times, butwithin the domains of the technologiesavailable at the time these applicationswere fundamentally service-oriented.‘C’ programs had their extern functionsignatures; COBOL programs hadtheir copy books. Include aninfrastructure like IMS or CICS, andyou certainly have a service-orientedsystem façade as these TP monitorsforced that model – a bucket of datain, run the request, a bucket of dataout and describe the data buckets via data structures. That is a service, right?

So what is the fundamental differencebetween applications #1 and #2? Theanswer [for me] is they are bothservice-oriented at their systemboundaries, but to be a true service-oriented application the fractal modelmust be applicable from the systemboundary to the database, with serviceinterfaces defined for each componentor sub-system and each service treatedas a black-box by the caller.

However, staying concentrated on thesystem boundaries of the examplesindicates that the quality of the service description is vastly differentbetween them.

– The original application #1 is doingnothing more than exposing libraryor executable end points, with itsentry point function signatures

propagated over the system boundaryvia the DCE technology.

– The evolved application #1 is doingthe same, but with a more elegantend point description mechanismemploying a simple abstraction.Entry point functions are exposed as a simple service-orientedabstraction via CORBA, with the link between the abstraction and the concrete implementation via coded marshalling.

– Application #2 has only onetechnical end point to expose.Exposing this end point in anytechnical environment in the samemanner (via DCE, CORBA, COM,asynchronous messaging, etc.) willgenerate the same result – the datadictionary will never be exposed,only the signature of a genericservice call. The value in thisapproach is to use the metadata-driven service definition and theassociated application infrastructurefor invoking those services, as theimplementation runtime forgenerating many different proxies in a number of different technologies topermit wider usage of the services.

The metadata-driven nature of theservices of application #2 leads thesolution to a dead-end if a puretechnical ‘code it’ approach is taken to providing access to those services.In such a metadata-driven applicationexposing functions is replaced byexposing metadata.

Sound a little dangerous, this idea of exposing metadata? Exposing themetadata itself is not the true intent ofa metadata-driven application. Usingthe metadata to drive the propagationof services [functions] over the systemboundary is a more accurate manner of

phasing the approach that needs to be employed.

From SOA to CORBA, from SOA toWeb Services, from SOA to … To demonstrate how a real metadatabased service design can outrun the competition, let’s propose somenew modifications to our existingexample applications.

– We have our basic server-sideruntime setup for all threeapplications, this stays ‘as is’.

– We would like to add a cleanmigration path to a very fashionablenew integration capability based on Web Services.

– We also want a migration path for anumber of the external applicationsthat have been integrated viaCORBA against a number of specificservices of application #1.

– Someone asked for documentation,so let’s give them that too.

– These external applications are to bemigrated to use the correspondingservices of application #2, requiringthat application #2 supports thesame type of CORBA interface as employed in the evolvedapplication #1.

Providing support for both CORBAand Web Services access channels tothe services of application #2 may allbe achieved efficiently, effectively andsafely using an infrastructure andcode generation. The same cannot bestated for either guise of application#1. If the same approach were to beapplied to application #1, the mostlikely result would be the sufferanceof increased complexity from the lack ofa central model to describe its services.Note it is the central model that isimportant, much like the TModel

“… a bucket of data in, run the request, a bucketof data out and describe the data buckets viadata structures. That is a service, right?”

Page 35: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 35

concept in UDDI being the centralmodel (as opposed to the physical datamodel used in MSDE or SQLServer asthe repository).

Using the metadata service definitionscoupled to an appropriate tools andinfrastructure environment, ametadata-driven application is capableof providing this bridging approach to propagate its services in to manytechnologies via code generation. Thisis a direct result of all services beingregular (described in a commongrammar and vocabulary) and that all service descriptions are availablein a meta-format (a descriptive format– the metadata) at both build-time and runtime.

Design direct to implementation –linking metadata to toolsThere are a number of prerequisites to this approach that are formidableenough subjects to warrant articles of their own. But to touch on thembriefly, in a horribly simplified vein,we need to go shopping for thefollowing items.

– Modelling tools that are capable ofproviding a graphical interface on toa model repository where we maystore our metadata models and theassociated metadata instances (notnecessarily in the same physicalstore). Dependent on your applicationdomain the usual suspects are likelyto be an UML-style tool, a processmodelling tool or a combination ofthe two. The tools must provide aprogrammatic interface to theirmodel storage.

– One or more target infrastructureproducts, preferably standardsbased or de facto third partystandards based but home-grown

if required. These products should befocussed on providing the bulk of thetechnical services required by anapplication, from persistent storageto session management (and that issession management on all tiers ofthe application, where possible).

– Development environment toolscapable of supporting theprogrammatic interfaces of themodelling tools and the targetinfrastructure products.

– Development environment toolscapable of supporting codegeneration. For those familiar withLex and Yacc, this is not the type of‘tool’ in question. Supporting codegeneration in this context needs to beat a higher level than the basis ofregular expressions and context-freegrammars. Ideally, code generationtools should embody some of theprinciples of dealing with metadataas this helps significantly reduce thecomplexity of the code generator.

– One or two strong technical leaders,familiar with the concept and use ofemploying metadata to drive theapplication development lifecycle.

– A number of designers familiar withthe concept of metadata modellingand the design of applicationinfrastructure.

– A number of developers familiarwith the concept of using metadataand whom agree with the designers.

The modelling tools, infrastructureproducts and developmentenvironments all exist in today’smarketplace. The final ‘people-oriented points’ are arguably the most difficult prerequisites to getright, as is the case with most softwareteams it is the mix of people andapproaches that often makes or breaks the software lifecycle.

Professor Belbin’s test will help getthe mix of ‘people types’ reasonablywell balanced, but there is nocompensating for a unified team3. Thisis critically important in a metadata-driven approach, as all team membersmust adhere to the model if theapplication is to achieve its goals.

Putting the whole thing together, wearrive at a process whereby it is viablefor the solution analysis (sometimestermed the requirements analysis) to feed directly in to the applicationdesign and the applicationinfrastructure. The application designgoverns the overall schema of thecomponent model and the metadatadefinitions. The applicationinfrastructure looks to provide supportfor the application schema via a set ofapplication framework APIs, the reuse of standard infrastructure or building of bespoke infrastructure. Finally,the application tooling is responsiblefor interrogating the models andmetadata of the application’s designand generating code on top of theapplication infrastructure. This isshown in Figure 4.

Build-timeLooking a little more closely at thebuild-time, and once again focussing on the service interfaces, the metadataservice definitions are used by theapplication tooling to provide codegeneration of the service interfaceimplementations (the service proxies,in different technology environments).

The application’s metadata is useddirectly to derive generated code. It isimportant to note the metadata feeds into the build of the application’s servicesalso, as does the common infrastructure of the application. The generated code

3 The potential impact on teamdynamics and correspondingmanagement techniques whenapplying these techniques is outside

the scope of this paper and is asubstantial enough subject in its ownright to warrant a dedicated article.

Page 36: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 36

should be based on the same code base (the framework) as the coreapplication. This helps to keep thevolume of generated code to aminimum – code is generated on top of this infrastructure layer. This isimportant to help reduce the need for complete regression testing.

The role of the generated code is toproviding the marshalling of requestsfrom one format (such as a requestoriginated from a CORBA peer, or a WebServices POST) to the generic internalmetadata-oriented request format. Themetadata must contain informationabout the service interface, such asparameters types and names, but canbe extended to include default valuesand validation. If extended in this vein,the generated proxy will be able to usethe same validation rules as the coreservice (on the assumption that thesame metadata is used by the coreservice – which in this scenario, itshould be!).

The generic infrastructure is,fundamentally, extended to support aspecific technology via systemic codegeneration. The code generationshould simply be defining a wrapperon the underlying applicationinfrastructure to marshal requests from ‘one side to the other’. This is made possible by the presence of ametadata model, as that model definesan overall structural representation for the service interfaces. The result is that the infrastructure for a giventechnical environment can be testedindependently of the code generation,a key factor in increasing the quality of a solution employing code generation. Deriving from themetadata also permits different codegeneration policies to be applied,such as ‘include code generation forrequest validation’ verses ‘no codegeneration for validation’.

One downside to code generation is the need to ensure that all criticalrequirements of the target

environment (to be bridged by the codegenerator) are well described in theservice metadata model. If not, themetadata model needs to be revised or extended to support the concepts inthe target environment that are absentfrom the metadata model (that is, themodel is not complete).Another downside to deriving theservice proxies in this manner is that of versioning. Versioning andconfiguration management is often a very thorny subject and certainly asubject that warrants dedicatedtreatment. However, the issues facingproxy versioning are not so differentfrom the issues faced by more typicaldevelopment approaches. If a servicedefinition changes, the associatedproxy will need to be regenerated andany integration against that proxy will need to be assessed for impact (all facilitated by the formalisedtraceability of the approach). This isno different to any other scenario,except here we need to ensure the coreservices, infrastructure and generatedproxies are all matching what wasintended in a deployment!

For the former, there is no real answer to this as it is all about thecompleteness of the metadata modelbeing utilised. Looking for standardsmay help, but nothing will beat a well reviewed and documented (usecases!) model. Fortunately, for thelatter, derivation from a metadatamodel means both packaging andimpact analysis on changes topackages or packaging may beperformed within the same tools-based regime. At code generation,the exact configuration of the build is well known and can be used topopulate a deployment repository.

Application Design

Design-timeModel/MetadataRepository

ApplicationInfastructure

SolutionAnalysis

Core InfastructureDesign & Build

Infastructureproducts

ApplicationFramework APIs

Code Generator Tooling

Application ToolingDesign & Build

ApplicationInfastructure

DevelopmentTools APIs

Figure 4. A metadata-driven process

Page 37: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 37

Run-timeReturning to the original intent of showing what can be done with ametadata-driven approach to designand development of applications,the best place to review that is in the runtime.Assuming the build process used acommon application infrastructuretailored towards the metadata servicedefinitions and that metadata wasused in the creation (coding – no magicthere!) of the core services of theapplication as well as the generation of service proxies, the consistency of the overall solution will be as complete as the metadata model itself.

A request originating from anysupported external source is

marshalled by the respective serviceproxy. In that marshalling, dependenton the code generation policyemployed, the service proxy mayperform some validation of the requestbased on the rules supplied via themetadata. This could even includeauthentication, authorisation andsession management via a hand-off ofmetadata to delegated sub-systems.

Once marshalled in to the genericservice request format the request isforwarded to the request manager for execution. In that execution, therequest manager interrogates therequest and matches it against themetadata for the target [core] service.If all is well, the request is accepted bythe target service and that servicemay then use additional metadata in

the processing of the request. This is shown in Figure 6.

The final result is an approach tosolution design that flows frominception to implementation, providinga pragmatic view on how and whymetadata-driven applicationspotentially have a longer life than their more traditional counterparts.

DocumentationIf this approach permits thegeneration of code from servicedefinitions, there is no reason not togenerate the service interfacespecification documentation also. Thishas been common practice for manyyears, with top-down documentationgeneration from tools like Rational Rose(to Microsoft Office Word) or bottom-upvia code-oriented tools such asAutoDuck or JavaDoc.

ApplicabilityWorthy of a few observations, is a briefsummation on the organisationalaspects involved in the application of this approach. To characterise thetypes of organisations that oftenappear drawn to a metadata-drivenphilosophy to developments, thisapproach probably is geared:– for designers and developers that

want a closer link between the design and the implementation;

– for organisations that are lookingseriously at a real infrastructure-driven, higher productivity andhigher quality approach to theirdevelopment;

– for organisations that are notnervous about getting in to bed with a couple of productivityenhancing tools;

– for organisations looking to buildknowledge repositories and the

WS-based ApplicationInfrastructure

Common Application Infrastructure

CORBA-based ApplicationInfrastructure

Coding of theApplicationservices

CodeGenerator

WS Code-BehindModules + WSDL

CORBA ImplementationClasses & IDL

WSSpecialisations

CORBASpecialisations

MetadataService

Definitions

Common Application Infrastructure

Figure 5. Systemic code generation

Application Service WS Code-Behind Modules

WSDL

IDL

WS-based ApplicationInfrastructure

Application Service

Application Service

Application Service

Application Service

Request Manager

Common Application Infrastructure

CORBA-based Application

Infrastructure

CORBA ImplementationClasses

Common Application Infrastructure

Requests to/fromClients & Peers

Requests to/fromClients & Peers

Common ApplicationInfrastructure

MetadataService

Definitions

Figure 6. Request processing

Page 38: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Metadata Application Design 38

associated tools, to better describetheir product(s) via a formaliseddescription language;

– for organisations whom need to support a range of technicalenvironments, particularly at the system boundary.

Positive benefits?This approach has been used tovarying degrees on many projects I have discussed, reviewed andworked with. Many of these projectshave sought the high goal of completeforward engineering from the models to the runtime, and some have found success when dealing with a specific domain.

At Temenos, we have been using this and related techniques in theproduction of the new 24x7 capablebanking system, ‘T24’, launched late in 2003. More specifically, wehave used these techniques in theproduction of a software developmentkit (programmatic APIs on existing functions) and Web Servicesdeployment tooling. It might not beeasy, and it might hurt your head fromtime to time. But looking at the modelfor the T24 solution it is clear to usthat a metadata-driven approach to the design and development of yourapplications will, when the nexttechnology wave comes, help youengineer your existing services out of a hole and in to the limelight.

Check, carefully, the initiatives of many of the IDE and tools vendors.Metadata representation and codegeneration is being courted once more.This time however, the aim appears to be to help the development processbecome more productive by providingtools to manage the abstractions and complexity in today’s technical environment.

ReferencesPattern-Oriented Software Architecture: Patterns for Concurrent and Networked ObjectsDouglas C Schmidt, Michael Stal, HansRohnert and Frank Buschmann, JohnWiley & Sons

Anti PatternsWilliam J Brown, Raphael C Malveau,Hays W McCormick III, Thomas JMowbray John Wiley & Sons

Applied Microsoft .NET Framework ProgrammingJeffrey Richter, Microsoft Press

Application Architecture for .NET, Microsoft Patterns & Practices GroupMicrosoft Press

Beyond the Component Based ProcessCBDi Newswire Commentarywww.CBDiForum.com

Design PatternsErich Gamma, Richard Helm,Ralph Johnson, John Vlissides,Addison-Wesley

Dinosaurs Battle in the Tar PitsCBDi Newswire Commentarywww.CBDiForum.com

Modelling Languages for Distributed ApplicationsWhite Paper, Microsoft Corporation

Modern C++ DesignAndrei Alexandrescu, Addison-Wesley

Objects, Components and Frameworks with UMLDesmond Francis D’Souza & Alan Cameron Wills, Addison-Wesley

SOA – A Cost Reduction StrategyCBDi Newswire Commentarywww.CBDiForum.com

Service Oriented Process MattersCBDi Newswire Commentarywww.CBDiForum.com

Structure and Interpretation of Computer ProgramsHarold Abelson, Gerald Jay Sussman,Julie Sussman, The MIT Press

Kevin S [email protected]@sandokan.co.uk

Working within the Technology &Research team at TEMENOS, a core systems provider to the financeindustry, Kevin Perera is engaged inthe role of Systems Architect. Advisingand guiding the key design anddevelopment teams of the company,Kevin’s primary goal is to ensure the overall consistency of the solution components

and their alignment with the technicalstrategy of the TEMENOS productsuite. His main areas of focus during2003 have been XML interfaces andprotocols, with a particular interest in the provision of APIs for ‘businessservices’ provided by the existingapplications of the company in newtechnology domains such as .NET and Web Services.

Page 39: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 39

The word ‘knowledge’, like many words adapted for computer science,has a technical meaning that isdifferent from its common meaning –and like many such words, it has beendefined and re-defined many times to suit the needs of various trends incomputer science.

This paper takes a high level view of knowledge, using the word in itsmore general sense, rather than as aspecific technical term, and then looksat different types of knowledge andtheir mappings to executable computercode. The purpose is to gain insightsinto when and why rule enginesprovide advantages over conventionalsoftware development tools.

The three types of knowledgeconsidered are factual, procedural, andlogical. These divisions correspond tothe capabilities of computers. The firsttwo map naturally to a computer’sarchitecture; the third does not.

Factual KnowledgeFactual knowledge is just that, facts,or data. It can be facts about customers,orders, products, or the speed of light.

Computers have memory and externalstorage devices. These are ideallysuited to the storage and retrieval of factual knowledge. Database toolsand programming languages thatmanipulate memory have evolvednaturally from these basic componentsof machine architecture.

Factual knowledge appears in thecomputer as either elements in adatabase or variables and constants in computer programs, as shown in Figure 1.

Procedural KnowledgeProcedural knowledge is the knowledgeabout how to perform some task. It can behow to process an order, search the Web,or calculate a Fourier transform.

Computers have a central processingunit (CPU) that processes instructionsone at a time. This makes a computerwell-suited to storing and executingprocedures. Programming languagesthat make it easy to encode and executeprocedural knowledge, have evolvednaturally from this basiccomputational component.

Procedural knowledge appears in acomputer as sequences of statementsin programming languages, as shownin Figure 2.

Logical KnowledgeLogical knowledge is the knowledge of relationships between entities.It can relate a price and marketconsiderations, a product and its components, symptoms and adiagnosis, or the relationships between various tasks.

Unlike for factual and proceduralknowledge, there is no corearchitectural component of a computerthat is well suited to the storage anduse of logical knowledge.

Typically, there are many independentchunks of logical knowledge that are toocomplex to store in a database, and lackan implied order of execution whichmakes them ill-suited forprogramming. Because it doesn’t map well to the underlying computerarchitecture (as shown in Figure 3),logical knowledge is difficult to encode and maintain using theconventional database and

Best Practices for Rule-Based Application DevelopmentBy Dennis Merritt, Amzi! Inc.

step step

step step

step

Procedural

Programming

Tools

PROCEDURAL KNOWLEDGE

Real World

Computer

CPU

this that

this that

this that

this that

LOGICAL KNOWLEDGE

Logic/Rule

Language

Logic Base

Real World

Virtual Machine

FACTUAL KNOWLEDGE

Real World

Computer

Database

Tools

Programming

Tools

Disks Memory

fact

fact

fact

fact

fact

this that

this that

this that

this that

LOGICAL KNOWLEDGE

Real World

Computer

Database

Tools

Procedural

Programming

Tools

Disks CPU

Figure 1. Factual Knowledge

Figure 2. Procedural Knowledge

Figure 3. Logical knowledge does not map well to computer architecture

Figure 4. Using virtual machines for logical knowledge

Page 40: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 40

programming tools that have evolvedfrom a computer’s architecture.

Specialized tools, which are effectivelyvirtual machines better suited tological knowledge, can often be usedinstead of conventional tools (as shownin Figure 4). Rule engines and logicengines are two examples.

Conventional vs Specialized ToolsLogical knowledge is often at the core of business automation, and often isassociated with the ‘difficult’ modulesof an application. Consider, forexample, a pricing module for phonecalls or airline seats, or an orderconfiguration module. Furthermore,logical knowledge is often changing.Government regulations are expressedas logical knowledge, as are the effectsof changing market conditions.Business rules that drive anorganization are almost alwaysexpressed as logical knowledge.

Because of the critical role logicalknowledge can play, there are goodarguments for using specialized toolswhich make the encoding of logicalknowledge quicker, easier and morereliable. There are also, however, goodarguments against them, foremostbeing the ready pool of talent that isfamiliar with conventional tools.There is a lot to be said for stickingwith the familiar, although in generalthe cost is lengthy development times,tedious maintenance cycles, a higherthan normal error rate, and oftencompromises in the quality of servicethe application provides. On the otherhand, there are some well knownproblems with rule engines and othertools designed for working with logical knowledge:

– There are many choices, and they areusually vendor specific. There isn’t a standard rule language to use.

– Each tool is better suited for sometypes of logical knowledge than othertypes. Rules that diagnose a faultneed to behave different from rulesthat calculate a price, which in turn behave different from rules that dictate how an order can be configured.

– Maintenance is not as easy assometimes promised. It is importantthe rules all use similar terms anddefinitions, otherwise theinterrelationships between the rulesdon’t work as intended – makingmaintenance difficult. Furthermore,because there is no order to the rules, tracking interrelationships can be difficult.

– There is no standard applicationprogram interface (API) forintegrating a rule engine with othercomponents of an application.

Given these difficulties, is the payofffrom using rule-based tools worth theinvestment? In many cases, yes! Forexample, one organization thatprovides online mortgage servicesreplaced 5000 lines of procedural codeto price mortgages, with 500 lines oflogical rules. The logic-based solutionwas implemented in two months asopposed to the year invested in theoriginal module. Maintenance turnaround due to changing marketconditions was reduced from weeks tohours, and the errors in the resultingcode went to practically zero.

One reason for near zero errors issimply that there’s less code to gowrong, but the main reason is the codeclosely reflects the logical knowledgeas it is expressed. There is no tricky

translation from business specificationto ill-suited procedural code.

The biggest win of all might be theflexibility the logic-based solutionprovided them with, allowing them to expand their product offerings.They could now offer more and bettermortgage pricing options for theircustomers, including the option ofcustomizing the pricing logic for each institutional customer.

This is not an uncommon story. Thesame benefits and 10 to 1 improvementratio appear over and over again in thesuccess stories of vendors of rule-basedand logic technologies.

Semantic GapThe concept of ‘semantic gap’ can beused to explain many of the issues withlogical knowledge. A semantic gap refersto the difference between the wayknowledge to be encoded in anapplication is naturally specified andthe syntax of the computer language or tool used to do the encoding. Forexample, you can use assembler to codescientific equations. But it is tediousand error-prone because there is alarge semantic gap between thesyntax of assembler and an equation.The FORTRAN scientific programminglanguage was invented to reduce thesemantic gap. It allowed a programmerto code an equation in a way that wasmuch closer to the way a scientist mightwrite the equation on paper. The resultwas easier, quicker coding ofengineering and scientific applications,and fewer errors.

Factual knowledge and proceduralknowledge are both readily coded in computers because there is areasonably small semantic gap

“Because of the critical role logical knowledge can play, there are good arguments for using specialized tools which make the encoding of logical knowledge quicker, easier and more reliable.”

Page 41: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 41

between the way facts and proceduresare described and the tools forencoding them. As pointed outpreviously, this is because computersare inherently good at facts andprocedures. The semantics of logicalknowledge however does not mapreadily to conventional tools. Considerthis piece of knowledge:

The price of an airfare from Cincinnatito Denver is $741 if departing andreturning midweek. It’s $356 if the stayincludes Saturday or Sunday.

The meaning, or semantics, of thisknowledge is best captured in apattern-matching sense. It reallymeans that the details of a proposedtrip should be matched against theconditions in the rule, and theappropriate rule should be used to determine the fare.

This sort of knowledge could beshoehorned into procedural code, butthe semantics of procedural code aredesigned to express a sequence ofoperations, not a pattern-matchingsearch. On the other hand, a rule engineis designed to interpret rules in apattern-matching sense, so rulesentered in such a tool will have asmaller semantic gap than rulesencoded procedurally.

There’s If Then, and Then There’s If ThenIt is very tempting to store if-thenlogical relationships in procedural code,especially since procedural code has if-then statements. In fact, not only is ittempting, it can work reasonably wellup to a point. However there is a bigdifference between a logicalrelationship and a procedural if-then. Aprocedural if-then is really a branchingstatement, controlling the flow of

execution of a procedure. If thecondition is true, control goes one way,and if not control goes a different way.It’s a fork in the road.

It’s the road bit that causes the trouble.A logical relationship can be coded as aprocedural if-then, but must be placedsomewhere along the road of executionof the procedure it is in. Furthermore,if there are more logical relationships,they too must be placed at some point in the procedural path – and, bynecessity, the placement of one affectsthe behaviour of another. It makes adifference which rule gets placed first,and if there are branches fromprevious rules, and which branch afollowing rule is placed on.

This is not a problem if the rules mapeasily to a decision tree, but in that casethe knowledge is really procedural. It’salso not a problem if there are a smallnumber of rules, but as the number ofrules increases it becomes verydifficult to maintain them as forks in a procedural flow. The arbitrarilyimposed thread of execution that linksthe various rules becomes extremelytangled, making the code difficult towrite in the first place, and verydifficult to maintain. This isn’t to sayit can’t be done, or indeed that it isn’tdone; it often is. However, the modulewith the rules is often the mosttroublesome module in a system.

Once encoded procedurally, logicalknowledge is no longer easilyaccessible; that is, it no longer lookslike a collection of rules anddeclarative relationships. Theknowledge resource has, in a sense,been lost and buried in the code, justas a scientific equation can no longerbe read if it is coded in assembler.

The same is not true of either factual orprocedural knowledge. In those cases,reading the code generally does showthe underlying knowledge.

Databases for RulesIt is possible, in some cases, toshoehorn logical relationships into adatabase. If the relationships can berepresented in a tabular form, then adatabase table can be used to encodethe rule. So for example, if the amountof discount a customer got wasdependent on the amount of previoussales at a few different levels, thiscould be represented as a table andstored in a database. However, as withthe using procedures, the databaseapproach is limited in that it onlyworks for very clean sorts of logicalrelationships.

A Mixed ApproachSometimes applications use a mixtureof both the procedural and databaseapproaches. Logical relationships thatcan be expressed in tables are storedin a database, and the remainingrelationships are coded as procedural if-then statements.

This can simplify the coding task, butit makes maintenance harder becausethe logical knowledge is now spreadacross two different vehicles.Despite these difficulties, there is astrong appeal to using data, procedureor both to encode logical knowledge,and that is that they are familiartechniques, and there are numerousindividuals skilled in their use.

Artificial IntelligenceThe problems with encoding logicalrelationships were first explored backin the 1970s by researchers atStanford University. They were tryingto build a system that advised

Page 42: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 42

physicians on courses of antibiotics fortreating bacterial infections of theblood and meningitis. They found thatthe medical knowledge consistsmainly of logical relationships thatcan be expressed as if-then rule.

They attempted many times to encodethe knowledge using conventionaltools, and failed because of theproblems described previously.

If the problem with coding logicalknowledge is that the nature of acomputer is not well-suited toexpressing logical relationships, thenclearly the answer is to create amachine that is. Building specialisedhardware is not very practical, but itturns out a computer is a good tool forcreating virtual computers. This iswhat the researchers at Stanford did.They effectively created a virtualmachine that was programmed usinglogical rules. This type of virtualmachine is often called a rule engine.

Why is a computer good at building arule engine, but not the rulesthemselves? It is because behaviour of a rule engine can be expressed in a procedural algorithm, along thelines of:

– Search for a rule that matches the pattern of data

– Execute that rule– Go to top

The Stanford researchers who wereworking on the first rule-based systemshad originally called their work‘heuristic programming,’ which is, ofcourse, a fancy way of saying rule-based programming. Because a largeamount of human thought seems toinvolve the dynamic applying of

pattern-matching rules stored in ourbrains, the idea surfaced that this wassomehow ‘artificial intelligence’.However the real reason for thegrowth of the term was pure and simplemarketing – it was easier to getDepartment of Defence funding foradvanced research on ArtificialIntelligence (AI) than it was forheuristic programming. The term‘expert system’ was also invented atabout this time for the same reasons.

The media too, was very excited aboutthe idea of Artificial Intelligence andexpert systems, and the softwareindustry went through a cycle oftremendous hype about AI, followedby disillusionment as the technologysimply couldn’t live up to the hype.Those companies that survived andcontinue to market and sell thetechnology have found the term AI to bea detriment, so they looked for adifferent term. Now it is most oftencalled rule-based programming.

Whether you call it heuristicprogramming, Artificial Intelligence,expert systems, or business ruleprocessing, the underlying technologyis the same – a virtual engine that usespattern-matching search to find andapply the right logical knowledge atthe right time.

Other Logical Virtual EnginesVirtual engines programmed withdeclarative rules are not the onlyexample of specialized softwaredesigned to deal with logicalknowledge. Other such programsdramatically altered the course of thehistory of computing.In the early days of data processing,reports from a database had to becoded using COBOL. But reporting

requirements were specified as logicalrelationships–these columns, thesepartial sums, etc. Culprit was a reportwriter that let a user specify in adeclarative, logical, way theknowledge about a report. It wouldthen generate the procedural code tomake the report happen.

The benefits were exactly as we’vediscussed – users could now create andmaintain their own reports withouthaving to go through a programmer.The result was quicker reports, fasterturn around of new reports, andreporting that met user’s needs muchbetter than procedural approacheschannelled through programminggroups.

The resistance to this technology wasalso exactly the same. Data processingdepartments did not want to use aseparate tool for reports, they knewCOBOL. The product only became a commercial success when it wasmarketed to the end-users, and notdata processing departments.

Culprit was the first commercialsoftware product from a non-hardwarecompany, launching the softwareindustry.

The VisiCalc spreadsheet program wasanother example. It let users easilydescribe the logical relationshipsbetween cells without having to writeprocedural code. As with rule-basedlanguages and report writers, the keywas a virtual engine that translatedthe logical knowledge into executableprocedural code.

Spreadsheet applications drove the early acceptance of personal computers.

“If the problem with coding logical knowledge is that the nature of a computer is not well-suited to expressing logical relationships,then clearly the answer is to create a machine that is.”

Page 43: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 43

Data & Process First, Then LogicRecent work at Stanford has exploredthe question of why AI is not morewidely spread in medicine, but theirobservations apply to applicationsoftware in general. Logical knowledgeexpresses relationships betweenentities, and unless those entities are available for computation, thelogic cannot be automated.

One can write a rule-based system that helps deal with patients and otheraspects of health care, but without theunderlying patient data to reason over,the logic base is of limited value.

The same is true for other applicationareas. Without a database of productcomponents, you can’t build aconfiguration system, and without theraw data of phone call dates, times and durations, you can’t implement a pricing module.

The lack of underlying data reflectsthe delay in wide-spread adaptation of new technologies. It is just recently,for example, that a higher percentageof doctor’s offices are using computerbased patient records.

Likewise, it is only in recent yearsthat more and more data is becomingavailable in relational databases that lend themselves to multipleapplication uses, rather than theapplication-specific files that havebeen used for most of the history of computing.

Early AI success stories are related to applications that dynamically gatherthe data from users, as in diagnostic systems, or that work with organizations that have goodcomputerized records, such as

insurance and phone companies.As more and more data becomesreadily accessible for multi-application use, there will be more and more applications deploying logical knowledge.

Logical Knowledge ToolsTools for encoding and deploying logical knowledge are relativelystraightforward. The two critical parts are a knowledge representationlanguage and a reasoning engine.

Knowledge RepresentationLanguageKnowledge representation is thesyntax of a particular tool. Each tool allows the entering of logicalknowledge in a certain format, whichmight be simple if-then statementsreferencing simple entities, or complexif-then statements that referencecomplex objects and properties. Theycan be in and English like syntax ormore closely resemble formal logic. Theclassic design tradeoffs of ease-of-useversus expressive power apply.

A tool might provide other means forexpressing logical knowledge as well,such as hierarchical structures,, andmight include capabilities forexpressing uncertainty. Uncertainty isuseful for some types of applications,like diagnosis where there isn’t a clearright answer, but just gets in the wayfor something like pricing where thereis only one right answer. Uncertaintyitself comes in competing flavours –fuzzy, Bayesian, and the original‘certainty factors’.

Reasoning EngineThe reasoning engine determines how rules will be applied at runtime.All reasoning engines are basically

pattern-matching search engines, loopingthrough the rules, deciding which touse next, and then repeating theprocess until some end condition isreached. However, there can be majordifferences in how patterns aresearched for, and what happens when a rule is used.

The two basic strategies are goaldriven and data driven. A goal drivenreasoning engine looks for a rule thatprovides an answer for a goal, such asprice. Having found one, it then looks forany sub-goals that might be necessary,such as customer status. A data drivenreasoning engine looks at the currentknown data and picks a rule that cando something with that data. It thenfires the rule, which will add orotherwise change the known data.For example a customer might want to configure a custom door, which isthe first bit of data, and matches a rule that adds the data that hingesare needed, which leads to a rule that decides what type of hinges.

Within these two basic schemes,there are many application-specificvariations one might encounter. Adiagnostic reasoning engine mighthave strategies that let it follow themost likely paths to a solution first,or the least expensive if there arerequirements for the user to researchmore information.

A critical aspect of any reasoningengine is an API that can be calledfrom other application components.This lets logic bases be integrated intoan application context in a manner thatlets the logic base be updated withoutrequiring updates to the mainapplication code.

Page 44: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 44

OntologyOne of the biggest problems withmaintaining a logic base is consistencyof definitions. If you are writing atechnical support system, for example,and one rule refers to Microsoft®

Windows® and another to XP, well theywon’t communicate unless somehowthe system ‘understands’ that XP is atype of Windows operating system.This, unfortunately, is the difficult partabout maintaining a logic base. While itis easy to write and add rules, unlesseach rule author uses the sameterminology the rules will not work as a cohesive unit.

A solution to the naming problem isontology. Just as with other terms,ontology is a perfectly normal wordappropriated for use in computerscience. The dictionary definition ofontology has little to do with thecomputer science use of the word. (Notsurprisingly, the term ontology wascoined by the same people who decidedheuristic was a better word than rule.)

A logic base ontology is a collection ofdefinitions and relationships betweenentities that can then be used by othercomponents of an application. Anontology would have the informationthat XP is a type of Windows. Andthat Windows is a type of operatingsystem. An ontology would also knowthat Windows 2000, Win2000, andWin2K are all synonyms. Given anontology, rules can now be entered thatrefer to XP and be ‘understood’ torefering to an operating system.For example a rule might have thecondition ‘if system is an operatingsystem …’, and that rule will fire if thevalue of system is XP. An ontologyprovides an alternate way to representlogical knowledge relating to

terminology that is a powerful adjunct to the more common rules.

Custom Rule EnginesGiven the wide variety of ways inwhich rule engines can representknowledge, reason with thatknowledge, and integrate with thesurrounding environment, it issometimes difficult to choose the rightone. A general purpose rule engine will fit a wide range of problems, butmight not fit them very well, requiringsome stuffing and bending around thecorners. For this reason you will findrule engine products designed forspecific applications. Microsoft BizTalk®

Server is a perfect example. It is a tooldesigned for integrating businessprocesses. It ‘knows’ about businessprocess, and passing messages betweenthem, and can be used to express therules of which process fires when, underwhat conditions, and which otherprocesses needed to be informed of whatwhen it happens.

There are also products for pricing problems, support problems,configuration problems and a numberof other common areas. Each of thesewill work better for the problemdomain they are designed for, butwon’t be much help for other problem areas.

There is another option to consider as well, and that is the creation of a custom solution for a particularapplication. Rule engines are not thatdifficult to write, and building one for a particular application allows for the best possible knowledgerepresentation, reasoning strategyand integration with the maincomponents of the application.The key advantage relates back to

semantic gap. A custom knowledgerepresentation language can providethe smallest possible semantic gap foran application with the associatedbenefits of ease of development andease of maintenance.

Short Case StudiesIn order to better understand theadvantages of rules-based solutions,consider the following case studies.

WorkflowA number of companies use logicengines or specialized rule-based tools for encoding logical knowledgeabout workflow. Typically, these tools integrate with the largerfacilities of an application with rules governing workflow.

For example, one large supplier ofworkflow for the telecommunicationsindustry has integrated the rulesdescribing workflow with the facilities of the larger applicationcontext, so the rules can be directlyapplied to the tasks in thetelecommunications domain.

This allows for the separation of thebusiness logic defining work flow rulesfrom the procedural knowledge of the actual processes that need to beperformed, and it puts that work flowknowledge in a representation that canbe easily understood and maintained.

ConfigurationA vendor of windows and doors uses alogic base of rules to drive interactiveproduct configuration through a VisualBasic interface. Contractors use theVisual Basic program to determinethe best configuration for a job site, andthen automatically connect with thecompany’s server for entering an order.

“Early AI success stories are related to applications that dynamicallygather the data from users, as in diagnostic systems, or that work with organizations that have good computerized records.”

Page 45: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 45

They have customized their owndevelopment front-end using Excel,allowing the experts to directlymaintain the logical knowledge ofproduct configuration using a familiartool. The spreadsheet is translated to alower-level rule language that is thenused to deploy the knowledge.

Because the logic base is a separateentity from the main application code,it can be easily updated. Whenever theuser, working with the Visual Basicprogram connects to the server,updates to the configuration logic base are automatically downloaded.

The result is a very flexible andresponsive architecture for providingtheir customers, the contractors, with a powerful tool for deciding on andordering the best products for aparticular job.

MiningA sophisticated pattern-matchingapplication determines if a geologic sitehas good mining potential. The rulesthat match geologic characteristicsand mining potential are in a logicbase that is maintained separatelyfrom the Visual Basic interface thatgraphically displays geologic maps andother information about the potentialsite. Key to this application is anontology of definitions of miningterminology that allows geologic fielddata to be easily accessed by thepattern-matching rules. Without theontology, it would be very difficult forthe rules to make use of the field dataentered by different geologists withdifferent ways of expressing the samegeologic concepts. The ontology isstored and maintained as part of the logic base.

The application is currently a stand-alone Visual Basic application but willbe deployed on the Web using VisualBasic.NET.

Detailed Case Study – VaccinationsVisual Data LLC provides a Windowssoftware product called OfficePracticum for paediatrician’s offices. Itkeeps medical records for patients andperforms all of the ‘data’ and ‘processing’functions you might expect.

One of the items it tracks for apatient is vaccination history. It turns out that one of the problems for a paediatrician is following all of the complex rules and regulations forvaccinations, and scheduling childrenfor future appointments based on their vaccination needs.

Customers asked Visual Data toprovide a feature in Office Practicumthat would tell what vaccinations wereup-to-date for a child on a visit, andwhich were due. It should also be ableto provide reports on each childanalyzing their vaccination histories,making sure they were in compliancewith regulations for schools andsummer camps. This took Visual Datainto the realm of encoding logicalknowledge. The knowledge aboutvaccinations is published in papersmade available by the CDC. Eachvaccine has one or more schedules ofdoses, based on the particular type of vaccine, and each has numerousexception rules that describe conditionswhen a vaccination may or may not be given. There are a number ofinteresting observations to be madeabout this application.

Data and Process First, then LogicThe first relates to the Stanford

comment about AI in medicine, whichwas that AI had not advanced due tothe lack of data. They observed that AI is really the encoding of logicalrelationships, but, without entities for the logical knowledge to reasonover, there is no practical value inautomating the logic. The vaccinationprogram illustrates this.

People in the past have worked on AIsystems to automate vaccination logic,but the patient data on vaccinationhistory was not readily available. Ithad to be typed in by hand as input to the system in order to get avaccination schedule. However, anymedical practitioner experienced invaccinations could figure out theschedule directly from the data in about the same time without having to engage a computer in the process.So there wasn’t much point.

Office Practicum provides enough help inthe day-to-day business of running apaediatrician’s office that collecting dataon patient histories comes naturally.Because that data is in the computer,and because the office is alreadyusing the computer for other aspectsof managing the patient, it nowmakes sense to automate the logicalknowledge for vaccinationscheduling. In fact, it was thecustomers who started to ask for thisfeature, after using the software. Theynoted that all the vaccinationinformation was in the computer, sowhy couldn’t it automatically generatethe vaccination schedules.

Procedural Code Works,but is ImpracticalVisual Data first attacked the problemby attempting to encode the vaccinationlogic using procedural code. In their

Page 46: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 46

case the application is developed inBorland’s Delphi, and they usedPascal for the encoding. The softwareworked, but was difficult to write, andwas in a large complex module, andonly provided some of the featuresthey wanted to provide.

However, the world of vaccines keptchanging. New vaccines were comingout that combined earlier vaccines in a single vaccination with new morecomplex rules about the interactionsbetween the components. Customerswanted to know when the softwarewould support Pediatrix, a new complexmulti disease vaccine. The softwaredevelopers groaned.

While they were a Delphi shop, andfamiliar with Delphi, and would love to do all their work in Delphi, theyrealized the vaccination module wasjust too difficult to maintain, so theyopted for a logic base solution. The logic base reduced the code size fromthousands of lines of code to hundreds of lines of easily understandable rules.It was the same 10:1 ratio seen so manytimes for these applications.

Further, the rules were now in a formatthat their resident paediatrician, not aprogrammer, could understand. Theapplication was restructured so thatthe Delphi code called the logic base,much the same way it called thedatabase. The ‘knowledge’ ofvaccination scheduling was nowcompletely outside of the core Delphicode. The logic base can be updatedwithout affecting the main application,just as the database can be updatedwithout changing the application.

Unlike the database, the logic basemust be tested, and Office Practicum

uses a tool set to independently testthe rules. Regression tests are a part of the system, so that various scenarioscan be automatically retested whenchanges are made to the logic base.

The Nature of Vaccine LogicalKnowledgeVisual Data did not use an off-the-shelfrule engine for a couple of reasons.One was cost, but more important, thelogical knowledge of vaccines seemedto require its own specific set of ways to represent knowledge. Theseincluded definitions, tables and rules.While all three could be stored inrules, some of the visual clarity of themapping from documentation to logicbase would be lost. It would be betterif the logic base more directlyexpressed the CDC logic.

Further, most of the logical knowledge had to do with dates anddate intervals expressed in any of the date units – days, months, weeksor years. The conditions in the rulesneeded to use the intervals and datescorrectly, and assume age from contextwhen appropriate.

Accordingly the knowledgerepresentation language for thevaccination system was designed to have

– An ontology of terms to store definitions.

– A means of entering tabular knowledge.

– A means of entering rules.– Language statements that recognized

dates and date intervals.

This made it easy to add newdefinitions without affecting the rules, allowed for the direct encoding

of the basic tables, and enabled therules to refer to the tables and reasonwith concepts such as the last livevirus vaccination.

OntologyThe ontology describes semanticrelationships such as the various typesof vaccines that are Hib vaccines, aswell as distinguishing between thosethat contain PRP-OMP and those thatdon’t. This is critical because differentschedules are used for each.

It also describes which vaccines arelive virus vaccines, another criticalfact used in many of the rulesconcerned with the interaction betweendifferent vaccines that both containlive viruses.

Additionally, there are multi-vaccineproducts, such as a combined measles,mumps and rubella (MMR) vaccine.There are rules that are justconcerned with, for example, whetheror not a child had a measles vaccine,but the database might indicateMMR. This knowledge is all in theontology as well.

Here, for example, are the definitions in the logic base that indicate Varicella and Small Pox are live virus vaccinations:

live_virus ->> 'Varicella'.

live_virus ->> 'Small Pox'.

Here are some different types of Hib.

'Hib' ->> 'HbOC'.

'Hib' ->> 'PRP-OMP'.

'Hib' ->> 'PRP-T'.

TablesStandard tables provide the minimum

“The ability for an organization tosuccessfully encode its logical knowledgecan lead to better services for its users.”

Page 47: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 47

age, recommended age, and minimumspacing interval for each dose of avaccine. If this was all there was tothe vaccination logic, then a databasesolution or other table lookup wouldhave worked, although even the tablesaren’t that simple. For a given vaccine,different tables apply depending onfactors such as whether it is a multi-vaccine, what the active componentsare, and whether or not the child hasfollowed a standard schedule.

Here’s an example of a table in thelogic base that describes the Hibschedule for vaccines containing PRP-OMP.table('Recommended B', [

% Recommended Schedule B from

'DHS Hib 2003Mar' for vaccines

% containing PRP-OMP

% Dose Minimum Minimum Recommended

% Age Spacing Age

[1, 6 weeks, none, 2 months],

[2, 10weeks, 4weeks, 4 months],

[3, 12months, 8 weeks, 15 months]]).

RulesThe rules work in concert with thedefinitions and the tables. They areused to determine which table isappropriate in a given situation.They also provide coverage for all theexception cases, such as the fact that a given vaccine isn’t necessary after a certain age, or that a schedule can’tbe kept if other live virus vaccines have been given, or what the correctivemeasures are if a previous vaccine wasgiven earlier than allowed.

Here’s a relatively simple rule that fireswhen the Polio sequence is complete.Note that the ontology lets rules referto either ‘Polio’ in general, or the two main vaccines, ‘IPV’ and ‘OPV’separately. This rule describes when

an OPV sequence is complete. Theoutput includes an explanatory notethat is displayed in verbose mode.

complete :-

valid_count('OPV') eq 3,

vaccination(last, 'OPV')

*>= 4 years,

!,

output('Polio', [

status = complete,

citation = 'DHS IPV

2003Mar',

note = [

`Complete: An all OPV,

three dose sequence is

complete when`,

`the last dose is given

after 4 years of age.`

]]).

ModularizationModularization was a keyrequirement for this application.The tables and rules for each vaccinewere kept in separate modules. Theontology, on the other hand, was in a common module as it was used by all the other modules.

Reasoning EngineThe reasoning engine for the vaccinelogic base is designed to meet a varietyof application needs. It takes as inputthe vaccination history of a child andthen goes to the module for eachvaccine in question and gets thestatus information for that vaccine.This includes an analysis of the pastvaccinations with that vaccine; thestatus as of today, the current officevisit; and the recommended range and minimum dates for the nextvaccination with that vaccine.

Each module is designed with the same goal and output, and that goal

is called by the reasoning engine.This allows for the easy addition of different vaccines, and the easy maintenance of any particular vaccine.

The reasoning engine has anapplication program interface (API)that is used by the calling application.The API provides the various reportsrequired for different uses. For example,it can tell what vaccines need to begiven on the day of an office visit,or what vaccines will be needed forscheduling a follow-up visit. It alsoallows for short and verbose reporting and explanations of therecommendations, and provides the historical analysis reportingrequired for camp and school forms.

Cost BenefitThe benefits from the logic baseapproach have been:– a 90% reduction in code used

for vaccine logic rules,– direct access to the knowledge

by the in-house paediatrician,– localization of all the vaccine logic,

which used to be scattered in thedifferent parts of the application with different needs,

– easy maintenance, and qualityassurance testing, and

– additional capabilities that were toohard to encode before, such as thecomplete analysis of past vaccinationhistory and support for new multi-vaccine products.

All of these benefits add up to the one major benefit, which is that theirsoftware now provides better servicesfor their customers in this area whichis critically important in the runningof a paediatric office.

Page 48: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | Rule-Based App Development 48

The costs were:– time spent investigating and learning

about various alternative approachesfor encoding the vaccination logic,

– software license fees,– two month’s development time, and– time spent learning the

new technology.

ConclusionLogical knowledge, unlike factual orprocedural knowledge, is difficult toencode in a computer. Yet, the abilityfor an organization to successfullyencode its logical knowledge can lead to better services for its users.The question then, is how best toencode logical knowledge. It can beshoe-horned into data–and procedure-based tools, but the encoding isdifficult, the knowledge becomesopaque, and maintenance becomes a nightmare.

Rule-based and logic-based tools arebetter suited to the encoding of logicalknowledge, but require the selection ofthe proper tool for the knowledge to beencoded, and the learning of how to use that tool.

Off-the-shelf rule or logic toolssometimes provide a good solution, butoften the knowledge representation of the tool doesn’t fit the actualknowledge, or the reasoning enginedoesn’t use the knowledge as it issupposed to be used. This leads to the same coding and maintenanceproblems experienced withconventional tools, but to a lesser extentdepending on how big the semantic gap is between the knowledge and the tool.

A viable alternative is the building of a custom logic-based language andreasoning engine. This allows for theclosest fit between the coding of theknowledge and the actual knowledge,and for the cleanest integrationbetween the tool and the rest of theapplication context.

Resourceshttp://ksl-web.stanford.eduStandford’s Knowledge SystemLaboratory’s home page has informationabout their current work withontologies and other research areas,as well as links to related sites andorganizations. www.aaai.org

American Association of ArtificialIntelligence (AAAI) is a non-profitscientific organization devoted tosupporting AI research anddevelopment. They have conferencesand journals and an excellent Web site,www.aaai.org/AITopics/aitopics.html,that introduces AI research and topics.

http://www.ddj.com/topics/ai Dr Dobb’s Journal’s Web site devoted to AI topics and links.

www.ainewsletter.comPast issues of the DDJ AI Expertnewsletter. You can subscribe to thenewsletter at www.ddj.com

http://www.pcai.comPCAI Magazine has a wealth ofinformation about AI technologies and products on their Web site.

A Google search for ‘business ruleengine’ will yield links to a number of commercial offerings.

Dennis [email protected]

Dennis Merritt is a principal in Amzi!Inc. (www.amzi.com), a small privately-funded company specializing in toolsfor developing knowledge-based and AIcomponents that can be embedded inlarger application contexts.

He has written two books and severalarticles on logic programming andexpert systems, and is currently theeditor of the Dr. Dobb's Journal AIExpert newsletter www.ddj.com

Page 49: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 49

Keeping a diary seems mostly anobsession of teenage girls, politicians,and people who are planning to writetheir memoirs at a later point in life.A diary is the place to write down a daily or weekly snapshot of veryprivate thoughts that are nobodyelse’s business. No matter whether the diary is kept by a 15 year old girlwho is desperately in love with rockstar Robbie Williams, or by the Germanchancellor, all diaries have one thing incommon – they are top secret, lockedaway and never shared with anyone.

Looking at the newest trend on the Web, we might need to revise, or atleast widen, our definition of a diary.Weblogs (usually shortened to ‘Blogs’)are the digital equivalent of thepersonal diary, and have taken theInternet by storm over the past twoyears. Mid-2003 estimates1 put thetotal number of Weblogs somewherebetween 1.3 and 2.2 million, and thesenumbers are rapidly growing. The two most striking differences betweendiaries and Weblogs are that Weblogsare not only for teenage girls orpoliticians, and that they are notsecret at all. On the contrary; Weblogsare out there for everyone to see!

So why is this topic being discussed in a magazine for software architects andinformation technology managers?There are two main reasons: First,there are a lot of architectural lessonsthat can be learned from the Weblogphenomenon and from the technologiesthat make the Weblog universe tick.In fact, the Weblog space as a wholehas already grown to be the largestdistributed XML and Web servicesapplication in existence. Second,Weblogs are becoming a strategic tool to improve communication and

collaboration in the enterprise that may eventually turn out to be just as important as email.

In the first part of this article, I will analyze and explain the Weblogphenomenon and give some examplesof how Weblogs can be used ascollaboration systems in businessenvironments. In the second part I will take a closer look at Weblogrelated technologies, examine a concrete example of a Weblog Enginein the form of newtelligence’s freeMicrosoft ® ASP.NET based dasBlogapplication, and share some of thearchitectural lessons newtelligencelearned from implementing, deployingand running Weblogs on this platform.

Part 1: The Weblog PhenomenonBefore we go into the more technicalaspects, we should spend a bit of timeanalysing why Weblogs have become sowildly popular and why some people arepublishing their once secret diary onthe Internet, where it can potentially beread by anyone.

Keeping a diary, and therefore keepinga Weblog, is a very good idea, indeed.Formulating thoughts and ideas inwriting requires more serious andintense consideration of a topic thanjust thinking about it. Keeping track of thoughts, ideas, problems, andsolutions over time helps build a greatgrowing resource of personal experiencethat you can turn back to in order tolook up all of those details that wereonce well thought out and known butnow almost forgotten.

However, being a good idea is notenough of a reason to explain thesudden popularity of Weblogs. Whatmakes Weblogs really popular is that

they provide a near-perfect personalpublishing approach for everybody.Even with the most recent generationof HTML editing tools, maintaining aprivate Web site is far too complicatedfor most people and the result is oftendisappointing when compared to thepolished Web sites created byprofessional web-designers. Even if thetechnical design challenges can beovercome there is often a ‘contentdilemma’ regarding the informationthat should actually be included in thehome page. Unless the owner is afanatic hobbyist keen to share hisexpertise about postage stamps, modelrailways or his favourite movie star,the home page creation efforts usuallyresult in little more than a few pictures,a personal greeting such as ‘Welcome to my home on the Web’ and somefavourite links – with the frustratingconsequence that nobody will ever look at it.

Weblogs – or more precisely, the toolsthat are used to create Weblogs – helpovercome these hurdles to personalpublishing. Most popular Weblogtools are in fact quite sophisticatedpersonal content managementsystems that will take care ofrendering content into a professionallooking HTML template. They usuallyprovide content categorization as wellas topical and historical navigationcapabilities. This functionality, togetherwith their ease of use also seems tosolve the content dilemma. If the effortto publish a quick thought of no morethan three lines onto a Web site isn’tgreater than that of writing an email,then it’s more likely that it willhappen. The ease of publishing and the ability to publish random thoughtsquickly, results in a qualitativedifference between home pages and

DasBlog: Notes from Building a Distributed .NET Collaboration SystemBy Clemens Vasters, newtelligence AG

“The Weblog space as a whole has alreadygrown to be the largest distributed XML and Web services application in existence.”

1 http://www.blogcensus.net/ andhttp://dijest.com/bc/2003_06_23_bc.html

Page 50: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 50

Weblogs: Home pages help show off HTML skills or knowledge andadmiration for a particular subject;Weblogs exhibit personality and tellan ongoing story.

Moreover, Weblogs have turned into a public discussion platform wherethoughts and ideas can be exchanged.Because Weblogs are formatted ashypertext, they allow links to otherWeblogs, enabling Weblog authors topublicly comment on other authors’entries, and to any other Web-pagesthe authors want to highlight to theirreaders. Many Weblog tools alsofacilitate the collection of per-topicreader comments, allowing everybody,not only ‘Bloggers’, to participate indiscussions. Weblog tools not onlyexpose the content as Web sites, butalso in XML. The most commonly used format is the Real SimpleSyndication (RSS) 2.0 format (alsoreferred to as RDF Site Summary or Rich Site Summary).

XML and RSS allow Weblog readersusing special tools to consolidate allcontent of interest into a local contentstore that’s searchable, easilynavigable, and provides pre-categorized views that are alsoavailable offline. The NewsGator2

plug-in for Microsoft Outlook® evengoes so far as to integrate this abilityinto your everyday email client.

Weblogs in the Software BusinessRecently, Weblogs have moved verymuch beyond being just a trendy toolfor personal publishing andcommunity discussion. Corporateknowledge and information workersalready benefit greatly from Weblogsas an information source today.Software developers and architects as

well as sales and marketing employeesquite often find Weblog links amongstthe top 10 entries of a Web searchresult – and sometimes Weblog entriesseem to be the only good source forcertain information.

Technology companies like Microsoftand Sun even have dedicated portalswhere their employees can host theirWeblogs. However, while the Microsoftportal is quite liberal about Weblogsand only provides a set of linkspointing to their employee’s personalblogs reflecting their personalopinions and not the ‘official’ Redmondposition, the Sun portal and Weblogsare much more of a developermarketing and developer educationtool with a personal touch. Bothapproaches have upsides anddownsides and neither can be calledbetter or worse. Sun’s primaryinterest is a consistent, polishedexternal corporate message, whileMicrosoft’s approach is to get and retaintraction with customers on an informal,personal level and give them anuncensored insight about what’shappening behind the scenes.

Both the Microsoft and Sunapproaches to public corporate Weblogsfill a gap in corporate messaging. TheWeblog format, the personal nature,and the ease of publishing allowsauthors to post small, informativesnippets about a product or solutionpath and, as demonstrated by the veryliberal Microsoft approach, sometimeseven without going through the usualcorporate review, editing andpublishing cycle. It’s obvious that evencompanies as large as Microsoft cannotdocument every problem solution andworkaround using the ‘officialchannels’, with all of the requirements

around localization, consistency andpublication processes – lettingemployees augment the officialdocumentation with their own insight is a brilliant way to complete thepicture and fill those gaps.

The corporate use of Weblogs is mostvisible in the software industry, but it’snot restricted to it. There are signs thatWeblogs are starting to fulfil similarroles in other fields of business suchas law, the media industry, the fashionindustry and various fields ofengineering. Weblogs aid in bringing a more personal touch to the corporatemessage and are a very valuablemarketing and public relations tool thatdemonstrates the personal competenceand abilities of the people that are thedrivers behind the a company.

Weblogs as an EnterpriseCommunication and Collaboration ToolAn even more attractive application ofWeblogs for corporations isinformation distribution andcollaboration inside the corporatenetwork. Although many enterpriseshave embraced and implementedportal solutions and ‘groupware’ asinformation distribution anddiscussion platforms for a long timenow, Weblogs have the potential tosubstantially enhance the corporateinformation ecosystem and foster amore transparent corporate culture.

Let’s look at a few examples of howWeblogs and related technologies canbe used in a corporate environmentand that use differs from moreestablished solutions:

– Personal Weblogs PersonalWeblogs are the most obvious

2 NewsGator: http://www.newsgator.com

Page 51: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 51

application, but – depending on thecorporate culture – also have the mostpotential for conflict. PersonalWeblogs are a ‘personal portal’ andallow individuals to track the resultsof their own work and informationthey gather from third partyinformation sources such as externalwebsites. Due to its chronologicalnature, it also provides a great way ofdocument the history of decisionprocesses, of assumptions aboutfuture events and therefore providesa great foundation for post-mortemanalysis in the event of success or,more importantly, failure. The conflictpotential lies in issues like contentownership, supervision, freedom ofpersonal expression, disciplinaryconsequences and similar issues.

– Topic-Centric KnowledgeWeblogs (K-Logs) Knowledge Logsare not person-centric but topic-centric. They are based on the sametechnology as Weblogs, but havemultiple authors, usually from oneteam, but sometimes across teams oreven divisions. K-Logs focus oncertain subject areas and enable theaggregation of information andreferences to topical content as agrowing and chronologicalrepository. In this function K-Logshave substantial overlap with classicknowledge portal solutions. Whatmakes K-Logs appealing is therelatively low software acquisitioncost, the ease of use and the ability todistribute and aggregate the contentvia RSS. These advantages will beexamined in detail later in thisarticle.

– Team Weblogs Team Weblogs areteam centric and track the progressof the development of a certainproduct or project.

– Automated Weblogs Using features

like ‘Mail-To-Weblog’ (discussed laterin this article) or integrating withWeb services exposed by Weblogengine software, Weblogs can serve asan easy-to-install and easy-to-maintain publishing point forautomatically generated information.In the software industry thisinformation could include dailyreports generated from automatedbuild and test processes, inmanufacturing processes it could bestatistical information, and so on.The benefit of automated Weblogs isthat the effort for publishing suchinformation in an accessible way isminimized and it is sufficient for theinformation provider to supply verysimple plain-text fragments.

An analysis of your own corporateknowledge capturing and distributionneeds might yield more possibleapplications of Weblogs. In addition to this, subscription-based RSSinformation services can help toimprove information distribution insidethe company without flooding emailinboxes. Examples of this areautomatically generated daily orhourly reports about system ormachine activities, but also mundaneyet important things such as today’smenu in the company’s cantina or the latest scores of the companyfootball team.

Part 2 Weblog Technologies and ApplicationsThe ‘Blogosphere’, as the Weblog space is also often called by Weblogaficionados, is powered by a set of core technologies and techniques that we need to explain before we can move on to the details of theconcrete implementation innewtelligence’s dasBlog.

RSS Publishing and Aggregating Information via XMLThe most important Weblogtechnology is undoubtedly the RealSimple Syndication3 (RSS) XML formatthat has already been mentionedearlier in this article. RSS wasinitially created by Userland Softwareand Netscape as the XML formatbehind the ‘Sidebar’ feature of NetscapeNavigator 6.0 that was the follow-uptechnology to the ‘Netcaster’ in the 4.0generation of Netscape’s product. RSScan be seen as a response to Microsoft’sXML-based Channel DefinitionFormat4 (CDF) for ‘Active Channels’and the ‘Active Desktop’ that hadalready been introduced in InternetExplorer’s version 4.0, and both serveapproximately the same purpose:

The common idea behind the RSS andCDF was to provide machine readableindices for websites that could bepicked up by Netscape Navigator andInternet Explorer and allowed thebrowsers to display the current sitehighlights and headlines either in theNetscape Sidebar, in InternetExplorer’s Favourites View or onWindows’ ‘Active Desktop’. The visionand promise was that all newsheadlines and articles that a user wasinterested in could be imported in aquick online session and were thenavailable offline – primarily as aconvenient workaround to expensive,pay-per-minute Internet access cost.The long defunct Pointcast Network5

had a similar approach and was also(in part) using the CDF format toacquire information from its sourcesand to push subscribed channels totheir client software. Unfortunately,none of these products and featureswere blessed with any great success;Pointcast went floating belly up and

3 RSS 2.0 specification home:http://blogs.law.harvard.edu/tech/rss

4 CDF W3C Note:http://www.w3.org/TR/NOTE-CDFsubmit.html, Official reference:http://msdn.microsoft.com/workshop/delivery/cdf/reference/CDF.asp

5 Pointcast Webopedia entry:http://www.webopedia.com/TERM/P/PointCast.html

Page 52: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 52

‘channels’ built for either Netscape or Internet Explorer have become an extinct species. RSS survivedthanks to the continued efforts ofUserland Software and some othervendors focusing on small scalecontent management systems,from which today’s Weblog toolseventually evolved.

Despite its popularity and the factthat RSS has accumulated a criticalmass of adoption that makes it hardto replace, it is widely recognized thatRSS has several critical deficienciesrequiring changes or additions. RSSlacks proper support for XMLNamespaces, does not use the ISOtime format mandated by the XMLspecification, has no normative XMLSchema or even Document TypeDefinition, and the specification itself is ambiguous and lacks formality.These issues have prompted theformation of a working group6 aroundthe IBM engineer Sam Ruby who are working to replace RSS, along with a consolidation of most of Weblogtechnologies into a set of specificationsunder the name Atom.

Referrals Sparking Discussion and InteractionThe public interaction betweenBloggers that emerges in theBlogosphere is one of its greatestappeals and motivators. Discussionson certain topics quite often involvedozens of authors who independentlypublish their own views on theirWeblogs, but by citing other authorsand linking to their respective Weblogs they jointly create ahyperlinked mesh of views,information and opinions on a given topic.

Discussions spanning multipleWeblogs are formed by no rules other than chaos. This is distinctivelydifferent from discussions in Internetnewsgroups, on mailing-lists, or inpublic folders in a Groupware system.In these, discussion participants mustactive subscribers of a certain groupand the discussion usually goesunnoticed outside of the group. Google’snewsgroup archive exposes newsgroupdiscussions to the web to some degree,but still requires the user to explicitlysearch either in a particularnewsgroup or for certain keywords.

The primary tool aiding the chaoticformation of discussion andinteraction is a simple and well-knownmechanism supported by all commonWeb browsers: The HTTP Referrerheader. When someone comes across an interesting tip, a thought provokingarticle or an opinion they post theirapplause, concerns or supportinginformation to the own Weblog – and in doing so add a hyperlink to the citedWeblog post over on the other Weblog.Because almost all Weblog toolsrecognize and log the Referrer HTTPheader, notifying the author of theoriginal entry is as simple as clickingthe link in a browser, because theReferrer header contains the URL ofthe page where the hyperlink was set.Most popular Weblog engines track and consolidate the referrals in easilyaccessible lists, ranking them by thenumber of visitors that have arrived atthe Weblog through the external linksand rendering the referrer URL as a clickable hyperlink – some Weblogengines can even notify their owner by email of every such referral. In thatway, an author learns about externalcomments or citations and can post hisown responses or additional comments

– and in the process linking back andpossibly citing and linking Weblogs of third parties for examples orsupporting opinions, causing the on-topic mesh to form and spread.

Because hyperlink referrals still requiremanual intervention in order totrigger notification, two additionaland more instant notificationmechanisms have gained the supportof the Weblog community and toolbuilders: Pingback7 and Trackback8.

Pingback allows implementingautomatic notifications betweenWeblog engines without having to relyon HTTP referrals. Pingback defines aWeb service interface (using the XML-RPC Web services protocol, not itssuccessor SOAP) and two auto-discoverymechanisms. The function principle isvery simple: When the Weblog authorposts a new entry to their Weblog, theengine looks at the submitted HTMLfragment and scans it for hyperlinks.It will then issue an HTTP GETrequest to each of those links, usingone or both of the auto-discoverymechanisms, looking for an HTTPheader or a special tag embedded inHTML, in order to find out whetherthe link target supports the Pingbackprotocol. If a Pingback endpoint isdetected, the engine will submit a pingWeb service call, supplying the URLsof both, the pinged and the pingingWeblog entry. Pingback has theadvantage of instant notificationabout citations and, just as important,about changes to these citations.

Trackback aims to provide similarfunctionality, but with a slightlydifferent spin. The protocol does notonly provide the URL of the pingingentry, but optionally also the title

6 Project Atom:http://www.intertwingly.net/wiki/pie/FrontPage

7 Pingback 1.0:http://www.hixie.ch/specs/pingback/pingback

8 Trackback 1.1:http://www.movabletype.org/docs/mttrackback.html

Page 53: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 53

and a short excerpt of the source entry along with the Weblog’s name.Contrary to Pingback, which is fullyautomatic, Trackback is typically used as an explicit, on-demandnotification mechanism.

The major technical difference betweenPingback and Trackback is thatPingback employs an XML-RPC Webservice interface while a Trackbackping is technically equivalent tosubmitting a form in a browser – the information is posted using aHTTP POST request employing theapplication/x-www-form-urlencodedcontent type, precisely as it is the casewith HTML forms. Although different,both protocols succeed in achievingtheir goal: improving collaboration and communication.

To enable people who do not own theirown Weblog to participate in Weblogdiscussions,, most Weblog enginessupport user comments that can beadded using a Web-based interface.Additionally, a widely adopted Webservices interface for comments exists;the Comment API9. The Comment API is directly supported by some of thepopular RSS aggregators like ‘RSSBandit’10, which allow readers to postcomments straight from the tool.

dasBlog: Implementing a Weblog engineEarly in 2003, at newtelligence wedecided that building our own Weblogengine would be a good thing to do.There were several motivations. As the most active and likely best knownWeblog author at newtelligence,I primarily wanted to have areplacement for my previous Weblogtool for myself, with the side effectthat the other colleagues at

newtelligence could use it too. Writingour own blog engine also promised togive us a great set of example code touse for developer education and aplatform to try out new technologiesand techniques. Finally, developing a solution that supports all thedescribed and a few more collaborationWeb services seemed like a greatexperiment to participate in, andallowed us to research the reality of a distributed system that alreadyimplements a great deal of the Webservices vision.

At the time I got around starting toimplement dasBlog in July 2003,with just 5 calendar weeks allocated to complete the job, there were two major Weblog engines existing for the Microsoft .NET Framework, ourdefault platform: The engine poweringhttp://weblogs.asp.net (now called‘.Text’), for which code was notavailable at the time, and the engineBlogX, which had been throwntogether by a couple of people atMicrosoft in their spare time along with some community contributors.

The ‘make or take’ decision was arelatively easy one, because BlogX was already a working implementationwith a file-based backend store andwas relatively lightweight in terms ofexisting features providing a goodskeleton that made refactoring andadding new features relatively easy.Also, the license conditions for BlogXwere largely equivalent to the BSDlicense, which we also favour for workthat we publish for free and in sourcecode form.

While our initial intent was to mergeour changes back into the originalBlogX code base, it turned out that

the refactoring process led to theelimination of almost all of the originalBlogX source code as the projectprogressed. Because merging theresult into the code base would haveamounted to a hostile takeover of thatcommunity project, we’ve decided togive it a new name and to maintain itas a separate project, and so dasBlogwas born. Version 1.0 of the new codebase went public after 3 weeks, withthe follow-up versions leading up tothe complete feature set in version 1.2being released after 5 weeks – in timeand with a stability and quality thathas convinced several dozen bloggersto abandon their old tools and switchto the dasBlog engine even in theearly stages of the project.

dasBlog: Requirements,Considerations, and SolutionsFundamentally, dasBlog is a small content management systemthat’s directly bound to a renderingengine, which renders all content just in time and based on the view that a visitor chooses.

StorageThe primary task of a Weblog system is to capture and present a chronologyof events. The front page of a Weblogtherefore presents a configurablenumber of Weblog entries,chronologically ordered, with the most recent entries at the top. Thisfundamental principle also requiresthat visitors can easily navigatethrough the history of the Weblog bydate. This primary function immediatelyinfluences the design of the backendstore, for which the most commonlookup criteria is a date or a time span.At the same time, it must be possibleto efficiently access individual Weblogentries by their identifier in order to

9 Comment API:http://wellformedweb.org/story/9

10 RSS Bandit:http://www31.brinkster.com/rssbandit/

Page 54: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 54

attach ‘tracking’ information such asreferrals, pingbacks and trackbacks,and to associate and displaycomments as explained earlier in this article.

Because a Weblog is a person-centric,not topic-centric publishing point, it is also required to enable and ease by-topic navigation by introducing acategorization scheme. Creating andmaintaining categories should belargely automated and should notrequire much administrative effort on behalf of the user.

Fulfilling these requirements would be very easy with a relational databasesystem and a few simple indexes.However, to achieve the desired easeof initial deployment and futureupgrades coupled with minimaladministrative effort for anyone with alow-cost, ASP.NET enabled account ata web-hosting company as well as forusers on corporate desktop machinesrunning a local web server as their ownpublishing point, it’s not a good idea todepend on the existence of a fulldatabase system. Instead, the backendis factored in a way that a databasecould be supported if that requirementshould arise, but the best and primarystorage mechanism is quite simply thefile system. This could have beenachieved by using a file-baseddatabase like Microsoft’s well-known Jetor FoxPro engines, but such a built-independency would limit extensibilityand impact the efforts required forupgrading to newer versions of thesoftware as they appear. Once the roaddown the database route is taken, anychanges or additions to the storagerequire schema updates for databases in the installed base, substantiallyincreasing the administrative effort.

The resulting architectural decisionthat was already pre-defined by theoriginal BlogX code base, and whichwe consequently decided to stick with, was to store all information inXML files in a subdirectory of theapplication. Because the lookup criteriais based on time, or at least a timeinterval, the content is storedfollowing a ‘one day, one file’ schemeand the index is simply the filesystem’s directory information: Thefiles names contain the date. Theauxiliary indexes for the categories and the entry’s unique identifiers arestored in separate files. Additionalinformation such as tracking data(referrals, pingbacks, and trackbacks)and comments are stored in files thatare also named (and thus indexed) bydate, but are kept separate from theactual content in order to limitconcurrency issues and to address thedifferences in their characteristics:

The core content has a very low updatefrequency (a few times a day), has verymany reads, and must never be lost.Tracking information is updated veryoften, potentially concurrently, hasmany reads, and is less critical.Whenever changes to the core contentoccur, the engine persists themsynchronously to be able to report anyerrors straight back to the user. Thesechanges also cause all in-memorycaches to be discarded and the auxiliaryindexes to be rebuilt. All trackings,however, are processed asynchronouslyon a single secondary thread that issequentially fed information throughan in-memory queue. This can be done,because the timeliness requirements fortrackings are very relaxed: They needto be reflected in the Weblog eventually,but they don’t need to appear as theevent occurs.

Because a new file is created each day,the resulting file sizes are quite small(typically substantially less than100KB), and writes are quick withminimal locking. An aggressiveapproach to caching allows synchronousupdates to the in-memory caches andthe backend store especially for thetracking information, and thereforefurther reduces concurrency problems.It should be noted, however, that the chosen storage model and theinteraction between the in-memorycaches and the backend store limitsclustering or otherwise having multipleengines share a common store.That’s a deliberate and acceptablerestriction, because even very popularWeblogs usually get only a few tenthousand hits per day. This is aided by the fact that most users readWeblogs though RSS aggregators anddue to infrequent updates of the corecontent, the RSS streams can be easilycached on the server side and even by upstream proxies.

Content ManagementWhile we have already discussed the storage strategy, we haven’t yetcovered how content is actuallysubmitted to the engine. Here again,dasBlog had to fulfil multiplerequirements for a variety of different usage scenarios.

The most obvious way to submitcontent into a Web-based application isto use a form on a Web page. dasBlogsupports this for all browsers, but gives Microsoft’s Internet Explorerpreferred treatment for a few quitesimple reasons: Users that access theeditor web pages using MicrosoftInternet Explorer are provided with apage that includes a set of client scriptsand an inline frame, utilizing Internet

“All trackings, however, are processed asynchronouslyon a single secondary thread that is sequentially fedinformation through an in-memory queue.”

Page 55: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 55

Explorer’s inherent ability to act as anHTML editor. With this, users get rich,in-browser text editing capabilities and can style text using several fonts,typographic effects and colours. Theeditor also supports attaching filesand embedding pictures. The binariesare uploaded using the standard HTMLupload control and stored in a specialdirectory below or alongside the contentdirectory. Once the upload is complete,the picture or a hyperlink is insertedinto the current text.

For users with other browsers, such as Opera or the Mozilla browser, theweb-based editing capabilities areunfortunately much more restricted.With these browsers, users only get astandard multi-line text field and mustwrite HTML mark up explicitly. Thedecision to go with such a limitedversion for non-Microsoft browsers isbased on Internet Explorer’s marketshare, the assumption that the usersof dasBlog will run Windows on theirdesktops, and the fact that HTMLediting support is not standardizedacross browsers. However, thislimitation isn’t as significant as itmight appear at first sight, becauseusing the web form is only one of threeways to submit content into theengine. The first alternative tosubmitting content through a browseris to use one of a variety of offlineWeblog editors such as Zempt11 orw.bloggar12 that directly support the Web services API developed by the makers of the popular Weblogenvironment ‘Blogger’. Any editor thatcan target Blogger servers can alsotarget dasBlog. dasBlog implementsthe Blogger API along with extensionsmade for the competing MovableTypeWeblog software and extensions madeby Userland, so that a wide range of

tools can be used both to submit new entries and to edit existing entries from a rich client.

The Blogger API and its variousextensions define Web servicesendpoints that do not use the SOAPprotocol, but rather use XML-RPC.The Atom working group that hasalready been mentioned in the previousdiscussion on RSS is planning toconsolidate the partially overlappingfunctionality of what are essentiallythree APIs and define the resultingAPI to be SOAP based. However, theserver-side XML-RPC protocolendpoints are sufficiently easy totarget by client applications, becausethere are plenty of pre-built librariessupporting the protocol for practicallyall platforms that matter13, even ifthese libraries are not too well known.This entry point to the contentmanagement backend is not only well suited for interactive editingtools, but also as an interface to pushautomatically generated content intodasBlog from any other application and even to synchronise contentbetween Weblogs.

The second alternative to submittingcontent through a browser, and by farthe most attractive one, is email. Whenthe dasBlog web application starts, itspins up a dedicated thread that willwatch, in configurable intervals, aPOP3 mail account defined by theWeblog owner. Whenever the enginepolls the account, it processes everyemail item in the account, looking foremails whose subject line is prefixedwith a configured passphrase. Emailswith a matching passphrase are addedto the content store. This is obviously a minimalist security measure and the passphrase can even be empty,

allowing any email to be published.dasBlog can handle HTML and plain-text formatted messages, and extract,store and link attachments, handleembedded pictures, and through aconfiguration switch create and embedthumbnails for picture attachments.

Email support provides the mostflexible and instantly interoperablemodel for adding information to theWeblog, is readily supported on everyplatform. It allows content creation and submission anywhere, with familiar tools including SMS and MMS messages sent from mobilephones through an email gateway.With support for HTML and pictureembedding and using state-of-the-art email tools like Microsoft OfficeOutlook 2003, publishing rich contentto the Web becomes easier than ever.

Rendering Engine andLocalisationThe rendering engine is responsible for formatting the content for Webpresentation. The core requirementsare easy to define: Navigationthrough the site should be easy andobvious for all visitors, the site shouldbe accessible using virtually everycurrent Web browser on anyplatform, and it should be possible to easily customise and enhance thesite’s visual design.

These requirements led to an approachwhere dasBlog borrows heavily fromthe popular Radio Userland14 Weblog-tool, and is indeed largely compatiblewith design templates created for thattool. The reason for taking this routewas that there are many free andready-to-use design templatesavailable for Radio Userland, and alsofor Userland’s Manila15 content

11 Zempt: http://www.zempt.com

12 W.bloggar: http://www.wbloggar.com

13 XML-RPC implementations:http://www.xmlrpc.com/directory/1568/implementations

14 Radio Userland:http://radio.userland.com

15 Manila: http://manila.userland.com/

Page 56: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 56

management system, and allowingreuse of these immediately provided avariety of widely known and appealingvisual themes that fit many personaltastes. In addition to having a broadselection of ready-to-use themes, thebasic navigation scheme is thereforealso aligned with a large number ofother public Weblogs, providing thedesired instant familiarity and ease-of-use.

The installable version of dasBlogcomes with a set of these templatesalready configured for use – enablingthe user to focus on content and not ontechnical details of HTML right fromthe start. Still, because many userswant to give their Weblogs a personaltouch and are not afraid of HTML,customisation beyond simply selecting a template must be very simple.

Design templates for dasBlog use a combination of simple HTML,Cascading Style Sheets (CSS) and a setof macros that is implemented by theengine. A design template (or theme)always consists of three simple HTMLfiles: homeTemplate.blogtemplate (or .txt or .html), dayTemplate.blogtemplate and itemTemplate.blogtemplate. The homeTemplate isused to render the content frameworkfor every page, the dayTemplate isused as a frame for the content of acertain day and the itemTemplate isused to render individual entries.The engine also supports separatetemplates for each content category.

The templates themselves are fairlyeasy to customize using a standardHTML editor with Cascading StyleSheet support. The supported macrosare well documented16 and can beinserted into the page using special

escape sequences. All dynamicelements such as the calendar or thecategory lists are partially hard-wiredinto the rendering engine because oftheir complexity, but their appearancecan be extensively customized usingCascading Style Sheets.

Another concern and requirement fordasBlog was to have good support forlocalization. Because newtelligence isa German company with customers allacross the EMEA region, it wasimportant for us to support fulllocalization into German and Englishfor ourselves and into all EMEA regionlanguages, including the right-to-leftlanguages Hebrew and Arabic, for ourcustomers. To make localization work,dasBlog combines several techniques.

The engine looks at the Accept-Language HTTP header that all majorbrowsers send with each request toindicate the user’s preferred language.The current culture of the ASP.NETthread handling the request is set tothe language-identifier with thehighest preference and subsequentlycauses all resources to be loaded from the most appropriate resourcetables, with a default fallback to aneutral culture that contains allresources in English. This causes thedate formatting and the calendar to be properly localized from within the.NET Framework itself and causes allhardwired strings that the Weblogengine needs to emit to be rendered in the most appropriate language. Fortemplates, the engine provides a macrothat allows specifying multi-lingualstrings within the HTML source.Switching the thread into theappropriate locale also enables right-to-left support for Arabic and Hebrew,because the resource tables for these

locales contains a special flag thatcauses the engine to inject theappropriate additional dir attributesinto the HTML streams.

In addition to these generallocalization techniques, it is possible to explicitly set the language for everyWeblog entry so that only visitors who have this language listed in theirbrowser preferences (and thereforeindicate that they can understand it)will see this content. If an entry usesthe invariant culture default setting,it is shown to all visitors, independent of language preference. If a languageis set of individual entries this is alsoreflected in the XML data streamsrendered by dasBlog where therespective elements are labelled withthe appropriate xml:lang attribute.

The combination of these techniquesdemonstrates that flexible localizationis very possible for Web sites in generaland we found that the effort it took toimplement the complete localizationsupport was very low when using the.NET Framework. In fact, the coderequired to make localization work was added to the already existingapplication and deployed in less than two days.

Threading Model and HostingdasBlog is hosted in an applicationdomain inside the ASP.NET workerprocess on Windows 2000 and WindowsXP, or in the Web Application runtime of Internet Information Server 6.0 onWindows Server 2003. Because of thefocus on a single user’s Weblog and theresulting limited traffic, it is safe tominimize I/O workload by usingaggressive caching, incrementallyloading content at the time of the firstrequest and keeping it cached for

16 DasBlog Macro Documentation:http://www.dasblog.net/documentation/CategoryView.aspx?category=Templates%20and%20Macros

Page 57: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 57

further requests. Because users veryrarely browse through the Weblog’shistory, typically only the content ofthe last month is cached and even fora very busy Weblog this means an in-memory data volume of well under2MB. Requests for binaries andpictures are handled and served by theWeb server directly and are thereforenot a concern for the engine.

Because of these considerations,operation in a Web Farm wheremultiple servers operate on the samecontent store is not explicitlysupported, and therefore not a testrequirement. That said, all contentupdates cause a shared file to beupdated. Whenever this file’s timestamp changes, indicating an update to the file’s content, the internal caches are discarded and incrementallyreloaded and clustered operation istherefore possible without causingcache coherency problems. Bothfailover clustering and load balancingcould be implemented by pointing theWebsite’s storage directory to anetwork drive, but support for this is,as explained, not a primary concern.

A more central concern is to design anappropriate threading model to handlework inside the engine. Of course, themain purpose of the engine is torespond to synchronous HTTPrequests for Web service invocationsand HTML resources and therefore we can rely on ASP.NET’s threadingmodel for the vast majority of thework: Each request is served by athread that’s allocated from theASP.NET thread pool.

However, there are several activitiesthat the engine can and should performasynchronously and in background in

order to maximize request/responseperformance. Because theasynchronous actions differ slightly in their execution characteristics,dasBlog employs three variants of thread use:

The first thread model is in the Mail-To-Weblog feature and another featurecalled the Xml Storage System updateservice (XSS), which we have notdiscussed in this article because itmainly serves to provide an easiermigration path for users switching to dasBlog from Radio Userland, bothshare similar characteristics. Thethreads for these two features performperiodic actions. Mail-To-Weblogperiodically polls a POP3 account andthe XSS thread periodically pushes a static copy of the current RSSdocument to a remote Web servicethat fronts a distributed file storagesystem. The only difference betweenthem is that the XSS thread can beexplicitly triggered by signalling anevent so that content updates arequickly synchronized into the remotestore. Neither of these threads is timecritical and therefore both run with abelow-normal thread priority, whichmeans that they only get served whenthe system is not busy. Both threadsare designed to start up immediatelywhen the application domain isstarted and to spin infinitely as longthe application domain runs. Boththreads are robust against internalfailures and will fail out gracefullywhen the application domain shutsdown and tears down all runningbackground threads. Except for thesignalling event of the XSS thread,the main application does not activelycommunicate with either thread, butthe threads rather invoke functions out of themselves.

The second, slightly different model for threads is employed for handlingincoming trackings (that is referralsthrough, Trackback, and Pingback).Incoming trackings are important to log,but there is no urgency to get theminto the data store. One somewhatcritical factor relating to detailedtracking of referrers is that it results ina write operation for every incomingrequest, which creates concurrencyissues on the files. Because the requiredlocking results in sequential access tothe file store, all trackings are written to an in-memory queue watched by asingle thread that runs for the entireapplication domain lifespan and thatprocesses all trackings in sequence,eliminating concurrency issues. Thesame strategy is used for sendingnotification emails to theadministrator. While this techniqueconveniently decouples foreground and background tasks, the primarymotivation of this model is to serialize access to limited resources.

The third usage scenarios for threadsare Pingbacks, Trackbacks and changenotifications that dasBlog activelysends to other sites. Serialization ofrequests is not required here, becausenotifications are usually targeted atseveral different sites, but decouplingfrom the main thread is imperativebecause sites might be very slow orunreachable, causing delays of severalseconds that can quickly add up whena series of automated Pingbacks needsto be issued based on a single post.Doing this in the thread processingthe request (submitting a new orchanged post) would mean that theresponse is delayed by the cumulativetime needed for the notifications, whichis clearly unacceptable. Therefore, thedata that is required to execute these

“Another concern and requirement for dasBlog was to have good support for localization.”

Page 58: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

JOURNAL1 | DasBlog 58

external notifications is packaged intojobs that are submitted to the .NETthread pool for execution.

Using the .NET thread pool permitsconcurrent execution once threadsbecome available for servicing, butdoes not create undue stress byrecreating new threads all the time.Instead, the active threads in the poolare being reused. This thread pool isshared with the ASP.NET runtime,which by itself puts a global throttle on the threads that can be executedconcurrently, limiting the potential tooverstress the machine. At the sametime, this approach creates a limitedrisk of drying up the ASP.NET threadpool and causing external requests to bequeued up – at the extreme it may evencause the ASP.NET application domainto recycle (shut down and restart). Asin many application designs, this is a case where the advantages (easyprogramming model) must be weighedagainst the disadvantages and risks.The special use-case here does justifythe chosen model, but if these actions

weren’t confined to the rare case of updating and adding content, themodel used for handling incomingtrackings described earlier would be a better choice.

SummaryWeblogs are an extremely promising,but much hyped new phenomenon.There are overly enthusiastic claimsthat Weblogs will change and shakethe foundations of journalism and even democracy. While these ideasmight dramatically overstate theirimportance, Weblogs present some real advantages and opportunities for collaboration and a quick and easy way to publish content in anorganized manner.

However, as we have shown, creatinga Weblog engine that is able to act asa node in an ever changing, growing,distributed, and cross-platformapplication network comes with a few architectural challenges, even if itthe resulting application is relativelysmall. It must be easy to deploy, easy

to use, and easy to customize,navigation must be simple, intuitiveand responsive and it must be able to interact and integrate with a widevariety of systems across manyplatforms. Weblogs are proof that the first generation of Web services is working and that deep interactionbetween foreign systems is not just a vision of the future, but today’sreality – deployed across thousands of servers running Unix, Linux,MacOS X, Windows and many moreplatforms. They are also proof thatcommitment to make integration work across all technology camps in a concerted grassroots effort doesindeed yield tangible results. None of the Weblog technologies highlightedin this article has ever seen a formalstandards committee.

ResourcesAn installable version of dasBlog and the source code of the latest release can be downloaded from theproject’s documentation website athttp://www.dasblog.net

Clemens [email protected]

Clemens is co-founder and executiveteam member of newtelligence AG,a developer services companyheadquartered in Germany. He is a Microsoft Regional Director forGermany. A well-known developer and architect trainer, he is a popular

conference speaker, author/co-author ofseveral books, and maintains a widelyread and frequently referenced Weblogfocused on architectural anddevelopment topics at http://staff.newtelligence.com/ clemensv

Page 59: microsoft journal1 PDF aw2...on software architecture and design. Andrew Macaulay, a technical architect at Cap Gemini Ernst & Young, writes about enterprise architecture design and

Microsoft is a registered trademark of Microsoft Corporation

JOURNAL1

Executive Editor & Program ManagerArvindra SehmiArchitect, Developer and Platform Evangelism Group, Microsoft EMEA

Managing EditorGraeme MalcolmPrincipal Technologist,Content Master Ltd

Editorial BoardChristopher Baldwin Principal Consultant, Developer and Platform Evangelism Group,Microsoft EMEAGianpaolo CarraroArchitect Evangelist, Developer and Platform Evangelism Group,Microsoft EMEA

Simon Guest Program Manager,.NET Enterprise Architecture Team,Microsoft CorporationWilfried GrommenGeneral Manager, Business Strategy Microsoft EMEANeil Hutson Director of Windows Evangelism,Platform Strategy and Partner Group, Microsoft CorporationEugenio PacePrincipal Consultant,Microsoft Consulting Services,Microsoft ArgentinaMichael PlattArchitect Evangelist, Developer and Platform Evangelism Group,Microsoft LtdPhilip TealePartner Strategy Manager, Enterprise Partner Group, Microsoft Ltd

Project ManagementContent Master Ltdwww.contentmaster.com

Design Directionventurethree, Londonwww.venturethree.com

OrchestrationDevinia HudsonProjects Coordinator, Developer and Platform Evangelism Group,Microsoft EMEA

Foreword ContributorSimon BrownGeneral Manager, Developer and Platform Evangelism Group,Microsoft EMEA

The information contained in this Architects Journal (‘Journal’) is for information purposes only. The material in the Journal does not constitute the opinion of Microsoft or Microsoft’s advice and you should not rely on anymaterial in this Journal without seeking independent advice. Microsoft does not make any warranty or representation as to the accuracy or fitness for purpose of any material in this Journal and in no event does Microsoft

accept liability of any description, including liability for negligence (except for personal injury or death), for any damages or losses (including, without limitation, loss of business, revenue, profits, or consequential loss)whatsoever resulting from use of this Journal. The Journal may contain technical inaccuracies and typographical errors. The Journal may be updated from time to time and may at times be out of date. Microsoft accepts no

responsibility for keeping the information in this Journal up to date or liability for any failure to do so. This Journal contains material submitted and created by third parties. To the maximum extent permitted by applicablelaw, Microsoft excludes all liability for any illegality arising from or error, omission or inaccuracy in this Journal and Microsoft takes no responsibility for such third party material.

All copyright, trade marks and other intellectual property rights in the material contained in the Journal belong, or are licenced to, Microsoft Corporation. Copyright © 2003 All rights reserved. You may not copy, reproduce,transmit, store, adapt or modify the layout or content of this Journal without the prior written consent of Microsoft Corporation and the individual authors. Unless otherwise specified, the authors of the literary and artistic

works in this Journal have asserted their moral right pursuant to Section 77 of the Copyright Designs and Patents Act 1988 to be identified as the author of those works.