Architecture centered publishing systems

25
Architecture Architecture–Centered Centered Publishing Systems Publishing Systems Architecture bridges the semantic gap between the requirements and software. The use of an architecture–centered development process for delivering information technology began with the introduction of client / server based systems. Early client/server and legacy mainframe applications did not provide the architectural flexibility needed to meet the changing business requirements of the modern publishing organization. With the introduction of Object Oriented systems, the need for an architecture– centered process became a critical success factor. Object reuse, layered system components, data abstraction, web based user interfaces, CORBA, and rapid development and deployment processes all provide economic incentives for object technologies. However, adopting the latest object oriented technology, without an adequate understanding of how this technology fits a specific architecture, risks the creation of an instant legacy system. Publishing software systems must be architected in order to deal with the current and future needs of the business organization. Managing software projects using architecture–centered methodologies must be an intentional step in the process of deploying information systems – not an accidental by–product of the software acquisition and integration process. Glen B. Alleman Niwot Ridge Consulting www.niwotridge.com Niwot, Colorado 80503 Copyright © 2000, All Rights Reserved Presented IFRA Publishing Platforms Symposium December 4 th , 2000 Zurich Switzerland

description

The use of an architecture–centered development process for delivering information technology began with the introduction of client / server based systems. Early client/server and legacy mainframe applications did not provide the architectural flexibility needed to meet the changing business requirements of the modern publishing organization. With the introduction of Object Oriented systems, the need for an architecture– centered process became a critical success factor. Object reuse, layered system components, data abstraction, web based user interfaces, CORBA, and rapid development and deployment processes all provide economic incentives for object technologies. However, adopting the latest object oriented technology, without an adequate understanding of how this technology fits a specific architecture, risks the creation of an instant legacy system. Publishing software systems must be architected in order to deal with the current and future needs of the business organization. Managing software projects using architecture–centered methodologies must be an intentional step in the process of deploying information systems – not an accidental by–product of the software acquisition and integration process.

Transcript of Architecture centered publishing systems

Page 1: Architecture centered publishing systems

ArchitectureArchitecture––Centered Centered Publishing SystemsPublishing Systems

Architecture bridges the semantic gap between the requirements and software.

The use of an architecture–centered development process for delivering information technology began with the introduction of client / server based systems. Early client/server and legacy mainframe applications did not provide the architectural flexibility needed to meet the changing business requirements of the modern publishing organization. With the introduction of Object Oriented systems, the need for an architecture–centered process became a critical success factor. Object reuse, layered system components, data abstraction, web based user interfaces, CORBA, and rapid development and deployment processes all provide economic incentives for object technologies. However, adopting the latest object oriented technology, without an adequate understanding of how this technology fits a specific architecture, risks the creation of an instant legacy system.

Publishing software systems must be architected in order to deal with the current and future needs of the business organization. Managing software projects using architecture–centered methodologies must be an intentional step in the process of deploying information systems – not an accidental by–product of the software acquisition and integration process.

Glen B. Alleman Niwot Ridge Consulting

www.niwotridge.com Niwot, Colorado 80503

Copyright © 2000, All Rights Reserved

Presented

IFRA Publishing Platforms Symposium December 4th, 2000 Zurich Switzerland

Page 2: Architecture centered publishing systems

Table of Contents

–CENTERED DESIGN .............................................. 11

Architectural Principles...................................................................................... 12 Architectural Styles............................................................................................ 13

4 + 1 ARCHITECTURE.................................................................................................. 14 MOVING FROM 4+1 ARCHITECTURE TO METHODOLOGIES............................................ 15 STRUCTURE MATTERS ................................................................................................ 16

REFERENCES................................................................................................................ 17 END NOTES.................................................................................................................... 20

Figures

Figure 1 – Integrating Diverse Newspaper Systems Components......................................... 5 Figure 2 – Classification of EAI Options .................................................................................. 7 Figure 3 – Domains of the Information Manufacturing Process ............................................. 9 Figure 4 – The 4+1 Architecture as Defined by [Kruc95]...................................................... 14

Page 3: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 1

INTRODUCTION

Much of the discussion in today’s literature is centered on applying the building architectural analogies to the design and deployment of information systems.[1] This analogy glosses over many of the difficulties involved in formulating, defining, and maintaining the architectural consistency associated with acquiring and integrating Commercial Off The Shelf (COTS) applications. The successful deployment of a COTS based system requires that the current business needs be met, but also that the foundation of the future needs of the organization be laid.

In many COTS products, the vendor has defined an architecture that may or may not match the architecture of the user domain. It is unlikely the vendor’s architecture will be a match for an organization that has mature business processes and legacy systems in place. The result is an over–constrained problem. [2]

By acquiring a COTS application and installing it, the end user may have unknowingly acquired the architecture of the application’s vendor. The consequences of this decision may not be known for some time. If the differences between the target architecture of the business and the architecture supplied by the vendor are not determined before the acquisition of the COTS product these gaps will be revealed during the systems operation — much to the disappointment of the purchaser.

THE SYSTEMS INTEGRATION PROBLEM

In today’s news and information publishing environment, content resides on many platforms distributed throughout the organization. Gaining access to the right information at the right time is a daunting task. Integrating these diverse information sources and the processes that manipulate them in a seamless manner is called Enterprise Application Integration (EAI).

Enterprise integration should be viewed as a strategic initiative for every large enterprise. The complexity of and the demands on modern computing environments require a systematic, centralized approach to integration requirements to achieve optimal value for the totality of enterprise information assets. Message brokers are an emerging class of products that can be increasingly applied at the enterprise level.

— Patricia Seybold Group

Traditional methods of integrating systems involve building code that connects the system components at the Application Programming Interface (API) level. The result is a tightly coupled system that may or may not provide the needed flexibility for future growth and adaptation. It also results in a system that contains n2 connections. At the same time, established enterprise solutions (usually ERP based approaches) focus on solving business problems in a predefined manner with commercial off the shelf products.

A new system deployment paradigm is needed to address the expanding publishing needs for electronic commerce, Internet and multi–media publishing,

The subject of the integration of heterogeneous publishing systems is not only complex it is convoluted and

confusing.

By focusing on the architecture of the system, the design and development

processes have a place to return to when confusion sets in.

The problem in automating the publishing enterprise is not developing

the underlying technology, or discovering new requirements, or even

deciding how to address these requirements with products or

services.

The problem is controlling the complexity of the integrated

system that results from all these activities.

One solution to the complexity issue is

to abstract the problem into a set of reusable components.

Page 4: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 2

the competitive pressures of the market, and the unceasing demand for customer service.

This new paradigm is based on system architecture and the use of industry standards. [3] These system integration standards include:

n CORBA

n MQ Series messaging systems

n Enterprise Java Beans

n XML based interprocess standards

STANDARDS VERSUS GUIDELINES

There is a major difference between standards and guidelines. A standard is a rule that must be followed. A guideline is a recommendation that should be followed in most cases. If an organization fails to follow a standard, a negative outcome should occur. If the organization fails to follow a guideline, there is usually little risk to the outcome. Most organizations allow deviations to standards only through a formal review and approval process. Usually a deviation requires some form of a sign–off by a manager that understands the consequences.

In addition to the system integration standards there are numerous standards for content interchange. When combined with the hardware, software, messaging, database, distributed computing, and data and metadata standards, the simple approach of picking a set of applications and assembling them into a system may seem hopeless.

System architecture is intimately related to life cycle cost. A well–designed system represents a valuable investment that yields adaptability to new requirements and technologies over the system life.

WHAT IS SOFTWARE ARCHITECTURE?

Software architecture is defined as the generation of the plans for information systems, analogous to the plans for an urban dwelling space. Christopher Alexander [Alex77], [Alex79] (the inventor of design patterns) observed that macro–level architecture is made up of many repeated design patterns. Software architecture is different from software design. Software architecture is a view of the system as a whole rather than a collection of components assembled into a system. This holistic view forms the basis of the architecture–centered approach to information systems. Architecture becomes the planning process that defines the foundation for the information system.

Distributed object computing and the Internet have fundamentally changed the traditional assumptions for architecting information systems. The consequences of these changes are not yet fully understood by the developers as well as consumers of these systems. The current distributed computing (client/server)

The term architecture is so overused in the software business, that it has

become a cliché. There are “official” descriptions of software architecture

and architects. Much of the architecture work has taken place inside

development organizations and academia. In this paper, the description of architecture is taken from a variety of

reliable sources.

Page 5: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 3

and Internet–based systems are complex, vulnerable, and failure–prone. This complexity is the unavoidable consequence of the demand for ever–increasing flexibility and adaptability. These rapidly changing technologies require a different planning mechanism, one based on fundamentally new principles. Because technology is changing at a rapid pace and user business requirements are becoming more demanding, the architecting these new systems is now essential. No longer can systems be simply assembled from components without consideration of the whole [Foot97].

Architecture is not the creation of boxes, circles, and lines, laid out in presentation slides [Shaw96], [Shaw96a]. Architecture imposes decisions and constraints on the process of designing, developing, and deploying information systems. [Adow95], [Alle94], [Kazm96], [Perr92]. Architecture must define the parts, the essential external characteristics of each part, and the relationships between these parts in order to assure a viable outcome.

Architecture is the set of decisions about any system that keeps its implementers and maintainers from exercising needless creativity.

The architecture of a system consists of the structure(s) of its parts, the nature and relevant externally visible properties of those parts, and the relationships and constraints between them [D’Sou99].

ARCHITECTURE BASED INFORMATION TECHNOLOGY STRATEGIES

Much has been written about software and system architecture [Sei00]. But the question remains, what is architecture and why is it important to the design and deployment of software applications in the publishing domain?

Publishing systems possess a unique set of requirements, which are continuously increasing in complexity. In the past, it was acceptable to assemble a set of application that functioned in a serial manner, making format transformations across application boundaries, and performing the workflow processes by hand. In the current publishing environment, the timeliness, seamless workflow and, multi–purpose nature of content has become a critical success factor [4] in the overall business process.

In the past, publishing information was usually provided through a monolithic set of applications (standalone applications integrated through file systems on a network). [5] This critical data was trapped inside the applications, which were originally designed to liberate the workforce from mundane tasks [Bryn98], [Bryn93]. However, without flexibility and adaptability, the users were forced to adapt their behaviors to the behaviors of the system. The result was a recurring non–recoverable cost burden on the organization. What was originally the responsibility of the software – as defined in the Systems Requirements Analysis – became the burden of the user [Bryn98], [Schr97], [Bryn96]. In the publishing environment, this includes multiple and inconsistent data and image formats, inconsistent or duplicate database contents, islands of information and automation, and the inability to adapt to the changing needs of the organization.

Page 6: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 4

An effective publishing system architecture must combine content creation, content management, content publishing, and information systems. [6] Most approaches in the past elected to keep these two systems separate but linked, adapting them to make intersystem communication transparent.

However, this strategy fails to address the important problem of how to restructure the publishing operations to meet the demand of future operations. The complexity of this customization in a just–in–time publishing environment means that the software components, and the work processes they support, are in constant flux. For an integrated publishing system to function in this way, software systems must be continuously expanded, modified, revised, tested, and repaired. The software components must be integrated quickly and reliably in response to rapidly changing requirements. Finally, such a system must cooperate in addressing these changing objectives. All these requirements define a highly flexible, adaptive architecture capable of operating in rapidly changing conditions [Mori98].

In order to address these needs, the system architecture must:

n Define the goals of the business in a clear and concise manner, using a notation that is readable by both humans and machines.

n Identify the Information Technology already in place that meets these goals.

n Identify gaps in the levels of Information Technology that fail to meet these goals.

n Identify the organizational structure needed to support the implementation of the strategy.

n Define a layered framework for connecting the system components.

INFORMATION MANUFACTURING DOMAIN

With the availability of Commercial Off The Shelf (COTS) publishing applications, the integration of these best of breed components becomes the primary strategy for many rapidly developing software markets. The newspaper automation systems market is an example of this trend. The state–of–the–art tools for pagination, editorial text creation, digital asset management, database management, and distributed object technologies are readily available for a variety of platforms.

Figure 1 is a simplified view of the problem domain. Best of Breed COTS products are available for each of these problem domains. However, these components operate in disjointed ways, with proprietary data formats and processing models.

With the advent of Enterprise Application Integration, the design and implementation of integrated systems

is guided by an academically sound and field proven techniques.

Page 7: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 5

Editorial Pagination

AssetManagement Advertising

Figure 1 – Integrating Diverse Newspaper Systems Components

INTEGRATING HETEROGENEOUS ENVIRONMENTS

The individual components shown in Figure 1 are typical in a publishing environment. Each component provides a unique setoff features and functions. Some components may be assembled from COTS products, some may be purpose built by the vendor or system integrator. Each may have individual database schemas, or they may share a central database and the related tables and schemas.

In this heterogeneous COTS environment:

n Each application domain makes use of a specific implementation paradigm. The editorial domain assumes documents are kept in a database and indexed using content–based retrieval of document specific attributes. The Digital Asset Manager (DAM) stores digital images and graphics, indexing them with traditional attribute databases. Browser–based clients access content through URLs or GUIDs. Locating this content first requires a query to a SQL database. The pagination system provides tools for manipulating pages, but stores the page image and the entities on the page in a proprietary format, accessible only through an API. Each application assumes this private paradigm can be syndicated to other application domains. This assumption is rarely true, since the semantics of the data and control processes are unique for each application. [7]

n Each application domain assumes it is independent from other application domains. This independence includes the flow of control within the domain as well as the announcement of events and changes in state across the domain boundaries. This domain independence creates an inversion of control or the lack of explicit control across these application boundaries. [8] This inversion of control is a fundamental architectural flaw found in tightly coupled integration architectures, where multiple information producers and consumers are present.

The diversity of publishing system components creates a unique problem

for the system integration designer.

Assembling a heterogeneous group of products exposes problems not normally found in the traditional

product integration environment.

The diverse set of application paradigms, application programming

interfaces, and inversion of control issues creates the need for a unique

approach to constructing seamless product suites with commercial

applications.

Page 8: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 6

n Each application domain makes use of unique file formats and communication protocols. Pagination systems, editorial systems, asset management systems, wire services, databases, and CORBA integration components all provide different file formats and protocols. In some cases, the format of the data is also unique to the operating system or the underlying database management system. [10]

n The event signaling, thread control, exception handling, and other low level software behaviors are difficult to control – even if the systems are intentionally designed to be compatible. [9] Each application domain assumes a unique mechanism for controlling the thread of execution. The synchronization of these execution threads across application boundaries is one of most difficult problems in the integration of heterogeneous systems.

n Exceptions are reported within each application domain using the semantics of that application. Simple error codes and user response processes are unique to each domain. The unification of the exception semantics and the handling of the exceptions is a difficult task. [10]

THE IMPEDANCE MISMATCH

In this heterogeneous integration environment, the influence of each domain – as well as many other intangible influences – creates an impedance mismatch between the COTS components. [11] The challenge for the system architect is to define a mechanism that integrates the disparate components while maintaining their unique functionality. [12]

In addition to the impedance mismatch between the COTS components, these components are also undergoing continuous change. [13] These changes create a moving target for the seamless integration of heterogeneous systems.

There are only two things we know about the future: 1) it cannot be known, and 2) it will be different from what exists now and from what we expect. Any attempt to base today’s actions and commitments on predictions of the future events is futile. But precisely because the future is going to be different and cannot be predicted, it is possible to make the unexpected and unpredicted come to pass

— Peter Drucker

The inability to predict the future system requirements of a system as well as the underlying changes in the system component’s behaviors means that the integration mechanism of these COTS components must provide sufficient flexibility to deal with these unknown requirements.

Impedance mismatches are unavoidable. The role of the UNA is to

“bridge” these mismatches by normalizing the interfaces between

each application domain.

Forecasting the needs of future software requirements and integration problems is “sporty” business at best. The development of a Framework for

this integration requires careful consideration to tangible and

intangible aspects of software design.

Page 9: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 7

APPLICATION INTEGRATION OVERVIEW

The tools available for Enterprise Application Integration (EAI) are described in Figure 2. This simple matrix isolates both the problem domain and the solutions. [14] There are distinctly different problem areas in EAI:

n The purpose of the integration – which can be to update the information in the integrated systems in order to maintain cross–system consistency, or simply to read data from one system from another in order to provide a unified view of the distributed data.

n The level at which EAI can be performed – which can through messages at the application level or through the exchange of data through a shared database.

Data Level

TP Monitor

Application Server

Data Federation Systems

Data Warehouse and Data Marts

Event Level

Message–Oriented Middleware (MOM)

Publish and Subscribe

Update

(Cross System Consistency)

Read

(Unified Views)

Figure 2 – Classification of EAI Options

The facilities available to address the purpose and level of integration can be portioned into four quadrants.

n TP Monitors and Application Servers allow applications to push data to multiple data sources while maintaining update consistency.

n Message–Oriented Middleware provides a one–to–one message passing process between applications. When an event occurs in one system, information is pushed to another system to ensure consistency.

n Publish and Subscribe systems push messages asynchronously between applications when events occur.

n Data Federation and Warehousing systems perform integration by moving data from two or more locations to a third location. Data Federation Systems (DFS) are pull systems that request data on demand from the underlying systems.

EAI provides a mechanism to connect heterogeneous systems through

several commercial solutions.

Page 10: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 8

APPLICATION INTEGRATION AND THE PUBLISHING DOMAIN

Depending on the business domain, there advantages and disadvantages to each purpose and level of integration shown in Figure 2. A publishing system based on the Enterprise Application Integration (EAI) of Commercial Off The Shelf (COTS) components must: [15]

n Provide access to a networked set of heterogeneous information sources, without relocating the original information or forcing changes to the information format. This integration must provide open access to the heterogeneous services of the integrated environment, including business application logic, data format transparency, event synchronization, error detection and handling, and fault tolerance.

n Create and manage the meta–data for retrieval of the information. This meta–data describes the location and other content attributes of the information. The meta–data repository becomes a clearing house for applications creating, reading, updating, and deleting information.

n Provide scalable systems without concern for the information’s locality. By distributing the information and meta–data, the CORBA based system can be scaled to meet the demands of the publishing process. Multiple information resources can be added to the system without impacting the existing components. The underlying CORBA services can be scaled to meet the demands of the information processing systems.

n Provide neutral programming interfaces between the various services, which provide state, status, and content manipulation. Using the CORBA IDL interface specification, programming language independence is established.

n Provide scalable services for each application domain, as well as fault tolerance, load sharing, configuration management, and distributed operations. These services are provided through the core components of CORBA.

n Create a separable set of functions for the core architecture. By this, it is meant that each business domain operates independently of the other application domains, while forming as an integrated system. This architecture is based on several important principles [16]:

n Producers and consumers of data are separate. Neither producers nor consumers have knowledge of the internal behavior of other application domains. This is accomplished by interposing an intermediary persistent data store between producers and consumers.

n A shared persistent data store provides an abstract interface to the information needed by producers and consumers. The abstract interface reduces the data coupling between data producers and data consumers.

n Data producers are publishers and data consumers are subscribers to this abstract persistent data store. When a producer publishes a piece of data, all subscribers are notified. The subscribers can then retrieve the data and use it in a manner appropriate for the application domain. In this way, the data store is not passive, but tracks the production and consumption of the data.

The EAI requirements for a publishing system are different from the

requirements found in business transaction processing systems.

The diverse sources of information, the real time aspects of information

content, the heterogeneous data and control formats all create a unique set

of requirements for the Publishing Enterprise Application Integration

architecture.

Page 11: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 9

THE INFORMATION MANUFACTURING DOMAIN

The interaction between large grained components (subsystems) and the work processes that define these interactions is an important consideration in the design of the meta–system. The creation of a newspaper or other published information media is similar in many ways to the publishing domain. [17] Raw materials are gathered through a procurement process. These materials are inventoried and scheduled for assembly. The raw materials or partially finished goods are assembled into the final product. These products are transported to the customers or users. Figure 3 describes the partitioning of these domains. The creation of a seamless integration environment between each participant in the Information Manufacturing process is the role of the Universal NewsGram Architecture (UNA). [18]

DigitalAsset

Manager

PaginationSystem

ObjectStore

Editorial andClassifiedSystem

Story

Page

Photo

ContentManagement

Reports,Stringers,

Wire Services,etc.

Photographers

Paginators

ContentProcurement

Creation ofContent

Assignment ofContent to the

Publication

ContentAssembly

ContentDelivery

OutputManagement

Publishing thePaper to

Specific Media

Figure 3 – Domains of the Information Manufacturing Process

n Content Procurement – the raw materials of the publishing industry includes new advertisements, stories, photos, and graphics. These items are captured, created, and acquired independent from their use. In this domain, the format of the information varies across the gathering and authoring tools. The primary attributes of this domain include:

n The format of the arriving information is specified by an external entity. Some of these specifications are industry standards some are proprietary to a specific product.

The concept of Information Manufacturing is derived from hard

goods manufacturing. Raw materials are manipulated into finished goods,

which are then distributed to customers.

Page 12: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 10

n The arrival rates of the information cannot be throttled by the Insiight system. Either the information arrives in a real–time manner from a network or reporters cover news events occurring in a spontaneous manner creating heavy demands on the system.

n Content Management – the procured items are managed as raw or semi–finished goods. Since they are not yet assigned to a publication, the content management domain can manipulate them without concern to the final output. The format and semantics of the content are irrelevant to the management of the content. The applications that make use of the content gain access to the content through the NewsGram Object Store and the URLs referencing the content. The primary attributes of this domain include:

n Creation of meta–data descriptions of the content and the storage of this meta–data in a system neutral location and format.

n Creation of connections between the various content domains that convey state, status, and location, but not the actual content information.

n The pagination process assembles these raw and semi–finished elements into a finished product. The NewsGram Object Store is the repository that holds the connections between these independent components.

n Content Assembly – publishable entities are assigned to a publication. At this point, the formats and content are targeted to a specific output paradigm. Again, the format and semantics of the content are related to the application domain. The UNA manages the state, status, and the relationships between of the content. The primary attributes of this domain include:

n The construction of the connections and topology of these connections, independent of the content of the entities.

n A standard event protocol to report status changes to the content that occurs in the content domain using a standard event protocol. This reporting process will follow the IFRA standard workflow data schema. [19]

n Content Delivery – publishable entities are delivered to a specific medium. The publication process takes the finished goods from the information manufacturing process and delivers them to the customers in a form and format required. At this point the format, syntax, and semantics of the content is now used to produce the product. The primary attributes of this domain include:

n Conversion from the native publishing format to XML, PDF and Postscript for distribution to the target devices.

n Re–purposing the published output to various devices with different formatting and content display capabilities.

Each of the material and product manipulation domains imposes specific data format, data semantics, and workflow process on the individual entities. The role of the UNA and its infrastructure is to normalize the interfaces between each domain and remove the impedance mismatch between these domains, but not necessarily between individual entities produced in these domains. [20]

Page 13: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 11

CHARACTERISTICS OF OPEN INFORMATION SYSTEMS TECHNOLOGIES

There are several characteristics of publishing systems that are shared by all systems with good architectural foundations. [Witt94], [Rech97], [Shaw96], [Garl95]. These properties may appear abstract and not very useful at first. However, they are measurable attributes of a system that can be used to evaluate how well the architecture meets the needs of the user community.

n Openness – enables portability and internetworking between components of the system.

n Integration – incorporates various systems and resources into a whole without ad–hoc development.

n Flexibility – supports a system evolution, including the existence and continued operation of legacy systems.

n Modularity – the parts of a system are autonomous but interrelated. This property forms for the foundation of flexibility.

n Federation – combining systems from different administrative or technical domains to achieve a single objective.

n Manageability – monitoring, controlling, and managing a system’s resources in order to support configuration, Quality of Service (QoS), and accounting policies.

n Security – ensures that the system’s facilities and data are protected against authorized access.

n Transparency – masks from the applications the details of how the system works.

MOTIVATIONS FOR ARCHITECTURE–CENTERED DESIGN

The application of architecture–centered design to publishing systems makes several assumptions about the underlying software and its environment:

n Large systems need sound architecture. As the system grows in complexity and size, the need for a strong architectural foundation grows as well.

n Software architecture deals with abstraction, decomposition and composition, style, and aesthetics. With complex heterogeneous systems, the management of the system’s architecture provides the means for controlling this complexity is a critical success factor for any system deployment.

n Software architecture deals with the design and implementation of systems at the highest level. Postponing the detailed programming and hardware decisions until the architectural foundations are laid is a critical success factor in any system deployment.

The publishing domain creates a unique set of requirements, not found in other

business information system environments. By focusing on the non-functional requirements for publishing

systems, the operational aspects of the software can be isolated from the

underlying infrastructure. This isolation provides the means to move the system

forward through its evolutionary lifecycle, while minimizing the impacts

on the operational aspects of the business processes.

Page 14: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 12

Architectural Principles

Software architecture is more of an art than a science. This paper does not attempt to present the subject of software architecture in any depth, since the literature is rich with software architecture material [Tanu98], [Witt94], [Zach87], [Shaw96], [Hofm97], [Gunn98], [Sei00]. There are several fundamental principles however to hold in mind:

n Abstraction / Simplicity – simplicity is the most important architectural quality. Simplicity is the visible characteristic of a software architecture that has successfully managed system complexity

n Interoperability – is the ability to change functionality and interpretable data between two software entities. Interoperability is defined by four enabling requirements: [21]

n Communication Channel – the mechanisms used to communicate between the system components.

n Request Generation Verbs – used in the communication process.

n Data Format Nouns – the syntax used for the nouns.

n Semantics – the intended meaning of the verbs and nouns.

n Extensibility – is the characteristic of architecture that supports unforeseen uses and adapts to new requirements. Extensibility is a very important property for long life cycle architectures where changing requirements will be applied to the system.

Interoperability and extensibility are sometimes conflicting requirements. Interoperability requires constrained relationships between the software entities, which provides guarantees of mutual compatibility. A flexible relationship is necessary for extensibility, which allows the system to be easily extended into areas of incompatibility.

n Symmetry – is essential for achieving component interchange and reconfigurability. Symmetry is the practice of using a common interface for a wide range of software components. It can be realized as a common interface implemented by all subsystems or as a common base class with specializations for each subsystem.

n Component Isolation – is the architectural principle that limits the scope of changes as the system evolves. Component isolation means that a change in one subsystem will not require a change in another.

n Metadata – is self–descriptive information, which can describe services, and information. Metadata is essential for reconfigurability. With Metadata, new services can be added to a system and discovered at runtime.

n Separation of Hierarchies – good software architecture provides a stable basis for components and system integration. By separating the architecture into pieces, the stability of the whole may sometimes be enhanced.

Page 15: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 13

Architectural Styles

Architectural style in software is analogous to an architectural style in buildings. An architectural style defines a family of systems or system components in terms of their structural organization. An architectural style expresses components and relationships between these components, with constraints on their application, their associated composition, and the design rules for their construction [Perr92], [Shaw96], [Garl93].

Architectural style is determined by:

n The component types that perform some function at runtime (e.g. a data repository, a process, or a procedure).

n The topological description of these components indicating their runtime interrelationships (e.g. a repository hosted by a SQL database, processes running on middleware, and procedures created through user interaction with a graphic interface).

n The semantic constraints that will restrict the system behavior (e.g. a data repository is not allowed to change the values stored in it).

n The connectors that mediate communication, coordination, or cooperation among the components (e.g. protocols, interface standards, and common libraries).

There are several broad architectural styles in use in modern distributed systems and several detailed substyles within each broad grouping [Shaw96], [Abow95].

Because practical systems are not constructed from one style, but from a mixture of styles, it is important to understand the interrelationship between styles and their affect on system behavior.

This architectural style analysis [Adow93]:

n Brings out significant differences that affect the suitability of a style for various tasks, the architect is empowered to make selections that are more informed.

n Shows which styles are variations of others, the architect can be more confident in choosing appropriate combinations of styles.

n Allows the features used to classify styles to help the designer focus on important design and integration issues by providing a checklist of topics.

Page 16: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 14

4 + 1 ARCHITECTURE

In many projects, a single diagram is presented to capture the essence of the system architecture. Looking carefully at the boxes and lines in these diagrams, the reader is not sure of the meaning of the components. Do the boxes represent computers? Blocks of executing code? Application interfaces? Business processes? Or just logical groupings of functionality? [Shaw96a]

One approach to managing architectural style is to partition the architecture into multiple views based on the work of [Kruc95], [Adow93], [Perr92], [Witt94]. The 4+1 Architecture describes the relationship between the four views of the architecture and the Use Cases that connect them. [22] A view is nothing more than a projection of the system description, producing a specific perspective on the system’s components.

LogicalArchitecture

PhysicalArchitecture

ProcessArchitecture

DevelopmentArchitecture

System UsageScenarios

End User

Use Cases

Developer / Integrator

Abilities System Engineers

Figure 4 – The 4+1 Architecture as Defined by [Kruc95]

Figure 4 describes the 4+1 architecture as defined in [Kruc95]. The 4+1 architecture is focused on the development of systems rather than the assembly of COTS based solutions. The 4+1 paradigm will be further developed during the ARCHITECTURAL PLANNING phase using the ISO/IEC 10746 guidelines.

There are many architectural paradigms in the market place. The 4+1 paradigm

is used here to focus the system architecture of the decomposition of the

architectural components into a COTS based view of the system.

Page 17: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 15

The system architecture is the structure of a software system. It is described as a set of software components and the relationships between them. For a complete description of an architecture several views are needed, each describing a different set of structured aspects [Hofm97].

For the moment, the 4+1 Architecture provides the following views:

n Logical – the functional requirements of the system as seen by the user.

n Process – the non–functional requirements of the system described as abilities.

n Development – the organization of the software components and the teams that assemble them.

n Physical – the system’s infrastructure and components that make use of this infrastructure.

n Scenarios – the Use Cases that describe the sequence of actions between the system and its environment or between the internal objects involved in a particular execution of the system.

There are numerous techniques used to describe the architectural views of a system: algebraic specifications [Wirs90], entity relationship diagrams [Chen76], automata [Hopc79], class diagrams [Rumb91], message sequence diagrams [ITU94], data flow diagrams [DeMarc79], as well as many others. In this paper, the Unified Modeling Language (UML) combines many of these notations and concepts into a coherent notation and semantics [Booc99], [Fowl97], [D’Sou99].

MOVING FROM 4+1 ARCHITECTURE TO METHODOLOGIES

Now that the various components of system architecture are established, the development of these four architectural components must be placed within a specific context. This is the role of an architectural methodology.

In the 4+1 architecture the arrangements of the system components are described in constructive terms – what are the components made of. The next step in the process is to introduce a business requirements architecture process. The business requirements will drive the architecture. Without consideration for these business requirements the architecture of the system, would be context free. By introducing the business requirements, the architecture can be made practical in the context of the business and therefore become it can become generative.

These business requirements are not the business functions, but rather the functional and non–functional requirements of a system to support the business functions.

Page 18: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 16

STRUCTURE MATTERS

From the beginnings of software engineering, structure has been the foundation of good architecture [Parn72], [Dijk68]. There are some basic tenets that can be used to guide the architecture–centered deployment [Clem96]:

n Systems can be built in a rapid, cost–effective manner by importing (or generating) large externally developed components.

n It is possible to predict certain qualities about a system by studying its architecture, even in the absence of detailed design documents.

n Enterprise–wide systems can be deployed by sharing a common architecture. Large–scale reuse is possible through architectural level planning.

n The functionality of a system component can be separated from the component’s interconnection mechanisms. Separating data and process is a critical success factor for any well architected system

���

Page 19: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 17

REFERENCES

[Abow95] “Formalizing Style to Understand Descriptions of Software Architecture,” G. Abowd, G. Allen and D. Garland, ACM Transactions on Software Engineering and Methods, 4(4), pp. 319–164, 1995.

[Adow93] “Using Style to Understand Descriptions of Software Architecture,” G. Adowd, R. Allen and D. Garlan, ACM Software Engineering Notes, December, 1993, pp. 9–20.

[Alle94] “Formalizing Architectural Connection,” R. Allen and D. Garlan in Proceedings of the 16th International Conference on Software Engineering, 1994.

[Alex79] The Timeless Way of Building, C. Alexander, Oxford University Press, 1979.

[Alex77] A Pattern Language: Towns, Buildings, Construction, C. Alexander, S. Ishikawa, and M. Silverstein, Oxford University Press, 1977.

[Bryn98] “Beyond the Productivity Paradox,” E. Brynjolfsson and L. M. Hitt, Communications of the ACM, 41(8), pp. 49–55, August 1998.

[Bryn93] “The Productivity Paradox of Information Technology,” E. Brynjolfsson, Communications of the ACM, 36(12), pp. 66-77, December 1993.

[Chen76] “The Entity Relationship Model – Towards a Unified View of Data,” P. Chen, ACM Transactions on Database Systems, 1(1), 1976, pp. 9–36.

[Clem96] “Coming Attractions is Software Architecture,” P. C. Clements, CMU/SEI–96–TR–008, Software Engineering Institute, Carnegie Mellon University, Pittsburgh, PA, January, 1996.

[DeMarc79] Structured Analysis and System Specification, T. DeMarco, Prentice Hall, 1979.

[Dijk68] “The Structure of the T.H.E. Multiprogramming System,” E. W. Dijkstra, Communications of the ACM, 26(1), January, 1968, pp. 49–52.

[Foot97] “Big Ball of Mud,” B. Foote and J. Yoder, University of Illinois at Urbana–Champaign, September 1997.

[Garl93] “An Introduction to Software Architecture,” D. Garlan and M. Shaw, Advances in Software Engineering and Knowledge Engineering, Volume 1, World Scientific, 1993.

[Garl95] “Architectural Mismatch or Why It’s Hard to Build Systems Out of Existing Parts,” D. Garlan, R. Allen, and J. Ockerbloom, Proceedings of the Seventh International Conference on Software Engineering, April 1995.

Page 20: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 18

[Gunn98] “The Architect’s Role in Package Application Integration,” S. Gunnell, Sun World, August 1998.

[Fowl97] Analysis Patterns: Reusable Object Models, M. Fowler, Addison–Wesley, 1997.

[Hofm97] “Approaches to Software Architecture,” C. Hofmann, E. Horn, W. Keller, K. Renzel, and M. Schmidt, in Software Architecture and Design Patterns in Business Applications, edited by M. Broy, E. Denert, K. Renzel, and M. Schmidt, Technical University at Muhchen, TUM–I9746, November, 1997.

[Hopc79] Introduction to Automata Theory, Languages and Computation, J. E. Hopcroft and J. E. Ullman, Addison Wesley, 1979.

[ITU94] International Telecommunications Union: Message Sequence Charts, ITU–T, Z.120, 1994.

[Jaco92] Object–Oriented Software Engineering: A Use Case Driven Approach, I. Jacobson, Addison Wesley, 1992.

[Kazm96] “Classifying Architectural Elements,” R. Kazman, P. Clements, G. Abowd, and L. Bass, Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 1996.

[Kruc95] “The 4+1 View Model of Architecture,” P. Kruchten, IEEE Software, 12(6), pp. 42–50, 1995.

[Mori98] “Applications in Rapidly Changing Environments,” K. Mori, IEEE Computer, April 1998, Volume 31, Number Four, pp. 42–44.

[Parn72] “On the Criteria to be Used in Decomposing Systems into Modules,” D. Parnas, Communications of the ACM, Vol. 15, pp. 1053–1058, December 1972.

[Perr92] “Foundations for the Study of Software Architecture,” D. E. Perry and A. L. Wolf, ACM Software Engineering Notes, October, 1993, pp. 40–52.

[Rech97] The Art of Systems Architecting, E. Rechtin and M. W. Maier, CRC Press, 1997.

[Rumb91] Object–Oriented Modeling and Design, J. Rumbaugh, M. Blaka, W. Permerlaui, F. Eddy, and W. Lorenson, Prentice Hall, 1991.

[Sei98] Continuous Risk Management, Software Engineering Institute, 1998.

[Sei00] “Software Architecture Bibliographies,” Software Engineering Institute, http://www.sei.cmu.edu/architecture/bibliography.html

[Schr97] “The Real Problem with Computers,” M. Schrage, Harvard Business Review, 75(5), November/December, 1997, pp. 178–183.

Page 21: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 19

[Schn98] Applying Use Cases: A Practical Guide, G. Schneider and J. P. Winters, Addison Wesley, 1998.

[Shaw96] Software Architecture: Perspectives on an Emerging Discipline, M. Shaw, and D. Garlan, Prentice–Hall, 1996.

[Shaw96a] “A Field Guide to Boxology: Preliminary Classification of Architectural Styles for Software Systems,” M. Shaw and P. Clements, Proceedings of the 2nd International Software Architecture Workshop, October 1996.

[D’Sou99] Objects, Components, and Frameworks with UML: The Catalysis Approach, D. F. D’Souza and AS. C. Wills, Addison Wesley, 1999.

[Tanu98] “Software Architecture in the Business Software Domain: The Descartes Experience,” M. Tanuan, Proceedings of ISAW3, ACM 1998, pp. 145–148.

[Wirs90] “Algebraic Specifications in Formal Methods and Semantics,” Handbook of Theoretical Computer Science, M. Wirsing, Elesiver, 1990, pp. 675–788.

[Witt94] Software Architecture and Design Principals, Models, and Methods, B. I. Witt, F. T. Baker, and E. W. Merritt, Van Nostrand Reinholt, 1994.

[Zach87] “A Framework for Information Systems Architecture,” J. Zackman, IBM Systems Journal, 26, Number 3, 1987.

Page 22: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 20

END NOTES

1 Many of these architectural analogies are based on mapping the building

architecture paradigm to the software architecture paradigm. If this were the actual case software systems would be built on rigid immovable foundations, with fixed frameworks and inflexible habitats and user requirements. In fact, software architecture is more analogous to urban planning. The macro level design of urban spaces is provided by the planner, with infrastructure (utilities, transportation corridors, and habitat topology) defined before any buildings are constructed. The actual dwelling spaces are built to broad standards and codes. The individual buildings are constructed to meet specific needs of their inhabitants. The urban planner visualizes the city–scape on which the individual dwellings will be constructed. The dwellings are reusable (remodeling) structures that are loosely coupled to the urban infrastructure. Using this analog, dwellings are the reusable components of the city–scape, similar to application components in the system architecture. In both analogies, the infrastructure forms the basis of the architectural guidelines. This includes, utilities, building codes, structural limitations, materials limitations, and local style [Alex79], [Alex77].

2 In many real life applications, there does not exist a solution to a problem that satisfies all the constraints. Such systems are called over constrained systems. An example might be the selection of matching cloths (shirt, shoes, and pants). There are red and white shirts, cordovan and sneaker shoes, and blue, denim, and gray pants. If the following matching constraints are used – Shirts and Pants: {(red, gray), (white, blue, (white, denim)}; Shoes and Pants: {(sneakers, denims), (cordovans, gray)}; Shirts and Shoes: {(white, cordovans)}, there is no solution.

3 The concept of integration standards is complex and fraught with misunderstandings and over simplifications, mostly provided by vendors. One source of a set of standards guidelines can be found at www.computer.org/standards/sesc/MasterPlan/index.htm. This include:

§ Architectural standards for software § Correlation of product standards to architectural standards. § Identification of critical constraints on the system. § Precise definitions of all software-software interfaces. § Precise definitions of functions and outputs. § Precise definitions of all software-hardware interfaces. § Domain specific user interface metaphors. § Conventions for object-oriented messages. § Language bindings that provide integration of the software packages. § Process and criteria for preparing software integration estimates. § Clear and interoperable relationshipos between hardware and software

standards. § An understanding of the copyright and patents. § Criteria for assessing quality of the integrated product. § Criteria for assessing the reliability of the integrated product. § Criteria for assessing the maintainability of the integrated product. § Criteria for assessing the usability of the integrated product. § Criteria for assessing the functionality of the integrated product. § Criteria for assessing the performance of the integrated prod uct.

4 Dr. John Rockhart from MIT's Sloan School of Management is the source of the concept of Critical Success Factors (CSF). The CSF’s for a business are

Page 23: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 21

associated with its industry, competitive strategy, internal and external environmental changes, managerial principles, and a CEO perspective.

5 The mainframe environment has been tagged with the monolithic label for some time. Now that mature client / server applications have been targeted for replacement, they too have been labeled monolithic. It is not the mainframe environment that creates the monolithic architecture, it is the application's architecture itself that results in a monolithic system being deployed. This behavior occurs when the data used by the application is trapped inside the code. Separating the data from the application is one of the primary goals of good software architecture. This separation however, must take into account the semantics of the data, that is the meaning of the data. This meaning is described through a meta–data dictionary which is maintained by the architect.

6 At this point, the separation between content management and information management systems is somewhat artificial. The separation between content management and information management is defined through the access facilities of the system. Many systems provide some form of access to the underlying database. What is not provided is a uniform data model of the both the semantics and the syntax of this data. This is usually referred to as the meta–data or meta–model. This architecture is the foundation of an open standards based integration strategy. Without such a meta approach the integration process creates an instant legacy system in the same way the previous generation systems functioned. With a meta–model the system becomes open and adaptive in a manner that supports adaptation and integration with other similar systems.

7 “Integrating Islands of Automation,” Michael Stonebraker, eaiJournal, September / October 1999. www.eaijournal.com.

8 “Object–Oriented Application Frameworks,” Mohamed E. Fayad and Douglas C. Schmidt, Communications of the ACM, 40(10), October 1997, pp. 32–38.

9 “Michael Stonebraker on the Importance of Data Integration,” Lee Garber, IEEE IT Pro, May / June, 1999

10 “Framework Integration: Problems, Causes, Solutions,” Michael Mattsson, Jan Bosch, and Mohamed E. Fayad, Communications of the ACM, 42(10), October 1999, pp. 81–87.

11 The term impedance mismatch is widely used in the database domain to describe the mismatch between SQL database technologies and Object Oriented technologies. This mismatch comes about through the query and updating processes that are distinctly different in their semantics. In addition, the concept of architectural mismatch is now well understood and cause of many of the problems found in the industry. To understand this issue and how it affects the design of UNA some background materials are available and should be read by anyone intending the make changes to the UNA. “Detecting Architectural Mismatches During System Composition,” Cristina Gacek, Center for Software Engineering, Computer Science Department, University of Southern California, USC/CSE–97–TR–506, July 8, 1997, “Composing Heterogeneous Software Architectures,” Ahmed Abd–el–Shaft Abd–Allah, PhD Thesis, University of Southern California, August 1996 and “Attribute–Based Architectural Styles,” mark Klein and Rick Kazman, Software Engineering Institute, CMU/SEI–99–TR–022, October 1999.

12 “Architectural Mismatch, or Why It’s Hard to Build Systems Out Of Existing Parts,” D. Garlan, R. Allen, and J. Ockerbloom, Proceedings of ICSE ‘95, IEEE Computer Society Press, April 23–30 1995, pp. 179–185.

Page 24: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 22

13 “Architectural Issues, other Lessons Learned in Component–Based Software

Development,” Will Tracz, Crosstalk, January 2000, pp. 4–8. www.stsc.hill.af.mil/CrossTalk/index.asp.

14 “Integrating Islands of Information,” Michael Stonebreaker, EAI Journal, September/October 1999.

15 There are formal integration strategies for other business domains defined by the Object Management Group (OMG). The publishing business domain has not been addressed yet by any formal standards body.

16 The importance of an academically sound, field proven architecture cannot be over emphasized in the commercial market place. Many of the developmental, operational and maintenance problems in commercial software products can be traced to poor architecture and poor implementation of architectures. The Universal NewsGram Architecture is constructed using several sub–architectures. Each of these architectures is carefully chosen to meet the requirements of the publishing domain, while providing a clear and concise means of extending the system into new markets.

The UNA architecture is based on the following foundations. Anyone intending to extend the UNA or alter the use of the UNA components is cautioned to gain a full understanding of the motivation and foundation of the architecture by reading the following as well as all the other references in this paper:

“Attribute Based Architectural Styles,” Mark Klein and Rick Kazman, CMU/SEI–99–TR–022, Software Engineering Institute, October, 1999.

“A Unified Framework for Coupling Measurement in Object–Oriented Systems,” L. Briand, J. Daly, and J. Wuest, IEEE Transactions on Software Engineering, 25(1), January/February 1999.

“SAAM: A Method for Analyzing the Properties of Software Architectures,” R. Kazman, G. Abowd, L. Bass, M. Webb, Proceedings of the 16th International Conference of Software Engineering, May 1994, pp. 81–90.

“Attribute Based Architecture Style,” M. Klein, R. Kazman, L. Bass, S. J. Carriere, M. Barbacci, and H. Lipson, Software Architecture. Proceedings of the First Working IFIP Conference on Software Architecture, February 1999, pp. 225–243.

17 This analogy is not far from the truth. The development of just in time manufacturing control systems is based on the Theory of Constraints, in which the materials for the finished product are identified, managed, and delivered just in time for the manufacturing operations to perform their value added processing. The theory of these manufacturing systems is well understood and used to reduce cost, improve performance of the capital and labor assets, and shorten delivery times. No formal theory of newspaper production has been performed, so this analogy is anecdotal at best. See Manufacturing Planning & Control Systems 4th Edition, Thomas Vollmann, McGraw Hill, 1997.

18 The detailed process workflow between these components is the subject of another White Paper, Insiight Theory of Operations. These workflows will not be described here, but the UNA’s role in this process will be.

19 The IFRATrack specification describes the data schema for tracking the production status of a newspaper. IFRATrack 2.0 is a specification for the interchange of status and management information between local and global production management systems in newspaper production. The key here is the exchange of information between production management systems. This assumes that two production management systems exist and that information can be exchanged between them using IFRATrack. There is the common misunderstanding that IFRATrack schemas are primarily designed for the

Page 25: Architecture centered publishing systems

Niwot Ridge Consulting, Copyright © 2000 ¡ 23

internal representation of status and state information. The UNA and the NOS provide IFRATrack compliant capabilities. The semantics of the status and state information maintained in the NOS through the NewsGram matches as close as possible the IFRATrack specifications found in IFRATrack 2.0, published in IFRA Special Report 6.21.2. This data schema represents a logical newspaper, with specific relationships between the departments. In addition, this schema is print–centric and not extensible to other media.

20 This is a fundamental distinction between an integrated system and a federated system.

21 The Channel, Verb, Noun, Semantics approach to defining interoperability is a high level concepts that can be used for nearly any architectural approach to system design.

22 The Use Case notation has become popular in object oriented design and development. The Use Case specifies the sequence of actions, including any variants, that a system can perform, interacting with actors of the system. [D’Sou99], [Jaco92], [Schn98]. Use Cases provide a functional description of the system but may not be appropriate for the non–functional requirements specification. When combined with sequence diagrams, Use Cases an describe the components of the system and the interactions between these components. These components include software, users, administrators, databases, and communication channels.