RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir...

12
Abstract As individuals and enterprises get interconnected via global networks, workflows that scale beyond traditional organizational boundaries and execute seamlessly across these networks will become relevant. We address the problem of designing a scalable workflow infrastructure for the Internet that supports both flexibility in workflow participation and interoperability between heterogeneous workflow system components. RainMan is a distributed workflow system developed in Java that lives naturally on the Internet. RainMan is a loosely-coupled collection of independent services that cooperate with each other rather than a monolithic system. Some of the useful features of RainMan are browser-based workflow specification, participation, and management, and dynamic workflow modification. The RainMan system is based on RainMaker, our generic workflow framework that defines a core set of well-defined interfaces for workflow components. 1. Introduction Workflow systems are essential to organizations that need to automate their business processes [Silver95]. The attraction of these systems is that they help organizations specify, execute, and monitor their business processes in an efficient manner over enterprise-level networks [FlMa, VisFlo, InConc, ActWork]. Workflow systems provide improved throughput and tracking of processes, and better utilization of organizational resources. As individuals and organizations become interconnected via global networks such as the Internet, they will attempt to team up in various ways to share work and processes across traditional organizational boundaries. An Internet infrastructure that enables such interactions as workflows would be of great value; unfortunately, traditional workflow systems designed for centralized workflow execution do not lend themselves well to these new workflow applications. Internet-wide workflows require an infrastructure that supports decentralized workflow execution, workflow system interoperability, dynamic workflow modification, and low-cost workflow participation. The proprietary and monolithic design of current workflow systems makes it very difficult to address these requirements. The Workflow Management Coalition (WfMC) has proposed a reference architecture and defined interfaces for vendors of traditional workflow systems to interoperate [WfMC]. The WfMC standard is a useful step in the direction of interoperability, however, its main shortcoming is that it promotes a monolithic workflow system architecture that is not flexible or scalable enough to address the needs of Internet-wide workflow applications which must necessarily be loosely-coupled. We designed RainMan as a distributed workflow system for the Internet. The system comprises a collection of independent and lightweight services on the network. Our Java implementation uses open standards and Web browser-based user interfaces. We are using RainMan to experiment with a range of interesting features such as dynamic workflow modification which allows a workflow graph to be changed during execution, disconnected participation which allows participants, designers and administrators on the Internet to be infrequently connected to the network, and downloadable workflow execution which allows workflow subprocesses to be downloaded across the Internet on demand. RainMan is based on RainMaker, our generic workflow framework that defines a core set of abstract interfaces for workflow components. The main purpose of RainMaker is to facilitate the design and implementation of flexible, interoperable, and scalable workflow system components. EvalApp EvalPaymentPlan EvalBusinessPlan CreditCheck Decision Figure 1: A Business Loan Approval Workflow RainMan: A Workflow System for the Internet Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598

Transcript of RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir...

Page 1: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

Abstract

As individuals and enterprises get interconnected viaglobal networks, workflows that scale beyondtraditional organizational boundaries and executeseamlessly across these networks will become relevant.We address the problem of designing a scalableworkflow infrastructure for the Internet that supportsboth flexibility in workflow participation andinteroperability between heterogeneous workflowsystem components.

RainMan is a distributed workflow system developed inJava that lives naturally on the Internet. RainMan is aloosely-coupled collection of independent services thatcooperate with each other rather than a monolithicsystem. Some of the useful features of RainMan arebrowser-based workflow specification, participation,and management, and dynamic workflow modification.The RainMan system is based on RainMaker, ourgeneric workflow framework that defines a core set ofwell-defined interfaces for workflow components.

1. Introduction

Workflow systems are essential to organizations thatneed to automate their business processes [Silver95].The attraction of these systems is that they helporganizations specify, execute, and monitor theirbusiness processes in an efficient manner overenterprise-level networks [FlMa, VisFlo, InConc,ActWork]. Workflow systems provide improvedthroughput and tracking of processes, and betterutilization of organizational resources.

As individuals and organizations becomeinterconnected via global networks such as the Internet,they will attempt to team up in various ways to sharework and processes across traditional organizationalboundaries. An Internet infrastructure that enables suchinteractions as workflows would be of great value;unfortunately, traditional workflow systems designedfor centralized workflow execution do not lendthemselves well to these new workflow applications.Internet-wide workflows require an infrastructure thatsupports decentralized workflow execution, workflow

system interoperability, dynamic workflowmodification, and low-cost workflow participation. Theproprietary and monolithic design of current workflowsystems makes it very difficult to address theserequirements.

The Workflow Management Coalition (WfMC) hasproposed a reference architecture and defined interfacesfor vendors of traditional workflow systems tointeroperate [WfMC]. The WfMC standard is a usefulstep in the direction of interoperability, however, itsmain shortcoming is that it promotes a monolithicworkflow system architecture that is not flexible orscalable enough to address the needs of Internet-wideworkflow applications which must necessarily beloosely-coupled.

We designed RainMan as a distributed workflowsystem for the Internet. The system comprises acollection of independent and lightweight services onthe network. Our Java implementation uses openstandards and Web browser-based user interfaces. Weare using RainMan to experiment with a range ofinteresting features such as dynamic workflowmodification which allows a workflow graph to bechanged during execution, disconnected participationwhich allows participants, designers and administratorson the Internet to be infrequently connected to thenetwork, and downloadable workflow execution whichallows workflow subprocesses to be downloaded acrossthe Internet on demand.

RainMan is based on RainMaker, our generic workflowframework that defines a core set of abstract interfacesfor workflow components. The main purpose ofRainMaker is to facilitate the design andimplementation of flexible, interoperable, and scalableworkflow system components.

EvalApp

EvalPaymentPlan

EvalBusinessPlan

CreditCheck Decision

Figure 1: A Business Loan Approval Workflow

RainMan: A Workflow System for the InternetSantanu Paul, Edwin Park, and Jarir Chaar

IBM T. J. Watson Research CenterP.O. Box 704

Yorktown Heights, NY 10598

Page 2: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

2. Workflow Systems Background The traditional use of workflow systems has been in theautomation of highly structured, process-basedapplications that are common to banks and insurancecompanies. Figure 1 shows a typical (simplified)BusinessLoanApproval workflow in a bank. Workflowsystems provide extensive support for the design andautomation of such applications in an efficient manner.Some of the important features they provide are:

1. Specification Tools: These are used to specifyprocesses using high-level specification languages.There is no commonly-accepted basis for specificationlanguages, each workflow system imposes its ownlanguage which can be based on directed acyclicgraphs, rules, state machines, Petri-nets, or other anyformalism. In general, it is common practice to programworkflows visually as a directed graph where the nodesrepresent activities and the arcs represent control anddata flows between activities.

2. Execution environments: A workflow systemprovides an execution environment for workflowinstances; this involves interpreting the process bystepping through the activities and assigning them toappropriate organizational resources such as humans,applications, and other workflow systems.

3. Audit Logs and Tools: These are necessary tomonitor and track the progress of workflow instancesthrough the workflow system.

4. Worklist Management: Management of worklists;worklists are persistent storage locations where humanparticipants receive work assigned to them.

In effect, traditional workflow systems are sophisticatedprocess specification and execution environments thatallow organizations to run process-based applicationsefficiently, within the context of a closed environmentover which the workflow system has significantauthority.

In response to a growing number of proprietaryworkflow systems, the Workflow ManagementCoalition has defined a set of interfaces to enableworkflow system interoperability. These interfaces aretied to a workflow system architecture known as theWfMC Reference Model (Figure 2) which definesinterfaces between workflow servers and othercomponents such as process definition or specificationtools, workflow client applications, invokedapplications, administrative tools, and other workflowservers. The specific details of these interfaces aredescribed in [WfMC].

Workflow Server

Process DefinitionTool

Admin/ Monitoring

Tools

Workflow Client

ApplicatonsInvoked

Applications

Workflow Server

interface 1

interface 2interface 3

interface 4interface 5

Figure 2: WfMC Reference Model

The architecture defined by the WfMC standard is notwell suited for workflow execution on the Internet. Themain problem is that the workflow server is monolithic.It is responsible for process execution, auditing,management of the organizational directory, anddistribution of activities to appropriate participants(performing role resolution as necessary).

Most importantly, the server is responsible for hostingand managing worklists on behalf of all participants.This is problematic; since worklists are hidden withinthe workflow server (not externally addressable),activities can be sent only to worklists that reside in thesame workflow server. Clients participating in multipleworkflows on heterogeneous servers must have aworklist on each server and must manage each of theworklists (Figure 3).

interface 2

Workflow Client

Directory

process interpreter

worklists

WorkflowServer

Directory

process interpreter

worklists

WorkflowServer

Internet

WorkflowServer

WorkflowServer

WorkflowServer

Figure 3: The Pull Model

In addition, the workflow client application mustmanage multiple connections to each of these workflowservers since the architecture assumes a continuouslyconnected workflow client. This is not a scalable orflexible solution, especially if one is to participate in alarge number of workflows on the Internet, or use thinclients or PDAs, or operate in a disconnected mode. Ananalysis of the WfMC architecture reveals additionalproblems for decentralized workflow execution;interested readers may examine them elsewhere[Paula97, Schu96].

Page 3: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

3. New Workflows on the Internet

In RainMan, we explore the potential of the Internet toenable decentralized workflow execution viainteroperable workflow components that reside acrossthis global infrastructure. We hope to enable new kindsof workflows involving dispersed individuals, multipleorganizations, scattered network resources, andheterogeneous workflow systems. A few motivatingexamples of workflows that should be possible over theInternet are presented here.

Consider a virtual team of IT consultants from differentorganizations, scattered across the globe, working on aproject that is coordinated via workflow. Theconsultants may be mobile and intermittently connectedto the network. Irrespective of location, the projectleader must monitor the workflow and work assigned toconsultants must appear on their worklists. Eachconsultant may participate in many other workflows atthe same time.

In such decentralized workflows, participants mustreceive work from multiple workflows usingheterogeneous clients ranging from desktops to laptopsto PDAs (Figure 4). The work distribution model needsto be different from that proposed by WfMC. Instead ofparticipants connecting to workflow servers explicitly topull their work, the workflow infrastructure mustautomatically route work to them over a global network.

Internet

Workflow Server

Workflow Server

Workflow Server

Workflow Client

Workflow Client

Workflow Client (disconnected)Network services

Figure 4: Decentralized Workflow Execution

Next, consider the BusinessLoanApproval workflow inFigure 1 running on a bank’s workflow server. TheCreditCheck step may itself be a nested subprocess thatis delegated to a CreditEvaluation firm with a workflowserver supplied by a different vendor. During execution,the bank’s server would notify the firm’s server to startan instance of its local CheckCreditRatings workflow.On completion of the subprocess, the bank’s serverwould resume its suspended workflow (Figure 5). Withglobal connectivity, these peer-to-peer workflows

between heterogeneous servers are possible. In fact,Interface 4 of the WfMC standard is designed to enablesuch workflow interactions. However, the drawback ofthe Interface 4 approach is the high cost of setting upsuch interactions, and significant pre-agreementrequired between participating servers.

Bank: Loan Approval Process

Eval Firm: Check Credit Ratings Process

Internet

Start_Process (Data) Process_Complete (Results)

EvalApp

EvalPaymentPlan

EvalBusinessPlan

CreditCheck Decision

EvalAppl

CheckAssets

CheckDebts

Results

Figure 5: Peer-to-Peer Workflow Execution

Finally, as electronic commerce applications becomeavailable on the Internet, it is conceivable thatworkflows may be downloaded for just-in-timeexecution. Downloadable workflows make senseespecially in cases where pre-installing and maintainingworkflows is not cost effective. For example, consideran education brokerage service on the Internet[Hama96] that specializes in locating custom educationservices (Figure 6).

Internet

Requirements

Request for Proposal A

Evaluate Decision

Request for Proposal B

Education Brokerage

Content Provider A

Content Provider B

Budget

Start Complete

Educational Requirements

Download

Start Complete

Course material

Cost

Download

Download

Figure 6: Downloadable Workflow Execution

In a plausible scenario, the Brokerage workflowdownloads a RequirementGathering subworkflow to theclient organization and requests its execution. Next, theBrokerage workflow downloads the requirementsgathered to content providers along with aRequestForProposal subworkflow that the latter mustexecute to create the proposal. At the end, the brokeragecompares all the proposals received and notifies theclient.

To enable workflows such as these on the Internet,significant issues related to security and organizationalprivacy must be addressed. Such services, based onstate-of-art distributed systems security considerations,

Page 4: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

will have to coexist with the workflow infrastructure.

4. RainMaker Workflow Framework

RainMaker is a generic workflow framework we havedeveloped to build interoperable workflow components.The details of the framework are available elsewhere[Paulb97]; only a brief outline is presented in thissection.

RainMaker identifies four important abstractions withinthe workflow domain. Workflow instances areconsidered Sources, or service requesters. Sourcesgenerate Activities, or service requests that aredelegated out. Instances of humans, applications,organizations, and other entities that handle thesedelegated requests are considered Performers, orservice providers. Performers manage units of workcalled Tasks, which implement the delegated requestsissued by Sources. It is an important characteristic ofthe workflow domain that Tasks can be long-running. Acentral feature of RainMaker is that a Task may be aworkflow instance that recursively acts as a Source; thisprovides natural support for delegation of subprocesses.

The core RainMaker interfaces that help support thisexecution model are:�

PerformerAgent: An abstract interface that isimplemented by Performers on the network. Theinterface provides mechanisms for delegating,controlling, and querying Tasks on the Performer.

SourceAgent: An abstract interface implemented bySources on the network. It provides a callbackmechanism for Performers to return the results ofTasks to Sources.

For the remainder of the paper, the italicizedSourceAgent and PerformerAgent refer to theRainMaker abstract interfaces. The terms Source andPerformer are used generically to refer to entities (orobjects) that implement the RainMaker SourceAgentand PerformerAgent interfaces respectively.

The PerformerAgent and SourceAgent interfacesdescribe how proprietary Sources and proprietaryPerformers can interact with each other (Tables 1 and2). In essence, the PerformerAgent interface concealsthe internals of how a Performer or service provideractually performs Tasks in response to the Taskrequests. Symmetrically, the SourceAgent interface is acallback interface implemented to conceal the internaldetails of how the Source actually generates Activities,

issues Task requests, and handles responses fromPerformers. The interfaces also describe the controlmechanisms by which Tasks can be suspended,resumed, and aborted; and query mechanisms by whichtheir status can be tracked. The interaction betweenSources and Performers is shown in Figure 7.

List(TaskID)::listTasks()

TaskStatus::queryTask(TaskID taskid)

ResumeID::resumeTask(TaskID taskid)

SuspendID::suspendTask(TaskID taskid)

AbortID::abortTask(TaskID taskid)

TaskID::createTask(SourceAgent source, ActivityID activityid, TaskRequest taskreq)

List(TaskDefinition)::listTaskDefinitions()

PerformerAgent interface

Table 1: PerformerAgent interface

Boolean::seekPermissionToStartTask( PerformerAgent performer, ActivityID activityid)

ForwardedID::forwardedTask( PerformerAgent performer, ActivityID activityid, PerformerAgent newperformer, TaskID taskid)

RefusedID::refusedTask( PerformerAgent performer, ActivityID activityid)

CompletedID::completedTask( PerformerAgent performer, ActivityID activityid, TaskResponse taskresp)

SourceAgent interface

Table 2: SourceAgent interface

create Task

Task complete

Source

SourceAgent

Taskrequest

Taskresponse

Activity

Performer

Task

PerformerAgentActivity Task

Task

Figure 7: Source and Performer Interaction

In addition to the core interfaces, RainMaker alsodefines the Worklist interface (Table 3). Worklists arean important workflow metaphor similar to electronicmail boxes; they provide a mechanism for humanparticipants to access work assigned to them byworkflow systems. In the case of RainMaker, Performerentities that represent human participants on thenetwork can implement the Worklist interface and allowparticipants to view and access the Task Requests sentto them by various Sources.

Page 5: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

void::forwardWorkItem( PerformerAgent newperformer, TaskID taskid)

void::refusedWorkItem(WorkItemID wid)

void::completedWorkItem( WorkItemID wid, WorkItemResponse wresp)

WorkItemRequest::getWorkItem(WorkItemID wid)

List(WorkItemDescriptor)::getWorklistIndex()

Worklist interface

Table 3: Worklist interface

The strength of the RainMaker framework lies in itsgeneralized push model of Task distribution that allowsTasks to be delegated to Performers independent oftheir implementation. This aids the interoperability ofheterogeneous workflow systems and components.Implementations of the PerformerAgent interface canrepresent arbitrary service providers: humans,applications, roles, organizational units, and workflowsystems. Implementations of the SourceAgent interfacecan embody heterogeneous workflow applicationsdescribed as rules, as control/data flow graphs, and evennon-workflow applications such as collaborations orhuman Sources.

Worklist Client Applet

RainMan Directory

Performer

Performers

Builder & Monitor Applet

WAN

Worklists

Worklist Client Applet

post

post

post

start

Source

Performer

Performer

Source

Figure 8: RainMan System Design

5. The RainMan System

The RainMan system (see Figure 8) is a distributedworkflow system prototype written in Java usingRainMaker interfaces. It consists of a loosely- coupledcollection of distributed lightweight services required todeliver workflow functionality to Internet users. Itcurrently consists of approximately 60 Java classesdeveloped with Java JDK 1.1.4 and it uses Java’sRemote Method Invocation (RMI) for transportbetween distributed components. The transport has been

designed to be replaceable; it can be reimplemented ontop of any messaging system. RainMan runs on aTCP/IP token ring network of Wintel machines.

5.1 RainMan User Interface Components

The user interface components of RainMan allowworkflow users to interact with the runtimeenvironment. The currently available ones are theBuilder, the Worklist Client, and the Administrator, allimplemented as Java applets. This makes anyJava-compliant Web browser a usable client for allRainMan user interfaces, and has the additional benefitof making RainMan user interface components portableand hardware-independent. The Builder and WorklistClient are described in this section.

5.1.1 RainMan Builder Applet

The RainMan Builder (see Figure 9) in its currentincarnation is a single Java applet that combines threefunctions. It is an interactive graphical environment forspecifying workflows as directed, acyclic graphs.Performers are assigned from the RainMan directoryservice view available within the Builder. Theservice-specific aspects of an Activity are handled usingthe JavaBeans component object model. Workflowgraphs can be saved to and loaded from a workflowspecification repository.

The Builder also acts as a workflow graph interpreter(Source) and implements the SourceAgent interface. Foran executing workflow, the Builder posts Task requeststo specified Performers. On receiving notification ofresults, it inspects the workflow graph and posts thenext set of Task requests. Since the Builder dynamicallyinterprets the workflow graph, it is possible to changeor refine the ‘downstream’ specification of theworkflow graph even after the workflow has beenstarted. This is a useful feature that enables thedefinition and execution of ad hoc processes, anddistinguishes RainMan from traditional workflowsystems. It allows for dynamic updates to the workflowgraph to deal with emerging scenarios.

This is particularly relevant to decentralized workflowexecution on the Internet, because we expectPerformers to have significant autonomy over theirinternal domains and hence be capable of raisingexceptions in response to Task requests from Sources.In addition, Performers can change their capabilities, aswell as join or leave the network at will. Workflowsystems that cannot be dynamically modified can bevery limiting in such situations.

Page 6: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

Support for groups and roles is an important part ofworkflow execution. Workflow systems are used toimprove process throughput; one of the ways to achievethat objective is to assign an Activity to a role at buildtime. A role represents a group of ‘fungible’ Performers(i.e. ‘designer’, ‘tester’, ‘manager’, etc.); the binding ofan Activity to a specific Performer is postponed untilexecution. In RainMan, we are experimenting with theuse of special-purpose Performers to manage roles;section 5.2.3 covers this topic in some more detail.

The choice of a good workflow specification languageremains a matter of considerable debate within theworkflow community. RainMan uses a vanilla workflowspecification language based on directed, acyclicgraphs. It is important to realize that RainMan offers anexcellent infrastructure for deploying a wide range ofspecification tools based on heterogeneous languages,since the details (and potential idiosyncrasies) of aspecification language are completely encapsulatedwithin a Source implementation and not exposed eitherto the Performers or to the RainMan infrastructure. Theclean separation of responsibilities of workflow routingand Task execution between Sources and Performersrespectively makes RainMan a suitable infrastructure

for plug-and-play operation with heterogeneous Sourcesand Performers.

Finally, the Builder also helps users monitor the state ofa workflow execution. It provides visual cues to theowner or administrator about the global state of aworkflow in execution at a coarse level of granularity.

Even though the three functions of Builder, Source, andMonitor are currently physically part of the same Javaapplet, they remain distinct logical entities in ourworkflow architecture. The next version of ourprototype will separate some of these functions,allowing for features such as disconnected and remotebuilding and monitoring.

5.1.2 RainMan Worklist Client Applet

The RainMan Worklist Client is a Java applet thatallows users to access their Worklists from within aWeb browser (see Figure 10). A Worklist is a remoteJava object on the network that implements thePerformerAgent and Worklist interfaces and acts as thePerformer for a human participant. The participant canuse the Worklist Client to view the contents of theremote Worklist, and selectively download specificTask requests and perform them.

Figure 9: Builder Applet

Page 7: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

Appropriate Task Handlers are automatically launchedto allow the human to interact with Task requests. Eachparticipant or Performer has a set of capabilities inRainMan. For example, for humans capable ofdocument processing, the Worklist Client currentlyhandles incoming Document_Processing Task requestsvia a specialized Task Handler that can locate areferenced document from a network data store andlaunch the necessary local application (word processor)needed by the human to interact with the document. Oncompletion, the Task Handler returns the response to itsWorklist, which in turn returns it to the requestingSource.

Since each Task request received from the Worklist ishandled locally on the client machine, disconnectedoperation is handled easily. The client applet needs toconnect to its remote Worklist only for the purposes ofreceiving and returning Activities. However, aconnection to the network may still be needed if theTask needs to reach certain data elements on the

network in the course of its execution, or if theapplications needed to perform it are not locallyavailable.

The RainMan Worklist Client Applet is significantbecause it offers a valuable proof-of-concept thatdemonstrates how human performers can receive workfrom heterogeneous workflow sources using a single,unified user interface. Current workflow systemsusually come with proprietary client applications thatprovide worklist access by pulling work from theirproprietary servers. In contrast, RainMan requires thatthe Worklist Client pull work only from its designatedWorklist on the network, to which Task requests arepushed from various workflow backends. The RainManapproach imposes a much lower burden on the WorklistClient since it no longer has to handle multipleconnections, one with each workflow server it isreceiving work from. In an interoperable workflowworld, an equivalent of the RainMan Worklist ClientApplet could replace proprietary worklist clients.

Figure 10: Worklist Client Applet

Page 8: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

Task priorities and deadlines, while not yetimplemented in RainMan, can be supported. Suchconstraints, based on convention and agreementsbetween Sources and Performers, can be specified aspart of the Task request message sent to a Performer. APerformer unable to meet these constraints may refuseto service the request. Similarly, a Source may use thedeadline information associated with an Activity to timeout on a Performer that does not respond by thedeadline. However, attempts to generalize the behaviorof arbitrary Sources and Performers with respect topriorities and deadlines leads to the broader question ofnegotiation protocols between Sources and Performers.Such protocols, while immensely useful, have not beenstudied in the context of the RainMaker framework.

5.2 RainMan Runtime Environment

The RainMan runtime environment provides acollection of distributed services that run on the Internetinfrastructure. The user components described in theprevious section allow workflow users - workflowdesigners, participants, and administrators - to interactwith these distributed services on the network. Theservices currently available are the Directory Service,the Worklist Service, and a variety of Performers.

5.2.1 RainMan Directory Service

The RainMan directory service plays the key role of atrading service in the RainMan system. It containsinformation about Performers on the network and theircapabilities. When a Source needs to issue a Taskrequest to a Performer, it first performs a lookupoperation on the directory service to locate theappropriate Performer on the network. The directoryservice interface also provides methods to register andunregister Performers. Using the Administrator applet, aRainMan administrator can manage the directoryservice and its contents. Mobile Performers use thedirectory service to locate their Worklists as well, thusenabling true location-independent workflowparticipation. For expediency, the RainMan directoryservice is currently prototyped as a custom Javaapplication. We are planning an LDAP-based[Yeong95] directory service for the next version of theRainMan prototype.

For widespread workflow deployment on the Internet, itis necessary for the RainMan directory service to bedistributed. The design of distributed directory servicessuch as DNS [Mock87] and X.500 [CCITT] can beused as a guideline for RainMan directory design.RainMan directories in individual domains (a domaincould be an organization, a geographical location, a

collection of smaller domains, etc.) can behierarchically arranged so that Sources can findPerformers across domains. Performers within a domainwould always be registered with their local RainMandirectory. Sources would always lookup their localRainMan directories for Performers; the localdirectories could in turn access other RainMandirectories by traversing the directory hierarchy tolocate matching Performers. A related example is thatof the Corba Trading Service [OMG97]; multiple CorbaTraders can be explicitly linked into a network fortransparent navigation.

5.2.2 RainMan Worklist Service

In workflow systems, worklists are inboxes associatedwith humans. In RainMan, a Worklist is a Java objectthat implements the RainMaker Worklist andPerformerAgent interfaces. Therefore, a RainManWorklist is a Performer that is owned by and representsa human on the network (this is not a limitingassumption - Worklists can easily represent applicationsas well). Worklists offer persistent storage of Taskrequests posted to humans from Sources.

Client

WAN

Worklist

Source

Source

Performer

Source

Source

get/complete

Figure 11: Reusable Worklists

In RainMan, Worklists are treated as addressablenetwork objects, and provide a level of separationbetween Sources and actual human performers, whichmakes asynchronous exchange of requests andresponses between them possible. In other words,requests can be posted to a Worklist on the networkeven when the human performer is not connected, andthe human can access and perform them withoutconnecting to the Source. Most importantly, in contrastto traditional workflow systems, a RainMan Worklist isa first class entity that can be reused by multipleSources, thus eliminating the need for explicit,dedicated connections to workflow servers on the partof the performer (see Figure 11). This is in sharpcontrast to the architecture shown in Figure 4.

Page 9: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

Worklists on the network are managed by anindependent RainMan Worklist Service. At the presenttime, the Worklist Service consists of a single Javaapplication on the network, called a Worklist Server,that manages a large pool of Worklists. The WorklistServer is analogous to a POP [Myers96] or an IMAP[Cris93] server that stores e-mail boxes for multipleusers. The clear separation of the Worklist Service fromthe Workflow Server is a novel design point inRainMan. This is useful because the Worklist Service isnow streamlined to perform a dedicated function in anautonomous manner, and a Worklist Server can run ondedicated powerful computational resources thatguarantee performance, availability, security, andreliability. Furthermore, since Worklists are practicallyindependent of each other, we expect it to be relativelyeasy in this architecture to address scalability in termsof distributed workflow participants by implementingthe Worklist Service as a distributed service; newWorklist Servers can be added to the network to hostWorklists as the number of participants increases. Weplan to experiment with these issues in the near future.

5.2.3 RainMan Performers

The Performer abstraction is useful in modelinghumans, software applications, groups and roles,workflow servers, and entire organizations that performTasks on behalf of a workflow (see Figure 12). As wehave seen, Worklists act as Performers for humans. Inaddition, we are building a host of other Performers thatcan be used by RainMan workflows. An interestingPerformer class is the SMTPGatewayPerformerthat allows RainMan workflows to send out e-mail overthe Internet. This Performer is an SMTP client writtenin Java that implements the PerformerAgent interface. Itcan receive SendEmail Task requests from Sources,connect to a SMTP server, and submit outgoing e-mailrequests. Another implemented Performer class is theDatabaseQueryPerformer that can receive SQLqueries from a Source and interface with a relationaldatabase at the back end via JDBC. APalmPilotPerformer class has also beenimplemented that allows Worklist access from a USRobotics Palm Pilot.

With the growing use of cell phones, pagers, and PDAs,it is conceivable that a Performer and its humanparticipant may communicate via other metaphors suchas publish/subscribe (also known as observable/observer or model/view), and its variations. ThePerformer and the human may be co-located on a singlemachine, or communicate over distances via a widevariety of networks (e.g., wireless, infrared, and so on).

In workflows, groups and roles are used to distributeTask requests to participants according to certainpolicies. The commonly supported scenario in currentworkflow systems is the case where an Activity isassigned to all members of a role, say insuranceunderwriters, and once a role member assumesresponsibility for the Activity, it is retracted from theother role members. In RainMan, we take the view thata wide variety of distribution and retraction policiesmay be meaningful, depending on the application. Forexample, consider a company that wishes to post aRequest_for_Quotation to each of its suppliers (thesecan be modeled as Performers). The company may wishto get results back from each of these suppliers before itcan proceed with the next step in its workflowapplication. Alternately, it may just wish to receiveresponses from a ‘majority’ of its suppliers. We aredesigning concrete classes for Performers thatimplement such Task distribution and retraction policiesbased on roles; additional classes can be implementedbased on the needs of specific applications.

Source Human Performer

Application Performer

WorkflowServer Performer

Role PerformerPerformer1

Performer2

Figure 12: Heterogeneous Performers

6. Considerations in Workflow

System Design

6.1 Performance

To the best of our knowledge, there are no publisheddata on the performance and scalability of commerciallyavailable, proprietary workflow systems. However, withthe increasing deployment of these workflow systems,concerns have been raised about the lack of adequateperformance and scalability in even ‘production’workflow systems (an industry label for workflowsystems that specifically address the automation ofhigh-volume, highly repetitive processes such as thosein banks and insurance companies).

Our view is that the Internet will further aggravate theperformance problems of centralized workflowarchitectures as they try to interoperate.WfMC-compliant, monolithic workflow servers are

Page 10: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

designed to handle process management, worklistmanagement, auditing, and directory servicessimultaneously for hundreds of workflow instances atany given time. This architecture may performacceptably as long as the number of participants islimited. However, the performance will degrade rapidlyas the number of participants grows large.

The field of workflow urgently needs meaningfulperformance benchmarks that can be used to evaluateexisting systems and their architectures. Until suchbenchmarks are established, it is not meaningful tocompare quantitatively RainMan’s performance withthat of traditional architectures. However, we believethat a compelling qualitative argument in favor ofRainMan can still be made. The RainMan designdismantles the traditional monolithic server into acollection of related components. This can help ineliminating the potential performance bottlenecks ofworkflow servers. For example, the clear separationbetween process management (Source-side) andworklist management (Performer-side) will allow eachof these to be optimized independently. As more humanPerformers join the RainMan system, additionalWorklist Servers can be pressed into service at differentparts of the network. In a traditional system, additionalworklists would all be added on the central workflowserver, thus burdening the server itself. In contrast, theRainMan approach has no impact on the Sources andtheir performance.

6.2 Failure Handling and Compensations

Much of workflow research in recent years has focusedon the transactional aspects of workflows [Rus94, Al96,Ley95]. The basic objective is to ensure therecoverability of workflows, since workflows arelong-running applications that can execute over days,weeks, or even months. It is fairly well-accepted indatabase transactions literature that classical flat ACIDtransactions are not viable in the context oflong-running applications. Workflow researchers havethus borrowed concepts from nested transactions, sagas,and spheres of compensation to address the needs ofworkflows. While many of these ideas remain untestedin commercial systems, they appear to be viable in thecontext of commercial workflow systems of todaywhere the workflow server can exercise significantcontrol over workflow participants.

The decentralization of the workflows, as proposed inRainMan, alters the picture significantly. In particular,if Source workflows are to execute on the Internet, theautonomy of Performers and their non-proximity toeach other and to the Source itself must be taken as agiven. A Source in this context has no global authority

or jurisdiction over remote, heterogeneous, autonomousPerformers, and has very limited visibility of theirinternal resources and mechanisms. In effect, all notionsof compensation must be addressed in terms ofcommitments between peers; in effect, the interactionbetween a Source and a Performer must be sufficientlyrich to handle failures and provide compensations forpast services as necessary. To use a classical example, aSource may be a Travel Booking workflow thatinteracts with a Performer such as a Hotel Server. TheHotel Server may provide two basic operations -reserve and cancel - where cancel is a compensation fora reserve. Because of a subsequent change in travelplans, the Source may return to the Performer with acancel request on its past reserve request. The HotelServer, within the limits of its service contract, wouldhave to fulfill this request. The idea of long-runningconversations between autonomous network entitiesthat engage in business transactions has been exploredin the Coyote project [Dan97]. The Coyote view thatmeaningful business transactions can occur despitelimited authority of each participant is a useful one forthe Internet; we are exploring how RainMan Sourcesand Performers can benefit from the idea ofconversations.

6.3 Decentralized Execution

Traditional workflow systems are based on a model ofcentralized workflow execution; the workflow system isresponsible for managing workflow coordination aswell as activity execution by invoking resources orparticipants entirely within its scope of authority -applications, other workflow servers, or humanworklists.

A diametrically opposite model of workflow executionthat can decentralize both workflow coordination andactivity execution has been proposed in the context ofthe Arjuna project [Ran97]. This execution modeldecentralizes the coordination of a process by installing‘task controller’ objects in different domains thatcoordinate with each other to deliver workflow routingfunctionality. Each task controller is a workflow‘router’ that understands a piece of the overall workflowgraph. This execution model eliminates a central pointof failure in a workflow; moreover, workflows canproceed even in the face of partial network failures. Themain consequence of the Arjuna approach is thatdecentralized workflow control requires participantdomains (i.e. service provider domains) to participate inworkflow routing on behalf of the workflow using apre-agreed coordination language (a workflow routingprotocol); this imposes computational burdens onparticipant domains that would be unacceptable if the

Page 11: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

domains were autonomous. Decentralized control canalso be expensive to manage; it is harder to maintainglobal state and make dynamic changes to the workflowwhen the workflow script itself is decentralized.

The RainMan execution model strikes a middle ground.It separates the responsibility of workflow coordinationfrom activity execution by creating two classes ofentities, Sources and Performers. In effect, while thecoordination of each process remains localized within aSource object, the actual execution of activities isdecentralized across a network of Performers overwhich Sources have very limited control. The leveragein this model arises from the ability of heterogeneousSources to share heterogeneous, autonomousPerformers. This approach respects the autonomy ofeach Performer, assumes that the environments in whichSources and Performers execute will necessarily beheterogeneous, and makes it easier to keep track ofglobal state as well as support dynamic workflowmodifications.

6.4 Security Considerations

For workflows to run across wide area networks andespecially across organizations, multiple securityconcerns must be addressed. First, an authenticationmechanism must exist to validate the identity of bothSource and Performer domains. This would allow basicfunctions such as Worklist access and Performerinvocation to be done in a secure fashion only byauthorized users or components. Second, access controlrights need to be described and enforced in a scalablefashion to control access to methods on Performers andSources. Third, the integrity and privacy of Taskrequests and responses exchanged between Sources andPerformers should be maintained. Finally, support fornonrepudiability and enforcement of terms andconditions is needed. We are currently exploring theseissues by drawing from the state-of-practice indistributed systems security. Many of these problemscan be easily alleviated in the case of workflowsbetween trusted parties by setting up private channels(Intranets or Extranets) between the participantindividuals and organizations.

7. Conclusions

Our research is directed at designing an Internetworkflow infrastructure that is scalable, flexible, andinteroperable. This is a relevant and important problemsince individuals and organizations are rapidly gettinginterconnected. This widespread interconnectivity can

be exploited to enable new kinds of process-basedapplications.

The RainMaker framework defines the essentialabstractions of a workflow system and facilitatesinterpretable workflow components. Using theRainMaker framework, we have implemented RainMan,a distributed, object-oriented workflow system writtenin Java. Workflow management, activity distribution,directory services, and worklist management are alltreated as independent services that work together todeliver workflow functionality to Internet users. This isa radical departure from traditional workflow systemsbased on monolithic, server-centric architectures. TheRainMan system uses open standards and Web-browserbased user interface components. The system is beingused to experiment with a range of interesting featuressuch as decentralized workflow execution, dynamicworkflow modification, and disconnected participation.

While RainMan has been designed as an infrastructurefor workflow execution, it offers insights into thebroader problem of designing long-running applicationson a network. In effect, RainMan highlights theimportance of separating the responsibilities of servicerequesters (in this case, workflows) from serviceproviders (in this case, humans, applications,organizations, etc.) via clean interfaces (i.e.SourceAgent and PerformerAgent), and assuming thatentities that implement these interfaces areheterogeneous and autonomous. It offers a peer-to-peerTask delegation model with a nice recursive behavior; aPerformer that receives Task requests from a Sourcecan itself act as a Source of Tasks for other Performerson the network.

8. Acknowledgments

We thank David Hutches and Sastry Duri for variousdiscussions on workflows and compensations. We alsothank Carl Staelin and the reviewers of this paper fortheir valuable comments.

Page 12: RainMan: A Workflow System for the Internet Santanu Paul ... · Santanu Paul, Edwin Park, and Jarir Chaar IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598.

Action Technologies, Action Workflow,http://www.actiontech.com

[ActWork]

G. Alonso, D. Agrawal, A. El Abbadi, M.Kamath, R. Gunthor, and C. Mohan,Advanced Transactional Models inWorkflow Contexts, In Proceedings ofICDE, 1996.

[Al96]

CCITT/ISO, X.500, The Directory -Overview of Concepts, Models andServices, CCITT/ISO IS 9594.

[CCITT]

M. Crispin, IETF RFC 2060, InternetMessage Access Protocol version 4 rev 1,December 1993

[Cris93]

A. Dan, and F. Parr, The Coyote Approachto Network Centric Service Applications,7th International Workshop on HighPerformance Transaction Systems,Asilomar, California, September 14-17,1997.

[Dan97]

IBM Corporation, FlowMark Workflow,http://www.software.ibm.com/ad/flowmark

[FlMa]

M. Hamalainen, A. B. Whinston, and S.Vishik, Electronic Markets for Learning:Education Brokerages on the Internet,CACM, Vol. 39, Number 6, June 1996.

[Hama96]

InConcert Inc., InConcert Workflow,http://www.inconcert.com

[InConc]

F. Leymann, Supporting BusinessTransactions via Partial BackwardRecovery in Workflow Management, inProceedings of BTW’95, Dresden,Germany, 1995, Springer Verlag.

[Ley95]

P. Mockapetris, IETF RFC 1034/1035,Domain Names - Concepts and Facilities,Implementation and Specification,November 1987.

[Mock87]

J. Myers, and M. Rose, IETF RFC 1939,Post Office Protocol - version 3, May1996.

[Myers96]

OMG Trading Object Service, CORBAServices: Common Object ServicesSpecification, Chapter 16, July 1997.

[OMG97]

S. Paul, E. Park, and J. Chaar, EssentialRequirements for a Workflow Standard,OOPSLA Workshop on Business ObjectsDesign & Implementation, October 6th,1997,http://www.tiac.net/users/jsuth/oopsla97

[Paula97]

S. Paul, E. Park, D. Hutches, and J. Chaar,RainMaker: Workflow Execution UsingDistributed, Interoperable Components,IBM Research Report nbr. 21008, October1997.

[Paulb97]

F. Ranno, S.K. Shrivastava and S.M.Wheater, A System for Specifying andCoordinating the Execution of ReliableDistributed Applications, InternationalWorking Conference on DistributedApplications and Interoperable Systems(DAIS’97), September 1997.

[Ran97]

M. Rusinkiewicz and A. Sheth,Specification and Execution ofTransactional Workflows, In W. Kim,Editor, Modern Database Systems: TheObject Model, Interopreability andBeyond, ACM Press, 1994.

[Rus94]

W. Schulze, M. Bohm, and K. Meyer-Wegener, Services of Workflow Objectsand Workflow Meta-objects in OMGcompliant Environments, OOPSLAWorkshop on Business Objects Design andImplementation, 1996.

[Schu96]

B. Silver, The BIS Guide to WorkflowSoftware, BIS Strategy Decisions, OneLongwater Circle, Norwell, MA 02061,1995.

[Silver95]

FileNet Corporation, FileNet VisualWorkflo, http://www.filenet.com/products/vwtext.html

[VisFlo]

Workflow Management Coalition,http://www.aiai.ed.ac.uk/WfMC

[WfMC]

W. Yeong, T. Howes, and S. Kill e, IETFRFC 1777, Light Weight Directory Access,March 1995.

[Yeong95]

References