ISSUES IN DISTRIBUTED PROGRAMMING LANGUAGES: THE … · 2020. 4. 2. · 3 SR Language Overview ......

ISSUES IN DISTRIBUTED PROGRAMMINGLANGUAGES: THE EVOLUTION OF SR (CONCURRENT).

Item Type text; Dissertation-Reproduction (electronic)

Authors Olsson, Ronald Arthur

Publisher The University of Arizona.

Rights Copyright © is held by the author. Digital access to this materialis made possible by the University Libraries, University of Arizona.Further transmission, reproduction or presentation (such aspublic display or performance) of protected items is prohibitedexcept with permission of the author.

Download date 08/03/2021 21:09:57

Link to Item http://hdl.handle.net/10150/183888

http://hdl.handle.net/10150/183888

INFORMATION TO USERS

This reproduction was made from a copy of a manuscript sent to us for publication and microfilming. While the most advanced technology has been used to photograph and reproduce this manuscript, the quality of the reproduction is heavily dependent upon the quality of the material submitted. Pages in any manuscript may have indistinct print. In all cases the best available copy has been filmed.

The following explanation of techniques is provided to help clarify notations which may appear on this reproduction.

1. Manuscripts may not always be complete. When it is not possible to obtain missing pages, a note appears to indicate this.

2. When copyrighted materials are removed from the manuscript, a note appears to indicate this.

3. Oversize materials (maps, drawings, and charts) are photographed by sectioning the original, beginning at the upper left hand comer and continuing from left to right in equal sections with small overlaps. Each oversize page is also filmed as one exposure and is available, for an additional charge, as a standard 35mm slide or in black and white paper format. *

4. Most photographs reproduce acceptably on positive microfilm or microfiche but lack clarity on xerographic copies made from the microfilm. For an additional charge, all photographs are available in black and white standard 35mm slide format. *

*For more information about black and white slides or enlarged paper reproductions, please contact the Dissertations Customer Services Department.

UMI Dissertation • • I nformation Service

University Microfilms International A Bell & Howell Information Company . 300 N. Zeeb Road, Ann Arbor, Michigan 48106

8623877

Olsson, Ronald Arthur

ISSUES IN DISTRIBUTED PROGRAMMING LANGUAGES: THE EVOLUTION OF SR

The University of Arizona

University Microfilms

PH.D.

International 300 N. Zeeb Road, Ann Arbor, MI48106

Copyright 1986

by

Olsson, Ronald Arthur

All Rights Reserved

1986

Issues in Distributed Programming Languages:

The Evolution of SR

by

Ronald Arthur Olsson

A Dissertation Submitted to the Faculty of the

DEPARTMENT OF COMPUTER SCIENCE

In Partial Fulfillment of the Requirements For the Degree of

DOCTOR OF PHILOSOPHY

In the Graduate College

THE UNIVERSITY OF ARIZONA

1986

<'I Copyright 1986 Ronald Arthur Olsson

THE UNIVERSITY OF ARIZONA GRADUATE COLLEGE

As members of the Final Examination Committee, we certify that we have read

the dissertation prepared by ~R~o~n~a~l~d~A~r~t~h~u~r~O~l~s~so~n~ ____________________ ___

entitled Issues in Distributed Programming Languages:

The Evolution of SR

and recommend that it be accepted as fulfilling the dissertation requirement

for the Degree of Doctor of Philosophy

Date

Date

Date

Date

Date

Final approval and acceptance of this dissertation is contingent upon the candidate's submission of the final copy of the dissertation to the Graduate College.

I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation requirement.

Dissertation Director Date

--------~

Statement by Author

This dissertation has been submitted i.n partial fulfillment of requirements for an

advanced degree at The University of Arizona and is deposited in the University Library to

be made available to borrowers under rules of the Library.

Brief quotations from this dissertation are allowable without special permission,

provided that accurate acknowledgement of source is made. Request for permission for

extended quotation from or reproduction of this manuscript in whole or in part may be

granted by the copyright holder.

SIGNED: --+~---=-------=-----"'---=~--=--.::.._-=~_-=-=------_

Acknowledgements

I am deeply indebted to my advisor, Greg Andrews. He has made immeasurable

contributions to this research and to my development as a researcher. I thank him for the

enormous· amount of time and effort he has spent working with me. His ideas and insight

ful suggestions have greatly enriched the technical content of this work; his persistent pur

suit of clarity has greatly enriched its presentation.

I am also grateful to the other members of my committee. Rick Schlichting gave

feedback on this work as it developed over the years. He and Pete Downey each carefully

read earlier drafts of this dissertation. Their constructive criticisms improved the technical

quality of this work and the lucidity of its presentation. I also thank the minor members

of my committee, Ted Williams and Fred Hill, for their assistance with my graduate pro

gram.

I am also indebted to my fellow graduate students who helped implement SR:

Mike Coffin, Irv Elshoff, Kelvin Nilsen, and Titus Purdin. They and the other members of

the Saguaro project-Stella Atkins, Nick Buchholz, Roger Hayes, and Steve Manweiler

provided useful feedback on SR. Phil Kaslo also assisted by explaining details of the

implementation of the original SR.

Finally, I thank my family and friends for their support, understanding, and con

stant encouragement over the years. I would like to acknowledge each one's contributions

individually, but to do so would require more pages than in this dissertation. I hope it is

sufficient to say that I truly appreciate their help.

iii

.. __ .-.-- - - -- _._- ------- ------~-------~----------- -_._-._-------------------

TABLE OF CONTENTS

Abstract .......... ...... .... .............. ...... ........ ...... ...... .................. ............ ...... .................... vii

1 Introduction .......................................................................................................... 1

1.1 Programming Network Computers ........................................................ 3

1.2 The New SR ........................................................................................... 5

1.3 Dissertation Organization ....................................................................... 8

2 Issues in Distributed Programming Languages .................................................... 10

2.1 Program Structure ................................................................................. 11

2.1.1 Component Structure .................................................................... 11

2.1.1.1 Modules and Processes ......................................................... 11

2.1.1.2 Separating Module Specification and Implementation ........ 13

2.1.1.3 Shared Variables .................................................................. 14

2.1.2 Module Activation ......................................................................... 16

2.1.3 Module Placement ......................................................................... 20

2.2 Communication and Synchronization .................................................... 22

2.2.1 Process Structure and Message-Passing Primitives ...................... 22

2.2.2 Synchronizing Access to Shared Variables .................................... 28

2.2.3 Additional Communication Mechanisms ...................................... 30

2.3 Coping With Failures ............................................................................. 34

3 SR Language Overview ......................................................................................... 39

3.1 Global Components and Resources ........................................................ 40

3.2 Operations and Communication Primitives ........................................... 49

3.2.1 Basic Invocation Statements ......................................................... 49

3.2.2 Servicing Operations ..................................................................... 51

3.2.3 Semantics of Shared Operations 56

lV

-~--~--. --. ~ --~--- ~.- ~- -- -.------- ----------------- -~-- - --- ---_._------------

v

3.2.4 Additional Communication Primitives ......................................... 59

3.3 Failure Handling ..................................................................................... 62

3.4 Types, Declarations, and Sequential Statements ................................... 64

3.5 Signatures and Type Checking .. .................. ................ .......................... 72

3.6 Implementation Specific Mechanisms ..................................................... 74 3.6.1 Input/Output in the UNIX Implementation ............... "................. 75

3.6.2 Device Control in Stand-Alone Implementations .......................... 78 3.6.2.1 Data Types and Variable Declarations ................................ 79

3.6.2.2 Operations and Interrupt Handlers ..................................... 79

3.6.2.3 Sketch of a Disk Driver ....................................................... 80

4 Examples ............................................................................................................... 83

4.1 Sort Program .................. ................ ...... .......................... ............ ............ 84

4.2 N ~ 8 Queens ........................................................................................ 85 4.2.1 Sequential Solution ........................................................................ 86

4.2.2 Adding Concurrency..................................................................... 88

4.3 Bounded Buffer ....................................................................................... 90

4.4 Dining Philosophers ................................................................................ 92

4.4.1 Centralized Approach .................................................................... 92 4.4.2 First Decentralized Approach ........................................................ 95

4.4.3 Second Decentralized Approach .................................................... 98

4.5 Network Topology.................................................................................. 102

4.6 Components of the Saguaro File System ............................................... 107

5 Implementation Overview ..................................................................................... 114 5.1 Supporting Separate Compilation .......................................................... 117

5.2 Resource Creation and Destruction ........................................................ 120

5.3 Operations .............................................................................................. 122

5.3.1 Invocation Statements ................................................................... 122

5.3.2 The Input Statement .................................................................... 125 5.3.3 Optimizations ................................................................................ 130

5.3.4 Completion Status and Failed ...................................................... 132

5.4 Status, Plans, and Statistics .................................................................. 133

vi

6 Discussion .............................................................................................................. 142

6.1 Integration of Language Constructs ....................................................... 143

6.2 Global Components and Resources ........................................................ 148

6.2.1 Global Components ....................................................................... 148

6.2.2 The Resource as an Abstraction ................................................... 148

6.2.3 Resource Initialization and Finalization ....................................... 150

6.2.4 Resource Parameters ..................................................................... 152

6.2.5 Import Mechanism ........................................................................ 154

6.2.6 Resource Code Loading ................................................................. 155

6.3 Operations .............. .......... ...................................................................... 156

6.3.1 Operation Declarations .................................................................. 157

6.3.2 Operation Invocation .................................................................... 161

6.3.3 Operation Implementation ............................................................ 162

6.4 Issues Related to Program Distribution ................................................. 166

6.5 Sequential Control Statements ............................................................... 168

6.6 Caveats to the Programmer ................................................................... 171

7 Conclusions ........................................................................................................... 176

7.1 Summary ................................................................................................ 176

7.2 Future Research ..................................................................................... 179

Appendix A: Synopsis of the SR Language ............................. ................................ 182

Appendix B: Manual Pages ...................................................................................... 184

References .................................................................................................................. 190

Abstract

This dissertation examines fundamental issues that face the designers of any dis

tributed programming language. It considers how programs are structured, how processes

communicate and synchronize, and how hardware failures are represented and handled.

We discuss each of these issues and argue for a particular approach based on our applica

tion domain: distributed systems (such as distributed operating systems) and distributed

user applications. We conclude that a language for such applications should include the

following mechanisms: dynamic modules, shared variables (within a module), dynamic

processes, synchronous and asynchronous forms of message passing, rendezvous, con---

current invocation, and early reply.

We then describe the current SR language, which has evolved considerably based

on our experience. SR provides the above mechanisms in a way that is expressive yet sim-

pIe. SR resolves the tension between expressiveness and simplicity by providing a variety

of mechanisms based on only a few underlying concepts. The main language constructs are

still resources and operations. Resources encapsulate processes and the variables they

share; operations provide the primary mechanism for process interaction. One way in

which SR has changed is that both resources and processes are now created dynamically.

Another change is that all the common mechanisms for process interaction-local and

remote procedure call, rendezvous, dynamic process creation, asynchronous message pass-

ing, and semaphores-are now supported by a novel integration of the mechanisms for

invoking and servicing operations. Many small and several larger examples illustrate SR's

mechanisms and the interplay between them; these examples also demonstrate the

Vll

Vlll

language's expressiveness and flexibility.

We then describe our implementation of SR. The compiler, linker, and run-time

support are summarized. We then focus on how the generated code and run-time support

interact to provide dynamic resources and to generate and service invocations. We also

describe optimizations for certain operations. Measurements of the implementation's size

and cost are given. The implementation has been in use since November 1985 and is

currently being improved.

Finally, we justify SR's syntax and semantics and examine how its mechanisms

compare to other approaches to distributed programming. We also discuss how SR bal

ances expressiveness, simplicity, and efficiency.

CHAPTER 1

Introduction

A concurrent program is a collection of processes and objects they share. Each

process executes a sequential program. The processes execute concurrently and cooperate,

using the shared objects, to perform a particular task. A shared object is either memory or

a message. A concurrent program may be executed in hardware environments ranging

from one processor shared by all processes to one processor for each process. On a single

processor, the required concurrency is achieved by executing processes one at a time in an

interleaved manner; at the other extreme, it is achieved simply by the inherent parallel exe

cution. A distributed program is a concurrent program that executes on one or more pro

cessors that communicate solely by exchanging messages. A common kind of hardware

environment for distributed programs is provided by a network computer: a collection of

processors connected by a high speed local-area network. We are interested in languages

for programming network computers.

Because a network computer generally contains sp.veral processors, it has the

potential of providing greater levels of concurrency and reliability than can a stand-alone

processor. A program might execute faster, for example, if it is split into parts that are

executed on separate processors. Also, a program executing on a network computer might

be able to continue executing correctly even if one of the processors on which it is executing

fails. Should the processor fail by crashing, its role might be assumed by some other pro

cessor or processors in the network. Should the processor fail by giving erroneous results,

the other processors could outvote it. Gn the other hand, when a stand-alone processor

1

2

fails, its programs would simply stop or would continue and give further erroneous results.

A network computer has further potential for reliability because the likelihood that some

of its processors will be operational is greater than the likelihood that a stand-alone proces

sor will be operational. However, besides processor failures, a network computer is also

susceptible to failures of the communication network. Although such failures can

effectively form partitions of the network computer's processors, its reliability may still be

better than that of a stand-alone processor.

One additional advantage provided by a network computer is that it can allow

increased access to peripherals and increased availability of information. For example, if a

disk is attached to one of the processors in the network, the other processors can communi

cate with that processor to read and write (indirectly) information on its disk. This has

the benefit of reducing overall computing costs: separate disks need not be purchased for

each processor. However, if the processor to which the disk is attached should crash, the

disk and its information would not be available to any of the processors, probably prevent

ing further productive activity in the network computer. This problem is typically solved

by having several disks, each attached to different processors, and replicating important

information on some or all of these disks. Note, though, that this kind of solution is not

perfect: the information needed by a processor might not be on its own disk (if it has one)

nor available from other processors because they or the network crashed.

The potential provided by a network computer has resulted in distributed sys

tems that offer their users much of the same potential. Distributed operating systems and

distributed database systems offer their users increased reliability, better performance, and

increased availability of data (i.e., file system or database). In addition, the system

designers have the flexibility of replicating or distributing the data present in the system or

allowing the users of the system to make such decisions. Other applications, such as

3

numerical algorithms and searching algorithms used in some AI programs, generally take

advantage of only the additional concurrency. New applications are sure to appear that

will further exploit the potential of network computers.

This dissertation examines the topic of distributed programming on network

computers. We are concerned with the programming problems presented by such a

hardware environment and what mechanisms a language should provide to solve these

problems. We survey existing languages and present a new one that we feel has many

advantages over existing languages.

1.1. Programming Network Computers

There are two main approaches to distributed programming: what we call the

primitives approach and the language approach. In this section, we describe these

approaches and give a few examples of each.

In the primitives approach, a distributed program consists of several independent

programs, one or more located on each node in the network. These programs communicate

using operating system facilities or an inter-process communication (IPC) kernel like the V

kernel [Cher84]. From the programmer's point of view, the language used to write distri

buted programs is just a sequential language augmented with message passing primitives.

For example, many of the "remote" commands (e.g., rwho, ruptime, rcp) provided in Berke

ley 4.3 UNIX1 are implemented in this manner; such commands use UNIX-provided sockets

to communicate with other programs (processes) on remote machines. Specifically, the

rwho command, which outputs what users are logged on to what machines, is implemented

using information maintained in a local table by a daemon process on each machine. Each

1 UNIX is a trademark of AT&T Bell Laboratories.

4

daemon process periodically broadcasts the names of the users currently logged on to its

machine. It also receives such broadcast messages from daemons on other machines and

updates its table accordingly. The daemon processes are coded in C and use sockets to

send and receive the broadcast messages. Note that the actual rwho command is executed

entirely on the local machine; it simply gathers its output from the table maintained by

the local daemon process.

In the language approach, a distributed program is a single program, pieces of

which are placed on different nodes in the network. Furthermore, these pieces communi

cate using mechanisms provided by the distributed programming language in which the

program is written. Some distributed programming languages have been developed from

existing languages by adding special mechanisms for distributed programming. For exam

ple, Concurrent C [Geha85] extends C, StarMod [Cook80] extends Modula [Wirt77], and

EPL [Blac84] extends Concurrent Euclid [Holt83]. On the other hand, some other

languages have been developed with distributed programming as a major design goal from

the beginning. Examples of such languages include Ada [Ada83], Argus [Lisk83a], Linda

[Gele85], NIL [Parr83, Str083], and the original SR language [Andr81, Andr82b], which we

now call SRo.

There are of course significant differences between different distributed program

ming languages. Many of these differences can be attributed to the application domain of

the language-i.e., the problems that the language is intended to solve. For example,

Argus is intended for programming transaction processing systems and SRo is intended for

programming both distributed systems software (such as an operating system for a net

work computer) and distributed user applications. Argus and SRo support different views

of the hardware environment and provide different language mechanisms, but each is

appropriate for its application domain. Among languages with similar application

5

domains, significant differences include the form and semantics of the mechanisms included

in a particular language and how they are integrated into the language. The various distri

buted programming languages are examined in Chapter 2.

Of the above two approaches to distributed programming, we feel that the primi

tives approach is less attractive than the language approach for three reasons. First, the

primitives approach is too low-level. For example, messages are not ty'pe checked, which

makes programming less secure, and are often limited or fixed in length, which makes pro

gramming more difficult. Second, the primitives approach can be system dependent. For

example, the primitives might depend on the underlying operating system for their imple

mentation. Thus, the run-time support for a language that uses these primitives would

need to include the underlying operating system. A distributed operating system written

in such a language would therefore be inherently inefficient. It also might be prevented by

the underlying operating system from directly controlling hardware devices. On the other

hand, the implementation of a distributed programming language can be realized without

underlying operating system support. Finally, it is often difficult to integrate the primi

tives with the rest of the language; their form, types of parameters, and semantics might

differ considerably from similar constructs in the language. We will be primarily concerned

with the language approach in the remainder of this dissertation; however, we will discuss

some mechanisms in the primitives approach for comparison purposes.

1.2. The New SR

As mentioned above, SRo is a distributed programming language intended for

programming both distributed systems software and distributed user applications. We

used SRo to program prototypes of several major components of the Saguaro distributed

operating system [Andr86] and several modest size distributed applications. This experi

ence substantiated the general appropriateness of the language, but also pointed out

6

several deficiencies. These deficiencies included

(1) The static nature of resources (modules) and processes greatly complicated the solu

tions to some problems that could easily be solved if these constructs were instead

dynamic.

(2) The semantics of procedures allowed at most one instance of the procedure to be

active at a time; hence recursion was not possible. In addition, procedures and

operations transmitted results via parameters instead of return values.

(3) The lack of a failure handling mechanism required the programmer to program

time-out facilities. This low-level approach greatly complicated programs con

cerned with failures.

(4) The language was too terse. For example, many programmers found SRo cumber

some to use because it does not contain a control statement for definite iteration;

such control must be programmed explicitly using indefinite iteration.

Consequently, we reexamined the entire language and redesigned several parts. The

revised SR has the same flavor and application domain as SRo' but provides more power

and flexibility. The main language constructs-resources and operations-remain the

same, although they are now used in different ways.

The redesign of SR has been guided by three major concerns: expressiveness, sim

plicity, and efficiency [Hoar73]. By expressiveness we mean that it should be possible to

solve distributed programming problems in the most straightforward possible way. This

argues for having a flexible set of language mechanisms, both for writing individual

modules and for combining modules to form a program. Three factors make distributed

programs generally much more complex than sequential programs. First, sequential pro

grams usually have a hierarchical structure; distributed programs often have a web-like

structure in which components interact more as equals than as master and slave. Second,

7

sequential programs usually contain a fixed number of components since they execute on a

fixed hardware configuration; distributed programs often need to grow and shrink dynami

cally in response to changing hardware configurations and hardware failures. Finally,

sequential programs have a single thread of control; distributed programs have multiple

threads of control. Thus, a distributed programming language necessarily contains more

language mechanisms than a sequential programming language.

One way to make a language expressive is to provide a plethora of mechanisms.

However, this conflicts with our second concern, simplicity. As Hoare has so aptly

observed, if programs are to be reliable, the language in which they are written must be

simple [Hoar81]. SR resolves this tension between expressiveness and simplicity by provid

ing a variety of mechanisms that are based on only a few underlying concepts. Moreover,

these concepts are generalizations of those that have been found useful in sequential pro

gramming, and they are integrated with the sequential aspects of SR so that similar con

cepts are expressed in similar ways. The main components of SR programs are parameter

ized resources, which generalize modules such as those in Modula-2 [Wirt82]. Resources

interact by means of operations, which generalize procedures. Operations are invoked by

means of synchronous call or asynchronous send. Operations are implemented or,

equivalently, invocations of operations are serviced by procedure-like proc's or by in state

ments. In different combinations, these mechanisms support local and remote procedure

call, dynamic process creation, rendezvous, message passing, and semaphores-all of which

we have found to be useful. The concurrent and sequential components of SR are

integrated in numerous additional ways in an effort to make the language easy to learn and

understand and hence easy to use.

A further consequence of basing SR on a small number of underlying concepts is

good performance. In particular, the size of the compiler and run-time support is much

8

smaller and their execution rate is much faster than would be possible for a larger, more

complex language. We have also designed the language and implemented the compiler and

run-time support in concert, revising the language when a construct was found to have an

implementation cost that outweighed its utility. Unfortunately, this feedback step is often

ignc.:ed or not possible in language design, sometimes with unpleasant results. For exam

ple, much of the complexity in Ada results from the language being frozen before its imple

mentation was complete enough to provide feedback.

Our initial implementation of SR was completed in November 1985 and has been

in use since then. It was used successfully in two graduate classes in which students

designed and coded moderate-sized (500-1000 lines) and large (5000-6000 lines) distributed

programs. Our implementation is also currently being used to program the Saguaro distri

buted operating system. The feedback from these uses has been encouraging and has

helped us to fine-tune the language and implementation. We expect current and future

uses of SR to further our understanding of distributed programming and distributed pro

gramming languages.

1.3. Dissertation Organization

Any distributed programming language must be concerned with how programs

are structured, how processes communicate and synchronize, and how failures are

represented and handled. The remainder of this dissertation concerns itself with these

issues and how SR addresses them in light of the language design goals described previ

ously. Chapter 2 discusses the above issues and describes how various languages approach

them. The issues and approaches discussed in this chapter motivate many of the mechan

isms in SR. Chapter 3 describes the SR language itself. This chapter defines the form and

semantics of the language mechanisms and provides numerous small examples to illustrate

their use. Chapter 4 illustrates the interplay between the language mechanisms by means

~~~ -- -------- -------- --- - ~--~--- ----~-----~-----------------

9

of several larger examples. These examples demonstrate how many of the issues raised in

Chapter 2 are resolved in SR. The examples include several examples representative of the

kind of programming found in large distributed programs. Chapter 5 describes our imple

mentation of SR. This chapter details how the major language mechanisms have been

implemented; it also provides some measurements of the cost of our implementation.

Chapter 6 discusses how well SR resolves the issues discussed in Chapter 2 and how well

SR meets the language design goals described above. This chapter argues for the appropri

ateness of the specific mechanisms in SR as well as the language as a whole. Finally,

Chapter 7 contains some concluding remarks and future research suggested by this work.

CHAPTER 2

Issues in Distributed Programming Languages

The main issues addressed by any distributed programming language are how

programs are structured, how processes communicate and synchronize, and how failures

are represented and handled. This chapter explores these issues and describes how various

languages approach them. The issues and approaches discussed in this chapter motivate

many of the mechanisms in SR, which is described in Chapter 3.

The approach taken in a particular language is determined by its application

domain and its designer. The application domain dictates what functionality the language

must provide; the language designer decides how that functionality is provided. The

language design process is therefore inherently subjective: a language reflects the views of

its designer. This subjectivity explains in part why there are so many programming

languages. It, together with the relative newness of distributed programming, also explains

why the number of distributed programming languages is large and growing.

Although there are many differences between distributed programming languages,

there are at least some similarities in their basic mechanisms. These similarities consider

ably simplify our study of different mechanisms. For example, in this chapter we shall dis

cuss remote procedure call in general rather than look at how it is represented in each of

many different distributed programming languages.

The rest of this chapter exambes the issues enumerated above: how programs are

structured, how processes communicate and synchronize, and how failures are represented

10

11

and handled. For each issue, we describe the general problem, discuss possible approaches

to the problem, and argue for a particular approach. We try to be as objective as possible

in our arguments but, as suggested above, a certain amount of subjectivity is unavoidable.

2.1. Program Structure

The structure of a distributed program can be viewed as a graph with nodes

corresponding to one or more processes and arcs corresponding to communication channels.

In this section we consider how the nodes and arcs are specified and how a program graph

is generated. In Section 2.2 we examine the internal structure of the nodes and how they

access the arcs.

2.1.1. Component Structure

2.1.1.1. Modules and Processes

The first issue in the design of any programming language is deciding what the

fundamental component is to be, i.e., the main unit from which programs are constructed.

Two possibilities exist in sequential programming languages: procedures or modules con

taining procedures. Two similar possibilities exist in distributed programming languages:

processes or modules containing processes.

Processes are the fundamental component in many distributed programming

languages (e.g., CSP [Hoar78], NIL [Parr83, Stro83], and Concurrent C [Geha85]). This

choice is perfectly adequate for simple applications such as text-processing filters. Also,

more complex components can always be implemented by collections of interacting

processes. However, we believe that a better choice for the fundamental component is a

construct that contains processes; below we explain why.

--- -- -- -----------------

12

Although some components in distributed systems are filters, most, especially in

systems programs, are servers. For example, all components in Saguaro [Andr86] are

servers that manage objects such as files and service client requests to access these objects.

Even Saguaro's command interpreters, which normally only initiate activity, also provide

operations that are used to control command execution.

Modules provide a more natural mechanism for programming servers. Using this

approach, a server can be viewed as an abstract provider of services. The details of how a

server provides a service is transparent to its clients. Also, groups of related processes can

be placed together in one module. Thus, if the implementation of a service is most easily

expressed as several cooperating processes, they can be encapsulated in the same module,

hidden from the view of the client. Moreover, the other objects-such as types, constants,

procedures, and variables-that are used in implementing a service can also appear within

the module, hidden from the outside if appropriate. Modules, therefore, impose an extra

level of abstraction above processes and other objects. This extra level of abstraction,

which consists or only modules and their interconnections, can actually make large pro

grams easier to understand because many of the unimportant details have been hidden

within modules. With processes as the fundamental component, no such hiding is possible;

therefore, the top-level view of a large program is cluttered with unimportant details.

An example illustrating how modules provide abstraction can be seen in Saguaro.

Saguaro contains several directory managers. Among other things, these directory

managers handle requests to open files. To clients, a directory manager is a single abstract

object that services open requests. Such an abstract object can be programmed as a single

module. How an open request can be serviced within a directory manager module and

more generally how modules can be structured internally is discussed later in Section 2.2.

--------------- -

13

SR's use of modules as the fundamental component is similar to the approach

taken in procedure-based languages such as Euclid [Lamp77j and Modula-2 [WirtS2j. The

fundamental component in many distributed programming languages (e.g., Distributed

Processes [Brin7Sj, StarMod [CookSOj, Argus [LiskS3aj, EPL [BlacS4j, and SRo [AndrS1,

AndrS2b]) is a module-like construct that contains processes.

2.1.1.2. Separating Module Specification and Implementation

Most languages (e.g., Euclid, Modula, Ada) that provides modules allow them to

be written as two parts: a specification part and an implementation part (or body). The

specification part, sometimes called the definition or interface part, defines what objects a

module provides (exports) to other modules and what objects it uses (imports) from other

modules. The implementation part contains executable code that implements the services

described in the module's specification; it might also contain other, internal services.

Being able to separate a module's specification from its implementation simplifies

program development, especially for large programs. A module's specification can be

viewed as a contract between the module's writer and its users. Declarations of objects in

a module's specification tell what objects the module's user can use and tell what objects

the module's writer must provide in the body. Thus, the details of how a module provides

a service can be hidden from the user. This allows modules to be developed independently

and the implementation of a module to be changed without affecting its users or requiring

their code to be recompiled.

The nature of distributed programming illustrates another advantage in separat

ing specification and implementation. Sequential programs almost always have a hierarch

ical structure. On the other hand, distributed programs often have a web-like structure in

which components interact more as equals than as master and slave. When two or more

modules are to interact as equals, their specifications are mutually dependent, i.e., a

14

circularity exists in their specifications. For example, in the Saguaro file system, the direc

tory manager and file server are mutually dependent; the directory manager creates a file

server when necessary and the file server informs the directory manager when the server's

client closes its file. As a second example, mutual dependence between modules is needed

to support the upca/l [Clar85j, in which data flows from a server back to its client. The

only way to handle such circularities between modules, at least using standard compiler

techniques and with reasonable efficiency, is to allow the specification to be written

separately from its body. Thus, both specifications can be compiled before either body,

which provides the compiler with enough information to compile a body completely. Being

able to separate a module's specification and body is therefore essential in a distributed

programming language.

2.1.1.3. Shared Variables

It is tempting in a distributed programming language to prohibit shared variables

since this mirrors the absence of shared memory in network computers. It also avoids the

need to synchronize access to shared variables. However, the reality, especially in systems

programs, is that shared variables facilitate efficient solutions to many problems. For

example, processes in Saguaro's file servers share file descriptors and buffers. Also,

processes in disk server modules share disk buffers. In each case, however, only processes in

the same server, executing on the same processor, share variables. Thus, it seems reason

able to allow variables to be shared within a module but not between modules. This

encapsulates the use of such variables and makes access to them efficient because processes

that share variables are guaranteed to execute on the same processor. (We assume here

that all processes within a module execute on the same processor; see Section 2.1.3 for a

discussion of this issue.) Note that processes might need to synchronize their access to

shared variables; this is discussed in Sec. 2.2.2.

- - ------------ -----------

15

If shared variables were not allowed within modules, common data would either

have to be managed by a separate caretaker process with which multiple server processes

could then interact, or all clients that needed access to common data would have to be ser-

viced by a single server process. Neither of these two approaches is desirable, as described

below.

The first approach is typically programmed as follows. A caretaker process con

tains the declaration of the common data and a loop that repeatedly services a read or a

write operation on the common data. The read operation simply returns the value of the

common data; the write operation simply updates the value of the common data. A pro-

cess reads or writes the common data by invoking the caretaker's read or write operation,

respectively. This approach is clumsy to use because processes must know about the care-

taker; it is inefficient because processes must exchange messages with the caretaker. Furth-

ermore, it requires a separate caretaker for each common data item. Alternatively, a single

caretaker could manage more than one such item. However, this would require that the

caretaker's read and write operations be parameterized to indicate which item is to be read

or written; such parameterization makes this approach even more clumsy.

The second approach employs a single process to service all clients that need

access to common data. This server process is quite complex because it has to multiplex

the possibly different activities of several clients. Suppose, for example, that a file system

allows more than one client process to read or write a particular file at the same time, as in

UNIX [Thom78].1 Then, each open file would require a separate file server process to allow

1 For simplicity, we assume that each process opens the file separately. In UNIX,

processes can also share files as a result of forking: a forked process inherits its parent's open files and shares its parent's I/0 pointer for each open file. This necessitates a system-wide open file table. Note that this serves as another example of why sharing is desirable.

---- -------- --

16

information, such as buffers and disk location, to be shared between all clients that are

accessing the file. The server would also need to maintain client-specific information; e.g.,

each client requires its own I/O pointer. The body of the server would service read, write,

seek, or close operations for each of its clients; thus, the identity of the client would need to

be available to the server, perhaps as part of the service request, which would clutter the

interface. The single server process might also introduce unnecessary delay. For example,

the server might delay when servicing a read request if no buffer is available. Conse

quently, the server will be unable to service, say, a subsequent close operation invoked by

another client until it obtains a buffer for its read request. In fact, deadlock could occur if

the pending close would release buffers that were needed to satisfy the read request. Such

delay and deadlock can be avoided with careful coding; however, such code would be quite

complex. Ideally, each process should manage one activity. Sometimes shared variables

are needed to make this both feasible and efficient. In the above example, each client

should be serviced by its own file server process; Sec. 4.6 illustrates this structure.

To summarize, allowing variables to be shared only within a module falls between

the extremes of prohibiting all sharing and allowing arbitrary sharing. This middle-ground

approach to shared variables can be more efficient than either of the two extremes.

2.1.2. Module Activation

The next key issue is whether modules are static or dynamic. By static, we mean

that the number of instances of a module is fixed and specified at compile time. By

dynamic, we mean that the number of instances of a module can vary and is determined at

run-time; new instances are created, and possibly later destroyed, by executing state-

17

ments. 2

The static approach has several advantages over the dynamic approach. First,

the number of instances of each kind of module in a program is specified directly in and is

easily discerned from the source program. In SRo' for example, the number of instances of

a resource (module) appears in the resource's heading. Second, modules and objects

declared in the specification parts of modules can be referenced directly by their names.3

Finally, optimizations in interactions between modules that are loaded on the same proces

sor might be possible. For example, suppose it is known at compile time that two modules

will be loaded on the same processor. Then, invocations between those two modules might

be able to use conventional procedure call (as in sequential languages) instead of the more

general invocation primitives that use message passing; e.g., the invoking process might use

its own stack for the arguments and directly execute the code in the invoked procedure.

Despite these advantages of the static approach, the dynamic approach is more

expressive. Moreover, this additional expressiveness can be realized with a small impact on

a language's simplicity and efficiency. The dynamic approach also has several other advan-

tages over the static approach.

One advantage of dynamic modules is that it is possible to bring up and test

different configurations of a program without recompiling all modules in the program. For

example, a program's configuration can be determined at run-time based on input values

2 In Ada, tasks that are declared within a procedure become active just prior to execution of the first statement in the procedure; we view this kind of module activation to be the result of executing the statement that called the procedure. Note that Ada tasks can also be explicitly created by executing a statement that allocates a new instance of a task type. Thus, we consider Ada tasking to be included in the dynamic approach.

3 An object's name might need to be qualified by the name of its defining module to avoid ambiguities.

18

or can be determined at link time by linking in one of several different main modules, each

of which creates different configurations of the other modules in the program. Second, it is

easier to reuse existing modules in new programs, again because in the dynamic approach

the number of instances of a module is separated from the code for the module. Finally, it

is possible for a program to grow and shrink during execution. In Saguaro, for example,

many servers are created and destroyed dynamically in response to changing levels of user

activity and to hardware failures. In the static approach, such growth is not so easily

attained. Typically, dynamic growth is emulated by creating a predetermined maximum

number of service modules together with an allocator module that manages that pool of

service modules. The disadvantages of this approach are that it requires an allocator (usu

ally a separate allocator for each kind of module), clients must interact with the allocator,

and service modules persist while unallocated, occupying valuable resources such as

memory. In addition, determining the maximum number of modules a priori is sometimes

difficult; if the chosen number is not sufficiently large, unnecessary delays are likely to

result.

The additional expressiveness of dynamic modules is realizable with a small

increase in the complexity of a language, requiring only the addition of statements or

built-in functions for creating and destroying instances of modules. In addition, because

instances of module are to be created dynamically, they must be referenced indirectly.

Such an indirect reference is represented in various languages as a pointer (e.g., Ada) or a

capability (e.g., EPL). Note that indirect reference is not necessarily a new language con

cept. In Ada, for example, pointers are also used to refer to other dynamically allocated

objects such as nodes in a linked list.

Dynamic modules are more expensive to implement than static modules, but not

unreasonably so. Dynamic modules impose two additional requirements on the run-time

19

support. First, the run-time support must provide primitives for creating and destroying

modules. Note that the run-time support for a language with static modules might aiready

contain a primitive that creates a module. This primitive would be executed when a pro

gram began to execute, once for each module in the program. The code in this primitive is

similar to the kind of code required to create an instance of dynamic module; the

differences are essentially bookkeeping details. (The above approach was used in SRo; see

[Andr82b] for details.) The second additional requirement is that when an object (such as a

procedure) is used outside the instance of the module that contains it, the run-time sup

port must first verify that that instance still exists.

The only real efficiency loss related to dynamic modules is that it is no longer pos

sible at compile-time to optimize interactions between modules that reside on the same

processor; e.g., conventional procedure call described above cannot be used. This loss,

however, is not be too significant because (1) even without optimizations, intraprocessor

interactions are inherently far more efficient than interprocessor interactions and (2) such

interactions can possibly be optimized at run-time. For example, the run-time support

might allow a calling process to execute a procedure in another module located on the same

processor instead of creating a new process to do so, as would be required if the module

were located on a different processor. Note that since the run-time support is involved, this

is less efficient than the regular procedure call used in the static approach.

A third possible approach to module activation, not mentioned above, is to

specify the number of instances of modules at link time. This approach falls between the

dynamic and static approaches. It is similar to the dynamic approach in that modules are

referenced indirectly; it is similar to the static approach in that the number of instances of

modules is constant at run-time. The dynamic approach is therefore more expressive.

--- - - ------- ----------------------

20

2.1.3. Module Placement

The next major concern is where modules are placed, i.e., on which processors

they are located. The key placement issues are what is the unit of placement, how and

when placement is specified, and whether placement units can move during program execu

tion (i.e., migrate). By unit of placement, we mean the smallest program unit whose com

ponents are guaranteed to execute on the same processor. The possible reasonable choices

are the process, the module, and a collection of modules.

The choice between process and module as the unit of placement is related to the

choice between process and module as the fundamental program unit (see Sec. 2.1.1.1). In

a language where the process is the fundamental program unit, the process is logically also

the placement unit since it is an abstraction of a processor. In a language where the

module is the fundamental program unit, either the process or th·e module could serve as

the placement unit. Several reasons favor the module. First, processes within a module

co.:l share variables, which can only be implemented efficiently if the processes are located

on the same processor (see Sec. 2.1.1.3). Second, processes in the same module cooperate to

accomplish a particular task; placing such processes on the same processor allows interac

tions between them to be optimized. Third, it would be cumbersome to specify where each

process within a module is to be placed, especially if processes are created dynamically.

Fourth, placing processes in the same module on different processors can confuse failure

handling. Should one of the processors fail, then only part of the module has failed. It is

not clear whether the remaining part can or should continue execution. For example,

• Are the module's shared variables accessible to continuing processes?

• How would recovery (either automatic or user-defined) deal with these processes?

For example, what happens to continuing processes if a new instance of the module

is created as part of recovery?

......... _ ...... -.- ..... -.---. --_._---------

21

Finally, the implementation is more complicated if processes within the same module are

located on different processors. For one thing, it is difficult to coordinate access to shared

objects, some of which (e.g., queues of invocations waiting to be serviced) are managed by

the run-time support. Also, if modules are dynamic, the run-time support also needs to

maintain the location of each process so that if a module is destroyed, its processes can be

located.

Another choice for placement unit is a collection of modules. For example, a file

system contains several different kinds of modules (e.g., directory managers, file servers,

disk servers, etc.). These modules form a logical collection and could be placed together.

However, it is often important to place individual modules on different processors. For

example, a disk server must be placed on the processor to which the disk is attached while

a file server might be placed on the same processor as the client using the server. Thus, the

individual module is the more appropriate placement unit.

The next key placement issue is how and when placement is specified. The exam

ple in the previous paragraph also illustrates that the programmer should have dynamic

control over the placement of modules; specifically, the placement of the file server on the

same processor as its client can only be accomplished during program execution. If

modules are created dynamically, it makes most sense to specify their location when each

instance is created. This provides maximum flexibility and is straightforward to imple

ment. The utility of being able to specify the location of a module is one of the lessons

learned from Eden [Blac85]. There, they found it valuable to be able to specify the proces

sor on which an object executes even though their overall philosophy is to provide an

environment in which objects are location independent. Being able to specify the location

of a module also provides a basic tool for load sharing.

22

The final issue related to placement is whether modules should be able to change

location during their existence. Such migration is useful, for example, to distribute the

load more evenly among the processors in the network or to move a module closer to-or

even onto the same processor as-modules with which it interacts closely. However, the

implementation of migration is not simple [Powe83]. Basically, in order for a process to

migrate, its current state must be frozen on the source processor and it must be moved to

the target processor, and future messages for it must be forwarded to the target processor.

Thus, process migration is too expensive to be built into a systems programming language

(although it might be available in a system written in such a language provided the

language allows for specifying module location).

2.2. Communication and Synchronization

We argued in the previous section that the basic unit in a distributed program

ming language should be a module that contains processes, and that such modules should

be dynamic. We now turn attention to the internal structure of modules. We examine

process structure, basic message-passing primitives, approaches to synchronizing access to

shared variables, and a few additional communication primitives.

2.2.1. Process Structure and Message-Passing Primitives

The number of processes per module could be static, or at least bounded at com

pile time. In this case, processes would be implicitly created when the containing module

was created. CSP [Hoar78] and Ada [Ada83] have such a process structure. Alternatively,

the number of processes could be dynamic, with new processes created as needed. Several

languages provide dynamic processes: DP [Brin78], Mesa [Mitc79], and StarMod [Cook80].

A hybrid approach is also possible. In this case, some processes, such as those performing

background tasks, are created implicitly and others are created explicitly. Argus [Lisk83a]

23

uses this approach. Note that if the hybrid approach is used with dynamic modules, then

the background processes are created when their defining module is created. The effect

then is that they are created dynamically. Thus, in reality, all processes are created

dynamically if modules are dynamic.4

Processes in different modules must exchange messages to interact.5 Message pass-

ing can be provided by send/receive primitives or by remote procedure call. With

send/receive primitives, data values flow in one direction between a sending and a receiv

ing process. The sender of a message either contin11es immediately after issuing the send

as in PLITS [Feld79], or delays until the message has been received as in CSP [Hoar18]. In

the former case, send is a non-blocking primitive, so message passing is asynchronous. In

the latter case, send is a blocking primitive, so message passing is synchronous. Receive

is invariably a blocking primitive, although a non-blocking variant is often provided.

With remote procedure call, data can flow in two directions: the process that calls

an operation sends data to a process that services the call, then waits for the results of the

call (if any) to be returned. Thus, remote procedure call provides a structured equivalent

of the send/receive and receive/send sequences used by clients and servers. The server

side of a remote procedure call can be provided by a process that is dynamically created to

service the call, as in DP or Argus. Alternatively, an already existing process could engage

in a rendezvous with the caller, as is done in Ada and SRo' StarMod supports both possi

bilities.

4 The issues of dynamic modules and dynamic processes are orthogonal. If a language provides only one, the other has to be simulated, which can lead to artificial program structures.

5 Processes in the same module can exchange messages or use shared variables to interact.

24

Different languages include different combinations of these choices for process

structure and message passing. The question is: What combination is best? Theoretically,

the question is moot since it is possible to solve any problem using any combination of

these mechanisms. However, different combinations are better suited to solving some prob

lems than others. The ideal combination is one that allows each problem to be solved in

the most straightforward and efficient manner possible. More precisely, it should not be

necessary to introduce superfluous processes or work queues, or to obfuscate the interface

to a module merely to overcome the limitations imposed by a particular choice.

For example, suppose a language provides remote procedure call and rendezvous,

but not send. Simulating send could be accompliRhed by the invoker calling a procedure

that simply inserts its arguments onto a work queue; the caller would therefore continue

after its arguments have been queued, thus effecting send. Another process would then

service elements on this work queue; it would remove an element from the queue and ser

vice that element as appropriate. As another example, suppose a language provides send

and receive, but not remote procedure call. Then remote procedure ~all could be simu

lated by using send/receive and receive/send, as suggested before. Note, however, that

the interface to the server operation would also be different. If the server operation is to be

invoked by different clients, one of its parameters would need to specify the operation to

which it is to send its reply.

The question of which combination of choices is best has been addressed at length

by Liskov, et al. [Lisk86]. Their observations are:

(1) Send/receive is viable with either static or dynamic process structure. However,

the client/server relationship is the dominant one in most distributed programs.

Since clients almost always require answers to their service requests, remote pro

cedure call provides by far the most convenient, and also familiar, client interface.

25

(2) The combination of rendezvous and static process structure is not sufficiently rich

to be attractive by itself. In particular, there are two classes of problems that are

difficult to solve with rendezvous: local delay and remote delay. Local delay occurs

in a server when an object needed to service a request is not currently available.

Remote delay occurs when a server, in processing a request, calls another server and

encounters a delay. In both cases, it may be necessary for the server to honor other

requests in order to remove the conditions that led to the delay. It is possible to

cope with local delay as long as the rendezvous mechanism allows the choice of

which operation to service next to be based on attributes of pending operations,

including their parameters. Rendezvous alone, however, is not capable of dealing

effectively with remote delay since when such delay will occur cannot be predicted

by the server.

For these reasons, Liskov) et al. conclude that either rendezvous or static process structure

has to be abandoned. This is a reasonable conclusion if only one combination of primitives

is available. It is also a reasonable conclusion in Argus' application domain of transaction

processing systems [Lisk83b]. However, rendezvous and static process structure are ade

quate, provided a language has a rich enough set of primitives. Moreover, we believe it is

beneficial to have a variety of primitives, especially to program distributed operating sys

tems such as Saguaro (see also [Scot83]). Our experience with SRo bears this out, as

described below.

, SRo has a static process structure. Operations are invoked by asynchronous send

or by call; operations are serviced by a rendezvous mechanism called the in statement.

The in statement allows the choice of which operation to service to be based on the attri

butes of pending invocations, including their parameters. Thus, local delay can be handled

in SRo' Moreover, send can be used to avoid remote delay: instead of calling an operation

26

that could delay, send to it and later use a rendezvous to receive the reply.6 This technique

converts potential remote delays into local delays since the server waits only to receive new

messages. It also obviates the need to create a process to service the remote operation.

In addition to making it possible to avoid remote delay, send also has other uses.

For example, it can be used to program pipelines of filter processes. A filter is a data

transformer that consumes streams of input values, performs some computation based on

those values, and produces streams of results. Filters are most simply and efficiently pro-

grammed using asynchronous send and receive. This is because it is never necessary to

delay when a message is produced: no assurance is required that the message has been

received, and no return value is needed. Producer delay can be avoided when synchronous

send or remote procedure call is used, but this requires programming extra buffer

processes. In Saguaro, send is used for implementing pipelines of filters and for situations,

such as writing windows or files, where it is not necessary to delay the invoking process.

To summarize, we have found SRo's combination of static process structure, ren

dezvous, call and send to be adequate. However, we now feel that the programmer should

be able to choose whether a new process is created to service an operation, or whether it is

serviced by performing a rendezvous with a previously created process. Furthermore, the

programmer should also be able to choose whether the invoker waits for a newly created

process to complete or whether it continues execution, in which case the invoker and the

new process execute concurrently. Note that these choices are like the ones for rendezvous:

the invoker can wait for an operation to be serviced (e.g., using call) or can continue execu-

tion (e.g., using send). Also note that it is useful to pass parameters to a process created

6Note that send must not be synchronous or else the sender would still encounter delay until the message is received.

27

to service an operation. For a process for which the invoker waits, the parameters might

be used to calculate some result that is returned to the invoker. For a process that exe

cutes concurrently with its creator, the parameters might be used to initialize the created

process's local data. These similarities between operations and processes suggest that

language mechanisms for the two should be similar; Sec. 3.2 describes how SR unifies these

concepts.

Creating a new process to service an operation that was called (i.e., remote pro

cedure call) has a semantics analogous to that of procedure call in sequential languages.

Since local (and recursive) procedures are still important in distributed programming, this

semantic similarity suggests that the mechanisms for calling remote and local procedures

and for servicing their calls should be very similar if not identical.

Essentially the same arguments as those presented earlier for dynamic modules

apply here to dynamic processes: they provide maximum flexibility in programming a

module body, and they can be implemented at modest cost by making minor alterations to

the language's run-time support. In particular, the run-time support must contain a prim

itive for process creation; it typically allocates space for the process and makes the process

eligible for execution. For static processes, this primitive is invoked as part of module crea

tion. For dynamic processes, this primitive is invoked when a process is to be created. The

actions in either case are identical except perhaps for the information the run-time support

maintains.

In a language with dynamic modules and dynamic processes, communication

paths between modules and processes are necessarily dynamic. Dynamic communication

paths provide needed flexibility in programming. Even in SRo' where modules and

processes are static, communications paths can vary dynamically (by using capabilities for

individual operations). Communication paths must be dynamic in order to support certain

28

applications, e.g., I/O redirection like that in UNIX. Specifically, the standard input for an

execution of a command can come from a file or a device. To allow the command to be

written independently of its source of input, it is passed a communication path that is

bound to the read operation in a file server or a device driver. This binding is done dynam

ically before the command begins execution.

2.2.2. Synchronizing Access to Shared Variables

We argued in Sec. 2.1.1.3 that processes within a module should be able to share

variables. Since processes will often need to synchronize access to shared variables, a

language should provide some mechanism to support such access. However, a language

should not require that access to shared variables be synchronized because there are situa

tions in which synchronization is not required. For example, suppose that many processes

read from a shared table but only one process updates the table. Provided that readers can

tolerate slightly out-of-date information, as is the case in some applications, no synchroni

zation is needed between readers and the writer. Thus, a language should just provide the

mechanism with which synchronization to shared variables can be programmed if so

desired.

There are two simple approaches that require no additional language constructs.

The first approach uses shared variables to program synchronization for other shared vari

ables, e.g., using Peterson's algorithm [Pete81]. Besides being awkward, this approach is

very inefficient because of its busy-waiting. The second approach employs a distinct pro

cess for shared variable synchronization. Whenever a process requires synchronized access

to the shared variables, it sends a request to this synchronizer process. The synchronizer

then either performs the access itself (in which case it is similar to the caretaker process

described in Sec. 2.1.1.3), or grants access permission to the other process. Especially if a

language contains a rendezvous mechanism with selection control as in Ada or SRo' the

29

synchronizer and the interface to it are easy to program. Unfortunately, the overhead of

exchanging messages with the synchronizer outweighs most of the performance gain that

can result from the use of shared variables. Special cases could be recognized and optim

ized, but these cases are hard to detect, especially when processes can be created dynami

cally.

Another approach to synchronizing access to shared variable is to employ atomic

data types and atomic actions upon them, as in Argus [Lisk83a]. However, support for this

approach requires a fairly substantial, special-purpose operating system. Hence, this

approach is inappropriate for a language intended for writing operating systems.

Intermediate approaches can be classified by whether exclusion to shared vari

ables is provided implicitly or must be programmed. In DP [Brin78] and LYNX [Scot86],

exclusion is implicit and results from their view of intra-module concurrency. In both these

languages, there is a single thread of control within each module. The thread executes in a

particular procedure until a blocking statement is encountered; e.g., in LYNX, the accept

statement blocks if no message is available. The thread then continues to execute in a

different procedure. Thus, access to shared variables is synchronized by preventing context

switches between procedures. This approach is simple and efficient to implement. How

ever, it only provides mutual exclusion between blocking statements. This requires the

programmer to be aware of which statements can block and to be careful when modifying

existing code not to introduce blocking statements into a critical section or into a pro

cedure called from within a critical section. Furthermore, interrupt handling is difficult to

integrate into this approach because the thread executing at the time of the interrupt must

continue until it encounters a blocking statement. Unless threads have short durations,

interrupts might not be handled in a timely fashion.

30

The other kind of intermediate approach requires that exclusion to shared vari

ables be programmed. Languages in this category include additional mechanisms

semaphores, conditional critical regions, or monitors [Andr83]-for programming such

exclusion. The use of these mechanisms is fairly well understood. Also, semaphores and

monitors can be implemented very efficiently, which is important if the potential efficiency

provided by shared var:iables -is to be realized. StarMod provides semaphores as a basic

data type [Cook80]; Ada also provides semaphores by means of a predefined package. EPL

provides monitors since it is based on Concurrent Euclid [Holt83].

The choice between semaphores and monitors is a difficult one. Monitors enable

shared variables to be encapsulated with the operations on them and make mutual exclu

sion of these operations automatic. Semaphores require that mutual exclusion be explicitly

programmed but provide more flexibility exactly because mutual exclusion is not

automatic. Since we have found that one of the primary uses of shared variables is for

tables t.hat are mostly read, often concurrently, we prefer using semaphores. Moreover, the

encapsulation provided by monitors is of marginal benefit since the shared variables and

processes that access them are already encapsulated by the module construct. Another

argument for semaphores is that they might already be supported by a language's com

munication primitives; we shall see in Sec. 3.2.2 that this is the case in SR.

2.2.3. Additional Communication Mechanisms

We argued in Sec. 2.2.1 that each of remote procedure call, rendezvous, asynchro

nous message passing, and dynamic process creation has a place in a distributed program

ming language. In this section, we consider a few other communication primitives that a

distributed programming language might include.

It is convenient in distributed programs to be able to invoke several operations

concurrently since this can be much more efficient than invoking them sequentially. Often

31

it is desirable to perform a multicast in which several operations are passed the same argu

ments, e.g., to update all copies of a replicated file. Other times it is desirable to pass

different arguments to different operations, e.g., to read several different records from a dis

tributed database. In both these cases, it is appropriate for the invoker to wait for all the

operations to be serviced before proceeding. This is not always the case, however. For

example, when reading from a replicated file, it may only be necessary to wait for one of

the reads to complete. Or, if weighted voting is used with a replicated file [Giff79], just

some subset of the replicas need to be reached on each read or write. It is also desirable to

record results from such invocations as they complete. For example, it might be useful to

remember which copy of a replicated database responded first or to count replicas' votes.

The need to wait for either all or some invocations to complete and the desire to record

results from invocations as they complete suggest a statement that has an action associ

ated with the completion of each invocation; this action could be used to terminate the

statement early or to record results. The coenter statement in Argus [Lisk83a] provides

such functionality; a similar statement is included in SR (see Sec. 3.2.4). The V kernel also

provides primitives that support this functionality.

As mentioned above, a sequence of call invocations might not be as efficient as a

concurrent invocation-the invoker needs to wait for each invocation to complete before

beginning the next one. In addition, if one of the invocations does not complete due to

failure, the invoker might not be able to continue (depending on how failures are handled

in the language-see Sec. 2.3). To increase concurrency, the sequence of call invocations

could be replaced by a sequence of send invocations followed by receive's of replies. How

ever, this would complicate the program because the interface to the invoked operations

would be more complicated (a reply parameter would need to be included, like that dis

cussed in Sec. 2.2.1) and the code would need to written so as not to confuse replies to ear-

32

lier sequences of send's with replies to the current sequence of send's. That is, a reply to

one of the send's in an earlier sequence might be mistaken as a reply to one of the send's

in the current sequence. Of course, sequence numbers could be used to distinguish between

replies, but this makes the code and the interface to the invoked operations even more

complicated. Furthermore, if the interface is changed, then normal call invocations that

previously invoked the operation would need to pass dummy parameters for a reply and

sequence number, or to invoke a different version of the same operation. In either case, the

resulting program is clumsy. Thus, there is a need for a concurrent invocation mechanism

in a distributed programming language.

It is also convenient to allow a server of a call invocation to perform an early

reply. That is, the server of a remote procedure call or rendezvous is allowed to reply

before it has completed providing service. This allows the server to return results to the

invoker and then execute concurrently with the invoker. An important use of early reply is

to program a conversation-i.e., a repeated exchange of messages between a client process

and a server process. In Saguaro, for example, a client process wanting to open a file calls a

procedure in the file manager and then engages in a conversation with a process in a file

server. Their conversation consists of a sequence of read, write, and seek invocations fol

lowed by a close invocation, which terminates the conversation. (Sec. 4.6 shows how this is

programmed in SR.) Early reply allows the server to return names of operations that the

client is to use for the conversation.

Note that early reply could be simulated using send and receive. Specifically, a

client could send to the server, which would send the results back to the client and then

continue executing. However, the interface is not as clean as it would be if early reply were

used because a reply operation in the client would need to be passed to the server. Furth

ermore, early reply provides a better abstraction and more flexibility in programming:

---- -------- --------------- ----- -- -.- ------- ----- --------------- - ----------

33

whether the reply to a call invocation is the result of the server completing normally or exe

cuting an early reply is not known outside of the implementing module. The programmer

is therefore free to program the server in the most appropriate way.

The implementation of concurrent invocation and early reply mechanisms are

reasonably inexpensive. The run-time support must already provide primitives for invok

ing an operation and for replying to an invocation. These same basic primitives can be

used to implement the new mechanisms. Details of how this is accomplished in SR are

described in Sec. 5.3.

Two additional communication mechanisms, what we call defer and forward,

might also be included in a distributed programming language. The first, defer, would be

used within a rendezvous to defer processing of an operation. This would allow invoca

tions to be replied to in a different order than they were selected-so called "out of order"

replies. The main use of defer would be to avoid local delay (Sec. 2.2.1). However, as

described in Sec. 2.2.1, local delay can be avoided if a language's rendezvous mechanism

contains a powerful selection mechanism. Moreover, defer can be simulated using send

and receive. There is therefore no need for defer in a language that has a powerful ren

dezvous mechanism or send and receive. Furthermore, defer is a somewhat complex

mechanism because the arguments of a deferred operation must remain accessible outside

the conventional scope of a rendezvous, and there is no general way to ensure that a

deferred invocation is ever replied to, or not replied to more than once.

The second mechanism, forward, would allow an invocation being serviced to be

partially processed and then passed to another operation. This would forward responsibil

ity to reply to the original invocation. An example of where forward would be useful can

be found in some file systems. To read from a file, a user invokes the read operation in a

file server. The file server in turn invokes the read operation in the appropriate disk server.

34

The disk server replies to the file server, which then replies to the user. If forward were

available, the file server could forward the user's request to the disk server, which could

then reply directly to the user, thereby eliminating the superfluous reply to the file server.7

The V-system, which is built on the V kernel, uses forward for such a purpose. Note that

forward can be simulated using send and receive. In the above example, the user would

pass an additional parameter specifying an operation to which results are to be sent. The

file server would pass this operation to the disk server, which would send results directly

to that specified operation in the user. Unfortunately, this simulation requires the user to

use a send/receive pair instead of the cleaner call interface. Although forward has its

uses and can provide a cleaner interface, it is not easily integrated into a high-level

language. In particular, an invocation of an operation should only be forwarded to another

operation that has the same number and type of value/result and result parameters and

return value so that they can be properly assigned.

Note that one of the reasons we argued against including defer and forward in a

language was that each could be simulated using send and receive. While the same is

true of concurrent invocation and early reply, those mechanisms are more useful and

simpler; thus a language should make the tradeoff to include them.

2.3. Coping With Failures

Programs that execute on network computers have the potential to be more reli-

able than programs that execute on single processors. Reliability is, in fact, one of the pri-

mary goals of many distributed systems. Saguaro, Eden [Alme85], Locus [Walk83], and

Clouds [Allc83], among others, attempt to provide service despite failures such as

7 In the Saguaro file system, the reply to the file server is needed because the file server performs some buffering.

35

processor, network, and device crashes. Consequently, a language for programming distri

buted systems must provide mechanisms for dealing with such failures.

When a system fails, it either simply stops executing or it continues executing but

produces erroneous results. The first kind of failure is called fail-stop failure [SchI83j; the

latter is called Byzantine failure [Lamp82j. With Byzantine failure, erroneous results can

be produced; for example, a bad memory board might not correctly maintain the value of a

variable or a communications network might garble, lose, or spuriously generate messages.

Byzantine failures can be detected and handled using a Byzantine algorithm [Lamp82j.

Such an algorithm, however, is very expensive to implement and can be application depen

dent. Byzantine algorithms therefore are not implicit in any language we know; instead

the user must explicitly program such algorithms. Thus, we are only concerned in the

remainder of this dissertation with language mechanisms to handle fail-stop failures.

Fail-stop failures can be handled in a distributed programming language in one of

three general ways. One approach is to provide high-level language mechanisms that either

hide the occurrence of failures or are used to recover from them. Examples of this

approach are atomic actions [Lisk83aj and replicated procedure calls [Coop84j. Each of

these techniques is useful, especially for applications programs. However, each requires an

extensive run-time support that is in essence a special-purpose operating system. Thus,

these approaches are too high-level for systems programming.

At the opposite extreme, a language could merely provide a time-out mechanism

by which a process can avoid indefinite delay while waiting for an invocation to complete

or to arrive. This approach is supported in Ada by the delay statement. SRo also

employed this approach, but at an even lower level than Ada. In particular, in SRo one

had to program an interval timer and then use it to provide a delay facility. The SRo

36

approach is flexible and requires very little language run-time support8, but it is relatively

clumsy.

An intermediate approach is also possible. Although failure detection ultimately

relies on the use of timeout, what the programmer requires is some way to ascertain that a

failure has occurred. The abstract concept is thus that a component has failed. The

language's run-time support is charged with detecting the failure. The programmer is

charged with handling the failure once it has been detected. The language must therefore

provide mechanisms that can be used to inform the programmer of an invocation that

failed to complete, of an expected invocation that failed to arrive, or of a module creation

or destruction that failed. For example, if one module has detected that another has

failed-say because one of its invocations did not complete successfully-it may decide to

create a new instance of the failed module. One proposal [SchI86] along these lines defines

handler operations that can be bound to events such as a processor crash. When the run

time support detects such an event, it invokes the handler operation, which can then han-

die the failure as it sees fit.

We feel that the intermediate approach is the most appropriate approach for sys-

terns programming. It is simpler to use than timeouts because it supports a higher-level

abstraction, which frees the programmer from dealing with low-level, implementation

details. It is also more efficient than the higher-level failure-handling mechanisms. Note,

however, that there is little experience with writing distributed programs that handle

failures. More experience in this area is required to demonstrate the appropriateness or

8 The run-time support maps each timer interrupt into a send invocation of the operation attached to the timer interrupt location. Such an operation is implemented as a semaphore. The send invocation that results from the interrupt is implemented as a V and the in statement that services the invocation is implemented as a P.

37

inappropriateness of current approaches and possibly lead to new approaches.

In addition to hardware failures, exceptions can also occur in programs. An

exception is an unexpected or infrequent event such as a zero divisor, a subscript out of

range, or a server process that terminates before completing its client's request. Oonse-

quently, some languages (e.g., OLU [Lisk81], Mesa [Mitc79], Ada [Ada83], and Modula-2+

[Rovn85]) provide exception handling mechanisms. On the other hand, others (e.g., Black

[Blac82]) have argued that such mechanisms are unnecessary and that most exceptions can

be prevented by careful coding or guarded against by explicit error checks. The

approaches to exception handling in current distributed programming languages are based

on and extend those in sequential programming languages. In Ada, for example, if an

exception occurs in a called procedure, the exception is raised in the called procedure, where

it is handled by a local handler or propagated upward along the call-chain until a handler

for it is encountered. If two Ada tasks (processes) are enga.ged in a rendezvous and an

exception occurs in the server, then the exception is raised in both the server and the client.

In some cases, this extension of sequential behavior is reasonable. However, its inflexibility

limits how the programmer can structure a program. For example, it might be desirable to

have the exception handled by an entirely different process, perhaps one that is monitoring

the progress of the other two.9

It is also not clear how to apply the Ada approach to a language that includes

asynchronous message passing. For example, suppose that two processes are simulating a

rendezvous using send and receive. That is, the client uses a send to transmit arguments

and a receive to obtain results; the server uses a receive to obtain the request and a send

9 This possible structure results from the ability to have multiple processes in a program.

38

to return results. Suppose further that an exception occurs in the server between its

receive and send. Then, should the exception be raised only in the server or also in the

client? If it were raised only in the server, then the client might also need to know. How

ever, automatically raising the exception in the client is difficult because there is no infor

mation inherently available to an implementation that would identify those clients that

are waiting for a reply from the server. Of course, the client could be notified and the

desired flexibility could be achieved through explicit programming (e.g., by passing addi

tional parameters that indicate the exception to be raised by the local handler). However,

this would complicate the code considerably, thus negating that would-be advantage of

exception handling mechanisms.

Besides the uncertainties raised above, exception handling mechanisms also con

siderably complicate a language and its implementation. For these reasons, we feel that

while there might be a need for exception handling mechanisms, none of the current

approaches are appropriate. As with failure-handling mechanisms, further work and

experience with exception handling mechanisms are needed.

CHAPTER 3

SR Language Overview

In Chapter 2, we argued that a language for programming distributed systems

should provide certain kinds of mechanisms: dynamic modules, shared variables (within a

module), dynamic processes, call and send forms of message passing, rendezvous, con

current invocation, and early reply. In this chapter, we describe the SR language. We

show how these mechanisms are realized in SR in a way that is expressive yet simple. SR

resolves the tension between expressiveness and simplicity by providing a variety of

mechanisms based on only a few underlying concepts. This chapter also justifies the

specific syntax and semantics we have chosen for mechanisms in SR. Some of this

justification, however, is deferred until Chapter 6 so as (1) not to digress too far from the

exposition of the language, (2) to be able to discuss the interplay between mechanisms, and

(3) to be able to discuss the implementation of the mechanisms, which is described in

Chapter 5.

The main components of SR programs are parameterized resources, which serve

the role of the abstract module discussed in Chapter 2. Resources interact by means of

operations, which generalize procedures. Operations are invoked by means of synchronous

call or asynchronous send. Operations are implemented by procedure-like proc's or in

statements; equivalently, invocations of operations are serviced by proc's or in statements.

In different combinations, these mechanisms support local and remote procedure call,

dynamic process creation, rendezvous, message passing, and semaphores.

39

40

We describe resources in Sec. 3.1 and operations and communication primitives in

Sec. 3.2. The remaining sections of this chapter describe failure handling mechanisms (Sec.

3.3); types, declarations, and sequential statements (Sec. 3.4); signatures and type checking

(Sec. 3.5); and implementation-specific mechanisms (Sec. 3.6). Our description of the

language is fairly complete but somewhat imprecise and informal; e.g., we do not give pre

cise syntax for parameter lists. The complete language is described more precisely and for

mally in [Andr85]; Appendix A contains a synopsis of the language. This chapter contains

many small examples that illustrate particular points about the language mechanisms.

Chapter 4 contains several large examples that illustrate the interplay between language

mechanisms.

3.1. Global Components and Resources

SR provides two kinds of separately-compiled components: globals and resources.

A global component groups together related declarations that are needed by more than one

resource. It is therefore purely declarative and exists only during program compilation. A

global component may contain declarations of symbolic constants, user-defined types, and

user-defined communications channels (optype's-see Sec. 3.2.1). When a global com-

ponent is imported into a resource, the objects declared in that component become avail

able within the resource. The general form of a global component is:

global identifier constant, type, and optype declarations

end

Global components are analogous to '.h' files in C programs; they differ in that they are

separately compiled and imported rather than being merely textually included. A simple

example of a global component is:1

global People type person = rec( id: inti name: string(lO); age: inti next: ptr person) const age_unknown = -1

end

41

People declares a record type, person, and an integer constant, age_unknown. The record

type person consists of two integer fields, id and age; a string field, name, which is at most

10 characters long; and a pointer field, next, which points to a person record. A larger

example of a global component appears in Sec. 4.6.

A resource consists of an interface part and an implementation part. The interface

part specifies what other components the resource uses; declares operations, constants, and

types provided by the resource; and specifies the types of resource parameters. The general

form of the interface part is:

spec identifier import component identifiers operation, constant, type, and optype declarations

resource identifier(formaI...parameters) separate

The identifier following spec defines the name of the resource; the identifier following

resource must be the same. The components listed in import statements are imported

into the spec; the objects they declare can be used within this interface part and the

corresponding implementation part. All objects declared in the interface part are exported

from the resource; they may be used in other resources that import this resource. Objects

exported from a resource's interface part are also visible within its implementation part.

Note that a resource can be parameterized. These parameters are typically used to pass

1 In presenting fragments of SR programs, we follow the convention of using boldface for reserved words, italics for user-declared identifiers, and Roman for predefined functions and types and enumeration literals.

42

communications paths or sizes of arrays to resource instances when they are created.

Consider the following resource specification, which illustrates the import

mechanism.

spec database import People op query{ key: int ) returns info: People.person

resource databaseO separate

This resource imports the global component People (presented earlier) and declares a query

operation with one parameter, key, and a return value, info. The type of key is integer; the

type of info is person, which was declared in People. Note the use of standard "dot nota

tion" to specify the type of info. Such qualification is only required to disambiguate

objects with the same name defined in different components. In the above example,

People.person could be replaced with just person because there is only one person in the

specification's scope. However, if database were to import another component that also

defined a person, then each reference to person would need to be qualified by the name of

the component that defines the desired person.

The implementation part (body) of a resource contains the processes that imple-

ment the resource, declarations of objects shared by those processes, and initialization and

finalization code. The general form of the body is:

body identifier declarations of shared objects initial block end processes final block end

end

A body's identifier must correspond to the identifier of a previously compiled spec. The

shared objects may be constants, types, optype's, variables, and operations; none of these

shared objects are visible outside the body. A block is a sequence of declarations and

43

statements. A block can declare the same kinds of objects that can be declared in the body

of a resource; none of these objects are visible outside the block. Note that variables can-

not be declared within the specification of a resource; they may only be declared within the

body of a resource or within a block. Thus, variables cannot be shared by processes in

different resources.

Most of the pieces in the interface and implementation parts of a resource are

optional and may occur in any order. An object can be referenced any time after it has

been declared up to the end of its defining block, except in a nested block if its name is hid

den by another object with the same name. This permits the values of constants to depend

on previously declared objects, e.g.,

const N= 10 const TwoN = 2 * N

Statements and declarations can also be intermixed. This permits sizes of arrays to depend

on input values. For example, the statements and declarations

var n: int write('number of integers?') read( n) var nums[l:n] : int

dynamically allocate the array nums after n is read. This example uses the predefined

implementation-specific operations 'read' and 'write' to read or write their arguments (see

Sec. 3.6).

The general ordering rule in SR is "declare before use"; i.e., an object must be

declared before being referenced. There is, however, one important exception to this rule.

Globals and specifications can be compiled in any order, except that a specification must be

compiled after any component whose objects are referenced in the specification itself. That

is, a specification can reference the name of an as-yet uncompiled component but may not

44

reference an object declared in that component. This exception exists to permit resources

to employ each other's operations (or to create each other), which is needed in many distri

buted programs.

The following two resource specifications illustrate the exception to the declare

before-use rule.

spec A import B type t = rec( tl, t2: int ) op J( x: cap B )

resource AO separate

spec B import A type u = rec( ul: char) op g( y: A.t )

resource B() separate

Assume that A is compiled before Bj this is an exception to the declare-before-use rule

because B has not yet been compiled when A is being compiled. As shown, A can import B

and use B (e.g., to declare parameter x of operation /). However, A cannot use objects

declared in B, such as u, because A is compiled before B. On the other hand, B can use A

and the objects declared in A, such as t. Sec. 4.6 presents a more realistic example of a pro

gram that requires this exception.

Compilation of resource bodies does abide by the general declare-before-use rule.

Specifically, a resource body must be compiled after its resource specification and after all

global components or resource specifications that its specification imports. Furthermore, if

a global component or specification is changed after having been compiled, it must be

recompiled. Also, any other component that imports the changed component and refer-

ences objects it contains must be recompiledj viz., if a resource's specification is changed,

its body must be recompiled.

The following interface part of a resource defines a queue of integers.2

spec Queue op inser~ item: int) op removeO returns item: int

resource Queue(size : int) separate

45

Queue exports two operations and is parameterized by the size of the queue. The imple-

mentation part of Queue is:

body Queue var store[O:size-l] : int var front, rear, count: int := 0, 0, 0

proc insert( item) if count<size - store[rear] := item; rear:= (rear+l)%size; count++ o else - # take actions appropriate for overflow

fi end

proc removeO returns item if count>O - item := store[front]; front:= (Jront+l)%size; count-o else - # take actions appropriate for underflow

fi end

end

The implementation of Queue employs the common array-based representation. Note that

the structure of this solution is very similar to the structure one would find in languages

such as Modula-2 [Wirt82], Euclid [Lamp77], and Ada [Ada83]. The actual forms of

declarations and statements are different, however, for reasons mostly having to do with

our goal of integrating the sequential and concurrent mechanisms of SR.

2 Semicolons are optional in SRi our convention is to use them as separators for declarations or statements that appear on the same line. Comments begin with '#' and terminate with the end of the line on which the comment begins or the next occurrence of '#'.

46

The interface and implementation parts of a resource may also be combined if the

specification contains no imports, or the imported globals and resource specifications have

already been compiled. For example, Queue can be written as:

spec Queue op inser~ item: int) op removeO returns item: int

resource Queue( size: int)

var store[O:size-l] : int var front, rear, count: int := 0, 0, 0

proc insert( item) if count<size - store[rear] := item; rear:= (rear+l)%size; count++ ~ else - # take actions appropriate for overflow

fi end

proc removeO returns item if count>O - item := store[front]; front:= (Jront+l)%size; count-~ else - # take actions appropriate for underflow

fi end

end

If the interface part of a resource does not declare any objects (i.e., constants,

types, optype's, or operations) and does not import any components, then only the

resource's name and parameterization must be specified. This is illustrated by the follow-

ing simple resource.

resource He 110 ( count: int) initial

fa i := 1 to countwrite( "Hello World" )

af end

end

When created, Hello outputs its message count times. Note that Hello can still be imported

into other resources, which can create instances of it. Also note that simple, one-resource

47

programs typically have the above form.

Resource instances are created dynamically by means of the create statement.

Execu tion of

variable := crea.te resourceJdentifier{arguments)

causes a new instance of the named resource to be created on the creator's machine. Argu

ments are passed by value to the new instance and then the resource's initialization code, if

any, is executed (as a process). Execution of create terminates when the initialization code

terminates or executes a reply statement (see Sec. 3.2.4). A capability for the instance is

returned and assigned to variable. This capability can be used to invoke the operations

exported by the resource, can be copied and passed to other resources, and can be used to

destroy the instance. For example, given

var qcap : cap Queue

which declares a capability for Queue, execution of

qcap := create Queue(20)

creates a 20 element Queue on the creator's machine and assigns to qcap a capability for it.

Subsequently, an item can be inserted into the queue by executing

qcap.inser« item)

or removed by executing

item := qcap.removeO

where item is of type into Essentially qcap is a record, the fields of which are capabilities

for the two operations exported by Queue. If it becomes appropriate to destroy this

instance of Queue,

48

destroy qcap

can be executed. The destroy statement terminates after the resource's finalization code

(if any) terminates and space allocated to the resource has been freed.

An optional clause on the create statement is used to specify the machine on

which the resource is to be created. (The default, as mentioned above, is the creator's

machine.) For example, execution of

qcap := create Queue( 40) on Barrel

creates a 40 element Queue on the machine named 'Barrel'. The general form of the on

clause permits an arbitrary expression of type machine, which is a special, user-defined

enumeration type. The machine type defines all possible machines in a given program. In

a stand-alone implementation, each enumeration literal represents a physical machine; in

an implementation built on top of an operating system, each enumeration literal defines a

virtual machine that is mapped (after linking but before executing the program) to a phy

sical machine (See Appendix B for more details.) All resources in a program must be com

piled using the same or no machine type. The machine type is typically defined in a global

component and imported into each resource that needs it. Such a global component might

look like

global Machines type machine = enum( Barrel, Cholla, Organpipe, Paloverde, Saguaro)

end

A resource capability variable can only be assigned capabilities for instances of

the resource specified in the variable's declaration; e.g., qcap in the above example can only

be assigned capabilities for instances of Queue. In addition, resource capability variables

can be assigned the special values null or noop. Destroying a null capability variable or

invoking an operation through a field in a null capability variable is an error, but doing

49

either of these for a noop capability variable ha.') no effect. The initial value of each

resource capability variable is null.

3.2. Operations and Communication Primitives

Resources are patterns for objects; operations are patterns for actions on objects.

Operations are declared in op declarations as illustrated in the Queue resource above.

(Arrays of operations are also supported.) Such declarations can appear in resource

specifications, within resources, or within processes. Operations are invoked by call or

send statements; they are serviced by proc's or in (input) statements. In the four possible

combinations, these primitives have the following effects:

Invocation Service Effect

call proc procedure call (possibly remote or recursive) call in rendezvous send proc dynamic process creation send in message passing

In addition, semaphores can be simulated using send and in with parameter-less opera-

tions. This section shows how these effects are achieved.

3.2.1. Basic Invocation Statements

The invocation statements have the forms

call operation(arguments) send operation(arguments)

where operation is a field of a resource capability, an operation capability (see below), or

the name of an operation declared in the scope of the invocation statement. The keyword

call is optional and is omitted when operations that have return values are used in expres-

sions. Arguments are passed by value (valor the default), result (res), or value/result

(var)j which parameter passing mode is to be used for each parameter is specified as part

50

of an operation's declaration (see the sort example below). A call invocation terminates

when the operation has been serviced and results have been returned (or when a failure has

been detected as discussed in Sec. 3.3). A send invocation terminates when the arguments

have been delivered to the machine on which the resource that services the operation

resides. Thus, call is synchronous whereas send is semi-synchronous [Bern86].

We have chosen semi-synchronous rather than asynchronous semantics for send

for two reasons. First, this is the semantics that is invariably implemented when the

sender and receiver of an operation execute on the same machine. Second, semi-

synchronous semantics provides the sender with assurance that adequate buffer space for

the invocation exists and that the servicing resource exists and was reachable at the time

the operation was invoked. Of course, the programmer still has no assurance that the

invocation will be serviced.

By default, an operation may be invoked by either call or send although result

parameters and return values are only available to call invocations. It is possible to res

trict invocation to just one of these by appending the operation restriction "{call}" or

"{send}" to the operation's declaration.

SR provides capabilities for individual operations in addition to capabilities for

entire resources. Operation capabilities provide a mechanism similar to procedure-valued

variables and parameters, and can be used to support finer-grained control over the com-

munication paths between resources (see Sec. 2.2.1.). An operation capability variable can

only be assigned capabilities for operations that have the structurally same parameteriza

tion and the same operation restriction. For example, the declarations

op quick~ort( var a[l:*]: int ) {call} op bubble~ort( var b[l:*]: int ) {call} var sort: cap ( var c[l:*]: int ) {call}

51

define two operations and an operation capability variable. (The ,*, in the declarations

means that the formal parameter inherits its size from the corresponding actual parameter;

see Sec. 3.4.) Since the declaration of sort matches the declaration of both operations, sort

can be assigned a capability for either of the operations, e.g.,

sort := bubble...§ort

Subsequently, sort can be used to invoke the operation to which it is bound, e.g.,

call sort( x )

where x is an array of integers. Like resource capability variables, operation capability

variables can also be assigned the special values null or noop. Invoking a null capability

variable is an error, but invoking a noop capability variable has no effect. The initial

value of each operation capability variable is null.

To simplify operation and capability declarations, SR includes an optype

declaration. An optype declaration is to operations what a type declaration is to vari-

abies: it declares a pattern that will be used more than once. An optype specifies the

types of the parameters and return value, and the operation restriction. For example, the

declarations in the previous paragraph can be simplified by using an optype:

optype sortjorm = ( var a[1:*]: int ) {call} op quick...§ort sortjorm op bubble...§ort sortjorm var sort: cap sortjorm

Larger examples illustrating the use of optype's appear in Chapter 4.

3.2.2. Servicing Operations

An operation is serviced either by a proc or by one or more in statements.3 A

3 An operation cannot be serviced by both a proc and in statements for reasons discussed in Sec. 6.3.3.

52

proc is a generalization of a procedure: it is declared like a procedure and may be called

like a procedure but has the semantics of a process. Its general form is

proc operationjden tifier( formaUden tifiers) returns resultjden tifier block

end

where the operation identifier is the same as that ill an op declaration in the resource con

taining the proc.4 Whenever that operation is invoked, an instance of the proc is created.

This instance executes as a process and uses the formal and result identifiers (if any) to

access the arguments of the invocation and to construct the result. If the operation was

called, the caller waits for the instance to terminate (or reply); the effect is thus like a pro

cedure call and is in fact implemented like a procedure call whenever possible (e.g., for

recursive calls-see Sec 5.3.3). However, if the operation was invoked by send, the sender

and instance of the proc execute concurrently; the effect in this case is like forking a pro

cess. The Queue resource (Sec. 3.1) contains two proc's; further examples are shown

below.

The other way to service operations is to employ in statements, which have the

general form

in operationsommand ~ ... ~ operationsommand ni

Each operation command is structurally like a proc except it may also contain a synchron-

ization expression and a scheduling expression:

4 However, the formal and result identifiers in a proc or an in statement (see below) need not be the same as those in the corresponding op declaration. This is discussed further in Sec. 6.3.1.

53

operationjden tifier( formallden tifiers) returns resultjden tifier and synchronization..!!xpression by scheduling..!!xpression -- block

An in statement delays the executing process until some invocation is selectable; then the

corresponding block is executed. An invocation is selectable if the boolean-valued syn-

chronization expression in the corresponding operation command is true; the synchroniza-

tion expression is optional and is implicitly true if omitted. If more than one operation

command has selectable invocations, one operation command is non-deterministically

chosen. From the chosen operation command, the oldest selectable invocation is serviced.

This can be overridden by the use of by, which causes selectable invocations of the associ

ated operation to be serviced in ascending order of the arithmetic scheduling expression fol-

lowing by. How access to invocations is controlled among processes that service the same

operation is described in Sec. 3.2.3. Note that synchronization and scheduling expressions

can reference parameters in invocations.

Two examples illustrate the input statement. First, the input statement

in query( key) returns info -# use key to find info.

info := ... ni

services the single operation query and returns information based on its parameter. Of

course, since in is a statement, it must be contained in a proc (or initial or final code).

As a second, more complicated example, the following input statement uses both a syn

chronization expression and a scheduling expression to allocate some object to its clients

giving preference to larger requests.

in request(size) and free by -size -- free := false ~ releaseO -- free := true

ni

54

This input statement implements two operations, request and release. The object can only

be allocated when it is available, which is represented by the variable free and controlled

by the synchronization expression in the operation command for request. The size of a

request is given by its size parameter, which is used in the scheduling expression to give

preference to larger requests.

A simple example will help clarify the mechanisms for invoking and servicing )

operations. Following is a resource that implements a bounded buffer of integers.

spec BoundedBuffer op inser~ item: int} op removeO returns item: int

resource BoundedBuffer( size: int}

op bbO

initial send bbO end

proc bbO var store[O:size-l] : int var front, rear, count: int := 0, 0, 0

do true.-in inser~ item} and count< size .-

store[rear] := item; rear:= (rear+l)%size; count++ ~ removeO returns item and count>O .-

item := store[front]; front:= (Jront+l}%size; count--ni

od end

end

The interface part of BoundedBuffer is identical to that of Queue, which is appropriate

since a bounded buffer is just a synchronized queue. The implementation of BoundedBuffer

contains one proc, bb. Initially one instance of bb is activated (using send); that instance

executes as a process that repeatedly services invocations of insert and rem01le. Invocations

of insert can be selected as long as there is room in the buffer; invocations of remove can be

selected as long as the buffer is not empty. Note that insert and remove are serviced by a

55

single process and thus execute with mutual exclusion. Also note that the implementation

is not visible to resources that use instances of BoundedBufferj the resource body could

equally well have used a monitor-like implementation in which insert and remove are ser

viced by proc's and semaphores are used to synchronize these proc's (see below for how

semaphores can be simulated).

Resources often contain "worker" processes such as bb above. SR provides a pro

cess declaration to simplify programming such processes. For example, the above resource

could be coded more compactly by deleting the declaration of the bb operation, deleting the

initialization code, and replacing the line

proc bbO

by

process bb

Process declarations are thus an abbreviation for the specific pattern that was employed in

BoundedBuffer. In particular, one instance of each process in a resource is created

automatically when an instance of the resource is created.

Another useful abbreviation is provided by the receive statement. In particular,

receive operation( vi, ... , vN)

is an abbreviation for an in statement that waits for an invocation of operation and then

assigns the values of the formal parameters to variables vi, ... , vN. Together with the

send form of invocation, receive supports asynchronous message passing in a familiar

way. Synchronous message passing is supported by receive together with the call form of

invocation. send and receive can also be used to simulate semaphores. For example,

given the declaration

56

op semaphore() {send}

the following statements simulate the semaphore P and V operations:

receive semaphore() send semaphore()

# P operation # V operation

This simulation can be used within resources to synchronize access to shared resource vari-

ables, as discussed in Sec. 2.2.2. In fact, operations declared and used in this way are

implemented just as if they were semaphores (see Sec. 5.3.3).

Note that operations can be declared within a proc or even within a block. The

lifetime of such an operation is the same as it is for a variable declared in a proc or a

block: a new instance is created when the operation's declaration is encountered and ceases

to exist when the proc or block containing the operation is exited. Because objects

declared within a block are visible only within that block, these operations are essentially

useless unless capabilities for them are passed outside the defining proc or block. (Little is

accomplished if a process sends itself a message; even less is accomplished if a process calls

itself!) The primary use for such local operations is to give a process its own communica

tions channels. This is especially useful for programming conversations, as is illustrated in

Sec. 4.6.

3.2.3. Semantics of Shared Operations

An operation that is declared global to proc's and processes and is implemented

by input statements is called a resource operation. Such an operation may be implemented

by input statements in more than one process; e.g., operations that simulate semaphores

are implemented in more than one process. Thus, several processes in the same resource

instance might compete for invocations of a shared resource operation. To identify and

control possible conflicts in access to invocations by such competing processes, operations

57

are grouped into classes based on their appearance in input statements. We define over the

set of operations in a resource pattern the relation

same = { (a,b) I a and b are implemented by the same input statement}

We consider same to be reflexive; i.e., (a,a) is in same for each operation a. It is also obvi-

ously symmetric. The transitive closure of same, then, is an equivalence relation, the

equivalence classes of which are our classes of operations. Note that operations declared

within a proc or a block are also included in the domain of same; such an operation may

be in the same class as a resource operation and, if so, may consequently be in the same

class as an operation declared within another proc or block. Also note that the only way

an operation is in a class by itself is if it is implemented only by input statements that

implement no other operation.

To illustrate classes, consider the following resource. (This resource is presented

for illustrative purposes only; although it is legal, it is useless.)

spec z op aO

resource z() op bO; op cO; op dO process pi

op eO; op.tO in bO ~ ni in aO ~ 0 eO ~ ni in dO ~ ni in 10 - ni

end process p2

op g() in bO ~ 0 cO ~ ni in aO ~ 0 g() ~ ni in dO ~ ni

end end

58

The four classes formed by the operations in z are {a,e,g}, {b,c}, {d}, and {f}. The class

{a, e,g} results from a and e being serviced by the same input statement, and a and 9 being

serviced by the same input statement. That is, (a,e) and (a,g) are in same, and the only

other pairs in same involving any of a, e, and 9 are (a,a), (e,e), and (g,g). The transitive clo

sure of same therefore contains the class {a,e,g}. Note that the above input statements

employ empty blocks. These are sometimes useful when developing programs.

The classes defined by same are static and can be determined at compile time

from the text of resource patterns. Classes also have a dynamic (run-time) aspect because

instances of resources are created dynamically. In particular, each resource instance has its

own set of classes, which contains instances of the static classes. This set's membership

varies during execution. When a resource instance is created, an instance of each class that

contains a resource operation is created; there will be only one instance of each such class in

the set. In the previous paragraph, for example, the instantiation of z results in three

classes being created-one for a, one for band c, and one for d. When a proc (or process)

is instantiated or a block is entered, each of its local operations is assigned to a class that

already exists or that must be created. Continuing the previous example, each time p1 is

instantiated, the new instance of e is added to the class containing a and a new class for

the instance of f is created; each time an instance of p1 completes, e is removed from the

class containing a and fs class ceases to exist. Consequently, the operations in an instance

of a class can change during execution and can contain several instances of the same named

operation; e.g., the class containing a contains all instances of e. Also, the number of

classes for an instance of a resource can be unbounded; e.g., an instance of the class for f is

created for each instantiation of p1.

Static classes therefore define patterns for dynamic classes that can exist during

execution and determine what operations can possibly be in the same dynamic class. In the

59

remainder of this dissertation, the unqualified term class refers to a dynamic class.

At most one process at a time is allowed to access the queues that contain invoca

tions of operations in a given class. That is, for a given class, at most one process at a time

can be selecting an invocation to service or appending a new invocation. Access to the

invocations in a class is assumed to be fair in the sense that processes are given access in a

first-come/first-served (FCFS) order; moreover, processes are given access to new invoca

tions in a FCFS order. Thus, a process waiting to access the invocations will eventually

obtain access as long as all functions in synchronization and scheduling expressions in

input statements eventually terminate.

3.2.4. Additional Communication Primitives

Two additional communication primitives-concurrent invocation and early

reply-were advocated in Sec. 2.2.3. SR provides these primitives and in addition a return

primitive, which generalizes the return statement found in most procedure-based

languages. All are useful, simple, and efficient.

SR's return and reply statements provide flexibility in servicing invocations.

Execution of return causes the smallest enclosing in statement or proc to terminatE': early.

If the invocation being serviced was called, the corresponding call statement also ter

minates and results are returned to the caller. Execution of reply causes the call invoca

tion being serviced in the smallest enclosing in statement or proc to terminate.5 In con

trast to return, however, the process executing reply continues with the next statement.

An important use of reply is to allow a proc to transmit return values to its caller yet

continue to exist and execute after replying. This facilitates programming conversations,

5Execution of reply has no effect for operations that are invoked by send.

60

as described in Sec. 2.3.3. An example of a conversation is presented in Sec. 4.6.

The final communication primitive is the co statement, which supports con-

current invocations. The form of a co statement is

co concurrentjnvocation -. post-processing / / ... / / concurrentjnvocation -. post-processing oc

A concurrent invocation is a call or send statement, or an assignment statement that con-

tains only a single invocation of a user-defined function. The post-processing blocks are

optional. Execution of co first starts all invocations. Then, as each invocation terminates,

the corresponding post processing block is executed (if there is one); post-processing blocks

are executed one at a time. Execution of co terminates when all invocations and post pro-

cessing blocks have terminated or when some post-processing block executes exit.

Note that concurrent send invocations can have post-processing blocks. Since

send invocations terminate immediately, their post-processing blocks will be executed

immediately after all invocations are started. Thus, such post-processing blocks generally

serve no purpose; their code could be placed before the co statement. An important excep-

tion is that if a send invocation fails, its post-processing block could detect and handle the

failure.

If a co statement terminates before all its invocations have terminated, uncom-

pleted invocations are not terminated prematurely. This is because such an invocation

could be being serviced, in which case terminating it could put the server process in an

unpredictable state. Also, it is sometimes useful to have uncompleted invocations get ser-

viced even after co terminates; for example, all reachable copies of a replicated database

should be updated even if the updater terminates after only a majority of the copies have

been updated.

61

As an example, the simple co statement

co [sum := left.sum{ ... ) / / rsum := right.sum{ ... ) oc

invokes two operations concurrently and assigns the result of each invocation to variables

that can be used after it completes. Neither concurrent command has a post-processing

block.

A concurrent invocation can also be preceded by a quantifier, which implicitly

declares a bound variable and specifies a range of values for that variable.6 Within co, a

quantifier provides a compact notation for specifying multiple invocation/post-processing

paIrs. For example, a replicated file might be updated by executing

co (i:= 1 to N) call file[i].update(values) oc

where file is an array of capabilities containing one entry for each file resource. Similarly,

reading a replicated file, terminating when one copy has been read might be programmed

as

co (i:= 1 to N) call file[i].read(arguments) --10 exit oc

In both cases, the quantifier's bound variable i is accessible in both the invocation state

ment and post-processing block. Thus, the last example could be modified to record which

of the filet i] was the first to respond by saving the value of i in a variable global to co

before executing exit.

6 Quantifiers can also be used within in statements to facilitate servicing elements of an array of operations, and they are the basis for one of the iterative statements discussed in Sec. 3.4.

62

3.3. Failure Handling

As described in Sec. 2.3, any language for programming distributed systems must

contain mechanisms for dealing with failures such as processor and network crashes. SR's

middle-ground approach provides two specific mechanisms for detecting such failures.

First, invocation and resource control statements (call, send, create, and destroy) return

an implicit completion status. All these statements employ capabilities so the completion

status is stored in a hidden field in the capability that is used; the predefined function

'status' can subsequently be used to examine the value. The possible status values are:

Success Crash NoSpace Terminated

Undefined

statement terminated normally statement cannot finish due to processor or network failure insufficient space to create resource or proc, or to store invocation resource or operation server no longer exists

initial value and value while statement is executing7

For example, a call invocation might use 'status' as follows.

call read( ... ) if status( read) =1= Success -

# handle the failure in some way.

fi

Here, if the call to read does not succeed, the body of the if statement can do something to

handle the failure; e.g., it might invoke another read operation located on a different pro-

cessor.

The second mechanism is a predefined boolean function, 'failed'. The argument

to 'failed' is either a capability or a machine name. Invocation of 'failed' returns true if the

7 The 'Undefined' value is visible to the programmer for concurrent invocations that have not yet completed.

63

resource, operation, or machine indicated by its argument does not exist or is presumed to

have crashed or to be unreachable; otherwise 'failed' returns false. A process can invoke

'failed' at any time to test the status of a resource, operation, or machine.

A process can also use 'failed' within in statements. This is done by including

exception commands of the form

failed(argument) -+ block

An exception command is selectable when the value of 'failed' is true. This use of 'failed'

enables a process to be informed when a failure occurs. One use is to avoid waiting forever

for an operation to be invoked. For example, a file server might be written as

do true -+

od

in read( ... ) -+ .. .

~ write( ... ) -+ .. .

~ seek( ... ) -+ .. .

~ close( ... ) -+ exit ~ failed(Barrel) -+ exit

ni

to ensure that it terminates should its client fail. We assume for illustrative purposes that

the server's client is located on the machine named 'Barrel'. In a more realistic program,

the server would need to determine the location of its client at run-time. The 'invoker'

predefined function is used for such a purpose; 'invoker' takes no arguments and returns a

value of type machine. Inside a proe or an in, 'invoker' returns the machine on which the

invoker of the operation resides; inside initial or final code, 'invoker' returns the machine

on which the creator or destroyer, respectively, resides. In the above example, if the proe

containing the in statement was created directly by the client, the server would use

'invoker' (before the do statement) to determine the client's machine. On the other hand,

if the proe was created indirectly, say through a file manager, the proc's creator would

64

use 'invoker' and pass its result as a parameter to the proc.

3.4. Types, Declarations, and Sequential Statements

SR provides a variety of data types, operators, and sequential statements. The

kinds that are provided are those found in conventional languages. However, their form

and many of their details are different. Largely this is to facilitate the integration of the

sequential and concurrent components of SR. It also results from our desire to make it

easy to program commonly occurring algorithmic patterns.

There are five builtin types: bool, char, int, real, and string. In addition,

users can define enumeration types, records, pointers, capabilities, and unions. Arrays are

not provided as type constructors; instead arrays are introduced with variable and opera

tion declarations, as shown earlier in both the Queue and BoundedBuffer examples. In this

respect SR is more like C than Pascal. The result is a great deal more flexibility than is

provided by Pascal-like languages. In particular, one array can be assigned to another as

long as they have the same base type and number of components even if they have

different subscript types. For example, the three arrays

var al[O:4] : int var a2['f':'j']: int type color = enum( red, green, blue, orange, yellow, magenta, brown) var a9[green:magenta]: int

can be assigned to one another because each is an array of five integers (see Sec. 3.5 for

more details about type checking). Additional examples are given below; SR's treatment of

arrays is discussed in detail in [Andr82a].

SR allows type declarations to indicate how objects of the type can be used. The

default is public, which imposes no restriction. The other option is private. Use of a

private type is unrestricted in the component in which it is declared. However, in other

65

components, objects of private type can only appear in declarations and expressions. In

particular, such objects cannot be assigned to. This allows a resource to export a type,

while ensuring that only instances of the resource can alter objects of that type.

Most expressions in SR compute a single value. SR provides a variety of opera

tors for use in expressions. In addition to the standard arithmetic, boolean, and relational

operators, SR provides a concatenation operator and the unary operator '?'. The concate-

nation operator is described later in this section. The '?' operator returns the number of

pending invocations of an operationj8 it can only be used within the scope of the operation.

One use of the '?' operator is to impose an order in which to service invocations in an in

statement. For example, the in statement

in aO and ?b = 0 ~ ... o 60 ~ ...

ni

will select an invocation of a only if there are no pending invocations of b.

SR also provides a number of predefined functions that may be used as operands

in expressions. Many of these functions are defined for any ordered type, i.e., bool, char,

int, or a user-declared enumeration type. The following list summarizes the prefined func-

tionsj details appear in [Andr85] .

• absolute value of an integer or real expression.

8 The value returned by'?' is the number of pending invocations at the time '?' is evaluated. Between the time of this evaluation and the time the returned value is used, the actual number of pending invocations might change due to the arrival of new invocations or the selection of invocations by other processes, which can happen for a shared resource operation.

66

• minimum and maximum values of a list of values of an ordered type.

• smallest and largest value in an ordered type; predecessor and successor of a value in an

ordered type.

• lower and upper bounds of an array; length of an array or string.

• allocation and deallocation of memory for user-defined data structures; number of bytes

in an object; binary representation of an object.

• resource capability for the executing resource; machine on which an invocation ori

ginated; completion status of a capability; detection of failure of a resource or machine.

In addition to these predefined functions, one conversion function is implicitly associated

with each user-defined enumeration type. In particular,

typejden tifier( x)

converts x to the named enumeration type (i.e., typejdentifier), where x is a value of an

ordered type and the result is within the range of values of the named type. Enumeration

literals are implicitly numbe.\'ed 0 up to one less than the number of literals in the type.

Similar conversion functions exist for the builtin types; e.g., bool( x), where x is an integer

variable, is true if x is non-zero and false if x is zero.

Constructors compute lists of values. An array constructor computes a list of

values of the same type; a record constructor computes a list of values of possibly different

types. Constructors are used in declarations of array or record variables to initialize such

variables; they are also used in assignments to array or record variables, or to formal

parameters. For example, the declaration

var x[1:10]: char := ([4] 'a', 'b', [5] 'c')

declares the ten element character array x. An array constructor is used to initialize x's

67

first four elements to 'a', fifth element to 'b', and remaining five elements to 'c'. As

another example, the declarations

type point = rec( x, y: intj label: char) var p: point:= rec( 300, -40, '8' )

declares the record type point and the variable p of that type. A record constructor is used

to initialize pj specifically, p.x is initialized to 300, p.y to -40, and p.label to '8'. Array and

record constructors can also be used in expressionsj e.g., the assignment statement

line := ( [profits] '$', [N-profits] , , )

assigns dollar signs to the first profits elements of line and blanks to the remaining ele

ments. (The array line is assumed to have N elements, and profits is assumed to be

between 0 and N.)

SR supports the decla.ration of variables, constants, types, optype's, and opera-

tions. Variable declarations can include initialization expressions, as shown above. Con-

stant declarations define symbolic names for expressions. SR's constants are like read-only

variablesj they may reference previously declared objects and may even have values that

are not computable until run-time. This is illustrated in the following proc.

proc differ{ arrl,arr2) returns place const len = min( ub( arrl),ub( arr2))

end

Here, the constant len is initialized to the smaller of the upper bounds of the array parame-

ters arr 1 and arr2.

SR's sequential statements include ones for assignment, alternation, and itera-

tion. The basic assignment statement is the multiple-assignmentj the conventional single

assignment is merely a special case. Additional special cases are single-variable increment

and decrement, examples of which appear in Queue and BoundedBuffer. Arbitrary-sized

68

slices of arrays as well as entire arrays can be assigned. For example, if matrix has N

columns numbered 1 through N, the multiple assignment statement

matrix[i,l:N], matrix[j,l:N] := matrix[j,l:N], matrix[i,l:N]

swaps its i'th and j'th rows using array slices. Also, one of the bounds of the source or tar

get array in an assignment can be unspecified; e.g.,

arr[l:*] := 'this is a string'

assigns the indicated string to character array arr beginning at subscript 1 (assuming arr is

large enough to hold the string). Formal parameters and results in operation declarations

can also employ ,*, in a similar way to permit variable-size arguments or return values.

For example, in the operation declaration

op differ( arr1[l:*], arr2[1:*]: char) returns place: int

the upper bounds of arrl and arr2 are determined by the corresponding arguments in the

invocation. (As discussed earlier, the subscripts on the actual arguments are not required

to be integers or have a lower bound of one.) Similarly, in the operation declaration

op fill( filler: cha.r ) returns filled[l:*]: char

filled's upper bound is the size of the target of the assignment resulting from the function

invocation of fill; e.g., for the invocation

line[3:8] := fill('a')

filled's upper bound is six.

The concatenation operator, 'II', takes two vectors and returns the result of

appending the second to the first. For example,

a[1:6] := 'abc' II 'def'

assigns the string 'abcdef' to elements one through six of the character array a and

69

6[1:10] := 6[2:10] II 6[1]

left shifts elements one through ten of the array 6. Note that the operands of concatena-

tion can have any type-they are not limited to be only strings or arrays of characters as is

required by many other languages.

The builtin type string provides variable length strings; they are similar to vary-

ing strings in PL/I [IBM66]. The declaration of a string variable specifies its maximum

length. For example,

var weather: string(10)

declares the variable weather, which can hold at most 10 characters. An implicit length is

associated with each string variable. It is set automatically as part of assignment to the

variable; its current value is obtained using the predefined function 'length '. The value of

the string variable s is the string in its first 'length(s)' positions. The statements

weather := 'abcdefghij' weather := 'cloudy' write( weather, length( weather) )

output 'cloudy' and 6, for example. Substrings can be extracted from or assigned to

strings using slices. For example, the statements

weather := 'cloudy' write( weather[2:5] ) weather[1], weather[4:5] := 's', 'pp' write( weather)

output 'loud' and 'sloppy'. String variables are like character arrays; the difference is that

the length of a character array is fixed, whereas the length of a string variable varies and is

automatically maintained. Strings and character arrays can be assigned to one another.

Strings can also be declared as formal parameters with unspecified size (i.e., '*'). The

actual maximum size of such a string is determined by the maximum size of the

------------------ --- ._-_.- -------_._---.---_._-.-_.---------------

70

corresponding argument in the invocation.

The single conditional statement is the if statement, which is similar to Dijkstra's

guarded-command if [Dijk75]. We provide only this one conditional statement since it

supports all of if-then, if-then-else, and case. Also, its structure and non-deterministic

semantics are similar to those of the in statement. The general form of if is

if boolean.J!xpression -+ block ~ ... ~ boolean~xpression -+ block fi

Execution of if selects one boolean expression that is true then executes the corresponding

block. Selection is non-deterministic if more than one boolean expression is true. If no

boolean expression is true, if has no effect (unlike Dijkstra's if, which aborts in this situa

tion). We allow the last component of if to have the form

else -+ block

where else is interpreted as the conjunction of the negations of the other boolean expres-

sions.

Two iteration statements are provided: do and fa (for-all). The do statement is

similar to Dijkstra's do, again to be compatible with the forms of in and if. Like the if

and in statements, the last component of a do statement can be an else block that is exe

cuted if no boolean expression is true. Note that a do statement with an else block will

loop infinitely unless a return or exit statement (see below) is executed in one of its

blocks. As an example of the do statement, the program fragment

X,Y:= x, y do X < Y -+ Y:= Y - X ~ Y < X -+ X:= X - Y

od gcd:= X

assigns to gcd the greatest common divisor of positive-valued x and y.

71

The fa statement is unique to SR, although it is similar to for-like iterative state

ments in other languages. The form of fa is

fa quantifier, ... , quantifier -- block af

where quantifiers have the general form

boundyariable initial.J!xpression direction final~xpression st boolean.J!xpression

Each quantifier implicitly declares a new bound variable whose type is derived from the ini

tial and final expressions. The scope of this variable is from the point of its implicit

declaration to the end of the for-all statement. The quantifier specifies that the bound

variable is to range from the specified initial value up to or down to the specified final value

according to whether the specified direction is to or downto, respectively. The bound

variable usually takes on all values in the range; if the optional such-that clause (st) is

present, the bound variable only takes on those values for which the expression following

st evaluates to true. Thus, the block within a fa statement is executed for each combina

tion of values of the bound variables (or until an exit statement is executed-see below).

Note that the right-most bound variable varies the most rapidly, then the next to right-

most, etc.

For example, the single fa statement

fa i := lb( a) to ub( a)-I,

af

j:= i+1 to ub(a) st a[i]>a[j] -ali], a[J1 := a[j], ali]

sorts array a into ascending order. Each of the two quantifiers in this statement employs

the predefined array lower-bound ('lb') and upper-bound ('ub') functions. Note how the

range of values for the second bound variable j depends on i. Also note the use of st,

which is used to limit execution of the body of fa to those values of i and j for which

72

ali] > a[J1.

The two final sequential statements are exit and next. The exit statement is

used to force early termination of the smallest enclosing iterative or co statement. The

next statement is used to force return to the loop control of the smallest enclosing iterative

statement. It can also be used within a post-processing block in a co statement to leave

that block and continue execution by waiting for the next invocation (if any) to complete.

Note how these statements, like quantifiers, have consistent uses in support of both

sequential and concurrent programming.

3.5. Signatures and Type Checking

Type checking in SR is based on what is called structural equivalence. An object

can be assigned to another object provided they have the same structure. Two objects

have the same structure if they contain the same number of components and each com

ponent has the same type or the same structure. Two resource capabilities have the same

structure if they are capabilities for the same named resource pattern.

For type checking, each object or expression has a signature with the general form

[ size] definition

The size field gives the number of elements. For scalar variables or values, this is '[1]'. For

arrays or array slices of fixed size, this is the total number of elements. For arrays or array

slices of arbitrary size, the size field is '[*]'. The definition field specifies the type structure

of the object. This is derived from the declaration itself if the type is anonymous or built

in; it is derived from the type declaration for user-defined types.

An expression can be assigned to a target variable, formal parameter, or return

value if their signatures are compatible. Expression and target signatures are compatible if:

73

(1) their sizes are the same, or exactly one is [*]j and

(2) their definitions are identical.

If the signatures are compatible-and all explicit or implicit array subscripts are within

range and have the required subscript type-an assignment copies size elements from the

expression to the target. Assignments from vectors to matrices are performed in row-

major order. Note that whether two signatures have identical definition parts can always

be determined at compiled time, but whether their sizes are identical and their subscripts

within range sometimes cannot be determined until run-time.

To illustrate signatures and type checking, consider the following declarations.

var i: int

var al[O:4] : int var a2['f':'j']: int type color = enum( red, green, blue, orange, yellow, magenta, brown) var a9[green:magenta]: int

type one = rec( al, a2: intj a9: char) var r1: one var r2: rec( x, y: intj z: char)

The signatures of the above variables and some of their uses are:

Ob}ect , al a2 a9 al[2:3] a2['g':*] a9[orange] rl r2

Signature

[1] int [5] int [5] int [5] int [2] int [*] int [1] int [1] rec( intj intj char) [1] rec( intj intj char)

Note that only the type and the number of elements in an array or a slice are included in

signaturesj the types of subscripts are not. Also note that only the structure is included in

74

the signature of a record; the names of the record and its fields are not. Based on the

above signatures, the following assignments are legal and have the indicated effects.

a2['g':*] := i a2['g':*] := al[2:3] i := a9[orange] al := a2 rl := r2

# a2['g'] is assigned i. # a2['g':'h'] is assigned al[2:3]. # i is assigned a9[orange]. # the en tire array a2 is copied in to al. # the entire record r2 is copied into rl.

On the other hand, the following assignments are illegal for the reasons cited.

i:= al[2:3] a2['g':*] := al[2:*] a2['g':*] := al a9[i] := 3

# sizes are different. # only one ,*, is allowed. # subscripts on a2 are out of range. # wrong subscript type.

Finally, the legality of the following assignments must be determined at run-time for the

reasons cited.

al[ i] := 3 # i must be between 0 and 4. a2['g':'h'] := al[i:*] # i must be between 0 and 3.

3.6. Implementation Specific Mechanisms

The language mechanisms defined in the previous sections are assumed to exist in

any implementation of SR. In addition, each implementation must provide some means

for communicating with input/output devices. There are two essentially different kinds of

implementations and hence two different collections of input/output mechanisms.

Our' current implementation is built on top of UNIX: SR programs run as UNIX

processes on one or more interconnected machines. In this or any similar implementation,

input and output are supported by a collection of predefined functions that provide access

to the underlying file system(s).

75

Ultimately, SR programs are intended to be able to run stand-alone on a network

of interconnected processors. In this environment there is no resident operating system, so

input and output facilities must be programmed in SR. This is supported by a variant of

resources called real resources.

The remainder of this section describes these two collections of input/output

mechanisms. Note, however, that except for the way in which input and output are pro

grammed, any SR program can run without change in either kind of implementation.

3.6.1. Input/Output in the UNIX Implementation

The I/O mechanisms in the UNIX implementation of SR essentially make the

underlying UNIX I/O mechanisms available to the SR programmer. I/O in this or any simi

lar implementation is supported by an additional data type, file, and several operations on

this type. A variable of type file contains a descriptor for a UNIX file; it can be thought of

as being a capability for the associated file. There are 5 predefined file literals, all of which

are reserved words. Three of the literals name the corresponding UNIX files:

stdin stdout stderr

the standard input device (usually the keyboard) the standard ou tpu t device (usually the display) the standard error device (usually the display)

The other file literals are null and noop. Attempting to access a file whose descriptor has

value null results in an error; null is the initial value of variables of type file. Attempting

to access a file whose descriptor has value noop has no effect.

In addition to the basic type file, the predefined enumeration type

type access mode = enum(READ, WRITE, READ WRITE)

defines enumeration literals used as arguments to the file open operation. Finally, there are

two predefined integer constants:

const EOF = -1 const ERR = -2

These are used as return values on file access operations, as described below.

76

Files are opened, closed, created, and removed by the following predefined func

tions, all of which return a boolean value.

open(pathname[I:*] : char; mode: accessmode; res /: file) returns s : bool If mode is 'READ', open an existing file for reading. If mode is 'WRITE', create a new file for writing, or truncate an existing file. If mode is 'READWRITE', open an existing file for both reading and writing. In all cases, the read/write pointer starts at the beginning of the file. The parameter pathname is the absolute or relative pathname of the file to be opened. If successful, open sets / to the appropriate file descriptor and returns true; if unsuccessful, open sets / to null and returns false.

close(f: file) returns s : bool Close file /, which should have been open. Returns true if the file can be successfully closed; otherwise returns false. Open files are implicitly closed when a program terminates.

remove(pathname[I:*] : char) returns s : bool Remove file pathname from the file system. The file should not be open; if it is, 'remove' will not take effect until the file is closed or the program terminates. Returns true if remove is successful; otherwise returns false.

Open files can be accessed by predefined operations for formatted and stream

I/O, and for read/write pointer manipulation (e.g., seek).9 Below, we give sketches of only

the predefined operations for formatted I/O; full descriptions of these and the other I/0

operations are given in [Andr85]. 'Read' and 'write' provide a simple, formatted I/O facil

ity. Each of these operations takes a variable number of arguments; each argument

9 The three files stdin, stdout, and stderr are already open when execution of an SR program begins.

(except for the optional file argument) must have type bool, char, int, real, or string.

read( v1 : Tj ... j vN: T) returns cnt : int read(f: filej v1 : Tj ... j vN: T) returns cnt : int

Read N literals from stdin (the default) or from file f and store them in variables v1, ... , vN. The literals must be separated by whitespace. 'Read' returns the number of literals successfully read, which might be 0 ~n the event of an error reading the first literal. 'Read' returns 'EOF' if the end of file is reached before any literals are found. 'Read' returns 'ERR' if f is not open for reading.

write( el : Tj ... j eN: T) returns cnt : int write( f: filej el : Tj ... j eN: T) returns cnt : int

Write N expressions to stdout (the default) or to file J, converting them to their literal ASCII form. One blank is inserted between each output valuej a newline is appended after the last output value. 'Write' returns the number of expressions that were successfully written, or 'ERR' if f is not open for writing.

77

In addition to these operations on files, two predefined operations provide access

to the arguments on the command-line that invoked execution of the SR program. These

can be used, for example, to parameterize a program.

numargsO returns cnt : int 'Numargs' returns the number of arguments on the command line. As III

C, the command name (argument 0) is counted.

getarg( n : intj arg: T) returns cnt : int

Read the n'th argument into argo As for 'read', the argument must be a literal of type bool, char, int, real or string. If successful, 'getarg' returns the number of characters contained in the argument. 'Getarg' returns 0 if the conversion cannot be performed or 'EOF' if there is no n'th argument.

Simple uses of the predefined operations 'read' and 'write' have been shown ear-

lier in this chapter. The following resource illustrates how a file can be opened and written,

and how command-line arguments can be accessed using the predefined operations

described above. It takes two command-line arguments: a filename and an integer. The

78

resource writes the specified number of Fibonacci numbers, in order, to the specified file,

one per line.

resource FibonacciO initial

var pathname[1:200]: char; var count: int var f. file # get command-line arguments.

var len: int len := getarg{l,pathname) if len < 1 - return fi if getarg{2, count) < 1 - count := 0 fi

# open file f. if not open{pathname[l:len], WRITE, f) - return fi

# write Fibonacci numbers 1 ... count to file f.

end end

var fibl, fibE: int := 0, 1 fa i := 1 to count-

if write{ /, fibE ) =F 1 - return fi fibE, fibl := fibl + fibE, fibE

af

The command-line arguments are represented in Fibonacci by the variables pathname and

count, respectively, and are obtained using the 'getarg' function. The program then

attempts to open the named file and write the required lines. The if statements are used

above to detect errors in command-line arguments and I/O. In most cases, Fibonacci han

dles such an error by executing a return, which terminates the program. However, if the

second argument (count) is not specified or is invalid, the default 0 is used. Note that the

program does not explicitly close the file when it terminates; if open, the file will be closed

automatically.

3.6.2. Device Control in Stand-Alone Implementations

In a stand-alone implementation of SR, a programmer needs to be able to control

hardware facilities such as devices and memory. This is supported by a variant of

79

resources called real resources. Such resources are distinguished from regular resources by

prepending the keyword real to a resource specification. Real resources are structurally

identical to regular resources, and can use all the same language mechanisms. In addition,

they can use the special mechanisms described in this section.

3.6.2.1. Data Types and Variable Declarations

Variables in real resources can be bound to specific addresses by appending 'at

address...!lxpression' to the variable's declaration. The main use of such variables is for

accessing devices' control and status registers.

An additional type, byte, may also be used within real resources for variables

and parameters. For example, a memory allocator might declare a large array of bytes to

serve as the space it manages. Arrays of bytes are universal in the sense that any variable

can be assigned to a slice of a byte array and vice versa, the only requirement being that

the size of the variable is the same as the size of the slice. Thus, a formal parameter

declared as an arbitrarily-sized array of byte's is compatible with any actual parameter;

e.g., the operation declaration

op clear{ var space[l:*]: byte)

specifies that space matches any actual parameter.

3.6.2.2. Operations and Interrupt Handlers

Since the role of real resources is to support programming device control facilities,

for which efficiency is a major concern, parameters to real resources can be passed by refer

ence. The parameter kind is this case is ref. Pointer types may also appear in the

specification parts of real resources. However, real resources that export operations having

ref parameters or employ pointer types may only be invoked by processes executing on the

same machine; cross-machine addresses are not supported in this case.

80

An operation in a real resource can also be bound to a specific machine address by

appending 'at address...!!xpression' to the declaration of the operation. This is used to

declare interrupt operations. Conceptually, a physical device is viewed as a process that

services I/O operations and invokes an interrupt operation. An interrupt operation should

be declared as being call-only or send-only by using the appropriate operation restriction.

If a device's interrupt operation is call-only, then when the device signals an interrupt, it

waits for the interrupt to be serviced before continuing. This effectively inhibits further

interrupts from the device. By contrast, if an interrupt operation is send-only, the device

controller is free to continue and might therefore signal another interrupt. Interrupt

operations may not be invoked directly within an SR program.

If an interrupt operation is implemented by a proc, the operation should be call

only to avoid the overhead of creating a process to service the interrupt. If an interrupt

operation is implemented by input statements, such statements should not contain com

plex operation commands, again to minimize interrupt-handling overhead.

Interrupt operations might have parameters, and proc's or processes that ser

vice interrupts might have an associated interrupt priority level. Whether or how these

mechanisms are provided is implementation dependent since they depend heavily on the

underlying hardware architecture.

3.6.2.3. Sketch of a Disk Driver

To illustrate how the special mechanisms available in real resources might be

used, this section presents a sketch of a disk driver for a machine such as a Vax, in which

addresses of device registers and interrupts are memory mapped. The disk driver would

typically be used by a file server to read or write blocks on the disk.

real spec disk

type iotype = enum( read, write) const BS = 1024 # disk block size. op io( rw: iotype; bn: inti ref buffer[l:BS1: byte) returns error: bool

# rw is the kind of disk operation: read or write. # bn is the block number. # buffer is the area that the I/O command is to read into or write from. # error is set to true iff could not perform the operation successfully.

resource disk{ dr _addr, dCaddr: ptr byte) # dr_addr is the address of the device register. # di_addr is the address of the device interrupt vector.

type drJype = rec( ... ) # layout of the device register. var dr: drJype at dr_addr # the device register. op interruptO at dLaddr # the interrupt operation-invoked by the device.

process driver

do true -+

in io{ rw, bn, buffer) returns error -+

# setup for the command.

ni od

end

end

# e.g., calculate cylinder, track, and sector from block number; # store those, address of buffer, byte count, and command # in device register.

# start the command; # e.g., by setting some bits in the device register.

# wait for interrupt that signals completion of command. receive interrup~)

# check status information returned by command.

81

This resource exports the operation io, which allows its user to read or write a disk block.

The buffer parameter to io is declared as a ref parameter. Thus, the actual I/O buffer will

be located in a resource that invokes io; only its address will be passed from the invoker to

disk. The two resource parameters, dr_addr and dCaddr, represent the addresses of the

82

device register and device interrupt vector, respectively. Note that an address is declared

as a pointer to a byte; although quite common, this is machine-dependent. The variable

dr (the device register) is placed at the location given by dr _addr; similarly, the operation

interrupt, which is invoked when the device interrupts, is placed at the location given by

dCaddr. The driver process services a single io invocation at a time. For each invocation, it

starts the command, waits for the invocation of interrupt (which signals the completion of

the command), and then checks the status information returned by the command. If the

command did not succeed, the driver could choose to restart the command, although that

is not shown above.

CHAPTER 4

Examples

In this chapter, we present six larger examples that together illustrate most of the

language mechanisms and the interplay between them. These examples demonstrate how

many of the issues raised in Chapter 2 are resolved in SR. The first example presents a

complete, albeit simple, program that sorts a set of input data. It shows the basic struc

ture of a sequential SR program and illustrates the use of the I/0 primitives. The second

example finds the smallest number of queens required to cover a chessboard. It illustrates

additional sequential aspects of SR. We also describe how concurrency can be added to our

solution; co statements are used to create concurrently executing processes, and a sema

phore operation is used for synchronization between those processes. The third example

shows how a bounded buffer resource might be written and used. It illustrates the basic

synchronization mechanisms in the language. The fourth example presents three solutions

to the classic dining philosophers problem [Dijk68]. These solutions employ many of the

SR communication mechanisms and illustrate different ways to structure solutions to syn

chronization problems. The fifth example uses a "probe-echo" algorithm to determine the

global topology of a network. It further illustrates the communication mechanisms, and

also shows how local operations, capabilities for individual capabilities, and optype

declarations can be used. The final example outlines parts of a simplified version of the

Saguaro file system [Andr86]. This example illustrates the use of additional mechanisms

including global components, mutually dependent resources, and the reply statement.

83

--... _. ---. -------

84

4.1. Sort Program

This single resource program illustrates many of the sequential aspects of SR and

the use of several of the predefined operations, including several of the input/output opera-

tions. The sorter resource sorts a list of integers into ascending order. First, it prompts for

the size of the list and for each integer in the list. Then, it outputs the original list, sorts

the list, and outputs the sorted list.

resourc~ sorter()

op prinCarray{ a[l:*] : int) op sor~var a[l:*] : int)

process main...!outine var n : int write('number of integers?'); read( n) var nums[l:n] : int # dynamically allocated after n is read

# Read in numbers. fa i:= I to n __ write('?'); read(nums[iJ) af

write('original numbers'); prinCarray( nums)

sor~nums) write('sorted numbers'); prinCarray( nums)

end

proc prinCarray( a) # Print elements of array a.

fa i := lb( a) to ub( a) -- write( a[ iJ) af end

proc sort( a) # Sort a into non-decreasing order.

fa i := lb( a) to ub( a)-I,

end

end

af

j:= i+l to ub(a) st ali] > a[Jl -a[ i], a[Jl := a[Jl, a[ i]

Because sorter is by itself an entire program, it neither imports nor exports any

objects; hence, it contains no spec. Each operation defined in the resource-prinCarray

and sort-is implemented by a proc and has as a parameter an array whose upper bound

----- -----------

85

is '*'. Thus, the value of the upper bound in a particular invocation of prinCarray or sort

is determined by the actual argument; the code in each proc uses the predefined function

ub to determine the actual upper bound. (Such code also uses lb to get the lower bound,

although that is always one in this case.) Sort's parameter is declared as a var parameter

so that changes made to it will be copied back into the actual argument.

The main routine in sorter is a single process that reads the input and then calls

prinCarray and sort. The optional keyword call is omitted from all the operation calls;

this choice is purely stylistic. Note that array nums is declared after its size is read; this is

permitted and results in nums having a size that is based on the input. For-all statements

are used throughout sorter to range over elements in arrays. Several different forms of for

all statements are employed. Note in particular the one in proc sort, which contains a

such-that clause that selects specific values of i and j for which to execute the assignment

that swaps ali] and a[J1. In most languages, two loops enclosing an if statement would be

required to program the actions of this single for-all statement.

4.2. N < 8 Queens

The "N ~ 8 Queens" problem is to find the smallest number of queens that com

pletely covers a chessboard [Wexe86]. A square on a chessboard is covered if it is controlled

by a queen. We first present a sequential solution to this problem. It illustrates array

slices, recursion, and for-all statements. We then show how that solution can be modified

to use concurrency. We change sequences of invocations into concurrent invocations. This

change uses eo statements for the concurrent invocations and a semaphore operation to

provide mutual exclusion between concurrently executing processes.

86

4.2.1. Sequential Solution

The following is a sequential solution to the "N ~ 8 Queens" problem. The gen

eral case of our algorithm places a queen at a currently vacant square. If the board is still

not covered, the algorithm recursively tries placing a new queen at each of the vacant

squares on the board. In doing so, only solutions that could be better than the current best

solution are tried. If the board is now covered, then we have a best solution.

The program presented below consists of the single resource Queens. It solves the

problem for an Nby N chessboard, where Nis a declared constant. Queens defines the sin-

gle operation place, which takes the following parameters: the location in which to place a

queen; the number of queens placed so far; and two boolean matrices, qboard and cboard,

that together represent a chessboard. The location of the placed queens is recorded in

qboard, and the squares they cover are recorded in cboard. The resource variable

bes(placed is used to record the fewest number of queens found so far that cover the board.

It is initialized to N since a solution using N queens is known to exist, e.g., one queen

placed in each row. The initialization code simply invokes place once for each possible

placement of the first queen on an empty board, and then prints bes(placed.

resource QueensO

const N= 8 op place(i,j: inti cboard[l:N,l:Nl, qboard[l:N,l:Nl: bool; placed: int) var bes(placed: int := N var bestgboard[l:N,l:N]: bool

initial

fa i:= 1 to N, j:= 1 to N -- place(i,j,([NI'Nlfalse),([NI'Nlfalse),O) af write("the smallest number of queens is", best-placed)

end

proc place( i,i, cboard, qboard,placed)

# mark square as occupied and update number of queens placed. qboard[ i,Jl := true; placed ++

# mark row i, column j, and diagonals as covered. cboard[i,l:N] := ([N] true); cboard[l:N,Jl := ([N] true) fa k := 0 to N-l -+

if (i-k»O & (j-k) >0 -+ cboard[i-k,j-k] := true fi if (i-k»O & (j+k)<N+l -+ cboard[i-k,j+k] := true fi if (i+k) <N+l & (j-k) >0 -+ cboard[i+k,j-k] := true fi if (i+k) <N+l & (j+k)<N+l -+ cboard[i+k,j+k] := true fi

af # check if board is now covered.

var covered: bool := true fa i:= 1 to N, j:= 1 to N st not cboard[i,Jl -+

covered := false; exit af

# either update bes(placed or place another queen # if that could give a better solution than current best.

if covered-+

end

end

bes(placed := placed; best...!Jboard := qboard ~ else -+

fi

if placed+ 1 < bestJJlaced-+

fi

fa i := 1 to N, j := 1 to N at not qboard[ i,j] -+

place( i,j,cboard, qboard,placed) af

87

The procedure place first places a queen at row i and column j, updates qboard accordingly,

marks squares in cboard that this queen covers, and determines if the board is now covered.

Then, if the board is covered, we have a new best solution so bestJJlaced is updated. l Oth-

erwise, place tries to find a solution by placing another queen. It recursively calls itself,

1 The current qboard is also saved in the resource variable best...!Jboard so that it can be printed at the end of the program. However, printing the board is not shown above.

--------. __ ._- _ .. -

88

trying a queen at each of the vacant squares. To prevent attempts that could not be better

than the current best solution, place compares the number of queens already placed on the

board (placed) with the best solution so far (bes(placed) and continues only if one more

queen placed on the board would result in fewer queens than bes(placed. Note that a

vacant square that is already covered is still a candidate for having a queen placed on it.

Placing a queen on such a square might allow the queen to cover squares that would other

wise require more than one queen to cover. For example, if Nis 4, then an optimal solution

is to place queens on squares (2, 2) and (4, 4). In this solution, the second queen is placed

on a square that is already covered by the first queen.

Note the use of array constructors in Queens. They are used in initial in the

invocation of place, and in place on the right hand of the assignments that mark row i and

column j as covered. Also note that the left hand sides of those assignments use array

slices to indicate a row and a column of cboard.

Using two matrices (cboard and qboard) to represent the board makes it easy to

mark squares as covered. Alternatively, the board could be represented as a single matrix

with each square being occupied by a queen, vacant and uncovered, or vacant and covered.

However, marking would then need to handle an occupied square and a vacant square

differen tly.

4.2.2. Adding Concurrency

We now modify the above solution so that at each stage, different placements of a

queen on the board are considered concurrently rather than sequentially. The basic algo

rithm remains the same, though.

The first modification is to change invocations of place to use co instead of fa;

this makes the searches concurrent. Specifically, the invocations in initial now become

-- ... - .. _- --._------

co (i:= 1 to N, j:= 1 to N) place(i,j, ([N"J\1 false), ([N*J\1 false), 0) oc

and the invocations in place now become

co (i := 1 to N, j:= 1 to N st not qboard[ i,Jl) place( i,j, cboard,qboard,placed)

oc

89

Then, we introduce the semaphore operation muter, it is used to prevent multiple processes

from updating the resource variables best"'placed and best~oard at the same time. Its

declaration

op mutex{) {send}

appears with the resource variables; it is initialized in initial by

send mutex{)

This semaphore is used in place whenever best"'placed or best...Jlboard is modified. That is,

the code in place that decides whether to try placing another queen becomes

if covered -+

receive mutex{) best-placed := placed; best...Jlboard:= qboard send mutex{)

~ else -+

fi

if placed+l < best"'placed-+

fi

co (i := 1 to N, j := 1 to N st not qboard[ i,J1) place( i,j, cboard, qboard,placed)

oc

Note that the comparison of placed with best"'placed is nut protected because best-placed is

monotonic: its value never increases. Thus, if best"'placed is changed during that com-

parison, the only bad effect is that extra, useless tries might be made.

90

4.3. Bounded Buffer

The next example presents a bounded buffer resource and illustrates how

processes in different resources might communicate and synchronize.2 Each instance of

BoundedBuffer provides two operations: insert and remove. A producer process calls insert

to insert an item into the buffer; a consumer process calls remove to retrieve an item from

the buffer. Invocations of insert and remove are synchronized to ensure that messages are

removed in the ordered in which they were inserted, are not removed until inserted, and

are not overwritten.

spec BoundedBuffer

op inser« item: int) op removeO returns item: int

resource BoundedBuffer( size: int)

process bb var store[O:size-l] : int var front, rear, count: int := 0, 0, ° do true -+

in inser« item) and count< size -+

store[rear] := item; rear:= (rear+l)%size; count++ ~ removeO returns item and count>O -+

item:= store[front]; front:= (front+l)%size; count--ni

ad end

end

The two operations defined by BoundedBuffer are declared in its spec and hence

are visible outside the resource. When an instance of BoundedBuffer is created, the size the

2 This example includes a BoundedBuffer resource that is nearly identical to the one shown in Sec. 3.2; it differs only in that it uses the process abbreviation. We present it here again mainly to show how it may be used by another resource.

91

buffer is to have is passed as an argument and an instance of the background process bb is

created. This process loops around a single input statement, which implements insert and

remove. The synchronization expressions in the input statement ensure that the buffer

does not overflow or underflow as described above.

The following resource outlines how instances of BoundedBuffer might be created

and used. It illustrates dynamic process creation and the use of capability variables.

spec User

import BoundedBuffer

resource User()

var buf: cap BoundedBuffer op pO {send} const N= 10

initial # Create a buffer with room for 20 items.

buf:= create BoundedBuffer(20) # Create N instances of proc p.

fa i := 1 to N -+

send pO af

end

proc pO var it: int

# Do some insert's into and remove's from the buffer.

end

final

buf. inser~ it)

it := buf. removeO

destroy buf end

end

92

User imports BoundedBuffer so that it can create instances of BoundedBuffer and invoke

operations in those instances. The create statement in the initialization code creates a

bounded buffer resource with 20 elements and assigns to buf a capability for that instance;

buf is a resource variable and hence is shared by all processes in User. These processes

invoke operations in the instance of BoundedBuffer by using the capability stored in but,

e.g., buf.insert refers to the insert operation. The final code in User ensures that if User is

destroyed, then the Bounde~Buffer it created will also be destroyed.

4.4. Dining Philosophers

In the dining philosophers problem, N philosophers (typically 5) sit around a cir

cular table set with N forks, one between each pair of philosophers. Each philosopher

alternately thinks and eats. To eat, a philosopher must first acquire the forks to his

immediate left and right.

This problem can be solved in at least three basic ways in a distributed program

mmg language. In these solutions, philosophers are represented by processes. The

approaches differ in how forks are managed. The first, centralized approach is to have a

single servant process that manages all N forks. The second, decentralized approach is to

distribute the forks among N servant processes, with each servant managing one fork. The

third approach is similar to the second, but employs one servant per philosopher instead of

one servant per fork. Each approach can be readily programmed in SR, as shown below.

4.4.1. Centralized Approach

This approach employs a single servant process that manages all N forks. Each

philosopher request forks from the servant, eats, and then releases the forks back to the

servant.

93

Our solution employs three resources: Servant, Philosopher, and Main, which are

compiled in that order. One instance of Main is created when execution of the program

begins. It prompts for input about the number of philosophers (n) and the number of "ses

sions" each philosopher is to execute (t). Main then creates one instance of Servant and n

instances of Philosopher; each philosopher is passed a capability for the Servant.

spec Main

import Philosopher, Servant

resource MainO

initial var n, t: int write("how many Philosophers? "); read( n); write{"how many sessions per Philosopher? "); read{ t); # create the Servant and Philosophers

end

end

var s: cap Servant var ph: cap Philosopher s := create Servan~ n} fa i:= 1 to n-l-

ph := create Philosopher{ s, i, t) af

Note that the number of Philosophers (n) is passed to the Servant as a resource parameter,

and that a capability for the Servant is passed to each Philosopher as a resource parameter.

Each instance of Philosopher alternately eats and thinks for t sessions. Before

eating, a Philosopher calls the Servant's getforks operation; after eating, it calls the

Servant's relforks operation.

spec Philosopher

import Servant

resource Philosopher{ s : cap Servant; id, t: int)

process phil fa i:= 1 to t

s.getforks( id) write("Philosopher", id, "is eating") # eat s.relforks( id) write("Philosopher", id, "is thinking") # think

af end

end

94

The Servant services invocations of getforks and relforks from all instances of Phi-

losopher. Each Philosopher passes its id to these operations to allow the Servant to distin

guish between Philosophers. A philosopher i~ permitted to eat when neither of its neigh-

bors is eating.

spec Servant

op getforks( id: int) op relforks( id: int)

resource Servant( n: int)

process server

# called by Philosophers

var eating[1:n]: bool := ([n] false) do true-

in getforks( id) and not eating[ id%n+ 11 and not eating[(n+id-2)%n+1] -

eating[id] := true ~ relforks( id) -

ni od

end

end

eating[idl := false

Notice the Servant is passed the number of Philosophers as a resource parameter (n). It

uses this value to allocate the array eating, which indicates the status of each Philosopher,

95

and to determine a Philosopher's neighbors using modular arithmetic. The server process

continually services the operations get/orks and rel/orks. The synchronization expression

on get/orks uses the invocation parameter id to determine whether either of a Philosopher's

neighbors is eating. If neither neighboring Philosopher is eating, server grants the Philoso

pher requesting forks permission to eat and updates the Philosopher's entry in eating.

Note that the above solution is deadlock-free: in effect, get/orks allocates both

forks at the same time. However, a Philosopher can starve if its two neighbors "conspire"

against it, i.e., if at any time at least one of them is eating.

4.4.2. First Decentralized Approach

The following is a decentralized solution that employs one servant per fork. Each

philosopher interacts with two servants to obtain the forks it needs. A philosopher that is

hungry may eat when it obtains a fork from each of the servants.

Our solution to this approach again employs the three resources Servant, Philoso

pher, and Mein. Main is similar to Main in the previous solution. It differs in that Main

now creates n instances each of Philosopher and Servant and passes them capabilities so

they can communicate with each other.

spec Main


resource MainO

initial var n, t: int write{"how many Philosophers? "); read{ n); write{"how many sessions per Philosopher? "); read{ t); var s[l:n] : cap Servant var ph : cap Philosopher # create the Servants

fa i:= 1 to n--sri] := create Servan( t)

af # create the Philosophers # note: their servants must be asymmetric or deadlock could occur

fa i := 1 to n-l --

end

end

ph:= create Philosopher{s[i],s[t%n+l],i,t) af ph:= create Philosopher{s[l],s[n],i,t)

96

Note that the size of array s depends on input value n. Also note the asymmetric way in

which Servant parameters are passed to instances of Philosopher; this makes deadlock easy

to avoid, as discussed later.

The Philosopher resource is similar to its counterpart in the previous example.

The differences are that it is now passed capabilities for two Servants, and that it now

invokes get/ork and rel/ork in each of those two Servants.

spec Philosopher

import Servant

resource Philosopher( I, r : cap Servant; id, t : int)

process phil fa i:= 1 to t-

l.get/ork(); r.get/ork() write("Philosopher", id, "is eating") # eat I.rel/ork(); r.relfork() write("Philosopher", id, "is thinking") # think

af end

end

97

Each instance of Servant services invocations of get/ark and rei/ark from its two

associated instances of Philosopher. A philosopher is permitted to eat when it obtains a

fork from each of its two servants.

spec Servant

op get/ork() op rel/ork()

resource Servant( id : int)

process server do true--

receive get/ark() receive rel/ork()

od end

end

The server process continually services the get/ark operation and then the rei/ark operation.

This ensures that the Servant's fork is allocated to at most one of the Philosophers at any

time.

The above solution is deadlock-free. When Philosopher resources are created in

Main, they are passed capabilities for their left and right Servants. The order of these is

switched for the last Philosopher (Le., Philosopher n). This causes the last Philosopher to

98

request its right fork first, whereas each other Philosopher requests its left fork first.

Requests from Philosophers, therefore, cannot form a cycle in the fork allocation graph.

Furthermore, the above solution prevents starvation since forks are allocated one at a time

and invocations of get/ork are serviced in order of their arrival.

4.4.3. Second Decentralized Approach

The following is a solution that employs one servant per philosopher. Each philo

sopher interacts with its own personal servant; that servant interacts with its two neigh

boring servants. Each individual fork is either held by one of the two servants that might

need it or is in transit between them. A philosopher that is hungry may eat when his ser

vant holds two forks. The specific algorithm that the servants employ is adapted from

[Chan84j. It has the desirable properties of being fair and deadlock-free. The basic solu

tion strategy also has application to other, realistic problems such as file replication and

database consistency.

Our solution again employs three resources Servant, Philosopher, and Main. Main

is similar to Main in the previous section. The differences are that Main now creates n

instances each of Philosopher and Servant, and also sends each Servant the initial values for

its local variables.

spec Main


resource MainO

initial var n, t: int write( C1how many Philosophers? "); read( n); write(C1how many sessions per Philosopher? "); read( t); var s[l:n] : cap Servant var ph[l:n] : cap Philosopher # create the Servants and Philosophers

fa i:= 1 to n--s[ i] := create Servan~ t) ph[i] := create Philosopher(s[i],i,t)

af # give each Servant its links to neighboring Servants

fa i:= 1 to n--send s[i].links(s[(n+i-2)%n + l],s[t%n + 1])

af # initialize each Servant's forks; # note: this must be asymmetric or deadlock could occur

send s[l]./orks(true,false,true,false)

end

end

fa i := 2 to n-l --send s[ i]./orks(false,false,true,false)

af send s[ n]./orks(false,false,false,false)

99

Note that Servant capabilities are passed to Philosophers as resource parameters, but that

separate operations are required to pass each Servant capabilities for its neighbor servants.

The second step is required in the latter case since a resource has to be created before Main

has a capability for it.

As in the previous two solutions, each instance of Philosopher alternately eats and

thinks for t sessions. Before eating, a Philosopher calls the get/orks operation of its personal

Servant; after eating, it calls that Servant's rel/orks operation. Thus, the Philosopher here

is identical to Philosopher in the centralized solution. (Its code, therefore, is not shown

100

again.) In both, a Philosopher interacts with the single Servant that it is passed as a

resource parameter. In the centralized solution, that Servant is shared by all Philosophers.

Here, however, each Servant serves only one Philosopher.

Instances of Servant service invocations of get/orks and rel/orks from their associ

ated instance of Philosopher; they communicate with neighboring instances of Servant

using the needL, needR, passL, and passR operations. A philosopher is permitted to eat

when its servant has acquired two forks. A servant may already have both forks when get

forks is called, or it may need to request one or both from the appropriate neighbor ser

vant. Two variables are used to record the status of each fork: haveL (haveR) and dirtyL

(dirtyR). Starvation is avoided by having servants give up forks that are dirty; a fork

becomes dirty when it is used by a philosopher. Details of the algorithm employed by ser

vants are in [Chan84j.

spec Servant

op get/orksO # called by Philosophers op rel/orks() op needLO # sent by neighboring Servants op needRO op passLO op passRO op links( l, r : cap Servant) # links to neighbors op /orks(haveL, dirtyL, haveR, dirtyR : boo I) # initial fork values

resource Servant( id : int)

var I, r : cap Servant var haveL, dirtyL, haveR, dirtyR : boo 1 op hungryO op ea~) proc getforks()

send hungryO receive ea~)

end

process server receive links( I, r)

# let server know Philosopher is hungry # wait for permission to eat

receive forks( haveL, dirtyL,haveR, dirtyR) do true-

in hungryO -

101

# ask for forks I don't have; I ask my right neighbor for his left fork, # and my left neighbor for his right fork

if - haveR - send r.needLO fi if - haveL - send l.needR() fi

# wait until I have both forks do - (haveL & haveR) -

in passRO - haveR := true; dirtyR := false ~ passLO - haveL := true; dirtyL := false ~ needRO & dirtyR -

haveR := false; dirtyR := false; send r.passLO; send r.needLO

~ needLO & dirtyL -

ni od

haveL := false; dirtyL := false; send l.passRO; send l.needRO

# let my Philosopher eat; wait for him to finish send ea~); dirtyL := true; dirtyR := true; receive relforksO

~ needRO -# right neighbor needs his left fork, which is my right fork

haveR := false; dirtyR := false; send r.passLO ~ needLO -

ni od

end

end

# left neighbor needs his right fork, which is my left fork haveL := false; dirtyL := false; send l.passRO

102

Notice the various combinations of invocation and service statements that are employed.

For example, get/orks provides a procedure-call like interface and hides the fact that get

ting forks requires sending a hungry() message and receiving an ea~) message. Also, server

processes use send to invoke the need and pass operations serviced by neighboring servers;

call cannot be used for this or deadlock could result if two neighboring servers invoked

each other's operations at the same time.

Note that the structure of the servants and their interaction in the above example

is typical of that found in some distributed programs, such as those that implement distri

buted voting schemes. Such a program might contain a collection of voter processes, each

of which can initiate an election among all in the collection. After a voter process initiates

an election, it tallies the votes from the others. While a voter process is waiting for votes

to arrive for its election, it must also be able to vote in elections initiated by other voter

processes. As pointed out in Chapter 2, and as illustrated above, this can only be accom

plished easily using asynchronous send.

Also note that in the above example, when a servant is attempting to acqUlre

both forks for its philosopher, it might give up a fork it already possesses (because the fork

is dirty). In this case, it passes the fork to its neighbor and then immediately requests the

fork's return. To reduce the number of messages exchanged, the request for the fork's

return could be combined with the passing of the fork. In particular, the pass operations

could be parameterized with a boolean that indicates whether or not the servant, when its

philosopher has finished eating, should automatically pass the fork back to its neighbor.

4.5. Network Topology

In this problem, there are N nodes of an Arpanet-like network. Nodes are num

hered from 1 to N. Each node has a few neighbors, and the network is connected (there is

a path from each node to every other). Each node knows only its own neighbors and can

103

communicate only with them. The problem is for a given node to determine the topology

of the entire network, i.e., the neighbors for each of the N nodes.

Our solution employs a "probe-echo" algorithm [Chan79]. In the probe phase,

the initiating node sends a probe message to each of its neighbors, who in turn send a

probe message to all other neighbors, etc. In the echo phase, a node returns to its prober

the union of its local topological information and the topological information returned

from its probes. Thus, the union of the initiating node's local topological information with

the echoes it receives contains the entire network topology.

A node can receive probes from more than one of its neighbors. For the first

probe a node receives, it probes all other of its neighbors, and then waits for an echo back

from each of these. Then, the node echoes the union of information in the echoes it has

received back to the node from whom it received the first probe. While waiting for the

echoes, however, the node may get other probes from different neighbors. These it answers

immediately by simply echoing the empty set; it does not propagate this probe to its neigh

bors.

A key point in arguing that this algorithm terminates is that a node receives

echoes back from all neighbors that it probes. This is true even if two neighbors both

receive their first probe at about the same time, and thus probe each other. They are

guaranteed to get echoes back from each other because a node does not terminate until it

has received echoes back from all its neighbors. Thus, each of the two nodes is still

active-waiting for echoes-when it receives its neighbor's probe, to which it responds

with an empty echo.

The following global component contains information that is needed by both the

resource that implements the algorithm and the user of that resource. (We do not show

such a user below, however.) The global component defines the number of nodes in the

104

network, a type for topological information, and a constant that represents the empty

topology. Sets are represented as arrays of boolean: an element is in a set if the

corresponding element in the array is true.

global Net

canst N = ... # number of nodes in the network. type Top = rec(a[l:N,l:.l\1: bool) canst Empty-Top := rec( ([N*N] false) )

end

The type Top is declared as a record that contains a matrix field. 3 An entry in row i and

column j of this matrix is true if node i is a neighbor of node j.

Each node in the network is represented by an instance of the resource node.

spec node

import Net op startup(neighbors[l:.l\1:cap node) {send} op initiateO returns top: Top {call} optype echo.Jype = (top: Top) {send} op probe(Jrom:int; echo-.9ack: cap echo.Jype) {send}

resource node( me:int)

proc initiateO returns top op echo echoJype send probe( me,echo); receive echo( top)

end

3 It is necessary to define Top as a record containing a matrix because matrix information is part of the name of an object and not part of its type. This precludes defining the Top type directly as a matrix.

process probe_handler var neighbors[l:.N]: cap node receive startup( neighbors) do true ---+

in probe(Jrom,echojJack) ---+

var top: Top := Empty_Top op echo echoJype var probed: int := 0

# send out probe to each neighbor except the prober. fa i := 1 to N st i =f from and neighbors[i] =f null ---+

send neighbors[ i].probe( me, echo) probed++; top.a[me,i] := true

af

# receive echoes from neighbors and union in the results. # also, handle probes by echoing back empty.

do probed> 0 ---+

od

in echo( t) ---+

fa i := 1 to N, j := 1 to N ---+

top.a[i,j] := top.a[i,j] or t.a[i,j] af probed--

~ probe( other, echojJack) ---+

send echojJack( Empty_Top) ni

# mark me as a neighbor of my first prober # (except if probe is from initiate) # and echo back result.

if me =f from ---+ top.a[me,Jrom] := true fi send echojJack( top)

ni od

end

end

105

The specification of node defines an optype and three operations. The optype

echoJype defines the parameterization and operation restrictor for the echoes; it is used in

the declarations of the echo operations and of a parameter to probe. The first operation,

startup, is used at the beginning of probe1tandler to receive an array of capabilities for the

106

node's neighbors. This operation is invoked by the node's creator after it has created all

node's, and therefore has their capabilities. (This is similar to the way Servants were

created and then given capabilities for their neighboring Servants in Sec. 4.4.3.) An entry in

the neighbors array is null if the corresponding node is not a neighbor. The code assumes

that a node is not a neighbor of itself; thus, a special test to ensure that a node does not

probe itself is not required.

The second operation, initiate, initiates a computation: a set of probes and echoes

to determine the network's topology. The process executing initiate sends a probe to the

probe operatio~ in the same instance of node. It then waits for the echo, and returns to

initiate's invoker the result returned to echo. Note that initiate provides a procedure-call

like interface, which hides the fact that a computation requires sending a probe message

and receiving an echo message.

The final operation in node's specification is probe. It is serviced by in statements

within the probe_handler process, which implements the general algorithm described above.

Note the use of the local operation echo. It is passed as part of the probe to a node's neigh

bors; they send their echoes to it. Probe messages are sent rather than called so that a

node can both collect echoes and handle incoming probes. This use of send is similar to its

use in Sec. 4.4.3 for the need and pass messages between neighboring servants.

The above solution is designed so that a computation can be initiated from any

node. However, the program does not work correctly if more than one computation is

active at the same time, whether initiated from the same node or from different nodes. The

problem is that probe_handler does not distinguish between probes generated from different

initiators. One way to remedy this problem is to allow only one computation from a given

node to be active at a given time, and to use N probe_handlers, one for each possible con-

---------------------

107

current computation.4 The probe operation would then be declared as an array of opera

tions. Each probe_handler would service an element of probe, and a prober would invoke

the element of probe that corresponds to the initiator's node number.

4.6. Components of the Saguaro File System

Our final example outlines components of a simplified version of the Saguaro file

system [Andr86] to illustrate additional uses of SR. In particular, it illustrates how a large

program consisting of several interacting resources can be constructed. We give particular

attention to how interdependencies between resources are represented using separate

specifications and bodies; in particular, this example illustrates how the circularities

between modules discussed in Sec. 2.1.1.2 can be expressed in SR. We also employ a glo-

bal component that contains declarations used by all the resources. Finally, this example

demonstrates several other features of the language, including operation types, operation

capabilities, and operations declared local to proc's.

Files in Saguaro, like those in Unix, include ordinary data files, devices such as

terminals and disks, and a generalization of pipes called channels. Different kinds of files

have different representations and are serviced by different resources. However, all files are

streams of bytes and are accessed using the same operations: read, write, seek, and close.

A client accesses a file by using a file descriptor, which contains capabilities for operations

supported by that file. These file system specifications, which are common to the different

kinds of files, can be declared in SR using a global component.

4 A simple alternative to multiple probe_handler's is to let the single probe_handler work on just one computation at a time. However, this can lead to deadlock.

- ------ --------------

108

global File

optype Read = (res buJ[O:*] : char; count: int) returns actual..J:ount: int optype Write = (buJ[O:*] : char; count: int) returns actual..J:ount: int optype Seek = (kind: inti offset: int) optype Close = 0 type Filepesc = rec( read: cap Read; write: cap Write;

seek: cap Seek; close: cap Close} # other global file declarations.

end

The optype declarations define the types of parameters and return values for the various

file operations. The declaration of Filepesc defines a file descriptor to be a record that

contains capabilities for the various types of file operations. Each field of Filepesc can be

bound to any instance of an operation of the specified optype, as is shown below in the

Fi/eServer.

Files are managed by a DirectoryManager resource. A client acquires access to a

file by calling the open operation provid.ed by DirectoryManager. In response to an invoca-

tion of open, the DirectoryManager allocates a resource to service the file. Here, we con-

sider only the case of ordinary data files, which are serviced by FileServer resources (there

are also resources that service the other kinds of files). When allocating a FileServer, the

DirectoryManager determines if there is an existing FileServer for the file. If so, the Direc

toryManager allocates that FileServer; if not, the DirectoryManager creates a new

FileServer. In either case, DirectoryManager returns a file descriptor, the fields of which are

bound to the operations provided by the FileServer. Thus, each instance of FileServer ser

vices all clients who have opened the same file; this allows file specific information, such as

buffers, to be shared. When a client is finished with a file, it invokes the close operation in

the associated FileServer. That FileServer then informs its DirectoryManager that the user

has finished by invoking the DirectoryManager's close operation. (The invocation of close

in the DirectoryManager is therefore an upcall [Clar85].) If there are no other clients that

109

have the file open, the DirectoryManager then destroys the FileServer.

In accord with the above description, the DirectoryManager resource has the fol

lowing specification:

spec DirectoryManager

import File, FileServer, TerminalServer, ... op open(path_name[O:*J : char; ... ) returns /d: File.FikPes.c op close( ... )

# directory operations.

resource DirectoryManager( disk: cap DiskServer; ... ) separate

Note that DirectoryManager imports the global declarations in File, and uses the file

descriptor type File_Dese. DireetoryManager also imports the declarations exported by

FileServer, TerminalServer, and other resources such as disk servers that implement file

system modules. Its spec is separate from its body because of the dependencies between it

and FileServer, as discussed below.

As mentioned, each (ordinary) file that is open is managed by an instance of a

FileServer resource. The FileServer resource exports one operation, /sopen, which is called

by a DireetoryManager each time the file handled by the server is opened. The /sopen

operation returns a file descriptor that is in turn given to the client who opened the file.

FileServer resources have the following outline:

- -_. - .. - ._._--------

spec FileServer

import File, DirectoryManager, DiskServer op fsopen( ... ) returns fd: File.FikPesc

resource FileServer( dm : cap DirectoryManager; disk: cap DiskServer; ... )

# Declarations of shared objects, # such as buffers and utility procedures.

proc fsopen( .•. ) returns fd op read File.Read op write File. Write op seek File.Seek op close File. Close # Other declarations and initialization.

# Assign capabilities for locally declared file operations # to fd, the return value.

fd. read := read fd. write := write fd.seek := seek fd. close := close

# Return those capabilities. reply

# Service file operations until client invokes close. do true ~

in read( ... ) ~ .. . ~ write( ... ) ~ .. . ~ seek( ... ) ~ .. . ~ close( ... ) ~ exit

ni od

# Clean up before terminating; # e.g., call close( ... ) in dm (the DirectoryManager).

dm. close( ... )

end

end

_.- ..... _ .... _ .. __ ._._--

110

111

Above, a new process is created to service each invocation of Jsopen (this happens

since Jsopen is implemented by a proc). This process declares instances of the different

file-access operations. Capabilities for these operations are assigned to the fields of Jd; then

reply is used to return the file descriptor to dm, the DirectoryManager that called Jsopen.

After replying, the process services the file-access operations until the file is closed. Note

that the fact that reply is used is transparent to the caller of Jsopen. Also note that here

the parameterization of each file-access operation comes from the previously specified

optype declaration in File. Note further that the operation declarations are local to Jso

pen; thus, each instance of proc Jsopen has its own set of local operations. This provides

each file server client with private channels for communicating with one instance of Jsopen.

Finally, note that the first parameter of the FileServer resource is a capability for a Direc

toryManager. This is used by the FileServer to invoke the DirectoryManager's close opera

tion when a user closes a file.

Recall that DirectoryManager exports two operations: open and close. Clients call

open to acquire access to a file managed by some instance of FileServer. Instances of

FileServer call close as described above when a client closes an open file. The body of the

DirectoryManager resource is:

.... - '--'--' ...... --- -_. _ .. - ... - ... ----------

body DireetoryManager

op loeaI..2pen( ... ) returns fse : cap FileServer # other shared declarations

initial # Initialize shared variables # end

proc open(pn, ... ) returns fd var fse : cap FileServer

# Search path pn to find file location, size, etc.

# Get capability for a FileServer. fse := loeal-2pen( ... )

# Open the FileServer; get back a file descriptor and return it to caller. fd := fse·fsopen( ... )

end

process eounCmanager

do true -I-

in loeal-2pen( ... ) returns fse -I-

112

# Increment reference count for file; create FileServer if necessary. # Return capability for FileServer.

if use existing FileServer -I- fse := saved value ~ else -I- fse := create FileServer{ myresourceO, ... )

fi ~ close( ... ) -I-

ni od

end

end

# Decrement reference count; destroy FileServer if necessary.

Since open is implemented by a proc, a new instance of this proc is created each time open

is called. This allows opens to be processed concurrently, except when they access the

DireetoryManager's shared variables within loeal..2pen. In contrast, close is serviced by an

in statement within process eounCmanager. One instance of this process is created

automatically when DirectoryManager is initialized; it repeatedly services calls of close as

well as loeal-2pen. Thus, invocations of loea/..2pen and close execute with mutual exclusion,

which ensures that reference counts are accurate. Note that neither loeal-2pen nor

.-- --.- ... -........ _._ .... -._ .. _-----_._--------

113

counCmanager are exported from DirectoryManager. Also note that when the Directo

ryManager creates a FileServer, it passes a capability for itself to the FileServer; this capa

bility is obtained by invoking the 'myresource' predefined operation.

DirectoryManager and FileServer are mutually dependent: DirectoryManager

creates and invokes operations in FileServer, and FileServer invokes an operation in Direc

toryManager. Such dependency requires that the spec's for both resources be compiled

before the body of either. Thus, as shown above, DirectoryManager is written with a

separate spec and body and FileServer is written as a combined resource. Alternatively,

FileServer could have been written with a separate spec and body and DirectoryManager

could have been written as a combined resource, or they both could have been written with

a separate spec and body. In this example, any of the alternatives is possible because nei

ther resource uses in its spec an object declared in the spec of the other. If this were not

the case, the spec for the resource that defines such an object would need to be compiled

before the other spec, and it would therefore need to be separate from its body.

CHAPTER 5

Implementation Overview

Our SR implementation consists of three major components: the compiler, linker,

and run-time support. In this chapter, we describe how SR programs are built and exe

cuted using these components and then how the major language features are implemented.

We conclude by giving the status of the current implementation and some measurements

of its size and performance. The same basic philosophy that guided the design of the

language has guided its implementation: common uses of language features should have a

simple, efficient implementation.

The compiler takes as input SR source files that each contain one or more

language components-entire resources, resource specs, resource bodies, or globals. Compi

lation takes place in the context of a special directory named 'Interfaces'. The 'Interfaces'

directory contains files that collectively provide symbol table information for all com

ponents that have been compiled. Internally, the compiler has a traditional structure: a

lexical analyzer, recursive-descent parser, and machine-code generator. The lexical

analyzer and parser employ common techniques. The code generator uses the Amsterdam

Compiler Kit (ACK) [Tane83] to produce generated code (GC). Optimizations are done by

the parser at the intermediate code level and by ACK at the machine code level. For exam

ple, at the intermediate code level the parser determines whether a multiple assignment

statement might require saving temporary copies of the values of any of its variables; if

not, a series of regular assignments is generated. At the machine code level, ACK performs

optimizations such as replacing instructions by less expensive equivalents (e.g., replacing a

114

115

"multiply by two" instruction by a "shift left one bit" instruction), and removal of

unreachable code and jumps to jumps.

The linker provides the means by which the user constructs a program from pre

viously compiled resources. The resources in a given program must have been compiled in

the context of the same 'Interfaces' directory and with the same or no user-defined machine

enumeration type. The input to the linker is a list of resource patterns to be included on

each machine enumerated in machine. If a program does not define a machine type, the

linker assumes the program is to execute on the machine on which the linking is performed.

The linker parses and verifies the legality of its input (e.g., it checks to make sure that each

named resource pattern has been compiled) and then uses the standard UNIX linker to

create a load module for each machine. The input to the linker also designates one of the

machines as the program's "main" machine and one of the resource patterns on that

machine as the program's "main" resource. One instance of the main resource is created

on the main machine when the program begins. (See Appendix B for more details.)

The run-time support (RTS) provides the environment in which the GO executes.

The RTS provides primitives for resource creation and destruction, invocation and servic

ing of operations, and memory allocation; it also supports the implementation-specific

language mechanisms described in Sec. 5.4. Internally, the RTS contains a nugget: a small

collection of indivisible process management and synchronization primitives. The RTS

hides the details of the network from the GO; i.e., the number of machines and their topol

ogy is transparent to the GO. When the RTS receives a request for a service provided on

another machine-e.g., create a resource or invoke an operation-it simply forwards the

request to the destination machine. Upon arrival at that machine, the local RTS processes

the request just as though it had been generated locally. Results from such requests are

transmitted back in a similar fashion.

116

The following table illustrates the hierarchical relationship between the generated

code, run-time support, nugget, and host operating syst.em (i.e., UNIX). In particular, a

component calls a service provided in another component if and only if the calling com-

ponent is above and abuts the servicing component.

Generated Code

Run-time Support

I Network Interface I I I/O Interface

Nugget J Host Operating System

The following three examples illustrate the kinds of relationships represented in the above

table.

• The GC uses the RTS to create resources and uses the nugget's P and V primitives for

operations implemented as semaphores.

• The RTS uses the nugget for synchronization and uses the host operating system to allo

cate memory. It also uses network and I/O interfaces.

• The network interface uses the nugget for synchronization am! uses the host operating

system to transport messages to other machines.

The vast majority of each of the three implementation components is written in

machine-independent C. The exceptions are that a few awk and sed tools are used to sim

plify maintenance of the compiler, lex is used to generate the lexical analyzers in the com

piler and linker, and the RTS nugget contains some assembly language code for process

management (e.g., stack setup and context switches).

117

5.1. Supporting Separate Compilation

As described above, a component is compiled in the context of an 'Interfaces'

directory. When a resource spec or global component is compiled, the compiler creates in

the 'Interfaces' directory a file in which it records symbol table information about that

component; such a file is called an interface file. The information stored in an interface file

is a linearization of the symbol table entries for the component, with absolute pointers

(addresses) in the symbol table replaced by relative pointers (integers) in the interface file.

Linearizing the symbol table is in fact somewhat complicated because it can contain cycles

resulting from pointers in declarations and circularities between resources; see below for

details. When a resource body is compiled separately from its specification, the compiler

uses the specification's interface file to initialize the symbol table preparatory to compiling

the body. Similarly, when one component imports another the compiler uses the imported

component's interface file to build symbol table entries for the objects exported by that

component. In both cases, as the symbol table entries are made, the relative pointers that

were stored in the interface file are changed to absolute pointers.

SR permits mutual dependency between resources in the sense that one resource

can import another before the second resource has been compiled (this was described in

Sec. 3.1 and illustrated in Sec. 4.6). Recall that this is restricted, however, to preclude a

resource from referencing components exported from an as-yet uncompiled resource. As a

specific example, suppose A imports Band B imports A and that the specification for A

was compiled before the specification for B. Then, A can import and use B (e.g., it can

declare a parameter with type cap B) but cannot use any object declared in B. On the

other hand, B can use A and the objects declared in A.

This mutual dependency complicates maintenance of the interface files. Continu

ing the previous example, suppose both A and B have been compiled once and that A is

--- --- -------------------

118

now being recompiled. Then, when importing B into A, the import mechanism would find

that B in turn imports A, so it would try to import A into B, and so on. To avoid an

infinite import loop, the compiler maintains a list of already-imported components and

where they are located in the symbol table. When a component is to be imported, the com

piler first consults this list. If it finds the component on the list, it does not attempt to

read the component's interface file; instead, it uses the symbol table entries already present

for the component. Note that this approach also enables the compiler to avoid reading in

duplicate information and building duplicate symbol table information for components

that have already been imported. For example, if A imports Band C and if Band C both

import D, then the interface file for D is read in only once and the symbol table informa

tion for D is built only once.

To write an interface file, the compiler starts at the beginning of the symbol table

and writes each entry that it has not yet written. Entries for certain kinds of declarations

point at ot~er entries that also need to be written. For example, the entry for a record

type points to its fields and the entry for an operation points to its parameters. Thus,

when writing such entries, the compiler follows these pointers recursively. Since the sym

bol table may contain cycles, the compiler marks each symbol table entry as it is written to

avoid infinite loops. For example, in the declaration

type Node = ree( info: inti next: ptr Node; preVo ptr Node)

the entry for next points at Node. Thus, the compiler writes the entry for next and then

follows next's pointer to Node. Since Node is marked as having already been written, the

compiler does not write that entry again; instead, it returns and continues by writing the

entry for prevo A symbol table entry for an imported component points at the objects

defined in the resource's specification or defined within the global component. Only the

entry for the component needs to be written; the entries for its objects do not because, as

119

described earlier, the import mechanism reads them from the imported component's inter

face file. However, these entries are marked as having been written so that they will not be

written to the current component's interface file. Consider, for example, the following two

components.

global Geometry type Coord = rec( x, y: int )

end spec User

import Geometry op j{ p : Geometry.Coord; q: int )

resource User() separate

When compiling User, the symbol table entry for p points at the entry for Coord, which

was imported as part of Geometry. The compiler writes the entry for p and then follows

p's pointer to Coord. Since Coord has been marked, the compiler does not write it; instead,

the compiler continues by writing the entry for q.

Each interface file entry is assigned a unique number relative to all other entries

in interface files in a particular 'Interfaces' directory; these numbers are used instead of

addresses to represent pointers in the symbol table. Thus, in the above example, the

pointer to Coord in the symbol table entry for p is replaced by Coord's unique number.

When an interface file is later read to compile the body of the resource or to import its

specification, these numbers are used to create the corresponding linkages in the symbol

table. New numbers are assigned each time an interface file is written to ensure that an

imported component has not been compiled without its importers also being recompiled.

(This also allows the unique numbers to be assigned sequentially.)

-----~----.-.---

120

5.2. Resource Creation and Destruction

On each processor, the RTS maintains a table of resource patterns and a table of

active resource instances. The pattern table on a particular processor indicates which pat-

terns are loaded on the processor; this table's value is set by the linker and remains con-

stant throughout execution. The instance table contains one entry for each active resource

instance on that processor. A resource capability thus consists of (1) a processor identity, a

pointer into a resource instance table, and a sequence number and (2) an operation capabil

ity for each of the operations declared in the resource's specification (operation capabilities

are described below in Sec. 5.3). The sequence number for a resource is assigned when the

instance is created; it is stored in the resource instance table. The RTS uses sequence

numbers to determine whether a resource capability refers to a resource instance that still

exists. For example, when the GC requests that a resource instance be destroyed, the RTS

compares the sequence number in the resource capability provided by the GC with the

sequence number in the appropriate entry in the instance table to determine whether the

referenced resource instance has already been destroyed.

To create a resource instance, the GC for create builds a creation block that con-

tains the identity of the pattern to be created (i.e., its index in the pattern tablei), the pro

cessor on which it is to be created, and the values of any parameters. This block is passed

to the RTS, which transmits it to the designated processor. When the creation block

arrives at the designated processor, the (local) RTS verifies that the pattern exists on the

processor and, if so, allocates a table entry for the instance and fiUs in the first part of the

resource capability accordingly. The RTS then creates a process to execute the resource's

1 Since the index in the pattern table is not known until link time, the compiler uses a symbolic constant that is defined by the linker.

-~- .. ~ ~-~~~~~--~--------

121

initialization code. (All instances of a particular resource on the same machine share a sin

gle copy of the resource's GO.)

The GO for every resource includes initialization code even if there is no user

specified initialization code. The key functions of such code are to allocate memory for

resource variables (the size of which may depend on the parameters in the resource head

ing), to initialize resource variables that have initialization expressions as part of their

declaration, and to create operations declared in the resource's specification or outer level

of its body. To accomplish operation creation, the GO interacts with the RTS. For each

operation that is being created, the RTS allocates and initializes an entry in its operation

table (see Sec. 5.3); if the operation is in the resource's specification, the RTS also fills in

the appropriate field in the resource capability that will be returned from create. After

this implicit initialization code executes, any user-specified initialization code is executed.

When the initialization process completes the user code or executes a return, it executes

additional implicit initialization code that creates any background processes (i.e.,

processes) in the resource. When the initialization process terminates or executes a reply

statement, the RTS passes the new resource capability back to the creator.

To destroy a resource instance, the GO passes the RTS a capability for the

instance. If the resource contains finalization code, the RTS creates a process to execute

that code. When that process terminates, or if there was no finalization code, the RTS

uses the resource instance table to locate processes, operations, and memory that belong to

the resource instance. The R TS then kills the processes, frees the en tries in the resource

and operation tables, and frees the resource's memory. The sequence number in each freed

entry is incremented so that future references to a resource that has been destroyed or to

one of its operations can be detected as being invalid .

.. . - .... - --------

122

When an SR program begins execution, first the nugget and then the RTS initial

ize themselves. Then an instance of the main resource is created on the main machine

much in the same way that any other resource instance is created.

5.3. Operations

The RTS also maintains an operation table on each processor. This table con

tains an entry for each operation that is serviced on that processor and is currently active.

The entry indicates whether the operation is serviced by a proc or by input statements.

For an operation serviced by a proc, the entry contains the address of the code for the

proc. For an operation serviced by input statements, the entry points to its list of pend

ing invocations. An operation capability thus consists of a processor identity, an index

into an operation table, and a sequence number. The sequence number serves a purpose

analogous to the sequence number in a resource capability: it enables the RTS to determine

whether an invocation refers to an operation that still exists. (An operation exists until its

defining resource is destroyed or its defining block is exited.)

5.3.1. Invocation Statements

To invoke an operation, the GC first builds an invocation block, which consists of

header information and actual parameter values. The GC fills in the header with the kind

of invocation (call, send, concurrent call, or concurrent send) and the capability for the

operation being invoked. Then, the GC passes the invocation block to the RTS. If neces

sary, the RTS transmits the invocation block to the machine on which the operation is

located (recall that capabilities contain processor identities). The RTS then uses the index

in the operation capability to locate the entry in the operation table, and thus determine

how the operation is serviced. For an operation serviced by a proc, the RTS creates a pro-

123

cess and passes it the invocation block. 2 For an operation serviced by input statements, the

RTS places the invocation block onto the list of invocations for the operation; then it

determines if any process is waiting for the invocation, and, if so, awakens such a process.

In either case, for a call invocation the RTS blocks the calling process; when the operation

has been serviced that process is awakened and retrieves any results from the invocation

block.

The implementation of co statements builds on the implementation of call and

send statements. First, the GO informs the RTS when it begins executing a co statement.

The RTS then allocates a structure in which it maintains the number of outstanding call

invocations (i.e., those that have been started but have not yet completed) and a list of

call invocations that have completed but have not been returned to the GO. Second, the

GO performs all the invocations without blocking. For each call invocation the GO places

an arm number-the index of the concurrent command within the co statement-in the

invocation block. Third, since send invocations complete immediately, the GO executes

the post-processing block (if any) corresponding to each send invocation. The GO then

repeatedly calls an RTS primitive to wait until call invocations complete. For each com

pleted call invocation, the GO executes the post-processing block (if any) corresponding to

the invocation; specifically, it uses the arm number in the invocation block as an index into

a jump table of post-processing blocks. When all invocations have completed, or when one

of the post-processing blocks executes exit, the GO informs the RTS that the co statement

has terminated. The RTS then discards any remaining completed call invocations and

arranges to discard any call invocations for this co statement that might complete in the

future. The infrequent situation in which a post-processing block itself contains a co state-

2In some cases, the R TS can avoid creating a process; see Sec. 5.3.3 for details.

124

ment is handled by a slight generalization of the above implementation.

Concurrent invocations within a co statement may also contain quantifiers,

which complicate only the GC in the above implementation. For a quantifier that precedes

a call invocation, the complier generates the appropriate loop(s) around the usual invoca

tion code. Bound variables can be referenced in the post-processing block; their (initial)

values there are the same as their values when the invocation whose completion triggered

execution of the block was generated. Therefore, the GC places the values of the bound

variables in the invocation block; it later uses these values for the bound variables during

execution of the post-processing block. For a quantifier that precedes a send invocation,

the compiler generates a loop around the invocation code just as it does for a call invoca

tion. In addition, the compiler generates a second loop, identical to the first, around the

post-processing block. This loop is executed immediately after all invocations in the co

statement have been started. Since two copies of the loop indicated by the quantifer are

needed, care is taken to ensure that expressions in the quantifier are evaluated just once:

their values are calculated before the first loop and saved for use in the second loop. Note

that the technique of storing the values of the bound variables in the invocation block and

later extracting them, as was done for call invocations, would not work for send invoca

tions because the RTS does not return send invocation blocks to the GC.

A different approach to handling bound variables in concurrent invocations

would be to record their values in a table or list. These values could be associated with an

invocation by (1) storing the address of the corresponding values in each invocation block

or (2) storing the address of the invocation block along with each table or list entry. Nei

ther of these is better than our approach because they still would store something in the

invocation block or would search a table or list for each invocation that cpmpletes. In

addition, if a table were used, it would need to be allocated dynamically after determining

how many invocations are in the entire co statement; if a list were used, a node would need

to be allocated as each in vocation is started. Our approach is much simpler and more

efficient.

5.3.2. The Input Statement

The input statement is the most complicated statement in the language. In its

most general form, a single input statement can service one of several operations and can

use synchronization and scheduling expressions to select the invocation it. wants. More

over, an operation can be serviced by input statements in more than one process, which

thus compete to service invocations. For these reasons, the implementation of input state

ments is more complicated than the implementation of the other language features. How

ever, as we shall see, the implementation of simple, commonly occurring cases is optimized.

Classes are fundamental to the implementation of input statements. Recall that

a static class of operations is an equivalence class of the transitive closure of the relation

"serviced by the same input statement" (see Sec. 3.2.3). They are used to identify and con

trol conflicts between processes that are trying to service the same invocations. At compile

time, the compiler groups operations into static classes based on their appearance in input

statements. At run-time, a dynamic class is represented by a class structure, which is

maintained by the RTS. Each operation table entry points to its operation's class struc

ture.

A class structure contains a flag indicating whether or not some process currently

has access to the class structure, old and new invocation lists, and old and new process

lists. The invocation and process lists are used to ensure that processes obtain access to

invocations in FCFS order as required by the semantics (Sec. 3.2.3). The old invocation

list contains pending invocations of operations in the class that interested processes can

search. The new invocation list contains invocations that arrive while a process is

126

searching the old invocation list; such invocations are not immediately available to the

searching process so that processes that have been waiting longer can examine them first.

The old process list contains processes that have searched the old invocation list but did

not find a selectable invocation; they are, therefore, waiting for new invocations to arrive.

The new process list contains processes that are waiting to search the old invocation list.

The process that is currently searching the old invocation list is not on either process list.

To illustrate how the RTS maintains these lists and provides the required FOFS

order, consider what happens when a new invocation arrives. The RTS first finds the class

structure associated with the operation. It does so by mapping the capability for the

invoked operation (in the invocation block) to the operation's operation table entry; it then

uses a pointer in that entry to locate the class structure. If no process is currently search

ing the class structure's old invocation list, the invocation is added to the old invocation

list and the old process list is moved to the new process list. On the other hand, if some

process is currently searching, the invocation is added to the new invocation list. When

that process completes its search, the new invocation list is moved to the end of the old

invocation list, and the old process list, together with the searching process if the search

found no selectable invocation, are moved to the beginning of the new process list. In

either case, the first process {if any) on the new process list is awakened; that process will

then search the old invocation list.

The RTS and nugget together provide seven primitives that the GO uses for

input statements. These primitives are tailored to support common cases of in put state

ments and have straightforward and efficient implementations. They are:

access( class) - Acquire exclusive access to class, which is established as the current class

structure for the executing process. That process is blocked if another process already

has access to class. The RTS will release access when this process blocks in trying to

-~- ~ ~~- ------- -_._-----

127

get an invocation or when this process executes remove (see below).

getjnvocationO - Return a pointer to the invocation block the exe<;uting process should

examine next. This invocation is on the old invocation list in the current class struc

ture of the executing process; successive calls of this primitive return successive invo

cations. If there is no such invocation, the RTS releases access to the executing

process's current class structure and blocks that process.

geCnamedinv( 0P..J:ap) - Get the next invocation of operation op..J:ap (an operation capa

bility) in the executing process's current class; a pointer to the invocation block is

returned.

geCnamedinv_nb( 0P..J:ap) - Get an invocation of 0P..J:ap. This primitive is identical to

geCnamedjnv except that it does not block the executing process if no invocation is

found; instead it returns a null pointer in that case. It is used when the input state

ment contains a scheduling expression.

remove( invocation) - Remove the invocation block pointed at by invocation from the

invocation list of the executing process's current class. The RTS also releases access

to the executing process's current class structure.

inpuCdone( invocaUon) - Inform the RTS that the GC has finished executing the com

mand body in an input statement and is therefore finished with the invocation block

pointed at by invocation. If that invocation was called, the RTS passes the invoca

tion block back to the invoking process and awakens that process.

receive( class) - Get and then remove the next invocation in class. This primitive is a

combination of access( class), invocation:= getjnvocationO, and remove( invocation);

hence, it returns a pointer to an invocation block. It is used for simple input state

ments and for receive statements.

128

The ways in which these primitives are used by the GC is illustrated below by four exam

ples. More complicated input statements are implemented using appropriate combinations

of the primitives.

Consider the simple input statement:

in q(x) --r ... ni

This statement delays the executing process until there is some invocation of q, then ser

vices the oldest such invocation. (Note that receive statements expand into this form of

input statement.) For this statement, if q is in a class by itself the GC executes

invocation:=receive( q's class). If q is not in a class by itself, the GO executes access( q's

class), invocation:=geCnamedjnv( q), and remove(invocation). In either case, the GC then

executes the command body associated with q, with parameter x bound to the value for x

in the invocation block, and finally executes inpuCdone( invocation).

As the second example, consider:

in q( x) --r ... ~ r(y,z) --r ... ni

This statement services the first pending invocation of either q or r. Note that q and rare

in the same class because they appear in the same input statement. Here, the GC first uses

access(q's class) and then invocation:=getjnvocationO to look at each pending invocation

in the class to determine if it is an invocation of q or r (there might be other operations in

the class). If the GC finds an invocation of q or r, it calls remove( invocation), then executes

the corresponding command body with the parameter values from the selected invocation

block, and finally executes inpuCdone( invocation). If the GC finds no pending invocation

of q or r, the executing process blocks in getjnvocation until an invocation in the class

arrives. When such an invocation arrives, the RTS awakens the process, which then

repeats the above steps.

129

As the third example, consider an input statement with a synchronization expres-

slOn:

in q( x) and x > 3 -+- ••• ni

This statement services the first pending invocation of q for which parameter x is greater

than three. The GC first uses access( q's class) to obtain exclusive access to q's class. The

GC then uses invocation:=getjnvocationO or invocation:=geCnamedjnv( q) to obtain invo

cations of q one at a time; the first primitive is used if q is in a class by itself, otherwise the

second is used. For each such invocation, the GC evaluates the synchronization expression

using the value of the parameter in the invocation block. If the synchronization expression

is true, the GC notifies the RTS of its success by calling remove(invocation), executes the

command body associated with q, and calls inpuCdone( invocation). If the synchronization

expression is false, the GC repeats the above steps to obtain the next invocation.

As the final example, consider an input statement with a scheduling expression:

in q( x) by x-+-... ni

This statement services the (oldest) pending invocation of q that has the smallest value of

parameter x. In this case, the GC uses the same steps as in the previous example to obtain

the first invocation of q. It then evaluates the scheduling expression using the value of the

parameter in the invocation block; this value and a pointer, psave, to the invocation block

are saved. The GC then obtains the remaining invocations by repeatedly calling

invocation:=geCnamedjnv_nb( q). For each of these invocations, the GC evaluates the

scheduling expression and compares it with the saved value, updating the saved value and

pointer if the new value is smaller. When there are no more invocations (i.e., when

geCnamedjnv_nb returns a null pointer), psave points to the invocation with the smallest

scheduling expression. The GC acquires that invocat;"n by calling remove(psave), then exe

cutes the command body associated with q, and finally calls inpuCdone(psave).

130

Note that synchronization and scheduling expressions are evaluated by the GO,

not the RTS. We do this for two reasons. First, these expressions can reference objects

such as local variables for which the RTS would need to establish addressing if it were to

execute the code that evaluates the expression. Second, these expressions can contain invo-

cations; it would greatly complicate the RTS to handle such invocations in a way that does

not cause the RTS to block itself. A consequence of this approach to evaluating synchroni-

zation and scheduling expressions is that the overhead of evaluating such expressions is

paid for only by processes that use them.

5.3.3. Optimizations

Three kinds of optimizations are applied to certain uses of operations. First, for a

call invocation of a proc that is in the same resource as the caller3 and that does not con-

tain a reply statement, the compiler generates conventional procedure-call code instead of

going through the RTS, which would create a process.4 The compiler generates code that

builds an invocation block on the calling process's stack and passes the block's address to

the called proc. Thus, the code in the proc is independent of whether it is executed by the

calling process or as a separate process. A similar optimization is performed for a call

invocation of a proc that is located on the same machine as the caller and that does not

contain a reply statement. In this case, however, the RTS must be entered since the com-

piler cannot determine whether an operation in another resource is located on the same

3 The compiler only detects such invocations that are invoked directly by the name of the operation, not indirectly through a capability variable. Detecting invocations that use capability variables would require extensive analysis of the user code, and some could only be detected at run-time.

4A proe that executes reply executes concurrently with its caller after replying; hence the proc must execute as a process in this case.

131

machine as its caller (recall that program linking follows and is independent of compila-

tion).5 Also, the invoking process must create an invocation block since it is possible that

the invoking process mig~t be in a resource that is destroyed before the invoked proc com

pletes.

The third optimization is that certain operations are implemented using sema

phores rather than the general mechanisms described above. The main criteria that an

operation must satisfy to be classified as a semaphore operation are that it: (1) is invoked

only using send, (2) has no parameters or return value, (3) is serviced by input (or

receive) statements in which it is the only operation and in which there are no synchroni

zation or scheduling expressions, and (4) is declared at the resource level. Note that these

criteria are relatively simple to check. Furthermore, they capture typical uses of opera-

tions that provide intra-resource synchronization such as controlling access to shared vari-

abIes.

For an operation that the compiler has decided to implement using a semaphore,

the compiler generates special code to create, invoke, and service the operation. Such code

uses the semaphores provided by the nugget. (The nugget provides semaphores that the

RTS uses internally for synchronization.) The GC for a send operation is simply a V on

the semaphore and the GC for an input statement that services a semaphore operation is

simply a P on the semaphore.

5 In fact, the compiler might not even know whether an operation is implemented as a proc. That is, it might not yet have compiled the body of the resource containing the invoked operation.

132

5.3.4. Completion Status and Failed

On each machine, the RTS maintains a table containing the current state (up or

down) of each machine in the network. This information is updated as the result of

"heartbeat" messages that each active machine periodically broadcasts. Of course, this

information only approximates the true state of the network, but that is sufficient for

implementing completion status and 'failed'.

Each resource or operation capability contains, in addition to the fields already

described, a status field. When a capability is used in an invocation or resource control

statement, the RTS initializes its status to 'Undefined'. The RTS then checks its state

table to determine the state of the machine indicated in the capability. If that machine is

down or if the RTS cannot transmit the request to the machine and receive a reply within

an implementation-specific time interval, the RTS sets the status field in the capability to

'Crash' and returns it to the GC. If the message is delivered, the status field is set to one of

the other three status values: 'NoSpace', 'Terminated', or 'Success'. If there is insufficient

space to create a resource or proe or to store the invocation, the status is set to 'NoSpace'.

If the resource or proe indicated by a capability has already terminated or terminates

while the request is pending, the status is set to 'Terminated'. Otherwise, the status is set

to 'Success'.

The 'failed' function can be used as both a regular function and as a guard in

input statements; its single argument can be either a machine name or a resource or opera

tion capability. The implementation of 'failed' uses the machine state table to determine

the status of other machines. It also uses probe messages to other machines to determine

whether a specific resource or operation still exists. The result of a probe message is true if

a positive reply is received within a certain interval; otherwise it is false. When 'failed' is

used as a predefined function, the GC simply asks the RTS to return the status of the

- --- --- ------ --------------------------- --

133

argument to 'failed'. For a machine argument, the RTS examines its machine-state table

and returns the appropriate value. For a capability argument, the RTS examines the

capability to determine whether it points to a local or remote object. For a local object,

the R TS examines the local resource instance table or operation table; for a remote object,

the RTS sends a probe message to the machine indicated by the capability and the RTS on

that machine consults its resource instance table or operation table. The implementation

of 'failed' is similar when it appears as a guard in an input statement. In this case, if the

GC is informed that 'failed' is true, it executes the corresponding command body in the

input statement. Note that when 'failed' is used in an input statement, its value may

become true while the input statement is delayed. Thus, the RTS periodically has to ren

evaluate uses of 'failed' that appear in guards.

5.4. Status, Plans, and Statistics

The initial implementation of SR ran under 4.3BSD Vax UNIX and was completed

in November 1985. It was then ported to Sun UNIX version 3.0. The port was relatively

easy to do since most of our code is written in machine-independent C and the ACK code

generator contains a Sun machine-code generator; we needed to rewrite only about half of

the nugget. The current implementation fixes bugs found in the initial implementation

and provides all the language's features except for a few minor ones. At present, the linker

combines the compiled resources in an SR program into one or more load modules. There

is one load module for each machine in the user-specified machine enumeration type. Each

load module executes as a single UNIX process in which concurrency is simulated by the

nugget. Load modules exchange messages using UNIX sockets; an execution manager sets

up the sockets and passes them to the load modules so they can communicate. The current

implementation also includes facilities to invoke C functions as operations, thereby gaining

access to underlying UNIX system calls.

134

Work is currently underway to extend the implementation. We are modifying

the execution manager to allow multiple load modules produced for a single program to be

located on different UNIX machines if so desired. We are also modifying the RTS to run on

top of the V kernel [Cher84] on the Suns. Eventually, the implementation will provide

load modules that execute stand-alone on a bare machine or on a network of bare

machines. We expect that the UNIX versions of the implementation will be completed by

Fall 1986, at which time they will be made available to interested groups.

The design and coding of the initial implementation took six people working

half-time about a year, i.e., about 3 man-years total. It was used successfully in a graduate

class in concurrent programming in which students designed and coded moderate-sized

(500-1000 lines) distributed programs such as card games, automatic teller machines, and

simple airline reservation systems; some of these projects implemented replicated data

bases. Work on the implementation has continued and the current implementation was

used in another graduate class in which students designed and coded a command inter

preter and a file system for a distributed operating system. Each of these progra~s was

5000-6000 lines of SR source code.

The compiler consists of approximately 20,000 lines of source code. (This does

not include the source code for ACK.) The linker consists of approximately 1700 lines of

source code. The RTS consists of about 3300 lines of source code, which includes the I/O

interface to UNIX. The nugget consists of about 600 lines of source code.

The SR compiler processes 1200 lines per minute on a Vax 8600. To give some

comparison, the C compiler processes about 6400 lines per minute. Thus, the SR compiler

is about 5.3 times as slow, which is not surprising since SR is a higher-level language that,

among other things, is strongly typed and has a more complex inter-module interface.

Moreover, most of the SR compiler's time is spent in the ACK machine-code generator.

135

When machine-code generation is turned off, the compiler processes 9400 lines per minute,

which means that about 85% of the compiler's time is spent in ACK. Using a faster

machine-code generator would obviously greatly increase the compilation rate.

At run-time, the RTS (including the nugget) requires about 16K for text, 3K for

static data, and 48K for table space. The large of amount of table space results from using

static allocationj we are currently changing to a dynamic allocation scheme for resource

instances, operations, and processes. The RTS also contains an additional 9K of text and

2K of data for the I/O routines used in the UNIX implementation.

The cost of an invocation depends on whether it is a call or a send and whether

it is serviced by a proc or an in. Below, we show five SR programs that use different kinds

of invocation and service and a table showing times for each program. For comparison

purposes, we also include a C program. Each program generates and services 10,000 invo-

cations.

# A: call to prOCj procedure call. resource AO

end

op 10 proc 10 end process aa

end

fa i := 1 to 10000 -call 10

af

The above call statement is implemented as a procedure call.

--- - ---- - ------ --------

# AI: send to procj new process. resource A10

end

op j() proe j() end process aa

end

fa i := 1 to 10000 -send j()

af

136

Here, a new process is created for each invocation of f. This would also be the case for A if

its proc f contained a reply statement.

# B: call to receive; 2 processes. resource BO

end

op g() process bb

fa i := 1 to 10000 -+

call g() af

end process gg

end

do true -+

receive g() od

Two processes are required above because a process cannot service its own call invocation

since it is delayed as part of call.

# Bl: send to receive; 1 process. resource Bl()

end

op g() process bb

end

fa i := 1 to 10000 --+

send g() receive g()

af

137

In the above, process bb sends invocations to itself. Operation g has the default operation

restriction {call,send}, which prevents g from being implemented as a semaphore.6

# B2: send to receive (semaphore); 1 process. resource B2()

end

op g() {send} # semaphore operation. process bb

end

fa i := 1 to 10000 --+

send g() receive g()

af

Here, g is declared with the send restriction so it can be implemented as a semaphore.

6 Even though g has the {call,send} restrictor, it is never invoked using call. In determining whether an operation can be implemented using a semaphore, the compiler looks only at the operation restriction, not at how the operation is actually used.

/* c: C program equivalent to A. */ mainO{

} j(){ }

int i; for( i = 1; i < 10000; i++ ) {

j(); }

138

The following table summarizes the times the above programs needed to generate

and service invocations. These times, expressed in milliseconds per invocation, were

obtained on a Vax 8600 and are averages of ten executions of each program.

Program Description msec/inv Relative to C

C C equivalent of A .0050 1 A call to proc; procedure call .15 30 A1 send to proc; new process .45 90 B call to receive; 2 processes .36 72 Bl send to receive; 1 process .20 40 B2 send to receive (semaphore); 1 process .024 5

The "Relative to C" column shows the ratio of the time per invocation for the given pro-

gram to that for the C program C; e.g., A is 30 times slower than C. The times above

include the cost of initializing the RTS, creating an instance of the resource, and executing

the loop control. This cost is so small (.002 msec) compared to the execution time of any of

the above programs that it does not significantly affect the averages.

Each invocation of a non-semaphore operation in the above SR programs requires

several procedure calls by the GC. As described in Sec. 5.3.1, the GC first calls the RTS

allocate primitive to obtain space for the invocation block. It then fills in the block with

header information and parameter values and calls the RTS invoke primitive. Upon

return from a call invocation, the GC calls the RTS deallocate primitive to free the invoca-

139

tion block. (The RTS frees send invocation blocks). Also, at the end of each proc or in

statement the GC informs the RTS of its completion by calling, for example, inpuCdone

(see Sec. 5.3.2). Each RTS primitive typically calls several nugget primitives or other RTS

routines, e.g., to create a process, obtain mutual exclusion to access an RTS table, or

record that a process belongs to a particular instance of a resource.

It is primarily the procedure calls described above that make A so much slower

than C. As of this writing (May 1986), the compiler does not yet optimiz€ calls within the

same resource (as we described in Sec. 5.3.3). Thus, the GC calls the RTS invoke primi

tive, which determines t~at the invoked operation is on the same machine as its invoker

and therefore lets the calling process execute the invoked proc's code. Compared to the

possible compiler optimization, this involves one extra procedure call, some RTS checking

and bookkeeping, and allocation and deallocation of the invocation block. Thus, the

actual cost of a local invocation should be considerably less than the cost shown for A.

Note that the cost of invoking a proc in a different resource on the same machine should

be very close to the cost shown for A.

When a process is created and destroyed for each invocation (as in Al or if the

proc in A contained a reply), the cost is about 3 times that for procedure call (as in A}j

this increase is a result of the extra work the RTS must do to maintain tables, allocate

memory, and schedule the new process. Given that many processes will have long life

times, this overhead might not be unreasonable.

B illustrates calling an operation serviced by a receive statement. The main cost

in B is context switching because each of the two processes constantly invokes or services

an operation and then blocks. Compare this with Bl, which has only one process that

sends to itself. There the cost of invocation is slightly more than that for procedure call

(A). More work is done in Bl than in A because the RTS inserts each invocation into a

140

queue of pending invocations and the GC executes additional code to obtain each invoca

tion as described earlier in Secs. 5.3.1 and 5.3.2.

Each invocation or service of a semaphore operation requires a few instructions in

the GC to call the RTS with a specific semaphore. Also, the RTS uses a procedure call to

reach the nugget primitives that do the actual P or V; these primitives are implemented

within the nugget to ensure mutual exclusion. These nugget primitives implement P and

V in the traditional way, which requires just a few instructions unless a process needs to be

blocked or awakened. B2 shows the cost of a semaphore operation for which there is no

blocking or waking up. Note that B2 is 5 times slower than C. This is quite reasonable

given that B2 executes one P and one V on each iteration, and each P or V requires several

procedure calls and several instructions; by contrast, C executes only a single procedure

call on each iteration. Also note that B2 is about 8 times faster than Bl, which is the same

program but is implemented using the standard primitives instead of semaphores.

The above measurements, although somewhat crude, give some idea of the cost of

different kinds of invocation and service. The comparisons of the SR programs with the C

program, however, are somewhat meaningless because SR is quite a different language. For

example, SR has mechanisms, such as send and in, that have no counterparts in C; SR

also has dynamic resources and processes, which require a more complicated RTS. On the

other hand, it is desirable that SR programs such as A that use only C-like mechanisms

should not be too much slower than their C counterparts.

The reported measurements showed the costs of intra-resource invocations.

Although not shown, most of these measurements also apply to equivalent inter-resource

invocations on the same machine. The exception is B2, which uses semaphores; inter

resource operations are not implemented using semaphores because it is not known until

run-time whether both resources will be located on the same machine. Also, we have not

----._--- ... __ .-. -----

141

included measurements for inter-machine invocations because such measurements would

include the cost of the underlying inter-process communications mechanism provided by

UNIX (i.e., sockets), over which we have no control. However, the basic cost of invocations

(i.e., ignoring the cost of using sockets) would be the same.

The implementation of SR could be improved to speed up invocations. For

instance, each process could have a pre-allocated invocation block that it would use for

most of its call invocations; the GC would no longer need to call the RTS to allocate and

later to free a block for each call invocation. This is the approach used in SRo and is

described in detail in [Olss84b]. Another way to speed up invocations (and in general all

programs) would be to use a better code generator, or at least perform some peephole

optimizations on the GC produced by ACK. ACK does not track register usage in the GC

and therefore generates many redundant loads. Note that some of the procedure calls

within the RTS and nugget could be eliminated; however, doing so would make their inter

nal organization less well-structured.

CHAPTER 6

Discussion

SR lies in between recent languages in which a distributed program appears to

execute on one virtual machine (e.g., Linda [Gele85j and NIL [Parr83, Stro83]) and more

conventional languages in which a distributed system is built from distinct programs, one

per machine. In SR, the invoker of an operation need not be concerned with where that

operation is serviced, but mechanisms are provided to enable the programmer to exert

some control over a program's execution environment. For example, the programmer can

control where a resource is created and can determine whether a resource or machine has

failed. Thus, SR is similar in level to the V kernel [Cher84j. However, SR and the V kernel

take quite different approaches. SR is a strongly typed language with an integrated set of

mechanisms for sequential and distributed programming; the V kernel is a type-less collec

tion of message-passing primitives that are invoked from a sequential language such as C.

The V kernel has been designed with efficiency being the most important criteria; SR has

been designed to balance expressiveness, simplicity, and efficiency.

Chapter 2 discussed the general kind of mechanisms a distributed programming

language should provide. Chapter 3 presented the specific mechanisms in SR, and Chapter

5 described their implementation. The remainder of this chapter discusses additional

aspects of SR's mechanisms and how they relate to other approaches to distributed pro

gramming. In addition, we examine some of the language design decisions that influenced

the development of SR.

142

143

6.1. Integration of Language Constructs

There is a large similarity between the sequential and concurrent mechanisms in

SR. For example, the if, do, and in statements have similar appearances and the same

underlying non-deterministic semantics. The fa, co, and in statements all use quantifiers

to specify repetition. Finally, the exit and next statements are interpreted uniformly

within iterative statements (do and fa) and the co statement. CSP [Hoar78] has a similar

integration of mechanisms. By way of contrast, Ada [Ada83] provides distinct mechanisms

for sequential and concurrent programming. In Ada, packages are used for sequential

modules, but tasks are used for concurrent modules. Also, if-then-else-used for selecting

alternative conditions-is deterministic, but select-used for selecting alternative

entries-is non-deterministic; thus, what should be similar constructs are quite dissimilar.

As a specific example, the Queue and BoundedBuffer resources presented in

Chapter 3 could be programmed in Ada as follows. l

1 In presenting fragments of Ada programs, we follow the common convention of using all upper-case for identifiers and lower-case boldface for reserved words.

---------------- --.

generic

SIZE: INTEGER;

package QUEUE is

procedure INSERT(ITEM: in INTEGER); function REMOVEO return INTEGER;

end QUEUE;

package body QUEUE is

STORE: array(O .. SIZE-l) of INTEGER; FRONT, REAR, COUNT: INTEGER:= 0;

procedure INSERT(ITEM: in INTEGER) is if COUNT < SIZE then

STORE(REAR):= ITEM; REAR:= (REAR+l) mod SIZE; COUNT := COUNT + 1;

else -- take actions appropriate for overflow endif;

end INSERT;

function REMOVEO return INTEGER is if COUNT > 0 then

ITEM: INTEGER; ITEM := STORE(FRONT); FRONT := (FRONT+l) mod SIZE; COUNT := COUNT - 1; return ITEM;

else -- take actions appropriate for underflow endif;

end REMOVE;

end QUEUE;

144

QUEUE is defined as a generic package so that it can be parameterized by its size when

ins tan tiated.

task type BOUNDEDj3UFFER is

entry INSERT(in ITEM: INTEGER); entry REMOVE(out ITEM: INTEGER);

end BOUNDED_BUFFER;

task body BOUNDED_BUFFER is

SIZE: constant INTEGER := 50; STORE: array(0 .. SIZE-1) of INTEGER; FRONT, REAR, COUNT: INTEGER := 0;

begin

loop select

when COUNT < SIZE = > accept INSERT( ITEM: in INTEGER) do

STORE(REAR):= ITEM; end INSERT; REAR := (REAR+1) mod SIZE; COUNT := COUNT + 1;

or when COUNT > 0 = > accept REMOVE( ITEM: out INTEGER ) do

ITEM := STORE(FRONT); end INSERT; FRONT := (FRONT+1) mod SIZE; COUNT := COUNT - 1;

end select; end loop;

end BOUNDED_BUFFER;

145

BOUNDED _BUFFER is defined as a task type so that multiple instances of it can be

created. (An object declared simply as a task represents a single task; multiple instances of

it cannot be created.) SIZE is declared as a constant in BOUNDED_BUFFER's body

because Ada does not allow generic tasks. Thus, the size of the buffer is fixed, whereas it is

determined by a parameter to the QUEUE package and the Queue and BoundedBuffer

resources. The differences between the Ada QUEUE and BOUNDEDj3UFFER are

marked; in particular notice the differences in their specification parts and in their imple-

mentation parts. By contrast, the differences between the SR Queue and BoundedBuffer

146

are minimal; their specification parts are in fact identical. A similar difference between the

sequential and concurrent components of a language results whenever an existing sequen-

tial language is extended with concurrency constructs (e.g., StarMod [Cook80] or Con

current C [Geha85]).

The SR mechanisms for communication and synchronization are also well

integrated. Operations support all of local and remote procedure call; rendezvous, dynamic

process creation, asynchronous message passing, multicast, and semaphores. In addition,

there is just one way-capabilities-in which an operation is named in an invocation. 2

Moreover, capabilities are used for both entire resources and individual operations-

another example of similar mechanisms for similar concepts-and are first class objects in

the language. Such integration and flexibility is not achieved in languages like Ada and

EPL [Blac84] where many mechanisms, each having special rules and restrictions, are used

to achieve the same effects that are realizable with just a few SR mechanisms. As shown in

the above example, Ada uses procedures and functions in packages, but entries in tasks;

also, Ada does not permit entries to have a return value. In SR, the notion of operation

subsumes procedures, functions, and entries, and any operation can have a return value.

Ada's lack of integration makes how an operation is implemented more visible. Conse-

quently, modifying programs is more difficult. In particular, if a program using QUEUE

were to be changed to use BOUNDEDj3UFFER, all invocations of REMOVE would need

to be changed to reflect the change from REMOVE being a parameter-less function with a

return value in QUEUE to being an entry with a result parameter in

BOUNDED j3UFFER.

2 An operation capability has one of three forms: a field in a resource capability, a capability for an individual operation, or the name of an operation in the current scope. The name of an operation is treated as a capability constant for the named operation.

147

Similar concepts are represented in similar ways in SR. The constants null and

noop, for example, have uniform interpretations for resource capabilities, operation capa-

bilities, and even the implementation-specific I/O file type.3 As another example, return

and reply statements are allowed anywhere and have uniform interpretations in different

contexts. Their meanings within proe's and in statements were described in Sec. 3.2.4.

Their meanings within initialization and finalization components are similar to those for

proe's. Executing a return statement in such a component has the same effect as it would

in a proe: the component terminates early. Executing a reply statement in such a com-

ponent causes the creator or destroyer of the resource to continue; the process executing

the component also continues.

By integrating the various language mechanisms, SR is a relatively simple

language. Simplicity plus the almost total lack of restrictions make the language easy to

learn and use. In the Fall of 1985, SR was used for term projects in a course on principles

of concurrent programming. Students in that course are second-year graduate students

with a C and Pascal programming background. They were able to learn SR and design

and code their projects in about 2 weeks. Although the projects were of modest size (500-

1000 lines), they used most of the language features that would be used in "real" con-

current programs. These features-mutually dependent resources, resource

creation/destruction, capabilities, invocations, operations-caused the students few con

ceptual difficulties. As another measure of SR's simplicity, the language was implemented

by only 6 people working part-time for a year. Finally, the complete language description

[Andr85] is only 36 pages long, including numerous examples. In all these aspects, SR is

3 Unfortunately, in the UNIX implementation, the SR null file might not be what an experienced UNIX user would expect. Writing to null is an error, whereas writing to the noop file is semantically equivalent to writing to the UNIX file / dev/ null.

148

much simpler than a language like Ada.

6.2.· Global Components and Resources

In this section, we discuss the structure and semantics of global components and

resources. We also discuss SR's import mechanism.

6.2.1. Global Components

Recall that global components declare constants and types that are needed by one

or more resources. Global components are separately compiled and imported into resources

rather than being merely textually included. We chose this approach because SR is a

strongly-typed language. In particular, a type declaration in included text could depend

on a constant defined earlier in the includer. Thus, two objects declared in different files

using the same global type could have different types. This would defeat the purpose of

global declarations and would require type checking to be performed at link-time instead of

compile time. Furthermore, it is easier to verify that the importers of a global component

have all been compiled with the same version of a global component than it is to verify

that they have included the same version of an include file. Besides, global components

add very little complexity to the language: the mechanisms needed for global components

are already present for resources. In fact, a global component can be viewed as a degen

erate resource specification, one that contains only declarations of constants and types.

6.2.2. The Resource as an Abstraction

The structure of the resource construct is similar to that of modular constructs

in procedure-based languages, such as Euclid [Lamp77] and Modula-2 [Wirt82], and other

distributed programming languages, such as Distributed Processes [Brin78], StarMod,

Argus [Lisk83a], and EPL [Blac84]. Since a resource has separate interface and implemen

tation parts, the way in which it implements its services is hidden from the resource's

149

users. This structure also permits flexibility in how a resource provides a service and

allows processes in the same resource to share variables efficiently.

Resources provide the only data-abstraction mechanism in SR. They are used to

program sequential "abstract data types" such as the Queue resource as well as concurrent

data types such as BoundedBuffer. Having just one abstraction mechanism makes the

language simpler than it would be if two separate mechanisms were provided, one for

sequential types and one for concurrent types. There is one disadvantage though: the

implementation of sequential types is not as efficient as it might be since a resource that

implements a type might be located on a different machine than its clients. We are able to

perform some optimizations when a resource and its clients are located on the same

machine, but not as many as would be possible if "sequential" resources were distinguished

as such in the language and were forced to be located on the same machine as their clients.

A second potential shortcoming of resources is that they are not polymorphic: they ma.y

not have types as parameters. We have not, however, found many situations in which a

generic resource facility would justify its large implementation cost. Furthermore, we have

found the resource parameterization provided in SR to be sufficient, as illustrated by the

Queue and BoundedBuffer resources in Sec. 6.1.

We do not allow resources to be nested, primarily because nesting is not needed.

If one resource needs the services provided by another, it can either create an instance of

the needed resource or be passed a capability to it. Precluding nesting results in a much

simpler language and implementation. One disadvantage, though, is that different

resources cannot share variables (the resources might execute on different machines). How

ever, if information sharing is necessary, resources can exchange messages or be combined

into a single resource. Another disadvantage of not allowing nested resources is that

resources that are used only by a single resource are visible to all resources. Although it

150

might be desirable for such resources to be hidden (e.g., to avoid potential resource name

conflicts), this has caused few problems in practice. We also do not allow processes to be

nested, for essentially the same reasons. In contrast, Ada allows arbitrary nesting of tasks,

packages, and subprograms. This makes Ada more complex than SR, its implementation

much more complicated and costly, and many programs more difficult to understand

[Clar80j.

8.2.3. Resource Initialization and Finalization

A resource can contain initialization and finalization code. Initialization code

gives the programmer a way to control the order in which initialization is done. This is

important for two reasons. First, resource variables should be initialized before processes

that access the variables are created in the resource. This can be accomplished by simply

initializing the resource variables before creating any processes. (This is also the reason

that background processes are created automatically at the end of initialization code.)

Second, resource variables should be initialized before processes outside the resource have a

capability for its operations; otherwise, such an outside process could invoke a proc that

accesses resource variables before they are initialized. This can be accomplished by not

executing a reply statement in the initialization code until resource variables have been

initialized. Initialization and finalization code are executed as processes so they can use

any of the language mechanisms (another instance of our aversion to imposing restric

tions). For example, initialization code can service operations, create other resources, or do

whatever else might be required.

Finalization code provides a means by which a resource can "cleanup" before it

disappears. For example, if a resource has obtained a lock for a file, it can record that it

owns the lock; its finalization code can then release that lock if the resource is ever des

troyed. Finalization code is executed as a process, again so it can use any of the language

151

mechanisms. Our approach is similar to that in NIL. A different approach is used in Ada.

When an Ada task is aborted, it does not get control-it is just destroyed.4 Thus, in the

above example, there is no way the task itself can release the lock; such a release can only

be done by another task that is monitoring the task that was aborted.

Note that a potential race condition exists in the above example: a process could

be trying to obtain the lock when the finalization code tests if the lock should be released.

This can be avoided by enclosing such code in critical sections, using a semaphore opera-

tion to implement mutual exclusion.

The destroy statement has the following meaning. First, the finalization code

begins execution. Then, when and if it terminates, all other activity in the resource ceases.

Note that this mirrors the create statement. Also, it is not too difficult to implement

because there is just one point at which all activity in a resource must be halted.

We originally gave the destroy statement a different meaning. The idea was

that when a resource was being destroyed, it should be shut off from external activity.

First, any active proc's in the resource were halted, and further invocations of the

resource's operations were prevented. Then, the finalization code in the resource was exe-

cuted. However, we found this semantics harder to implement because of its two parts.

Also, it is not as useful as the current semantics. In particular, it is sometimes useful for

the finalization code to interact with the active processes in resources, for which operations

are essential, or to receive an invocation from an outside resource. For example, the finali-

zation code might tell a server process it should terminate by invoking an operation in the

server; the server could then stop its interaction with its client and release any objects that

4Task destruction may not be immediate; e.g., a task is allowed to complete servicing a rendezvous before the task is destroyed.

152

it holds.

We also considered representing the initialization and finalization components as

operations serviced by proc's (such an approach is taken in [Olss84a]). This approach

separates the creation of the resource from its initialization; thus, it provides more flexibil-

ity than the current approach. The two operations, initial and final, would be implicitly

declared in the resource specification. To create a resource, the create statement would be

used as it is presently. However, the resource's creator would also need to call the

resource's initial operation before invoking the resource's other operations.5 To destroy a

resource, the final operation would be invoked; the destroy statement would no longer be

needed. We decided against this approach for three reasons. First, there is very little

gained by separating a resource's creation and initialization; if present, resource initializa-

tion code always needs to be executed. Second, the creator would always need to invoke

the initial code. This would make programs a bit messier, introduce a potential source of

error (Le., if the user forgets to invoke initia~, and require an extra invocation compared to

the present creation scheme. Finally, too much extra meaning is attached to these two

proc's; they should be treated specially. In particular, the initial proc also automatically

creates the background processes; the final proc also kills other processes.

6.2.4. Resource Parameters

Recall that resource parameters are declared after the other objects in the inter

face part. This prevents exported types from depending on parameters. Otherwise, sizes

in exported types might not be known at compile time and different instances of the same

5 The creator could also send to the initial operation. However, there would be no guarantee that resource variables were initialized before the creator invokes the resource's other operations.

153

resource could have different types, which would necessitate run-time type checking. An

alternative is to place the parameters at the beginning of the specification but disallow

their use within the specification. This, however, would violate the simple usage rule that

applies uniformly throughout SR: an object can be used anywhere after its declaration.

Requiring that the resource parameter list be placed at the end of the

specification leads to three possible forms for specifications. The first possibility is to place

the resource name after its specification and before its parameters:

spec

resource identifier (forma1...parameters) sepa.rate

However, this would prevent a resource's name from being used within its specification.

For example, an operation parameter that is a capability for the resource being defined

could not be declared. Also, placing the resource's name only at the end of the spec would

make it difficult to see what resource the spec is for.

The next remaining possible forms for the resource specification both place the

resource name at the beginning of the spec. The first of these separates the name from its

parameter list:

spec identifier

resource (formal-parameters) separate

The second repeats the name at the start of the parameter list:

spec identifier

resource identifier(formal-parameters) separate

We chose the latter because it fits with the form of simple resources that have no imports

and no declarations in their specifications, i.e.,

resource iden tifier( formal-parameters) # body of the resource.

end

6.2.5. Import Mechanism

154

As described in Sec. 3.1, resource specifications can be mutually dependent.

Assume we have two resource specifications, A and B, with A compiled before B. Then, A

can import B and use B itself, but not objects declared within Bj on the other hand, B can

import A and can use A itself and objects declared within A. The reason for this asym

metry in what is available to each of A and B is that it simplifies the implementation

without sacrificing expressiveness. This asymmetry allows a specification to be compiled in

its entirety because all objects from imported components will have already been compiled.

To illustrate this, suppose on the contrary that A were allowed to use objects declared in

B. Then, since these objects have not yet been compiled when A is compiled, the compila

tion of A cannot be completed until B is compiled. Thus, when B is compiled, the compila-

tion of A would need to be finished. In addition to being difficult to implement (especially

if there are more than two specifications), such circularities in specifications can lead to cir

cularities in type declarations. For example, a type in A could be defined using a type in B,

which is in turn defined using the original type in A. Such erroneous circularities in types

could be detected, although to do so would be costly. Finally, there is no need for such cir-

cularities in specifications. An object can always be declared in a global component, and

that global component can be imported.

In SR, entire resource specifications or global components are imported, which

makes all their objects visible. This is a change from SRo's import mechanism. There, an

exported object (i.e., constant, operation, or type) is by default visible to all importing

resources, but visibility can be limited by specifying which objects are visible to which

155

resources. Also, the SRo import statement names which objects are to be imported from a

particular resource. We found that the power provided by the SRo import to be more

than we needed. Although it does provide additional security by controlling what

resources can uses certain objects, we have not found this kind of defensive programming

necessary, especially in a systems programming environment. In addition, it is cumber

some to change a resource's export list whenever a new resource needs to uses one of its

objects. SRo's import mechanism has complicated rules and implementation; SR has con

siderably simpler rules and implementation.

Recall that SR allows type declarations to include a type restrictor-public or

private-that indicates how objects of the type can be used. This allows a resource to

export a type, while ensuring that only instances of the resource can alter objects of that

type. (Types in SRo can be similarly restricted.)

6.2.6. Resource Code Loading

The input to the SR linker is a list of resources to be included on each machine in

the user-specified machine enumeration type. Thus, the machines on which a particular

resource can execute are fixed at link-time. This has the advantage that the code necessary

to execute a program is already loaded on each machine in the network when program exe

cution begins. Alternatively, this decision could be deferred until run-time, in which case

the input to the linker would simply be a single list of the resources that comprise the pro

gram.

This latter approach can be implemented in two ways. The first way is to load

the code for all resources onto each machine before the program begins to execute. How

ever, this is very wasteful of memory, especially for large programs such as operating sys

tems. The second way is to fetch code at run-time as needed, i.e., when the first instance of

a particular resource is created on a machine, that machine's run-time support fetches the

156

code for the resource. However, generally some processors in the network will be diskless.

Thus, such processors would need to obtain the code from a processor that has a disk. To

ensure that code is available despite processor failures, it would need to be stored on more

than one disk, if not on all disks. This implementation would complicate the run-time sup

port, and cause extra network traffic during program execution. We have not yet found

situations where this extra flexibility justifies its extra implementation cost.

6.3. Operations

Operations in SR provide much flexibility in how processes communicate and syn-

chronize. The following table illustrates how SR compares to some other distributed pro

gramming languages. The table's heading lists four important mechanisms in distributed

programming. These mechanisms were described in Sec 2.2.1. The table entry for a partic

ular language indicates whether or not each of the mechanisms is provided directly in the

language. We are not interested in whether a mechanism can be simulated by other

mechanisms because simulations are generally inelegant and inefficient.

RPC rendezvous asynchronous dynamic message processes

Ada N Y N Y Argus Y N N Y Concurrent C N Y N Y DP Y N N N EPL N Y Y Y NIL N Y Y Y StarMod Y Y Y Y SRo N y Y N SR Y Y Y Y

The above table shows that only StarMod and SR provide all the listed communication

mechanisms. However, SR provides these mechanisms in a way that is better integrated

157

than StarMod. For example, semaphores in SR are just a special kind of operation,

whereas they are a distinct mechanism in Star Mod.

The rest of this section gives the rationale for the particular syntax and semantics

we have chosen for operations. We discuss issues related to operation declaration, invoca

tion, and implementation.

6.3.1. Operation Declarations

Operations are declared in op declarations; their implementations are later

defined by proc's or in statements. This allows resource specifications to be written and

used without concern as to how an operation is implemented. The op declaration specifies

the names and types of the operation's formal parameters and return value. The heading

on the proc or in statement specifies the names of the formals and return value to be used

within the proc or in block. Thus, both the operation declaration and heading specify

names for the formal parameters and return value. However, the names given in the opera-

tion heading can differ from those in the operation declaration. This is allowed to facilitate

programming nested operation implementations, which are sometimes necessary. For

example, suppose a resource services the two operations

op j( x, y:int) op g(x:int)

Then they might be implemented as:

proc j(x,y)

in g( z) --+- ••• ni

end

Inside the body of the in statement, the formals of both f and g can be accessed: x and yare

Is parameters, z is g's. Being able to give different names is also necessary to allow in

158

statements that implement the same operation to be nested and still provide access to the

parameters of both invocations. For example, in the innermost block of

op swap( var x : int )

in swap( xl ) -+

ni

in swap( x2 ) -+

xl, x2 := x2, xl ni

parameters to· both invocations of swap are available because they are given different local

names. Note that the same effect could be achieved by copying parameters to local vari

ables that would be accessible in the nested block; however, this leads to cumbersome pro-

grams.

We considered two ways to simplify op declarations and headings. First, we con

sidered an abbreviation in the heading of proc and in statements. If the formal names

were omitted in an operation heading, the names from the operation declaration would be

used within the proc or in statement. For example,

op j(a,b :int)

procf # body of f.

end

would mean that a and b would be used for the formals within the proc. We chose not to

allow this abbreviation because the names of a proc's formals would not be visible by exa-

mining only the proc code; the operation's declaration would need to be examined.6

6 Ada requires that the names and types in procedure, function, and entry declarations in the specification part be repeated in the operation heading in the implementation part. Having to repeat the types makes programs longer without being more readable, and requires the compiler to ensure that the types are the same.

159

The second way we considered to simplify op declarations and headings was to

allow parameter names in operation declaration to be omitted, and instead require only the

parameter's signature. The idea behind this simplification is that since formal names are

specified in the operation heading, parameter names in operation declarations serve no pur-

pose other than for documentation. For example, instead of

op g( c: char; d: int )

we could write

op g( char; int )

However, there is a problem in doing so for array parameters. Recall that the signature of

an array gives its size and type, but does not give any hint as to the type of its subscripts

and its actual bounds. For example, the signatures of

are each

el['a':'e']: int e2[3:7]: int

[5] int

Such subscript information could be encoded into the signature; e.g., the signatures of el

and e2 could be

['a':'e'] int [3:7] int

respectively. However, we did not want to change signature rules just for this. In addi-

tion, allowing signatures in operation declarations could make some operation declarations

hard to understand and would complicate parsing. Suppose, for example, that type t has

already been declared. Then,

160

op /1( t) op /2( t: int )

declares /1 to have an unnamed parameter of type t and /2 to have an integer parameter

named t. Further,

op gl( t [1:4] int ) op g2( t [1:4]: int )

declares gl to have two unnamed parameters, the first has type t and the second is an

array of integers; and g2 to have a single parameter named t, which is an array of integers.7

Such declarations are clearly hard to understand; in addition, they require arbitrary look

ahead in parsing (because subscript specifications can contain arbitrary expressions).

A semaphore operation is declared as an op just like any other operation. The

compiler determines whether an operation can be implemented using semaphores. Two

alternatives we considered were to have a special semaphore operation restriction, such as

"{sem}", or a distinguished operation declaration, such as "semaphore identifier". We

chose our approach mainly to avoid introducing the extra concept of semaphore to the

language when a semaphore is just a special kind of operation. This has the additional

benefit that programs are easier to modify. For example, if a parameter is added to a

semaphore operation, it can no longer be implemented using semaphores; if there were a

separate kind of semaphore declaration, it would need to be changed to an operation

declaration. Of course, if special semaphore declarations were desired, a preprocessor could

be used to convert semaphore declarations to operation declarations. By contrast, Star-

Mod provides semaphores as a basic data type, and Ada provides semaphores by means of

a predefined package.

7 Recall that semicolons are optional. The declaration of gl is therefore legal.

161

6.3.2. Operation Invocation

Operations in SR can be invoked either synchronously (call) or asynchronously

(send). Providing both these forms of invocations provides the programmer with flexibil

ity without unduly complicating the implementation. Examples have illustrated that both

are useful.

The co statement allows the invoker to invoke several operations at the same

time and to continue when an appropriate combination of replies has been received. The

exact combination of replies can be determined dynamically. For example, a majority vot-

ing scheme might be coded as follows.

ayes, nays := 0, 0 co (i:= 1 to N) ballot := constituent[ zl. voteO -+

if ballot -+

oc

ayes++j if ayes> N/2 -+ exitj fi ~ else -+

nays++j if nays> N/2 -+ exitj fi fi

Here, constituent is an array of capabilities for resources that are to vote; voting uses the

operation vote, which returns a boolean result. The above co statement terminates when a

majority of aye votes or nay votes has been received and the post-processing block executes

an exit statement, or when all votes have been received and there is a tie. The post-

processing block associated with each concurrent invocation in a co statement allows the

programmer to handle the reply from each invocation in a manner appropriate to that

invocation. In addition to being useful, co is relatively simple to implement since its imp le-

mentation uses the basic invoke and reply primitives in the RTS. Thus, co illustrates how

opening up the implementation provides additional, useful flexibility. Note that the co

statement is similar to Argus's coenter statement and to the V kernel's multicast mechan-

isms [Cher85j.

162

6.3.3. Operation Implementation

Recall that proc's support procedure call and process forking, depending on

whether the operation serviced by a proc is invoked by call or send. Input statements

support rendezvous or message receipt, also dep.ending on whether an operation serviced by

in is invoked by call or send. Thus, input statements combine aspects of both Ada's

select statement and CSP's guarded input statement. They are even more powerful, how

ever, since the synchronization expression may reference formal parameters, and thus selec

tion can be based on parameter values. Input statements may also contain scheduling

expressions, which may also reference formal parameters; thus scheduling can also be based

on parameter values. These mechanisms greatly simplify solving many synchronization

problems. For example, the first and third solutions to the dining philosophers problem in

Sec. 4.4 use synchronization expressions in their input statements. Also, an example in

Sec. 3.2.2 uses a scheduling expression to give preference to larger requests. As we saw in

Chapter 5, synchronization and scheduling expressions are implemented quite efficiently.

Of particular importance is that accessing invocation parameters in such expressions has

very little cost. The process executing the input statement is given a pointer to the invoca

tion block, which it uses to access the invocation's parameters.

An operation is restricted to be implemented by a single proc or by input state

ments, but not by both a proc and by input statements. The reasons for these restrictions

are both practical and conceptual. Allowing two proc's to implement the same operation

has no practical use because both would need to provide the same functionality to its

invokers. Allowing an operation to be implemented by both a proc and in statements

would raise several semantic questions for which there are no good answers. Specifically,

does an invocation of such an operation always result in a new process? If it does, when is

the in statement given an invocation? A rule such as

"An invocation of an operation implemented by both a proc and by in statements is given to an in statement if a process is waiting for the invocation; otherwise, a process is created to execute the proc."

163

could be defined8, but such rules complicate the otherwise simple and consistent semantics.

Besides, a practical use for such an operation remains to be seen.

Operations can be declared within a process (a local operation) or at the

resource-level (a resource operation). Local operations support the programming of conver-

sations, as shown in Sec. 4.6. Resource operations provide the most commonly used form.

Of importance is that resource operations, like resource variables, may be shared; i.e., they

can be serviced by in statements in more than one process. Shared resource operations are

almost a necessity given that multiple instances of a proc can service the same resource

operation. For example, in the resource

body r op j( ... )

proc p( .•. )

in j( ... ) - ... ni

end end

each instance of p services the resource operation I, of which there is only one instance;

instances of p therefore share I.

Shared resource operations are also useful since they can be used to implement

conventional semaphores, "data-containing" semaphores, and server work queues. A

8 This rule is not complete because it does not define what happens if the synchronization expression on the input statement is found to be false for the invocation.

164

data-containing semaphore is a semaphore that contains data as well as a synchronization

signal. As an example, we use such semaphores to implement buffer pools in Saguaro. A

buffer is produced by sending its address to a shared operation; a buffer is consumed by

receiving its address from the shared operation. A shared operation can also be used to

permit multiple servers to service the same work queue. Clients request service by invok

ing a shared operation. Server processes (in the same resource) wait for invocations of the

shared operation; which server actually receives and services a particular invocation is

transparent to the clients. In addition to being useful, shared resourc~ operations can be

implemented almost as efficiently as non-shared operations; see Sec. 5.3.2.

We considered several possible semantics for shared resource operations before

deciding on the one described in Sec. 3.2.3. Our main concern was finding the right balance

between expressiveness and implementation efficiency. Some semantics are inexpensive to

implement, but too weak to be useful. For example, the semantics could state that if

several processes are waiting at an in statement for the same operation and if an invoca

tion of that operation arrives, then one of the processes will proceed. This is inexpensive to

implement because the order in which processes wait for invocations need not be main

tained by the run-time support. However, this is too weak, for example, to guarantee that

a process will ever obtain entry to a critical section guarded by a semaphore operation. On

the other hand, some semantics are more powerful, but are expensive to implement. For

example, the semantics could provide the same FCFS ordering as in our semantics but

allow two processes servicing disjoint operations in the same class to search pending invo

cations concurrently. This is stronger than our semantics because we allow only one pro

cess at a time to access invocations in a given class. (Note that this stronger semantics has

little benefit in realistic programs.) However, it is more expensive to record for what invo

cations processes are searching, and to maintain the FCFS order among processes {since

165

there is no guarantee that processes will finish their searches in the same order in which

they started).

The particular semantics we chose has the advantage that simple uses of opera-

tions have simple semantics and an efficient implementation. For example, processes

implementing a semaphore-like operation are given access to pending invocations in FCFS

order; they use the receive RTS primitive. Grouping operations into classes shows which

input statements can potentially compete for invocations. Recall that at run-time, invoca

tions are placed onto a list of invocations according to their class. In our implementation,

this has the disadvantage that when a new invocation arrives, processes that do not imple

ment the particular operation might be awakened. For example, suppose process pi is exe

cuting the in statement

in aO - ... ni

and process p2 is executing the in statement

inaO-··· ~ bO - ...

ni

Suppose further that pi was waiting before p2 and that there are no pending invocations of

a or b. Then, when an invocation of b arrives, pi will be awakened even though it does not

service b. This disadvantage is not too significant, however, because realistic programs

rarely contain input statements that do not service all the operations in one class.

The form of the return statement is simply

return

rather than

return expression

166

as found in some other languages. This is because an operation's return value has a

declared name (in an operation heading). It is treated just like a result parameter of the

operation; i.e., it can be assigned to and used in expressions in the body of the operation

implementation. Therefore, it is unnecessary to give an expression that is to be returned.

Moreover, treating the return value as we do has the added advantage that it obviates the

need for extra local variables. For example, if an SR operation is to return an array of ten

integers, th~ name of the return value can be used to build the result. If there were no

name for the return value, a local array in which to build the result would have to be allo

cated. Furthermore, SR allows operations to be declared using '*' as the size of the return

value. For example, the operat.ion

op fill(filler:char) returns filled[l:*]: char

might be invoked by

a[1:5] := fil/('c'}

Without a name on the return value, there would be no way for the implementation of the

operation to know the actual upper bound of an invocation (unless it were passed as an

extra parameter). Note that to be consistent with the return statement, the reply state

ment is also expression-less.

6.4. Issues Related to Program Distribution

We argued in Chapter 2 that distributed programs sometimes have components

that interact as equals. The file system example in Sec. 4.6 illustrated such interaction.

There, two resources-DirectoryManager and FileServer-both provide operations used by

the other and use operations provided by the other. SR supports this kind of interaction

because circularities are allowed in spec's and import's. In addition, they are not too

costly to implement. By contrast, most other languages (e.g., Ada, Modula-2) support only

167

hierarchical structures.

We also argued that it is often important in distributed programs to be able to

specify the machine on which the different parts of a program are to execute. SR supports

programmer control over placement since the location for a resource can be specified when

the resource is created; Argus provides similar support. By contrast, Ada provides no sup·

port for placement of tasks. (Ada also allows arbitrary tasks to share variables, which

further limits Ada's use for writing distributed programs.)

If the kind of placement described above is to be possible, the programmer must

be provided with some way to specify machines both when writing and linking programs.

Recall that in SR the notion of the network of machines on which a program is to execute

is represented by the user· defined enumeration type machine. This allows the user to

define virtual machines, one for each literal in machine. The linker creates a load module

for each virtual machine; these load modules (virtual machines) are mapped to physical

machines after linking. (See Appendix B for details.) This approach is reasonably simple to

implement and provides sufficient flexibility for constructing programs. By contrast, Ada

does not specify how machines are to be represented or how its programs are to be linked.

Finally, we argued that a distributed programming language must provide sup·

port for detecting and handling failures. SR provides two failure.handling mechanisms:

the completion status returned from statements that use capabilities and the 'failed' func·

tion. These mechanisms are higher level than a simple timeout mechanism, such as that

found in Ada, and lower level than mechanisms like atomic actions [Lisk83a] and replicated

procedure call [Coop84]. We feel that our approach is appropriate for the intended appli.

cation domain of SR. The SR mechanisms are simpler to use than timeout since they sup·

port a higher-level abstraction: the presumed status of a component rather the passage of

an interval of time, which may indicate failure. Timeout intervals are used to implement

168

'failed', but the programmer need not be concerned with such low-level details. The SR

mechanisms are much more efficient than higher-level failure-handling mechanisms, and

hence they are more appropriate for a systems programming language. In fact, the SR

mechanisms can be used to implement high-level mechanisms such as atomic actions.

6.5. Sequential Control Statements

The sequential control statements in SR-if, do, and fa-evolved from those in

SRo' In SRo' the only control statements were if and do statements essentially identical to

Dijkstra's guarded commands. Although pure guarded commands are certainly adequate

to express any computation, there are three situations in which they result in awkward

program structure.

First, to prevent an SRo if statement from aborting, it is often necessary to

include "else - skip" as the last guarded command.9 Almost all the people who have

written SRo programs have quickly (and vociferously) tired of having to do this. There

fore, we defined SR's if statement so that executing if has no effect if no guard is true.

This makes SR's if similar to the if statement in other languages. Moreover, this change

does not violate the spirit of guarded commands since the programmer still has to consider

the case in which no guard is true.

The second deficiency of having only guarded commands for control constructs is

that iterative loops can be more complex than necessary. For example, to iterate through a

one-dimensional array, the SRo programmer has to d~clare a variable to serve as the loop

index, initialize the variable, and write a do statement that expresses the termination con

dition and updates the loop index. We have found that programmers often forget one or

gelse, like otherwise in Ada, is the negation of the disjunction of the other guards.

169

more of these steps (often the update of the loop index); also, they quickly became as

annoyed as when coding "else - skip" in if statements. Consequently, we added the fa

(for-all) statement to support definite iteration.

Finally, it is awkward in SRo to program early exits from loops or to go to the

next loop iteration before reaching the end of the loop body. For example, to exit a loop

prematurely, it is necessary to use a flag variable, which must be declared, initialized, set,

and tested. To avoid this, we added exit and next statements to SR. Including these

statements makes the programmer's job easier and makes programs more readable, yet

does not destroy the structure provided by the iterative statements. We also allow exit

and next to be used within post-processing blocks in the co statement since they are useful

and also since post-processing blocks are executed one after another in an iterative fashion.

Once again we have a mechanism that is simple, widely applicable, and efficient to imple-

ment.

Recall that the form of the fa statement is

fa quantifier, ... , quantifier - block af

where a quantifier has the general form

boundyariable initial~xpression direction final~xpression

st boolean~xpression

The form we originally considered for the fa statement was

fa quantifier block af

where a quantifier has the general form

(boundyariables := initial~xpressions to final~xpressions st boolean~xpression)

For example, using this original form, the statement

170

fa ( £, i:= 1,5 to 2,7 ) ~ write( £, i) af

would output the sequence

1 5 1 6 1 7 25 26 27

Two factors persuaded us to change to the present form of the fa statement, and

specifically to the present form of the quantifier. First, it is hard to read the old form of

quantifier because the bound variables are separated from their corresponding initial and

final expressions. Second, nested fa statements were needed to express iteration in which

one bound variable depends on the value of the other; such iteration could not be expressed

in co and in statements because they allow only a single quantifer. For example, the fa

statement using the present quantifier

fa i := lb( a) to ub( a)-l,

af

i:= i+1 to ub(a) st a[i]>a[J] ~ ali], a[J] := a[J] , ali]

would be written using the old quantifier as two fa statements

fa ( i := lb( a) to ub( a)-l )

af

fa (j:= i+1 to ub(a) st a[i]>a[J] ) ali], a[J1 := a[i] , art]

af

Besides these two factors, the present quantifier is simpler to parse and easier to generate

171

intermediate code for. lO Furthermore, it can lead to more efficient programs. For example,

using the old quantifier, the statement

fa ( i, j:= 1,4 to 10,7 st i =I: 5) write( i, j) af

implies that the test "i =I: 5" is to be performed in the innermost loop, which is inefficient;

that test should be performed in the outer loop. (The such-that expression must generally

be evaluated in the innermost loop. This is likely to happen unless the compiler optimizes

such tests, which is complicated.) To avoid such inefficiencies, the user could use two fa

statements, although this partially defeats the brevity provided by the fa abbreviation.

This problem is solved in the present approach because each quantifer can have its own

such-that clause; the above fa statement can be expressed as:

fa i := 1 to 10 at i =I: 5, j := 4 to 7 - write{ i, j ) af

Note that quantifiers implicitly declare their bound variables, which removes some of the

tedium from programming.

6.6. Caveats to the Programmer

SR provides a good deal of flexibility and freedom in how its mechanisms can be

used. In some situations, howeve", this can lead to unexpected program behavior, make

debugging more difficult, or provide the programmer with "enough rope to hang himself".

Most of these could be precluded by placing restrictions upon the use of the language's

mechanisms. However, we have chosen not to do so because we believe a language should

not impose a "police state" on its use; i.e., it should not unnecessarily restrict the program

mer. Such restrictions can also greatly complicate a language's definition and implementa

tion. Besides, some of these situations actually have practical uses, and few are likely to

10 The direction of iteration can also now be to or downto.

172

occur in realistic programs. This section enumerates and discusses these situations. Note

that these kinds of situations exist in almost any other language, although the specific

causes differ.

• An operation that has variable or result parameters, or a return value can be invoked

using send. However, such parameters and return value will not be copied back. The

compiler warns about such invocations.

• An operation with a return value can be invoked in a call statement. In this case, the

return value is discarded. The compiler warns about such invocations.

• An operation can be invoked using either call or send, if so specified. We have

presented examples of where this is useful (Sec. 2.2.1). The programmer should be aware

of how an operation can be invoked when coding its implementation.

• Grouping operations into classes that are accessed by one process at a time can have

unexpected effects on the execution of input statements. In addition, the evaluation of

synchronization and scheduling expressions within in statements can have side effects.

These topics are described in detail below.

• A reply statement can be executed for an operation invoked by send; however, it has

no effect. In addition, replies made after the initial reply to any invocation have no

effect.

• An operation declared within a proc that is never assigned to a capability variable is

essentially useless. The compiler warns about such operations it can detect.

• A resource that has finished executing continues to exist even if there are no capabilities

for it or for any of its operations. Such a useless resource consumes internal run-time

support objects, such as memory and semaphores. The programmer should therefore

destroy such resources so that the run-time support can reclaim these objects. The

173

run-time support does not attempt to detect and eliminate such useless resources

automatically because to do so would greatly complicate our implementation. In partic

ular, capabilities would need to be maintained by the run-time support. This requires

that the run-time support be entered each time a capability variable is assigned or ceases

to exist (because its defining block is exited). On each such entry, the run-time support

would update a reference count for the capability. If that count became zero and there

was no activity in the resource, the run-time support could eliminate the resource .

• The private type restrictor can be used within a global component. However, the type

is then essentially useless because no other component can see the type's representation .

• The facilities provided by real resources and pointers allow a programmer to subvert the

type checking and write programs to overwrite arbitrary memory locations. These facil

ities should be used with care.

As mentioned above, grouping operations into classes that are accessed by one

process at a time can have unexpected effects on the execution of in statements. Consider,

for example, the in statement

in a( x) and J( x) -+

ni

where a is a resource operation. If f is in the same class as a, or the implementation of f

attempts to service an operation in the same class as a, then deadlock will result.

Specifically, the original process holds the class lock until it has finished accessing the class;

the process implementing fwould be unable to obtain the lock, so it would delay forever.

The evaluation of synchronization and scheduling expressions can have side

effects. For example, consider the following input statement, where p is a variable local to

the process containing the in statement and r is a resource variable.

174

in a(x) and j(x,p) by g(x,r) -+-

ni

Possible side effects, and examples thereof, are:

• Modification of variables local to the process containing this in statement or

modification of resource variables. For example, p might be modified in the evaluation

of fif the second formal parameter of Jis a var or res parameter. Similarly, r might be

modified in the evaluation of g.

• Modification of variables local to the process implementing the synchronization expres

sion or the scheduling expression. For example, variables local to the process implement

ing Imight be modified.

• Generation of new invocations. For example, the implementation of I might invoke

other operations.

• Modification of parameters in the invocation. For example, x might be modified in the

evaluation of fif the first parameter of lis a var parameter.

This last side effect can have an undesirable consequence: the parameter in an invocation

can be modified without the invocation being accepted. (Although the new value of the

parameter might make the invocation acceptable!) The semantics could define a rule prohi

biting the evaluation of a synchronization expression or scheduling expression to modify

the parameters of the invocation. However, if such a rule were defined, some of the other

side effects would still be possible. Therefore, we do not attempt to restrict side effects

although we do advise, in general, against writing operations that have side effects and

that are used in synchronization expressions or in scheduling expressions.

No~e that the above described deadlock and most of these side effects are unlikely

to occur in typical programs. (In fact, we have yet to see a program that has encountered

175

any of these problems.) Thus, most programmers do not need to be concerned with these

potential problems.

CHAPTER 7

Conclusions

This dissertation has discussed general issues in distributed programming

languages, and described the SR language and its implementation. This chapter summar

izes the main points and discusses future research suggested by this work.

7.1. Summary

We argued in Chapter 2 that a language for programming distributed systems

should provide certain kinds of mechanisms: dynamic modules, shared variables (within a

module), dynamic processes, synchronous and asynchronous forms of message passing, ren

dezvous, concurrent invocation, and early reply. A module provides one or more services;

within a module, one or more processes cooperate to provide these services. The user of a

module sees only the module's services, not how they are implemented. Allowing modules

and processes to be created dynamically provides additional flexibility in programming,

without adversely affecting performance. The two forms of message passing-synchronous

and asynchronous-allow processes to interact in the most appropriate way; inelegant and

inefficient programs result if only one of these is included in a language. Concurrent invo

cation and early reply mechanisms provide additional, needed flexibility in process com

munication and synchronization.

We then presented the SR language and showed how it provides these mechan

Isms. The resource is SR's module; it is the main unit of abstraction and encapsulation. A

resource is a parameterized pattern, instances of which are created dynamically. Resources

176

177

define operations and are implemented by one or more processes that execute on the same

processor. Processes interact by means of operations; processes in the same resource can

also share variables. Operations generalize procedures. They are invoked by means of syn

chronous call or asynchronous send. Operations are implemented by procedure-like

proc's or in statements. In different combinations, these mechanisms support local and

remote procedure call, dynamic process creation, rendezvous, message passing, and sema

phores. The co statement provides concurrent invocation, and the reply statement pro

vides early reply. We also described SR's sequential statements, how variables and types

are declared, and type checking rules. Many small examples and several larger ones were

given to illustrate particular mechanisms and the interplay between them. These examples

showed the expressiveness and flexibility of SR's mechanisms.

Chapter 5 described our implementation of SR. It was designed to make common

cases efficient. The esoteric cases, such as input statements that contain synchronization

and scheduling expressions, are necessarily more expensive to implement. However, the

implementation was designed so that these more complicated uses do not affect the perfor

mance of simpler cases. This chapter briefly discussed the roles of the compiler, linker, and

run-time support. It then focused attention on how the generated code and run-time sup

port interact to create and destroy resources, and to generate and service invocations of

operations. In particular, we showed how different kinds of in statements, some with syn

chronization or scheduling expressions, are implemented. We also described optimizations

that allow us to use semaphores for some operations, and procedure call instead of the gen

eral invocation primitives for some call invocations. Finally, this chapter presented meas

urements of the costs to generate and service different kinds of invocations. The imple

mentation has been in use since November 1985 and is currently being improved.

178

Chapter 6 described why we chose the particular syntax and semantics for SR's

mechanisms, and how these mechanisms compare to mechanisms in other approaches to

distributed programming. We examined how SR mechanisms strike a balance between our

language design goals of expressiveness, simplicity, and efficiency. In particular, SR's

mechanisms are well-integrated and based on only a few concepts; they can be used in flexi

ble ways with virtually no restrictions. The comparisons of SR's mechanisms to those in

other languages showed the expressiveness and flexibility of SR's mechanisms; they also

illustrated some of the shortcomings of other distributed programming languages.

Balancing our language design goals of expressiveness, simplicity, and efficiency

lead us to mechanisms that fall in the middle of other approaches to distributed program

ming. Variables can be shared within a resource, but not between resources. The location

of a resource can be specified when it is created, but resources cannot migrate. Iterative

statements can be terminated early (exit) or can be continued with the next iteration

(next), but there is no goto statement. Failure mechanisms detect failures, but no special

mechanism is provided to handle such failures.

The work in this dissertation makes the following contributions. Few of the

mechanisms in SR are novel; mechanisms similar to them have been used in other

languages. What is novel about SR, however, is how these mechanisms are provided. The

language mechanisms are well-integrated and have uniform meanings. This integration

and uniformity pervades the language, from small things-such as the uniform meaning of

null-to larger things-such as the integration of operations. All the pieces of the

language fit together well and there are virtually no restrictions on how a mechanism can

be used. Moreover, we designed and programmed an implementation of the language. The

implementation showed that the mechanisms can be achieved at a reasonable cost. It also

provides us with a tool for writing distributed programs.

179

7.2. Future Research

As indicated in Chapter 2, more programming experience is needed with failure

handling mechanisms. SR's failure handling mechanism represents an intermediate

approach. It allows a program to detect failures, but does not provide assistance in han

dling failures. The programmer must therefore explicitly include tests for failure and

actions for failure handling after each statement that may fail; this leads to cumbersome

programs. One possible improvement might be to use recovery operations as described in

[SchI86]; these gather the failure detection and handling in one place, separate from the

mainline code.

Another area that needs to be explored further is the performance of the imple

mentation. The measurements of the costs to generate and service invocations presented

in Chapter 5 are admittedly crude. Additional tests should be performed, e.g., to deter

mine how efficient it is to pass parameters of different sizes. These measurements should

indicate where improvements to the implementation should be made. They should also

indicate the relative costs of the different mechanisms, and might serve as a guide to the

programmer concerned with obtaining maximum efficiency. Also, they should be com

pared with performance tests for other distributed programming languages to see how

specific mechanisms and equivalent programs compare.

As described in Chapter 5, we perform two optimizations for call invocations of

operations implemented by a proc. Such calls from within the same resource use conven

tional procedure call; such calls between resources enter the RTS, which allows the calling

process to execute the proc's code. A significant part of the cost in the latter case is that

the generated code must obtain (and later release) memory for the invocation block from

the RTS; the cost of entering the RTS for the actual invocation and the checking it per

forms is minimal. The problem with such inter-resource invocations is that the location of

------------.-----

180

resources cannot generally be determined at compile time. One way to improve such invo

cations is to recognize certain common patterns of usage. For example, if one resource

creates another on the same machine and then invokes proe's in that resource, those invo

cations can use conventional procedure call. Observe, however, that how an operation is

implemented might not be known at compile time, e.g., perhaps only the resource's

specification has been compiled. Therefore, to make such optimizations, the decision as to

what kind of code to use for a particular invocation should be put off until link time. This

can be done by having the compiler generate both kinds of invocation code; the linker

could then decide which code to include. Further study of these ideas is required to assess

their viability.

More experience is needed writing distributed programs in SR. SR is currently

being used to program the Saguaro distributed operating system [Andr86]. This experi

ence has already pointed out the need for a facility to execute a UNIX load module from

within an SR program; this facility will be used to execute user code. The design of such a

facility is currently underway. The experience gained in writing Saguaro and other distri

buted programs should point out the appropriateness or inappropriateness of the mechan

isms in SR.

In addition to work on the language and implementation, further work could also

provide tools to aid the SR programmer. A debugger for SR programs would be very help

ful. Such a debugger might provide facilities to trace message traffic, and process and

resource creation. Another tool that would be useful is a tool like UNIX's 'make'. Such a

tool should be able to infer dependencies among components by using SR's import mechan

ism and automatically compile those that need it.

As illustrated by some of the examples in Chapters 3 and 4, SR is an attractive

language to use even for sequential programming. This has lead us to consider defining a

181

sequential subset of SR. This subset language would be simpler than the full language and

much more efficiently implemented.

Another related idea is to define a family of languages. Each member of the fam

ily would be intended for a different level of programming. For example, a family might

contain a member for each of five levels: machine, sequential, concurrent, distributed, and

transaction. The machine level language would provide facilities usually found in assembly

languages, but would be as machine-independent as possible. The sequential level

language would provide facilities like those found in C. The concurrent level would pro

vide processes and shared variables; its programs would execute on a single processor. The

distributed level language would be SR. The transaction level language would provide

facilities for atomic transactions. The key points about such a family are that throughout

the family similar mechanisms would be expressed in similar ways, and that mechanisms

would scale up from one level to the next. For example, procedure call in the sequential

levd would scale up to call as it is defined in SR. Moreover, a program could be formed

from components programmed in several different levels. This would lead to more efficient

programs; e.g., an SR program could directly use an assembly language procedure. In

addition, the compilers, linkers, and run-time supports for the levels might be able to share

some common code.

APPENDIX A

Synopsis of the SR Language

The following are the generai forms of the main language constructs. Here, optional items

are enclosed in brackets, and plurals and ellipses are used to indicate one 0.1" more

occurrences of items.

Components

global

resource specification

resource body

proc

block

global identifier constants or types

end

[spec identifier [import componentjdentifiers] [constants, types, or operations]]

resource identifier([parameters]) [separate]

[body identifier] [ declarations] [initial block end] procs [final block end]

end

proc identifier( [formaljden tifiers]) [returns resultjden tifier]

block end

[declarations] statements

182

Declarations

constant

type

operation type

variable

operation

Statements

sequential

operation invocation

operation service

resource control

const identifier = expression

type identifier = type...§pecification

optype identifier = ([parameters]) [returns result]

var identifier [subscripts], ... : type [:= expression, ... ]

op identifier [subscripts] ([parameters]) [returns result] op identifier [subscripts] optypejdentifier

skip variable, ... variable ++ variable --

expression, ...

if boolean~xpression - block ~ ... fi do boolean~xpression - block ~ ... od fa quantifier - block af exit next

[call] operation([actuals]) send operation( [actuals]) co [(quantifier)] calljnvocation [- block] / / ... oc

in [(quantifier)] operation( [formaUdentifiers]) [& boolean~xpression] [by expression] -

block ~ ... ni receive operation ([variablesj) return reply

capability := create resource([actuals]) [on machine] destroy capability

183

APPENDIX B

Man ual Pages

This appendix contains the manual pages for the major components of our UNIX

implementation: the compiler (sre), linker (sr~, and execution manager (srx).

184

SRC (1L)

NAME src - SR compiler

SYNOPSIS sre [ options 1 file ...

DESCRIPTION

185

UNIX Programmer's Manual SRC (1L)

Src is the compiler for SR programs. It takes a list of SR file names and produces one object file for each SR resource. (In the following, "res" will be used to denote an arbitrary resource name.) The sri (IL) command is then used to link the program into one or more load modules, which may be executed directly or by invoking srx (IL).

Src takes a number of option flags, which are specified in arguments beginning with a hyphen. Multiple flags can appear in an argument and any number of option arguments can be given. The following options are currently recognized: -I The next argument specifies the directory that contains the Interfaces directory

for this compilation. The default is the current directory. -d Information for debugging the compiler, of use only to the implementors of SR,

is written to stderr. -e The files res.e are not removed after assembly. These files contain assembly

language for the em pseudomachine. -m The next argument ("sun" or "vax") specifies for what machine code is to be

produced. The default is the machine on which the compilation is performed. -q Normally, src prints each file name before it begins to compile the file; this

option silences the compiler on this subject.

-s The files res.vax.s (or res.sun.s) are not removed after assembly. These files contain Vax (or Sun) assembly language.

-t Compilation stops after the res.e files are produced. Neither ACK (which translates em pseudomachine code into Vax or Sun assembly language) nor the assembler are invoked.

DIAGNOSTICS Src is supposed to emit useful and self-explanatory error messages. It also occasionally emits messages of the form "Internal Compiler Error: ... "j these indicate bugs in the compiler itself, and should be reported to the implementors of SR.

FILES file.sr res.e res.vax.s res.sun.s res.vax.o res.sun.o Interfaces

SEE ALSO

SR source file assembly language for the em pseudomachine Vax assembly language Sun assembly language Vax object file Sun object file directory containing interface and linker files

srl(1L), srx(IL), ack(1L), as(l) BUGS

Yes.

SRL (1L)

NAME srI - SR linker

SYNOPSIS

UNIX Programmer's Manual

sri f options 1 package-name [ spec-file 1 sri options exec-file res-name ... re-srl

DESCRIPTION

186

SRL (IL)

Sri is the linker for SR programs. It takes a list of resources and machines and produces a group of binary files that can be executed together as a distributed program. The srx (IL) command is then used to execute the program. An important special case handles single-machine SR programs and produces directly executable binaries. This case is quite different, and hence treated separately. Re-srl is a shell script that, when executed, will relink a program in the manner specified by the last sri done in the current working directory. This is useful when changes not affecting the resource or machine structure of the program are made to the source files and a quick relink is desired. However, if any machine enumeration or resource name or spec is changed, re-srl may no longer be applicable so sri should be used. The re-srl file is produced by sri in the current directory every time it is invoked.

Sri takes a number of option flags, which are specified in arguments beginning with a hyphen. Multiple flags can appear in an argument and any number of option arguments may be given. The following options are recognized: -1 Link a single-machine program. Note that this is a one, not a lower case L. -i Use the following argument as the directory that contains the Interfaces direc-

tory to be used in linking. The default is the current working directory.

-t Turn on verbose mode. The linker will print some information about what it is doing. Also, when the program is finally executed, the run-time support on each machine will say when it starts and ends executing. Normally the linker and execution are silent unless an error occurs.

-e Use the experimental run-time support. This option is intended for the imple-mentors of SR only.

Normally sri produces load modules to run on a V AX under 4.3bsd UNIX. The target machine and operating system can be specified with the following options: -s Link object code for the Sun Workstation. -v Link object code for the VAX-H.

-V Link the V Kernel version of the run-time support.

-S Link the stand-alone version of the run-time support for bare machines. -U Link the UNIX version of the run-time support.

In the multi-machine case, the input to the linker is a list of virtual machine specifications, which are read from the standard input or spec-file if given. The output is one virtual machine load file (VMLF) containing the appropriate resource patterns for each machine and a package file (named package-name) that contains configuration information.

187

SRL (IL) UNIX Programmer's Manual SRL (IL)

The VMLF's are executed using the srx(IL) command, which uses the package file to determine which machines are in the program. The VMLF's are named with the machine names in the specification file. Furthermore, these names must match the names declared in the SR program. Specifically, the machine enumeration type in each component of the SR program must be identical, and every name in the enumeration must appear in the machine specification and vice-versa. This is strictly enforced by the linker.

The specification consists of one or more lines of the following form: A space-separated list of machines, an equal sign, and a space-separated list of resource names. Each machine name may be followed by an optional target clause. The target clause is a space-separated list of elements of {Sun, sun, VAX, Vax, vax, UNIX, Unix, unix, V, SA, sa} enclosed in square brackets. The "V" stands for the V Kernel and "SA" means stand-alone. These flags are similar to the command-line options, but apply only to one machine rather than on the whole program. Target clauses may also be given on a line by themselves, in which case they change the target of all subsequent machines.

The specification directs the linker to assemble all of the resource patterns listed on the right-hand side of the equal sign into each of the VMLF's listed on the left of the equal sign. The first resource of the first specification line is taken to be the main resource. It should not have any parameters. The first machine on the first line is the main machine. It will get control when the SR program begins execution; all other machines will be idle and waiting for messages.

In the single-machine case, the linker assembles the resources named on the command line into the executable file exec-file. This file can be run as any other UNIX command. There is no concept of a remote machine in this case, so no machine enumeration may be given in the components of the SR program. The first resource specified on the command line is taken to be the main resource. It should not have any parameters.

EXAMPLES In order to clarify these ideas a couple of examples are given. For instance, assume there is a file called spec that has the following contents:

alpha beta [Sun UNIX] = red blue delta = green

Then the command "srI -V pack spec" will produce five files: re-srl , the quick relink script; pack, the package file; alpha, a VMLF destined to run on the VAX under the V kernel and containing patterns for resources red and blue; beta, a VMLF targeted to UNIX on the Sun and containing the same patterns as alpha; and delta, a VMLF for the V Kernel on the VAX and containing only the pattern for resource green. The main resource is red and the main machine is alpha. The command "sri -Ii jtmp foo bar zap qaz" will produce a single executable file called foo that contains patterns for resources bar, zap, and qaz. Bar will be the main resource. The linker will look for the Interfaces directory in j tmp (but foo will be created in the current working directory).

SEE ALSO src(lL), srx(IL), Id(1)

188

SRL (1L) UNIX Programmer's Manual SRL (IL)

DIAGNOSTICS Sri does quite a bit of checking to make sure that the machine names declared in the SR program and those given in the virtual machine specifications are consistent. Also, the SR source files are checked to make sure that they have not been modified since last compiled. If the linker aborts, it should be clear from the error messages what went wrong.

Sri does not actually do the linking itself; it is merely a front-end for the UNIX linker Id (1). As such, the UNIX linker may produce error messages on its own. All messages not beginning with "SRL" can be attributed to ld (1).

BUGS It is not possible to change an SR program from single-machine to multi-machine (or vice-versa) without modifying and recompiling the source. While the linker supports different hardware and operating system targets, for now a compiler and run-time support are only available under UNIX. Support for the V Kernel is upcoming. The user interface is new (and vastly different from the old srld). Any comments or suggestions for improvement would be welcome.

SRX (IL)

NAME

UNIX Programmer's Manual

srx - SR execution manager SYNOPSIS

srx package-name DESCRIPTION

189

SRX (IL)

Srx is used to execute a group of virtual machine load files (VMLF's) produced by the SR linker srl(lL). It determines which VMLF's to use from the package file produced by sri. For each machine in the package, srx creates a UNIX process to execute the code for the machine. Once all machines have been started, srx sets up communications between all machines using the Berkeley UNIX Internet IPC protocols. Finally, srx gives control to the initial code of the main resource on the main machine. Srx terminates only after each of the machines has terminated.

SEE ALSO src(IL), srl(IL)

DIAGNOSTICS Srx will abort if it cannot find the package file or VMLF's or if it does not have read and execute permissions for those files, respectively. Srx can also fail if Internet addresses cannot be created or bound (this usually indicates a system problem). In any case, an appropriate error message will be produced.

BUGS The full functionality of this command has not yet been realized. The implicit termination detection facility is not yet available. The program will therefore "hang" when it is finished. This can be remedied by killing the program with the DELETE key or inserting a stop statement at the appropriate place in the program.

-_ ... --... _----._-----

[Ada83]

[Allc83]

[Alme85]

[Andr81]

References

Reference Manualfor the Ada Programming Language. ANSI/MIL-STD-1815A, January 1983. Allchin, J.E. and McKendry, M.S. Synchronization and recovery of actions. Proc. of the Second Annual ACM Symp. on Prin. of Dist. Comp., Montreal, Canada (Aug. 1983), 31-44. Almes, C.T, Black, A.P., Lazowska, E.D., and Noe, J.D. The Eden system: A technical review. IEEE Trans. on Soft. Engr. SE-ll, 1 (Jan. 1985),43-58. Andrews, G.R. Synchronizing Resources. ACM Trans. on Prog. Lang. and Systems 9, 4 (Oct. 1981),405-430.

[Andr82a] Andrews, C.R. An alternative approach to arrays. Software-Practice and Experience 12, 5 (May 1982), 475-485.

[Andr82b] Andrews, C.R. The distributed programming language SR-mechanisms, design and implementation. Software-Practice and Experience 12, 8 (Aug.

[Andr83]

[Andr85]

[Andr86]

[Bern86]

[Blac82]

[Blac84]

[Blac85]

[Brin78]

[Chan79]

1982),719-754. Andrews, C.R. and Schneider, F.B. Concepts and notations for concurrent programming. ACM Computing Surveys 15, 1 (March 1983), 3-43. Andrews, C.R. and Olsson, R.A. Report on the distributed programming language SR. TR 85-23, Dept. of Computer Science, The University of Arizona, November 1985. Andrews, C.R., Schlichting, R.D., Hayes, R., and Purdin, T. The design of the Saguaro distributed operating system. IEEE Trans. Softw. Eng. SE-12, 12 (Dec. 1986), to appear. Bernstein, A.J. Predicate transfer and timeout in message passing systems. Information Processing Letters, 1986, to appear. Black, A.P. Exception handling: The case against. Ph.D. dissertation, Oxford University. Available as TR 82-01-02, Dept. of Computer Science, The University of Washington, January 1982. Black, A.P., Hutchinson, N., McCord, B.C., and Raj, R.K. EPL programmer's guide. Eden Project, Dept. of Computer Science, University of Washington, June 1984. Black, A.P. Supporting distributed applications: Experience with Eden. Proc. 10th Symposium on Operating Systems Principles, Orcas Island, W A, December 1985, 181-193. Brinch Hansen, P. Distributed processes: A concurrent programming construct. Comm. ACM 21, 11 (Nov. 1978),934-941. Chang, E.J. Decentralized algorithms in distributed systems. Ph.D. dissertation, Technical Report CSRG-103, Dept. of Computer Science, University of

190

[Chan84]

[Cher84]

[Cher85]

[Clar85]

[Clar80]

[Cook80]

[Coop84]

[Dijk68]

[Dijk75]

[Feld79]

[Geha85]

[Gele85]

[Giff79]

[Hoar73]

[Hoar78]

[Hoar81]

[Holt83]

[IBM66]

[Lamp82]

[Lamp77]

191

Toronto, October 1979. Chandy, KM. and Misra, J. The drinking philosophers problem. ACM Trans. on Prog. Lang. and Systems 6, 4 (October 1984),632-646. Cheriton, D.R. The V kernel: A software base for distributed systems. IEEE Software 1,2 (April 1984), 19-42. Cheriton, D.R. and Zwaenepoel W. Distributed process groups in the V kernel. ACM Trans. on Computer Systems 9,2 (May 1985),77-107. Clark, D.D. The structuring of systems using upcalls. Proc. 10th Symposium on Operating Systems Principles, Orcas Island, WA, December 1985, 171-180. Clarke, L.A., Wileden, and Wolf, A.L. Nesting in Ada programs is for the birds. Proc. of the ACM-SIGPLAN Symposium on the Ada Programming Language, Boston, MA, December 1980, 13g-145. Cook, R. *Mod-a language for distributed programming. IEEE Trans. Softw. Eng. SE-6, 6 (Nov. 1980), 563-571. Cooper, E.C. Replication procedure call. Proc. 9rd ACM Symp. on Principles of Distributed Computing, Vancouver, BC, August 1984, 220-232. Dijkstra, E.W. Cooperating sequential processes. In F. Genuys (Ed.), Programming Languages. Academic Press, New York, 1968. Dijkstra, E.W. Guarded commands, nondeterminacy, and formal derivation of programs. Comm. ACM 18,8 (Aug. 1975),453-457. Feldman, J.A. High level programming for distributed computing. Comm. ACM 22, 6 (June 1979),353-368. Gehani, N.H. and Roome, W.D. Concurrent C. AT&T Bell Laboratories Report, 1985. Gelernter, D. Generative communication in Linda. ACM Trans. on Prog. Lang. and Systems 7, 1 (Jan. 1985),80-112. Gifford, D.K Weighted voting for replicated data. Proc. 7th Symposium on Operating Systems Principles, Pacific Grove, CA, December 1979, 150-162. Hoare, C.A.R. Hints on programming language design. SIGACT/SIGPLAN Symposium on Principles of Programming Languages, Boston, October 1973. Hoare, C.A.R. Communicating sequential processes. Comm. ACM 21,8 (Aug. 1978),666-677. Hoare, C.A.R. The emperor's old clothes. Comm. ACM 24, 2 (Feb. 1981), 75-83. Holt, R.C. Concurrent Euclid, the Unix system, and Tunis. Addison-Wesley, 1983. IBM System/360 operating system PL/I language specifications. IBM form C28-6571-4, 1966. Lamport, L., Shostak, R., and Pease, M. The Byzantine generals problem. ACM Trans. on Prog. Lang. and Systems 4,3 (July 1982), 382-401. Lampson, B.W., Horning, J.J., London, R.L., Mitchell, J.G. and Popek, G.J. Report on the programming language Euclid. SIGPLAN Notices 12, 2 (Feb. 1977), 1-79.

[Lisk81]

[Lisk83a]

[Lisk83b]

[Lisk86]

[Mitc79]

[Olss84a]

[Olss84b]

[Parr83]

[Pete81]

[Powe83]

[Rovn85]

[SchI83]

[SchI86]

[Scot83]

[Scot86]

[Stro83]

[Tane83]

192

Liskov, B. et al. CLU Reference Manual, Lecture Notes in Computer Science 114, Springer-Verlag, Berlin, 1981. Liskov, B. and ScheiRer, R. Guardians and actions: Linguistic support for robust, distributed programs. A CM Trans. on Prog. Lang. and Systems 5, 3 (July 1983),381-404. Liskov, B. and Herlihy, M. Issues in process and communications structure for distributed programs. Proc. Third Symposium on Reliability in Distributed Software and Database Systems, Clearwater Beach, Florida, October 1983, 123-132. Liskov, B., Herlihy, M., and Gilbert, L. Limitations of remote procedure call and static process structure for distributed computing. Proc. 19th ACM Symp. on Principles of Programming Languages, St. Petersburg, Florida, January 1986. Mitchell, J.G., Maybury, W., and Sweet, R. Mesa language manual, version 5.0. Rep. CSL-79-3, Xerox Palo Alto Research Center, April 1979. Olsson, R.A. and Andrews, G.R. SuccessoR: Refinements to SR. TR 84-3, Dept. of Computer Science, The University of Arizona, March, 1984. Olsson, R.A. and Andrews, G.R. An implementation of SuccessoR. TR 84-4, Dept. of Computer Science, The University of Arizona, March, 1984. Parr, F .N. and Strom, R.E. NIL: A high-level language for distributed systems programming. IBM Systems Journal, Vol. 22, Nos. 1/2 (1983), 111-127. Peterson, G.L. Myths about the mutual exclusion problem. Information Processing Letters 12,3 (June 1981), 115-116. Powell, M.L. and Miller, B.P. Process migration in DEMOS/MP. Proc. 9th SIGOPS Symp. on Operating Systems Principles, Bretton Woods, NH (Oct. 1983), 110-119. Rovner, P., Levin, R., and Wick, J. On extending Modula-2 for building large, integrated systems. Technical Report 3, Digital Equipment Corporation Systems Research Center, January 1985. Schlichting, R.D. and Schneider, F .B. Fail-stop processors: An approach to designing fault-tolerant computing systems. ACM Trans. on Computer Systems 1, 3 (Aug. 1983), 222-238. Schlichting, R.D. and Purdin, T.D.M. Failure handling in distributed programming languages. Proceedings of the Fifth Symposium on Reliability in Distributed Software and Database Systems, Los Angeles, Januai'Y 1986, 59-66. Scott, M.L. Messages vs. remote procedures is a false dichotomy. SIGPLAN Notices 18,5 (May 1983),57-62. Scott, M.L. Language support for loosely-coupled distributed programs. TR183, Dept. of Computer Science, The University of Rochester, January 1986. Strom, R.E. and Yemini, S. NIL: An integrated language and system for distributed programming. Research Report RC 9949, IBM Research Division, April, 1983. Tanenbaum, A.S., van Staveren, H., Keizer, E.G., and Stevenson, J.W. A practical tool kit for making portable compilers. Comm. ACM 26, 9 (Sept. 1983),

193

654-660. [Thom78] Thompson, K. UNIX implementation. Bell System Technical Journal, 57, 6,

part 2 (july-August 1978), 1931-1946. [Walk83]

[Wexe86] [Wirt77]

[Wirt82]

Walker, B., Popek, G., English, R., Kline, C., and Thiel, G. The LOCUS distributed operating system. Proc. 9th SIGOPS Symp. on Operating Systems Principles, Bretton Woods, NH (Oct. 1983),49-70. Wexeblat, R.L. Editorial. SIGPLAN Notices 21,3 (March 1986), 1.

Wirth, N. Modula: A language for modular multiprogramming. SoftwarePractice and Experience 7, (1977), 3-35. Wirth, N. Programming in Modula-2. Springer-Verlag, New York, 1982.

ISSUES IN DISTRIBUTED PROGRAMMING LANGUAGES: THE … · 2020. 4. 2. · 3 SR Language Overview ......

Documents

Transcript of ISSUES IN DISTRIBUTED PROGRAMMING LANGUAGES: THE … · 2020. 4. 2. · 3 SR Language Overview ......