The Tool Daemon Protocol: Defining the Interface Between Tools and Process Management Systems

28
PTools Annual Meeting, Knoxville, TN, 10-12 September 2002 The Tool Daemon Protocol: Defining the Interface Between Tools and Process Management Systems Paradyn Group Condor Group {paradyn,condor}@cs.wisc.ed u Computer Sciences Department University of Wisconsin Madison, Wisconsin 53705 USA Ana Cortés Miquel A. Senar {miquelangel.senar,ana.cortes} @uab.es Departament d’Informàtica Universitat Autònoma de Barcelona Barcelona, Spain Presented by Philip C. Roth [email protected]

description

The Tool Daemon Protocol: Defining the Interface Between Tools and Process Management Systems. Paradyn Group Condor Group { paradyn,condor}@cs.wisc.edu Computer Sciences Department University of Wisconsin Madison, Wisconsin 53705 USA. Ana Cortés Miquel A. Senar - PowerPoint PPT Presentation

Transcript of The Tool Daemon Protocol: Defining the Interface Between Tools and Process Management Systems

PTools Annual Meeting, Knoxville, TN, 10-12 September 2002

The Tool Daemon Protocol:Defining the Interface Between Tools and Process Management Systems

Paradyn GroupCondor Group

{paradyn,condor}@cs.wisc.eduComputer Sciences Department

University of WisconsinMadison, Wisconsin 53705

USA

Ana CortésMiquel A. Senar

{miquelangel.senar,ana.cortes}@uab.es

Departament d’InformàticaUniversitat Autònoma de Barcelona

Barcelona, Spain

Presented byPhilip C. Roth

[email protected]

2 Tool Daemon Protocol

The Current SituationConsider a job submitted to a process management system (e.g., Condor, PBS, Globus, MPICH’s MPD)—the process manager…

…starts the job’s processes

Sets up file I/OSets up standard

I/O…monitors process

status…controls the job

ProcessManagerDaemon

monitor/control

ApplicationProcess

ApplicationProcess

3 Tool Daemon Protocol

The Current SituationNext, consider a tool wanting to monitor the job. The tool……also may want to start

the processes (or attach to them)

…also needs to monitors process status

…also may want to control the job

…also may want access to file I/O or standard I/O

…needs to communicate with its front-end

ProcessManagerDaemon

monitor/control

ApplicationProcess

ApplicationProcess

Tool Daemon

?

?

4 Tool Daemon Protocol

The Current Situation

So, who wins?

ProcessManagerDaemon

monitor/control

ApplicationProcess

ApplicationProcess

Tool Daemon

?

?

5 Tool Daemon Protocol

The Current Situation•Process managers are many and varied

•E.g., IBM POE, SGI Origin MPI and MPICH all work differently

•Some process managers have support for specific tools

• E.g., MPICH support for TotalView debugger

•Heading for an m n combination of m process managers and n tools

Bottom line: need a standard interface for process managers and tools to coexist:

The Tool Daemon Protocol (TDP)

6 Tool Daemon Protocol

TDP: The Tool Daemon Protocol

• Defines an API between process management system and tool processes for…1. Creating processes2. Controlling processes3. Sharing information between processes

• Pilot implementation—trying out ideas to see what works

7 Tool Daemon Protocol

TDP Job Startup Sequence

Execution HostLocal Host

ToolFront-End

ProcessManagerDaemon

Create job

1. Tool submits job request to process management system

8 Tool Daemon Protocol

TDP Job Startup Sequence

ApplicationProcess

Execution HostLocal Host

ToolFront-End

ProcessManagerDaemon

2. Process manager creates application processes, leaving it suspended (“pause on exec”)

9 Tool Daemon Protocol

TDP Job Startup Sequence

ApplicationProcess

Execution HostLocal Host

Tool Daemon

ProcessManagerDaemon

ToolFront-End

3. PM daemon creates tool daemon process (if necessary)

TDP

10 Tool Daemon Protocol

TDP Job Startup Sequence

ApplicationProcess

Execution HostLocal Host

ProcessManagerDaemon

ToolFront-End

Tool Daemon

PID,

host

/por

t pai

rs

4. PM daemon passes information to tool daemon (e.g., process pid,front-end host/port, standard I/O host/port)

11 Tool Daemon Protocol

TDP Job Startup Sequence

Execution HostLocal Host

ProcessManagerDaemon

ToolFront-End

Tool DaemonApplication

Process

5. Tool daemon examines the application process (e.g., parses symbols,discovers static call graph)

12 Tool Daemon Protocol

TDP Job Startup Sequence

Execution HostLocal Host

ProcessManagerDaemon

ToolFront-End

Tool Daemon ApplicationProcess

6. App process is allowed to run

13 Tool Daemon Protocol

TDP Pilot Implementation

• Goals• To try out TDP ideas and see what makes

sense in real environment• To collect informed suggestions for a

standard

• The software• Two well-established packages at U.

Wisconsin-Madison• Paradyn performance tool• Condor resource management system

14 Tool Daemon Protocol

1. Process startup

2. Notification of exited processes

3. Inter-process communication

• Mechanism

• Identification of information to be transferred

• Asynchronous notifications

4. Private networks and firewalls

• Tool daemon communicating to front-end

• Application process sending standard I/O

Challenges

15 Tool Daemon Protocol

Challenge: Process Startup

• Most functionality already in place, but not in the right place• Need to refactor process startup logic

between process manager daemon and tool daemon

• Control handoff (process manager daemon to tool daemon) difficult under some OSs• E.g., Linux—two scheduling race conditions

between application process and tool daemon

16 Tool Daemon Protocol

Challenge: Exit Process Notification

•Want the starter to be aware if the app or tool daemon process exits

•Process exit notification (e.g., SIGCHLD to the parent under UNIX/Linux)

paradynd App

SIGCHLD

starter

SIGCHLDParent

of

Pare

nt o

f

17 Tool Daemon Protocol

Challenge: Exit Process Notification

paradynd App

starter

SIGCHLD

Parent of

• Parental relationships may change when tool daemon attaches

• E.g., Linux—daemon process becomes app process’ parent

On app process’ termination, SIGCHLD sent to paradynd, NOT to the Condor starter

Parent

of

18 Tool Daemon Protocol

Challenge: Exit Process Notification

paradynd App

starterSIGCHLD

• SIGCHLD delivered to Condor starter only if paradynd calls wait()

Condor must trust monitoring daemon or poll the application process’ state

19 Tool Daemon Protocol

Challenge: Information Transfer

• “Attribute Space”

• {name, value} pairs shared between processes

• Mainly, intra-host sharing between process manager

daemon and tool daemon

• Also tool front-end, daemon sharing

• E.g., application PIDs for front end

• Basic idea from MPICH

• Not a Linda tuple space

• Not a global shared environment space

20 Tool Daemon Protocol

Attribute Space (Execution Host)

ProcessManagerDaemon

Tool Daemon ApplicationProcess

PID=2473FE_host=cham.cs.wisc.eduFE_port=7331

tdp_put(“PID”, “2473”)tdp_put(“FE_host”, “cham.cs.wisc.edu”)tdp_put(“FE_port”, “7331”)

21 Tool Daemon Protocol

Attribute Space (Execution Host)

ProcessManagerDaemon

Tool Daemon

ApplicationProcess

PID=2473FE_host=cham.cs.wisc.eduFE_port=7331

tdp_get(“PID”)tdp_get(“FE_host”)tdp_get(“FE_port”)

22 Tool Daemon Protocol

Challenge: Asynchronous Notification

• Uses attribute space• In process interested in event notification,

register action tdp_register_notify(handle, event, action)

• In event-generating process, deliver event to attribute space

tdp_put(event,value)

• Value available in action function

23 Tool Daemon Protocol

Challenge: Firewalls and Private Nets

Remote HostLocal Host

ProcessManagerDaemon

ToolFront-End

Tool Daemon

ApplicationProcess

Firewall

X

24 Tool Daemon Protocol

Challenge: Firewalls and Private Nets

Remote HostLocal Host

ProcessManagerDaemon

ToolFront-End

Tool Daemon

ApplicationProcess

Firewall

CommProxy

25 Tool Daemon Protocol

Status• Pilot implementation nearly complete

• Paradyn with jobs submitted to Condor• Linux 2.4• “Create process” model• Condor “vanilla” and “MPI” universes• Remaining work: library packaging,

documentation

• Periodic planning meetings•Paradyn (Miller)•Condor (Livny)•U. Barcelona (Cortés, Senar)•TUM (Wismüller)•U. Vienna (Fahringer)•U. Tennessee (Moore)

•MPICH (Butler, Gropp, Lusk)•Etnus (Cownie, Delsignore)•Globus (Kesselman)•HP/Compaq (Balle)•Pallas (Vampir group)

26 Tool Daemon Protocol

The Path Forward

• Identify necessary information exchange between principals

• Complete design, implement attribute space as standalone package

• Get other tool builders, process management system builders involved• Integrate TDP ideas into their systems to see

what works

27 Tool Daemon Protocol

Summary

• TDP standardizes the interface between process management systems and tools• API for tools and management systems• Support libraries• Distributed attribute space

• Avoids the propagation of tool- and process manager-specific interfaces

• Pilot implementation nearly complete

28 Tool Daemon Protocol

TDP: The Tool Daemon Protocol

It is the early stages of this important effort—we want your participation!• Draft report in progress—available for

review and comments soon• Web: http://www.cs.wisc.edu/tdp• Email: [email protected]

Barton MillerPhilip RothBrandon SchendelVictor Zandy

Miron LivnyTodd TannenbaumDerek Wright

Ana CortésMiquel A. Senar

Pilot Implementation Team