Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena...

30
Collaborating with Petaobjects: The TJNAF Virtual Experiment Environment Executive Summary This proposal seeks to develop a key collaboratory environment for a central DOE application –modern sophisticated high-energy and nuclear physics experiments. The primary goal of this project is to develop a collaborative, problem-solving environment so that everything associated with the Hall D experiment being planned for the Thomas Jefferson National Accelerator Facility — the entire complex, evolving network including the detector, experimental measurements, subsequent analyses, computer systems, technicians, and experimenters—can be integrated into a simple, collaborative fabric of information resources and tools. This Virtual Experiment Environment (VEE) will make it possible for groups of distributed collaborators to conduct, analyze, and publish experiments based on the composition and analysis of these resource objects. The scientific focus of this project is an experimental search for gluonic excitations among hadrons produced in photoproduction with the ultimate goal of understanding the nature of confinement in quantum chromodynamics, i.e., why are quarks are forever bound in the hadrons of which they are constituents. Our approach has several key and novel features and is designed to address issues coming from both previous research and a detailed analysis of major commercial tools in the collaboration and object management area. The foundation of this system is the Garnet Collaborative Portal (GCP), which uses an integrated distributed object framework to specify all the needed object properties, including their rendering and their collaborative features. Our existing Gateway system is being integrated into GPC to provide a computing portal supporting collaborative job preparation and visualization. GCP is implemented in a Grid Service framework that includes some key ideas including the systematic use of small XML based objects just containing the necessary meta-data to allow scalable management and sharing of the quadrillion objects. This is implemented as a Web environment MyXoS controlled by XML scripts initially built using RDF. We address high performance at several levels from the design of the object system to the use of a reconfigurable server network to support the Grid message service and peer-to-peer network on which MyXoS is built. A single publish subscribe message service extending the industry standard JMS supports synchronous and asynchronous collaboration. A hierarchical XML schema covers events, portalML (user view) and resourceML (basic resources). The proposal combines innovative research into these issues tested by a staged deployment allowing for careful evaluation and user feedback on the VEE functionality, performance and implementation. This project is critical to the Hall D scientific and computing efforts and needs to begin as soon as possible. It will help establish

Transcript of Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena...

Page 1: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

Collaborating with Petaobjects:The TJNAF Virtual Experiment Environment

Executive Summary

This proposal seeks to develop a key collaboratory environment for a central DOE application –modern sophisticated high-energy and nuclear physics experiments. The primary goal of this project is to develop a collaborative, problem-solving environment so that everything associated with the Hall D experiment being planned for the Thomas Jefferson National Accelerator Facility — the entire complex, evolving network including the detector, experimental measurements, subsequent analyses, computer systems, technicians, and experimenters—can be integrated into a simple, collaborative fabric of information resources and tools. This Virtual Experiment Environment (VEE) will make it possible for groups of distributed collaborators to conduct, analyze, and publish experiments based on the composition and analysis of these resource objects. The scientific focus of this project is an experimental search for gluonic excitations among hadrons produced in photoproduction with the ultimate goal of understanding the nature of confinement in quantum chromodynamics, i.e., why are quarks are forever bound in the hadrons of which they are constituents.

Our approach has several key and novel features and is designed to address issues coming from both previous research and a detailed analysis of major commercial tools in the collaboration and object management area. The foundation of this system is the Garnet Collaborative Portal (GCP), which uses an integrated distributed object framework to specify all the needed object properties, including their rendering and their collaborative features. Our existing Gateway system is being integrated into GPC to provide a computing portal supporting collaborative job preparation and visualization. GCP is implemented in a Grid Service framework that includes some key ideas including the systematic use of small XML based objects just containing the necessary meta-data to allow scalable management and sharing of the quadrillion objects. This is implemented as a Web environment MyXoS controlled by XML scripts initially built using RDF. We address high performance at several levels from the design of the object system to the use of a reconfigurable server network to support the Grid message service and peer-to-peer network on which MyXoS is built. A single publish subscribe message service extending the industry standard JMS supports synchronous and asynchronous collaboration. A hierarchical XML schema covers events, portalML (user view) and resourceML (basic resources). The proposal combines innovative research into these issues tested by a staged deployment allowing for careful evaluation and user feedback on the VEE functionality, performance and implementation.

This project is critical to the Hall D scientific and computing efforts and needs to begin as soon as possible. It will help establish an overall structure to the organization of the Hall D computing efforts, it makes it practical for the Hall D collaboration to create an efficient grid-computing environment that reduces computing costs and attracts collaborators. The current computing and data management effort in Hall B at TJNAF faces challenges similar to Hall D (at roughly 1/10 the expected Hall D rate). The similarity of the computing and collaboration needs in Halls B and D provide an opportunity to support both efforts simultaneously. The early application of the VEE concept to simulations and first-pass analysis within Hall B will allow us to make an important step in improving Hall B’s computing environment.

The FSU, Indiana University principle investigators include the physics (Alex Dzierba) and computing leadership (Larry Dennis) of the Hall D experiment with expertise of Geoffrey Fox, who leads the design and implementation of the collaborative environment and has participated in several major high-energy physics experiments. This team is working in partnership with Jefferson Lab personnel to define and create the Hall D computing environment at Jefferson Lab. Florida State University has provided significant matching support (~$240,000) for this project.

The effort required to successfully complete this three-year project is significant. We have adopted a very aggressive schedule in order to provide this problem-solving environment as early as possible. There are several reasons for this: this system needs to be in place early enough so that physicists can develop additional software tools that work with it; there is a significant computing effort that needs to take place before the experiments begin and the effectiveness of this effort relies on feedback from scientists using the VEE.

Page 2: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

Collaborating with Petaobjects:The TJNAF Virtual Experiment Environment

1. IntroductionThis proposal seeks to develop a novel collaboratory environment for a central DOE application – a

modern sophisticated high-energy nuclear physics experiment. The work will contribute directly to computer science research – in particular the nature of collaboration services required for Grid based applications. Further there will be direct benefit to nuclear physics as it will develop new approaches to both experimental control and analysis software and indeed to the operational model for the large worldwide teams that are needed today. Finally there will be contributions from the integration of the computer science and physics research; we believe that the application requirements are critical input to research in collaborative systems – we need to know what objects to share and in what fashion. Here we note that although this is a research proposal, we will develop a collaboratory that is robust and functional so that the physicists can and will use it. Lessons from this use will be a major driving force for the computer scientists.

The FSU, Indiana, Jefferson Lab team brings together the physics and computing leadership of the Hall D [1] experiment. Dzierba (Indiana) is the scientific leader of the Hall D project and Dennis (FSU) is a member of the Hall D collaboration board and leader of the Hall D computing group. Further Fox, who leads the design and implementation of the collaborative portal, participated in several major high energy experiments including two at Fermilab (E110 and E260) where as a physicist he led the analysis and simulation activities, writing most of the software and collaborating with Dzierba while they were both at Caltech. Riccardi, a computer scientist who has been a member of the Hall B collaboration for approximately 10 years, was instrumental in creating databases for recording online, analysis, simulation, and calibration information. Erlebacher from FSU’s new school of Computational Science and Information Technology will lead the work on hand-held interfaces and collaborative visualization; he will be a major participant in developing XML infrastructure.1.1 Experiments at Jefferson Lab

The physics experimental program is a search for gluonic excitations among hadrons produced in photoproduction with the ultimate goal of understanding the nature of confinement in quantum chromodynamics, i.e., why are quarks are forever bound in the hadrons of which they are the constituents. This search is being planned for Hall D at the Thomas Jefferson National Accelerator Facility (JLab), which like many modern scientific endeavors will produce large volumes of complex, high-quality data from many sources. These data include experimental measurements, information on the state of the experimental conditions, information describing the status and results of data analysis, and simulations of the detector required to understand its response during the experiment. The exploration of physical phenomena with Hall D depends critically upon our ability to efficiently extract information from the large volume of distributed data using a complex set of interrelated computing activities. This experiment is brand-new but relatively near-term; it can be designed to use new and different methodologies and we expect to see immediate benefits and feedback on this. The experiment will not take data for about 5 years but already simulations are being run and hardware and software decisions being made. The lessons from this work will be broadly applicable to physics experiments in the nuclear and high-energy areas and we will demonstrate this by applying some of technology to the existing Hall B.

Hall B at TJNAF faces similar challenges, except that it is currently in operation and generates experimental and simulation results at the rate of approximately 300 Tbytes per year, roughly a factor of 10 below the expected rate for Hall D. The similarity of the computing and collaboration needs in Halls B and D provide an opportunity to support both efforts simultaneously and give us the opportunity to refine our middleware services in a production environment.

This project will develop a set of middleware services that create an environment in which everything associated with the Hall D experiment—the entire complex, evolving network including the detector, experimental measurements, subsequent analyses, computer systems, technicians, and experimenters—will be integrated into a simple, collaborative fabric of information resources and tools. The resultant Virtual Experiment Environment (VEE) will make it possible for groups of distributed collaborators to conduct, analyze, and publish results from experiments based on the composition and analysis of these resource objects.

One advantage of a new project like Hall D is that one can decide to build the entire infrastructure around new ideas without worrying too much about legacy concepts. Applications within the VEE will accept

Page 3: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

only properly described distributed objects as input and produce corresponding objects as output. While it is possible to achieve this from the start for Hall D, the conversion to distributed objects in Hall B will focus on particular applications, such as simulations and first pass analysis. We estimate that the Hall D activities will create about 1015 distinct objects for each year of its operation with micro-objects like detector signals and processed information like track segments and particle identifications being grouped into larger event objects. In addition there will be objects describing reports and presentations and the information needed to specify input and output of simulations initiated by any of geographically distributed researchers involved in the experiment. 1.2 Overview of the Technical Approach

Our approach has several key and novel features that have designed to address issues coming from both previous research [2] and a detailed analysis [3] of major commercial tools in the collaboration and object management area. We are building a system that provides a web interface to access and manipulate the Hall D objects and further allows this to be done collaboratively. This capability is formulated as the Garnet Collaborative Portal (GCP), which uses an integrated distributed object framework to specify all the needed object properties including both their rendering and their collaborative features. This builds on our existing Gateway system [4] for a computing portal where this is being integrated into GCP with collaborative job preparation and visualization. Sharing a complex object is difficult and systems not designed from scratch to integrate all object features will not be as effective. Including rendering information in an object’s description allows one to customize to different clients and so build collaborative environments where one share the same object between hand-held and desktop devices.

We describe the technical approach in section 3 but give highlights here. We assume that we are building on a computational grid infrastructure and so can layer our high level services on top of the capabilities under development by projects such as the Particle Physics Data Grid [5] and GryPhyN [6-7]. Users, Computers, Software applications, Sessions, and all forms of information (from physics DST’s to recording of audio/video conferences) are all objects, which can be accessed from GCP. We estimate that after aggregation of the logged events into runs, we will need to handle around several tens of millions of explicit objects. These will all self-defining; namely make explicit all the necessary metadata to enable GCP to perform needed functions such as searching, accessing, unpacking, rendering, sharing, specifying of parameters, and streaming data in and out of them. This metadata is defined using a carefully designed XML schema GXOS and exploiting the new RDF framework. Typically GCP only manipulates the meta-objects formed from this metadata so that we build a high performance middleware that only performs control functions. This idea has been successfully used in our Gateway computing portal. The XML meta-objects that define the GCP point to the location of the object they define and can initiate computations and data transfers on them. Objects can be identified by a URI and referenced with this in either RDF resource links (such as <rdf:description about=”URI”..) or fields in the GXOS specification. Three important URI’s are the GXOS name such as gndi://gxosroot/HallD/users/…, and the web location of either the meta-object or object itself. All objects in GXOS must have a unique name specified in a familiar (from file systems) hierarchical syntax.

Our software is largely written in Java (using Enterprise Javabeans in the middle tier) but Java/XML is only the execution object model of the meta-objects; one can load persistently stored meta-objects or control target base objects formed by flat files, CORBA, .net (SOAP) or any distributed object system to which we can build a Java gateway. Our successful Gateway computational portal has used this strategy already; here all object interfaces are defined in XML but CORBA access is generated dynamically. Further this system also only uses meta-objects and invokes programs and files using classic HPCC technology such as MPI. This strategy ensures we combine the advantages of highly functional commodity technologies and high performance HPCC technologies. 1.3 Collaboration Technologies

GCP uses the shared event model of collaboration where these events use the same base XML schema as the meta-objects describing the entities in the system. The uniform treatment of events and meta-objects enables us to use a simple universal persistency model gotten by a database client (shown in figure 1) subscribing as a client to all collaborative applications. Integration of synchronous and asynchronous collaboration is achieved by the use of the same publish/subscribe mechanism to support both modes. Hierarchical XML based topic objects matched to XML based subscribing profiles specified in RDF (Resource Description Framework from W3C) control this. Topics and profiles are also specified in GXOS and managed in the same way as meta-objects. These ideas imply new message and event services for the Grid, which must integrate events between applications and between clients and servers. This GMS (Grid Message service) will

Page 4: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

be one major focus of our research. One extension of importance GMSME (GMS Micro Edition) handles messages and events on hand held and other small devices. This assumes an auxiliary (personal) server or adaptor handling the interface between GMS and GMSME and offloading computationally intense chores from the handheld device. Currently we use JMS (Java Message Service) to provide publish/subscribe services for events in our prototype GCP but have already found serious limitations that we will address in GMS. The event based synchronous collaboration model handles both the classic microscopic state changes (such as change in specification of viewpoint to a visualization) but also the transmitted frame-buffer updates for shared display which our experience has shown to be the most generally useful sharing mode for objects. We also support shared export where objects are converted to a common intermediate form for which a powerful general shared viewer is built; shared PDF, SVG, Java3D, HTML and image formats are important export formats. 1.4 Distributed Programming Models

Although the use of XML based objects is relatively well understood, there appears to be less consensus as to the distributed programming or execution model needed to build the services needed by applications. In other words, what is the distributed operating system for the objects and meta-objects? The Ninja project [8] at UC Berkeley is addressing such issues with a philosophy similar to our approach, which is termed MyXoS, and supports GCP with such capabilities as the creation, access, copying and editing of meta-objects. MyXoS has a “shell” similar to that provided by UNIX but specified (at a low level) by RDF statements and aimed at manipulating GXOS objects not files in UNIX. W3C likes to talk about the Semantic Web [9] formed by the synergistic interaction of web resources and this intriguing concept is an underlying research issue for systems like Ninja and MyXoS. Another important trend is peer-to-peer computing (P2P) with recent work typified by JXTA from Bill Joy at Sun Microsystems [10]. As shown in Figure 1, collaborative systems create P2P networks although in our approach (and most other systems), this is an “illusion” for the P2P environment is created by the routing of messages through a network of servers. Here another interesting research issue is how best to perform this mix of software and hardware multicast and where the servers should be placed; MyXoS allows the dynamic instantiation of servers to support clusters of clients with similar

subscription profiles. The message routing strategy needs to integrate the published topic and subscription profile objects and is quite complicated for heterogeneous client subscriptions. 1.5 Research Challenges

The GCP prototype available in May 2001 will only implement rudimentary versions of these ideas and the extensions needed for Hall D imply major research and implementation challenges. Research areas include the Grid Message Service as well as the Grid server architecture implementing the “illusion of P2P”. We will also research the structure of MyXoS and the scalable management of large numbers of meta-objects in an efficient effective fashion. Indeed is this a reasonable model at all with separate object and meta-objects – this will be an important lesson from this research. Our use of RDF as the scripting language of the Semantic Web will challenge this relatively simple meta-data model and perhaps point the way to improvements. Integrating the event models between the different event models and synchronous and asynchronous

Page 5: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

collaboration modes is another hard area. We will build in general support for shared export but each case is non trivial and implementation will need substantial interaction with users and careful software design.

From the application side, a major task will be to establish the formal specifications and procedures necessary to integrate all archived and derived information (distributed objects) into a common access framework and to apply them to create the required VEE’s. We need to build extensions to the current GXOS XML Schema to handle the physics specific resources. Our overarching goal is to incorporate collaborative object technology into a Grid-based, problem solving framework that gives groups of experimenters the ability to conduct and manage complex, distributed, large-scale simulation and data analysis problems that involve quadrillions of individual objects. Thus, the VEE seeks to dramatically improve the end-user environment for managing the needed collaboration, data access, and computational tools that experimental teams use to conduct scientific research.

We believe there are significant benefits resulting from this effort and that this project is crucial to the successful conduct of the scientific program in Hall D. This effort will enable the Hall D collaboration to achieve quality and efficiency gains that translate directly into higher-quality physics results. By making it possible to locate some of the collaboration’s computing resources at facilities other than TJNAF, the collaboration can bring more of the scientific effort directly into these institutions, create the opportunity for university support for the activities and attract more students to the effort. This research is using very general technologies and we expect our ideas to be applicable to general collaboratories.

2. Description of the Hall D Virtual Experiment Environment2.1 Hall D Virtual Experiment Environment

The Hall D VEE will serve as the organizational focal point for the analysis of the Hall D experiment and scientific research. The VEE will provide a collaborative end-user problem-solving environment to manage the needed collaboration, information sharing, large-scale data management, and computational tools that experimental teams use to conduct their scientific research and derive their scientific results. The environment

integrates common synchronous collaboration tools such as instant messaging, chat, whiteboards, slide shows, video and audio conferencing as well as some not so common features such as shared monitoring of the detector or other detailed experimental information. The VEE also serves as the portal for asynchronous collaboration tools such as internal newsgroups, document sharing and electronic logbooks. It will also incorporate large scale computing resources into this collaborative environment. Doing so makes it possible to better manage software and computational resources and to immediately share the results of computations throughout the

Figure 2. Hierarchical building blocks of the Hall D VEE.

Page 6: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

collaboration. By extending the VEE to include computing, we extend the information resources managed by the VEE to include virtual data (data that could be created as the result of some action by the user, but has not yet been created).

In order to create this environment and then fully deploy it in a real experiment, we must develop a set of standards and procedures that physicists can use to meet their needs. These underlying standards are intended to insure that the VEE is extensible to other uses with the Hall D project and are adaptable by other projects. Figure 2 illustrates the basic plan of building a comprehensive VEE from generic, enterprise, computational and education and training portal technology.

To facilitate this integration, the experiment has committed to making all information within the Hall D project available as distributed objects with corresponding metadata descriptions. In addition, within the VEE, every application will accept only properly described distributed objects as input and produce corresponding objects as output. This immense collection of objects will be integrated by a set of tools that allows for their production, archiving, sharing and display. In some cases, in order to attain the high performance required for this vast object set, collections of objects would be “wrapped” with associated metadata to provide for the most common searching capabilities. This allows for many rapid searches without processing the entire object collection. Interfaces for some legacy codes would be created so that their inputs and outputs can be manipulated as objects while the codes function as before. Other codes would be developed to use this object technology from the start and thus could be better integrated into the VEE environment. The access to this information in the form of objects will allow us to create standard sets of tools for creating, locating and sharing them and to develop a host of other novel applications.2.2 Hall D Grid

The approximately 25 geographically distributed institutions involved in the Hall D experimental program will collaborate to build the detector and develop data acquisition and analysis software, run the experiments, analyze the measurements and publish the results. When the detector is completed, physicists from these remote institutions will travel to JLab (Newport News, VA) to conduct experiments year-round (whenever the accelerator can deliver beam). It is reasonable to assume that these experiments will continue for approximately 10 years.

The Hall D project is now being considered by the Nuclear Science Advisory Committee (NSAC) and will be ranked among other projects as this committee deliberates the long-range plan for the nuclear physics community. The decision will be known by early April 2001. There is good reason for expecting strong approval. The Jefferson Lab has made the Hall D project the premier science case for their accelerator upgrade project. A committee of high energy and nuclear physicists reviewed the project and issued a report in early

2000 stating that this experiment would be the definitive search for gluonic excitations and that the approach proposed by the collaboration was sound.

Funding for these activities comes primarily from the Department of Energy, the National Science Foundation, and various universities. The effort described in this proposal should not be confused with the much larger effort required to provide the entire computer infrastructure and application software required by Hall D. The Hall D computing effort will require millions of dollars in computer hardware and hundreds of

Figure 3. Schematic showing several major components of the Hall D Grid.

Page 7: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

years of effort by a large group of people. This proposal seeks funding for researching and testing by prototype deployment of a critical piece of the overall Hall D computing effort that will be of greatest value if it is completed well before data taking begins in Hall D. In addition to the funding requested here, we will seek funds and collaborators in the application of this project from a variety of other resources, including but not limited to DOE high energy and nuclear physics programs and university resources. If DOE approves the Hall D project, it is reasonable to expect that there will be a significant redirection of currently funded nuclear physics research efforts, of Jefferson Lab resources and of the research efforts of our collaborators towards Hall D computing efforts. The initial estimated computing equipment budget for Hall D is over $4 M, excluding personnel – which make up the bulk of the cost of the project. The timing of hires and the roles of computing personnel are currently being evaluated by the Hall D collaboration and Jefferson Lab.

As participants in the Hall D experiment, scientists are required to jointly develop plans for detector construction, software, and experiments. Analysis of the experimental data and the production and publication of final results require participation of the entire collaboration. Such activities place a premium on close collaboration and as a result have historically been conducted primarily by research groups that reside near the experiment or that are able to place scientists in residence at the experiment.

We will integrate this research project into the Hall D management plan to help ensure the successful completion of both this and the Hall D project. The integration helps us leverage the resources necessary to define and create the required Hall D objects and to insure that the real-world needs of the Hall D collaboration are met.

The primary Hall D computing activities are quite diverse and must be supported by a variety of high-performance computer systems. Experimental data acquisition collects measurements from the detector components, eliminates unnecessary data items, organizes the resulting information as objects, and stores the results. First-pass offline analysis converts the experimental information (each event is about 1000 smaller raw-data micro-objects) into more meaningful physical information (processed objects), such as particles, momenta, and energies. Detector calibrations provide information for converting measured electrical pulses to times, energies, and positions. Event simulations model the detector’s response to particles of known masses, energies, and momenta. Physics analyses use the measured particles, momenta, and energies together with simulated detection efficiencies to reconstruct the reactions taking place.

The current plan for the operation of Hall D indicates that it should generate approximately three Pbytes/year from three main sources: experimental data (~0.75 Pbytes/year), offline analysis (~1.5 Pbytes/year) and event simulations (~0.75 Pbytes/year). This implies that each year the experiment will generate some 10 15

basic objects (nuggets or micro-objects of information) organized into 1012 macro-objects such as events or analysis instances. Even though the Hall D detector will generate experimental information at a very high rate (up to 100 Mbytes/s), computer programs, such as the offline analysis and event simulations will generate the majority of the information.

The distributed computing resources required for the Hall D experiment will be created as a set of computer systems each designed to address a specific computing task. The computing activities are described here and shown schematically in Figure 3. This illustrates the scope and diversity of the information involved in the Hall D project and emphasize the natural organization of Hall D information as distributed objects and the suitability of the VEE for this application. This proposal does not include any funds for creating these computing resources or for developing the specific computer software needed to carry out these steps.Experimental data acquisition takes places on computers dedicated to the task. These computers acquire the event information from the detector and distribute it to a computer cluster for filtering in real time. The filtered data are stored on a tape archive at Jefferson Lab. During data acquisition, one of the critical real-time activities is the automated collection and recording of information about the state of the 50,000 active components that comprise the Hall D detector. Data calibration requires repeated analysis of small subsets of the experimental data. The acquisition system will archive approximately 1% of the Hall D data for calibrations and carry out the detector calibrations.

For the experiments currently running at Jefferson Lab, data location, efficiency, and quality control issues have forced most first-pass analysis activities to be performed at Jefferson Lab. Jefferson Lab computing facilities will be expanded to meet the archival and computing needs for the first-pass analysis of the Hall D data.

Event simulations are required to quantify the systematic errors in the Hall D experiment. We anticipate that the quality of our results will be limited by systematic errors and therefore highly dependent on

Page 8: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

the quality and quantity of simulations. Our plan is to distribute the simulation effort among several of the institutions within the Hall D collaboration.

Traditionally, large distributed teams of graduate students and researchers develop specialized computer programs and software systems to conduct an experiment’s computing activities. Such systems generally have an unacceptably high dependence on the original authors, require large ramp-up times to use, and are very labor intensive to produce and maintain. Furthermore, the entire process does not scale well with the data size and complexity because the programs and computing environments are frequently lax about the way they keep track of what has been done. As a result a significantly large fraction of each scientist’s time in conducting such research must be devoted to learning how to use the programs and to routine computing activities such as code management, installation, and data management.

3 Technical Approach3.1 Synchronous Collaboration

In this and following two sections, we define the key concepts and components of our proposed system. This is done in a glossary fashion for the broad categories of synchronous collaboration, portals and environments and distributed object technology.3.1.1 Synchronous Collaboration Capabilities

This refers to object sharing in real-time with events recording state changes transmitted from a “master” instantiation to replicas on other clients in same session. Fox at Syracuse produced a research system of this type Tango [11] and includes lessons from this in GCP. As detailed in an FSU survey [3], the three leading commercial systems, Centra Placeware and WebEx, are quite similar to themselves and to Tango. Such systems typically support:

Shared documents using either shared event, shared export or shared display. Note “document” here includes visualization, web page, Microsoft Word etc.

Text Chat/Instant Messenger/Polling/Surveys/Attention getting tools White board and annotations (transparent white board) of shared documents Audio-Video conferencing User registration Recording Session

The Garnet system GCP has these capabilities using either HearMe or Access Grid for the conferencing function. The application of this capability to project management, experimental problem solving (such as when detector experts are not on site) provides numerous benefits to the collaboration. By extending this capability into the classrooms at the various institutions it makes it much easier for students who wish to remain in residence at Jefferson Lab and take courses at their home institutions. Fox successfully employed Tango in this fashion for a set of courses given in Jackson State University in Mississippi with him as teacher at Syracuse or Florida State.3.1.2 Shared Display

Shared display is the simplest method for sharing documents with the frame buffer corresponding to either a window or entire desktop replicated among the clients. Modest client dependence is possible with PDA’s for example receiving a reduced size image. Some collaboration systems support remote manipulation with user interactions on one machine holding a replica frame buffer transmitted to the instance holding the original object. This is important capability in help desk or consulting applications, similar to situations that occur frequently in the training of students. As this works for all applications without modifying them, this is the basic shared document mechanism in GCP. The public domain VNC [12] and Microsoft NetMeeting were two of the earliest collaboration systems to implement this capability.3.1.3 Shared Export

Shared display does not allow significant flexibility; for instance different clients cannot examine separate viewpoints of a scientific visualization. More flexible sharing is possible by sharing object state updates among clients with each client able to choose how to realize the update on its version of the object. This is very time consuming to develop if one must do this (as say in Tango) separately for each shared application. The shared export model filters the output of each application to one of a set of common formats and builds a custom shared event viewer for these formats. This allows a single collaborative viewer to be leveraged among several different applications. WebEx uses a shared virtual printer which is achieved with shared Acrobat PDF export in GCP. The scalable formats SVG and PDF are particularly interesting and support of collaborative viewers for them is a major advantage of GCP. Scalability implies that each client can resize and scroll while

Page 9: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

preserving correct positions of pointers and annotations. SVG is useful as it is already available for Adobe Illustrator and we can expect both PowerPoint and Macromedia Flash to be exportable to this syntax. Currently there is a Flash (which is a binary 2D vector graphics format) to SVG converter from the University of Nottingham; Office 2000 (save as web page) already exports PowerPoint to VML – an early proposal for the W3C SVG process. We would recommend building SVG exports into tools like whiteboards and 2D scientific visualizations to allow convenient interchange among these different 2D presentation tools. We can expect Java3D and X3D [13] to allow similar general collaborative viewers to support collaborative 3D visualization. Although Tango (our Syracuse collaboration system) supported shared VRML (a forerunner of X3D) and a Java3D visualization system, we don’t see these standards as clearly accepted yet by the community. GCP will be enhanced in these directions when it seems appropriate.3.1.4 Access Grid

The Access Grid [14] is a very successful community audio-video conferencing system developed by Argonne National Laboratory. We intend to use this in GCP and the Hall D VEE although we will substitute (augment) its shared PowerPoint capability with shared document capability from GCP and the commercial synchronous collaboration systems.3.1.5 HearMe

HearMe [15] is a leading commercial Internet desktop audio conferencing service supporting both PC and telephone client with archiving and replay. We have installed a HearMe system at FSU. Note that audio quality is a critical problem for Internet collaboration, as audio needs negligible bandwidth but excellent quality of service, which is often not available. Thus we use a system that allows phones as an integrated backup – note HearMe archives all audio whether it is from phones or purely Internet based. The archived audio can be replayed using streaming formats (such as RealAudio) with the W3C SMIL syntax integrating this with shared documents. We are investigating desktop video solutions but experience has found this not as critical as audio and so it is currently lower priority; we will not develop this ourselves but use the best academic or commercial practice. We intend to study over the summer, ways of using the SIP and H323 (two standards for conferencing tools) compatibility of HearMe to bridge it to Access Grid. This should allow desktop users to link to access Grid sessions in a convenient fashion. 3.2 Portals and Collaborative Environments3.2.1 Education and Commodity Portals

WebCT and Blackboard are leading education portals and are typical of managed information portals. The GXOS schema extends ideas present in the IMS [16] and ADL [17] initiatives for “learning object standards”. We can expect conformance to these standards to allow exchange of course material between different management systems. Actually GXOS does agree in detailed syntax with these standards but rather has a Schema, which allows GXOS objects to be mapped (by XSLT) into the IMS and ADL standards. For instance GXOS views education specific structure as an extension to a framework designed for general meta-objects, messages and events. IMS and ADL take an education centric view.

We have also examined the structure of commodity portals such as Yahoo and Excite and the structure of the news sites from CNN, New York Times etc. These are supported by the hierarchical topics (channels) in GXOS with customization using user profiles in GXOS; we call this part of GXOS portalML. It is clear that different people within the collaboration will require different views of the Hall D VEE depending on their tasks. For example, those calibrating the Cerenkov counters will require a very different view from those trying to replace phototubes and very different different from those puzzling over an anomaly in a partial wave. It is very likely that a single individual will want different views of the Hall D VEE depending on the particular task they are performing at the time.3.2.2 Computing Portals

Computing portals provide web-based computing or problem solving environments. 16 recent projects of this type have been gathered together by the Grid Forum Computing Environment working group [18]. This includes the FSU Gateway activity [4], which is to be integrated into GCP, initially using shared display and shared Java server pages. The need for such systems is clear when one considers that the computing resources required to conduct meaningful simulations are not likely to be located at every university. The use of computing portals help provide the means for managing the production level computing projects that Halls B and D must perform.3.2.3 Collaborative Portal

A collaborative portal is a system that provides a portal (a web-based access to a particular application and/or set of Internet resources viewed as distributed objects) combined with ability to share accessed objects.

Page 10: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

This sharing includes synchronous and asynchronous access with the latter involving “channels” or “bulletin boards” or gotten by just posting a web page and informing interested parties in an informal fashion. Both portals and collaboration requires high-level metadata about the accessed objects and involved users. Thus we combine both concepts in GCP with a single supporting information management service.3.2.4 GCP Garnet Collaborative Portal

GCP is the research system embodying ideas described in sections 3.3, 3.4 and 3.5. Key functions include the integration of synchronous and asynchronous collaboration both in terms of topic publishing (channels) and object management. Thus it combines capabilities of synchronous collaboration and portals like Gateway, WebCT or Blackboard. It uses the Access Grid or HearMe systems for conferencing. It supports hand held devices, archiving and replay of collaborative sessions. Image (JPEG, GIF, PNG) SVG PDF and Java3D shared export viewers are planned. The latter three formats are scalable and support separate viewpoints (zooming) on each client. A prototype of GCP will be available in May 2001. GCP is designed in a modular fashion with clean interface for collaborative applications, which use the common GMS mechanism and GXOS Schema for exchanging state update events. Thus we can interface other novel DOE collaborative components such as the Electronic notebook [19] into GCP. It is quite likely that scientists will require several notebooks, depending on the task at hand. Different groups could organize their efforts with the assistance of electronic notebooks for various detectors or different physics analyses. We also will work with Dennis Gannon of Indiana University to interface his computing portal with GCP.3.2.5 VEE Virtual Experiment Environment

The VEE is the digital networked environment proposed here that supports physics experiments and is based on GCP, Gateway and MyXoS.3.3 Distributed Object Technology3.3.1 Hall D Physics Events

The digital representation of measured physics events produced from the Jefferson Lab accelerator will be stored in raw and processed form. Hall D will gather approximately 1012 physics events per year each containing approximately 5000 Kbytes of information.3.3.2 Hall D Micro-objects

Micro-objects are the smallest objects such as individual detector signals or tracks. Each Hall D event consists of some 50-1000 micro-objects, depending on level of detail with which event is examined.3.3.3 Hall D Gallimaufry

The Gallimaufry (hodgepodge or jumble) is the heterogeneous collection of Hall D knowledge made up from a multitude of sources including electronic mail, reports, presentations, partial wave analyses, archived visualizations and meetings. It explicitly excludes the very structured and numerous Hall D physics events.3.3.5 Aggregates

Aggregates are collections of either objects or Meta-objects, which are usefully considered as a single unit – often because a group of objects are stored together in a single file. In GNDI, an aggregate is the collection of all Meta-objects stored as children of a GXOS node. An aggregate is defined in GXOS as a general sub-tree. Examples of aggregates might include the results of a particular HBOOK file (using the CERN Root terminology for a file containing a collection of histograms). Such aggregates would include a series of one and two-dimensional histograms. We use aggregates to join related objects together into larger objects and so reduce the total number of meta-objects that users and MyXoS must explicitly manipulate.3.3.6 GXOS Garnet eXtensible Object Specification

GXOS is an object specification realized as a collection of XML Schema in a single namespace defining a general hierarchical data structure where each node supports extensions to define different application domains; Users, Security, Computers, GMS, IMS/ADL (Education) and Hall D are particular extensions. We can view GXOS as having three basic capabilities expressed with same overall structure and three different sets of extensions; there is resourceML (defining the base objects like users, documents, computers); portalML describes the virtual environment with topics, user profiles, client renderings; GMS describes the messages that communicate between the subsystems.3.3.7 Meta-object

Meta-objects are the basic units of GXOS, and they can be either at leaf or internal nodes of a GXOS tree. Meta-objects typically only contain Meta-data and use the GXOS Object Realization Schema type to specify access to “original object”. There are three ways an object can be related to a meta-object. Small objects such as Hall D Micro-objects or GCP Events would be self-contained i.e. the GXOS schema specifies the object and there is no distinction between meta-object and object. Secondly the object can be specified outside GXOS but its realization can be internal to GXOS (e.g. an RDF literal data type). Finally GXOS can reference a

Page 11: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

specification in any distributed object framework such as Java, CORBA, .net, or general SOAP protocol. One of the most common Hall D objects, experimental runs will be stored in a hierarchical tape storage system and fall into the third category and likely would have significant meta-data describing the runs and their access, but the meta-objects would not contain the run.

Note that we will specify all properties of a physics event in GXOS but manipulate in aggregate form with the metadata just summarizing the events. The events would be stored efficiently and referenced as an external object. Some parts of the analysis will want to generate the native XML version of an event and MyXoS supports the multi-resolution view of information – one just needs to specify which of the tree you wish to look at. Note all nodes of a GXOS tree whether internal or leaf, have metadata and can be viewed as meta-objects.3.3.8 GNDI Garnet Naming and Directory Interface

All GXOS Meta-objects have a unique name with a hierarchical structure and a URI of the form:

gndi://gxosroot/HallD/jlab/run137/aggregate007/event31416/fsuanalysis2/number_of_particles

3.3.9 GMS Garnet or Grid Message ServiceThe publish/subscribe message based infrastructure used to support GCP and MyXoS. This supports

XML based publication topics and subscription profiles as well as a sophisticated distributed server network supporting fault-tolerance and performance of message delivery. A research prototype is described in the June 2001 Syracuse PhD of Pallickara [20] while our initial “deployment” of GCP uses JMS (Java Message service) as an interim solution. All messages are archived in an Oracle database. We expect to switch from JMS to a more powerful model as both our research and the work of the Grid Forum evolves. We are working through the Grid Computational Environment and Performance working groups to a consensus on a grid event service.3.3.10 GCP Events

The events exchanged by the clients in GCP are transmitted as time stamped messages by GMS and routed between clients using the publish/subscribe mechanism. The GXOS schema fully specifies GCP event objects with all properties provided by the Schema.3.3.11 GMSME or GMS Micro-edition

GMSME is the customization of GMS to small clients, which connect in MyXoS through “an adaptor” linked to a PDA or cell-phone class device via our HHMP (Hand Held Message Protocol). The adaptor (running on a conventional MyXoS server) performs functions such XML and SVG parsing, and rescaling of images. Typically the processing of any collaborative application (called a sharedlet in GCP) is split between adaptor and client in a fashion that depends on client capabilities. A Windows CE Instant Messenger needs fewer services from the adaptor than the cell-phone IM interface. An adaptor looks like a GMS client to MyXoS and so this creates the illusion that GMS directly connects to small clients.3.3.12 P2P Peer-to-Peer Systems.

P2P refers to a linkage of computers “at the edge” of the Internet. As shown in fig. 1, this can be achieved by routing through one or more servers. JXTA is a technology initiative by Sun [10] in this area. Systems like Napster are popular P2P environments. In the current GCP, we use the same simple client-single server architectures used by the commercial collaboration systems and by our original Tango system. We are continuing to research this issue and develop approaches based on optimized routing of GMS messages based on a given server configurations and particular published topics and subscribed profiles. Also dynamic instantiation of servers seems an important capability, which should be supported by MyXoS. For instance if one has each client at widely separated (in Internet land) locations, then a single server could be appropriate. If there are many clients in a given location, then MyXoS should generate dynamically a server at this location to implement optimal local P2P routing. Hardware multi-cast should of course be used if available.3.3.13 MyXoS My eXtensible Web Operating System

This references the total environment including both the collaborative portal GCP and the suite of administrative tools to manage the dynamic information infrastructure. At a low-level MyXoS is driven by scripted XML written in the W3C RDF syntax and referencing GXOS objects. MyXoS includes sophisticated search capabilities described below and an extremely interesting research challenge of defining how it brings referenced meta-objects into memory as requested by executing programs. This is termed the MyXoS execution model below. MyXoS will provide core systems services such as copy, create, grep (i.e. search), diff etc., familiar from UNIX and Windows.3.3.14 RDF Resource Description Framework

Page 12: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

RDF [21] is a W3C standard for metadata allowing any resource labeled by a URI to be given a value (which is either a literal or another resource) for a property. This can be used in MyXoS to specify or modify distributed tree fragments in a fashion similar to that used for distributed data sources in the Mozilla (Netscape 6) browser [22]. Each data source stores a fragment of tree; these are glued together by MyXoS as its distributed servers combine their information. Typical RDF uses in MyXoS are illustrated by the examples below.

1. Specify value for property in GXOS tree<rdf:description about=”gndi://gxosroot/resourcename” >

<gxos:property rdf:parseType=”literal” >somevalue</gxos:property></rdf:description>

2. Specify profile by linking between GXOS tree elements<rdf:description about=”gndi://gxosroot/sessionname”><gxos:userprofile rdf:resource=”gndi://uri_of_user” gxos:customize=”sessionspecificstuff” /></rdf:description>

3. Specify MyXoS copy command for meta-objects<rdf:description rdf:about="gndi://gxosroot/system/bin/cp" system:source="gxosobject1" system:destination="gxosobject2" gxos:execute="true" />

4. Specify alternative locations to find all FSU users<rdf:description aboutEachPrefix=”gndi://gxosroot/users/fsu”> <gxos:metaobjectlocation><rdf:alt><rdf:li resource=”http://main_fsuweblocation” /> <rdf:li resource=”http://backup_fsuweblocation” /></rdf:Alt></ gxos:metaobjectlocation></rdf:description>

Note that RDF is not an essential part of MyXoS and we can replace it by other XML based tools. Some people have expressed reservations about RDF because more powerful forms of knowledge representation may replace it. We will monitor the W3C and community activities and evolve our practice appropriately.3.4 Asynchronous Collaboration and Object Management

GPC and MyXoS provide a unified approach to sharing and managing objects; functions that are traditionally treated separately. For instance Centra is current leading e-learning collaboration system but it has weak management capabilities. Blackboard and WebCT are the leading learning management systems in Universities but neither has strong collaborative capabilities. We combine the support of sharing and management because both require accurate metadata and we can achieve this with the same infrastructure MyXoS. Asynchronous collaboration is supported by uniform use of a single public/subscribe mechanism: currently JMS and to be extended to GMS. In principle all forms of asynchronous collaboration would be included in GMS by wrapping if necessary “foreign objects” like email as a GMS event. We will do this as needed but it is not realistic for us to redefine ways of working that are already adequate. GCP does use the public domain Jabber instant messenger [23] and has modified this to interface with GMS. Synchronous and asynchronous collaboration are integrated by using the same publish/subscribe mechanism.Registration of Meta-Objects

MyXoS maintains queues (topics) to which aggregates and meta-objects can post their location. These messages are similar in function to those used in Jini and are automatically purged when their posted validity expires. These messages contain one or more RDF statements identifying the location of the meta-objects with certain value or range of URI (GNDI names). Currently this can be most easily specified using the about or aboutEachPrefix attribute in rdf:description tag but we expect to generalize this rather simple syntax. These locations contain either the meta-objects or further RDF statements giving you directions. In this fashion MyXoS uses distributed engines subscribing to the registration topics to build up indices to map GNDI name into meta-object location. The meta-object is either the desired object or uses the GXOS Object realization Schema to specify access to the original object. Efficient Handling of Objects

We need to combine the high flexibility and functionality of distributed objects with the performance associated with traditional physics analysis systems using simple flat files with customized formats. We must satisfy the requirements of security, collaboration and distributed dynamic objects. The most important general strategy is our use of small meta-objects, which contain the essential information for implementing MyXoS services. The original object is only accessed when necessary. The information stored in the meta-object is application dependent and requires careful design of the GXOS extension for this object type. As part of this

Page 13: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

project, we will design a suite of Hall D Schema for the different aspects of the experiment. Hall D has a quadrillion objects and we need to specialize the architecture for this size of a problem. We first divided the Hall D objects into two classes (Hall D physics events and Gallimaufry) defined above. We have a trillion events taken each year and these are combined into aggregates of about 10 5 events. Each aggregate will be stored as a separate file with a rich meta-data abstract. These abstracts (around 107 per year) are viewed as MyXoS meta-objects and handled by the base MyXoS system. This technique is used for both raw data and all processed versions with tracks, particle identifications and momenta. We also have the Hall D Gallimaufry, which includes everything else such as reports and partial wave analyses. There are fewer in number of this type of data but they are much more heterogeneous in content and scattered over file systems around the world. Hall D events are expected to be consolidated in a few places (initially Jefferson lab) and so amenable to special processing. Thus we expect MyXoS to be required to handle explicitly tens of millions of meta-objects and this seems realistic.

It is computationally practical to index down to the micro-objects inside the event records. This would be a special service handled by a variant of today’s web search engines. Note we have roughly a trillion events (per year) and each has up to a thousand attributes. This is similar in scope to today’s web if we substitute web pages by event objects and text/meta-data in web page by micro-object. Thus a cluster of some 100 workstations would be able to search this information on demand in a fraction of a second. The cluster size would be reduced substantially either if one allows a longer wait time or one “waits on” the inevitable improvement of PC and workstation performance that will occur by the time Hall D comes online. The Gallimaufry would be included in this search and both aggregates and Gallimaufry objects would filter their meta-data and store in a form optimized for this search. This part of the service is equivalent to the web robot searches and substantially easier and more accurate. Hall D objects are required to have metadata and register with MyXoS through their associated meta-objects. The metadata includes a versioning history and updated search information will be generated at registration.

The requirement of all objects to have metadata has important implications as an enabling technology for the management of Hall D computing activities. The scale of the computing problem requires that physicists undergo a phase transition in the way they think about many of their computing tasks. It is no longer possible for a graduate student or a post-doc to manage the computations on even a medium-sized cluster. Computing for experiments such as those in Hall B and D must be approached as a production level problem or it will not get done. The creation of metadata from all the programs in the Hall D production environment makes it possible to automatically track the Hall D computing activities.3.5 MyXoS Execution Model

We are currently researching different ways of reading into memory the XML meta-objects as needed by programs running under MyXoS. SAX and DOM XML parsers are not efficient for tens of millions of XML instances at a time. Converting XML schema into Java data structures is possible [24] but efficiency requires this be combined with “lazy” parsing so that we expand GXOS trees only as needed to refine our access. Remember the use of multi-resolution aggregates “stopping” at a certain level in GXOS tree is absolutely essential for efficient systems. We see this as a particularly challenging problem that has important programming style implications as we look at new computing paradigms where data structures are defined in XML and not directly as C++ or Java classes.

Elsewhere we intend to look at areas like “Parallel Computing in MyXoS” with intelligent XML based data structures interpreted by Java agents enabling more powerful decomposition (of XML structures and algorithms expressed in Java or a more powerful version of RDF) strategies and the use of distributed dynamic resources for parallel execution. In this model we will as now produce MPI based SPMD codes but with a very different way of specifying the problem. We believe the research proposed here (mainly aimed at collaborative information systems) would be synergistic with such other applications of emerging Web operating environments like MyXoS. Another general capability needed by all these problems is “packed or binary XML” which can most efficiently represent XML structured for optimal parsing.

4. TimelineThe Hall D project is currently being evaluated by the Nuclear Science Advisory Committee as a part

of the long range planning process for nuclear physics. If this project eventually receives approval from DOE to begin as soon as possible, it is expected to begin taking data in 2007. The Hall D VEE is a key component of the computing plan for the Hall D experiment. The effort described in this proposal covers three years and will

Page 14: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

be completed in time to be fully deployed in the Hall D computing effort and to have a substantial impact on Hall B computing.

In developing the VEE it is expected that members of the Hall D Collaboration and Jefferson Lab personnel will formally review it on at least two occasions. Reviewers from outside of the Hall D community will also review these plans within the context of the entire Hall D computing effort. Finally, as part of this effort we will seek informal reviews of the software from the users as they begin to use the VEE and will use this feedback to help refine the VEE and underlying standards.

The FSU group will be primarily responsible for developing the framework and standards for the VEE and for developing a prototype collaboration system and for developing the prototype applications for Hall B simulations and analysis. The IU group will be primarily responsible for applying the standards to the most important Hall D computing applications. Division of the effort in this way provides us with critical feedback on the usefulness of the framework from people not directly involved in its development. We have established an aggressive schedule to reach this project’s goals.4.1 Initial Deployment

We have about 5 years before we need to package our system and deploy it in a totally robust fashion to support Hall D data taking and production level analysis and simulation. During this time we will incrementally deploy technologies starting with those to support simulations and other aspects of what we called the Gallimaufry. Note this means that it is not this proposal but a rather follow-on that must generate the production VEE with all its functionalities. Here we will focus on research but with enough deployment to validate the concepts in practice. In the first of this proposal, This will be done for both the mature Hall B and the new Hall D experiments. Initially we will take existing technologies and modify them to support immediate use in early fall 2001. This involves

The prototype GCP collaborative portal or if this is not quite ready, one of the major commercial synchronous systems discussed above. This will supply basic tools and support shared documents using PDF and web page export and shared display.

Selected installations of Access Grid technology to support group meetings; we have in other projects developed training and deployment support through Argonne and NCSA

Pervasive availability of desktop audio (and if necessary video) using HearMe or what ever is best practice in fall 2001

GCP will support modest document management capabilities and archiving of sessions. This will be used to get experience with Gallimaufry objects.We will use our Gateway portal to provide an initial computing portal for Hall D simulations and Hall B analysis runs. This system already has a wizard to define metadata and this will be modified in the next few months to generate GXOS meta-objects. This plan could be modified based on the work of the Grid Forum in establishing best practice and standards in this area. Initial enhancement would involve using the Gateway portal’s visualization capability for both traditional physics event and statistical analysis displays. Note Gateway already has extensive security functions and these have been used in design of the current security extension for GMS.4.2 Ongoing Research and Development

We will combine use of our technology, research, hardening and further deployment in an iterative fashion. Above we have described an initial deployment of proven technology by which we will gain initial experience with a VEE for both Hall B and Hall D. This deployment will be done almost immediately when the project begins. We will gain experience from use of this by the experimental physicists as we research evolve and deploy new components.

Below we describe our plan for technology research and development and the deployment plan. We assume that both the computer science and physics community will review our progress and both workshops and programmatic reviews could well be combined. We list formal requirements for evaluation of the physics functionality, as these are needed to ensure that we can integrate work with Hall B and Hall D schedules. The initial deployment has the three major functional areas – namely synchronous collaboration, asynchronous collaboration, and portals -- not well integrated. Further each area will be incomplete – most advanced will be the computing portal and synchronous collaboration support. If our project is approved, we will work with DoE on a broader review process.

The initial deployment has the three major functional areas – namely synchronous collaboration, asynchronous collaboration, and portals -- not well integrated. Further each area will be incomplete – most advanced will be the computing portal and synchronous collaboration support.

Page 15: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

4.3 MilestonesThe Hall D VEE is a key component of the computing plan for the Hall D experiment. The effort

proposed covers three years and will be completed in time to be extended for full deployment in the Hall D computing effort and to have a substantial impact on Hall B computing. The milestones given below provide clear goals that need to be meet in order to insure successful completion of the project.

We have established an aggressive schedule to reach this project’s goals. The initial deployment has the three major functional areas – namely synchronous collaboration, asynchronous collaboration, and portals -- not well integrated. Further each area will be incomplete – most advanced will be the computing portal and synchronous collaboration support. During the first year the latter will be improved by completion of the different shared export modes. We will also design the initial GXOS extensions to support Hall D objects – especially for the simulation area. Simultaneously the major year 1 activity will be the construction of the initial MyXoS environment, which will enable us to integrate the three different collaboratory components.

In year 2, we will complete the Hall D object specifications, and refine the synchronous collaboration system so it can be used in the full experiment. This will be evaluated with an attempt to perform an end-to-end analysis of Hall D data. We will get our initial user feedback from the use of the MyXoS based environment that fully integrates the Gateway and GCP. The full support of hand-held devices will be included. Research will be ongoing into the next set of MyXoS capabilities to support fully dynamic registration and searching in a fashion that scales to tens of millions of objects.

In Year 3 will deploy a MyXoS environment with the full, proposed capabilities with the major emphasis on user feedback, documentation and deployability rather than increased functionality. We will also critically evaluate the scalability of the system by attempting to evaluate the system with a small fraction of the objects that will eventually be included the Hall D VEE (this fraction is limited primarily by our ability to produce the required number of objects with the available computing and storage resources. It is not expected to be limited by the VEE system itself).0-6 months:

Deployment of skeleton VEE system, mainly collaborative tools Create two Grid “nodelets”, one at FSU, one at IU, for use as the development platform for this effort. Design the initial GXOS extensions to support Hall D objects – especially for the simulation area.

6-12 months: Integrate Hall B & D simulations into Gateway System Construction of the initial MyXoS environment, which will enable us to integrate the three different collaboratory components. Completion of the different shared export modes.

12 – 18 months: Extend the capabilities to include additional computing applications, collaboration tools and integration

with an experiments database. Begin evaluation of database performance. Begin creating user interface customization tools. Integrate hand-held devices.

18-24 months: Deploy second prototype. Create developer interface. Create wrappers for many standard systems, including slow control software, CEBAF online data

acquisition system, JLab hierarchical storage system, and queuing system and the accelerator electronic logbook.

Deploy for Hall B first pass analysis. Complete the Hall D object specifications. Refine the synchronous collaboration system to a form it can be used in the full experiment. Evaluate VEE performance. Attempt an end-to-end evaluation of Hall D data using the VEE. Conduct first workshop. Research will be ongoing into the next set of MyXoS capabilities to support fully dynamic registration and

searching in a fashion that scales to tens of millions of objects. 24-30 months:

Create customizable user interface. Extend functionality, demonstrate integration of slow controls, CODA, JLab computer system, data transfer.

Page 16: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

Evaluate developer tools Conduct internal evaluation with a 2% object challenge – try to accommodate 2% of the Hall D objects into

the VEE. Conduct second workshop.

30-36 months: Deploy third prototype, a.k.a., the first version. Finish documentation on framework, standards, user’s and best practices guide.

4.4 OutreachThis project by its nature provides one avenue of outreach. In order to complete this project successfully we

must educate the TJNAF community about the nature and structure of the Hall D VEE. In order to accomplish this in a more direct manner, we intend to conduct two workshops each during years two and three of this project. These workshops will include members of the Hall B and D collaborations to use the standards and tools created to create, extend, and apply the VEE. From these focused workshops we will solicit feedback about the VEE’s and seek advice concerning VEE functionality, applications and best practices. In addition we intend to start using the prototype during the project construction phase as a collaboration tool and to provide access to a limited number of computing resources.

In addition, members of this research project are active participants in the Global Grid Forum and can use this involvement to help direct and adhere to best practices for this technology.4.5 Scientific Team

The scientific team consists of the key leaders of the Hall D collaboration. Alex Dzierba is spokesman for the Hall D collaboration. Larry Dennis is co-coordinator of the Hall D computing effort. Geoffrey Fox has extensive expertise in the development of collaborative software tools. He wrote the simulation and analysis software for three major Fermilab experiments (E110,E260,E350) and is very familiar with the application requirements. Larry Dennis has been involved in the Hall B collaboration since 1987 and currently has three students working on the simulation and analysis of experiments in Hall B. He served as co-coordinator for Hall B software until 1997 and as chairman of the Jefferson Lab Physic Computing Advisory Committee in 1996-97. Computer scientist Greg Riccardi of FSU has been a member of the Hall B collaboration at TJNAF for 10 years and is a major contributor to its data and computation management. Mathematician Gordon Erlebacher is an expert in the use of XML, XSL, Schema, Cocoon and hand held devices. The FSU group has extensive experiment in cluster and distributed computing. The IU group has extensive experience in cluster computing, simulations, and analysis of nuclear physics experiments. The members of this collaboration are working closely with the Jefferson Lab staff to define and create the Hall D and Jefferson Lab computing environment.4.6 Partnerships

Critical partnerships for this effort are with the Jefferson Lab computing center, headed by Ian Bird, and the Jefferson Lab Particle Physics Data Grid Project group, headed by Chip Watson. Both have been involved in the development of the Hall D computing plan from the start and have agreed to help advise and support this effort.4.7 Matching Support

The Florida State University School of Computational Science and Information Technology is providing matching support for this proposal in the form of one full time research scientist for three years (estimated cost of $180 K) and the FSU Department of Physics is providing funds totaling $20 K.

Page 17: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

Bibliography

1) Hall D, http://dustbunny.physics.indiana.edu/Hall D/2) Portals and Frameworks for Web-based Education and Computational Science, http://www.new-npac.org/users/fox/documents/pajavaapril00/3) General Review of Portals by G. Fox, http://aspen.csit.fsu.edu/collabtools/CollabReviewfeb25-01.html4) Gateway Computational Portal, http://www.gatewayportal.org5) Particle Physics Data Grid, http://www.cacr.caltech.edu/ppdg/6) GriPhyN Project Site, http://www.griphyn.org/7) GriPhyN site, http://www.phys.ufl.edu/~avery/griphyn/8) Ninja Project, http://ninja.cs.berkeley.edu/9) Semantic Web, http://www.w3.org/2001/sw/10) Sun Microsystems’s JXTA, http://conferences.oreilly.com/p2p/p2p_brochure.pdf11) The Tango System, http://www.new-npac.org/users/fox/documents/generalportalmay00/erdcportal.html12) Virtual Network Computing System, http://www.uk.research.att.com/vnc13) (http://www.web3d.org/x3d.html 14) Access Grid, http://www.mcs.anl.gov/fl/accessgrid/15) HearMe, http://www.hearme.com16) Instructional Management Systems (IMS), http://www.imsproject.org17) Advanced Distributed Learning Initiative, http://www.adlnet.org18) Grid Forum Computing Environment Working Group, http://www.computingportals.org/cbp.html19) DOE 2000 Electronic Notebook, http://doecollaboratory.pnl.gov/research/homepage.html20) Grid Message System, http://aspen.csit.fsu.edu/users/shrideep/mspaces/21) Resource Description Framework (RDF), http://www.w3.org/TR/REC-rdf-syntax/22) RDF implementation in the Netscape Browser, http://www.mozilla.org/rdf/doc/23) Jabber Instant Messenger, http://www.jabber.org24) Castor: XML Schema to Java Data Structure Converter, http://castor.exolab.org/sourcegen.html

Page 18: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

Computing and Networking Facilities

The Florida State University School of Computational Science and Information Technology is the home of a comprehensive high performance computer environment.Workstations comprise a growing component of the computing resources utilized by CSIT scientists. The public domain UNIX operating system for PC platforms, Linux, is proving to be a mature enough environment to place on scientist's desks and a number of high-powered PCs were purchased for this reason. CSIT has organized most of the machines into ``compute clusters’’ that are allocated to scientists as required. The cluster includes

30 2-CPU Intel platforms with Linux, 8 DEC 3000 Model 600s, 10 DEC 3000 Model 400s, ~90 miscellaneous workstations (Intel based PCs, Sun, and Silicon Graphics) 2 4-CPU AlphaServer 2100s, 1 1-CPU Alpha, 1 1-CPU AlphaServer 2100, 2-CPU Alpha, 2 4-CPU Alpha with 8 GB memory each and ~100 GB disk each, 1 2-CPU HP 9000 series with 8 GB of

memory, 1 4-CPU HP 9000 series with 8 GB of memory, 1 2-CPU Sun EntepriseServer3500 with 1 GB memory, ~90 GB disk and gigabit Ethernet to support file serving. The sixteen-node IBM 9076 SP2 has eight wide nodes (with one Gigabyte of RAM and ten Gigabytes disk storage each). There are five four-R10000-processor Origin 200s at 180 Megahertz, each with one Gigabyte of memory and 27 Gigabytes of disk storage, one Origin 200 with a single CPU (180 Mhz) acting as an X-terminal server, one SGI Maximum Impact with one Gigabyte memory, and a SGI O2 workstation with one Gigabyte of memory.Our high-end system is an IBM multiprocessor supercomputer. There are 42 4-way nodes, each CPU running at 375 Mhz with 0.5 Gigabytes of memory. The peak speed is ¼ Teraflop. In phase two (18 months from now), the machine will upgraded to 512 processors with a peak performance of 2 Teraflops. The visualization laboratory has the following resources:

One Onyx with 2 pipes of Infinite Reality2E graphics, 4 R12000 processors operating at 250 Mhz, with 8 Mbyte cache, 2 Gigabyte RAM, 128 Mbyte texture

memory, and 200 Gigabytge disk farm. Eight of the 10 disks are striped pairwise for faster I/O. 1 Octane with 2 Gigabytges RAM, MXE graphics, 27 Gigabytge disk. 3 Visual NT's (two with 1 450 Mhz CPU, one with 550 Mhz CPU), 1 Gigabytge memory each, 27

Gigabytges disk. 1 O2 with 325 Mbytes RAM, R10000, 180 Mhz. 1 SGI Indigo 2 Maximum Impact with one gigabyte memory O2 workstation with 1 Gigabyte memory. 1 rear projection 8'x16' powerwall, capable of stereographic display. It resides in the seminar room and is

used together with two 24” monitors for visualization research, classroom activities, and presentations.

CSIT networking is a mix of technologies, with Ethernet dominating and FDDI taking a legacy role. CSIT inherited a 24-port FDDI-based Gigaswitch, it is operating until the FDDI based systems are retired. In addition, there are two Ethernet switches (one from SCRI and one from CSE) that are linked via gigabit Ethernet. One switch has 2 100BaseFX ports, 24 10/100 ports in addition to the 1000SX port linking it to the second switch. The second switch has a second 1000BaseSX port supporting the Sun ES3500 server and 48 10/100 ports. A Cisco 7507 router connects the FDDI subnet via a full-duplex interface to the switched Ethernet subnet via a pair of full-duplex 100BaseTX Ethernet interfaces. The 7507 also has an ATM interface to the campus backbone operating in full-duplex at 155 Mbps. The campus-wide backbone is a dual technology network with both ATM cell and Ethernet frame switches. The campus network joins CSIT to the Computer Science Department Sun network, the Meteorology computer systems, the Physics Department clusters, the FSU/FAMU College of Engineering networks, the National High Field Magnetic Laboratory, the Center for Oceanic-Atmospheric Prediction Studies, Academic Computing and Network Services as well as the rest of campus. The network provides access to regional, national, and worldwide networks including ESnet, Internet2, FIRN and Floridanet.

Page 19: Collaborating with Petaobjects:€¦  · Web viewIntroduction. This proposal seeks ... phenomena with Hall D depends critically upon our ability to efficiently extract information

Other Related Facilities - FSU Department of Physics Cluster

The FSU Physics Department maintains and operates a 57 node, dual-CPU cluster of Intel Pentium processors augmented with a 600 GByte file server, and additional storage totaling 1.4 Tbytes, a small interactive cluster and firewall for computational physics. Both the Physics cluster and the CSIT Pentium cluster were designed and built by Drs. Dennis and Riccardi and their students.

Indiana University Computing Cluster

The IU cluster consists of 16 nodes with dual, 800 MHz Intel pentium-3 processors and two 45 GB hard drives for a total of 1.44 TBytes disk storage. The nodes are connected to a 100 Mbit/sec switch which is in turn connected to the campus 100 Mbit/sec backbone.