Collaboratory Testbed for Macromolecular Crystallography at SSRL
Peter Kuhn, Stanford Synchrotron Radiation Laboratory, [email protected]
SSRL is funded by the US Dept. of Energy and the National Institutes of Health
NIH-NCRR Advisory Panel Meeting, August 11, 2000
Agenda for the NCRR Collaboratory Advisory Meeting
• Overview, History, Evolution of the Collaboratory• Demonstration of the Current Tools• Introduction to the Assessment System (Frank Topper)• Report on the previously prioritized success indicators• Development of new success indicators
What is missing from the previous set What are the global, evolutionary goals What are the new development goals
• Coffee Break• Ranking of new success indicators• Collaboratory Software and its use at other synchrotrons and within other
disciplines Short report on current status Maintenance vs. development; service vs. collaboration What is needed to develop a Collaboratory environment
• Future Directions and the Interface with High-Throughput Data collection and the Joint Center for Structural Genomics
The Collaboratory for Protein Crystallography
Goals
• Allow a team of researchers distributed anywhere in the world to perform a complete crystallographic experiment, from data collection to structure publication.
• Enhance productivity by allowing remote collaborators to participate in experimental choices at the beam line.
• Facilitate collaborative experiments in such areas as drug design and structural genomics.
• Fully utilize National resources for crystallographic experiments.
A Collaborative Research Environment
S yn ch ro tro n S o u rce
L o c al Us er s
C o m p u teS er v er s
S c ien tif ic I n s tr u m en ts
S a n D ie go S u p e rc o m p u te rC e nte r: D a ta A rc hiv e
Local/Remote Users
Video Feed
Web-based Data Viewer
File and Project Management
Data Reductionand Structure Analysis
Data Collection
Design Choices for Collaboratory Implementation
• Distributed architecture. Collaboratory services will be hosted by a large number of computers at the National labs, but
this infrastructure will be transparent to the remote scientist who will see an integrated view of the experiment with client software.
• Platform independence. The remote scientist will be able to run the client software on any widely available computer
operating system, and Collaboratory servers will be designed to avoid obsolescence when computer hardware is upgraded.
• Network performance. “Thin” clients will optimally utilize the available network bandwidth.
• Secure access. Remote access will be via secure channel, and users will be able to specify who can access the
data.• Crystallographic applications.
A full suite of widely used crystallographic software will be made available to remote users through a Windows Terminal Server platform.
• Permanent archive. Raw data will be written to a 1000-Terabyte tape storage system at the San Diego
Supercomputer Center. Permanent data access will permit more accurate structural analysis.• Tiered approach
WWW appliactions for minimal access WindowsTerminalServer ICA environment for access to full suite of x-ray software BLU-ICE in native Client-Server for full performance environment
History on the Collaboratory
• Initial Proposal: March 1998
• Initiation of Funding: September 1998
• Staffing: Nick Sauter and Limin Yang were hired
in late 1998, both have since moved on to LBNL and a software company in Winter 1999.
Significant responsibility for the Collaboratory was assumed by Timothy McPhillips for software design, Scott McPhillips for software engineering and Peter Kuhn for scientific direction in Fall 1999
Thomas Eriksson joined in March 2000 as Systems Developer
Fred Bertsch will join on August 14th 2000 as Sci. Software Developer
Offer to candidate for the lead-scientist is currently being drafted with an expected starting date of October 2000.
• Development Progress (planned) 1998: Assessment of basic needs 1999: Design, evaluation of needs, and testing of
existing software 2000: Implementation of standalone
communication tools, networking of crystallographic software and development of Collaboratory backbone
2001: Beta-testing of the Collaboratory backbone 2002: End of testing and launch of Collaboratory
• Development Progress (implemented) 1998: Assessment of basic needs 1999: Advisory Panel Meeting for definition of
priorities and development plan; Diffraction Image Viewer as first web-application; Design
and specification of BLU-ICE as the unified control and data collection environment.
Implementation of single OS environment at the beam lines.
2000: Access to data collection, data analysis, image viewing, office software via a single user account with fully integrated file access; BLU-ICE launched on BL9-2 (fall 1999) and full launch on all beam lines in November 2000. Prototype developments of database implementations for data collection.
WWW-Diffraction Image Viewer
• Prototype Web-application
• Logon via authorized Unix Account
• Browse life data directory
• View JPG of diffraction images
• Zoom, Contrast controls
• Image viewer is now implemented at NSLS
Legacy X-Windows Applications Can Run Within in ICA Client without Modification
SGI Desktopat home lab
Citrix ICA ClientShowing SGI Desktop at SSRL
Data Analysis Application Running at SSRL
Citrix ICA Client as an Example of a “Good” Thin Client
ICA client anywhereIn the world
Modem or Internet< 20 Kbps
Collaboratory Citrix server
X11 protocol Over Gigabit LAN
Unix beam line computers and central CPU and file servers
Complete working environment
• Feels like a complete workstation in a window.
• Supports multiple graphical applications running simultaneously.
• User need only install the free ICA client.
High performance
• X11 performance and responsiveness in ICA session comparable to a local workstation.
Cross-platform
• Client available for all popular operating systems, including DOS, Windows, MacOS, Linux, and many flavors of Unix.
• Applications run identically on all client platforms.
Integrated with Client Computer
• Local file systems and printers on client computer are automatically accessible in ICA session.
Thin
• Does not take up significant CPU or memory resources on client machine.
• Only 20 kbit/sec of bandwidth needed for full performance
Robust
• Does not hang or cause client computer to crash.
BLU-ICE – a unified data collection interface
• Insert movie here
Assessment of the Current Status of the Collaboratory
• Evolutionary History of the SSRL Collaboratory• Previous success indicators ranked by importance• Previous success indicators grouped by ‘theme’
• Unfulfilled success indicators• Largest project: Archival System
• New success indicators What are the global, evolutionary goals
• What will be the future bottlenecks in SMB
• What are the projects that need particular attention
• What are the projects that benefit the most from a collaborative environment What are the new development goals
• Database environment for all system parameters to enter imgCIF; database will be made available and becomes information source for http://biosync.sdsc.edu web-sites and SSRL internal web-sites
• Integrated account system that enables individual user accounts and shared group accounts
• Grouping and prioritization of success indicator
Top 30 Success Indicators from 2/5/1999 Meeting
1.25 Ability to transfer data in and out of Collaboratory
1.27 24/7 availability 1.31 Rapid feedback to the user during the
data collection run1.44 Beam line safety1.47 Ease of use; friendliness of user
environment 1.53 Security and reliability (free of
malicious and accidental interruption)1.53 Responsiveness to user suggestions1.57 Increased percentage of successful
experiments 1.63 User-friendly interface for camera and
beam line motion control1.63 Database of methods, tutorials, and
example files 1.63 Responsive, high-speed user interface at
remote (worldwide) locations1.63 Rapid processing of CPU-intensive jobs1.81 Early characterization of user needs and
wishes1.86 Reduced time from data collection to
structure solution1.87 Availability and quality of training:
safety, hardware, crystallography methods, software
1.88 Ability of researcher at the home lab to monitor & participate in real-time strategic decisions at the beam line
1.88 Access to computer resources from remote locations once the data collection run has ended
2.06 Availability of complete toolkit for solving X-ray structure
2.19 Permanent archiving of data2.20 Compliance with IUCr standards2.25 24/7 user support2.27 Ability of researchers at multiple
locations applications on screen, e.g., molecular modeling.
2.31 Turn-key operation instead of traditional methods
2.40 Increased throughput: number of user groups and number of data frames collected
2.50 Availability of all legacy applications for solving X-ray structure
2.53 Scalability and ability of other synchrotron sources to use Collaboratory model
2.53 Willingness of users to collaborate & involve more researchers on a given project
2.59 High resolution video feed to monitor microscope and goniometer
2.67 Beam line control from remote location
Grouped success indicators – 1
• Capabilities Remote control & video presence:
• High resolution video feed to monitor microscope and goniometer – under development
• Beam line control from remote location– available now through the Citrix ICA client and on native platforms Q1 2001. – currently developing the security protocols needed to allow remote access
• Ability of researcher at the home lab to monitor & participate in real-time strategic decisions at the beam line– current tools allow test users to experiment with this capability, WWW-image viewer gives all users access to
their data• User-friendly interface for detector and beam line motion control
– available now at beam line 9-2 with BLU-ICE Data processing:
• Availability of complete tool kit for solving X-ray structure – MAD structures are routinely solved at BL9-2; not yet implemented in collaborative way– It will require a larger effort to integrate software from different sources
• Access to computer resources from remote locations once the data collection run has ended – Available now for test users, but requirements are not yet defined for larger scale
• Access to sufficient compute resources for rapid data processing during the experiment– Part of ‘regular’ user operations; three 4-processor 667MHz systems will support the 5 beam lines from Nov
• Permanent archiving of data – all image data will be in imgCIF format from Nov 2000; archival under development
• Ability to transfer data in and out of Collaboratory (see archival)• Turn-key operation instead of traditional methods
– 120 second movie shows impact of advanced instrumentation and software environments. Still developing methods for rapid determination of ‘best’ energies for MAD and general data collection strategies
• Database of methods, tutorials, and example files – To be developed
Grouped success indicators - 2
• Accessibility Availability:
• 24/7 availability – Test systems are ‘standalone’, production systems will include multi-system failover environment– Distributed control system has built in ‘watch-dog’ and other safety features that enhance high uptime– Software engineering principles result in high robustness
Security and reliability: • Free of malicious and accidental interruption
– implemented basic security precautions for all remote access avenues• Beam line safety
– X-ray accidents are not possible because regular hutch safety protocols are never circumvented. Responsiveness:
• Responsive, high-speed user interface at remote locations – Tested from Singapore, Hong Kong, Erice (Italy), and numerous places in the US
• Rapid processing of CPU-intensive jobs – Adequate CPU power for current use but expansion of capabilities required for post-experimental access
• 24/7 user support – Support at the beam lines is 24/7, extension to remote users is under study– Collaboratory development has triggered equipping all support staff with cell phones and high-speed internet
connections to home locations• Compliance with IUCr standards
– Compliance with mmCIF standards and SDSC archive formats; mmCIF will be used from Nov 2000
Grouped success indicators - 3
• User Experience Rapid feedback to the user during data collection
• Users generally process their data in real-time at the beam line. Test users process data remotely.
Ease of use; friendliness of user environment • User feedback has been positive
Responsiveness to user suggestions • Software upgrades and system improvements based on user feedback, BLU-ICE for data
collection was initiated in mid-1999, developed with user feedback, launched in Nov 1999, revised with user feedback and will control all beam lines by Nov 2000
Early characterization of user needs and wishes • See above
Availability and quality of training: safety, hardware, crystallography methods, software • Expansion of smb.slac.stanford.edu web pages; the remote collaboration tools will be
documented in full when they become available; enhanced scientific support through additional support from NIGMS; SMB School2000 in September 2000.
• Scientific Progress Increased percentage of successful experiments Reduced time from data collection to structure solution Increased throughput; number of user groups and data frames collected
Current Plans from Previous Success Indicators
• Capabilities Remote control & video presence:
• High resolution video feed to monitor microscope and goniometer – under development
• Ability of researcher at the home lab to monitor & participate in real-time strategic decisions at the beam line
– full collaborative environment under development; BLU-ICE client server to include data reduction;
Data processing: • Availability of complete tool kit for solving X-ray structure
– MAD structures are routinely solved at BL9-2; not yet implemented in collaborative way;
– Specialized RUNS window within BLU-ICE that allows auto-selection of MAD energies, ultra-high resolution strategies, integration of Kappa strategy
• Permanent archiving of data – under development; highest demand project because it carries responsibility for
the data• Ability to transfer data in and out of Collaboratory (see archival)• Database of methods, tutorials, and example files
– To be developed; SMB Team is collaborating with outside groups
A Distributed Architecture for Data Archiving
Data CollectionSoftware
CollaboratoryFile Browser
Unix CommandLine Interface
SSRL Data Archive Server
SSRLData Archive
Database
Storage ResourceBroker (SRB)
At SDSC
HPSS at SDSC
SSRL RAID System
Hard DiskAt Home Lab
Web BrowserInterface
Metadata for Diffraction Images
Image File Header
Thumbnail View
File Parameters• Creation date• Access control list• Tape archive status• User annotation • Annotation by data processing software• Move, rename, and copy tracking
Larger JPEG ViewLarger JPEG View
Prototype GUI for Archive Access
New Success Indicators
• What are the global, evolutionary goals, how do we define scientific progress Increased percentage of successful experiments Reduced time from data collection to structure solution Increased throughput; number of user groups and data frames collected What will be the future bottlenecks in SMB What are the projects that need particular attention What are the projects that benefit the most from a collaborative environment
• What are the new development goals Database environment for all system parameters to enter imgCIF; database will be
made available and becomes information source for http://biosync.sdsc.edu web-sites and SSRL internal web-sites
Integrated account system that enables individual user accounts and shared group accounts for data sharing and project separation as needed
New WWW-Diffraction data viewer, add’l WWW tools Integration with the scheduling process Integration with the proposal review process
• Closing the chapter of BLU-ICE Next revision as full production version Only additional modules but no new developments
Future Plans and Directions
• Collaboratory Software and its use at other synchrotrons and within other disciplines XAS-Collaboratory Short report on current status
• SRRC, Spring8
• Canadian Light Source
• Implementation of BLU-ICE on rotating anode systems
• ALS MCF BL5.0.
• ALS superbending magnet beam lines Maintenance vs. development; service vs. collaboration
• SRRC MoU for multi-year collaboration What is needed to develop a Collaboratory environment
• Future Directions and the Interface with High-Throughput Data collection and the Joint Center for Structural Genomics Overview of JCSG Overview of ASAP
ASAP - Automated Structural Analysis of Proteins
O pe ra tor In te rfa ceAllow s to m onitor a nd
contro l AS AP ope ra tions
P roce ssorM a ke s scie ntific
de cisions a nda lloca te sre source s
Da ta ba seM a inta ins syste m sta te
UserInterfaceLayer
AgentLayer
MainControlLayer
A S A P S o ftw a re A rc h ite c tu re
S tructura l G e nom ics In te rfa ceAllow s m onitoring a nd
in te rfa cing w ith o the r core s
Crysta lCha ra cte riz a tion
Age nt(s)
Da ta Co lle ctionAge nt(s)
Da taRe duction
Age nt(s)
S tructureDe te rm ina tion
Age nt(s)
Tra cing &M ode l Bu ild ing
Age nt(s)
Re fine m e ntAge nt(s)
Agent
Be amlineControl Broke r
- DCS
Robot CCDBe am Line
M otorsHPSS Tape
Archiv eRAID
File StorageR e source
B roke r
ComputeRe source
Broke r
S S RLCom pute rs
SDSCCompute rs
A S A P A gen ts , R es o u rces , an d S e rvices
AgentLayer
ResourceBrokerLayer
ServiceLayer
Third PartySoftware
FunctionLibrarie s
Top Related