Simplified Experiment
Sub
mit P
roposal
Resu
lts
Excited Users
High Performance Computing
Data Storage
Local ClusterData Visualization Portals and Facility
Interfaces
Data Analysis Authenticated and Authorized User
Automation SoftwareBeamline
Control with GDA
Do
Exp
t
Data Analysis
Feedback
Diamond Overall Requirements
0) Users are uniquely identified and should need to log in once only for all aspects of the experiment.
1) Users can move from beamline to beamline as easily as possible so a common scripting environment is necessary
2) Remote access including role based access control.3) Data migration is automatic from beamlines to
externally accessible repository.4) Data evaluation and reduction as close to online as
possible.5) Integration of data reduction and analysis workflows.6) Metadata in files sufficient for data analysis7) Ability to perform science specific analysis/acquisition8) Seamless access to remote large computing resources.9) Continuous Integration and User Acceptance Testing
Single Sign On
1. The aim of this project was to provide a mechanism for uniquely identifying users of UK large scientific facilities irrespective of their method of access.
2. All users of the major facilities will need only one username/password combination to access any of the facilities.
3. These credentials or an automatically generated certificate or token will allow access to any computing technology given the correct authorization.
4. The authorization will be performed locally by the facility involved based on the single unique identifier derived from 1-3.
5. Normally we use either CAS (Originally Yale – now JASIG) or myProxy to perform user authenication - http://www.ja-sig.org/products/cas/index.html
6. A Java Web service filter uses authenticated user name with Actve Directory and/or local ldap to determine the user's roles.
7. Partners: STFC, e-Science, SRS, ISIS, Diamond8. Users can now reset their own passwords using a “Bank Type”
web application.
Data Analysis Framework
The central concept is allow data processing to proceed in a series of discrete steps with decision process being possible between each. Typically the overview of this data processing pipeline would be a UML modelling diagram or ,more commonly, a flow diagram. The advantages of separating the data analysis into discrete sequences of steps:
1. The processing programs themselves may be step based2. The programs may be available in binary only for a particular computer
architecture3. The programs may be distributed over different machines particularly
should their processing requirements be large.4. Assuming that Single Sign On (SSO) is functioning it should be practical
to perform this processing distribution to GRID resources such as SCARF or HPCX and avoid the necessity to enter authentication at every step.
5. Diamond now has 200Tb short term storage and cluster using Lustre interconnect. Uses Sun Grid Engine.
6. It is possible to use the decision process to proscribe different processing branches depending on the results of a particular sequence step.
7. Automate potentially large numbers of processing steps to be performed without user intervention.
Data Processing Cycle
High Performance Computing
Data Storage
Local ClusterData Visualization Portals and Facility
Interfaces
Data Analysis Authenticated and Authorized User
Automation SoftwareBeamline
Control with GDA
Automatic Authentication and Authorization
Data Store
Central CPU cluster
Data Store
Central CPU cluster
Diamond Beamline
User
User System
Linux Intrument Control
Linux
User PC
1000BaseT or
faster
Wireless
Access Point
Local Processing
(Cluster)
Disk array
Computer
Computer
User
User
User System
Linux Intrument Control
Linux
User PC
1000BaseT or faster
Wireless
Access Point
Local Processing
(Cluster)
Disk array
Computer
Computer
User
ESRF Beamline
Data Store
Central CPU cluster
Authenticate Once Only
Soleil Beamline
User
User System
Linux Intrument Control
Linux
User PC
1000BaseT or
faster
Wireless
Access Point
Local Processing
(Cluster)
Disk array
Computer
Computer
User
Current Position
Experimental Data Flow
Web Site Implementation CAS
Web Authentication Service
Apache 2/JSP server
Authorization
UsersUsers
CAS serverTomcat
CAS serverTomcat
Application
Application
Application
Active Directory
LDAP
Step 2
Step 3
Step 1
Kerberos
Username/PasswordSPNEGO
Administration System 1 - Duodesk
• Authentication using Active Directory (VAS, CAS)• Authorization based on Active Directory roles• Create new or edit new proposals• Create/edit user and establishment details• Modify or add proposal details such as participants or change
beamlines• Allocate time• Schedule proposals to beamlines• Administer users on site – travel and subsistence, production
of HID authorization entry cards• Data automatically extracted to set up beamline accounts• Database information automatically incorporated into files
acquired on the associated beamlines and placed into ICAT.
Administration System 2 - Duodesk
• Authentication using Active Directory (VAS, CAS)• Authorization based on Active Directory roles• Administration views for Health safety and radiation protection• Administration views for Review panels• Administration views for Beamline staff – (soon)• Administration views for goods inward and Experimental Hall
Coordinators• Interaction with ISpyb
Technologies duops and duodesk
Duops II first release Q3 – 2009• Spring MVC and Hibernate• Allows upload of pdf proposal and science cases• Upload/download of many Excel sample sheets• Web interface for sample input.• Currently 2 level CAS – 1) fedid 2) email• Foreseen 3 level CAS – 1) fedid 2) openid 3) email Duodesk• Initially collaboration with ESRF but now large modifications• Initially ejb2 but now mainly ejb3, struts II, oracle, eclipse,
jboss• Allows upload of pdf proposal ,science cases and Excel files• Increasing integration with Magnolia. (user self registration,
report generation and claims for Q3 2009)• Probable user self scheduling
Nexus at Diamond
• Installed by default on all beamlines in version GDA 7.14• Will be used from October 2009 on beamlines I12 (
JEEP: Joint Engineering, Environmental and Processing) and I20 I20 - LOLA: X-ray spectroscopy respectively
• Already a significant number of applications for browsing the created Nexus files. (see next slide)
• New applications written by the Data Analysis team are or will be able to read and write Nexus files.
• Practical experience indicates that Nexus needs to be enhanced to support area detectors for MX at least.
Nexus browser integrated into Generic Data Acquisition system
Top Related