Status Report of the RD45 Project z Based on LCB Review, November 1999.
-
Upload
conrad-evelyn-carson -
Category
Documents
-
view
216 -
download
1
Transcript of Status Report of the RD45 Project z Based on LCB Review, November 1999.
Status Report of the RD45 Project
Based on LCB Review, November 1999
Overview
Initial Goals of the ProjectSummary of April 1998 LCB ReviewWork on MilestonesEvolution of O(R)DBMS marketRisk Analysis: Summary and
ConclusionsFuture ActivitiesSummary
RD45: IntroductionProposed in 1994; approved early 1995Goal: provide persistency for LHC data
• Objects: event, calibrations, histograms, etc.
Assumed: O-O environment, C++ (initially)• Now also Java, including interoperability issues
Emphasis: standards (OMG, ODMG, ...), potential use of commercial solutions
Anticipated 3 project phases:• Requirements gathering, limited prototyping• Detailed prototyping and evaluation• Implementation
Guiding Principles “In particular, the data should be presented in as consistent a
way as possible. The data themselves may be stored in a variety of formats but this should be hidden from the user…”
"The ODMG ... binding is based on one fundamental principle: the programmer should perceive the binding as a single language for expressing both database and programming operations, not two separate languages with arbitrary boundaries between them.“
Capability of scaling to LHCdata volumes & rates
Capable of satisfying wide variety of HEP needs DAQ, SIM, REC, Analysis, ...
Use of “standard”, widely-usedsolutions if applicable
CMS
Phase 2 - Milestones
Impact of using an ODBMS Object model, physical data organisation, use of CASE tools, 3rd
party class libraries, C++ application code
Evaluation of ODBMS features & suitability for HEP Schema Evolution, Object Versioning, Data Replication
Performance comparisons with existing solutions PAW + Ntuples
Use of ODBMS for typical simulation, reconstruction and analysis scenarios with data volumes of up to 1TB.
Impact of ODBMS on end-user physicist. (Including private schema & collections for simulation, reconstruction and analysis.)
Demonstrate the feasibility of using an ODBMS and MSS at data rates sufficient for ATLAS and CMS 1997 test-beam requirements.
1
9
9
6
1
9
9
7
LCB Review, April 1998
“The project has achieved the initial R&D goal of investigating and identifying potential solutions to the problem of persistent data storage for LHC experiments.”
“The proposed solution: ODBMS (Objectivity/DB) is now adopted for data persistency not only by all the LHC experiments but by many others (BaBar, NA45, COMPASS, RHIC) ready to take data in 1-2 years.”
No longer valid
(except ALICE)
Milestones (April 98) Provide, together with the IT/PDP group, production data
management services based on Objectivity/DB and HPSS with sufficient capacity to solve the requirements of ATLAS and CMS test beam and simulation needs, COMPASS and NA45 tests for their '99 data taking runs.
Develop and provide appropriate database administration tools, (meta-)data browsers and data import/export facilities, as required for
Develop and provide production versions of the HepODBMS class libraries, including reference and end-user guides.
Continue R&D, based on input and use cases from the LHC collaborations to produce results in time for the next versions of the collaborations' Computing Technical Proposals (end 1999).
Production Servers setup; used for CDR and other activities. Milestones: ATLAS: 1TB, CMS: 100MB/s. COMPASS, CHORUS, NA45, others, ...
Federated DB Backup tool developed (based on multiple FDs); numerous DB browsers (CERN DRO_Tool, SLAC BDB, Micram Hudson, …), DB Import/Export based on SLAC model
New release of HepODBMS including scalable event collections + import of BaBar “conditions DB”. Revised user doc (XML) + ref. manual (DOC++), CSC tutorials + examples
R&D activities: Database usage over a wide area network Clustering and re-clustering strategies Multi-user, multi-federation issues Database integration with MSS
Milestones - Results
MONARC
Examples follow...
Multi-user, Multi-FD Issues
Multiple FDs used mainly to workaround limitations in Objectivity/DB, e.g. lock contention on global resources (catalogue)
e.g. online / offline systems (BaBar, CMS, …)
lack of private schema/cataloguee.g. user schema / data
lack of securitysee Objy V5.2
lack of support of partial backupsdescribed above
One of the main issues for with Objy meeting in late Feb
Multi-FD Example: CMS
DB1DB1 DB2DB2 DB3DB3 DBnDBnDB1DB1 DB2DB2 DB3DB3 DBnDBn
RunDBRunDB
LogDBLogDB
BConfDBBConfDB Prod Prod FDFD
Prod BootProd Boot Offline - cmsc01Offline - cmsc01
Clone FDClone FD
OnlineOnlineProd BootProd Boot
Prod Prod FDFD
LogDBLogDB
BConfDBBConfDB
RunDBRunDB
Multiple FDs & User Data
Production FD cloned by usersUsers can add private data / schemaCan share dataScalability?
Objects usingObjects usingnew Schemanew Schema
User1User1FDFD
User1 BootUser1 Boot
PrivatePrivateSchemaSchema
User2User2FDFD
U2DB1U2DB1
U2DB2U2DB2
User2 BootUser2 Boot
U1DB1U1DB1
U1DB2U1DB2
Prod Prod FDFD
DB1DB1 DB2DB2 DB3DB3 DBnDBn
Clone FDClone FD
Prod BootProd Boot
Approach also usedby other large Objy users, e.g. COMPASS Space telescope
Federation Backup Procedure
Is production FD in a consistent state?Copy all relevant DB filesInstall on backup FDCheck consistencyCopy to tape
Fallback Lock ServerFallback Lock Server
DB5DB5 DB7DB7DB6DB6
Disk Server ADisk Server A
FederationFederationCatalogueCatalogue& Schema& Schema
DB 1DB 1
DB3DB3
DB2DB2
Production Lock ServerProduction Lock Server
FederationFederationCatalogueCatalogue& Schema& Schema
DB 1DB 1
DB3DB3
DB2DB2
DB5DB5
DB7DB7
DB6DB6
HPSS Controlled TapesHPSS Controlled Tapes
DB8DB8
DB10DB10
DB9DB9
Backup TapesBackup Tapes
FederationFederationCatalogueCatalogue& Schema& Schema
DB 1DB 1
DB3DB3
DB2DB2
2. Copy toTape
1.FD Copy & Test
Backup TapesBackup Tapes
FederationFederationCatalogueCatalogue& Schema& Schema
DB 1DB 1
DB3DB3
DB2DB2
Backup Tapes 1.7.99Backup Tapes 1.7.99
FederationFederationCatalogueCatalogue& Schema& Schema
DB 1DB 1
DB3DB3
DB2DB2
Partial backup procedure high on wish-list of Objy customers (ETF)
Database Production Service - What is missing?
Transparent non-blocking interface with MSSUser capability to:
export, extract, replicate data and schema manipulate data and schema outside
production database and while accessing data and schema from production database
Fully functional, reliable high-quality database system including VLDB support (>>1PB) management tools
From L. Silvestris: Review of application software services for the LHC era, FOCUS 07/10/99
Objy V5.2
Objy V6?
BaBar
O(R)DBMS Evolution
From CMS Computing Technical Proposal:
“If the ODBMS industry flourishes it is very likely that by 2005 CMS will be able to obtain products, embodying thousands of man-years of work, that are well matched to its worldwide data management and access needs. The cost of such products to CMS will be equivalent to at most a few man-years. We believe that the ODBMS industry and the corresponding market are likely to flourish. However, if this is not the case, a decision will have to be made in approximately the year 2000 to devote some tens of man-years of effort to the development of a less satisfactory data management system for the LHC experiments.”
RDBMS + “object extensions” Can store ADTs “Methods” on server
Complex Data with Queries
$8B in 1996Likely to become
dominant DBMS technology
Complex DataPerformance,
scalabilityTight Language
Binding OQL - SQL3 query
subset
Growth similar to RDBMS in ’80s
~$1B market by 2001
ODBMS / RDBMS / ORDBMS
~$100M?
Risk Analysis: Issues
Choice of Technology ODBMS, ORDBMS, RDBMS, light-weight
POM, files + meta-data etc.Choice of Vendor
#1 Objectivity, #2 VersantThe Home-Grown approach
Estimate resources requiredImplies proof-of-concept prototype
Versant
Risk Analysis:Summary of Options Evaluate C++ binding to e.g. ORACLE Add ESCROW clause to Objectivity contract Pursue possibility of source license Visit key Objectivity customers Produce new requirements list Estimate manpower to support Objy in house Estimate manpower for “clean-sheet”
solution Continue to monitor alternatives
The LCB agrees with the other suggested steps to mitigate risk, with the addition of trying to insure that user code in reconstruction and analysis programs is kept as standards compliant as possible.
Risk Analysis: Conclusions
A solution is certainly possible! How much should we align ourselves with
industry trends / standards?ODBMS unlikely to dominate DBMS market
Likely to survive foreseeable future - market!Need to complete current prototype to
make meaningful manpower estimates Target: end-1999; present at this workshop!
Future Activities
Production ServicesConsidered essential
by several experiments
Tools, documentation, regular releases, … general production
level support
Push for VLDB and other enhancements
“2001” milestoneRevise requirementsVisit other HEP labs
(BNL, FNAL, SLAC, …)Provide ODBMS-
independent s/w layerEstimate man-power for
alternative POMEvaluate ORDBMS
technology
Feb. meeting at Objy
Summary (+)
We have a good understanding of ODBMS technology & Objectivity/DB in particular
System has been demonstrated to work in production up to level of today’s (BaBar) experiments
Many enhancements have been delivered, others in pipeline
Production experience will be invaluable for LHC (product enhancements, tools, etc.)
Summary (-)
The ODBMS market has not taken off as was previously predicted
We need to assure ourselves that there is sufficient non-HEP demand (and $$$)
We need to (in any case) understand how an eventual migration could be handled
We need to develop at least one realistic fallback scenario
Conclusions
R&D phase of RD45 has now led to production ODBMS services
Risks of current strategy well understood - risk management must continue
We are well placed to prepare for “2001 milestone”
Future focus: Production Road-map to 2001 and beyond