A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and...
-
Upload
diego-parker -
Category
Documents
-
view
217 -
download
2
Transcript of Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and...
Jens G Jensen
Atlas Petabyte store
Supporting Multiple Interfacesto Mass Storage
Providing Tape and Mass Storage to Diverse Scientific Communities
Jens G Jensen
Atlas Petabyte store
Requirements
Data Management
Locate data
Retrieve data
Jens G Jensen
Atlas Petabyte store
Requirements
Analyse data on the Grid
Dual access – Grid andnon-Grid
Jens G Jensen
Atlas Petabyte store
Requirements
CurationLong term storage and archival
Jens G Jensen
Atlas Petabyte store
Requirements
Very largevolumes
Very largereal-timedata rates
Jens G Jensen
Atlas Petabyte store
Requirements
Recover data in emergency
= backup
Jens G Jensen
Atlas Petabyte store
What is Atlas?
• 1.2 Petabyte capacity• Tape• STK Powderhorn• Interfaces, services
Jens G Jensen
Atlas Petabyte store
What is Atlas?
• 1.2 Petabyte capacity• Tape• STK Powderhorn• Interfaces, services• New robot coming in
Jens G Jensen
Atlas Petabyte store
We support – Grid protocols
• SRB
– Data management interface
– Metadata, data
• SRM
– Built for very large data volumes
– Very high transfer rates via GridFTP
Jens G Jensen
Atlas Petabyte store
What is the Grid?
Distributed computing
Distributed collaborations
and Virtual Organisations
Jens G Jensen
Atlas Petabyte store
What is the Grid?
Access is brokered
job
Data is replicated
Jens G Jensen
Atlas Petabyte store
What is the Grid?Well defined protocols (sort of)
• File access• Information providers• Job submission• Security
Jens G Jensen
Atlas Petabyte store
Grid Architecture, SRB
Atlas
SRBScientist
Jens G Jensen
Atlas Petabyte store
Grid Architecture, SRB
Atlas
LocalSRB
Scientist
Group files into container
Slow network
Fast network
Store container
Remote SRB
Jens G Jensen
Atlas Petabyte store
Grid Architecture, SRM
Storage Element(SRM)
File Transfer Service
Replica Manager
Replica Catalogue
ApplicationInformation
Services
Components fit togetherto provide Grid servicesAtlas
Jens G Jensen
Atlas Petabyte store
We support – non-Grid protocols
TapeDisk cache
vtp, rfio, dcap,…
Jens G Jensen
Atlas Petabyte store
Who are the customers
GridPP Tier 1
CCLRC facilities
Research councils
e-Science projects
Jens G Jensen
Atlas Petabyte store
Who are the customers
Small – a few gigabytes
To large – hundreds ofTerabytes
Different customers drivedifferent areas of service
Be all things to all people?
Jens G Jensen
Atlas Petabyte store
Community
User group meetings
Helpdesk
How to tie the communitytogether?
Jens G Jensen
Atlas Petabyte store
Conclusions
• Supporting multiple communities via multiple interfaces– Grid interfaces and non-Grid– Multiple requirements
• Diversity is good – (up to a point?)– Volume and rates driven by GridPP– Metadata driven by e-Science projects
and RCUK