- 1. Sanger Institute Site Report Nov 2007 Guy Coates
[email_address]
2. About the Institute
- Funded by Wellcome Trust.
-
- 2 ndlargest research charity in the world.
- Large scale genomic research.
-
- Sequenced 1/3 of the human genome (largest single
contributor).
-
- We have active cancer, malaria, pathogen and genomic variation
studies.
- All data is made publicly available.
-
- Websites, ftp, direct database. access, programmatic APIs.
3. Why are we here?
-
- We have particle accelerators which throw out massive amounts
of data.
- Different science, same problems.
Sequencing machines 4. Managing Growth
- We have exponential growth in storage and compute.
-
- Storage doubles every 12 months.
-
-
- We will haveat least2PB of disk next year.
- New sequencing technologies are a huge challenge.
-
- ~50x increase in data production in the space of 6 months.
- New sequencing tech is still growing.
-
-
- Higher data output from our current machines.
-
-
- New big science projects are just a good idea away...
5. Data centre
- Overhead aircon, power and networking.
-
- Allows counter-current cooling.
-
- 1 data centre is an empty shell.
-
-
- Rotate into the empty room every 4 years.
-
-
- Refurb one of the in-use rooms with the current state of the
art.
rack rack rack rack 6. Storage
-
- Dual Brocade fabric (27 switches per fabric).
-
- ~1PB in production today.
-
- Holds the bulk of our data (~1PB)
-
- Dual controller, Fibre channel disks, ~50TB per array.
-
- virtual raid5(effectively Raid 6).
-
-
- Don't need to worry about raidset size being nice multiples of
physical disk size etc.
-
-
- Allows rapid allocation of storage to projects as
required.
-
- Storage is either directly attached, or used with cluster
filesytems.
-
- NFS serving for home directories and storage which needs
concurrent windows / linux access.
-
- EVA storage at the back end.
-
- Veritas netbackup to a Storagetek SL8500 library.
-
- 12 drives (LTO2&3), 1500 slots.
7. Compute
- 3800 cores in >1500 blades and rack mount servers.
-
- Blades preferred due to ease of management, space and power
efficiency.
-
- Mostly x86_64 servers, some older x86 systems.
-
-
- Single, dual and quad core.
-
- Token ia64 for large memory machines.
-
-
- (SGI Altix 350, 16 CPUs, 192GB memory).
- We use Debian Linux as primary OS.
-
- Badly burned by proprietary OS and file-systems.
-
-
- We still have legacy Alpha / Tru64 / AdvFS data and apps which
require migration to Linux.
-
- 99% of systems run Debian Sarge / Etch.
-
-
- Run 64bit on x86_64 CPUs.
-
-
- SLES9 on Oracle server to say inside support matrix.
-
- ~300 users, diverse workload.
-
- Typically IO bound, integer intensive, single threaded.
-
-
- Scales well on clusters (Apart from the IO bit).
8. Infrastructure / Management
-
- Debian FAI automated installer. Integrated with blade
management systems for fire-and-forget deployment.
-
-
- ~2 minutes for a complete OS install.
-
- Used by many software development and science teams within the
Institute as well as the System team.
-
- External engineers and collaborators have access.
-
- 2-8 node cluster for high-availability.
-
- Mostly mysql + apache using SAN for storage failover.
9. Tera-scale Oracle
- Sequencing Trace archive.
-
- Hold results from all DNA sequencing experiments,
everywhere.
-
- Mirrored with NCBI trace archive.
-
- Currently ~60TB / 8 Billion traces.
-
- Doubles in size every 12 months.
- Originally data was on file-system, meta-data in oracle.
-
- Billions of small files (20-80k).
-
- Hard to backup, hard to manage space.
-
- All on Tru64 /advfs. (Dead architecture).
- We decided to move everything into oracle.
-
-
- Tera-scale databases are common (according to Oracle).
10. Tera-scale Oracle
-
- 4 node Oracle 10g RAC cluster (4 core x86_64, 16GB RAM).
-
- 60TB of EVA / fibre-channel storage, Oracle ASM clustered
file-system.
-
- Replicate database to a secondary database with Oracle
dataguard.
-
- 2 node RAC cluster with 60TB of MSA1000 storage
(cheap-n-cheerful fibre-channel).
-
-
- 15 minute delay in replication to protect against finger
trouble.
-
- Secondary database is the primary backup (disk-to-disk,
fast).
-
-
- We can run off the secondary if we need to.
-
- Dumps to tape taken from the secondary.
-
- Oracle is not well tested (especially by Oracle!) on this
scale.
-
- How will we cope with future growth of the database?
11. Compute farm
-
- 588IBM HS20/LS20 (42 chassis), 128 HP BL460c (8 chassis).
-
- 2224 cores (mix of 32 and 64 bit), 2GB memory /core.
-
- Debian Sarge + custom kernel.
-
- LSF used for job scheduling.
-
-
- Typically 10k-100k jobs in the system.
-
- Systems distributed across data centres.
12. Farm lustre storage
-
- In house client port to Debian.
- Lustre for work / scratch areas.
-
-
- Dual tailed SCSI (highly available).
-
- Reliability sacrificed for performance.
-
- Lustre random access / meta-data performance is rubbish.
-
- NFS for home directories.
13. Lustre performance
- Sustained 11-12 Gbit/s peak.
-
- This is real work, not a benchmark.
14. Supporting New Sequencing
- We have 20 Illumina (n e Solexa) sequencing machines.
-
- This will produce 20-30TBper day.
-
- We need to keep raw data for ~2 weeks for analysis and QC.
- 320 TB Lustre staging area.
-
- 8 EVA8000 arrays, 28 OSSs, 160 OSTs.
-
-
- (8 luns per OSS limit required more OSS than planned).
-
- 3 x 100TB file-systemsfor production + 50TB file-system for
development.
-
-
- Smaller file-systems hedge against EVA failure.
-
- 256 HP BL460c blades. 600 cores, mixture of dual / quad
core.
-
- Extreme networks Black Diamond 8810 switch (360 non-blocking
GigE ports).
-
- 25TB SFS20 lustre scratch area for ad-hoc analysis.
15. Data pull ... LSF reconfig allows processing and alignment
capacity to be interchanged. Lustre Clients have 2xGigE, 4GigE
trunks from chassis to core switch. Datastore 320TB EVA (Lustre)
sequencer1sequencer 20 blade chassis (alignment) blade chassis
(processing) blade chassis (processing) blade chassis (alignment)
blade chassis (alignment) blade chassis (suckers) scratch area 25TB
SFS20 (Lustre) Final Repository (NFS) 16. Acknowledgements
HP Life sciences / SFS
Sanger Institute