The EOSDIS Data Server · 2000. 10. 5. · The EOSDIS Data Server Jeanne Behnke Earth Science Data...

Post on 23-Mar-2021

1 views 0 download

Transcript of The EOSDIS Data Server · 2000. 10. 5. · The EOSDIS Data Server Jeanne Behnke Earth Science Data...

The EOSDIS Data Server

Jeanne BehnkeEarth Science Data Information Systems/NASA GSFC

Code 423, Greenbelt MD 20771+1-301-614-5326

jeanne.behnke@gsfc.nasa.gov

Presented at the THIC Meeting at the Naval Surface Warfare Presented at the THIC Meeting at the Naval Surface Warfare Presented at the THIC Meeting at the Naval Surface Warfare Presented at the THIC Meeting at the Naval Surface Warfare Center Center Center Center CarderockCarderockCarderockCarderock

Bethesda MDBethesda MDBethesda MDBethesda MD

October 3, 2000October 3, 2000October 3, 2000October 3, 2000

2

A Big Mission!A Big Mission!

• EOS = Earth Observing System– Centerpiece of NASA’s Earth Science Enterprise– Collect Earth remote sensing data for a 15 year global

change research program

• EOSDIS Data Information System– Software architecture is designed to receive, process,

archive and distribute several terabytes of science data on a daily basis

– User community consists of several thousands of science and non-science users

– 7 major facilities across the US • Distributed Active Archive Centers (DAACs)

3

EOS Terra in orbitEOS Terra in orbit

4

Terra ImageTerra Image

5

TERRA Image TERRA Image -- MODISMODIS

Cyclone Hudah in Indian Ocean taken March 29, 2000

6

TERRA Image TERRA Image -- ASTERASTER

Reno, Nevada taken April 18, 2000

7

TERRA Image TERRA Image -- MISRMISR

8

LandsatLandsat Browse ImageBrowse Image

9

EOSDIS has 3 segments:• Networks• Flight Ops• Science Data Processing System

EOSDIS is composed ofseveral geographicallydistributed elements thatwill appear as a single,integrated, logical entity

EOSDIS is working withNOAA and other agenciesto ensure long term availability of Earth science data

EOSDIS ConceptEOSDIS Concept

10

JPL

ASF

EDC

LaRC

GSFC

NSIDC

ORNL

DISTRIBUTED ACTIVE DISTRIBUTED ACTIVE ARCHIVE CENTERSARCHIVE CENTERS

for EOS Datafor EOS Data

11

Predicted Data Volumes for 2000Predicted Data Volumes for 2000

• Expect launch of EOS-Aqua and ADEOS satellites this year

• 260 different data products and sets of raw instrument data

• 1.6 TB of processed data stored daily by end of 2000DataCenter

ArchiveVolumesGB/Day

# ofgranulesper day

ArchiveVolumesIn TBper year

# o f Granulescumulativeper year

DistributionviaNetworkGB/day

Distributionvia tapeGB/day

EDC 522 6886 190 2,513,390 194 159GSFC 688 5545 251 2,023,925 226 226LaR C 312 2945 114 1,074,925 102 102NSIDC 22 1083 8 395,295 6 6Total 1544 16459 563 6,007,535 528 493

12

SDPS System GoalsSDPS System Goals

• Flexible, Scaleable, Reliable• Use Open System standards• Support standard interface to Earth science to

enable coordinated data analysis• Maximize the use of COTS packages and respond to

technological advances and techniques• Inevitable change and new additions• Architecture to support these goals:

– EOSDIS Core System (ECS)

13

ECS Context DiagramECS Context Diagram

Dat a Se rve rS u b s y s t e m

In t e ro p e ra b il i t y

Dat aMa n a g e m e n tS u b s y s t e m

Clie n t /In t e rn a l

&Ex t e rn a l

Us e rsManagementSubsystem

Pla n n in gS u b s y s t e m

Da t aPro c e s s in g

In g e s tS u b s y s t e m

Dat aCo lle c t io n Da t a

P r o v i d e r s

Us e r re g is t ra t io n ,Orde r s t at us ,Us e r p ro f ile

Dat a p ro c e s s ingr e q u e s t s

Da t aIn g e s t

P l a n s

Dat a &O n - De m a n dRe q u e s t s

Dis c o v e r y /A d v e r t is e m e n t s

Se arch andA c c e s s

A c q u i r e

Da t a /

Do c u m e n t s /A d v e r t is e m e n t s

S e r v i c e s

14

Data Server SubsystemData Server Subsystem“Heart of the ECS system”“Heart of the ECS system”

• Object-oriented C++ on a multiplatform environment of SUNs and SGIs

• Three Software Configuration Items (CI)– Science DataServer CI

• DBMS, geospatial search, inventory

– Storage Management CI• Manages all peripherals including robotic silos

– Data Distribution CI• Places data in distribution location

• Ingest Subsystem CI is also significant to data archiving

15

COTS PackagesCOTS Packages

• System uses ~ 75 Off-The-Shelf packages from commercial and government sources

• Principal COTS that impact design:– Sybase Relational DBMS/SQS - dbms and spatial query system– AMASS - file storage management system for robotic storage devices– Autosys - scheduling software for the processing system– Tivoli - system management tools– HP Openview- graphical tool for system management – RogueWave - libraries used to map components to objects– DCE - distributed computing environment– ClearCase - CM tool to manage completion of different builds– Remedy - trouble-ticketing software used across project

16

Data Server SubsystemData Server SubsystemHardware Context DiagramHardware Context Diagram

Data Repository - STK

Diskstorage

Repository Manager - SGIs

Data DistributionManager - SUNs

Data Server Manager - SGIs

InventoryDatabase

DistributionPeripherals

Tape drives

Push/PullStaging

CD ROM

*configuration is similar at all sites

17

Data RepositoryData Repository

DAAC Make/Model Qty Drive Type MediaCapacity

Total #of Mediain silos

GSFC StorageTek STKPowderhorn

4 13 D3 drives10 9840 drives

50 GB40 GB

12,000

NSIDC StorageTek STKPowderhorn

1 3 D3 drives 50 GB 600

EDC StorageTek STKPowderhorn

3 14 D3 drives8 9840 drives

50 GB40 GB

7,700

LaRC StorageTek STKPowderhorn

2 8 D3 drives8 9840 drives

50 GB40 GB

3,100

DAAC Make/Model Qty Drive Type MediaCapacity

GSFC Exabyte tape drivesCD Rom Writers

82

8mm tape 50 GB600 MB

NSIDC Exabyte tape drivesCD Rom Writers

22

8mm tape 50 GB600 MB

EDC Exabyte tape drivesCD Rom WritersD3 tape drive

821

8mm tape

D3 tape

50 GB600 MB50 GB

LaRC Exabyte tape drivesCD Rom Writers

22

8mm tape 50 GB600 MB

ArchiveRobotic Storage

DistributionSystems

18

STK SiloSTK Silo

19

20

Drive CabinetDrive Cabinet

21

Typical Cartridge Media for SiloTypical Cartridge Media for Silo

22

Mass Storage I/O SystemMass Storage I/O System

• Consists of silo, RAID disk, server hosts– Capable of 40 MB/s throughput sustained at all times

(3.5 TB of data per day)

• Able to push 3 to 4 times as much data because of double buffer mechanism in the storage management system design– Minimizes stress on robotics– Creates our own persistent cache

23

Mass Storage I/O SystemMass Storage I/O System

• Utilize Volume Groups (groups of tapes together)– Group tapes by ‘science data type’ (for example, all

Landsat data in a silo is grouped together on a specified collection of tapes)

– Enables load balancing– Assures minimum performance levels– Allows logical management of the archive

• Additional information in two poster papers at this conference– Fault Tolerant Design – Scalable Architecture for Maximizing Concurrency

24

Archive OperationsArchive Operations

• Strive for an automated archive system– Continuous connection to the archive systems by operations

personnel– At least two operations personnel at each site:

• Principle activities include error notification; backup; monitoring; problem resolution

– Strive for lights out administration

• Support several modes at each site for system upgrade– One operational mode; 2 test modes

• Scheduled maintenance includes hardware monitoring, media monitoring, format and cleanup

25

ConclusionConclusionSo How Are We Doing?So How Are We Doing?

DAAC Type Archive SizeEDC Landsat7/

ASTER/MODIS Land

20 TB274 GB/day

NSIDC MODIS Land(snow

products)

9 TB25 GB/ day

GSFC MODIS L1/Atmos/Ocean

50 TB300 GB/day

LaRC MISR 18 TB88 GB/day

Archive Size to

Date