Managing processing and storage for MODIS and OMI
-
Upload
auberta-cameo -
Category
Documents
-
view
27 -
download
0
description
Transcript of Managing processing and storage for MODIS and OMI
Managing processing and storage for MODIS and OMI
Edward MasuokaTerrestrial Information Systems Branch
What our data systems support
• Integration of science team software• Development of Level 1 and Level 3 products• Processing science products• Product Quality Assessment• Archiving/distributing data products and imagery
In ‘98, 47TB was a lot of storage
4 Petabytes in 4 racks in 2010
3 SGI systems in 1995> 700 Linux servers in 2010
MODIS and OMI Hardwareinformation in one place
Tables updated daily
System Administration Tools• Depot http://www.cs.cmu.edu/~help/unix_linux/software_collections/local_depot.html
Manages software under /usr/local• SATE (System Administrator Tool Environment)
Handles all user accounts, Integrated with NAMSProperty database (location, value description)
• Problem Queue Email-based problem trackingAutomated notifications from h/w are sent here
• System Administrator Wiki (MediaWiki)– Procedures for managing systems and storage
Development Team Tools/Process• Subversion – Configuration management• Bugzilla – Software Bug Tracking• PCR (PGE Change Request) Process
Science S/W PGE (Product Generation Executive) delivered unit test (specified by developer) science test (defined by science discipline lead) science test results reviewed by Quality Assessment Team Science disciplines and Science Team leader review If approved put into production
MODAPS H/W Architecture
Fast Storage Compute Servers
Black Diamond Ethernet Switch
Production Database Server
Distribution Database Server
Web, ftp and proxy servers
High VolumeStorage
Archive Servers
Outside Network
Head Nodes (Ingest and Staging Server)
Monitoring performance with Ganglia
Activity on one Cluster
Moving processing resources between Clusters
• Operations monitors level of use on clusters with Ganglia
• Extreme Networks Black Diamond switches enable compute servers to be shifted between clusters supporting activities as needed to meet demand in different areas
• Software on compute servers can be quickly reconfigured via Depot and sync from Subversion repository
Databases are the foundations for processing and distribution
• Processing databases track files on production systems and processing jobs
• LAADS database supports search/order and custom product generation
• Processing and distribution are decoupled– Searches don’t impact production rates or delivery
to end-users via ftp push– Processing at 100x doesn’t impact product searches
or post-processing on LAADS
Current Evaluations• Running iRODS + FUSE as replacement for NFS
for presenting the all_data tree (/Collection /Mission/Product/…/granule) to ftp users– Prevents ftp from hanging if storage node is down
• Running PGEs in the Nebula Cloud– Need to understand cost with respect to
computing, storage and network bandwidth– May be used to handle peak demand
Finish in 2011• Migrate all servers to CentOS Linux• Migrate LAADS database to PostgreSQL– Databases migrated for most production systems
• Single set of security plans for MODIS and OMI
Quality Assessment Teamhttp://landweb.nascom.nasa.gov/cgi-bin/QA_WWW/newPage.cgi
Quality Assessment• Global browse – Images of Daily and Multi-day products
• Golden Tiles (9 tiles over key land cover types)– Browse images– Time Series Plots allow comparisons between
different years and reprocessing campaigns• Tools written in C interfaced to ENVI facilitate
manipulation and assessment of MODIS standard products
AcronymsENVI GIS and image processing COTS s/wFUSE File System in User SpaceiRODS integrated Rule-Oriented Data SystemNFS Network File SystemLAADS Level 1 and Atmosphere Archive and Distribution
SystemMODIS Moderate-resolution Imaging SpectroradiometerOMI Ozone Monitoring Instrument