Data Hub Evolution for Off-line Access - ESA - Sentinel
Transcript of Data Hub Evolution for Off-line Access - ESA - Sentinel
Collaborative Workshop #16 - [email protected]
Data Hub Evolution for Off-line Access
Slide 2ESA UNCLASSIFIED - For Official Use
Introduction - The Data Challenge
10% Buffer for Cluster Operation
16.25% Protection Overhead(RAIN Redundancy Node)
2% DataHub Operation
10.8 PB(9833 TiB)
For Sentinel Products
On-line
Dell EMC Isilon Cluster
4 years of Successful Operation!
Average Disk Fill-up Rate(Considering ramp-up from S1A to S3B, until today)
209 TiB/Month
Storage Full in Oct-2018
14.7
PB R
AW
Sto
rage
(Cor
e D
ataH
ub C
ente
r @
T-S
yste
ms)
Rolling Archive Size inline with Commitments
Slide 3ESA UNCLASSIFIED - For Official Use
Introduction - The Data Challenge
10% Buffer for Cluster Operation
16.25% Protection Overhead(RAIN Redundancy Node)
2% DataHub Operation
10.8 PB(9833 TiB)
For Sentinel Products
On-line
Dell EMC Isilon Cluster
Looking at the future…
Nominal Disk Fill-up Rate(All Sentinel Products including S2 L2A Global, as of Jan-2019)
700 TiB/Month
On-Line Capacity
14 Months of Data
14.7
PB R
AW
Sto
rage
(Cor
e D
ataH
ub C
ente
r @
T-S
yste
ms)
Slide 4ESA UNCLASSIFIED - For Official Use
Content
Access to Off-Line Products from LTARoad Map
Current Status & StatisticsPlan
Remote DHUS Data StoreConcept
Plan
Slide 5ESA UNCLASSIFIED - For Official Use
Access to Off-Line Products From LTA - Road Map
• First version of the Service deployed in Oct-2018-> Used for Sentinel-1-> Interfaces tailored to the specificities of the Sentinel-1 PDGS and LTA
• Improved version in development. Deployment starts Q1 2019-> Generic for all Sentinels-> To be released as Open Source on DHUS Github-> Could be adopted by Collaborative partners for their own purposes
Slide 6ESA UNCLASSIFIED - For Official Use
Off-Line Products From LTA – Status
First version of the Service Activated early Oct-2018 for Sentinel-1 A/B.Following Products have been evicted (physically removed and flagged “off-line”):
• SLC, GRD, OCN: from mission start to end Dec-2016 • RAW: from mission start to end Apr-2017
Off-Line
Today
Sentinel-1 Data Population per Sensing Date
Slide 7ESA UNCLASSIFIED - For Official Use
Commitments:Sentinel-1 All products: 12 month rolling archiveSentinel-2 Level 1C products: 12 month rolling archiveSentinel-2 Level 2A products: 18 months rolling archiveSentinel-3 All products: 12 month rolling archiveSentinel-5P All products: 12 month rolling archive
Off-line Products from LTA – Rollout Plan (1/2)Jan-2019S1 12 Months Rolling Activation – First Version
Jul-2019S2 L1C 12 Months Rolling Activation
Apr-2019S3 12 Months Rolling Activation - Improved Version
2020 (TBD)S5P 12 Months Rolling Activation
Sep-2019S2 L2A 18 Months Rolling Activation
Applicable to OpenHub, CopHub, DiasHub#1
Data Population per Sensing Date
Example: State on Jun-2019
Rolling Window
Slide 8ESA UNCLASSIFIED - For Official Use
Off-line Products from LTA – Rollout Plan (2/2)
Rolloutperiod
Plan has been optimised by keeping data online as long as possible:• 1 mission at a time• Gradually lowering the rolling window
to the committed size.
Online Archive projection
Today
Slide 9ESA UNCLASSIFIED - For Official Use
Off-line Products from LTA – First Version
DHUS
S1 PDGS Data
Delivery(GMP)
S1 Production Service
LTA(PAC)
Users
Internal Interfacewith shared DB
Request Product from configured PAC[Internal Interface]
Retrieval from LTA[Internal Interface]
Product retrieval
Product ingestion
1-DownloadOfflineProduct
2-Monitor bySearchingCatalogue
3-Download onceProduct is back online
Current set-up used for Sentinel-1:
• Ad-hoc S1 PDGS Data Delivery component.• One DHUS instance can retrieve products from a
single LTA/PAC.• Once retrieved, a product is restored back in the
DataHub (status changes from off-line to on-line).• Restored products are kept on-line for 3 days.• Users have to monitor the catalogue until the
product is back on-line.
S1-PDGS
DataHub
Slide 10ESA UNCLASSIFIED - For Official Use
Off-line Products from LTA - Current Operational Setup
Sci
Hub
API
Hub
DIA
S/C
opH
ub
PAC 2 (DLR) PAC 1 (UK)
Max100
User’s request for offline product
Max100
Max200
LTA 1LTA 2
Max number of Open Requests(configurable)
DH
uSServices
S1 PD
GS
Slide 11ESA UNCLASSIFIED - For Official Use
Off-line Products from LTA - Statistics
• 12450 Products retrieved from the LTA since activation of the service (Since 06-Oct-2018)
• Offline/Online product access ratio:
• Products typically available within 1 to 5 hours. Some glitches under investigation.
• Only one LTA used so far (UK-PAC). Performances expected to more than double with the use of the second LTA (DLR) started 3-Dec-2018.
2018-11-22 00:00:00 UTC to 2018-11-29 00:00:00 UTCRAW GRDM GRDH SLC OCN Overall
scihub 0.67% 3.51% 0.45% 6.47% 6.18% 3.1%apihub 4.77% 2.46% 1.28% 1.18% 6.44% 5.35%cophub 0 0 0 17.65% 0 3.65%
Slide 12ESA UNCLASSIFIED - For Official Use
• Archive curation: Clean-up of some “old” obsolete products from the early phase of the S1A mission
• Fine tuning of the quota management• Automation of the statistic generation and reporting
• System performance• User behaviour
• Integration of statistics in Dashboard (Q2 2019)
• Implementation and roll-out of an improved version of the Off-line Product Access Service (Q1/Q2 2019)
• For all Sentinels• Including ColHub
Off-line Products from LTA – Next Steps
Slide 13ESA UNCLASSIFIED - For Official Use
Off-line Products from LTA – Improved Version
Improvements:
• DHUS Back-end Interface is generic: hides the underlying LTA Interfaces.
• LTA Broker responsible for load balancing between LTAs. Strategy transparent to DHUS.
• Different DataHubs retrieve products from the same LTA Broker cache
• User can monitor his requests• Multi-mission approach (1 LTA Broker
per-mission foreseen but this could evolve)
• Off-line product access function included in OpenSource DHUS Software.
DHUS
LTABroker
LTA(PAC#1)
Users
Request Product[Generic LTA BrokerInterface]
Retrieve Product from appropriate LTA[Internal Interface]
Product Retrieval & ingestion
1-RequestOfflineProduct
2-Monitor request 3-Download once
Request is completed
PDGS
LTA(PAC#2)
DataHub
Cache
MonitorRequest
Slide 14ESA UNCLASSIFIED - For Official Use
Off-line Products from LTA - Future Operational Setup
Sci
Hub
API
Hub
DIA
S/C
opH
ub
S1 LTA Broker
MaxTBD
User’s request for offline product
MaxTBD
MaxTBD
LTA 1LTA 2
Max number of Open Requests(configurable)
DH
uSServices
S1 PD
GS (*)
Col
Hub
MaxTBD
(*) Set-up is similar for Sentinel 2 and 3
Slide 15ESA UNCLASSIFIED - For Official Use
Generic DHUS/LTA Broker Interface - Concept
Off-line Product Request (GET Product)
DHUS User DHUS System LTA Broker
Product Retrieval Request (GET Product)
Check if product already restored
Check Request Feasibility (Quota)
Check if product already in Cache
Check Request Feasibility (Quota)
Product retrieval from LTA Initiated
Product retrieval from LTA Completed
Monitor Request (GET Request)
Monitor Requests (GET All Jobs)
GET Product Response (Request Id)
GET Product Response (Job Id)
Request Status (In Progress)
Job StatusesMonitor Request (GET Request)
Request Status (In Progress)Retrieve Product
Monitor Request (GET Request)
Request Status (Completed)
Restore Product
Download Product
DHUS Operator Monitor Request (GET Job)
Delete Request (DELETE Job)
User API Generic LTA BrokerInterface
Slide 16ESA UNCLASSIFIED - For Official Use
Generic DHUS/LTA Broker Interface - Protocol
Simple HTTP Restful/Json based ProtocolGET Product Requesthttps://s1-ltabroker.eo.esa.int/products/S1B_IW_SLC__1SDV_20181018T174205_20181018T174235_013210_0186A6_7404
GET Product ResponseHTTP/1.1 202 Accepted{“job_id”: “S1-LB001-0001234”,“job_uri”: “https://s1-ltabroker.eo.esa.int/jobs/S1-LB001-0001234”,“submission_time” : “2018-10-27T03:45:10Z”,“estimated_time” : “2018-10-27T04:02:51Z”}
GET Job Requesthttps://s1-ltabroker.eo.esa.int.eo.esa.int/jobs/S1-LB001-0001234
GET Job Response (In Progress)HTTP/1.1 200 OK{“status_code”: “in_progress”,“status_message”: “request is under processing”,“job_id”: “S1-LB001-0001234”,“job_uri”: “https://s1-ltabroker.eo.esa.int/jobs/S1-LB001-0001234”,“product_name”: “S1B_IW_SLC__1SDV_20181018T174205_20181018T174235_013210_0186A6_7404”,“submission_time” : “2018-10-27T03:45:10Z”,“estimated_time” : “2018-10-27T04:02:51Z”}
GET Job Response (Completed)HTTP/1.1 200 OK{“status_code”: “completed”,“status_message”: “request is completed”,“job_id”: “S1-LB001-0001234”,“job_uri”: “https://s1-ltabroker.eo.esa.int/jobs/S1-LB001-0001234”,“product_name”: “S1B_IW_SLC__1SDV_20181018T174205_20181018T174235_013210_0186A6_7404”,“submission_time” : “2018-10-27T03:45:10Z”,“estimated_time” : “2018-10-27T04:02:51Z”,“actual_time” : “2018-10-27T04:04:30Z”,“product_url”: “http://s1-ltabroker.eo.esa.int/S1B_IW_SLC__1SDV_20181018T174210_20181018T174252_013210_0186A6_7404.zip”}
Messages are fully described in the DHUS/LTA Generic ICD
Slide 17ESA UNCLASSIFIED - For Official Use
End-User API for Off-Line Product Access
Same Concept/Protocol as the DHUS/LTA Broker Interface:• Get Product Request• Get Job Request (monitoring)• Download using the URL provided by the Get Job Response
Quota Management in DHUS:• Max number of requests per period of time (per user)• Max overall number of requests (all users)
Slide 18ESA UNCLASSIFIED - For Official Use
Off-line Product Access -Applicability To Collaborative Hubs
Collaborative Hubs can implement their own LTA Service…
Collaborative Hub
Broker
Proprietary LTA Solution
Generic Interface
e.g: • Cloud based (”AWS Glacier” like)• Local (Tape Archive)
Cache(optional)
Proprietary Interface
DHUS OpenSource
Your Implementation
You’re invited to provide comments and new requirements.
ESA will share:• DHUS Software with Generic DHUS/Broker
back-end interface and end-user API -> Release on GitHub planned for Feb-2019
• Draft DHUS/LTA Broker Generic ICD: Now • Consolidated DHUS/LTA Broker Generic ICD:
mid Dec-2018
Slide 19ESA UNCLASSIFIED - For Official Use
Content
Access to Off-Line Products from LTARoad Map
Current Status & StatisticsPlan
Remote DHUS Data StoreConcept
Plan
Slide 20ESA UNCLASSIFIED - For Official Use
Remote DHUS Data Store - Concept
Overview:• Allows to use another DHuS instance
DataStore as if it was local.→ for downloads and nodes browsing via Odata (i.e. read only)
• Easy to use and configure• Transparent to end usersSet-up:• Remote DataStore is configured on DHuS-
A (v0.14.5+ needed)• A user account is created on DHuS-B for
DHUS-A (without quotas)• Products UUIDs must be the same
between DHuS-A and DHuS-B: Achieved via Synchronisation.
• Products are “Safe Evicted” from DHuS-A
DHuS BRemote
DHuS AFront-End
Product Download /Node Inspection[ODATA Requests]
Synchronisation
LocalData Store
DataStore
Synchronisation:1- Remote DHuS copies products from Front-End2- Copied products are “Safe Evicted” from Front-End
Safe Eviction: A Soft Eviction which does not physically delete a product if it is not present on the remote data store.
RemoteData Store
Forwarded ODATA Requests
Slide 21ESA UNCLASSIFIED - For Official Use
Remote DHUS Data Store – Administration Details
Remote data Store Configuration Example:{"@odata.type": "#OData.DHuS.RemoteDHuSDataStore",
"Name": "RemoteDHuS",
"ServiceUrl": "http://REMOTE_DHUS/odata/v1",
"Login": "username","Password": "password”
}
Safe Eviction Example:HTTP POST on http://DHUS/odata/v2/Evictions('MySoftEviction')/OData.DHuS.QueueEviction
With body:
{
"TargetDataStore":"MyDataStore","SafeMode": true
}
Slide 22ESA UNCLASSIFIED - For Official Use
Remote DHUS Data Store for Sentinel-2
Goal: Contingency plan to free space in OpenHub core infrastructure• A subset of the L1C Products are synchronised to a remote DHuS instance• Products are physically deleted from OpenHub (via “safe eviction”)• Data to be downloaded/browsed from the remote instance instead
→ Front-End is still OpenHub→ Transparent to users
• Target: Jan-2019
OpenHub Front-end
S2 L1CDataHubBack-end
Remote Data Store
LocalData Store
End Users [Odata Requests]
[Odata Requests]If Physical Product is Remote
[Response]
Front-End
Back-End
Synchronisation
Local Data Store
~300 TiBof L1C
Products
Slide 23ESA UNCLASSIFIED - For Official Use
Remote Data Store - Applicability To Collaborative Hubs
Data HubRelay
Data HubRelay
Data HubRelay
Data HubRelay
Data Hub Relays can “augment” their own Product offerings by publishing products available on other relays.
Data HubRelay
Front-End
DHUS DHUS
A Data Hub Relay can deploy its on-line archive on different locations, and yet, provide a unified view through a single access point.
Collaborative Hubs could benefit from the Remote Data Store function in different scenarios…
Will be released in DHUS v2.0 on GitHub, Feb-2019
Slide 24ESA UNCLASSIFIED - For Official Use
Thank you for your attention!
Questions ?