Data Hub Evolution for Off-line Access - ESA - Sentinel

24
Collaborative Workshop #16 - [email protected] Data Hub Evolution for Off-line Access

Transcript of Data Hub Evolution for Off-line Access - ESA - Sentinel

Page 1: Data Hub Evolution for Off-line Access - ESA - Sentinel

Collaborative Workshop #16 - [email protected]

Data Hub Evolution for Off-line Access

Page 2: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 2ESA UNCLASSIFIED - For Official Use

Introduction - The Data Challenge

10% Buffer for Cluster Operation

16.25% Protection Overhead(RAIN Redundancy Node)

2% DataHub Operation

10.8 PB(9833 TiB)

For Sentinel Products

On-line

Dell EMC Isilon Cluster

4 years of Successful Operation!

Average Disk Fill-up Rate(Considering ramp-up from S1A to S3B, until today)

209 TiB/Month

Storage Full in Oct-2018

14.7

PB R

AW

Sto

rage

(Cor

e D

ataH

ub C

ente

r @

T-S

yste

ms)

Rolling Archive Size inline with Commitments

Page 3: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 3ESA UNCLASSIFIED - For Official Use

Introduction - The Data Challenge

10% Buffer for Cluster Operation

16.25% Protection Overhead(RAIN Redundancy Node)

2% DataHub Operation

10.8 PB(9833 TiB)

For Sentinel Products

On-line

Dell EMC Isilon Cluster

Looking at the future…

Nominal Disk Fill-up Rate(All Sentinel Products including S2 L2A Global, as of Jan-2019)

700 TiB/Month

On-Line Capacity

14 Months of Data

14.7

PB R

AW

Sto

rage

(Cor

e D

ataH

ub C

ente

r @

T-S

yste

ms)

Page 4: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 4ESA UNCLASSIFIED - For Official Use

Content

Access to Off-Line Products from LTARoad Map

Current Status & StatisticsPlan

Remote DHUS Data StoreConcept

Plan

Page 5: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 5ESA UNCLASSIFIED - For Official Use

Access to Off-Line Products From LTA - Road Map

• First version of the Service deployed in Oct-2018-> Used for Sentinel-1-> Interfaces tailored to the specificities of the Sentinel-1 PDGS and LTA

• Improved version in development. Deployment starts Q1 2019-> Generic for all Sentinels-> To be released as Open Source on DHUS Github-> Could be adopted by Collaborative partners for their own purposes

Page 6: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 6ESA UNCLASSIFIED - For Official Use

Off-Line Products From LTA – Status

First version of the Service Activated early Oct-2018 for Sentinel-1 A/B.Following Products have been evicted (physically removed and flagged “off-line”):

• SLC, GRD, OCN: from mission start to end Dec-2016 • RAW: from mission start to end Apr-2017

Off-Line

Today

Sentinel-1 Data Population per Sensing Date

Page 7: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 7ESA UNCLASSIFIED - For Official Use

Commitments:Sentinel-1 All products: 12 month rolling archiveSentinel-2 Level 1C products: 12 month rolling archiveSentinel-2 Level 2A products: 18 months rolling archiveSentinel-3 All products: 12 month rolling archiveSentinel-5P All products: 12 month rolling archive

Off-line Products from LTA – Rollout Plan (1/2)Jan-2019S1 12 Months Rolling Activation – First Version

Jul-2019S2 L1C 12 Months Rolling Activation

Apr-2019S3 12 Months Rolling Activation - Improved Version

2020 (TBD)S5P 12 Months Rolling Activation

Sep-2019S2 L2A 18 Months Rolling Activation

Applicable to OpenHub, CopHub, DiasHub#1

Data Population per Sensing Date

Example: State on Jun-2019

Rolling Window

Page 8: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 8ESA UNCLASSIFIED - For Official Use

Off-line Products from LTA – Rollout Plan (2/2)

Rolloutperiod

Plan has been optimised by keeping data online as long as possible:• 1 mission at a time• Gradually lowering the rolling window

to the committed size.

Online Archive projection

Today

Page 9: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 9ESA UNCLASSIFIED - For Official Use

Off-line Products from LTA – First Version

DHUS

S1 PDGS Data

Delivery(GMP)

S1 Production Service

LTA(PAC)

Users

Internal Interfacewith shared DB

Request Product from configured PAC[Internal Interface]

Retrieval from LTA[Internal Interface]

Product retrieval

Product ingestion

1-DownloadOfflineProduct

2-Monitor bySearchingCatalogue

3-Download onceProduct is back online

Current set-up used for Sentinel-1:

• Ad-hoc S1 PDGS Data Delivery component.• One DHUS instance can retrieve products from a

single LTA/PAC.• Once retrieved, a product is restored back in the

DataHub (status changes from off-line to on-line).• Restored products are kept on-line for 3 days.• Users have to monitor the catalogue until the

product is back on-line.

S1-PDGS

DataHub

Page 10: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 10ESA UNCLASSIFIED - For Official Use

Off-line Products from LTA - Current Operational Setup

Sci

Hub

API

Hub

DIA

S/C

opH

ub

PAC 2 (DLR) PAC 1 (UK)

Max100

User’s request for offline product

Max100

Max200

LTA 1LTA 2

Max number of Open Requests(configurable)

DH

uSServices

S1 PD

GS

Page 11: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 11ESA UNCLASSIFIED - For Official Use

Off-line Products from LTA - Statistics

• 12450 Products retrieved from the LTA since activation of the service (Since 06-Oct-2018)

• Offline/Online product access ratio:

• Products typically available within 1 to 5 hours. Some glitches under investigation.

• Only one LTA used so far (UK-PAC). Performances expected to more than double with the use of the second LTA (DLR) started 3-Dec-2018.

2018-11-22 00:00:00 UTC to 2018-11-29 00:00:00 UTCRAW GRDM GRDH SLC OCN Overall

scihub 0.67% 3.51% 0.45% 6.47% 6.18% 3.1%apihub 4.77% 2.46% 1.28% 1.18% 6.44% 5.35%cophub 0 0 0 17.65% 0 3.65%

Page 12: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 12ESA UNCLASSIFIED - For Official Use

• Archive curation: Clean-up of some “old” obsolete products from the early phase of the S1A mission

• Fine tuning of the quota management• Automation of the statistic generation and reporting

• System performance• User behaviour

• Integration of statistics in Dashboard (Q2 2019)

• Implementation and roll-out of an improved version of the Off-line Product Access Service (Q1/Q2 2019)

• For all Sentinels• Including ColHub

Off-line Products from LTA – Next Steps

Page 13: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 13ESA UNCLASSIFIED - For Official Use

Off-line Products from LTA – Improved Version

Improvements:

• DHUS Back-end Interface is generic: hides the underlying LTA Interfaces.

• LTA Broker responsible for load balancing between LTAs. Strategy transparent to DHUS.

• Different DataHubs retrieve products from the same LTA Broker cache

• User can monitor his requests• Multi-mission approach (1 LTA Broker

per-mission foreseen but this could evolve)

• Off-line product access function included in OpenSource DHUS Software.

DHUS

LTABroker

LTA(PAC#1)

Users

Request Product[Generic LTA BrokerInterface]

Retrieve Product from appropriate LTA[Internal Interface]

Product Retrieval & ingestion

1-RequestOfflineProduct

2-Monitor request 3-Download once

Request is completed

PDGS

LTA(PAC#2)

DataHub

Cache

MonitorRequest

Page 14: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 14ESA UNCLASSIFIED - For Official Use

Off-line Products from LTA - Future Operational Setup

Sci

Hub

API

Hub

DIA

S/C

opH

ub

S1 LTA Broker

MaxTBD

User’s request for offline product

MaxTBD

MaxTBD

LTA 1LTA 2

Max number of Open Requests(configurable)

DH

uSServices

S1 PD

GS (*)

Col

Hub

MaxTBD

(*) Set-up is similar for Sentinel 2 and 3

Page 15: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 15ESA UNCLASSIFIED - For Official Use

Generic DHUS/LTA Broker Interface - Concept

Off-line Product Request (GET Product)

DHUS User DHUS System LTA Broker

Product Retrieval Request (GET Product)

Check if product already restored

Check Request Feasibility (Quota)

Check if product already in Cache

Check Request Feasibility (Quota)

Product retrieval from LTA Initiated

Product retrieval from LTA Completed

Monitor Request (GET Request)

Monitor Requests (GET All Jobs)

GET Product Response (Request Id)

GET Product Response (Job Id)

Request Status (In Progress)

Job StatusesMonitor Request (GET Request)

Request Status (In Progress)Retrieve Product

Monitor Request (GET Request)

Request Status (Completed)

Restore Product

Download Product

DHUS Operator Monitor Request (GET Job)

Delete Request (DELETE Job)

User API Generic LTA BrokerInterface

Page 16: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 16ESA UNCLASSIFIED - For Official Use

Generic DHUS/LTA Broker Interface - Protocol

Simple HTTP Restful/Json based ProtocolGET Product Requesthttps://s1-ltabroker.eo.esa.int/products/S1B_IW_SLC__1SDV_20181018T174205_20181018T174235_013210_0186A6_7404

GET Product ResponseHTTP/1.1 202 Accepted{“job_id”: “S1-LB001-0001234”,“job_uri”: “https://s1-ltabroker.eo.esa.int/jobs/S1-LB001-0001234”,“submission_time” : “2018-10-27T03:45:10Z”,“estimated_time” : “2018-10-27T04:02:51Z”}

GET Job Requesthttps://s1-ltabroker.eo.esa.int.eo.esa.int/jobs/S1-LB001-0001234

GET Job Response (In Progress)HTTP/1.1 200 OK{“status_code”: “in_progress”,“status_message”: “request is under processing”,“job_id”: “S1-LB001-0001234”,“job_uri”: “https://s1-ltabroker.eo.esa.int/jobs/S1-LB001-0001234”,“product_name”: “S1B_IW_SLC__1SDV_20181018T174205_20181018T174235_013210_0186A6_7404”,“submission_time” : “2018-10-27T03:45:10Z”,“estimated_time” : “2018-10-27T04:02:51Z”}

GET Job Response (Completed)HTTP/1.1 200 OK{“status_code”: “completed”,“status_message”: “request is completed”,“job_id”: “S1-LB001-0001234”,“job_uri”: “https://s1-ltabroker.eo.esa.int/jobs/S1-LB001-0001234”,“product_name”: “S1B_IW_SLC__1SDV_20181018T174205_20181018T174235_013210_0186A6_7404”,“submission_time” : “2018-10-27T03:45:10Z”,“estimated_time” : “2018-10-27T04:02:51Z”,“actual_time” : “2018-10-27T04:04:30Z”,“product_url”: “http://s1-ltabroker.eo.esa.int/S1B_IW_SLC__1SDV_20181018T174210_20181018T174252_013210_0186A6_7404.zip”}

Messages are fully described in the DHUS/LTA Generic ICD

Page 17: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 17ESA UNCLASSIFIED - For Official Use

End-User API for Off-Line Product Access

Same Concept/Protocol as the DHUS/LTA Broker Interface:• Get Product Request• Get Job Request (monitoring)• Download using the URL provided by the Get Job Response

Quota Management in DHUS:• Max number of requests per period of time (per user)• Max overall number of requests (all users)

Page 18: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 18ESA UNCLASSIFIED - For Official Use

Off-line Product Access -Applicability To Collaborative Hubs

Collaborative Hubs can implement their own LTA Service…

Collaborative Hub

Broker

Proprietary LTA Solution

Generic Interface

e.g: • Cloud based (”AWS Glacier” like)• Local (Tape Archive)

Cache(optional)

Proprietary Interface

DHUS OpenSource

Your Implementation

You’re invited to provide comments and new requirements.

ESA will share:• DHUS Software with Generic DHUS/Broker

back-end interface and end-user API -> Release on GitHub planned for Feb-2019

• Draft DHUS/LTA Broker Generic ICD: Now • Consolidated DHUS/LTA Broker Generic ICD:

mid Dec-2018

Page 19: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 19ESA UNCLASSIFIED - For Official Use

Content

Access to Off-Line Products from LTARoad Map

Current Status & StatisticsPlan

Remote DHUS Data StoreConcept

Plan

Page 20: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 20ESA UNCLASSIFIED - For Official Use

Remote DHUS Data Store - Concept

Overview:• Allows to use another DHuS instance

DataStore as if it was local.→ for downloads and nodes browsing via Odata (i.e. read only)

• Easy to use and configure• Transparent to end usersSet-up:• Remote DataStore is configured on DHuS-

A (v0.14.5+ needed)• A user account is created on DHuS-B for

DHUS-A (without quotas)• Products UUIDs must be the same

between DHuS-A and DHuS-B: Achieved via Synchronisation.

• Products are “Safe Evicted” from DHuS-A

DHuS BRemote

DHuS AFront-End

Product Download /Node Inspection[ODATA Requests]

Synchronisation

LocalData Store

DataStore

Synchronisation:1- Remote DHuS copies products from Front-End2- Copied products are “Safe Evicted” from Front-End

Safe Eviction: A Soft Eviction which does not physically delete a product if it is not present on the remote data store.

RemoteData Store

Forwarded ODATA Requests

Page 21: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 21ESA UNCLASSIFIED - For Official Use

Remote DHUS Data Store – Administration Details

Remote data Store Configuration Example:{"@odata.type": "#OData.DHuS.RemoteDHuSDataStore",

"Name": "RemoteDHuS",

"ServiceUrl": "http://REMOTE_DHUS/odata/v1",

"Login": "username","Password": "password”

}

Safe Eviction Example:HTTP POST on http://DHUS/odata/v2/Evictions('MySoftEviction')/OData.DHuS.QueueEviction

With body:

{

"TargetDataStore":"MyDataStore","SafeMode": true

}

Page 22: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 22ESA UNCLASSIFIED - For Official Use

Remote DHUS Data Store for Sentinel-2

Goal: Contingency plan to free space in OpenHub core infrastructure• A subset of the L1C Products are synchronised to a remote DHuS instance• Products are physically deleted from OpenHub (via “safe eviction”)• Data to be downloaded/browsed from the remote instance instead

→ Front-End is still OpenHub→ Transparent to users

• Target: Jan-2019

OpenHub Front-end

S2 L1CDataHubBack-end

Remote Data Store

LocalData Store

End Users [Odata Requests]

[Odata Requests]If Physical Product is Remote

[Response]

Front-End

Back-End

Synchronisation

Local Data Store

~300 TiBof L1C

Products

Page 23: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 23ESA UNCLASSIFIED - For Official Use

Remote Data Store - Applicability To Collaborative Hubs

Data HubRelay

Data HubRelay

Data HubRelay

Data HubRelay

Data Hub Relays can “augment” their own Product offerings by publishing products available on other relays.

Data HubRelay

Front-End

DHUS DHUS

A Data Hub Relay can deploy its on-line archive on different locations, and yet, provide a unified view through a single access point.

Collaborative Hubs could benefit from the Remote Data Store function in different scenarios…

Will be released in DHUS v2.0 on GitHub, Feb-2019

Page 24: Data Hub Evolution for Off-line Access - ESA - Sentinel

Slide 24ESA UNCLASSIFIED - For Official Use

Thank you for your attention!

Questions ?