REST VOL for HSDS and HDF Sharded Data Storage · 5 • Convenience of having one file vs lots of...

18
Proprietary and Confidential. Copyright 2018, The HDF Group. REST VOL for HSDS and HDF Sharded Data Storage John Readey

Transcript of REST VOL for HSDS and HDF Sharded Data Storage · 5 • Convenience of having one file vs lots of...

  • Proprietary and Confidential. Copyright 2018, The HDF Group.

    REST VOL for HSDS and HDF Sharded Data Storage

    John Readey

  • 2

    • Sharded Data Storage• REST VOL

    • Direct Access

    Overview

  • 3

    Instead of managing HDF5 objects (datasets, groups, chunks) within a POSIX file store them as separate files (or as objects within an object storage system such as S3)

    Sharded data concept

    For meta data (datasets and groups), a self-descriptive format such as JSON can be used

    For chunks, store as binary objects for efficiency

  • 4

    • Limit maximum size of any object

    • -> Object storage systems typically don’t support partial writes, so large objects

    are inefficient to update

    • Supporting parallelism is easier

    • -> no file locking needed

    • No need to manage free space, key-value mappings, etc

    • -> storage systems have gotten pretty good at doing this for you

    • No need to worry about system crash leaving you with a corrupted HDF5 files

    • -> worse case you lose one object, with object storage not even that

    Why a sharded data format?

  • 5

    • Convenience of having one file vs lots of small files

    • Maybe not as important given tooling to abstract this from the user

    • E.g. hstouch, hscopy, hsrm tools

    • Filesystems have trouble dealing with large number of files (particularly within one directory)

    • Not sure this is a problem with modern Linux filesystems

    • Certainly not an issue with object storage systems

    Case against sharded storage

  • 6HSDS shard schema example

    root_obj_id/group.jsonobj1_id/

    group.jsonobj2_id/

    dataset.json0_00_1

    obj3_id/dataset.json0_0_20_0_3

    Observations:• Metadata is stored as JSON• Chunk data stored as binary blobs• Self-explanatory• One HDF5 file can translate to lots of

    objects• Flat hierarchy – supports HDF5

    multilinking• Can limit maximum size of an object• Can be used with Posix or object

    storageSchema is documented here: https://github.com/HDFGroup/hsds/blob/master/docs/design/obj_store_schema/obj_store_schema_v2.md

    https://github.com/HDFGroup/hsds/blob/master/docs/design/obj_store_schema/obj_store_schema_v2.md

  • 7

    • It can be useful to divvy up the objects within an HDF5 domain into roughly equal size collections – for instance we have n workers and we’d like to perform some

    action on the domain

    • CRUSH algorithm approach: hash key and take modulo of number of workers

    • Decentralized, no book keeping required

    Storage Partitioning

  • 8

    • The tool “hsload” will convert an HDF5 file to the sharded format (using HSDS)• Conversely, the “hsget” tool will take the shaded format and reconstruct the HDF5

    file

    • Data is preserved after a round trip

    Conversion from HDF5 files to sharded files

  • 9HDF5 file linking

    • Converting large HDF5 files (or a large collection of files) to the sharded format is time consuming and effectively doubles the storage requirements

    • Rather than converting the entire file to the HDF Schema, just the metadata can be imported (typically

  • 10REST VOL Plugin

    • The HDF5 VOL architecture is a plugin layer for HDF5• Public API stays the same, but different back ends can be implemented

    • REST VOL substitutes REST API requests for file i/o actions

    • C/Fortran applications should be able to run with minor tweaks• Downloadable from: https://github.com/HDFGroup/vol-rest

    For HDF5 1.12, use the hdf5_1_12_update branch

    https://github.com/HDFGroup/vol-rest

  • 11Features not yet supported

    • *Need new VOL api?

    Feature HSDS h5pyd RESTVOLObject reference J J LRegion Reference L L LFill Value L J LVirtual Datasets L L LVariable Length Datatypes J J LDataset SQL Query* J J L

  • 12REST VOL Wishlist

    Interesting things that would be nice to have:• Support multi-threaded clients

    • Support asynchronous API

    • REST VOL activation based on file prefix – e.g. “hdf5://”• Paginated read/write for large dataset operations

    • Retry logic for HTTP timeouts, Service Unavailable• Support for other languages: Java, .Net, R, etc.

  • 13Pros and cons of running a service

    • Accessing a sharded data store via a service (HSDS) is nice:• Server mediates access to the storage system

    • Server can speed things up by caching recently accessed data

    • Only the data the client needs needs to be transmitted outside the data center• HSDS running on a large server or cluster can provide more processing capacity

    than a client might have • Unless it’s not:

    • Don’t want to bother setting up, running service

    • Challenge to scale capacity of service to clients

  • 14Direct Access Project

    Provide equivalent functionality of HSDS in a library • SN code would run in a sub-process• DN code would run in one or more sub-processes (e.g.

    based on number of cores) • Sub-processes would directly access storage system• Communication between parent processes and sub-

    processes would be http via localhost• Sub-processes shutdown when last file is closed• The same HSDS storage schema would be used• Can switch between direct access and server as

    needed

  • 15Diect Access System Diagram

  • 16Direct Access VOL plugin

    • For C/C++ apps, the direct access model could be implemented as a VOL connector• Other than launching the sub-processes the VOL would work in the same way as the

    REST VOL, so it probably makes sense to include this functionality in the REST VOL

    rather than create a new VOL• With direct access HDF5 lib + REST VOL enables sharded data as an alternative to

    the HDF5 file format• Enables multi-threading

    • Cloud optimized storage

    • Crash-proof

  • 17Questions?

  • 18Try it out!

    Get the software here:

    • HSDS: https://github.com/HDFGroup/hsds• H5pyd: https://github.com/HDFGroup/h5pyd• REST VOL: https://github.com/HDFGroup/vol-rest• REST API documentation:

    https://github.com/HDFGroup/hdf-rest-api• Example programs:

    https://github.com/HDFGroup/hdflab_examples

    https://github.com/HDFGroup/hsdshttps://github.com/HDFGroup/h5pydhttps://github.com/HDFGroup/vol-resthttps://github.com/HDFGroup/hdflab_examples