1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema RegistrySatish Duggana, HortonworksDataworks summit - 2017, Munich
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Introduction What is Schema Registry?
• A shared repository of schemas that allows applications to flexibly interact with each other
What Value does Schema Registry Provide?– Data Governance
• Provide reusable schema • Define relationship between schemas• Enable generic format conversion, and generic routing
– Operational Efficiency• To avoid attaching schema to every piece of data • Producers and consumers can evolve at different rates
Example Use– Register Schemas for Kafka Topics to be used by consumers of Kafka Topic (e.g: Nifi, StreamLine)
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema Registry Concepts
• Schema Group A logical grouping/container for similar type of schemas or based any criteria that the customer has from managing the schemas
• Schema Metadata Metadata associated with a named schema.
• Schema Version The actual versioned schema associated a schema meta definition
Schema Metadata 1
Schema NameSchema TypeDescriptionCompatibility PolicySerializersDeserializers
Schema Group
Group Name
SchemaVersion 3
SchemaVersion 2
Schema Version 1versiontextFingerprint
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema Registry
Schema Registry Component Architecture
SR Web Server
Schema RegistryWeb App
REST APISchema Registry Client
Java Client
Integrations
Nifi Processors Kafka Ser/Des StreamLine
SchemaStorage
Pluggable Storage
Serializer/Deserializer Jar Storage
MySQL In-Memory Local File System
HDFSPostgres
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Writer/Reader schemas
Writer schema– Senders/Producers use this schema while sending the payloads according to the given schema viz
writer’s schema
Reader/Projection schema– Receivers uses this schema to project the received payload written with a writer schema.
Sender ReceiverWriter
SchemaWriter
SchemaProjection
Schema
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema evolution
Producerv2
Consumerv2
Producerv1
Producerv4
Consumerv5
Producerv1
Consumerv7
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema Compatibility Policies
What is a Compatibility Policy?– Defines the rules of how the schemas can evolve– Subsequent version updates has to honor the schema’s original compatibility.
Policies Supported– Backward– Forward– Both– None
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Backward compatibility
New version of a schema would be compatible with earlier version of that schema. Data written from earlier version of the schema, can be read with a new version of the
schema.
V1{ "type": "record", "name": "book", "namespace": "registry.example", "fields": [ { "name": "id", "type": "string" }, { "name": "color", "type": "string", "default": "blue" } ]}
V2{ "type": "record", "name": "book", "namespace": "registry.example", "fields": [ { "name": "id", "type": "string" }, { "name": "color", "type": "string", "default": "blue" }, { "name": "pages", "type": "int", "default": -1 } ]}
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Forward compatibility
Existing schema is compatible with future versions of the schema. That means the data written from new version of the schema can still be read with old
version of the schema.
V1{ "type": "record", "name": "book", "namespace": "registry.example", "fields": [ { "name": "id", "type": "string" }, { "name": "color", "type": "string", "default": "blue" } ]}
V2{ "type": "record", "name": "book", "namespace": "registry.example", "fields": [ { "name": "id", "type": "string" }, { "name": "color", "type": "string", "default": "blue" }, { "name": "pages", "type": "int" } ]}
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Both/Full compatibility
New version of the schema provides both backward and forward compatibilities.
V1{ "type": "record", "name": "book", "namespace": "registry.example", "fields": [ { "name": "id", "type": "string" }, { "name": "color", "type": "string", "default": "blue" } ]}
V2{ "type": "record", "name": "book", "namespace": "registry.example", "fields": [ { "name": "id", "type": "string" }, { "name": "color", "type": "string", "default": "blue" }, { "name": "pages", "type": "int", "default": -1 }, { "name": "title", "type" : "string", "default": "" } ]}
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema composition
Schemas can be shared and reused with in existing schemas Inbuilt support in default serializer/deserializer to build effective schemas
{ "name": "account", "namespace": "com.hortonworks.example.types", "includeSchemas": [ { "name": "utils” } ], "type": "record", "fields": [ { "name": "name", "type": "string" }, { "name": "id", "type": "com.hortonworks.datatypes.uuid" } ]}
{ "name": "uuid", "type": "record", "namespace": "com.hortonworks.datatypes", "doc": "A Universally Unique Identifier, in canonical form in lowercase. This is generated from java.util.UUID Example: de305d54-75b4-431b-adb2-eb6b9e546014", "fields": [ { "name": "value", "type": "string", "default": "" } ]}
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Sender/Receiver flow
Local schema/serdes
cache
Serializer
Sender
Schema Registry Client
Message Store
Local schema/serdes
cache
Deserializer
Schema Registry Client
versionpayload
versionpayload
Schema Storage SerDes Storage
Receiver
SchemaRegistrySchemaRegistry SchemaRegistry
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Serializers/Deserializers
Snapshot based serializer/deserializer– Serializes the complete payload– Deserializes the payload to respective type
Pull based serializer/deserializer– Serialize whatever elements are required and ignore other elements– Pull out whatever elements that are required to build the desired object
Push based deserializer– Gives callback to receive parsing events for respective fields in schema
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema registry client
REST based client Caching
– Metadata– Schema versions– Ser/des libs and class loaders
URL selectors– Round robin– Failover– Custom
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HA
Storage provider – Depends on transactional support of
underlying SQL stores– Spinup required schema registry
instances
Supports HA at SchemaRegistry– Using ZK/Curator– Automatic failover of master– Master gets all writes– Slaves receive only reads
SchemaRegistry
storage
SchemaRegistrySchemaRegistry
SchemaRegistry
SchemaRegistrySchemaRegistry
storage
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Integration of Schema Registry
Kafka– Using producer/consumer API for serializer/deserializer
Nifi Processors for Schema Registry– Fetch Schema– Serialize/Deserialize with Schema
StreamLine– Lookup Schema of a Kafka Topic
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kafka integration
Local schema/serdes
cache
KafkaAvroSerializer
Producer
Schema Registry Client
Local schema/serdes
cache
KafkaAvroDeserializer
Schema Registry Client
versionpayload
versionpayload
Consumer
SchemaRegistrySchemaRegistry SchemaRegistry
Kafka
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kafka Avro ser/des protocol
ser/des can be implemented with different protocols Default ser/des send protocol/schema versions as part of the binary payload of kafka
messages– Can be enhanced to use headers/metadata instead of the message payload– Custom ser/des can be registered for schemas.
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Nifi integration
Nifi Controller Service Nifi processors
– Transforms• Avro – CSV• Avro – Json• Json – CSV
– Extracting Avro fields
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Schema Registry UI
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
WIP/Future enhancements
Security– Kerberos support– Default authorizers and Apache Ranger support
Archiving schemas Notifications
– New versions– Archiving
Converters
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Try it out!
https://github.com/hortonworks/registry https://groups.google.com/forum/#!forum/registry Open sourced under Apache license Apache incubation soon Contributions are welcome
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Q & A
Top Related