The Operational Data Hub - DAMA Chicago · 2018. 10. 17. · SLIDE: 4 19 October 2018© MARKLOGIC...
Transcript of The Operational Data Hub - DAMA Chicago · 2018. 10. 17. · SLIDE: 4 19 October 2018© MARKLOGIC...
19 October 2018© MARKLOGIC CORPORATION
The Operational Data HubKen Krupa, CTO, MarkLogic
Data Integration
SLIDE: 3 19 October 2018© MARKLOGIC CORPORATION
EDW – What You GetINTEGRATION PATTERN FOR ANALYSIS
§ Bill Inmon: “a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions”
§ Integration of multiple upstream OLTP line-of-business systems for downstream analysis
§ Typically quantitative in nature
§ Accompanied by decision support dashboards
§ A cross-enterprise view in support of the observe-the-business function
SLIDE: 4 19 October 2018© MARKLOGIC CORPORATION
§ Emerged in various flavors from the late 90s into 2000s (point-to-point, SOA, ESB, etc.)
§ Application-oriented, focusing on interoperability at a coarse-grained level
§ Data copying and enrichment from endpoint to endpoint
§ Addressed integration for the run-the-businessfunctions
Enterprise Application Integration (EAI)
INTEGRATION PATTERN FOR OPERATIONS
SLIDE: 5 19 October 2018© MARKLOGIC CORPORATION
Enterprise Data FlowTHE ENTERPRISE AT 35K FT.
§ Distinction between run-the-business and observe-the-business functions
§ Enterprise Data Management functions exist to manage transformation, master data and distribution
§ Data is transformed (and copied) to suit various needs
SLIDE: 6 19 October 2018© MARKLOGIC CORPORATION
Another LookWHERE THINGS STAND TODAY
1. ETL: Costly, time-consuming and brittle
2. Master Data Management: High failure rate (~75%)
3. Data Warehouse: Stale data and agility challenges
4. Data Marts: Reaction Enterprise Data Warehouses shortcomings
5. SOA and similar: Extreme function focus “papers over” data
implications
6. Data Distribution: Time to delivery challenges exacerbated by
change
Overall Impact of Analysis on Operations: An ever-increasing
distance between discovery and operations
SLIDE: 7 19 October 2018© MARKLOGIC CORPORATION
A Typical Enterprise
Root Cause
SLIDE: 9 19 October 2018© MARKLOGIC CORPORATION
The Problem Is …
SLIDE: 10 19 October 2018© MARKLOGIC CORPORATION
“5 Whys” & Root Cause: Example 1
Problem: The amount of time customer service representatives spend trying to find customer information is very high.
1. Why? – They may need to potentially search across 16 different systems to find what they’re looking for.
2. Why? – The systems all have data about a customer, yet they each have different data models.
3. Why? – Because they serve different operational functions and the data has yet to be integrated.
4. Why? – Each system contains overlapping data as well as distinctly different parts of the data. Combining that data to allow for a customer 360 has proved difficult.
5. Why? – Because creating a relational database model must consider all data model variances up-front and the schema must be created before development can begin.
Example 1: Customer Service Call Center
SLIDE: 11 19 October 2018© MARKLOGIC CORPORATION
“5 Whys” & Root Cause: Example 2
Problem: After more than 18 months, the project team still has not started development and don’t
expect to start for another 3 to 6 months.
1. Why? – The data model is not finished.
2. Why? – They haven’t accounted for all of the asset classes.
3. Why? – Every time they look at a new asset class, the model has to be redone.
4. Why? – The shapes of each entity are very different and that causes difficulty for the
modeling team.
5. Why? – Because creating a relational database model must consider all data model variances up-front and the schema must be created before development can begin.
Example 2: Investment Bank - Derivatives Post-trade Processing, Project Delivery
SLIDE: 12 19 October 2018© MARKLOGIC CORPORATION
SLIDE: 13 19 October 2018© MARKLOGIC CORPORATION
Fixing Key Flaws
SLIDE: 15 19 October 2018© MARKLOGIC CORPORATION
§ Pass messages around between endpoints
§ Gloss over data persistence (“black box”)
§ No holistic view
However… the messages tend to be richly modeled and contextual
Point-to-point “Integration”
RUN THE BUSINESS
SLIDE: 16 19 October 2018© MARKLOGIC CORPORATION
The ETL QuagmireOBSERVE THE BUSINESS
§ Create one model to “rule them all”
§ Force fit data into that model
§ Throw away what doesn’t fit
However… this is data centric
SLIDE: 17 19 October 2018© MARKLOGIC CORPORATION
There’s a pattern here…THE “BOW TIE”
Many inputs, many outputs … across common data
SLIDE: 18 19 October 2018© MARKLOGIC CORPORATION
SLIDE: 19 19 October 2018© MARKLOGIC CORPORATION
Operational Data Hub (ODH)NEW ENTERPRISE INTEGRATION PATTERN
ODH Principles
SLIDE: 21 19 October 2018© MARKLOGIC CORPORATION
Document/Object ModelODH PRINCIPLES
§ Self-describing schema: JSON or XML
§ Makes data easy to ingest
§ Models are data: All models are relevant
{"customer": {"id": 123,"first": ”Frodo","last": ”Baggins",”town": ”The shire","post": "097364","gis_coords": "FECABEBA879"}} {
"customer": {"customer_id": 456,"fname": ”Samwise","lname": ”Gamgee","postal": "768410","spouse": 789}}
SLIDE: 22 19 October 2018© MARKLOGIC CORPORATION
Data HarmonizationODH PRINCIPLES
§ Create canonical models as you go
§ Retain source models
§ Capture lineage
{"metadata": {"source": "system-1","load-date": "2017-09-13"},"canonical": {"id": 123,"name": ”Frodo Baggins","postal": "097364"},"customer": {"id": 123,"first": ”Frodo","last": ”Baggins",”town": ”The Shire","post": "097364","gis_coords": "FECABEBA879"}}
SLIDE: 23 19 October 2018© MARKLOGIC CORPORATION
§ W3C standard
§ Relationships with context
§ Metadata with context: e.g. PROV-O
RDF Triples: Relationships & Metadata
ODH PRINCIPLES
{"metadata": {"source": "system-1","load-date": "2017-09-13”},"triple" : {
"subject": ”/canonical/customer/123", "predicate": "http://example.org/friendOf", "object": ”/canonical/customer/456"},
"canonical": {"id": 123,"name": ”Frodo Baggins","postal": "097364"},"customer": {"id": 123,"first": ”Frodo","last": ”Baggins",”town": ”The Shire","post": "097364","gis_coords": "FECABEBA879"}}
SLIDE: 24 19 October 2018© MARKLOGIC CORPORATION
Operational & Real-timeODH PRINCIPLES
§ Indexing to support query and search
- Raw data
- Curated data
§ Support for bi-directional data access
- Transactions
- “Read/write” queries
- Cross-LoB operations
SLIDE: 25 19 October 2018© MARKLOGIC CORPORATION
And SecureODH PRINCIPLES
§ This system will hold the crown jewels
§ Robust role-based security
§ Encryption on the wire
§ Encryption at rest (great for cloud)
§ … and Governed by Default
SLIDE: 26 19 October 2018© MARKLOGIC CORPORATION
ADVANCED SECURITY
Safe Data Access &Data Sharing§ The most secure modern database
§ Proven in the world’s most demanding security environments
§ Fine-grained access controls so the right data is shared with the right people
§ Protects from hackers and insider threats
ODH: In the Wild
SLIDE: 28 19 October 2018© MARKLOGIC CORPORATION
ODH under another name?
“…if any of these extracted insights are going to be used to optimize transactions, they have to be fed back into Systems of Engagement and Systems of Record. All this calls for a platform solution, … labeled a System of Operation.”
Geoffrey Moore – Feb, 23 2017https://www.linkedin.com/pulse/intelligent-computing-systems-how-enterprise-evolve-geoffrey-moore
SLIDE: 29 19 October 2018© MARKLOGIC CORPORATION
Financial Services Data FlowsODH IN ACTION
ENTITY DATA
TRADE DATA
PRICING ENGINE
MARKET DATA
POLICY RULES
OTHER DATA FEEDS
TRANSACTIONAL APPS- PRE AND POST TRADE PROCESSING
- CONTENT PUBLISHING- CUSTOMER APPS
ANALYTICS & BI- REGULATORY COMPLIANCE
- RISK MANAGEMENT
OTHER DOWNSTREAM SYSTEMS
SLIDE: 30 19 October 2018© MARKLOGIC CORPORATION
Entertainment StudioODH IN ACTION
MATERIAL REQUESTS
CLIENT PROFILES
DIGITAL ASSET MANAGEMENT
ARCHIVE
MASTER DATA (MDM)
PARTNER PORTAL
DISTRIBUTION PARTNER
USER WORKFLOW
STUDIO SHOPS
STUDIO CUSTOM PRODUCTS
SLIDE: 31 19 October 2018© MARKLOGIC CORPORATION
ODH for DefenseODH IN ACTION
SLIDE: 32 19 October 2018© MARKLOGIC CORPORATION
Healthcare.GovODH IN ACTION
ODH in the cloud
SLIDE: 34 19 October 2018© MARKLOGIC CORPORATION
POWERFUL CLOUD SERVICE
MarkLogic Data Hub Service § The fastest way to integrate data
§ No infrastructure to buy or manage
§ Fixed and predictable cost for varying and unpredictable workloads
§ Fully automated operations for a fully integrated service
SLIDE: 35 19 October 2018© MARKLOGIC CORPORATION
Questions?
SLIDE: 36 19 October 2018© MARKLOGIC CORPORATION
Further Readinghttp://marklogic.com/odh-ebook
19 October 2018© MARKLOGIC CORPORATION
Thank you!@kenkrupa