Digging Deep: Discover and Excavate Your Data Artifacts

26
EMBARCADERO TECHNOLOGIES EMBARCADERO TECHNOLOGIES Digging Deep Discover & Excavate Your Data Artifacts Ron Huizenga & Rob Loranger

Transcript of Digging Deep: Discover and Excavate Your Data Artifacts

EMBARCADERO TECHNOLOGIESEMBARCADERO TECHNOLOGIES

Digging Deep

Discover & Excavate Your Data Artifacts

Ron Huizenga & Rob Loranger

EMBARCADERO TECHNOLOGIES

Agenda

• Complex data landscape

• Artifact discovery and identification

• Conquer complexity– Approach

– Tools

– Techniques for different platforms

• Derive understanding from naming and classification

• Business meaning and value

2

EMBARCADERO TECHNOLOGIES

Evolution of Systems: AnalogyEvolution:

• 38 years of construction

• 147 builders

• No Blueprints

• No Planning

Result:

• 7 stories

• 65 doors to blank walls

• 13 staircases abandoned

• 24 skylights in floors

• 160 rooms, 950 doors

• 47 fireplaces, 17 chimneys

• Miles of hallways

• Secret passages in walls

• 10,000 window panes (all bathrooms are fitted with windows)

3

EMBARCADERO TECHNOLOGIES

Complex Data Landscape

4

• Comprised of:

– Proliferation of disparate systems

– Mismatched departmental solutions

– Many database platforms

– Big data platforms

– ERP, SAAS

– Obsolete legacy systems

• Compounded by:

– Poor decommissioning strategy

– Point-to-point interfaces

– Data warehouse, data marts, ETL …

EMBARCADERO TECHNOLOGIES

Artifact Discovery and Identification

• Identify candidate databases

• Reverse engineer existing databases into models

• Apply naming standards (comprehension)

• Classify through metadata

• Analyze redundancies & gaps

• Data lineage / chain of custody

5

EMBARCADERO TECHNOLOGIES

ER/Studio – Tools to Conquer Complexity• True multi-level sub-models (hierarchy)

• Reverse engineering (extensive list of platforms)

• Comprehensive metadata extensions (attachments)

• Naming standards

• Universal mappings

• Business glossaries

• Macros

• Data lineage

• Business Architect – data context (processes)

6

EMBARCADERO TECHNOLOGIES

Reverse Engineering & MetaWizard

• The ability to create a data model by connecting to an existing data store– Native connector

• Relational

• Big data

– ODBC

– Can also be SQL script rather than direct connection

– Other models and metadata repositories

• Vital to map & analyze complex data landscapes

7

EMBARCADERO TECHNOLOGIES

Native Support for Big Data

• Ability to model big data constructs

– Nested objects

– Nested object arrays

• Natively reverse engineer big data platforms

– Internal to tool as opposed to MetaWizard

• Forward engineering

8

EMBARCADERO TECHNOLOGIES

ER/Studio Native Big Data Support

• MongoDB

– Diagramming

– Reverse & Forward Engineering (JSON, BSON)

– MongoDB certification for 2.x and 3.0

• Certified for HDP 2.1

– Forward and reverse engineering

– Hive DDL

9

EMBARCADERO TECHNOLOGIES

Hive & ER/Studio

10

EMBARCADERO TECHNOLOGIES

Containment Relationship: Array of Nested Objects

11

db.patron.insert(

{

"_id" :

ObjectId("5367ddc4228cd006ab2bc60c"),

name: "Joe Bookreader",

address: [

{

street: "123 Fake Street",

city: "Faketon",

state: "MA"

},

{

street: "1 Someother Street",

zip: "12345"

} ]

})

db.book.insert([

{

title: "MongoDB: The Definitive Guide",

author: [ "Kristina Chodorow", "Mike Dirolf" ],

published_date: ISODate("2010-09-24"),

pages: 216,

language: "English",

publisher_id: ObjectId("5367dd99228cd006ab2bc60b"),

available: 3,

checkout: [ { by: ObjectId("5367ddc4228cd006ab2bc60c"), date: ISODate("2012-10-

15") } ]

},

{

title: "50 Tips and Tricks for MongoDB Developer",

author: [ "Kristina Chodorow" ],

published_date: ISODate("2011-05-06"),

pages: 68,

language: "English",

publisher_id: ObjectId("5367dd99228cd006ab2bc60b")

}])

EMBARCADERO TECHNOLOGIES

ER/Studio – Big Data Notation Enhancement

• Physical Model

– Objects instead of tables

• Nested Objects

– “is contained in” relationship type

12

EMBARCADERO TECHNOLOGIES

What about ERP and SAAS?

• Cryptic table and column names

• Internal data dictionaries

• Thousands of tables

• Often don’t implement referential integrity in the database

13

EMBARCADERO TECHNOLOGIES

Safyr – Technology Partnership

14

EMBARCADERO TECHNOLOGIES

Safyr: How it works

ERX

15

ER/Studio

EMBARCADERO TECHNOLOGIES

Comprehension: Naming Standards

• Extremely important

– Define, apply, enforce

• Real world objects

• Typically comprised of

– Business terms

– Abbreviation for each

– Template (specify order)

– Case, prefixes, suffixes

16

EMBARCADERO TECHNOLOGIES

Naming Standards Setup/Usage

• Typical use case– Logical -> physical

• Entity name -> table name

• Attribute name -> column name

• Landscape Mapping– Physical -> logical

– Table name -> entity name

– Column Name -> attribute name

17

EMBARCADERO TECHNOLOGIES

Naming Standards

Real-time update while

typing

18

EMBARCADERO TECHNOLOGIES

Classification Through Metadata: Attachments

19

EMBARCADERO TECHNOLOGIES

Correlation & Duplication: Universal Mappings

20

EMBARCADERO TECHNOLOGIES

Business Meaning: Glossary/Terms

21

EMBARCADERO TECHNOLOGIES

Glossary Integration

• Associate ER/Studio Data Architect objects to Team Server glossary terms– Model, submodel

– Entity, Table

– Attribute, Column

– Domain

– View

• Push terms to glossary

22

EMBARCADERO TECHNOLOGIES

Data Lineage

23

EMBARCADERO TECHNOLOGIES

Data Source Mapping

24

EMBARCADERO TECHNOLOGIES

Demo

25

EMBARCADERO TECHNOLOGIES

Thank you! You have chosen … wisely.

• Learn more about the ER/Studio product family: http://www.embarcadero.com/data-modeling

• Trial Downloads: http://www.embarcadero.com/downloads

• To arrange a demo, please contact Embarcadero Sales: [email protected], (888) 233-2224

26