Digging Deep: Discover and Excavate Your Data Artifacts
-
Upload
embarcadero-technologies -
Category
Software
-
view
96 -
download
0
Transcript of Digging Deep: Discover and Excavate Your Data Artifacts
EMBARCADERO TECHNOLOGIESEMBARCADERO TECHNOLOGIES
Digging Deep
Discover & Excavate Your Data Artifacts
Ron Huizenga & Rob Loranger
EMBARCADERO TECHNOLOGIES
Agenda
• Complex data landscape
• Artifact discovery and identification
• Conquer complexity– Approach
– Tools
– Techniques for different platforms
• Derive understanding from naming and classification
• Business meaning and value
2
EMBARCADERO TECHNOLOGIES
Evolution of Systems: AnalogyEvolution:
• 38 years of construction
• 147 builders
• No Blueprints
• No Planning
Result:
• 7 stories
• 65 doors to blank walls
• 13 staircases abandoned
• 24 skylights in floors
• 160 rooms, 950 doors
• 47 fireplaces, 17 chimneys
• Miles of hallways
• Secret passages in walls
• 10,000 window panes (all bathrooms are fitted with windows)
3
EMBARCADERO TECHNOLOGIES
Complex Data Landscape
4
• Comprised of:
– Proliferation of disparate systems
– Mismatched departmental solutions
– Many database platforms
– Big data platforms
– ERP, SAAS
– Obsolete legacy systems
• Compounded by:
– Poor decommissioning strategy
– Point-to-point interfaces
– Data warehouse, data marts, ETL …
EMBARCADERO TECHNOLOGIES
Artifact Discovery and Identification
• Identify candidate databases
• Reverse engineer existing databases into models
• Apply naming standards (comprehension)
• Classify through metadata
• Analyze redundancies & gaps
• Data lineage / chain of custody
5
EMBARCADERO TECHNOLOGIES
ER/Studio – Tools to Conquer Complexity• True multi-level sub-models (hierarchy)
• Reverse engineering (extensive list of platforms)
• Comprehensive metadata extensions (attachments)
• Naming standards
• Universal mappings
• Business glossaries
• Macros
• Data lineage
• Business Architect – data context (processes)
6
EMBARCADERO TECHNOLOGIES
Reverse Engineering & MetaWizard
• The ability to create a data model by connecting to an existing data store– Native connector
• Relational
• Big data
– ODBC
– Can also be SQL script rather than direct connection
– Other models and metadata repositories
• Vital to map & analyze complex data landscapes
7
EMBARCADERO TECHNOLOGIES
Native Support for Big Data
• Ability to model big data constructs
– Nested objects
– Nested object arrays
• Natively reverse engineer big data platforms
– Internal to tool as opposed to MetaWizard
• Forward engineering
8
EMBARCADERO TECHNOLOGIES
ER/Studio Native Big Data Support
• MongoDB
– Diagramming
– Reverse & Forward Engineering (JSON, BSON)
– MongoDB certification for 2.x and 3.0
• Certified for HDP 2.1
– Forward and reverse engineering
– Hive DDL
9
EMBARCADERO TECHNOLOGIES
Containment Relationship: Array of Nested Objects
11
db.patron.insert(
{
"_id" :
ObjectId("5367ddc4228cd006ab2bc60c"),
name: "Joe Bookreader",
address: [
{
street: "123 Fake Street",
city: "Faketon",
state: "MA"
},
{
street: "1 Someother Street",
zip: "12345"
} ]
})
db.book.insert([
{
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: ObjectId("5367dd99228cd006ab2bc60b"),
available: 3,
checkout: [ { by: ObjectId("5367ddc4228cd006ab2bc60c"), date: ISODate("2012-10-
15") } ]
},
{
title: "50 Tips and Tricks for MongoDB Developer",
author: [ "Kristina Chodorow" ],
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher_id: ObjectId("5367dd99228cd006ab2bc60b")
}])
EMBARCADERO TECHNOLOGIES
ER/Studio – Big Data Notation Enhancement
• Physical Model
– Objects instead of tables
• Nested Objects
– “is contained in” relationship type
12
EMBARCADERO TECHNOLOGIES
What about ERP and SAAS?
• Cryptic table and column names
• Internal data dictionaries
• Thousands of tables
• Often don’t implement referential integrity in the database
13
EMBARCADERO TECHNOLOGIES
Comprehension: Naming Standards
• Extremely important
– Define, apply, enforce
• Real world objects
• Typically comprised of
– Business terms
– Abbreviation for each
– Template (specify order)
– Case, prefixes, suffixes
16
EMBARCADERO TECHNOLOGIES
Naming Standards Setup/Usage
• Typical use case– Logical -> physical
• Entity name -> table name
• Attribute name -> column name
• Landscape Mapping– Physical -> logical
– Table name -> entity name
– Column Name -> attribute name
17
EMBARCADERO TECHNOLOGIES
Glossary Integration
• Associate ER/Studio Data Architect objects to Team Server glossary terms– Model, submodel
– Entity, Table
– Attribute, Column
– Domain
– View
• Push terms to glossary
22
EMBARCADERO TECHNOLOGIES
Thank you! You have chosen … wisely.
• Learn more about the ER/Studio product family: http://www.embarcadero.com/data-modeling
• Trial Downloads: http://www.embarcadero.com/downloads
• To arrange a demo, please contact Embarcadero Sales: [email protected], (888) 233-2224
26