Abstract
• This data governance presentation focuses on data and metadata standards. The intention of the presentation is to identify new standards or modernize existing standards for both data and metadata.
© 2011
Biography
• Antonio AmorinPresident, Data Innovations, Inc.– Nineteen years of data modeling experience– Eleven years of data profiling experience– Delivered data modeling and data profiling solutions
to numerous clients in the Midwest and East Coast– Presented at national and international conferences,
user groups, webcasts, and at client sites– Founded Data Innovations, Inc. in 2002
© 2011
Data Innovations, Inc.
• Established in 2002• Based in northwest suburbs• Professional Services:
– Data Modeling– Data Profiling– Data Architecture– Metadata– Database Administration– ETL
• CA Service Partner in 2004• CA Commercial Reseller in
2006• CA Enterprise Solution
Provider in 2007
© 2011
Data Standards
• Documented agreements on representations, formats, and definitions of business data
© 2011
Data Standards
• Benefits– Improved data quality– Improved data
compatibility– Improved consistency
and efficiency of data collection, use, and sharing
– Reduced data redundancy
© 2011
Data Standards
• Data Stewards– Role or position– Responsible for
overseeing stewardship of the data and metadata
– Likely to be on both the business and IT sides of the organization
– Gatekeepers
© 2011
Data Standards
• Council or Board– Data stewards and
representatives of the various business areas
– Responsible and/or accountable for specific data for the organization
© 2011
Data Standards
• Types of Standards– Data definitions– Data rules– Data values– Data quality– Data standardization– Data security
© 2011
Data Standards
• Data Definitions and Rules– Provide a consistent,
clear understanding of what data content is expected
– Centralize or publish across the organization
– Enterprise data dictionary or metadata repository
© 2011
Data Standards
• Data Values– Valid values lists
• Static or rarely changed data
• Codes• Indicators
– Master reference data• Customer• Product• Etc
– Centralize
© 2011
Data Standards
• Data Quality– Leverage data profiling
• Column/Field– Value analysis– Pattern analysis– Data type analysis
• Table/File– Validate key structure– Determine dependencies
• Cross-table– Validate foreign keys– Valid values
• Cross-system
© 2011
Data Standards
• Data Quality Assessments– Standardize the process
through detailed analysis procedures
– Identify the different data quality problems using standardized notation
– Summarize the analysis in reports to communicate to others
– Create detailed examples to coincide with the analysis procedures
© 2011
Data Standards
• Data Standardization– Address
• Leverage address standardization software
– Phone and Email• Leverage data quality
software to standardize
– Business data• Leverage valid values and
master reference data to standardize data across the organization
© 2011
Data Standards
• Data Security– Identify sensitive data– Clearly define and
publish procedure for requesting access
– Identify and maintain lists of users with access rights
– Validate regularly that the user still needs access
© 2011
Metadata Standards
• Documented agreements on representations, formats, and definitions of Metadata
© 2011
Metadata Standards
• Metadata Stewards– Generally IT resources
fill this role or position– Responsible for
overseeing stewardship of the metadata
– Standards are generally integrated into the SDLC
© 2011
Metadata Standards
• Metadata Stewards– Generally IT resources
fill this role or position– Responsible for
overseeing stewardship of the metadata
– Standards are generally integrated into the SDLC
© 2011
Model Metadata
• Business metadata– Business requirements– Functional requirements– Data requirements
• Data profiling metadata– Column profiling– Table profiling– Cross-table profiling– Cross-system profiling
• Data quality metadata– Data quality statistics
• Data modeling metadata– Enterprise data models– Logical models– Physical models
• Mapping metadata– Source-to-target
mapping– Data Flow Diagrams
• Database metadata– Data Definition
Language
© 2011
Model Metadata
• Business metadata– Business requirements– Functional requirements– Data requirements
• Data profiling metadata– Column profiling– Table profiling– Cross-table profiling– Cross-system profiling
• Data quality metadata– Data quality statistics
• Data modeling metadata– Enterprise data models– Logical models– Physical models
• Mapping metadata– Source-to-target
mapping– Data Flow Diagrams
• Database metadata– Data Definition
Language
© 2011
Metadata Standards
• Data Requirements– Align with the business
requirements– Each business
requirement is likely to have matching data requirements
– Clearly define the data content to be captured
– Profile existing data sources
© 2011
Metadata Standards
• Data Profiling– Identify standards for
utilization• Create a step-by-step
process for preparing the data, profiling the data, and analyzing the results
• Identify and document the communication method to the business and IT
© 2011
Metadata Standards
• Data Profiling– Column Profiling
• Identify both valid and invalid
– Values
– Patterns
– Data types
– Lengths
• Standardize notation– Descriptions
– Problems
© 2011
Metadata Standards
• Data Profiling– Table Profiling
• Validate key structure• Identify candidate keys• Identify natural keys• Identify and document
exceptions or violations
– Cross-Table Profiling• Identify redundant data• Validate foreign keys• Identify orphaned rows
© 2011
Metadata Standards
• Data Profiling– Table Profiling
• Validate key structure• Identify candidate keys• Identify natural keys• Identify and document
exceptions or violations
– Cross-Table Profiling• Identify redundant data• Validate foreign keys• Identify orphaned rows
© 2011
Metadata Standards
• Data Profiling– Cross-system Profiling
• Identify redundant data• Identify inconsistent
data• Identify common
matching criteria
© 2011
Metadata Standards
• Data Quality– Consider requiring as
part of all profiling initiatives
– Capture and store in metadata repository
– Establish thresholds– Trend monitoring
© 2011
Metadata Standards
• Data Modeling– Enterprise Data Model
• Identify high level view of where the data lives across the enterprise
• Centralize to make accessible across the organization
• Consider identifying enterprise-level entities for important data
© 2011
Metadata Standards
• Data Modeling– Model Standards
• Standardized development process
• Model naming convention
• Name standards• Data type standards• Clearly documented
review process
© 2011
Metadata Standards
• Data Modeling– Logical/Physical
Models Standards• Model or project
narrative• Subject area• Entity• Relationships• Attribute• Identifier• Derived and BI
Elements
© 2011
Metadata Standards
• Data Modeling– Metadata Validation
• Column level– Values
– Patterns
– Data types
– Lengths
• Table level– Key validation
• Cross-table level– Foreign key
relationships
© 2011
Metadata Standards
• Mapping– Standardize mapping
process– Standardize format of
mapping document– Require data profiling
as part of the mapping process or to validate mapping
© 2011
Recommendations
• Publish or centralize data and metadata standards
• Integrate data and metadata standards into the SDLC
• Include standards review during onboarding
• Identify and publish the list of stewards
• Enforce standards with offshore teams
© 2011
Summary
• Data and metadata standards need to be developed and supported by both IT and the business
• Well defined standards will enhance the development of new applications and simplify the integration of data across the organization
© 2011
Thank You!
• Antonio C. Amorin– [email protected]– (847)975-0217
• Data Innovations, Inc.– www.dataprofilers.com– (888)438-3717
© 2011
Top Related