Introduction to Master Data Management
-
Upload
william-el-kaim -
Category
Technology
-
view
438 -
download
1
Transcript of Introduction to Master Data Management
Introduction to MDM
William El Kaim Oct. 2016 - V 2.0
This Presentation is part of the
Enterprise Architecture Digital Codex
http://www.eacodex.com/Copyright © William El Kaim 2016 2
Plan
Introduction to Data Governance
• Introduction to Data Quality
• Introduction to MDM
• MDM Delivery Model
• MDM Architecture
• Master Data Value
• MDM project Mgt.
• Conclusion
Copyright © William El Kaim 2016 3
The Data Management Context
• Large Global Organizations with a multitude of business processes and
systems to process transactions are often faced with the challenge of not
having a “Single Source of Truth” for their Master Data.
• Systems Architecture and Data Architecture objectives appear to be divergent and
tactical rather than cohesive and strategic
• Data is an enterprise asset used to make strategic business decisions
• Very often accuracy, completeness, accessibility and security of data prevents effective
business decision making
• 80% of data in Transactions is Master and Reference Data!
• Organizations are naturally endowed with isolated pools of data that are not
optimally leveraged for the sum of the parts to result in the whole
Copyright © William El Kaim 2016 4
The Quest for Trusted Data
• Trusted data are data used by business stakeholders to support their
processes or decisions with no reservations as to its relevance, freshness,
accuracy, integrity, and other previously agreed upon definitions of quality
• In order to deliver trusted data it is required to:
• Ensure data quality through appropriate processes and best practices
• Break traditional functional and IT silos to share data across the enterprise.
• Introduce the right tools and platform
• Unfortunatelly
• Ignoring the need for trusted data is common until the lack of it impacts your business.
Copyright © William El Kaim 2016 5
So What Exactly Is Data Governance?
• Data governance is a set of processes ensuring that important data assets
are formally managed throughout the enterprise.
• It formalizes the “fiduciary” duty for the management of data assets critical to its success.
• Data governance ensures that data can be trusted and that people can be
made accountable for any adverse event that happens because of low data
quality.
• It is about putting people in charge of fixing and preventing issues with data so that the
enterprise can become more efficient.
• Data governance also describes an evolutionary process for a company,
altering the company’s way of thinking and setting up the processes to
handle information so that it may be utilized by the entire organization.
• It’s about using technology when necessary in many forms to help aid the process.
Copyright © William El Kaim 2016 6
The Responsibilities of Data Stewards
• Stewards should be considered data subject-matter experts for their
respective business functions and processes.
• Stewards are responsible for guiding the effort, not necessarily executing it
themselves.
• Stewards have other roles and responsibilities and therefore cannot effect significant
change on their own.
• Their roles as stewards should be to guide and influence others in implementing the
changes necessary to improve data quality. They should be viewed as the leaders of the
data quality improvement effort, not necessarily the "doers.“
• Stewards should define and monitor quality measures to justify the program
but also must have specific goals for data quality improvement.
• Stewards must be accountable
• Stewardship should be based on manageable subsets of data.
Copyright © William El Kaim 2016 8
Data Stewardship MetricsObjective Dimensions
• Accuracy• Whether the data values being held reflect the properties of the real-world object or an
event that the data is intended to model.
• Consistency • Whether the values of attributes managed or presented in multiple locations are the
same.
• Existence• Whether a value is being held for a particular attribute.
• Integrity• Whether all expected relationships between data in multiple data stores, tables and files
are intact.
• Validity• Whether the values held fall within the allowable domain of values established for an
attribute.
Copyright © William El Kaim 2016 9
Data Stewardship MetricsSubjective Dimensions
• Believability
• The degree to which users of the data believe and trust it.
• Interpretability
• The degree of ease with which data is consumed and understood.
• Relevance
• The degree to which the data supports and furthers the goals and objectives of users,
processes and the organization.
• Timeliness
• The degree to which the latency of data delivery matches the needs of the consuming
individuals or processes.
Copyright © William El Kaim 2016 10
Data Governance Framework Example
Copyright © William El Kaim 2016 11Source: SAS
The Role Of Technology In Data Governance
• Data profiling and data quality software supports data stewards in:
• Profiling and analyzing source data.
• Defining and capturing standard definitions.
• Standardizing lists of values.
• Defining and implementing cleansing, standardization, validation, enrichment, and
matching; and merging business rules for automatic data quality validation and
remediation.
• Defining and implementing exception rule parameters where manual intervention is
required.
Copyright © William El Kaim 2016 12
Data governance is not an IT project: It is a business strategy that can be optimized with the appropriate use of enabling technologies.
Synthesis
• Data is the “raw material” used everywhere
• Data Stewardship is the recognition that data is a resource that needs to be
managed.
• It involves recognizing the criticality of the data quality and making stewardship of it a
jointly shared responsibility of the business and IT.
• Data Stewardship is one of the key enablers of turning data into information that can be
used for strategic advantage.
• There has been a quality revolution that has redefined quality from being an
optional characteristic to a basic requirement for both goods and services.
• When the level of data quality is equal among the competition, the competitive battle
lines are drawn in other areas.
• However, organizations have been redefining the role of data and data quality causing
data to be in the heart of the competition.
Copyright © William El Kaim 2016 13
Plan
• Introduction to Data Governance
Introduction to Data Quality
• Introduction to MDM
• MDM Delivery Model
• MDM Architecture
• Master Data Value
• MDM project Mgt.
• Conclusion
Copyright © William El Kaim 2016 14
Data Quality: What Is Being Measured?
Timeliness. While more of an operational quality metric, timeliness addresses whether the delivery of data from one environment to another meets user expectations.
?
Copyright © William El Kaim 2016 15
Accuracy. Data must be consistent with the intended goal.
Completeness. Having missing or invalid data leads to problems.
Integrity. Not having the expected relationships between multiple data sets intact presents data integrity issues.
Hierarchal Relationship Accuracy. Parent-child relationships can be overlooked, leading to data quality issues.
Data Quality: What Is Being Measured?
Consistency and standardization. Delivering data that doesn’t conform to defined formats and standards can lead to chaos.
Copyright © William El Kaim 2016 16
Third-party enrichment. Not all data exists inside the enterprise and often must be appended with third-party information.
Freshness. A different metric than timeliness, freshness focuses on the age of the data, which may have varying levels of usefulness depending on its type.
Uniqueness. While data will be scattered throughout the enterprise, not all of it should be considered unique.
Data Quality Software Supports Trusted Data
• Data quality software (DQS) provides the technology enabler for
implementing many of the data quality rules and processes defined through
your data governance efforts.
Copyright © William El Kaim 2016 17
Data Quality Solutions Market
Copyright © William El Kaim 2016 18
Plan
• Introduction to Data Governance
• Introduction to Data Quality
Introduction to MDM
• MDM Delivery Model
• MDM Architecture
• Master Data Value
• MDM project Mgt.
• Conclusion
Copyright © William El Kaim 2016 19
What is MDM?
• Definition• A business capability enabling an organization to first identify trusted master data and
then leverage master data to improve business processes and decisions.
• Identify trusted master data
• MDM defines and/or derives the most trusted and unique “version” of important enterprise data (e.g., vendor, customer, product, employee, asset, material, location, etc.).
• Leverage master data to improve business processes and decisions
• MDM incorporates this master version of the data within functional business processes (sales, marketing, finance, support, etc.) that will provide direct benefit to employees, customers, partners, or other relevant stakeholders within an organization.
• Master data alone provides little value
• Hence, anticipation of how the data will be consumed by other applications or systems within the context of a business process provides the most value.
• Master data management begins where data quality software leaves off!• MDM is a business capability enabled through the integration of multiple technologies
and business processes.
Copyright © William El Kaim 2016 20
Spreadsheet Effect!
R&D Operations Sales Marketing Procurement Finance & HR
Data Warehouse Finance Human Resources Sales Operations
Silo Effect!
Shared Data by Businesses & Systems
Copyright © William El Kaim 2016 21
Partners
Products
Items
Services
Customers
Channels
Pricing
Locations
Stores
Organization
Employees
Suppliers
Assets
Finance
Accounts
Codes
Hierarchies
Why MDM Is Complex?
Copyright © William El Kaim 2016 22
USA
App
MD
Data Warehouse Finance Human Resources Sales & Marketing Operations
BW SAP App CRM App
Products Accounts Accounts Products Employees Org Customers Products Items Locations
IDAccount numNameInvoicing.../...
IDLabelDescriptionPricePromotions.../...
IDHierarchiesMarkets.../...
Master Data are dispersed and redundantWhere is the thruth ?
No integrity between data silos
Version 1
Version 1.1
Version 2
Version 1
Version 3
Lots of life cyclesNot in sync!
Europe
App
MD
en_US
fr_FRde_DE
Plan
• Introduction to Data Governance
• Introduction to Data Quality
• Introduction to MDM
MDM Delivery Model
• MDM Architecture
• Master Data Value
• MDM project Mgt.
• Conclusion
Copyright © William El Kaim 2016 23
MDM Delivery ModelIssues
• MDM as a business capability has been difficult to achieve due to:
• The complexity of integration and architecture alternatives,
• A lack of data governance and business ownership,
• Existing processes that impede the capture of high-quality data
• Prohibitive implementation costs paired with poor scoping and prioritization.
• As an added bit of irony, this solution that helps to enable a single version of
the truth does not itself boast a single version of the truth regarding its own
market definition.
Copyright © William El Kaim 2016 24
MDM Delivery ModelAnalytical MDM
• Analytical MDM, focuses on providing a one directional business view of
information through version-controlled hierarchy management and
dimensional modeling capabilities
• For example, product families, sales channels, and sales regions are common views
managed in these environments.
• Many customers begin their MDM journey with analytical MDM
• Analytical MDM is easier to tackle and is a recommended first step because it is
primarily about the data.
• Because its one directional nature introduces much less risk and complexity than
attempting to bi-directionally synchronize master data with critical production application.
• Analytical MDM usually corresponds with the third level of Forrester’s MDM
Maturity Model.
Copyright © William El Kaim 2016 25
MDM Delivery ModelOperational MDM
• Operational MDM
• Focuses on consolidating data from disparate upstream data sources into a reconciled
analytical environment (usually a data warehouse or operational data store) for reporting
and analysis.
• Bi-directionally synchronizes trusted master data in real time, across heterogeneous
information environments.
• Requires the much more challenging need to synchronize business processes as well
as data.
• Operational MDM typically corresponds with levels four and five of the MDM
maturity model.
Copyright © William El Kaim 2016 26
Source: May 16, 2008, “Trends 2008: Master Data Management”
MDM Maturity ModelForrester
Copyright © William El Kaim 2016 27
Analytical MDM
Operational MDM
Operational MDM
MDM Maturity ModelGartner
Copyright © William El Kaim 2016 28
Plan
• Introduction to Data Governance
• Introduction to Data Quality
• Introduction to MDM
• MDM Delivery Model
MDM Architecture
• Master Data Value
• MDM project Mgt.
• Conclusion
Copyright © William El Kaim 2016 29
MDM Ecosystem
• Do not Confuse Delivery Methods with MDM Technology Options
Copyright © William El Kaim 2016 30
Introducing the MDM Ecosystem
• The MDM ecosystem consists of upstream, downstream and core
components . . .
• The MDM ecosystem includes:
• Sources. Source systems capture the raw materials (data) used to build the master
record.
• Centralized data management factories. Technologies and processes to collect,
standardize, consolidate, aggregate, and apply business rules leading to the finished
product (master data).
• Business processes, systems, and access tools. Package and deliver master data to
support contextual business consumption.
• Transportation systems. Information integration technologies ensure data seamlessly
navigates through these components.
Copyright © William El Kaim 2016 31
Introducing the MDM Ecosystem
Source: October 2, 2008, “It’s Time To Invest In Upstream Data Quality” Forrester report
Copyright © William El Kaim 2016 32
Without effective governance, upstream
business processes pollute downstream data requirements
MDM ecosystem is complex
Source: Forrester, April 28, 2008, “Making MDM And SOA Better Together”Copyright © William El Kaim 2016 33
Breadth of Data Impacts Architecture
Copyright © William El Kaim 2016 34
Architectural Approach to MDM
Copyright © William El Kaim 2016 35
Architectural Approach to MDM
Copyright © William El Kaim 2016 36
Architectural Approach to MDMGartner Vision
Copyright © William El Kaim 2016 37
Data Integration Problem Space for MDM
Copyright © William El Kaim 2016 38
Integration Services for MDM
Copyright © William El Kaim 2016 39
Plan
• Introduction to Data Governance
• Introduction to Data Quality
• Introduction to MDM
• MDM Delivery Model
• MDM Architecture
Master Data Value
• MDM project Mgt.
• Conclusion
Copyright © William El Kaim 2016 40
Criteria For Identifying Master Data
• A data is a Reference/Master data(1) if
• It is duplicated inside several systems
• Common examples: Customer address, Organization, Product description, etc.
• It is keyed before being used by transactional systems
• Common examples: table of labels for products by regions, technical and functional
parameterization, etc.
• Systems generate and handle many reference/master data because of
• Several data duplications inside many functional and technical silos
• Several configuration and parameterization tools such as
• Excel spreadsheets, direct SQL coding, specific in-house tools, parameterization tools bring by
software packages, etc.
Copyright © William El Kaim 2016 41
(1) Similar terms in the context of this presentation. In some articles and surveys “Reference Data” is used for code-labels data and “Master Data” is used for business and more complicated data (structrures, life-cycles)
Usual Information System Architecture
• Complexity of data mapping
• Difficulty for managing referential integrity rules connecting data
across several systems
• Duplication of business rules for validating data
• Lack of auditability and traceability regarding the use of data
Copyright © William El Kaim 2016 42
Updating data in a point-to-point mode between systems without a pivot format (a.k.a. Common Information Model)
The MDM tackles those drawbacks
SystemSystem SystemSystem SystemSystem SystemSystem
Format #1 Format #2 Format #3 Format #4
mapping
mapping
mapping
MIDDLEWARE – ETL, EAI, ESB (point-to-point mode)
Propagation of data updating across systems
Improvement #1
• Reduction and simplification of data mapping treatments
• Management of referential integrity rules that connect data across
systems
• Unification of business rules for validating data
• Auditability and traceability regarding the use of data
Copyright © William El Kaim 2016 43
A Common Information Model (CIM) is required and modeled with help from a suitable method
The CIM is a shared model
SystemSystem SystemSystem SystemSystem SystemSystem
Format #1 Format #2 Format #3 Format #4
MDM
mapping
Common
Information
Model
mapping
Propagation of data updating across systems
MIDDLEWARE – ETL, EAI, ESB
Storage with the Common Information
Model (allowing for a better traceability)
Reference data administration
(data governance)
Convergence
Improvement #2
• Reference/Master data administration (a.k.a. governance) is unified
• The MDM’s user interface is used for data feeding, version management, comparison and merge of versions, deployment of versions, querying of data, traceability, reporting, etc.
Copyright © William El Kaim 2016 44
SystemSystem SystemSystem SystemSystem System
MDM
Common
Information
Model
Feeding of data depending on
execution contexts: versions and
variants (organization, channel,
regions, etc.)
MIDDLEWARE – ETL, EAI, ESB
Direct reading of
the MDM
Propagation of
values
This system is
overhauled and
takes advantage
of direct access
to MDM
mapping
ACTIVE GOVERNANCE OF REFERENCE DATA
Governance
Who Is Responsible For Updating?It Depends Of The IS Strategy And IT Ability
Copyright © William El Kaim 2016 45
Address
CRM
Address
MDM
Checks, cleans… Ok, Ko, Result…
Address
Other systems
Push the data
1
2 3
5
The Address should be associated with a stateso as to indicate its validity
Updating + COMMIT
4
Updating dependingon the result
Who Is Responsible For Updating?It Depends Of The IS Strategy And IT Ability
CRM
Address
MDM
New address
Address
Other systems
Push the data
1
3
5
In a theorical world relying strongly on SOA the Address shouldn’t be recorded in the CRM (stage 4) since a service interaction with the MDM allows for getting the Address
Push the data
Address
2 Updating+ COMMIT
4 Updating +COMMIT
synchronization
Copyright © William El Kaim 2016 46
Who Is Responsible For Updating?It Depends Of The IS Strategy And IT Ability
ERP
Product
MDM
Product
Other systems
Push the data
2
2
Updating with additional transactional data
Product
1
Updating Master Data
Copyright © William El Kaim 2016 47
Why now ?
• End-of-life of existing systems growing old with difficulty due to several
successive software layers added during last years
• So many functional and technical silos
• Retirement of some key business users and IT specialists
• Loss of business and IT knowledge regarding existing assets
• Lack of documentation
• Loss of Modelling knowledge
• Useless complexity of maintenance due to the lack of IT alignment with the business
• MDM comes into play not only for increasing data quality!
• Misunderstanding it is a risk that will reduce benefits of MDM
• Let’s take an insurance industry example (next slides)
Copyright © William El Kaim 2016 48
Example – Carlson Hospitality
Copyright © William El Kaim 2016 49
Example – Carlson Hospitality
Copyright © William El Kaim 2016 50
Example – Carlson Hospitality
Copyright © William El Kaim 2016 51
The CDI profile hub consists of a database and services that are provided in real-time or batch
Plan
• Introduction to Data Governance
• Introduction to Data Quality
• Introduction to MDM
• MDM Delivery Model
• MDM Architecture
• Master Data Value
MDM project Mgt.
• Conclusion
Copyright © William El Kaim 2016 52
MDM Project Mgt.Data Governance is Key
Copyright © William El Kaim 2016 53
MDM Project Mgt.Strong Program Management Is Critical
• Key program management skills include:
• Defining and executing change management strategies.
• Clearly defining roles and responsibilities.
• Data stewardship training.
• Rapid issue resolution by executive steering committees.
• Strategic communications planning
• Data Stewardship has, as its main objective, the management of the
enterprises’ data assets
• to facilitate a common understanding and acceptance of the data.
• The purpose of doing this is to maximize the business return on the investment made in
the data resources.
• The expected results are improved reusability, accessibility and quality of the data.
Copyright © William El Kaim 2016 54
MDM project Mgt.
Copyright © William El Kaim 2016 55
MDM project Mgt.
Copyright © William El Kaim 2016 56
Data Stewardship Responsibilities
• Document and implement business-naming standards.
• Creates and maintains business metadata definitions for business users
• Develop consistent data definitions and data aliases.
• Document standard calculations and derivations.
• Document the business rules related to the data - for example, edit and validation rules.
• Monitor development efforts for adherence to standards.
• Ensure ownership and responsibility for the maintenance of data quality standards.
• Looks for common data problems, finds ways to solve problems
• Performs duplicate suspect processing of guest profile data (merges, unmerges)
• Sends defects back to data owners or source that created bad data
• Uses metrics to check the quality of the data and data process
Copyright © William El Kaim 2016 57
The Questions That Must Be Asked!
• Existing tools for managing reference/master data• Direct SQL, Excel, Specific tools, ERP configuration, etc.
• Execution environments and related life-cycles• Often many parameters with different values depending on those environments: test, UAT
(User Acceptance Test), training, run-time, etc.
• Existing processes for data integration• EAI, ESB, ETL treatments
• The level of maturity in Common Information Model• Does it exist? Via an Operational Data Store ?
• The lack of data quality• Duplication, wrong values, errors when using data due to lack of
business documentation
• The lack of IT alignment with business• Taking into account external constraints stemming from business
regulation (SOX, Basel II, Solvency II...)
Copyright © William El Kaim 2016 58
First Project Possible Scope
• Objectives• Acquiring MDM Modeling procedures in an operational way
• Using governance features brought by the MDM tool
• Version management, variant management, permission management, approval processes, etc.
• By avoiding• Staying in the scope of the IT department only
• Governance features must be used by business usersnot only by IT specialists
• Building up data models that will not be reusable
• Cautious with quick-win approach. We prefer to adopt an approach that fosters sustainable results
• Being too conceptual
• Fostering an iterative approach by validating models with help from the MDM’s User Interface. It requires a Model-driven approach
Copyright © William El Kaim 2016 59
First Project Metrics
• Duration from 4 up to 6 weeks
• From the Modelling to roll-out in
production and utilization by users
• Less than 100 data localized within 3 Business Objects (BO)
• Less than 5 referential integrity constraints between these BO
• 1 Business Object = a set of entities tightly coupled in term of semantic (coarse grained
object)
• 2 synchronizations between MDM and systems
• It is more secure if an infrastructure such as EAI/ESB/ETL is already available
• Use of a suitable MDM that encourages a rapid implementation by
parameterization rather than a rigid lifecycle software development
• This is the case for Orchestra Networks tool
Copyright © William El Kaim 2016 60
CIM Modelling Lifecycle
Workshops to build up the semantic data architecture
relying on Domains of business objects
Tools for automatic analysis of existing databases to help the Modeling of the semantic data => HELP WHEN NEEDED BUT NO MORE!
Progressive Modeling by business objects
UI of the MDM is used to support data validation
Automatic loading
Help for validating data models
Top-Down approach (Data Enterprise Architecture)
Re-engineering Bottom-up
Iterative cycle
MDM Prod.
MDM
MDM Test
MDM
Data Enterprise Architecture
Relies on the Data Enterprise
Architecture
N N+1
Copyright © William El Kaim 2016 61
CIM Modelling Lifecycle
• Iterative (bottom->up)• Incremental Modeling with frequent
loading in the MDM for validating via the MDM’s User Interface
• A MDM tool that allows for automatic loading from data models is required=> Model-driven MDM
• Avoiding the tunnel effect
• Allowing a data validation by users with help from the MDM’s UI
• Risks are taken due to modifications of data models during cycles. It involves data migration between the successive versions of data models
• Enterprise Architecture (top->down)• A global effort to build up an global data
architecture (a data map)
• Better stability and upgradeability of data models
• Mastering data Modelling relying on EA and business architecture is required
• management at the level of the enterprise is needed (global act)
• The launching phase IS COMPLEX• it requires strong competency in EA and data
Modelling
• After the launching phase data Modelling relies on the global data architecture.
• Then the iterative life-cycle can be started in a secure way
Copyright © William El Kaim 2016 62
Plan
• Introduction to Data Governance
• Introduction to Data Quality
• Introduction to MDM
• MDM Delivery Model
• MDM Architecture
• Master Data Value
• MDM project Mgt.
Conclusion
Copyright © William El Kaim 2016 63
Evolution of Data Awareness
Copyright © William El Kaim 2016 64Source: SAS
Master Data Synthesis
Copyright © William El Kaim 2016 65
Common Barriers Hindering MDM Success
• Considering MDM as purely a technology initiative
• Assuming that dirty data is just an IT problem
• Managing the vast complexity of multiple data domains without proper
techniques, including common data models, integration APIs, and Web-
service-enabled features
• Lacking focus on data governance, prioritization, people, and process
• Underestimating the level of executive sponsorship required for success
• Ineffectively prioritizing funding and managing costs
Copyright © William El Kaim 2016 66
Recommendations
• Consider data quality strategies that support enterprise demands:
• Prioritize your data quality objectives by focusing on data elements supporting your most
business-critical processes.
• Get started with project-based data quality.
• Ride the coattails of cross-enterprise data management initiatives.
• Adopt data governance to allow you to evolve from project-based DQ to enterprise-class
MDM.
• Master Data will become the focal point in the SOA architecture ‘battle’
• Application Independent MDM solutions will provide a richer context for an SOA than
Application Specific approaches (e.g., SAP, Oracle)
Copyright © William El Kaim 2016 67
The 7 Building Blocks of MDM
Copyright © William El Kaim 2016 68
The 7 Building Blocks of MDM
Copyright © William El Kaim 2016 69
http://www.twitter.com/welkaim
SlideShare
http://www.slideshare.net/welkaim
EA Digital Codex
http://www.eacodex.com/
http://fr.linkedin.com/in/williamelkaim
Claudine O'SullivanCopyright © William El Kaim 2016 70