Ds04 data quality
-
Upload
dotnetcampus -
Category
Technology
-
view
91 -
download
1
description
Transcript of Ds04 data quality
![Page 1: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/1.jpg)
Previously known as
Think Big. Move Fast.
![Page 2: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/2.jpg)
Template designed by
brought to you by
![Page 3: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/3.jpg)
SolidQ
• Born in 2002 in USA and Spain
• Established in 2007 in Italy
• More than 1000 customers and more than 200 consultants worldwide
• Dedicated to Data Management on the Microsoft Platform
• Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors
• www.solidq.com
![Page 4: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/4.jpg)
Davide Mauri
• 18 Years of experience on the SQL Server Platform
• Specialized in Data Solution Architecture, Database Design, Performance Tuning, Business Intelligence
• Microsoft SQL Server MVP
• President of UGISS (Italian SQL Server UG)
• Mentor @ SolidQ
• Video, Book & Article Author
• Regular Speaker @ SQL Server events
• Projects, Consulting, Mentoring & Training
![Page 5: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/5.jpg)
Data Quality
![Page 6: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/6.jpg)
The BIG problem
• What’s the key asset of a company?• Data that leads to Information and then to Knowledge
• With the mass adoption of Business Intelligence / Analytics problems with Data Quality arises and become evident• Wrong, incomplete or incoherent data leads to wrong decisions
• Managers cannot “trust” native data
• Data needs to be reworked a lot in order to be usable• As per my experience, almost the 50% of the time spent developing a BI solution is use just to solve
Data Quality problems.
![Page 7: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/7.jpg)
The BIG problem
• A Gartner research states that«Organizations estimated they are losing an average of $8.2 million annually as a result of data quality issues”• 22% report estimated losses for $20 million
• 4% report estimated losses for $100 million
![Page 8: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/8.jpg)
Data Quality Concepts
Data Quality Issue Sample Data Problem
Standard Are data elements consistently defined and understood ?
Gender code = M, F, U in one system and Gender code = 0, 1, 2 in another system
Complete Is all necessary data present ? 20% of customers’ last name is blank, 50% of zip-codes are 99999
Accurate Does the data accurately represent reality or a verifiable source?
A Supplier is listed as ‘Active’ but went out of business six years ago
Valid Do data values fall within acceptable ranges? Salary values should be between 60,000-120,000
Unique Data appears several times Both John Ryan and Jack Ryan appear in the system – are they the same person?
![Page 9: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/9.jpg)
Master Data Management
• A solution to the problem is offered by Master Data Management (MDM)• Is a Discipline and a Process supported by Technology
• MDM aim to discover and define non-transactional lists of data, with the goal of compiling maintainable master lists, that will become the reference data.
![Page 10: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/10.jpg)
Master Data Management
• What are Master Data?• Master Data: “Slowly changing reference data shared across system”
• Master Data != Transactional Data
• Master Data != Metadata
• Reference Data:• Products, Customers, Suppliers, Geography, ecc.
• The Dimensions of a Data Warehouse
![Page 11: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/11.jpg)
Master Data Services
• Introduced with SQL Server 2008 R2
• With SQL Server 2012 al lots of improvements• Web Interface improved *a lot*
• Silverlight based
• Integrated with Excel 2007 and after• Killer Application!
• Installed with SQL Server 2012 but must be configured prior usage• Needs IIS, WCF and so on…
• No Changes in 2014
![Page 12: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/12.jpg)
Master Data Services
• Allow the management and the definition of Master Data• “Model” Definition
• Entities, Attributes, Hiearchies, ecc…
• Business Rules
• Data Stewardship • Through Excel Addin or the Web Portal
• Integration• Batch and/or WCF Service
![Page 13: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/13.jpg)
What is Master Data Management?
ERP CRMWarehouse
MGMTInvoicingSystem
BI
Master Data Hub
ExternalSystem
Integration
Web Service Data Hub
Data Steward
![Page 14: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/14.jpg)
Master Data Services
![Page 15: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/15.jpg)
Data Quality Services
• A new Service introduced with SQL Server 2012• Enable the verification of «new» data against a established Knowledge Base
• Has its own client
• Must be installed after SQL Server 2012 installation• «Data Quality Service Installer»
• Three dedicated SQL Server database (DQS prefix)
![Page 16: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/16.jpg)
Data Quality Services
• Help to• Define a Knowledge Base
• Through «Knowledge Discovery» and «Domain Management»
• Perform Data Cleasing & Data Matching (De-Duplication)
• Integrated with• Integration Services
• Master Data Services (via Excel Addin)
![Page 17: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/17.jpg)
Data Cleasing
• Master (Reference) Data is needed for Data Cleansing• Can be provided directly from our data (Customer Names, for example)
• Can be supplied by third party companies: Azure Market Place• *Very* nice feature.
![Page 18: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/18.jpg)
Data Quality Services
![Page 19: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/19.jpg)
Identity Mapping & De-Duplication
• DQS is not the only solution for Identity Mapping & De-Duplication• MDS has some built-in functions
• SSIS Fuzzy Lookup is great for this• Great Performance & Results!
![Page 20: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/20.jpg)
De-Duplication with Integration Services
![Page 21: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/21.jpg)
Conclusions
• Bad Data Quality means Bad Business
• Start working on Data Quality ASAP• Define a business process to achieve Data quality
• MDS and DQS will help to support it
• Integration with existing application via• Batch
• SOA
• High Data Quality will be a must have!
![Page 23: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/23.jpg)
Link
• MDS Homepagehttp://msdn.microsoft.com/en-us/sqlserver/ff943581.aspx
• 3rd Party Client Applicationhttp://profisee.com/
• Data Quality & Data Sciencehttp://www.solidq.com/consulting/
![Page 24: Ds04 data quality](https://reader034.fdocuments.in/reader034/viewer/2022042713/549cfdcbb47959a0318b48e1/html5/thumbnails/24.jpg)
Previously known as
Think Big. Move Fast.