Semantic Technology for the Data Warehousing Practitioner
-
Upload
thomas-kelly-pmp -
Category
Technology
-
view
251 -
download
4
description
Transcript of Semantic Technology for the Data Warehousing Practitioner
©2013, Cognizant
Semantic Technology for the Data Warehousing Practitioner
Thomas Kelly Practice Director, Life Sciences Enterprise Information Management Cognizant Technology Solutions, Inc.
Shattering Traditional DW/BI Best Practices to Drive Intelligent Analytics
Speaker
June 2-5, 2013
| ©2013, Cognizant 2
Agenda
Observations on Project Execution
Changing DW/BI Practices through Semantic Technology
A New Generation of Data Warehousing and Business Intelligence
Why the Data Warehouse is Important 1
2
3
4
| ©2013, Cognizant 3
Data Warehousing Keeps the Business Running, While Delivering Information & Insights
Traditional Enablers
• Performance
• Predictable Results; Consistent Reports
• Maintain History
• Most data was well-understood
• New data sources emerged only occasionally
• Relational and well-structured data (sources)
Objectives
| ©2013, Cognizant 4
Many Successful DW Projects have had Challenges
• Lengthy time-to-business value
• Get the … right, or the … breaks
• Speed of business change
• More data is available, from new sources
• We need data faster – closer to time of creation
• Developing expertise
| ©2013, Cognizant 5
Semantic Technology is about Standards, Products, and Techniques
| ©2013, Cognizant
Expert Knowledge
Extensible Ontologies
Linked Data
Provenance
6
Semantic Technology Features that enable Agile Data Warehousing
Entity Resolution
Data Virtualization
Data Federation
| ©2013, Cognizant
Now that we’ve discussed my requirements, how soon can I get my new reports?
7
Fresh, Never Frozen Requirements
• Semantic Technology provides highly flexible and extensible features
• Better support for agile development (ongoing requirements definition and prioritization, managed through timeboxing)
• Extend the data model without breaking dependent data loading and analytics functions
Requirements described during the “Requirements” phase
Requirements that are unearthed in later phases
| ©2013, Cognizant 8
Evolutionary Data Modeling
ISBN Title Author Publisher
ISBN
ISBN Language Title ISBN Language Author
ISBN Language Publisher
ISBN Language Title Author Publisher
“A Game of Thrones (A Song of Ice and Fire, Book 1)” @en “Le trône de fer : L'intégrale, tome 1 “ @fr “漫画系列•冰与火之歌漫画:权力的游戏(第1卷) [平装]“ @ch “Игра Престолов” @ru “Juego de Tronos” @es
bookURI
title
ISBN
| ©2013, Cognizant
Semantic Data Warehouse
Traditional Data Warehouse
Multi-Dimensional Data Quality Management
Data Quality Happens Here
Division- Level View
Data Store A
Data Store B
Data Store C
Enterprise Data
Warehouse
Division- Level View
Division- Level View
Department- Level View
My View
Your View
Data Quality Happens Here
And Here
And Here
And Here
And Here
And Here
And Here
Data Source A
Data Source B
Data Source C
Data Warehouse
9
| ©2013, Cognizant 10
Minimizing Data Movement
Data Source
Data Mart
Data Mart
Landing Zone
Staging Area
Integrated Store
Analytics Layer
Traditional Data Warehouse
Data Warehouse
Data Source
Data Warehouse
Semantic Data Warehouse
| ©2013, Cognizant 11
Build Links, Not Storage Farms
Traditional Data Warehouse
Data Warehouse
“The Preferred Repository”
Data Source
Data Source
Data Source
Data Source Data
Source
Data Source
Semantic Data Warehouse
One Stop Access Point
Data Source
Data Source
Data Source
Data Source Data
Source
Data Source
| ©2013, Cognizant
Source: SNOMED-CT Ontology, IHTSDO
SNOMED Clinical Terms Ontology
sno:40930008 ID 40930008
sno:40930008 Preferred Name Hypothyroidism
icd9:244 ID 244
icd9:244 Preferred Name Acquired hypothyroidism
icd9:244.8 ID 244.8
icd9:244.8 Preferred Name Other specified acquired
hypothyroidism
ind:4093008 ID 40930008
ind:4093008 Defined By sno:40930008
ind:4093008 Inclusion ICD icd9:244
icd9:244.8
ind:4093008 Exclusion ICD icd9:631
icd9:633
{ SELECT DISTINCT ?patientID, ?patientName WHERE { ?patient ?indication “HYPOTHYROIDISM” } }
SPARQL query (abbrieviated)
Integrating Expertise: Selecting for Hypothyroidism
Case Medications
Levothyroxine, synthroid,
levoxyl unithroid, armour
thyroid, desicated thyroid,
cytomel, triostat,
liothyronine, synthetic
trilodothyronine, liotrix,
thyrolar
ICD-9 Codes for Hypothyroidism
244, 244.8, 244.9, 245, 245.2, 245.8, 245.9
ICD-9 Codes for Secondary
Causes of Hypothyroidism
244.0, 244.1, 244.2, 244.3
Abnormal Lab Values
TSH > 5 OR FT4 < 0.5
Case Definition
All three conditions required:
1. ICD-9 code for hypothyroidism OR abnormal TSH/FT4
2. Thyroid replacement medication use
3. Require at least 2 instances of either medication or lab
with at least 3 months between the first and last
instance of medication and lab
Case Exclusions
Exclude if the following information occurs at any time in
the record:
• Secondary causes of hypothyroidism
• Post surgical or post radiation hypothyroidism
• Other thyroid diseases
• Thyroid altering medication
Case Exclusions
Time dependent case exclusions:
• Recent pregnancy TSH/FT4
• Recent contrast exposure Conway et al.; Denny et al.
Reprinted with permission from Jyotishman Pathak, Ph.D., Mayo Clinic
Pregnancy Exclusion
ICD-9 Codes
Any pregnancy billing code
or lab test if all Case
Definition codes, labs, or
medications fall within 6
months before pregnancy
to one year after
pregnancy
V22.1, V22.2, 631, 633,
633.0, 633.00, 633.1,
633.10, 633.20, 633.8,
633.80, 633.9, 633.90,
645.1, 645.2, 646.8, etc.
Exclusion Keywords
Optiray, radiocontrast,
iodine, omnipaque,
visipaque, hypaque,
ioversol, diatrizoate,
iodixanol, isovue,
iopamidol, conray,
iothalamate, renografin,
sinografin, cystografin,
conray, iodipamide
ICD-9 Codes for Post
Surgical or Post Radiation
Hypothyroidism
193*, 242.0, 242.1, 242.2,
242.3, 242.9, 244.0, 244.1,
244.2, 244.3, 258*
CPT Codes for Post
Radiation Hypothyroidism
77261, 77262, 77263, 77280,
77285, 77290, 77295, 77299,
77300, 77301, 77305, 77310,
etc.
Exclusion Keywords
Multiple endocrine neoplasia,
MEN I, MENII, thyroid cancer,
thyroid carcinoma
Thyroid-Altering Medications
Phenytoin, Dilantin, Infatabs,
Dilantin Kapseals, Dilantin-125,
Phenytek, Amiocarone
Pacerone, Cordarone, Lithium,
Eskalith, Lithobid,
Methimazole, Tapazole,
Northyx, Propylthiouracil, PTU
12
Embedding Expert Knowledge
| ©2013, Cognizant 13
Researching, Analyzing, Justifying, Socializing, and Pleading for Approval of the Project Business Case
• Justify an exploratory project to prove and demonstrate value
• Focus efforts on incremental improvements that achieve a positive result
• Justify further work based on success
| ©2013, Cognizant 14
Semantic Technology-enabled Data Warehousing: Features at a Glance
Features Traditional Technology
and Practices Semantic Technology
and Practices
Requirements Gathering and Analysis • Capture requirements and freeze
early • Manage change
• Capture and validate initial requirements
• Adjust and fine tune • Prioritize new requirements
Data Modeling Thorough upfront analysis to avoid rework later
• Expect change • Agile, evolutionary
Data Latency “Yesterday’s data is available today, if it all loaded ontime”
“The pricing data is updated in real-time”
Data Infrastructure Bring all of the data in house • Leverage external data • Cache locally to address
performance / reliability
Deliver Business Value Correctly define, design, build, and document everything before delivering value
Deliver value early and often
©2013, Cognizant
Questions?
©2013, Cognizant
Thank you
©2013, Cognizant
Thomas (Tom) Kelly Practice Director, EIM Life Sciences, Cognizant
Thomas is a Practice Leader in Cognizant’s Enterprise Information Management (EIM) Practice, with over 30 years of experience, focusing on leading Data Warehousing, Business Intelligence, and Big Data projects that deliver value to Life Sciences and related health industries clients. [email protected]
Speaker