Semantic Technology for the Data Warehousing Practitioner

17
©2013, Cognizant Semantic Technology for the Data Warehousing Practitioner Thomas Kelly Practice Director, Life Sciences Enterprise Information Management Cognizant Technology Solutions, Inc. Shattering Traditional DW/BI Best Practices to Drive Intelligent Analytics Speaker June 2-5, 2013

description

Semantic Technology for the Data Warehousing Practitioner -- Shattering Traditional DW/BI Best Practices to Drive Intelligent Analytics Current DW/BI best practices optimize technologies that were conceived two to three decades ago. To successfully leverage semantic technology, DW/BI professionals will change (even reverse) many of these practices. Many organizations use data warehousing and business intelligence to monitor their operations and guide tactical and strategic decision making. Data warehouses continue to have challenges in: - keeping the data organized in sync with the organization's analytics needs, - delivering data to decision makers in a timely manner, - managing constantly-evolving data quality requirements, - integrating new data sets into the data warehouse, - reusing expert knowledge that is embedded in end-user analytics, and - organizing internal data assets, data from cloud applications, and data from business partners into a common access method. Many of today's DW/BI practices were developed to optimize technologies that were conceived in the 1970's, 80's, and 90's. This presentation examines key features of semantic technology and how DW/BI practices are likely to change to successfully deliver intelligent DW/BI projects.

Transcript of Semantic Technology for the Data Warehousing Practitioner

Page 1: Semantic Technology for the Data Warehousing Practitioner

©2013, Cognizant

Semantic Technology for the Data Warehousing Practitioner

Thomas Kelly Practice Director, Life Sciences Enterprise Information Management Cognizant Technology Solutions, Inc.

Shattering Traditional DW/BI Best Practices to Drive Intelligent Analytics

Speaker

June 2-5, 2013

Page 2: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 2

Agenda

Observations on Project Execution

Changing DW/BI Practices through Semantic Technology

A New Generation of Data Warehousing and Business Intelligence

Why the Data Warehouse is Important 1

2

3

4

Page 3: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 3

Data Warehousing Keeps the Business Running, While Delivering Information & Insights

Traditional Enablers

• Performance

• Predictable Results; Consistent Reports

• Maintain History

• Most data was well-understood

• New data sources emerged only occasionally

• Relational and well-structured data (sources)

Objectives

Page 4: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 4

Many Successful DW Projects have had Challenges

• Lengthy time-to-business value

• Get the … right, or the … breaks

• Speed of business change

• More data is available, from new sources

• We need data faster – closer to time of creation

• Developing expertise

Page 5: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 5

Semantic Technology is about Standards, Products, and Techniques

Page 6: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant

Expert Knowledge

Extensible Ontologies

Linked Data

Provenance

6

Semantic Technology Features that enable Agile Data Warehousing

Entity Resolution

Data Virtualization

Data Federation

Page 7: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant

Now that we’ve discussed my requirements, how soon can I get my new reports?

7

Fresh, Never Frozen Requirements

• Semantic Technology provides highly flexible and extensible features

• Better support for agile development (ongoing requirements definition and prioritization, managed through timeboxing)

• Extend the data model without breaking dependent data loading and analytics functions

Requirements described during the “Requirements” phase

Requirements that are unearthed in later phases

Page 8: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 8

Evolutionary Data Modeling

ISBN Title Author Publisher

ISBN

ISBN Language Title ISBN Language Author

ISBN Language Publisher

ISBN Language Title Author Publisher

“A Game of Thrones (A Song of Ice and Fire, Book 1)” @en “Le trône de fer : L'intégrale, tome 1 “ @fr “漫画系列•冰与火之歌漫画:权力的游戏(第1卷) [平装]“ @ch “Игра Престолов” @ru “Juego de Tronos” @es

bookURI

title

ISBN

Page 9: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant

Semantic Data Warehouse

Traditional Data Warehouse

Multi-Dimensional Data Quality Management

Data Quality Happens Here

Division- Level View

Data Store A

Data Store B

Data Store C

Enterprise Data

Warehouse

Division- Level View

Division- Level View

Department- Level View

My View

Your View

Data Quality Happens Here

And Here

And Here

And Here

And Here

And Here

And Here

Data Source A

Data Source B

Data Source C

Data Warehouse

9

Page 10: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 10

Minimizing Data Movement

Data Source

Data Mart

Data Mart

Landing Zone

Staging Area

Integrated Store

Analytics Layer

Traditional Data Warehouse

Data Warehouse

Data Source

Data Warehouse

Semantic Data Warehouse

Page 11: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 11

Build Links, Not Storage Farms

Traditional Data Warehouse

Data Warehouse

“The Preferred Repository”

Data Source

Data Source

Data Source

Data Source Data

Source

Data Source

Semantic Data Warehouse

One Stop Access Point

Data Source

Data Source

Data Source

Data Source Data

Source

Data Source

Page 12: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant

Source: SNOMED-CT Ontology, IHTSDO

SNOMED Clinical Terms Ontology

sno:40930008 ID 40930008

sno:40930008 Preferred Name Hypothyroidism

icd9:244 ID 244

icd9:244 Preferred Name Acquired hypothyroidism

icd9:244.8 ID 244.8

icd9:244.8 Preferred Name Other specified acquired

hypothyroidism

ind:4093008 ID 40930008

ind:4093008 Defined By sno:40930008

ind:4093008 Inclusion ICD icd9:244

icd9:244.8

ind:4093008 Exclusion ICD icd9:631

icd9:633

{ SELECT DISTINCT ?patientID, ?patientName WHERE { ?patient ?indication “HYPOTHYROIDISM” } }

SPARQL query (abbrieviated)

Integrating Expertise: Selecting for Hypothyroidism

Case Medications

Levothyroxine, synthroid,

levoxyl unithroid, armour

thyroid, desicated thyroid,

cytomel, triostat,

liothyronine, synthetic

trilodothyronine, liotrix,

thyrolar

ICD-9 Codes for Hypothyroidism

244, 244.8, 244.9, 245, 245.2, 245.8, 245.9

ICD-9 Codes for Secondary

Causes of Hypothyroidism

244.0, 244.1, 244.2, 244.3

Abnormal Lab Values

TSH > 5 OR FT4 < 0.5

Case Definition

All three conditions required:

1. ICD-9 code for hypothyroidism OR abnormal TSH/FT4

2. Thyroid replacement medication use

3. Require at least 2 instances of either medication or lab

with at least 3 months between the first and last

instance of medication and lab

Case Exclusions

Exclude if the following information occurs at any time in

the record:

• Secondary causes of hypothyroidism

• Post surgical or post radiation hypothyroidism

• Other thyroid diseases

• Thyroid altering medication

Case Exclusions

Time dependent case exclusions:

• Recent pregnancy TSH/FT4

• Recent contrast exposure Conway et al.; Denny et al.

Reprinted with permission from Jyotishman Pathak, Ph.D., Mayo Clinic

Pregnancy Exclusion

ICD-9 Codes

Any pregnancy billing code

or lab test if all Case

Definition codes, labs, or

medications fall within 6

months before pregnancy

to one year after

pregnancy

V22.1, V22.2, 631, 633,

633.0, 633.00, 633.1,

633.10, 633.20, 633.8,

633.80, 633.9, 633.90,

645.1, 645.2, 646.8, etc.

Exclusion Keywords

Optiray, radiocontrast,

iodine, omnipaque,

visipaque, hypaque,

ioversol, diatrizoate,

iodixanol, isovue,

iopamidol, conray,

iothalamate, renografin,

sinografin, cystografin,

conray, iodipamide

ICD-9 Codes for Post

Surgical or Post Radiation

Hypothyroidism

193*, 242.0, 242.1, 242.2,

242.3, 242.9, 244.0, 244.1,

244.2, 244.3, 258*

CPT Codes for Post

Radiation Hypothyroidism

77261, 77262, 77263, 77280,

77285, 77290, 77295, 77299,

77300, 77301, 77305, 77310,

etc.

Exclusion Keywords

Multiple endocrine neoplasia,

MEN I, MENII, thyroid cancer,

thyroid carcinoma

Thyroid-Altering Medications

Phenytoin, Dilantin, Infatabs,

Dilantin Kapseals, Dilantin-125,

Phenytek, Amiocarone

Pacerone, Cordarone, Lithium,

Eskalith, Lithobid,

Methimazole, Tapazole,

Northyx, Propylthiouracil, PTU

12

Embedding Expert Knowledge

Page 13: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 13

Researching, Analyzing, Justifying, Socializing, and Pleading for Approval of the Project Business Case

• Justify an exploratory project to prove and demonstrate value

• Focus efforts on incremental improvements that achieve a positive result

• Justify further work based on success

Page 14: Semantic Technology for the Data Warehousing Practitioner

| ©2013, Cognizant 14

Semantic Technology-enabled Data Warehousing: Features at a Glance

Features Traditional Technology

and Practices Semantic Technology

and Practices

Requirements Gathering and Analysis • Capture requirements and freeze

early • Manage change

• Capture and validate initial requirements

• Adjust and fine tune • Prioritize new requirements

Data Modeling Thorough upfront analysis to avoid rework later

• Expect change • Agile, evolutionary

Data Latency “Yesterday’s data is available today, if it all loaded ontime”

“The pricing data is updated in real-time”

Data Infrastructure Bring all of the data in house • Leverage external data • Cache locally to address

performance / reliability

Deliver Business Value Correctly define, design, build, and document everything before delivering value

Deliver value early and often

Page 15: Semantic Technology for the Data Warehousing Practitioner

©2013, Cognizant

Questions?

Page 16: Semantic Technology for the Data Warehousing Practitioner

©2013, Cognizant

Thank you

Page 17: Semantic Technology for the Data Warehousing Practitioner

©2013, Cognizant

Thomas (Tom) Kelly Practice Director, EIM Life Sciences, Cognizant

Thomas is a Practice Leader in Cognizant’s Enterprise Information Management (EIM) Practice, with over 30 years of experience, focusing on leading Data Warehousing, Business Intelligence, and Big Data projects that deliver value to Life Sciences and related health industries clients. [email protected]

Speaker