Data Integration - COCS 6421 Advanced Database Systems

71
Introduction Data Integration Summary Data Integration COCS 6421 Advanced Database Systems Przemyslaw Pawluk CSE, York University March 20, 2008 Przemyslaw Pawluk Data Integration

Transcript of Data Integration - COCS 6421 Advanced Database Systems

Page 1: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Data IntegrationCOCS 6421 Advanced Database Systems

Przemyslaw Pawluk

CSE, York University

March 20, 2008

Przemyslaw Pawluk Data Integration

Page 2: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Agenda

1 IntroductionProblem descriptionProblems

2 Data IntegrationIntegration processTheoretical perspectivePractical solutions

3 SummaryOpen questions and future workConclusionBibliography

Przemyslaw Pawluk Data Integration

Page 3: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Introduction

Przemyslaw Pawluk Data Integration

Page 4: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Why to do it?

Many data sources and applications in one organization

The need of fast access and understanding of informationfrom different sources.

Elimination of bottlenecks in the data-flow (many pointswhere data is transformed).

Przemyslaw Pawluk Data Integration

Page 5: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Why to do it?

Many data sources and applications in one organization

The need of fast access and understanding of informationfrom different sources.

Elimination of bottlenecks in the data-flow (many pointswhere data is transformed).

Przemyslaw Pawluk Data Integration

Page 6: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Why to do it?

Many data sources and applications in one organization

The need of fast access and understanding of informationfrom different sources.

Elimination of bottlenecks in the data-flow (many pointswhere data is transformed).

Przemyslaw Pawluk Data Integration

Page 7: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Goals of data integration

To answer queries in most efficient way. (When we knowexactly the queries we want to answer and what data areavailable)

To discover the knowledge. (When don’t know the queries inadvance or we don’t know all data sources in advance)

In both we need to analyze data and present results accordingto requirements of users.

Przemyslaw Pawluk Data Integration

Page 8: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Goals of data integration

To answer queries in most efficient way. (When we knowexactly the queries we want to answer and what data areavailable)

To discover the knowledge. (When don’t know the queries inadvance or we don’t know all data sources in advance)

In both we need to analyze data and present results accordingto requirements of users.

Przemyslaw Pawluk Data Integration

Page 9: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Goals of data integration

To answer queries in most efficient way. (When we knowexactly the queries we want to answer and what data areavailable)

To discover the knowledge. (When don’t know the queries inadvance or we don’t know all data sources in advance)

In both we need to analyze data and present results accordingto requirements of users.

Przemyslaw Pawluk Data Integration

Page 10: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Problems in data integration

Identification of the best data sourceHow to get data from the source?

Interface to integrate schemesFormulation of queries to different data sources

Problems with dataincomplete and incorrect datainconsistent dataincomprehensible dataduplicatesdifferent formats

Security issues

Quality of DataQuality of Service

reliabilityefficiency. . .

Przemyslaw Pawluk Data Integration

Page 11: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Problems with data–Example

Many data formats i.e. John Brown, Brown J., Brown John

Two different phone no. or addresses where integrityconstraint allows only one

Zip codes, phone numbers, SIN etc. with ‘–‘ or without

Bad news!

AI cannot solve all problems. Knowledge of meaning of the data isrequired!

Przemyslaw Pawluk Data Integration

Page 12: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Problems with data–Example

Many data formats i.e. John Brown, Brown J., Brown John

Two different phone no. or addresses where integrityconstraint allows only one

Zip codes, phone numbers, SIN etc. with ‘–‘ or without

Bad news!

AI cannot solve all problems. Knowledge of meaning of the data isrequired!

Przemyslaw Pawluk Data Integration

Page 13: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Problems with data–Example

Many data formats i.e. John Brown, Brown J., Brown John

Two different phone no. or addresses where integrityconstraint allows only one

Zip codes, phone numbers, SIN etc. with ‘–‘ or without

Bad news!

AI cannot solve all problems. Knowledge of meaning of the data isrequired!

Przemyslaw Pawluk Data Integration

Page 14: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Problem descriptionProblems

Problems with data–Example

Many data formats i.e. John Brown, Brown J., Brown John

Two different phone no. or addresses where integrityconstraint allows only one

Zip codes, phone numbers, SIN etc. with ‘–‘ or without

Bad news!

AI cannot solve all problems. Knowledge of meaning of the data isrequired!

Przemyslaw Pawluk Data Integration

Page 15: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Data Integration

Przemyslaw Pawluk Data Integration

Page 16: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Integration Process

Przemyslaw Pawluk Data Integration

Page 17: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Elements of the integration process

Understanding

meta-data analysisdiscover inconsistencies and dependencies

Standardization

target schemadata repair (to achieve consistent data)data identification constraints (How to identify object?)

Specification

build mapping(s)

Execution

materialization (ETL)federation (data exchange or query reformulation)indexing

Przemyslaw Pawluk Data Integration

Page 18: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Elements of the integration process

Understanding

meta-data analysisdiscover inconsistencies and dependencies

Standardization

target schemadata repair (to achieve consistent data)data identification constraints (How to identify object?)

Specification

build mapping(s)

Execution

materialization (ETL)federation (data exchange or query reformulation)indexing

Przemyslaw Pawluk Data Integration

Page 19: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Elements of the integration process

Understanding

meta-data analysisdiscover inconsistencies and dependencies

Standardization

target schemadata repair (to achieve consistent data)data identification constraints (How to identify object?)

Specification

build mapping(s)

Execution

materialization (ETL)federation (data exchange or query reformulation)indexing

Przemyslaw Pawluk Data Integration

Page 20: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Elements of the integration process

Understanding

meta-data analysisdiscover inconsistencies and dependencies

Standardization

target schemadata repair (to achieve consistent data)data identification constraints (How to identify object?)

Specification

build mapping(s)

Execution

materialization (ETL)federation (data exchange or query reformulation)indexing

Przemyslaw Pawluk Data Integration

Page 21: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Techniques

Przemyslaw Pawluk Data Integration

Page 22: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Materialization

DS1

DS2

DS3

ODS

ETL CDB DM2

DM1

DM3

Present.

The information pipeline[1]

Przemyslaw Pawluk Data Integration

Page 23: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Materialization

DS1

DS2

DS3

ODS ETL

CDB DM2

DM1

DM3

Present.

The information pipeline[1]

Przemyslaw Pawluk Data Integration

Page 24: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Materialization

DS1

DS2

DS3

ODS ETL CDB

DM2

DM1

DM3

Present.

The information pipeline[1]

Przemyslaw Pawluk Data Integration

Page 25: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Materialization

DS1

DS2

DS3

ODS ETL CDB DM2

DM1

DM3

Present.

The information pipeline[1]

Przemyslaw Pawluk Data Integration

Page 26: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Materialization

DS1

DS2

DS3

ODS ETL CDB DM2

DM1

DM3

Present.

The information pipeline[1]

Przemyslaw Pawluk Data Integration

Page 27: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Federation

DS1 DS2 DS3

Query processor

Presentation

Do not materialize data

Usually use queryreformulation or unfolding

Przemyslaw Pawluk Data Integration

Page 28: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Federation

DS1 DS2 DS3

Query processor

Presentation

Do not materialize data

Usually use queryreformulation or unfolding

Przemyslaw Pawluk Data Integration

Page 29: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Federation

DS1 DS2 DS3

Query processor

Presentation

Do not materialize data

Usually use queryreformulation or unfolding

Przemyslaw Pawluk Data Integration

Page 30: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Federation

DS1 DS2 DS3

Query processor

Presentation

Do not materialize data

Usually use queryreformulation or unfolding

Przemyslaw Pawluk Data Integration

Page 31: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Indexing

XML1 XML2 XML3

Index

Query processor

Presentation

Do not materialize data

Use index to find properdocument

Przemyslaw Pawluk Data Integration

Page 32: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Indexing

XML1 XML2 XML3

Index

Query processor

Presentation

Do not materialize data

Use index to find properdocument

Przemyslaw Pawluk Data Integration

Page 33: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Indexing

XML1 XML2 XML3

Index

Query processor

Presentation

Do not materialize data

Use index to find properdocument

Przemyslaw Pawluk Data Integration

Page 34: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Indexing

XML1 XML2 XML3

Index

Query processor

Presentation

Do not materialize data

Use index to find properdocument

Przemyslaw Pawluk Data Integration

Page 35: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Theoretical perspective

Przemyslaw Pawluk Data Integration

Page 36: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Views

Przemyslaw Pawluk Data Integration

Page 37: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Views in data integration

Architecture based on global schema (GS) and set of sources

Data in the data sources (DS)

Global schema as virtual view of DS

Each DS and GS can be expressed in a different language

The goal is to define some mappings between DSs and GS

Two approaches:

Global as viewLocal as view

Przemyslaw Pawluk Data Integration

Page 38: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Global as view

Global schema expressed in terms of data sources. Each element ofthe global schema is defined as a view over the sources.

Requirement!

Stable set of data sources!

Przemyslaw Pawluk Data Integration

Page 39: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Global as view

Global schema expressed in terms of data sources. Each element ofthe global schema is defined as a view over the sources.

Requirement!

Stable set of data sources!

Przemyslaw Pawluk Data Integration

Page 40: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

GAV–benefits

The benefits of GAV:

Easier to understand

Mapping explicitly tells how to retrieve data from DSs

Przemyslaw Pawluk Data Integration

Page 41: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Local as view

Global schema defined independently from data-sources, and therelationships between the global schema and the sources areexpressed by defining every source as a view over the globalschema.

Requirement!

Stable global schema!

Przemyslaw Pawluk Data Integration

Page 42: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Local as view

Global schema defined independently from data-sources, and therelationships between the global schema and the sources areexpressed by defining every source as a view over the globalschema.

Requirement!

Stable global schema!

Przemyslaw Pawluk Data Integration

Page 43: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

LAV–benefits

The immediate benefits of LAV:

Easier describing sources(doesn’t require any knowledge aboutother DS)

New DS can be easily added

Describing precise constraints on the contents of DS anddescribing sources that have different relational structure thanthe GS is easier

Przemyslaw Pawluk Data Integration

Page 44: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

GAV vs. LAV

The LAV system can be transformed into a GAV one andvice-versa.

LAV to GAV. . .

. . . can be used to derive theprocedural specification fromdeclarative one.

GAV to LAV. . .

. . . can be useful to derive adeclarative characterization ofthe content of the sourcesstarting from proceduralspecification.

Przemyslaw Pawluk Data Integration

Page 45: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

GAV vs. LAV

The LAV system can be transformed into a GAV one andvice-versa.

LAV to GAV. . .

. . . can be used to derive theprocedural specification fromdeclarative one.

GAV to LAV. . .

. . . can be useful to derive adeclarative characterization ofthe content of the sourcesstarting from proceduralspecification.

Przemyslaw Pawluk Data Integration

Page 46: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

GAV vs. LAV

The LAV system can be transformed into a GAV one andvice-versa.

LAV to GAV. . .

. . . can be used to derive theprocedural specification fromdeclarative one.

GAV to LAV. . .

. . . can be useful to derive adeclarative characterization ofthe content of the sourcesstarting from proceduralspecification.

Przemyslaw Pawluk Data Integration

Page 47: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Answering queries using views

Depends on chosen technique

GAV

gives simple solution how to get the data. Just unfold!

LAV

uses view-based query rewriting or view-based query answering.More complicated than GAV, uses reasoning in the presence ofincomplete information

Przemyslaw Pawluk Data Integration

Page 48: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Schema mapping

Przemyslaw Pawluk Data Integration

Page 49: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Schema mapping

Data exchange–How to move data from one schema to theother one?

Data integration–How to integrate data from many differentschemes?

Przemyslaw Pawluk Data Integration

Page 50: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Schema mapping in data exchange problem

Przemyslaw Pawluk Data Integration

Page 51: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

LAV and GAV–Example

In general elements of the set Σ can have a formφ(x) → ∃yΨ(x , y)For example(Student(s)∧Enrolls(s, c)) → ∃t∃g(Teaches(t, c)∧Grade(s, c , g))In LAV there is only one element on the left sideP(x) → ∃yΨ(x , y)In GAV there is only one element on the right sideφ(x) → R(x)

Przemyslaw Pawluk Data Integration

Page 52: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Schema composing

Przemyslaw Pawluk Data Integration

Page 53: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Schema inverting in schema evolution

Przemyslaw Pawluk Data Integration

Page 54: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Schema mapping–Example of changes in schemas

Why evolution?

new atributeadd/remove constraint. . .

Example

new information about student or course requiredchanges in the grading system. . .

Przemyslaw Pawluk Data Integration

Page 55: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Practical solutions

Przemyslaw Pawluk Data Integration

Page 56: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Example–banking system

Our point of view: CRM system

Questions:

The purpose of the systemThe source/sources of data for the system?. . .

Przemyslaw Pawluk Data Integration

Page 57: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Example–Banking systems

Customers DB

WebSys. Data Account DB Contracts

Cards DB

CRM System

Przemyslaw Pawluk Data Integration

Page 58: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Example–Banking systems

Customers DB

WebSys. Data Account DB Contracts

Cards DBCRM System

Przemyslaw Pawluk Data Integration

Page 59: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Example–Banking systems

Different formats and accuracy of data

Different Database Management Systems

Different owners (security levels)

. . .

Przemyslaw Pawluk Data Integration

Page 60: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

What is Master Data?

Facts describing core business entities

Provides business context

Does not imply a particular style of usage

Not everything is Master Data . . .

Przemyslaw Pawluk Data Integration

Page 61: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

What Is Master Data Management?

An approach that decouples master information fromindividual applications and unifies it.

Central application- and process-neutral resource

Ensures consistent, up-to-date master information acrossbusiness processes, and systems

Simplifies ongoing integration tasks and new applicationdevelopment

Przemyslaw Pawluk Data Integration

Page 62: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Problems . . .

Przemyslaw Pawluk Data Integration

Page 63: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Integration processTheoretical perspectivePractical solutions

Proposed solution

Master Data Management brakes walls between the systemsand provides fast access to the consistent information.

Przemyslaw Pawluk Data Integration

Page 64: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Summary

Przemyslaw Pawluk Data Integration

Page 65: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Open questions

Automatic schema mapping

Reference reconciling

Model management

Peer-to-Peer Data Management

Przemyslaw Pawluk Data Integration

Page 66: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Conclusion

Data integration–theoretical as well as business problem

Some commercial solutions are available but not sufficient

Many issues without solution = reach research field

Przemyslaw Pawluk Data Integration

Page 67: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Bibliography

Przemyslaw Pawluk Data Integration

Page 68: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Brazhnik, O., and Jones, J. F.Anatomy of data integration.J. of Biomedical Informatics 40, 3 (2007), 252–269.

Fagin, R., Kolaitis, P. G., Tan, W.-C., and Popa, L.Composing schema mappings: second-order dependencies tothe rescue.In PODS ’04: Proceedings of the twenty-third ACMSIGMOD-SIGACT-SIGART symposium on Principles ofdatabase systems (New York, NY, USA, 2004), ACM,pp. 83–94.

Haas, L. M.Beauty and the beast: The theory and practice of informationintegration.In ICDT (2007), T. Schwentick and D. Suciu, Eds., vol. 4353of Lecture Notes in Computer Science, Springer, pp. 28–43.

Przemyslaw Pawluk Data Integration

Page 69: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Halevy, A., Rajaraman, A., and Ordille, J.Data integration: the teenage years.In VLDB ’06: Proceedings of the 32nd international conferenceon Very large data bases (2006), VLDB Endowment, pp. 9–16.

Halevy, A. Y.Answering queries using views: A survey.The VLDB Journal 10, 4 (2001), 270–294.

Kolaitis, P. G.Schema mappings, data exchange, and metadatamanagement.In PODS ’05: Proceedings of the twenty-fourth ACMSIGMOD-SIGACT-SIGART symposium on Principles ofdatabase systems (New York, NY, USA, 2005), ACM,pp. 61–75.

Lenzerini, M.

Przemyslaw Pawluk Data Integration

Page 70: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Data integration: a theoretical perspective.In PODS ’02: Proceedings of the twenty-first ACMSIGMOD-SIGACT-SIGART symposium on Principles ofdatabase systems (New York, NY, USA, 2002), ACM,pp. 233–246.

Pankowski, T.Integracja danych w teorii i praktyce–przeglad problemow irozwiazan.In Bazy Danych: Nowe Technologie (2007), WK, pp. 45–55.

Ra, Y.-G., and Rundensteiner, E. A.A transparent schema-evolution system based onobject-oriented view technology.IEEE Transactions on Knowledge and Data Engineering 9, 4(1997), 600–624.

Shahri, H. H., and Shahri, S. H.

Przemyslaw Pawluk Data Integration

Page 71: Data Integration - COCS 6421 Advanced Database Systems

IntroductionData Integration

Summary

Open questions and future workConclusionBibliography

Eliminating duplicates in information integration: An adaptive,extensible framework.IEEE Intelligent Systems 21, 5 (2006), 63–71.

Przemyslaw Pawluk Data Integration