Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex...
-
Upload
hannah-coate -
Category
Documents
-
view
220 -
download
4
Transcript of Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex...
![Page 1: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/1.jpg)
1
Intersection Schemas as a Dataspace Integration
Technique
04/11/2023
Richard Brownlow Alex Poulovassilis
![Page 2: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/2.jpg)
04/11/2023 2
Contribution• A new methodology for lightweight data
integration in an incremental pay-as-you-go environment based on the concept of “Intersection Schemas”, utilising bidirectional transformations at a schema level.
• Improve on existing workflows for data integration, to increase the productivity of the incremental Data Integration process.
• Development of a demonstrator and user interface to aid the data integrator
![Page 3: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/3.jpg)
04/11/2023 3
Intersection Schemas• Implements a framework for incremental data
integration. A component within the existing AutoMed data integration framework.
• Introduces a new “pay-as-you-go” technique of Intersection Schemas. This allows the integrator to incrementally identify intersections between schemas, and integrate them into the Global Schema.
![Page 4: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/4.jpg)
04/11/2023 4
AutoMed Architecture
![Page 5: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/5.jpg)
04/11/2023 5
Data Integration via Union-compatible Schemas
![Page 6: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/6.jpg)
04/11/2023 6
Intersection Schema
![Page 7: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/7.jpg)
04/11/2023 7
Integrated Intersection and Extensional
Schemas
![Page 8: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/8.jpg)
04/11/2023 8
Global schema derived from
Intersection and Extensional Schemas
![Page 9: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/9.jpg)
04/11/2023 9
Case StudyISpider
• Proteomics data from three different data sources• Mappings defined by domain experts• Mappings constitute the domain knowledge
![Page 10: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/10.jpg)
04/11/2023 10
Illustrative Use Case
Based on iSpider Datasetso Three data sources:
• gpmDB• Pedro• Pepseeker
![Page 11: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/11.jpg)
04/11/2023 11
Illustrative Use CaseGUI
![Page 12: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/12.jpg)
04/11/2023 12
Workflow1. Identify the extensional schemas representing the set of data
sources that are to be integrated.2. Initially a federated schema is created from the schemas
identified in Step 1. 3. Inspect the schemas identified in Step 1 and select two of them
from which to derive an intersection schema.4. Identify mappings between these two schemas and create an
intersection schema. 5. A new Global Schema is created automatically from the
Intersection Schema and the extensional schemas by our tool. The user may optionally elect for any redundant objects in the new Global schema to be dropped.
6. The user may test the Intersection schema or Global schema at this stage by running queries on it.
7. Repeat Steps 3 to 6 for each integration iteration.
![Page 13: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/13.jpg)
04/11/2023 13
Evaluation• Comparison of Intersection Schema methodology
versus a “classical” ladder based integration methodology:
• For ladder based integration integration:• 95 manually defined transformations
• For Intersection schema based integration: • 26 manually defined transformations
![Page 14: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/14.jpg)
04/11/2023 14
ConclusionsWe have demonstrated the technique on a real-world data integration scenario and have seen that the number of user-defined steps required to perform the integration is significantly reduced compared to the original data integration methodology used by the domain experts on that project.
We have shown how the AutoMed toolkit and bidirectional schema transformations can be used to underpin a new light-weight data integration technique within an incremental pay-as-you-go data integration process.
![Page 15: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/15.jpg)
04/11/2023 15
Future Work• Extending the methodology so that intersections
can be created between any number of source schemas at each iteration of the process, rather than just two as at present.
• Detailed user evaluations.
![Page 16: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/16.jpg)
04/11/2023 16
Any Questions
![Page 17: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/17.jpg)
04/11/2023 17
Appendix
Example iSpider transformations from original project.
![Page 18: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/18.jpg)
04/11/2023 18
![Page 19: Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.](https://reader036.fdocuments.in/reader036/viewer/2022062404/551a87975503466b3a8b4abb/html5/thumbnails/19.jpg)
04/11/2023 19