Object Databases and Object Persistence Framework for openEHR

26
Object Databases and Object Persistence Framework for openEHR Student: Travis Muirhead Supervisor: Jan Stanek Associate Supervisors: Chunlan Ma Heath Frankel

description

Student: Travis Muirhead Supervisor: J an Stanek Associate Supervisors: Chunlan Ma Heath Frankel. Object Databases and Object Persistence Framework for openEHR. Overview. Overview Background openEHR foundation and architecture Motivation - PowerPoint PPT Presentation

Transcript of Object Databases and Object Persistence Framework for openEHR

Page 1: Object Databases and Object Persistence Framework for  openEHR

Object Databases and Object Persistence Framework for openEHR

Student: Travis MuirheadSupervisor: Jan Stanek

Associate Supervisors: Chunlan MaHeath Frankel

Page 2: Object Databases and Object Persistence Framework for  openEHR

2

Overview

• Overview• Background

• openEHR foundation and architecture

• Motivation• Structural and Semantic Issues relating to openEHR systems• Performance Issues in XML and Relational databases storing complex object structures

• Preliminary Evaluation• Linear recursion test• OODB product comparison

• Final Evaluation• Real test data and common queries• Scope and bounds

• Conclusion

Page 3: Object Databases and Object Persistence Framework for  openEHR

3

openEHR foundation

• Non-profit organisation

• Produces open specifications for Electronic Health Records

• Specifications address many challenges in EHR such as:• Semantic Interoperability• Continual Change and Complexity• Maintainability

Page 4: Object Databases and Object Persistence Framework for  openEHR

4

openEHR Architecture

• Separation of knowledge and information

Archetype Software Meta-Architecture.(Beale, Thomas 2002)

Page 5: Object Databases and Object Persistence Framework for  openEHR

5

openEHR Architecture

• Two-Level Modelling Approach

A Two-Level Modelling Paradigm(Beale, Thomas 2002)

Page 6: Object Databases and Object Persistence Framework for  openEHR

6

openEHR Architecture

Archetype Query Language

Terminology Subset Syntax

openEHR package structure(Beale, T & Heard, S 2007c)

• My scope: The Reference Model (RM)

Page 7: Object Databases and Object Persistence Framework for  openEHR

7

openEHR Architecture

• Complex Structure – The persistence problem

Elements of an openEHR Composition(Beale, T & Heard, S 2007c)

Page 8: Object Databases and Object Persistence Framework for  openEHR

8

openEHR Architecture

• AQL (Path Based Querying)• Navigational based, similar to XML query languages

SELECT o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value AS Systolic,o/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value AS DiastolicFROM EHR [ehr_id=$ehrUid]CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.encounter.v1]CONTAINS OBSERVATION o[openEHR-EHR-OBSERVATION.blood_pressure.v1]WHERE o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140ORo/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/value >= 90

A typical EQL query (Ma, C, Frankel, H, Beale, T & Heard, 2007)

Page 9: Object Databases and Object Persistence Framework for  openEHR

9

Motivation for study

• Only known implementation of the persistence layer is in Microsoft SQL Server 2005• Hybrid XML / Relational Approach• Extensive mappings • Doesn’t use the native XML support

• Issues with the current approach• Parsing XML is usually slow• Smallest unit that can be retrieved is the top level

container• Even the native query approach has limitations

• For example: Querying facilities require improvement

Page 10: Object Databases and Object Persistence Framework for  openEHR

10

What about pure relational databases?

• Join Operations• Slow especially for deep tree structures• Can try flattening structures but this results in many NULL fields• Difficult to join many tables and maintain semantics

Example from the Objectivity white pages (Bioinformatics case study): Finding all the Amino Acids associated with a Protein

Page 11: Object Databases and Object Persistence Framework for  openEHR

11

Alternative persistence solutions

• Object-Relational Mapping (ORM)• Framework to map classes written in an OO language to Relational

database Tables• Object-Relational conversion adds overhead• eg. Hibernate, TopLink

• Object-Relational Databases• Additional OO features as layers to Relational Databases• Data still resides in tables and tables are not generated by classes • Eg. UniSQL, Oracle, Sybase

Page 12: Object Databases and Object Persistence Framework for  openEHR

12

Alternative persistence solutions

• Object-Oriented Databases (OODB)• Transparent Persistence• No Impedance Mismatch• Removes overhead of querying XML structures with the

Relational approach using XML Blobs• Removes overhead of complex joins on deep hierarchical

structures• Improves navigational access• Handles recursive and deep hierarchical tree structures

well• Maps well to the openEHR specification• eg. Caché, dbo4, Objectivity/db

Page 13: Object Databases and Object Persistence Framework for  openEHR

13

Selection of OODB Products – Db4o

Opportunities• Transparent persistence API• XTEA (eXtended Tiny Encryption Algorithm) fast, reasonably secure• Lack of authentication and access control• Small overhead – Ideal for mobile health care• No Administration and schema evolution• Tight integration with C# and Java

Limitations:• Query Scope• Activation implementation• Little support for high availability• dRS replication service only form of distribution• Manage your own pooling• Optimal blocksize 8kb – Only 16GB maximum file size

Page 14: Object Databases and Object Persistence Framework for  openEHR

14

Selection of OODB Products – Caché

Opportunities• Post-Relational (Multidimensional, Object and Relational views)• Transparent distribution with ECP• Implicit Locking• Role-Based Access Control (RBAC) – similar to openEHR• AES encryption• Journal Roll Forward• Clustering• 24/7 Support

Limitations:• Only provides tight integration with Java• Some mapping required• Steeper learning curve

Page 15: Object Databases and Object Persistence Framework for  openEHR

15

Selection of OODB Products– Objectivity/DB

Objectivity for Java Programmer’s Guide(Objectivity, 2006)

Page 16: Object Databases and Object Persistence Framework for  openEHR

16

Selection of OODB Products– Objectivity/DB

Opportunities• Tight Integration with many programming languages• Very Scalable (BaBar system)• Possible to have 4 times the amount of databases across a federation in

comparison to Caché• Parallel Queries• Schema Evolution – A possibility to map ADL definitions to data structures• Partitions – resource sharing, replication

Limitations:• Container level locking• Little support for security features• High availability package is a separate product• Rollback journaling only

Page 17: Object Databases and Object Persistence Framework for  openEHR

17

Preliminary Evaluation

Linear Recursive Structure Testing• Evaluate some aspects of openEHR structures in several databases• Assist in identifying implementation and performance issues before settling on

a database and implementing a complex domain model.• Little success with using own structures for lookups in db4o• Query scope was too course-grained for complex objects• Forced to use id’s and db4o OID’s

Head Node

Long headIDNode node;

Long headIDLong positionNode next;

0 .. *0 .. *

0 .. *

0 .. 1

Page 18: Object Databases and Object Persistence Framework for  openEHR

18

Preliminary Evaluation

Bulk Insertion Time

Insert 10,000 head objects containing 100 nodes each

Page 19: Object Databases and Object Persistence Framework for  openEHR

19

Preliminary Evaluation

Insertion at Fixed Intervals

Insert 10,000 head objects followed by 50 head objects (committing one at a time) and repeat 50 times

Page 20: Object Databases and Object Persistence Framework for  openEHR

20

Preliminary Evaluation

Find Single Node (Non-Cached results)

Find node at position #50 within a head object with hID = 2500

Page 21: Object Databases and Object Persistence Framework for  openEHR

21

Preliminary Evaluation

Find Single Node (Cached Results)

Find node at position #50 within a head object from hID = 2000 to hID =2999

Page 22: Object Databases and Object Persistence Framework for  openEHR

22

Preliminary Evaluation

Find Group Node (Non-Cached Results)

Find all nodes inside head objects with identifiers in between 2500 and 2500+gapsize. The test was performed with gap sizes: 5, 10, 20, 50, 100, 200

Further test showed traversal at node 1 resulted in ~ 6 ms average

where as traversal to 99 resulted in 369 ms average

Lookup for both resulted in ~ 4ms

Page 23: Object Databases and Object Persistence Framework for  openEHR

23

Final Evaluation

• Implementation in Intersystems Caché• Implement openEHRV1 in Caché and C#• Better scalability than db4o• Administrative characteristics of distribution over Objectivity/DB• Availability of license

• Test Data• Trying to obtain some real test data to use

• Evaluation• Also finding statistics for most common database operations• More specific than the preliminary evaluation• Statistical Analysis

Page 24: Object Databases and Object Persistence Framework for  openEHR

24

Conclusion

• Difficult to map openEHR architecture to a Relational Model. Hybrid XML approach limited

• OODBs provide a closer mapping and better performance due to references (particularly with path based queries)

• Development time is decreased• Some of the most scalable systems use OODBMSs• Final Evaluation will provide some insight into the

performance aspects within a controlled environment• Scope for future work in comparing distributed

solutions

Page 25: Object Databases and Object Persistence Framework for  openEHR

25

References

• Provided for figures:• Beale, T 2002, Archetypes: Constraint-based Domain Models for Future-proof

Information Systems.• Beale, T & Heard, S 2007c, Architecture Overview, openEHR Foundation.• Ma, C, Frankel, H, Beale, T & Heard, S 2007, 'EHR Query Language (EQL) - A Query

Language for Archetype-Based Health Records', MEDINFO.• Objectivity, I 2007, Whitepaper: Objectivity/DB in Bioinformatics Applications, Objectivity,

California, p. 9.• Objectivity, I 2006a, Objectivity for Java Programmer’s Guide Release 9.3, Objectivity,

Sunnyvale.

• See minor thesis for complete list

Page 26: Object Databases and Object Persistence Framework for  openEHR

26

Questions?