D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up...

67
SALUS “Scalable, Standard based Interoperability Framework for Sustainable Proactive Post Market Safety Studies” SPECIFIC TARGETED RESEARCH PROJECT PRIORITY Objective ICT-2011.5.3b Tools and environments enabling the re-use of electronic health records SALUS D4.2.1 SALUS Common Set of Data Elements for Post Market Safety Studies - R1 Due Date: March 31, 2013 Actual Submission Date: March 31, 2013 Project Dates: Project Start Date : February 01, 2012 Project End Date : January 31, 2015 Project Duration : 36 months Deliverable Leader: SRDC Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013) Dissemination Level PU Public X PP Restricted to other programme participants (including the Commission Services) RE Restricted to a group specified by the consortium (including the Commission Services) CO Confidential, only for members of the consortium (including the Commission Services)

Transcript of D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up...

Page 1: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

SALUS “Scalable, Standard based Interoperability Framework for

Sustainable Proactive Post Market Safety Studies”

SPECIFIC TARGETED RESEARCH PROJECT PRIORITY Objective ICT-2011.5.3b Tools and environments enabling the re-use of electronic health records

SALUS D4.2.1 SALUS Common Set of Data Elements for Post Market Safety Studies - R1

Due Date: March 31, 2013 Actual Submission Date: March 31, 2013 Project Dates: Project Start Date : February 01, 2012

Project End Date : January 31, 2015 Project Duration : 36 months

Deliverable Leader: SRDC

Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013)

Dissemination Level

PU Public X PP Restricted to other programme participants (including the Commission Services) RE Restricted to a group specified by the consortium (including the Commission Services) CO Confidential, only for members of the consortium (including the Commission Services)

Page 2: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 2 of 67

Document History: Version Date Changes From Review

v0.1 2013-03-04 The structure of the deliverable with initial content.

SRDC All

v0.2 2013-03-15 Design & Implementation of SALUS MDR has been inserted to the deliverable.

SRDC All

v0.3 2013-03-20 Common Set of SALUS Data Elements SRDC All

v0.4 2013-03-25 ISO/IEC 11179 based design principles and methodology. IHE DEX Profile.

SRDC All

v1.0 2013-03-26 General improvements and finalization SRDC All

Contributors (Benef.) Gokce B. Laleci, Mustafa Yuksel, A. Anil Sinaci, Anil Pacaci (SRDC)

Responsible Author A. Anil Sinaci Email [email protected]

Beneficiary SRDC Phone +903122101763

Page 3: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 3 of 67

SALUS Consortium Contacts:

Beneficiary Name Phone Fax E-Mail SRDC Gokce Banu Laleci

Erturkmen +90-312-2101763 +90(312)2101837 [email protected]

EUROREC Georges De Moor +32-9-2101161 +32-9-3313350 [email protected] UMC Niklas Norén +4618656060 +46 18 65 60 80 [email protected] OFFIS Wilfried Thoben

+49-441-9722131

+49-441-9722111

[email protected]

AGFA Dirk Colaert +32-3-4448408 +32 3 444 8401 [email protected] ERS Gerard Freriks +31 620347088 +31 847371789 [email protected] LISPA Alberto Daprà +390239331605 +39 02 39331207 [email protected] INSERM Marie-Christine Jaulent +33142346983 +33153109201 marie-

[email protected] TUD Peter Schwarz +49 351 458 2715 +49 351 458 7319 Peter.Schwarz@uniklinikum-

dresden.de ROCHE Jamie Robinson +41-61-687 9433 +41 61 68 88412 [email protected]

Page 4: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 4 of 67

EXECUTIVE SUMMARY This deliverable defines the first version of the SALUS Common Data Elements (CDEs) and the initial implementation of the CDE Repository that has been built through the SALUS Semantic Metadata Repository (MDR). In order to elicit the SALUS CDEs and design the SALUS Semantic MDR, Task 4.2 has worked on the SALUS use case requirements provided within Deliverable 8.1.1 and developed the first set of CDEs with appropriate mappings to the data elements of the content models that have been presented in the SALUS Deliverable 4.1.1.

SALUS semantic interoperability approach has been designed to enable the information exchange between clinical research and clinical care domain applications through a central layer instead of one-to-one transformations between several different content models, by developing a common ontology. SALUS CDEs that have been elicited within this deliverable can be considered as the semantic dictionary of the SALUS components, of which each interoperating application should be aware. CDE Repository serves as the metadata repository of the SALUS framework so that the SALUS components and interoperating applications can consume the CDE definitions in a machine-processable way and handle the data exchange based on the CDEs that they correspond to.

During the design and implementation of the CDE Repository, in addition to the SALUS CDEs, other efforts in the literature that are developing data element models for achieving interoperability in eHealth domain are examined. This has led to a Semantic Metadata Repository implementation that goes beyond the SALUS specific requirements. In this deliverable, the requirements, design and implementation details for the first version of the federated Semantic MDR architecture has also been presented. The ultimate goal of this architecture is to support interoperability between the CDEs and content models defined by disparate systems and organizations in the eHealth domain.

SALUS Project, in particular Task 4.2 is in close cooperation with IHE Quality, Research and Public Health Domain (QRPH) Technical Framework for the development of a new IHE interoperability profile, namely the Data Element Exchange (DEX) profile. IHE DEX profile mainly addresses the data interoperability issues between patient care and clinical research domains and discusses a standard-based expressive and scalable framework that is grounded on the power of metadata registries. SALUS Semantic MDR is going to be one of the first implementations of the IHE DEX Profile.

Page 5: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 5 of 67

TABLE OF CONTENTS EXECUTIVE SUMMARY ..................................................................................................................... 4  TABLE OF CONTENTS ........................................................................................................................ 5  1   PURPOSE ...................................................................................................................................... 6  

1.1   Definitions and Acronyms ...................................................................................................... 6  2   Introduction .................................................................................................................................... 7  3   ISO/IEC 11179 ............................................................................................................................... 9  

3.1   Metadata & Metadata Repository ........................................................................................... 9  3.2   Common Data Element ......................................................................................................... 10  3.3   ISO/IEC 11179 Metamodel .................................................................................................. 12  

4   SALUS Common Data Elements ................................................................................................. 14  5   The Requirement for a Semantic MDR framework ..................................................................... 29  

5.1   How this semantic MDR can be exploited for enabling interoperability across domains? [23] ............................................................................................................................................... 31  

6   Design & Implementation of SALUS SEMANTIC MDR ........................................................... 33  6.1   Ontology of ISO/IEC 11179 Metamodel .............................................................................. 33  6.2   MDR Knowledge Base ......................................................................................................... 36  

6.2.1   Triple Store ................................................................................................................... 37  6.2.2   Semantic Data Manipulation API ................................................................................. 39  6.2.3   MDR API ...................................................................................................................... 42  

6.3   Importers ............................................................................................................................... 45  6.3.1   OMOP Common Data Model Importer ........................................................................ 46  6.3.2   SDTM Importer ............................................................................................................. 47  6.3.3   CDASH Importer .......................................................................................................... 50  6.3.4   HITSP Importers ........................................................................................................... 50  

6.4   REST API ............................................................................................................................. 52  6.5   Graphical User Interface ....................................................................................................... 52  

6.5.1   Authentication Service .................................................................................................. 53  6.5.2   REST Services .............................................................................................................. 54  6.5.3   Graphical User Interface ............................................................................................... 55  

7   IHE DEX Profile .......................................................................................................................... 59  7.1   DEX Actors, Transactions .................................................................................................... 59  7.2   DEX Profile Use Cases ......................................................................................................... 60  

7.2.1   Use Case #1: Pre-population of a Research Case Report Form .................................... 60  7.2.2   Use Case #2: Eligibility Determination ........................................................................ 61  7.2.3   Use Case #3: Observational Study ................................................................................ 62  7.2.4   Use Case #4: Public Health Case Reporting ................................................................. 64  

7.3   Current status of IHE DEX Profile ....................................................................................... 65  8   Conclusion & Future Work .......................................................................................................... 66  REFERENCES ...................................................................................................................................... 67  

Page 6: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 6 of 67

1 PURPOSE

The main purpose of this deliverable is to create the common set of data elements, so called SALUS Common Data Elements (CDEs) based on the content models that have been defined through SALUS deliverable D4.1.1 – SALUS Content Models for the Functional Interoperability Profiles for Post Market Safety Studies – R1 [1]. In addition to the SALUS CDEs, this deliverable introduces the detailed design and implementation of SALUS CDE Repository, which serves as a semantic metadata repository in order to maintain and manage SALUS CDEs.

1.1 Definitions and Acronyms Table 1 – List of Abbreviations and Acronyms

Abbreviation/  Acronym   Definition  

13606   CEN/ISO  13606  EHR-­‐Communication  standard  ADE   Adverse  Drug  Event  AIFA   Italian  Medicines  Agency  CCD   Continuity  of  Care  Document  CDA   Clinical  Document  Architecture  CDE   Common  Data  Element  CEN   European  Committee  for  Standardization  ContSys   System  of  Concepts  for  Continuity  of  Care  (CEN/ISO  13940)  DEX   Data  Element  Exchange  DWH   Data  Warehouse  E2B  (R2)   ICH  message  standard  based  on  HL7  for  Individual  Case  Safety  Reports  EMA   The  European  Medicines  Agency  FDA   Food  and  Drug  Administration  HISA   CEN/ISO  Health  Information  Services  Architecture  HITSP   Health  Information  Technology  Standards  HL7   Health  Level  Seven  ICSR   Individual  Case  Safety  Report  IHE   Integrating  the  Healthcare  Enterprise  IHE  DEX   IHE  Data  Element  Exchange  Profile  ISO   International  Standardisation  Organisation  MDR   Metadata  Repository  OMOP   Observational  medical  Outcomes  Partnership  OMOP  CDM   Observational  medical  Outcomes  Partnership  -­‐  Common  Data  Model  PCC   Patient  Care  Coordination  SIAMM   Semantic  Interoperability  Artefact  Modeling  Method  

Page 7: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 7 of 67

2 INTRODUCTION SALUS semantic interoperability framework aims to achieve seamless information exchange between clinical research and clinical care applications. SALUS deliverable D4.1.1 [1] has identified the formal definitions of content models which are in use within the SALUS use-cases. In a very high-level perspective, the goal is to transfer data conforming to one content model to the required format conforming to another content model. SALUS semantic interoperability approach has been designed to enable this information exchange through a central layer instead of one-to-one transformations between several different content models, by developing a common ontology. For this purpose, in this deliverable, the common set of data elements, so called SALUS Common Data Elements (CDEs), are presented together with the metadata repository that is developed to maintain the CDEs semantically. This work is the result of Task 4.2 of the SALUS project. These common data elements then are formally expressed as semantic resources to create the so called SALUS Harmonized Ontology in Task 4.3. SALUS deliverable D8.1.1 [2] provides the pilot application scenarios and associated data requirements. During the work of Task 4.2, SALUS CDEs have been elicited through a comprehensive analysis of the data requirements of SALUS pilot scenarios. In addition, the mappings of the CDEs have been performed to the data elements of the content models identified within Task 4.1 and presented through D4.1.1. SALUS CDE Repository maintaining the SALUS CDEs constitutes an important part of the SALUS semantic interoperability framework. SALUS CDEs can be perceived as the semantic dictionary of the SALUS components where each interoperating application should be aware of. For this, the CDEs should be maintained in a machine-processable manner, where the CDE Repository meets the expectations. CDE Repository serves as the metadata repository (MDR) of the SALUS framework so that the applications interoperating through SALUS framework can interact with the repository to consume SALUS CDEs and act appropriately during information exchange. The design and implementation of CDE repository go beyond the requirements of SALUS interoperability framework. During the elicitation of SALUS CDEs, several other common data element models have been analysed and it has been observed that one of the major deficiencies is that most of the common data element models are published through PDF documents or spreadsheets and hence they are not accessible in a machine-processable way. While the main objective of SALUS CDE Repository is to maintain and manage SALUS CDEs, seeing this kind of a wide interoperability challenge where lots of disparate data element models exist and these models are not in machine-processable forms; a much more comprehensive metadata registry has been designed and implemented to support the interoperability. In this respect, SALUS Semantic MDR, which is the underlying technology of CDE Repository, aims to present a federated metadata registry architecture where machine-processable definitions of CDEs across domains can be shared, re-used, and semantically interlinked with each other to address this semantic interoperability challenge. As a result, the SALUS semantic MDR can be used not only by the SALUS project but also by other efforts and projects that address semantic interoperability by maintaining a set of common data elements. SALUS Semantic MDR is based on ISO/IEC 11179 [3] metamodel and exploits the semantic Web technologies at the same time. ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks of different data models are identified first and then larger data elements are built through the aggregation and association of the smaller data elements. This is in line with the objectives of SALUS CDE Repository. However, to support the semantic interoperability in a wider area, it should deal with several annotations and links to the external world because several vocabularies, classification schemes and terminology systems are currently in use for clinical care and research domains. This is an important requirement for Semantic MDR: providing

Page 8: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 8 of 67

semantic interoperability between different data models using different terminology systems should follow the characteristics of the Linked Data1 approach. SALUS Semantic MDR is designed based on the idea of Common Data Elements; however, it introduces a completely new methodology to support the data interoperability between different data models. The novelty comes with the methodology in the population of the repository. Identification of the CDEs for different data models and accumulation of these CDEs within the repository requires an intensive work of domain experts. Moreover, clinical research and clinical care domains are linked through several terminology systems. Human experts should also deal with all these terminologies and mappings between the CDEs of different domain models. CDE Repository introduces importers for the semi-automatic identification of CDEs of domain models and population of CDE Knowledge Base. Human interaction is decreased because CDE identification will be handled through the associated importers, and inter-relations and mappings will be semi-automatically handled within the CDE Knowledge Base. As mentioned above, SALUS Semantic MDR addresses a much more challenging level of interoperability considering several different data element models in a federated framework through semantic Web technologies within Linked Open Data cloud. The first version provides easy-to-use Graphical User Interfaces (GUIs) for CDE management on top of the CDE Knowledge Base together with several content model importers such as OMOP CDM [4] importer and CDISC SDTM [5] importer. The first version of the open source implementation is presented in a public Github repository2. SALUS Project, in particular Task 4.2 is in close cooperation with IHE Quality, Research and Public Health Domain (QRPH) Technical Framework [6]. This collaboration is working on a new IHE profile named as IHE Data Element Exchange (DEX) profile [7], which argues that integrating patient care and clinical research domains requires a standard-based expressive and scalable semantic interoperability framework, allowing dynamic mappings between data elements and semantics of varying data sources through a metadata repository. Section 0 presents the details of this work. In the following sections, after presenting a summary of the required background information in Section 0, the development methodology and the SALUS Common Data Elements are presented in Section 0. Afterwards, Section 0 analyses the requirements of a semantic MDR architecture. The design and implementation details of the Semantic MDR are presented in Section 6. Section 0 presents the outcome of the IHE DEX profile proposal and Section 0 concludes the deliverable with a discussion on the future work and the second version of SALUS Common Set of Data Elements (D4.2.2).

1 http://linkeddata.org/ 2 https://github.com/sinaci/semanticMDR

Page 9: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 9 of 67

3 ISO/IEC 11179 ISO/IEC 11179 [3] family of specifications introduces a standard model for metadata registries to increase the interoperability of applications with the use of data elements. The main idea is to make disparate systems use the same set of data elements with very well-defined methodologies so that different information systems can be made through the aggregation and association of the same data elements. The standard defines a metadata registry; describes how to describe data, store data, classify data and manage data. That is, ISO/IEC 11179 comes in six different parts in order to address the semantics, representation and registration of data elements (the metadata). These are listed as follows in the 2nd edition of the standard:

1. Framework: Contains an overview of the standard and describes the basic concepts 2. Classification: Describes how to manage a classification scheme in a metadata registry 3. Registry metamodel and basic attributes: Provides the basic conceptual model, including the

basic attributes and relationships, for a metadata registry 4. Formulation of data definitions: Rules and guidelines for forming quality definitions for data

elements and their components 5. Naming and identification principles: Describes how to form conventions for naming data

elements and their components 6. Registration: Specifies the roles and requirements for the registration process in an ISO/IEC

11179 metadata registry Applying ISO/IEC 11179 specifications throughout the metadata management provides several improvements in terms of data interoperability. The standard lists them as follows:

• Standard description of data • Common understanding of data across organizational elements and between organizations • Re-use and standardization of data over time, space, and applications • Harmonization and standardization of data within an organization and across organizations • Management of the components of data • Re-use of the components of data

SALUS Common Data Elements (CDEs) and the CDE Repository have been designed in accordance to the ISO/IEC 11179 specifications. SALUS Common Data Elements correspond to the metadata which will be used within the SALUS use-cases while the CDE repository corresponds to the metadata repository which maintains the CDEs of SALUS. In the following sub-sections (3.1, 3.2, 3.3), the background and the methodology on SALUS CDE development has been presented. And in Section 0, the list of SALUS CDEs together with the textual descriptions and classifications based on ISO/IEC 11179 metamodel constructs.

3.1 Metadata & Metadata Repository Metadata has a common definition: “data about data”. However, this is a very generic, and deprecated, definition. Today’s systems make a distinction between structural and descriptive metadata. Structural metadata gives information about the syntactic nature of the data (data about the containers of data) while descriptive one provides semantics for the data. Metadata and metadata management is very important for the data interoperability between different applications. To be able to exchange data and process data once it has been exchanged, the metadata should be agreed on by the interoperating systems. Figure 1 presents an example about the use of Data and Metadata within an application. In the figure, data about a person is presented through some fields like citizenship number, surname and gender. In this example, “Gender” is the metadata and “male” is the data to indicate the value represented through the semantics of the metadata, “Gender”. During the data exchange between two different applications, when “male” is received in one hand, it is crucial that the application should know that this data indicates the “Gender” of the person. Apart from that, the application also needs to know that this “Gender” data is indicating the gender of a “Person”. All

Page 10: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 10 of 67

this syntactic and semantic information is coded with the associated metadata. Hence, interoperating applications should agree on metadata before they start to exchange data.

Figure 1 - Importance of Metadata: Same data annotated through different metadata can lead

inconsistencies

Most of the systems implement its own information model; hence each application has its own metadata. Even, it is highly probable that two different interfaces of the same application can consume data through different metadata. This situation is illustrated in Figure 1 where birth date information of a person is annotated with “Date of Birth” in one interface and “B. Date” in the second interface.

Figure 2 - Metadata management in an ISO/IEC 11179 based Metadata Repository

In order to agree on the metadata, the metadata should be available to the interoperating applications. Since metadata is data, it needs to be managed through well-established mechanism. ISO/IEC 11179 defines the required mechanism to manage metadata within Metadata Repositories. Figure 2 illustrates metadata management where the structural and descriptive metadata about the data itself (i.e. Patient information structured with First name, Last name, Date of Birth, Sex fields) is managed under an ISO/IEC 11179 based Metadata Repository (MDR). In this kind of a setting, the structure of each entity (i.e. Patient) such as the fields it contains is described and managed under the MDR together with the meaning of each information field such as the Sex field.

3.2 Common Data Element ISO/IEC 11179 defines a data element as the basic container for data. In SALUS, a Common Data Element is defined in a similar fashion: the smallest meaningful data container in a context. Based on

Page 11: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 11 of 67

the context, the granularity of the common data elements changes and it is tightly bound to the data requirements of the domain applications running in that context. Since SALUS addresses the interoperability between the clinical research and clinical care domains for the post market safety studies, SALUS Common Data Elements have been derived in the context of clinical research and clinical care content models. In D4.1.1 [1], a number of content models have been presented according to the data requirements of SALUS pilot application scenarios. These content models can be perceived as a collection of data elements which have been brought together to form content models in specific contexts. During the creation of SALUS CDEs, each CDE is mapped to the corresponding data element identified within the content models of D4.1.1. Apart from the SALUS content models listed in D4.1.1 such as OMOP CDM [4] or ASTM/HL7 CCD [8], many organizations publish specific information models that are disparate; meaning the data within each system is stand-alone and not interoperable. As stated by ISO, “One of the prerequisites for a correct and proper use and interpretation of data is that both users and owners of data have a common understanding of the meaning and descriptive characteristics (e.g., representation) of that data. To guarantee this shared view, a number of basic attributes have to be defined”. In line with this vision, many of the efforts which try to facilitate the exchange of EHRs for better care of the patient, or to enable secondary use of EHRs for supporting clinical research and patient safety studies have already been developing common data element (CDE) models. A few examples can be summarized as follows:

• Health Information Technology Standards Panel (HITSP) has defined the C154: Data Dictionary Component [9] as a library of the HITSP defined data elements to facilitate the consistent use of these data elements across various HITSP selected standards. These data elements are served through PDF documents and spreadsheets. For example, HITSP C32 [10] which describes the HL7 Continuity of Care Document (CCD) [8] content for the purpose of health information exchange, marks the elements in CCD document with the corresponding HITSP C154 data elements to establish common understanding of the meaning of the CCD elements.

• The Federal Health Information Model (FHIM) [11] develops a common Computationally Independent Model (CIM) for EHRs.

• The Transitions of Care Initiative (ToC) [12] maintains the S&I Clinical Element Data Dictionary (CEDD) [13] as a repository of data elements to improve the electronic exchange of core clinical information among authorized entities in support of meaningful use and improvement in the quality of care. The Query Health [14] initiative extends this data dictionary, and establishes Query Health CEDD to enable an architecture for querying distributed EHRs in order to aggregate healthcare data for collecting quality measures and monitoring disease outbreaks.

• Clinical Data Interchange Standards Consortium (CDISC) provides common dataset definitions in (a) Study Data Tabulation Model (SDTM) [5] for enabling the submission of the result data sets of regulated clinical research studies to the FDA and in (b) Clinical Data Acquisition Standards Harmonization (CDASH) [15] for integrating SDTM data requirements into the Case Report Forms.

• The Biomedical Research Integrated Domain Group (BRIDG) [16] developed the Domain Analysis Model (DAM), which harmonizes CDISC data standards with the HL7 Reference Information Model (RIM) [17]. The BRIDG DAM unifies the concepts in the clinical care and research domains and creates a shared generic representation for each data element.

• Mini-Sentinel [18] is a pilot project to create an active surveillance system to monitor the safety of FDA-regulated medical products by accessing pre-existing electronic healthcare records. It proposes a Common Data Model (CDM) so that analytic applications can run on a uniform model. This CDM is maintained in a PDF document and partner EHR Systems are expected to translate the EHR data to this CDM.

Page 12: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 12 of 67

There are other similar efforts to define CDEs and accompanying content models like GE/Intermountain Healthcare Clinical Element Models (CEM) [19], National E-Health Transition Authority (NEHTA) Clinical Models [20] and I2B2 data model [21]. These are defined either as data dictionaries or through abstract data models; ensuring interoperability within the boundaries of these initiatives. For instance, the query services, analysis methods or data exchange protocols envisioned by these initiatives can seamlessly run on top of the agreed common data element models which are set of core data elements. However, when it comes to achieving a broader range of interoperability, these efforts fall short: proliferation of common data element models does not help to solve the interoperability problem. Exchange of EHRs for the care of patients or secondary use of EHRs is not directly possible across these initiatives. For example, it is not directly possible to query an EHR which conforms to FHIM model through the query services provided by Query Health unless a mapping to Query Health CEDD is achieved first. When a researcher defines the data set to be collected for an observational study through CDISC SDTM variables, it does not become readily possible to extract these data sets from EHRs which can provide medical summaries of eligible patients through HITSP C32 documents. The use of different set of CDEs such as CDISC SDTM variables and HITSP Data Elements does not solve the problem of interoperability; yet it is not practical to expect all of these diverse initiatives and projects to stick to the same common model, and to use the same set of CDEs. The main objective of SALUS CDE Repository is to maintain and manage the Common Data Elements extracted from the content models of SALUS use-cases. However, seeing this kind of a wide interoperability issue where lots of disparate data element models exist and these models are not in machine-processable forms; a much more comprehensive metadata registry has been designed and implemented to support the interoperability. SALUS Semantic MDR aims to present a federated metadata registry architecture where machine-processable definitions of CDEs across domains can be shared, re-used, and semantically interlinked with each other to address this semantic interoperability challenge.

3.3 ISO/IEC 11179 Metamodel ISO/IEC 11179 (Part 3 of the standard) exhibits a relational data model which describes the metadata registries through entity-relationship diagrams. This metamodel is designed to be generic; hence any data element model can be represented through regardless of the level of granularity. In Figure 3, decomposition of a data element is presented according to the metamodel of ISO/IEC 11179. Please note that this figure corresponds to a very small part of the metamodel exposed by the ISO standard. Apart from this decomposition, the metamodel includes the machinery to manage the administration and identification, different contexts, naming and definition, and classification.

Figure 3 - Decomposition of a data element according to ISO/IEC 11179

Page 13: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 13 of 67

SALUS Common Data Elements can be represented through the ISO/IEC 11179 metamodel. Figure 4 illustrates the decomposition of “Person.DateOfBirth.Date” which is a CDE of SALUS. SALUS CDEs are presented together with the descriptions and corresponding decompositions of each CDE in Section 0.

Figure 4 - An example of decomposition of a SALUS Common Data Element: Patient.DateOfBirth.Date

As presented in Figure 4, the concept of the CDE and the representation are separate in the metamodel. These are modelled through Data Element Concepts and Value Domains respectively. A Data Element Concept is further decomposed into an Object Class and a Property. In the given example, “Patient” is the Object Class and “Date of Birth” is the property together which constitute the concept of “Patient.DateOfBirth”. This is the concept of the data element regardless of its representation which can be dictated through a Value Domain. It is important to notice that the metamodel of ISO/IEC 11179 inherently supports the re-use of resources. For example, the “Patient” Object Class is re-used while forming the “Patient.Address” data element concept with the use of “Address” property. Moreover, the “Address” property is re-used in several other CDEs such as “HealthcareProvider.Address.Address”.

Page 14: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 14 of 67

4 SALUS COMMON DATA ELEMENTS Analysis through the content models of SALUS presented in D 4.1.1 [1] has led to the identification of SALUS Common Data Elements. The total number of the identified CDEs is 199. These are presented in Table 2 with their descriptions:

Table 2 – SALUS Common Data Elements

Data  Element  

Name   Description  

Patient.ID.II   Identifier  of  the  patient  Patient.Title.String   Title/prefix  of  the  patient  

Patient.GivenName.String   Given  name  of  the  patient  Patient.FamilyName.String   Family  name  of  the  patient  

Patient.Gender.CD   Gender  of  the  patient  

Patient.DateOfBirth.Date   Birth  date  of  the  patient  Patient.MaritalStatus.CD   Marital  status  of  the  patient  

Patient.ReligiousAffiliation.CD   Religion  of  the  patient  Patient.Race.CD   Race  of  the  patient  

Patient.Ethnicity.CD   Ethnicity  of  the  patient  Patient.PlaceOfBirth.Address   Birth  place  of  the  patient  

Patient.Address.Address  Address  (e.g.  home,  work  place,  postal)  of  the  patient  

Patient.Telecom.Tele  

Telecommunication   means   details   (i.e.  telephone,   fax,   mobile   phone,   email)   of   the  patient  

Patient.HealthcareProvider.HealthcareProvider  Health  professional  that  takes  place  in  the  care  of  the  patient  

Patient.ProviderOrganization.Organization  Healthcare   provider   organization   that   takes  place  in  the  care  of  the  patient  

Patient.DataReporter.DataReporter  

Human   data   reporter   of   the   patient.   It   is   a  mandatory   information   in   the   case   of  reporting  an  Adverse  Drug  Event  (ADE).  

Patient.InsuranceProvider.InsuranceProvider   Insurance  provider  of  the  patient  Patient.Encounter.Encounter   Encounter  of  the  patient  

Patient.Allergy.Allergy  Allergy   /   intolerance   /   adverse   event   of   the  patient  

Patient.Condition.Condition  Condition   (i.e.   problem,   diagnosis,   finding,  symptom)  of  the  patient  

Patient.FamilyHistory.FamilyHistory   Family  history  of  the  patient  Patient.Immunization.Immunization   Immunization  of  the  patient  

Patient.Medication.Medication   Medication  of  the  patient  Patient.PlanOfCare.PlannedEvent   Care  plan  event  of  the  patient  

Patient.Pregnancy.Pregnancy   Pregnancy  history  of  the  patient  Patient.Procedure.Procedure   Procedure  of  the  patient  

Patient.Result.Result   Lab  result  of  the  patient  

Patient.SocialHistory.SocialHistory   Social  history  of  the  patient  

Page 15: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 15 of 67

Patient.VitalSign.Result   Vital  sign  of  the  patient          

Address.NullFlavor.CD  An   indicator   that   the  address   is  null,   together  with  the  flavor  (i.e.  cause)  of  null.    

Address.Use.CD  Type   of   the   address;   e.g.   home,   work   place,  postal.  

Address.StreetAddressLine.String   Street  information  of  the  address  

Address.PostalCode.String   Postal  code  of  the  address  

Address.City.String   City  of  the  address  Address.State.String   State  (region)  of  the  address  

Address.Country.String   Country  of  the  address          

Tele.NullFlavor.String  

An   indicator   that   the   telecom   information   is  null,   together   with   the   flavor   (i.e.   cause)   of  null.    

Tele.Use.CD  Type   of   the   telecom   information;   e.g.  telephone,  fax,  mobile  phone,  email.  

Tele.Value.String   Value  of  the  telecom  information  

       

HealthcareProvider.DateRange.IVLTS  Time   interval   that   the   healthcare   provider   is  involved  in  the  care  of  the  patient  

HealthcareProvider.ID.II   Identifier  of  the  healthcare  provider  

HealthcareProvider.Role.CD  Role   of   the   healthcare   provider;   e.g.   GP,  surgeon,  nurse.  

HealthcareProvider.Title.String   Title/prefix  of  the  healthcare  provider  HealthcareProvider.GivenName.String   Given  name  of  the  healthcare  provider  

HealthcareProvider.FamilyName.String   Family  name  of  the  healthcare  provider  

HealthcareProvider.Address.Address  Address  (e.g.  home,  work  place,  postal)  of  the  healthcare  provider  

HealthcareProvider.Telecom.Tele  

Telecommunication   means   details   (i.e.  telephone,   fax,   mobile   phone,   email)   of   the  healthcare  provider  

HealthcareProvider.Organization.Organization  Organization   that   the   healthcare   provider   is  associated  with  

HealthcareProvider.patientID.II  Healthcare   provider   specific   identifier   of   the  patient  

HealthcareProvider.Comment.String  Further   free   text   comments   /   information  about  the  healthcare  provider  

       

Organization.ID.II   Identifier  of  the  organization  

Organization.Address.Address  Address   (e.g.   work   place,   postal)   of   the  organization  

Organization.Telecom.Tele  

Telecommunication   means   details   (i.e.  telephone,   fax,   mobile   phone,   email)   of   the  organization  

Organization.Name.String   Name  of  the  organization  

       DataReporter.ID.II   Identifier  of  the  data  reporter  

DataReporter.Title.String   Title/prefix  of  the  data  reporter  DataReporter.GivenName.String   Given  name  of  the  data  reporter  

Page 16: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 16 of 67

DataReporter.Familyname.String   Family  name  of  the  data  reporter  

DataReporter.Qualification.CD  

Qualification   of   the   data   reporter.   Suggested  values  are  Physician;  Pharmacist;  Other  Health  Professional;   Lawyer;   Consumer   or   other   non  health  professional.  

DataReporter.Organization.Organization  Organization   that   the   data   reporter   is  associated  with  

DataReporter.Address.Address  Address  (e.g.  home,  work  place,  postal)  of  the  data  reporter  

DataReporter.Telecom.Tele  

Telecommunication   means   details   (i.e.  telephone,   fax,   mobile   phone,   email)   of   the  data  reporter  

       

InsuranceProvider.GroupNumber.II  

Policy   or   group   contract   number   identifying  the   contract   between   a   health   plan   sponsor  and  the  health  plan  

InsuranceProvider.HealthInsuranceType.CD  Coded   type   of   the   health   plan   covering   the  individual  

InsuranceProvider.Payer.Organization  Payer   organization   of   the   health   plan  insurance  

InsuranceProvider.Member.Member   Patient  who  is  covered  by  the  health  plan  

InsuranceProvider.FinancialResponsibilityPartyType.CD  Coded  type  of  the  financial  responsibility  party  (i.e.  guarantor)  

InsuranceProvider.Subscriber.Subscriber  

Human   subscriber   (i.e.   the   actual  member   or  health   plan   contract   holder;   the   true  subscriber)  of  the  health  plan  

InsuranceProvider.Guarantor.Guarantor   Guarantor  of  the  health  plan  

InsuranceProvider.HealthPlanName.String   Name  of  the  health  plan  

InsuranceProvider.Comment.String  Further   free   text   comments   /   information  about  the  insurance  provider  /  health  plan  

       

Encounter.ID.II   Identifier  of  the  encounter  

Encounter.Type.CD  Coded  type  of  the  encounter  (e.g.  ambulatory,  inpatient,  surgical)  

Encounter.TimeInterval.IVLTS   Effective  time  interval  of  the  encounter  

Encounter.Provider.HealthcareProvider   Healthcare  provider  involved  in  the  encounter  

Encounter.Organization.Organization  Healthcare   provider   organization   where   the  encounter  takes  place  

Encounter.ReasonForVisit.Condition  Condition   of   the   patient   that   causes   the  encounter  

       

Allergy.AdverseEventType.CD  

Coded   type   of   the   allergy   /   intolerance   /  adverse   event   (e.g.   drug   allergy,   food  intolerance)  

Allergy.TimeInterval.IVLTS  Effective   time   interval   of   the   allergy   /  intolerance  /  adverse  event  

Allergy.Product.CD  

Product  (i.e.  substance)  that  causes  the  allergy  /  intolerance  /  adverse  event  (e.g.  egg  protein,  dust,  nifedipine)  

Allergy.Reaction.Condition  

The  condition  which  occur  as  a  reaction  to  the  allergy   /   intolerance   /   adverse   event;   can   be  any  condition  

Allergy.Status.CD   Coded   status   of   the   allergy   /   intolerance   /  

Page 17: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 17 of 67

adverse  event  (e.g.  active,  inactive,  resolved)  

Allergy.Severity.CD  Coded   severity   of   the   allergy   /   intolerance   /  adverse  event  (e.g.  low,  moderate,  high)  

Allergy.Comment.String  Further   free   text   comments   /   information  about  the  allergy  /  intolerance  /  adverse  event  

       Condition.TimeInterval.IVLTS   Effective  time  interval  of  the  condition  

Condition.ProblemType.CD  Coded   type   of   the   condition   (e.g.   problem,  diagnosis,  finding,  symptom)  

Condition.ProblemName.String   Free  text  name  of  the  condition  

Condition.ProblemCode.CD  

Coded  name  of  the  condition.  In  case  the  Time  of  Death  is  provided,  this  is  the  coded  cause  of  death.  

Condition.ProblemStatus.CD  Coded   status   of   the   condition   (e.g.   active,  inactive,  resolved)  

Condition.ProblemSeverity.CD  Coded   severity   of   the   condition   (e.g.   low,  moderate,  high)  

Condition.TimeOfDeath.Datetime   Time  when  the  patient  died  

Condition.TreatingProvider.HealthcareProvider  Healthcare  provider  involved  in  the  diagnosis  /  treatment  of  the  condition  

Condition.Comment.String  Further   free   text   comments   /   information  about  the  condition  

       FamilyHistory.ObservationDate.Datetime   Time  of  entry  of  the  family  history  

FamilyHistory.KinshipType.CD   Coded  type  of  kinship  with  the  patient  

FamilyHistory.ObservationCode.CD   Coded  name  of  the  family  history  observation  

FamilyHistory.AgeAtOnset.Integer  Age   of   the   kin   when   the   family   history  observation  became  effective  

FamilyHistory.Comment.String  Further   free   text   comments   /   information  about  the  family  history  

       

Immunization.AdministeredDate.Datetime  Time  when  the  immunization  is  applied  to  the  patient  

Immunization.MedicationSeriesNumber.Integer   Medication  series  number  of  the  immunization  

Immunization.Route.CD  Coded   route   of   the   immunization   (e.g.  intravenous)  

Immunication.Dose.PQ   Dose  of  the  immunization  

Immunization.Site.CD  Coded  approach  site  of  the  immunization  (e.g.  upper  arm  structure)  

Immunication.Reaction.CD  Coded   reaction   that   the   patient   has   to   the  immunization  

Immunization.Performer.HealthcareProvider  Healthcare   provider   that   applies   the  immunization  to  the  patient  

Immunization.MedicationInformation.MedicationInformation  

Details  (e.g.  coded  product  name,  coded  active  ingredient,   free   text   brand   name)   of   the  immunization  

Immunization.Comment.String  Further   free   text   comments   /   information  about  the  immunization  

   Medication.TimeInterval.IVLTS  

Effective  time  interval  when  the  medication  is  used  

Medication.AdministeredTiming.PIVLTS   Specific   description   of   use   time   of   the  

Page 18: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 18 of 67

medication  

Medication.Route.CD  Coded   route   of   the   medication   (e.g.   oral  inhalation)  

Medication.Dose.PQ   Dose  of  the  medication  

Medication.Site.CD  Coded   approach   site   of   the   medication   (e.g.  nose)  

Medication.DoseRestriction.IVLPQ  Minimum   and   maximum   doses   that   the  medication  can  be  taken  

Medication.ProductForm.CD  Coded   product   form   of   the   medication   (e.g.  tablet)  

Medication.DeliveryMethod.CD  Coded   description   of   how   the   medication   is  administered  

Medication.MedicationInformation.MedicationInformation  

Details  (e.g.  coded  product  name,  coded  active  ingredient,   free   text   brand   name)   of   the  medication  

Medication.Indication.Condition  Condition   of   the   patient   that   causes   the  medication  administration  

Medication.PatientInstructions.String  Free  text  instructions  to  the  patient  (e.g.  "keep  in  the  refrigerator")  

Medication.Reaction.CD  Coded   reaction   that   the   patient   has   to   the  medication  

Medication.Order.Order   Details  of  the  order  (i.e.  prescription)  

Medication.FulfillmentInstructions.String  

Free   text   instructions   to   the   dispensing  pharmacist   or   nurse   (e.g.   "instruct   patient   on  the  use  of  occlusive  dressing")  

Medication.FulFillmentHistory.FulfillmentHistory   Details  of  the  fulfillment  (i.e.  dispensation)  

Medication.Comment.String  Further   free   text   comments   /   information  about  the  medication  

       PlannedEvent.ID.II   Identifier  of  the  care  plan  event  

PlannedEvent.TimeInterval.IVLTS   Effective  time  interval  of  the  care  plan  event  PlannedEvent.Type.CD   Coded  type  of  the  care  plan  event  

PlannedEvent.EventCode.CD   Coded  name  of  the  care  plan  event  

PlannedEvent.Comment.String  Further   free   text   comments   /   information  about  the  care  plan  event  

       

Pregnancy.ObservationDate.Datetime   Time  of  entry  of  the  pregnancy  history  Pregnancy.LastMenstrualPeriodDate.Datetime   Time  of  last  menstrual  period  of  the  patient  

Pregnancy.DeliveryDate.Datetime   Time  of  delivery  of  the  patient  

Pregnancy.Comment.String  Further   free   text   comments   /   information  about  the  pregnancy  history  

       

Procedure.ID.II   Identifier  of  the  procedure  

Procedure.TimeInterval.IVLTS   Time  when  the  procedure  is  performed  Procedure.Type.CD   Coded  name  of  the  procedure  

Procedure.Type.String   Free  text  name  of  the  procedure  

Procedure.Status.CD  Coded   status   of   the   procedure   (e.g.  completed,  active,  aborted,  cancelled)  

Procedure.Site.CD  Coded   target   site   of   the   procedure   (e.g.   hip  joint)  

Procedure.Provider.HealthcareProvider   Healthcare   provider   that   performs   the  

Page 19: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 19 of 67

procedure  

Procedure.Indication.Condition  Condition   of   the   patient   that   causes   the  procedure  

Procedure.RelatedEncounter.Encounter  Associated   encounter   in  which   the   procedure  is  performed  

Procedure.Comment.String  Further   free   text   comments   /   information  about  the  procedure  

       Result.ID.II   Identifier  of  the  lab  result  /  vital  sign  

Result.TimeInterval.IVLTS  Effective   time   interval   of   the   lab   result   /   vital  sign  

Result.Type.CD   Coded  name  of  the  lab  result  /  vital  sign  Result.Value.CD   Coded  value  of  the  lab  result  /  vital  sign  

Result.Value.String   Free  text  value  of  the  lab  result  /  vital  sign  

Result.Value.PQ  Physical  quantity  value  (i.e.  value  with  unit)  of  the  lab  result  /  vital  sign  

Result.Interpretation.CD  Coded   interpretation   of   the   lab   result   /   vital  sign  (e.g.  abnormal,  high,  below  low  threshold)  

Result.Provider.HealthcareProvider  Healthcare   provider   that   provides   the   lab  result  /  vital  sign  value  

Result.ReferenceRange.IVLPQ   Referance  range  of  the  lab  result  /  vital  sign  

Result.RelatedCondition.Condition  Condition   of   the   patient   that   causes   the   lab  result  /  vital  sign  to  be  measured  /  observed  

Result.Comment.String  Further   free   text   comments   /   information  about  the  lab  result  /  vital  sign  

       

SocialHistory.TimeInterval.IVLTS   Effective  time  interval  of  the  social  history  

SocialHistory.ObservationCode.CD  Coded  name  of  the  social  history   (e.g.  alcohol  intake,  tobacco  use)  

SocialHistory.ObservationValue.CD   Coded  value  of  the  social  history  

SocialHistory.ObservationValue.String   Free  text  value  of  the  social  history  

SocialHistory.ObservationValue.PQ  Physical  quantity  value  (i.e.  value  with  unit)  of  the  social  history  

SocialHistory.Comment.String  Further   free   text   comments   /   information  about  the  social  history  

       

Member.HealthPlanCoverageDates.IVLTS  Effective   time   interval   of   the   health   plan  covering  the  member  

Member.ID.II  Identifier   assigned   by   the   health   plan   to   the  member  who  is  covered  by  the  health  plan  

Member.RelationshipToSubscriber.CD  Coded   relationship   of   the   member   to   the  subscriber  

Member.Address.Address  Address  (e.g.  home,  work  place,  postal)  of  the  member  

Member.Tele.Tele  

Telecommunication   means   details   (i.e.  telephone,   fax,   mobile   phone,   email)   of   the  member  

Member.Title.String   Title/prefix  of  the  member  

Member.GivenName.String   Given  name  of  the  member  Member.FamilyName.String   Family  name  of  the  member  

Member.DateOfBirth.Date   Birth  date  of  the  member  

Page 20: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 20 of 67

       

Subscriber.ID.II  

Identifier   assigned   by   the   health   plan   to   the  actual  member  or  health  plan  contract  holder  (the  true  subscriber)  

Subscriber.Address.Address  Address  (e.g.  home,  work  place,  postal)  of  the  subscriber  

Subscriber.Tele.Tele  

Telecommunication   means   details   (i.e.  telephone,   fax,   mobile   phone,   email)   of   the  subscriber  

Subscriber.Title.String   Title/prefix  of  the  subscriber  

Subscriber.GivenName.String   Given  name  of  the  subscriber  

Subscriber.FamilyName.String   Family  name  of  the  subscriber  Subscriber.DateOfBirth.Date   Birth  date  of  the  subscriber  

       

Guarantor.ResponsibilityEffectiveDate.IVLTS  

Effective   time   interval   of   the   responsibility   of  the   guarantor   (i.e.   financial   responsibility  party)  

Guarantor.Address.Address  Address  (e.g.  home,  work  place,  postal)  of  the  guarantor  

Guarantor.Tele.Tele  

Telecommunication   means   details   (i.e.  telephone,   fax,   mobile   phone,   email)   of   the  guarantor  

Guarantor.Title.String   Title/prefix  of  the  guarantor  

Guarantor.GivenName.String   Given  name  of  the  guarantor  

Guarantor.FamilyName.String   Family  name  of  the  guarantor          

MedicationInformation.ProductName.CD   Coded  product  name  of  the  medication  MedicationInformation.ProductName.String   Free  text  product  name  of  the  medication  

MedicationInformation.ActiveIngredient.CD   Coded  active  ingredient  of  the  medication  

MedicationInformation.BrandName.CD   Coded  brand  name  of  the  medication  MedicationInformation.BrandName.String   Free  text  brand  name  of  the  medication  

MedicationInformation.DrugManufacturer.Organization  Pharmaceuticals   organization   that  manufactured  the  medication  

       

Order.Number.II   Identifier  of  the  order  (i.e.  prescription)  

Order.Provider.HealthcareProvider   Healthcare  provider  that  wrote  this  order  

Order.FillNumber.Integer  Number   of   times   that   the  medication   can   be  dispensed  

Order.QuantityOrdered.PQ   Amount  of  product  that  can  be  dispensed  

Order.ExpirationDateTime.Datetime  Time   when   the   ordering   provider   wrote   the  order  

Order.Datetime.Datetime   Time  when  the  order  is  no  longer  valid  

       

FulfillmentHistory.PrescriptionNumber.II  Identifier   of   the   corresponding   order   (i.e.  prescription)  

FulfillmentHistory.DispensingProvider.HealthcareProvider  Healthcare   provider   that   dispensed   the  medication  

FulfillmentHistory.DispenseDate.Datetime  Time  when  the  dispensing  provider  dispensed  the  medication  

FulfillmentHistory.QuantityDispensed.PQ   Amount  of  product  that  was  dispensed  

Page 21: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 21 of 67

FulfillmentHistory.FillNumber.Integer   Fill  number  of  the  dispensation  

FulfillmentHistory.FillStatus.CD  Coded  status  of  the  fill  event  (e.g.  completed,  aborted)  

As stated and exemplified in Section 3.3, SALUS CDEs have been mapped to the ISO/IEC 11179 metamodel. This decomposition and mapping is presented through Table 3. During the mapping to ISO/IEC 11179 metamodel constructs, the Value Domain of the CDE is represented through a Data Type and a Conceptual Domain (CD) as shown in Table 3. Data type definitions have been adopted from ISO 21090 – Health Informatics Harmonized data types for Information Interchange [22]. In the CD column, the encoding is as follows:

• NE: Non Enumerated – The value domain of the CDE is not enumerated, hence the value of the CDE is not restricted to an enumerated set.

• E: Enumerated – The value of the CDE should be selected from a closed enumeration. • OC: Object Class – The value of the CDE is another Object Class within the metadata

registry. This is not an explicit ISO/IEC 11179 construct and added by the SALUS Semantic MDR in order to increase the re-use of CDE resources.

Table 3 – Decomposition of SALUS CDEs according to the ISO/IEC 11179 metamodel

Data  Element   Data  Element  Concept   Value  Domain  

Name   Object  Class   Property   Data  type   CD  

Patient.ID.II   Patient   Identifier   II   NE  

Patient.Title.String   Patient   Title   characterstring   NE  

Patient.GivenName.String   Patient  Given  Name   characterstring   NE  

Patient.FamilyName.String   Patient  Family  Name   characterstring   NE  

Patient.Gender.CD   Patient   Gender   CD   E  

Patient.DateOfBirth.Date   Patient  Date  of  Birth   date   NE  

Patient.MaritalStatus.CD   Patient  Marital  Status   CD   E  

Patient.ReligiousAffiliation.CD   Patient  Religious  Affiliation   CD   E  

Patient.Race.CD   Patient   Race   CD   E  Patient.Ethnicity.CD   Patient   Ethnicity   CD   E  

Patient.PlaceOfBirth.Address   Patient  Place  of  Birth   Address   OC  

Patient.Address.Address   Patient   Address   Address   OC  Patient.Telecom.Tele   Patient   Telecom   Tele   OC  

Patient.HealthcareProvider.HealthcareProvider   Patient  Healthcare  Provider  

HealthcareProvider   OC  

Patient.ProviderOrganization.Organization   Patient  

Provider  Organization   Organization   OC  

Patient.DataReporter.DataReporter   Patient  Data  Reporter   DataReporter   OC  

Patient.InsuranceProvider.InsuranceProvider   Patient  Insurance  Provider  

InsuranceProvider   OC  

Page 22: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 22 of 67

Patient.Encounter.Encounter   Patient   Encounter   Encounter   OC  Patient.Allergy.Allergy   Patient   Allergy   Allergy   OC  

Patient.Condition.Condition   Patient   Condition   Condition   OC  

Patient.FamilyHistory.FamilyHistory   Patient  Family  History   FamilyHistory   OC  

Patient.Immunization.Immunization   Patient  Immunization   Immunization   OC  

Patient.Medication.Medication   Patient   Medication   Medication   OC  

Patient.PlanOfCare.PlannedEvent   Patient  Plan  of  Care   PlannedEvent   OC  

Patient.Pregnancy.Pregnancy   Patient   Pregnancy   Pregnancy   OC  

Patient.Procedure.Procedure   Patient   Procedure   Procedure   OC  

Patient.Result.Result   Patient   Result   Result   OC  

Patient.SocialHistory.SocialHistory   Patient  Social  History   SocialHistory   OC  

Patient.VitalSign.Result   Patient   Vital  Sign   Result   OC  

                   Address.NullFlavor.CD   Address   Null  Flavor   CD   E  

Address.Use.CD   Address   Use   CD   E  

Address.StreetAddressLine.String   Address  

Street  Address  Line   characterstring   NE  

Address.PostalCode.String   Address  Postal  Code   characterstring   NE  

Address.City.String   Address   City   characterstring   NE  Address.State.String   Address   State   characterstring   NE  

Address.Country.String   Address   Country   characterstring   NE  

                   Tele.NullFlavor.String   Tele   Null  Flavor   characterstring   NE  

Tele.Use.CD   Tele   Use   CD   E  Tele.Value.String   Tele   Value   characterstring   NE  

                   

HealthcareProvider.DateRange.IVLTS  HealthcareProvider   Date  Range   IVLTS   NE  

HealthcareProvider.ID.II  HealthcareProvider   Identifier   II   NE  

HealthcareProvider.Role.CD  HealthcareProvider   Role   CD   E  

HealthcareProvider.Title.String  HealthcareProvider   Title   characterstring   NE  

HealthcareProvider.GivenName.String  HealthcareProvider  

Given  Name   characterstring   NE  

HealthcareProvider.FamilyName.String  HealthcareProvider  

Family  Name   characterstring   NE  

HealthcareProvider.Address.Address  HealthcareProvider   Address   Address   OC  

HealthcareProvider.Telecom.Tele  HealthcareProvider   Telecom   Tele   OC  

HealthcareProvider.Organization.Organization  HealthcareProvider  

Organization   Organization   OC  

Page 23: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 23 of 67

HealthcareProvider.patientID.II  HealthcareProvider  

Patient  Identifier   II   NE  

HealthcareProvider.Comment.String  HealthcareProvider   Comment   characterstring   NE  

                   

Organization.ID.II   Organization   Identifier   II   NE  Organization.Address.Address   Organization   Address   Address   OC  

Organization.Telecom.Tele   Organization   Telecom   Tele   OC  Organization.Name.String   Organization   Name   characterstring   NE  

                   DataReporter.ID.II   DataReporter   Identifier   II   NE  

DataReporter.Title.String   DataReporter   Title   characterstring   NE  

DataReporter.GivenName.String   DataReporter  Given  Name   characterstring   NE  

DataReporter.Familyname.String   DataReporter  Family  Name   characterstring   NE  

DataReporter.Qualification.CD   DataReporter  Qualification   CD   E  

DataReporter.Organization.Organization   DataReporter  Organization   Organiation   OC  

DataReporter.Address.Address   DataReporter   Address   Address   OC  

DataReporter.Telecom.Tele   DataReporter   Telecom   Tele   OC  

                   

InsuranceProvider.GroupNumber.II   InsuranceProvider  Group  Number   II   NE  

InsuranceProvider.HealthInsuranceType.CD   InsuranceProvider  

Health  Insurance  Type   CD   E  

InsuranceProvider.Payer.Organization   InsuranceProvider   Payer   Organization   OC  

InsuranceProvider.Member.Member   InsuranceProvider   Member   Member   OC  

InsuranceProvider.FinancialResponsibilityPartyType.CD   InsuranceProvider  

Financial  Responsibility    Party  Type   CD   E  

InsuranceProvider.Subscriber.Subscriber   InsuranceProvider   Subscriber   Subscriber   OC  

InsuranceProvider.Guarantor.Guarantor   InsuranceProvider   Guarantor   Gurantor   OC  

InsuranceProvider.HealthPlanName.String   InsuranceProvider  Health  Plan  Name   characterstring   NE  

InsuranceProvider.Comment.String   InsuranceProvider   Comment   characterstring   NE  

                   Encounter.ID.II   Encounter   Identifier   II   NE  

Encounter.Type.CD   Encounter   Type   CD   E  

Encounter.TimeInterval.IVLTS   Encounter  Time  Interval   IVLTS   NE  

Encounter.Provider.HealthcareProvider   Encounter   Provider  HealthcareProvider   OC  

Page 24: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 24 of 67

Encounter.Organization.Organization   Encounter  Organization   Organization   OC  

Encounter.ReasonForVisit.Condition   Encounter  Reason   For  Visit   Condition   OC  

                   

Allergy.AdverseEventType.CD   Allergy  Adverse  Event  Type   CD   E  

Allergy.TimeInterval.IVLTS   Allergy  Time  Interval   IVLTS   NE  

Allergy.Product.CD   Allergy   Product   CD   E  

Allergy.Reaction.Condition   Allergy   Reaction   Condition   OC  

Allergy.Status.CD   Allergy   Status   CD   E  Allergy.Severity.CD   Allergy   Severity   CD   E  

Allergy.Comment.String   Allergy   Comment   characterstring   NE                      

Condition.TimeInterval.IVLTS   Condition  Time  Interval   IVLTS   NE  

Condition.ProblemType.CD   Condition  Problem  Type   CD   E  

Condition.ProblemName.String   Condition  Problem  Name   characterstring   NE  

Condition.ProblemCode.CD   Condition  Problem  Code   CD   E  

Condition.ProblemStatus.CD   Condition  Problem  Status   CD   E  

Condition.ProblemSeverity.CD   Condition  Problem  Severity   CD   E  

Condition.TimeOfDeath.Datetime   Condition  Time   of  Death   datetime   NE  

Condition.TreatingProvider.HealthcareProvider   Condition  Treating  Provider  

HealthcareProvider   OC  

Condition.Comment.String   Condition   Comment   characterstring   NE                      

FamilyHistory.ObservationDate.Datetime   FamilyHistory  Observation  Date   datetime   NE  

FamilyHistory.KinshipType.CD   FamilyHistory  Kinship  Type   CD   E  

FamilyHistory.ObservationCode.CD   FamilyHistory  Observation  Code   CD   E  

FamilyHistory.AgeAtOnset.Integer   FamilyHistory  Age   At  Onset   integer   NE  

FamilyHistory.Comment.String   FamilyHistory   Comment   characterstring   NE  

                   

Immunization.AdministeredDate.Datetime   Immunization  Administered  Date   Datetime   NE  

Immunization.MedicationSeriesNumber.Integer   Immunization  

Medication  Series  Number   integer   NE  

Immunization.Route.CD   Immunization   Route   CD   E  Immunication.Dose.PQ   Immunization   Dose   PQ   NE  

Page 25: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 25 of 67

Immunization.Site.CD   Immunization   Site   CD   E  Immunication.Reaction.CD   Immunization   Reaction   CD   E  

Immunization.Performer.HealthcareProvider   Immunization   Performer  HealthcareProvider   OC  

Immunization.MedicationInformation.MedicationInformation   Immunization  

Medication  Information  

MedicationInformation   OC  

Immunization.Comment.String   Immunization   Comment   characterstring   NE  

                   

Medication.TimeInterval.IVLTS   Medication  Time  Interval   IVLTS   NE  

Medication.AdministeredTiming.PIVLTS   Medication  Administered  Timing   PIVLTS   NE  

Medication.Route.CD   Medication   Route   CD   E  

Medication.Dose.PQ   Medication   Dose   PQ   NE  Medication.Site.CD   Medication   Site   CD   E  

Medication.DoseRestriction.IVLPQ   Medication  Dose  Restriction   IVLPQ   NE  

Medication.ProductForm.CD   Medication  Product  Form   CD   E  

Medication.DeliveryMethod.CD   Medication  Delivery  Method   CD   E  

Medication.MedicationInformation.MedicationInformation   Medication  

Medication  Information  

MedicationInformation   OC  

Medication.Indication.Condition   Medication   Indication   Condition   OC  

Medication.PatientInstructions.String   Medication  

Patient  Instructions   characterstring   NE  

Medication.Reaction.CD   Medication   Reaction   CD   E  

Medication.Order.Order   Medication   Order   Order   OC  

Medication.FulfillmentInstructions.String   Medication  

Fulfillment  Instructions   characterstring   NE  

Medication.FulFillmentHistory.FulfillmentHistory   Medication  Fulfillment  History   FulfillmentHistory   OC  

Medication.Comment.String   Medication   Comment   characterstring   NE                      

PlannedEvent.ID.II   PlannedEvent   Identifier   II   NE  

PlannedEvent.TimeInterval.IVLTS   PlannedEvent  Time  Interval   IVLTS   NE  

PlannedEvent.Type.CD   PlannedEvent   Event  Type   CD   E  

PlannedEvent.EventCode.CD   PlannedEvent   Event  Code   CD   E  

PlannedEvent.Comment.String   PlannedEvent   Comment   characterstring   NE                      

Pregnancy.ObservationDate.Datetime   Pregnancy  Observation  Date   datetime   NE  

Pregnancy.LastMenstrualPeriodDate.Datetime   Pregnancy  

Last  Menstrual  Period  Date   datetime   NE  

Page 26: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 26 of 67

Pregnancy.DeliveryDate.Datetime   Pregnancy  Delivery  Date   datetime   NE  

Pregnancy.Comment.String   Pregnancy   Comment   characterstring   NE  

                   

Procedure.ID.II   Procedure   Identifier   II   NE  

Procedure.TimeInterval.IVLTS   Procedure  Time  Interval   IVLTS   NE  

Procedure.Type.CD   Procedure   Type   CD   E  

Procedure.Type.String   Procedure   Type   characterstring   NE  Procedure.Status.CD   Procedure   Status   CD   E  

Procedure.Site.CD   Procedure   Site   CD   E  

Procedure.Provider.HealthcareProvider   Procedure   Provider  HealthcareProvider   OC  

Procedure.Indication.Condition   Procedure   Indication   Condition   OC  

Procedure.RelatedEncounter.Encounter   Procedure  Related  Encounter   Encounter   OC  

Procedure.Comment.String   Procedure   Comment   characterstring   NE                      

Result.ID.II   Result   Identifier   II   NE  

Result.TimeInterval.IVLTS   Result  Time  Interval   IVLTS   NE  

Result.Type.CD   Result   Type   CD   E  

Result.Value.CD   Result   Value   CD   E  Result.Value.String   Result   Value   characterstring   NE  

Result.Value.PQ   Result   Value   PQ   NE  

Result.Interpretation.CD   Result  Interpretation   CD   E  

Result.Provider.HealthcareProvider   Result   Provider  HealthcareProvider   OC  

Result.ReferenceRange.IVLPQ   Result  Reference  Range   IVLPQ   NE  

Result.RelatedCondition.Condition   Result  Related  Condition   Condition   OC  

Result.Comment.String   Result   Comment   characterstring   NE                      

SocialHistory.TimeInterval.IVLTS   SocialHistory  Time  Interval   IVLTS   NE  

SocialHistory.ObservationCode.CD   SocialHistory  Observation  Code   CD   E  

SocialHistory.ObservationValue.CD   SocialHistory  Observation  Value   CD   E  

SocialHistory.ObservationValue.String   SocialHistory  Observation  Value   characterstring   NE  

SocialHistory.ObservationValue.PQ   SocialHistory  Observation  Value   PQ   NE  

SocialHistory.Comment.String   SocialHistory   Comment   characterstring   NE  

                   

Member.HealthPlanCoverageDates.IVLTS   Member  

Health  Plan  Coverage  Dates   IVLTS   NE  

Page 27: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 27 of 67

Member.ID.II   Member   Identifier   II   NE  

Member.RelationshipToSubscriber.CD   Member  

Relationship  To  Subscriber   CD   E  

Member.Address.Address   Member   Address   Address   OC  Member.Tele.Tele   Member   Tele   Tele   OC  

Member.Title.String   Member   Title   characterstring   NE  

Member.GivenName.String   Member  Given  Name   characterstring   NE  

Member.FamilyName.String   Member  Family  Name   characterstring   NE  

Member.DateOfBirth.Date   Member  Date  of  Birth   date   NE  

                   

Subscriber.ID.II   Subscriber   Identifier   II   NE  

Subscriber.Address.Address   Subscriber   Address   Address   OC  Subscriber.Tele.Tele   Subscriber   Tele   Tele   OC  

Subscriber.Title.String   Subscriber   Title   characterstring   NE  

Subscriber.GivenName.String   Subscriber  Given  Name   characterstring   NE  

Subscriber.FamilyName.String   Subscriber  Family  Name   characterstring   NE  

Subscriber.DateOfBirth.Date   Subscriber  Date  of  Birth   date   NE  

                   

Guarantor.ResponsibilityEffectiveDate.IVLTS   Guarantor  

Responsibility  Effective  Date   IVLTS   NE  

Guarantor.Address.Address   Guarantor   Address   Address   OC  

Guarantor.Tele.Tele   Guarantor   Tele   Tele   OC  Guarantor.Title.String   Guarantor   Title   characterstring   NE  

Guarantor.GivenName.String   Guarantor  Given  Name   characterstring   NE  

Guarantor.FamilyName.String   Guarantor  Family  Name   characterstring   NE  

                   

MedicationInformation.ProductName.CD  MedicationInformation  

Product  Name   CD   E  

MedicationInformation.ProductName.String  MedicationInformation  

Product  Name   characterstring   NE  

MedicationInformation.ActiveIngredient.CD  MedicationInformation  

Active  Ingredient   CD   E  

MedicationInformation.BrandName.CD  MedicationInformation  

Brand  Name   CD   E  

MedicationInformation.BrandName.String  MedicationInformation  

Brand  Name   characterstring   NE  

MedicationInformation.DrugManufacturer.Organization  

MedicationInformation  

Drug  Manufacturer   Organization   OC  

Page 28: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 28 of 67

                   Order.Number.II   Order   Number   II   NE  

Order.Provider.HealthcareProvider   Order   Provider  HealthcareProvider   OC  

Order.FillNumber.Integer   Order   Fill  Number   İnteger   NE  

Order.QuantityOrdered.PQ   Order  Quantity  Ordered   PQ   NE  

Order.ExpirationDateTime.Datetime   Order  Expiration  Date  Time   Datetime   NE  

Order.Datetime.Datetime   Order   Datetime   Datetime                          

FulfillmentHistory.PrescriptionNumber.II   FulfillmentHistory  Prescription  Number   II   NE  

FulfillmentHistory.DispensingProvider.HealthcareProvider   FulfillmentHistory  

Dispensing  Provider  

HealthcareProvider   NE  

FulfillmentHistory.DispenseDate.Datetime   FulfillmentHistory  Dispense  Date   Datetime   NE  

FulfillmentHistory.QuantityDispensed.PQ   FulfillmentHistory  Quantity  Dispensed   PQ   NE  

FulfillmentHistory.FillNumber.Integer   FulfillmentHistory   Fill  Number   İnteger   NE  

FulfillmentHistory.FillStatus.CD   FulfillmentHistory   Fill  Status   CD   E  

Page 29: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 29 of 67

5 THE REQUIREMENT FOR A SEMANTIC MDR FRAMEWORK

As described in Section 3.2, there are a number of recent efforts in both patient care and clinical research focused on defining metadata and vocabulary standards for clinical information and thereby on building Common Data Elements (CDEs). Although these efforts ensure interoperability within the selected domain for the selected use cases, interoperability across application domain boundaries is not automatically possible. These stem from the following facts [23]:

• Common data element model development efforts are often disparate form each other. Although previous efforts are examined, most of the time, a common model is created from scratch.

• Most of the time, the specifications for these CDE sets and common models are in unstructured text files.

• Some of these efforts examine previous ones and re-uses some CDEs proposed by others, and sometimes provide partial mappings to other CDE dictionaries. For example, S&I CEDD re-uses elements from HITSP C154, NEHTA and FHIM; HITSP C32 provides mapping between HITSP C154 data elements to the elements of HL7 CCD. However these are maintained in several different spreadsheets or in PDF documents. Hence, it is not possible to process or query these data.

We believe, there is a need for a more coordinated approach that would allow machine processable definitions of CDEs defined by different efforts to be searched, allow CDEs to be re-used and to be linked with each other; as a result the mappings among different CDEs in different domains can be queried to address semantic interoperability. In SALUS Project, our aim is to develop a framework that facilitates all of these through the use of federated semantically enabled metadata registries (MDR) conforming to ISO 11179 standard where CDEs maintained in different MDRs can be uniquely identified, queried and linked with each other through Linked Data principles. The details of this approach are presented in the paper [23] titled as “A Federated Semantic Metadata Registry Framework for Enabling Interoperability across Clinical Research and Care Domains”. In this section we will very briefly present the challenges that are needed to address to develop a semantic MDR are described, and a high level overview of how we address them in SALUS Semantic MDR implementation is provided. Details of these challenges and proposed solutions can be found in our related SALUS Publication [23]. Details of the implementation choices and results are presented later in Section 6. The first challenge we would like to address in developing a semantic MDR is to maintain the definitions of CDEs in machine processable manner rather than keeping them in PDF documents or spreadsheets so that it becomes possible to search and query them. For this we have decided to adopt ISO/IEC 11179 - Metadata Registries (MDR) standard, and provide the necessary extensions when required. A semantic MDR framework should enable the following basic functionalities [23]:

• Searching CDEs maintained by different MDRs • Retrieving standard specification of a selected CDE from an MDR • Re-using CDEs maintained in a different MDR by referencing to the respective CDE

To facilitate semantic interoperability more effectively across domains, a semantically linked federated MDR framework should support some additional functionalities:

• It should be possible to link and semantically associate the CDEs across different MDRs in reference to well-accepted knowledge organization system (KOS) ontologies and terminology systems.

Page 30: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 30 of 67

• It should be possible to easily query these semantic relationships within and across MDRs.

We have chosen to apply Linked Open Data (LOD) principles as the basis of this semantically linked federated MDR framework. Linked Data is a recommended best practice for exposing, sharing, and connecting pieces of data, information and knowledge on the Semantic Web using URIs and Resource Description Framework (RDF). It provides a natural way to expose the CDEs maintained in different MDRs openly in the LOD cloud, and interrelate them with each other as depicted in Figure 5.

Figure 5 – Federated semantic MDR framework [23]

In particular the following principles are followed:

• Each CDE should be uniquely identified by a URI. • Each CDE should be dereferenceable, that is, MDRs shall provide the necessary HTTP-REST

services for looking up CDEs by using their unique URIs • Each MDR should provide semantic RDF descriptions of CDEs, which are accessible through the

HTTP Services provided. When a CDE is looked up through its URI, the RDF description of the CDE should be returned, where all context of the CDE is presented in RDF: each RDF property is interpreted as a hyperlink to other linked open MDR resources. This automatically enables access to more data, which is usually referred to as the “follow-your-nose principle”. To enable this, we have created an OWL ontology from ISO/IEC 11179 meta-model as described in detail in Section 6.1. When a CDE is looked up, its RDF description in conformance to ISO/IEC 11179 meta-model should be returned.

• In an ISO/IEC 11179 MDR, it is possible to annotate CDEs with external terminology systems. In line with LOD approach, in a semantic MDR, links of CDEs to terminology system codes should also be referred through their unique URIs in the LOD cloud.

• In a semantic MDR, it should be possible to set other semantic links between the CDEs maintained in different MDRs as a part of semantic description of the CDE.

Page 31: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 31 of 67

One of the additional functionality we would like to enable through a federated semantic MDR framework is retrieving “extraction specifications” for a CDE defined in a selected domain, from a content model in a different domain. CDEs are often abstract data element definitions, which are later used to annotate the actual data elements in implementation dependent models that carry clinical content. These implementation dependent models are called “Content Models”. For example, HITSP C154 data elements are used to annotate parts of CCD content models to indicate the unambiguous meaning of CCD elements. Maintaining the links between these abstract CDEs with the implementation dependent content models through an MDR architecture would facilitate retrieving machine processable extraction specifications that can be used to enable dynamic interoperability across different domains. Through a semantic federated MDR framework, it should be possible to extract the SDTM annotated data sets from a medical summary conforming to HITPS C32 content model specifications (annotated with C154 data elements).

5.1 How this semantic MDR can be exploited for enabling interoperability across domains? [23]

In our hypothetical scenario, a study data manager in a pharmaceutical company aims to design the data collection set for a new observational study. She searches the local MDR of his company to retrieve data element descriptions for the selected set of variables in the data collection set. The local MDR returns a list of data element descriptions, including the unique URIs of the matching SDTM CDEs maintained by the MDR managed by CDISC. The study manager prepares the study protocol as a CDISC ODM document annotated with SDTM CDEs and sends it to a Safety Analysis Tool. Safety Analysis Tool automatically processes the study protocol, and tries to map the data items identified in the data collection set to the parts of HL7 CCD medical summary documents of study patients it collects from the participating care organizations as follows (Figure 6): • Safety Analysis Tool queries the federated MDR framework for extraction specifications of the

selected SDTM CDEs from HL7 CCD format. The service asks for extraction specifications of each SDTM CDE to the registered MDRs through the RESTful interfaces.

• None of the MDRs directly provides the extraction specification of the selected SDTM CDE (say LBORRES which stands for “results of a lab test”) from HL7 CCD format.

• The query service asks for the Semantic Links of LBORRES to the registered MDRs. In our example scenario, a semantic MDR maintaining BRIDG DAM elements [16] provides a mapping between the “LBORRES” CDE in CDISC SDTM domain to the “PerformedObservationResult.value.Any” CDE in BRIDG domain. It also maintains a mapping between “PerformedObservationResult.value.Any” CDE and the “Result.Value” CDE from HITSP domain. Hence, when the federated query service asks for Semantic Links of LBORRES, BRIDG MDR returns two URIs of

1. “PerformedObservationResult.value.Any” from BRIDG 2. “Result.Value” from HITSP

• “Result.Value” CDE is served in a semantic MDR hosted by HITSP which is linked with “PerformedObservationResult.value.Any” CDE through “SKOS:exactMatch” semantic relationship.

• The federated MDR search system now looks up to the HITSP MDR to retrieve extraction specification of “Result.Value” CDE in RDF format, and the extraction specification to HL7 CCD content model is available as “cda:observation[cda:templateId/@root='2.16.840.1.113883.10.20.1.31']/cda:value” as an XPATH query.

• In this way, the Safety Analysis Tool is able to retrieve the required data elements in the data collection set from the HL7 CCD documents provided for each study visit by the participating organizations.

Page 32: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 32 of 67

A similar flow can be achieved through retrieving the RDF descriptions of CDEs by calling the CDE endpoints, and by processing these RDF descriptions where semantic links and links to extraction specifications are already available. As depicted in the example scenario, through the proposed federated MDR framework, it is possible to facilitate interoperability across clinical research and care domains although different standards and different CDEs are in use.

Figure 6 – Step-by-step representation of the hypothetical scenario [23]

Page 33: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 33 of 67

6 DESIGN & IMPLEMENTATION OF SALUS SEMANTIC MDR

This section presents the design and the implementation details of the SALUS MDR. SALUS MDR is an open source project and the source code can be found at GitHub Repository1. It is also available in the local source code repository of the SALUS Project maintained by SRDC.

Common Data Element (CDE) Repository

MDR Knowledge Base

SALUS MDR Web GUI

UML Model Importer

Schema Model Importer

Semantic Model Importer

Figure 7 – Components of the SALUS MDR

Figure 7 presents the higher level interactions among components of the SALUS MDR. Main sub-component providing semantic MDR functionalities is the MDR Knowledge Base. MDR Knowledge Base consists of the triple store which serves as the backend for the SALUS MDR and two layers of Java API exposing interfaces to manage SALUS MDR.

SALUS MDR Web GUI is the Web Based Graphical User interface exposes functionalities of the SALUS MDR to outer world in an easy-to-use way. It also introduces an authentication mechanism which allows controlling who gains access over SALUS MDR and makes registration and administration of ISO/IEC 11179 items easier.

SALUS MDR also provides several importers in order to process and import some content models. These importers are used for auto-population of SALUS MDR with CDEs extracted from external content models.

6.1 Ontology of ISO/IEC 11179 Metamodel As it has been stated previously, CDE Repository is designed as a semantic MDR and it follows the Linked Data2 principles. However, ISO/IEC 11179 standards provide a relational model for the structure of the metadata registries through its entity-relationship diagrams. To be able to add 1 https://github.com/sinaci/semanticMDR 2 http://linkeddata.org

Page 34: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 34 of 67

semantic capabilities such as handling inter-links between CDEs and handling external links to other repositories, terminology systems, classification schemes etc., SALUS Semantic MDR has been built on top of a triple store. Apache Jena1 framework for Semantic Web applications has been adopted as the triple store interface to be used as a backend of the CDE Knowledge Base. One of the core functionalities of the Apache Jena is its Java API to manipulate RDF graphs. It also has a native support for OWL2 ontologies. ISO/IEC 11179 ontology has been built by following the specifications in the ‘ISO/IEC 11179-3:2003 Metadata registries (MDR) - Part 3: Registry metamodel and basic attributes’. Because of Jena’s native support and being a well-known and accepted standard, OWL language is chosen to represent semantics between ISO/IEC 11179 constructs. ISO/IEC 11179 defines metamodel objects modelled on a one of the metamodel constructs which are classes, attributes, relationships and association classes. While building the ISO/IEC 11179 ontology, an OWL resource for each metamodel construct is created accordingly and these mappings are reflected on the created ontology. Mappings used while constructing the ontology are as follows in Table 4:

Table 4 – Mapping of ISO/IEC 11179 metamodel constructs to OWL constructs

ISO/11179 metamodel construct OWL construct class owl:Class

attribute owl:DatatypeProperty composite attribute owl:ObjectProperty

relationship owl:ObjectProperty

By using the mappings given in Table 4, each class in ISO/IEC 11179 metamodel is defined as an owl:Class under the namespace “mdr”. All ontological resources from metamodel are created under the namespace http://www.salusproject.eu/iso11179-3/mdr# with prefix “mdr”. According to the parent – child relationship between classes in the metamodel, same class hierarchy is created in the ontology by using owl:SubClassOf property. Figure 8 presents the class hierarchy of Administered Items in ISO/IEC 11179 metamodel which is reflected to ontology.

Figure 8 – Class hierarchy in ISO/IEC 11179 metamodel

1 http://jena.apache.org 2 http://www.w3.org/TR/owl2-overview/

Page 35: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 35 of 67

In the process of building OWL constructs from metamodel constructs, all properties in the metamodel are reflected to the ontology so that one-to-one correspondence between two models can be maintained. Attributes with literal values are modelled as owl:DatatypeProperty of classes. Complex attributes, which have another class as a value, is modelled as owl:ObjectProperty for the corresponding ontology class. In Figure 9, RDF serialization of a DataElementConcept from the Semantic MDR ontology is shown. Notice that, data_element_concept_ property corresponds to an object property since its value is another class from metamodel, while object_class_qualifier corresponds to a data type property since its values are string literals.

Figure 9 – ER Diagram and RDF serializations of DataElementConcept and its attributes

As stated previously in this section of the document, relationships between classes are modelled in the ontology by using owl:ObjectProperty’s. For each relationship defined, source and the target class of the property is provided in the ontology by using the rdfs:domain and rdfs:range constructs from RDFS1. Also, cardinality constraints are specified for the relationships and attributes in the ISO/IEC 11179 metamodel. Since ontology built for the semantic MDR is based on OWL DL and OWL DL allows cardinality constraints on properties, all given cardinality constraints in metamodel are reflected to ontology by using owl:minCardinality, owl:maxCardinality and owl:cardinality OWL constructs. If a cardinality of a relationship or an attribute is defined as 0..*, which means any number of values, then nothing is provided in the ontology.

Figure 10 – ISO/IEC 11179 ER Diagram of Data Element

In Figure 10, it is seen that data_element_concept_expression between Data Element and Data Element Concept has cardinality constraints. This relationship is modelled in the ontology by using two object properties with defined cardinalities, expressing and expressedBy. These two object

1 http://www.w3.org/TR/rdf-schema/

Page 36: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 36 of 67

properties are also related by using owl:inverseOf to indicate the relationship between them. In Figure 11, a part from the RDF serialization of the constructed ontology including this relationship is shown.

Figure 11 – Parts from the RDF serialization of Data Element and its relationships

By following the principles described in Figure 11, all metamodel given by entity-relationship diagrams in ISO/IEC 11179 Part 3 is modelled as an RDF graph by using OWL constructs. In the end, we ended up with an OWL DL document consisting of 1441 triples and 67 owl:Class definitions with no individuals. Since metamodel describes all the entities in terms of class, attributes and relationships, same principle is adapted while constructing the ontology. As it specified in Table 4, entire metamodel is modelled using OWL constructs and since ontology consists no individuals, it can work as a schema for the SALUS Semantic MDR. According to the ISO/IEC 11179 specifications, each Administered Item should have at least one Context in where items are named and defined. This requirement also holds for Contexts itself. Solution proposed by ISO/IEC 11179 is to have a one parent Context, which will be Context of all Contexts, as the repository itself. So an instance of Context is added to ISO/IEC11179 ontology, later to be used as the repository Context, or Context of Contexts. So as not to violate ISO/IEC 11179 based model, a couple of default resources are created such as Organization, Registration Authority or Language Section which are required for an AdministeredItem. However, these default resources will be hidden from the users. The mechanism is described in section 6.2.3.

6.2 MDR Knowledge Base MDR Knowledge Base serves as the repository for the SALUS CDE Repository. It is designed to handle interoperability requirements of the SALUS CDE Repository. MDR Knowledge Base is a semantic metadata repository based on ISO/IEC 11179 metamodel. Therefore, while it does fully implement the ISO/IEC 11179 family of standards, it adds powerful semantic capabilities.

Page 37: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 37 of 67

Triple Store(Jena TDB | Virtuoso)

MDR Knowledge Base

Semantic Data Manipulation API(Pure ISO 11179 Mapping)

MDR API(Easy-to-use Semantic ISO 11179 Mapping)

Java API REST API

Data

Semantic MDR

Figure 12 – Components of the MDR Knowledge Base

Figure 12 presents the sub components of MDR Knowledge Base and their interactions among them. At the bottom, there is a triple store serving as a backend for the MDR Knowledge Base. Although ISO/IEC 11179 employs a relational-model, in order to add semantic capabilities, the OWL ontology fully conforming with metamodel is constructed and used as the schema for the triple store. Above the triple store, there is a 3 layered API to perform semantic operations on this triple store. In following sub-sections each sub component of the MDR Knowledge base is described in detail.

6.2.1 Triple Store Triple store sub-component serves as a backend for the MDR Knowledge Base. Since powerful semantic capabilities, following Linked Data approach requires more sophisticated data management than the relational model; triple store is chosen as a data backend of the MDR Knowledge Base. Previous section describes how ontology is constructed from the ISO/IEC 11179 metamodel to be used as schema. This section is focused on the details of the triple store used in Knowledge Base. It is already mentioned that Apache Jena1 is selected as a RDF Framework for the MDR Knowledge Base. Jena provides a Java API for manipulation of RDF graphs and it also has a native support for OWL ontologies. It has Java constructs for OWL constructs such as owl:Class, owl:cardinality. Apache Jena has a built-in triple store backend, Jena TDB2. TDB is basically a relational database on the disk specialized for holding RDF triples. It provides a scalable and high performance RDF triple store for the RDF Graphs and allows functionalities like indexing graphs, running SPARQL3 and full text queries over the dataset etc... Also it automatically handles the memory-disk mappings of the graph so while using from Java API, no need to consider synchronization issues. Because of all these 1 http://jena.apache.org 2 http://jena.apache.org/documentation/tdb/index.html 3 http://www.w3.org/TR/rdf-sparql-query/

Page 38: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 38 of 67

features provided by Jena TDB, we have used it as one of the triple store backends of the SALUS Semantic MDR. Another widely used high performance triple store is Virtuoso Triple Store1. Indeed Virtuoso is a high performance, scalable Universal database server. However, it has RDF layer on its relational database so it can be used as triple store, too. Because of its high performance and scalability, it gives better performance than most of other triple store although it is not a native triple store. Virtuoso provides a SPARQL endpoint to perform semantic operations on RDF Graphs. To be able to support common RDF Frameworks, Virtuoso provides couple of drivers for Jena, Sesame etc… With this Jena Driver, Jena Java API can manipulate the RDF Graphs on Virtuoso without any additional effort.

Figure 13 – Class Diagram for Triple Store Sub Component

Since both triple stores can be used from Jena Java API, JenaStore interface is created to unify the triple store operations. Currently, there are two implementations of this interface, namely TDBStore and VirtuosoStore as seen in Figure 13. Any triple store based on Jena Java API could easily be added to be used by MDR Knowledge Base by just implementing JenaStore interface. To replace the triple store backend with another one, just couple of programmatic adjustments at Semantic Data Manipulation API would be enough. Triple Store not only provides basic triple store functionalities like creation, removal and retrieval of models in the underlying dataset, but also provides a SPARQL endpoint and also a text index based on the implementation. While Virtuoso has built-in full text index and its own syntax for full text queries, Jena, so as TDB provides Lucene2 indexing engine as LARQ1.

1 http://virtuoso.openlinksw.com/ 2 http://lucene.apache.org

Page 39: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 39 of 67

6.2.2 Semantic Data Manipulation API Semantic Data Manipulation API is the first layer over the triple store. As previously mentioned Triple Store provides storage for RDF Graphs and provides manipulation over the RDF Graphs. However, Triple Store has nothing to do with ISO/IEC 11179 relational model and SALUS CDE Repository. Semantic Data Manipulation API is direct implementation of the ISO/IEC 11179 relational model. Every class, attribute and relationship in the metamodel is directly implemented as Java classes and attributes for fully conformance with ISO/IEC 11179. Also each Java class in this layer is an ontological class of Jena Java API, so that they are automatically resources on the underlying triple store. In Figure 14, design of the abstract AdministeredItem class is presented;

Figure 14 – Class Diagram for Administered Item

As stated previously in this document, Semantic Data Manipulation API is designed to both enforce the constraints of ISO/IEC 11179 relational model and to support semantic operations to follow Linked Data approach. First step of the design process was to create Java classes and interfaces which directly maps with given entity-relationship diagrams of relational model. Later, class hierarchy in metamodel is directly reflected to Java classes as in the case of ontology building. This class hierarchy of administered items are presented below in Figure 15:

1 http://jena.sourceforge.net/ARQ/lucene-arq.html

Page 40: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 40 of 67

Figure 15 – Java Class Hierarchy of Semantic Data Manipulation API

Figure 16 – Administered Item Class Diagram, Semantic Data Manipulation API

Figure 16 shows public members of the AdministeredItemResource interface. This interface is to represent AdministeredItem’s at Semantic Data Manipulation API or low level API. All the methods in this interface correspond to attributes or the relationships in the metamodel. These methods are defined to be used as a setter /getter methods for the attributes and the relationships. As it can be easily understood from the above class diagram, arguments or the return types of the methods corresponds to class or the literal data type of the related attribute or the relationship. This operations does not consider any data integrity or constraints rather than given in entity-relationship diagrams. Semantic Data Manipulation API is also named as the low level API. The reason is basically explained in above paragraphs. First of all, classes and interfaces on this level are tightly coupled with the underlying triple store. All the resources on this API are also ontological classes since they extend OntClass interface of the Jena Java API. In this way, all the resources automatically have the methods and the capabilities of the Jena OntClass interface. Also, since a triple store supporting Jena Java API is used as a backend of the MDR Knowledge Base, all the resources on this low level API can directly be stored in underlying triple without additional effort like mapping Java objects to RDF resources.

Page 41: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 41 of 67

Another reason why this API is called as low level API is methods exposed by this API may not be considered as easy-to-use. Since Semantic Data Manipulation API directly manipulates the RDF triples in the underlying triple store, it requires knowledge about triple stores and Jena Java API. Also because of the same reason, it is possible to leave the system in inconsistent state even by performing valid operations. One of the important functionalities provided by this API is the MDRDatabase. MDRDatabase is an abstraction of the underlying triple store and also adds some registry/repository functionalities to Semantic Data Manipulation API. In Figure 17, class diagram of MDRDatabase is given and methods provided are shown;

Figure 17 – MDRDatabase Class Diagram

MDRDatabase can be considered as one of the main entities of the MDR Knowledge Base. It holds a triple store to be used as a backend and with the utility functions it exposed such as getOntModel, commit, sync etc...; triple store can be used uniformly. As it can be seen from the Figure 14, each AbstractMDRResource has a protected attribute which holds a reference to MDRDatabase. Since all AbstractMDRResource is the parent class for all the implementations, each ISO/IEC 11179 entity automatically has a reference to the MDRDatabase which they are created in. Another benefit gained is from each resource, utility methods of the MDRDatabase can be used.

Page 42: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 42 of 67

Another important functionality provided by Semantic Data Manipulation API is to MDRResourceFactory. Each MDRDatabase has a single MDRResourceFactory employs factory method pattern to create resources on MDRDatabase. In this way, all the ISO/IEC 11179 resources are created from a single source. Also very basic consistency check is done by the factory methods to keep the triple store in consistent state. Since creation an ISO/IEC 11179 resource may require creation of number of another resources, any inconsistency in arguments would cause an inconsistency at underlying triple store if constructs were used for object creation. With MDRResourceFactory, this risk is eliminated and constraints of ISO/IEC 11179 metamodel are enforced during object creation.

6.2.3 MDR API Semantic Data Manipulation API is to provide full conformance with ISO/IEC 11179 relational model and it is problematic to expose as an API for MDR users. Hence MDR API is designed upon the Semantic Data Manipulation API to expose easy-to-use interfaces and registry/repository capabilities. MDR API is basically one more layer of interfaces upon Semantic Data Manipulation API. These interfaces provide easy-to-use methods for manipulating the metadata registry/repository by using the low level API. With this design, MDR API is completely decoupled from the underlying triple store. This provides us a domain independent API for semantic MDRs. Figure 18 shows a part form the class diagram of MDR API to illustrate the design;

Figure 18 – Part of the MDR API Class Diagram

Figure 18 is already presented in – Class Diagram for Administered Item (Figure 14) except the rightmost part. As it can be seen from the above diagram, interfaces of Semantic Data Manipulation API are actually extension of interfaces exposed by MDR API. In this way, same Java object is available from the MDR API too due to the class hierarchy; and only desired methods, which are defined in MDR API interfaces, are exposed to outer world. Implementation of the MDR API interface methods only uses the interfaces exposed by the Semantic Data Manipulation API. Therefore, this level of API has nothing to do with underlying triple store and Jena Java API. This design choice decouples the underlying triple store from the MDR API. In this way MDR API becomes domain independent API for semantic metadata registry/repository. Using MDR API does not require any knowledge about neither Jena RDF API nor underlying triple store. Also decoupling MDR API from underlying triple store makes lower levels of MDR Knowledge Base interchangeable without an effect on the MDR API. On the MDR API, Repository is designed to be an entry point for the metadata registry/repository. Repository object is designed to be a singleton so that within a single execution, only a single

Page 43: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 43 of 67

repository exists and the all operations are executed same single repository. This singleton instance is obtained by RepositoryManager#getInstance()#getRepository() call. Repository is designed to have single MDRDatabase instance so that operations executed on the Repository is reflected to underlying triple store by it. Figure 19 presents the Java class diagrams of this design;

Figure 19 – Java Class Diagrams for the Repository

Page 44: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 44 of 67

As presented in Figure 19, Repository has methods for creation and retrieval of the AdministeredItem’s managed in the Repository. Purpose of MDR API is to provide an easy-to-use, domain independent API for semantic metadata registries/repositories. To achieve this, different approach then directly implementing ISO/IEC 11179 metamodel is followed without any conflict with the metamodel. Since each AdministeredItem has to have a context, Context is designed to be base for other AdministeredItem types. Contexts are created on the Repository and other types of AdministeredItem’s are created on the Context where they belong to.

Repository is designed to be the entry point as mentioned. Contexts are designed to represent the data models or data element dictionaries, where DataElement’s and other AdministeredItem’s belong to. With this design of MDR API, these semantics of ISO/IEC 11179 metamodel is enforced to user and is also provided to the users in an easy way.

According to the ISO/IEC 11179 specifications, ConceptualDomains are sets of ValueMeanings, which may be enumerated or expressed by description and can be viewed as logical code sets. So they are used to support DataElementConcept’s and focused on the semantics. On the other hand, ValueDomains are the physical code sets or the sets of PermissibleValues, so they focus on the physical representation of those concepts and are used to support DataElement’s. According to this specification, ConceptualDomains are considered as semantic domains independent of any physical representation. On the other hand, Contexts are the data models consisting of physical elements. In this way, ConceptualDomains are context independent resources capturing the semantic whereas ValueDomains are defined on the context focused on the representation of the semantics. In the MDR API, ConceptualDomains objects are managed by the Repository, whereas ValueDomains are managed by the Context which corresponds to data models. In this way, these semantics of ISO/IEC 11179 is naturally reflected on the API and provided to user in an easy-to-use natural way as can be seen in Figure 19.

According to ISO/IEC 11179 family of standards, each AdministeredItem should have a Context where it is named and defined. AdministeredItems other than Context and ConceptualDomain is already created on the Context, so that naturally this requirement is specified. For fully conformance with ISO/IEC 11179, a parent Context which will be Context of all Context and ConceptualDomains is created and automatically assigned to them when they created on the Repository. This context is embedded in ISO/IEC 11179 ontology, described in section 6.1. There are other resources created such as Organization, or Language Section for the parent Context. However, all these resources are hidden from the user. For example when user executes Repository#listContexts(), actually Contexts defined on the parent context are returned to user. However all these operations are handled internally for ease of use and to enforce semantics of ISO/IEC 11179.

Repository has the methods for managing ConceptualDomains and Context as already explained. Managing of other types of AdministeredItems follows the same approach. DataElementConcepts are created upon ObjectClass by specifying a Repository or ValueDomains are created upon Context by specifying a represented ConceptualDomain.

Page 45: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 45 of 67

Figure 20 – Sequence Diagram of DataElement Creation

Previous paragraphs describe approach followed during the design of Repository and implementation of ISO/IEC 11179 relational model on MDR API. Sequence diagram given at Figure 20, summarizes the approach followed. As it can be seen in diagram, to create a DataElement from scratch, one should follow the presented path. First Context should be created on Repository; later ObjectClass on that repository should be created. By adding Property, DataElementConcepts are created. As the last step, one should specify the ValueDomain on DataElementConcept to create DataElement. The same approach is followed while browsing the repository. This design makes easier to create and manage CDEs on SALUS MDR because it requires less parameters than the low level API. The reason for this is in low level API, factory pattern is employed and all of the attributes should be given as parameters to be created. However on MDR API, some arguments can be set to their default value or some arguments can be inferred from the sequence in which methods are called. In this way the MDR API becomes easy-to-use and enforces semantics of the ISO/IEC 11179 model.

However there are couple of extra utilities provided by the MDR API. First, all the DataElement’s on MDR can be retrieved; or by just specifying a Context, DataElement’s defined on that Context can also be retrieved. All of these retrieval methods allow specifying limit and offset, so that resulting collections can be paginated.

One of the most important functionality by the MDR API is search utilities. Each Repository object comes with an abstract QueryFactory. This factory consists of several SPARQL query template to be executed on underlying triple store. In the Semantic Data Manipulation API, methods to manage attributes and relationships exist since it is just an implementation of the relational model. However, on MDR API, any relation in the underlying triple store can be extracted with these query templates. For example on Semantic Data Manipulation API, DataElement’s of a DataElementConcept can be retrieved. On the other hand, via QueryFactory, DataElement’s can be obtained from the Repository or given a Context, etc.

Another benefit of the QueryFactory is full text search. As described in section 6.2.2, graph stored in the dataset is indexed. By using methods of QueryFactory, SPARQL templates doing text search on graph is executed. In this way, Repository interface provides methods to search DataElement’s, Property’s, ConceptualDomain’s etc.

6.3 Importers One of the important functionality provided by SALUS MDR is auto population of MDR Knowledge Base by importing pre-defined content models from external resources. For this purpose, SALUS

Page 46: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 46 of 67

MDR has importer interfaces for different types of content models such as ontology importer, XSD1 importer, DDL2 importer; and also provides implementation for pre-defined content models. Importers have domain dependent implementation since each content model comes with its own specification. Even if two content models are represented through XML schemas, since structure of the content is different, it would not be feasible to implement on XSD Importer to parse all the content models defined through XML schemas. This is the reason why importers have domain dependent implementation. In the following sub-sections, implementation details of the importer will be detailed for each content model. Importers make use of MDR API provided by MDR Knowledge Base to operate on the repository. Since MDR API enforces a way to create and manage CDEs in the repository as mentioned in the previous section, importers follows same pattern. First a Context to represent content model is created. After, other resources necessary to create DataElements are created on the context such as ObjectClass, Property, ValueDomain, etc. As the last step, DataElements in the content model is imported to Context. With their structure, importers set a good example of MDR API usage.

6.3.1 OMOP Common Data Model Importer The Observational Medical Outcomes Partnership is a public-private partnership designed to protect human health by improving the monitoring of medical, such as drugs or other regulated medical products, for safety and effectiveness. To achieve this aim, OMOP created a set o tools, including a common data model to organize and standardize observational data. The OMOP Importer is designed and implemented to populate SALUS MDR Repository with CDEs extracted from this Common Data Model (CDM) of OMOP. OMOP Common Data Model is specified by relational model. There are entity-relationship diagrams specifying OMOP CDM tables and their relations, and there are also DDL files defining the schema for the common data model tables. In OMOP CDM, most of the elements are defined through well-known external vocabularies such as ICD-103 and MedDRA4. OMOP also defines the vocabularies and concepts utilized from these vocabularies by through CSV5 or DDL files. All these documents and files mentioned above and used by OMOP Importer are publicly available at project website6. In SALUS MDR, OMOP Importer is an implementation of the DDL Importer since OMOP Common Data Model is specified through relational model and DDL files as mentioned above. First step of the process, DDL file containing schema for the OMOP CDM parsed using the DDL parser from dbmigrate7 project. Tables in the OMOP CDM are considered as the ObjectClasses for the DataElements and an ObjectClass is created on OMOP Context for each table in common data model. Each column in the row is considered as the attribute of that table, so column can be considered as the properties in OMOP Context later to constitute DataElementConcepts with ObjectClass. So a Property is created for each column and DataElementConcepts are created on OMOP Context. Since each column in CDM has a pre-defined data type, ValueDomains and data types for DDL types directly created. However, there are two issues to be solved: first, as we mentioned earlier, some columns takes their values from external vocabularies so they should be handled as enumerated value

1 http://www.w3.org/2001/XMLSchema 2 http://en.wikipedia.org/wiki/Data_definition_language 3 http://www.who.int/classifications/icd/en/ 4 http://www.meddramsso.com/index.asp 5 http://en.wikipedia.org/wiki/Comma-separated_values 6 http://omop.fnih.org/CDMvocabV4 7 https://code.google.com/p/dbmigrate/

Page 47: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 47 of 67

sets; second, almost all tables has references to other tables as foreign keys as relational model implied. In Figure 21, all these can be seen in part of the OMOP CDM entity-relationship diagram;

Figure 21 – Part of the OMOP Relational Model

In above table, you can see provider_id is a foreign key reference to provider table. To handle such issues, a special conceptual domain, named ObjectClass, is created on Repository. Later to specify that provider_id column takes values from provider table, data type special for Provider Object Class is created. In this way, foreign key references handled correctly without any conflict with ISO/IEC 11179 metamodel. Second issue for the OMOP importer is references to external vocabularies. OMOP uses vocabulary table to specify and external terminology. From the DDL files provided in OMOP website, an actual database created for OMOP model and vocabularies. Later, with simple SQL queries, these vocabularies are retrieved from Database and EnumeratedConceptualDomains are created for each of them. For example, gender_concept_id of the person takes value from HL7 Administrative Gender1, so ValueDomain of the Person Gender corresponds the HL7 Administrative Gender ConceptualDomain. Also its data type is specified as Coded Description to specify values are codes from a coded Value set. By following the principles explained for Person ObjectClass and its DataElements, ObjectClass for each table and DataElement for each column of each table is created on OMOP Context. Also EnumeratedConceptualDomains referencing external vocabularies are created independent of any Context.

6.3.2 SDTM Importer SDTM (Study Data Tabulation Model) defines a standard structure for human clinical trial data tabulations that are to be submitted as part of a product application to a regulatory authority such as the United States Food and Drug Administration (FDA)2. The Submission Data Standards team of Clinical Data Interchange Standards Consortium (CDISC)3 defines SDTM. SDTM model constitutes of domain and variables on these domain. Also it defines a data type for each variable. These domains can be considered as tables and variables as columns. In ISO/IEC 11179 model, domains corresponds to ObjectClasses and variable names corresponds to the Properties. When they combined with the data type specified for that variable, they constitute a DataElement. SDTM defines its domains and variables in a PDF documents available on their website. Although the document is complete and well defined, it is not machine processable. 1 http://phinvads.cdc.gov/vads/ViewValueSet.action?id=8DE75E17-176B-DE11-9B52-0015173D1785 2 http://www.fda.gov 3 http://www.cdisc.org

Page 48: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 48 of 67

The Biomedical Research Integrated Domain Group (BRIDG)1 is collaborative effort with aim of producing a shared view of the dynamic and static semantics for the domain of protocol-driven research and its associated regulatory artifacts. Since CDISC is one of the stakeholders, mapping of CDISC data models including SDTM to BRIDG model is provided. These mappings are provided in an excel sheets and available in project website2. Table 5 represents a part of these mapping and the first four columns can be used as complete list of SDTM Elements.

Table 5 – Part of SDTM - BRIDG Mapping

SDTM Domain Prefix

SDTM Variable Name

Data Type

Class Name Attribute Name

Data Type

Mapping Path Mapping Tag

DM STUDYID

Char

Unique identifier for a study.

DocumentIdentifier

identifier

II PerformedActivity.PerformedObservation > StudyProtocolVersion > DocumentVersion.StudyProtocolDocumentVersion > Document > DocumentIdentifier.identifier

DM.STUDYID

To have a machine-processable list of SDTM Data Elements, first four columns presented above are exported as CSV file. SDTM defines its domains under the general observation class. This general observation class is divided among three sub-classes namely Interventions, Events and Findings and each of these sub-classes has further sub-classes in a hierarchy. Other than the three main Observation classes, there is a special purpose class to define the domains which are not Observations such as demographics. Figure 22 shows the hierarchy of the domains in CDISC which are used for SDTM data elements.

1 http://www.bridgmodel.org 2 http://bridgmodel.nci.nih.gov/download_model/bridg-releases/release-3-2/release-package

Page 49: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 49 of 67

Figure 22 - CDISC Standard Domain Hierarchy

There is another open-source project ongoing named cdisc2rdf1. Aim of the project is making CDISC standards available using generic standards from World Wide Web Consortium which is the stack of standards from the Semantic Web. cdisc2rdf defines the DataElements from the SDTM in an ontological form using RDF. Although SDTM does not define Data Elements for general observation classes, in project cdisc2rdf, data elements for general observation classes exist with semantic relations.

1 http://cdisc2rdf.com

Page 50: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 50 of 67

Although Data Elements from SDTM are to be produced from CSV files or excel sheets, it is considered as an ontological model. For this reason, SDTM importer is designed to be implementation of the ontology importer. Again as the first step a context named SDTM is created on the repository as the context of all resources to be imported. Since all the Data Elements to be imported are in a flat list, every line of the CSV file does correspond to a DataElement. For this after parsing CSV file with Java utilities, first value of the every line is created as ObjectClass in SDTM Context. Later, Property object which corresponds to variables created on the Context and combined with their domains ObjectClass to form a DataElementConcept. In SDTM, variables are defined to take either text or numerical values. There is reference to other domains in the model or an external terminology as in the case of OMOP. So only two NonEnumeratedValueDomain’s with property data type is created on the Context. Later each DataElementConcept is combined with proper value domain to combine DataElement’s. As mentioned earlier in this section, cdisc2rdf project represents the general observation classes and their domains through ontological resource. It defines the data elements in the level of Findigs, Events and Interventions domains, not going into the sub-domains in the hierarchy. In SALUS Semantic MDR, we have come up with CDEs which have been created through the finest granular level, that is, we create the CDEs having Object Classes such as Adverse Events, Exposures and Vital Signs. To be able to comply with the cdisc2rdf, extra functionality is added to MDR API in order to create the hierarchy between Concepts, hence Object Classes. On the MDR API this relationship is reflected as parent-child relationship to user. On the triple level, since all AdministeredItem classes are ontological classes at the same time, parent-child relationship is reflected by using rdfs:subClassOf. In this way, compliance with cdisc2rdf is ensured without breaking any standards. For example, in Semantic MDR, CDEs with Object Class Adverse Event (AE) also have the parent Object Class Events through the hierarchy of ISO/IEC 11179 Concepts.

6.3.3 CDASH Importer The Clinical Data Acquisition Standards Harmonization (CDASH) defines basic standards for collection of trial data. CDASH is another standard defined by CDISC like SDTM. Indeed they define the same domains with same variables with little bit differences from each other. CDASH standard is also available in PDF documents available from CDISC website. In the specifications, for most of the data elements direct mapping with SDTM elements are given. However, again this specification is only human readable. Since CDASH is a part of BRIDG and they are very similar to each other, mapping with BRIDG Model is also defined for CDASH Data Elements. For the CDASH Importer, almost the same methodology is applied. First a CSV is exported from BRIDG mapping, later by parsing exported file, Concepts are created from domains and Property’s are created from the domain variables. Later DataElement from the each line of the CSV file is created on CDASH Context. Again only couple literal data types – textual, numerical, date time etc..- exist in CDASH model so creation of DataElement’s and ValueDomain’s was a trivial task as in the SDTM Importer.

6.3.4 HITSP Importers According to the explanation on their website, “The Healthcare Information Technology Standards Panel (HITSP) is a cooperative partnership between the public and private sectors. The Panel was formed for the purpose of harmonizing and integrating standards that will meet clinical and business needs for sharing information among organizations and systems.” For this purpose, HITSP defines a

Page 51: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 51 of 67

set of Data Elements which are used in HITSP documents. Document with specifications and complete list of Data Elements is available at website1.

However, this document is available in PDF format, and there is no machine-processable list of Data Elements like other content models used by importers. So machine- processable version of the Data Element dictionary is constructed in XML format by parsing the PDF documents by us. A part of this XML is shown in Figure 23.

Figure 23 – HITSP Data Elements XML File

Creation of Data Elements is a straightforward process as in the SDTM and CDASH importers. First, Object Classes are created from the parent elements and later child of these elements are added as a Property to combine Data Element Concepts. Then, Data Elements are created by combining proper Value Domains. In this way, HITSP C 154 Data Dictionary is imported easily. One difference of HITSP C 154 model from other content models is the CDAMappings part seen in the XML in Figure 23. In CDAMappings elements, mapping from HITSP C 154 Data Element to elements in Clinical Document Architecture2 model is given. Since CDA model is based on XML, mappings are given as the location of nodes in XML via XPATH. Normally, these mappings are given in another terminology name HITSP C 83, but while constructing machine processable XML, C 83 and C 154 are combined together. These mappings correspond to the extraction specifications for the Data Elements as explained in Section 0. In ISO/IEC 11179 model, Classification Scheme and Classification Scheme Item are used to define extraction specifications. Since mappings follow CDA Content Model, a Classification Scheme named CDA is created on the Context HITSP C 154. Later, for each extraction specification, a Classification Scheme Item is created as a child of CDA Classification Scheme and type attribute is set to XPATH. And value of the Classification Scheme Items is set to value of the CDAMappings Attribute. In this way, all the extraction specifications are defined by using the ISO/IEC 11179 constructs. Another difference of the HITSP C 154 Data Elements is Constraints on the Data Element. Some constraints are given in human readable format. But most of the constraints define the value set of the Data Elements with well-known terminologies. These terminologies are given by the OID3. A complete list of terminologies used by HITSP Data Elements is given in another document, named HITSP C 80. This list is also exported to XML to make it machine processable and later, Conceptual Domains and corresponding Value Domains in HITSP C 154 Context is created by parsing this XML file. When a Data Element has a constraint given by OID, Value Domain is set by using this OID.

1 http://www.hitsp.org/ConstructSet_Details.aspx?&PrefixAlpha=4&PrefixNumeric=154 2 http://en.wikipedia.org/wiki/Clinical_Document_Architecture 3 http://en.wikipedia.org/wiki/Object_identifier

Page 52: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 52 of 67

6.4 REST1 API Previous sub-sections focus on internal details of SALUS MDR and Java API exposed. However, Java API is not the only interface of SALUS MDR. Indeed, Java API is designed to be used by other components of the SALUS project. There are two other interfaces exposed to outer world as explained in the overall architecture of SALUS MDR, namely RESTful API and graphical user interface. In this section, internal details of the RESTful API are described. Purpose of the RESTful API is to provide platform independent services to use the functionalities provided by SALUS MDR. Also with services provided, SALUS MDR ensures the compatibility with Linked Data approach. To follow the Linked Data approach, every resource should have unique dereferencable URI and resources should have links to others. According to ISO/IEC 11179 specifications, each Administered Item should have a unique identifier. In SALUS MDR, ISO/IEC 11179 is modelled as ontology; all items are represented as RDF resources. In this way, SALUS MDR assigns a URI for every item. To make them dereferencable, REST API provide following services with given functionalities;

• URI: returns a serialization of the resource specified by URI. • UUID2 : returns a serialization of the resource specified by UUID. • SPARQL3 : returns the result of execution of the given SPARQL query.

First service is exposed under URL /sparql/serizalization/uri/{uri}. The last URI is used as a path parameter to specify the URI of the resource desired. This service executes a SPARQL Construct query on the underlying triple store. SPARQL query executer takes {uri} as a parameter and constructs a new graph by adding all the properties of the given resource to the model and returns serialization of the resulting model. Second service exposed by REST API is a service which returns the serialization of the resource specified by UUID as explained above. As already mentioned, SALUS MDR generates a UUID for each Administered Item to follow ISO/IEC 11179 specifications. This generated UUID is stored as the data identifier in the Administration Record of Administered Items. With a simple SPARQL SELECT Query, resource having the specified data identifier (UUID) can easily be extracted. By combining the result with the service explained above, resources with specified ID can easily be found. This service is deployed under /sparql/serialization/uuid/{uuid} path. Last service RESTful API provided is a SPARQL endpoint. This service is deployed under /sparql/query and it accepts well-formed SPARQL 1.0 queries and returns the serialization of the result obtained by the execution of the query on underlying triple store. Services provided by RESTful API only allow retrieval of items in the repository and does not allow any modification on underlying dataset. So, current implementation of the RESTful API does not require any authentication and returns a response to all well-formed HTTP requests.

6.5 Graphical User Interface As presented in the overall component diagrams, SALUS MDR also provides a Web based Graphical User Interface. Aim of the Web GUI is to provide and easy-to-use user interface for basic operations of the SALUS MDR such as running importers, browsing contexts and data elements, search data elements and basic create-update operations. It also adds authentication functionality since it allows modification on the MDR Knowledge Base.

1 http://en.wikipedia.org/wiki/Representational_state_transfer 2 http://en.wikipedia.org/wiki/Universally_unique_identifier 3 http://www.w3.org/TR/rdf-sparql-query/

Page 53: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 53 of 67

Following sub-sections focus on the authentication service, REST services and design of the graphical user interface respectively.

6.5.1 Authentication Service GUI not only provides browsing functionality, it also provides the functionality of modification on the MDR Knowledge Base. Also, according to ISO/IEC 11179 specification, each Administered Item should have a record about registration and administration of the item, in other words; should keep track about the registering and administering user and organization. Because of the reasons explained, authentication mechanism is introduced to control and to keep track of source of the modifications on MDR Knowledge Base. Authentication mechanism can be described shortly as follows; a user first logins to the system with his/her login credentials by using authentication services. This service returns a session id associated with user after validating the credentials. Later on, request is executed on MDR only if session id coming as cookie is valid. This process is shown as a sequence diagram in Figure 24.

Figure 24 – Sequence Diagram for Authentication Mechanism

To keep the user and session information, an Apache Derby1 relational database is created with three tables, namely User, Organization and Session. Apache Derby is chosen because it can run in embedded mode. To make it clear, Apache Derby does not have to be separate service, it can be started inside the Java Virtual Machine and it is only accessible from the same virtual machine through jdbc2. When a user registers, entries in User and Organization tables are created. Later, when a user makes a request to authentication service, credentials are validated from the database. If such user exists, a random session id associated with user is generated in Session table which will be valid for a day or for 10 years according to user’s choice. After, client should set a cookie named SID and should sent the cookie with every request. In this way SALUS MDR authenticates the user and can extract the user information from the request. If login credentials or session id is not valid, response with status 401 is sent back.

1 http://db.apache.org/derby/ 2 http://en.wikipedia.org/wiki/Java_Database_Connectivity

Page 54: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 54 of 67

6.5.2 REST Services In previous sections, implementation of the SALUS MDR is described in detail and Java API to expose the functionality is presented. As represented in the beginning of this section, SALUS MDR also exposes a REST Services in which services are specialized for SALUS MDR easy-to-use functionalities. Indeed, REST API can be considered as a web service implementation of the MDR API provided by MDR Knowledge Base. REST Services are designed to expose the functionality of the MDR Knowledge Base. Other than the authentication functionality, services do necessary operations on MDR Knowledge Base by using the MDR API. So, most of the services can be considered as RESTful version of the related Java methods. Other than this, there are couple of services to run importers via REST requests. REST services responsible of semantically similar functionalities are grouped under same path. For example, method of the Repository in the MDR API such as createContext(), createConceptualDomain(), listContexts() correspond to services under the path /repository. In ISO/IEC 11179 relational model, entities are complex and there are many attributes and relationships among entities. Therefore, it is not practical to use complete model on the Web based GUI. For this purpose, relatively simple models are extracted from ISO/IEC 11179 objects and these simpler representations are used in REST Services. This makes REST Services simpler and faster. In Figure 25, chosen attributes of Data Element from complete model is shown.

Figure 25 – Class Diagram of Web GUI Simpler Models

As a way of serialization of these simpler models, JSON1 format is chosen. There are couple of benefits of choosing the JSON format. First, it is simpler and less verbose than its counterparts like XML. Secondly, since JavaScript2 is chosen for client side programming and it provides a native support for JSON format. Also JSON serialization on the Java side is done automatically by the Jersey3, JAX-RS4 framework for Web Service implementation, thanks to bean like models for the Web GUI. Figure 26 presents a Data Element imported by OMOP Importer.

1 http://www.json.org 2 http://en.wikipedia.org/wiki/JavaScript 3 http://jersey.java.net 4 http://en.wikipedia.org/wiki/Java_API_for_RESTful_Web_Services

Page 55: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 55 of 67

Figure 26 – JSON Serialization of a Data Element, PERSON GENDER

Complete set of REST services with descriptions can be found at: https://github.com/sinaci/semanticMDR/blob/master/web/README.MD

6.5.3 Graphical User Interface Web GUI is designed to be the user interface of MDR API exposed to outer world. Users of the SALUS MDR should be able to perform their request through the Web GUI and see the responses from the SALUS MDR. Hence, Web GUI is designed to provide the functionality of the MDR API on a graphical user interface through Web Services. In this section, design and implementation details of the SALUS MDR Web GUI are presented. For the client side implementation, there are numerous JavaScript frameworks providing different functionalities at different levels. For the SALUS MDR, Backbone1 and its extension Marionette2 is chosen. Backbone is basically MVC3 architecture providing rich API for models and collection of models and easy integration with existing JSON based REST Services. Because of the model-view separation and JSON based REST support, Backbone fits perfect for the requirements of SALUS MDR. Marionette is simply an extension of the Backbone. It is built upon the constructs provided by the Backbone and its aim is to simplify building large scale, heavy JavaScript applications. It provides lots of utility to manage browser history, JavaScript events, or synchronizing with server side through REST. It also provides and application and module constructs to make JavaScript code more structured. Because of its advantages, Marionette is chosen as the main implementation framework for the Web based graphical user interface of the SALUS MDR. Section 6.5.2 describes the simplified models the REST Services built for graphical user interface. This set of simplified models sent through REST Services is mapped Backbone.Models on the client side, too. For each simplified model sent through REST Services, a Backbone.Model is defined and the URL attribute of the model is set with the URL of related Service. In this way, Backbone handles client-server interactions automatically through JSON serializations. Later necessary views for each item or collection is defined so that, they can be rendered as desired on the screen.

1 http://backbonejs.org 2 http://marionettejs.org 3 http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller

Page 56: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 56 of 67

Figure 27 – Main Screen of Web GUI

Figure 27, presents the main screen of the SALUS MDR Web GUI. On the upper right corner, information of the authenticated user is shown along with logout button to invalidate the session information. On the top, there is the main menu for the SALUS MDR. Right side of the menu shows the current context and provides dropdown menu to change the current context. Below, there is the back-forward button to navigate history and the main content of the GUI. Finally below the content, there is a pagination bar for the paginated browsing of repository items. When user first login into system, list of contexts is retrieved from Repository and Object Classes of the first context is shown on the main screen. Whenever a context is changed through dropdown, Object Classes of chosen context is displayed on main Context so navigation always starts from the Object Class. From the browse menu, user can also choose to navigate Data Elements; Data Elements of the chosen context is displayed on screen. From the browse menu, user can also display the Conceptual Domains of the Repository, since Conceptual Domain is independent of Context. Another way to navigate Data Elements of the Context is start form Object Class. When the right arrow of a Concept is clicked, Data Element Concepts derived from that Object Class is displayed. When the right button of Data Element Concept is clicked, a detailed view of the Data Element of the Data Element Concepts is displayed. Figure 28 and Figure 29 present this path from OMOP Context to PERSON GENDER Data Element.

Figure 28 – Data Element Concepts of Person Object Class

Page 57: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 57 of 67

Figure 29 – PERSON GENDER Detailed View

On the items displayed on main screen, there is an edit button on the left. With this button, a modal window filled with details of the item is opened and allows user to modification on the item within certain limitations. Figure 30 presents a screenshot from update view of the Data Element Concept.

Figure 30 – Modal Window to Modify PERSON GENDER

Same modal windows are also used for creation of ISO/IEC 11179 items. On Figure 28, button Add Data Element Concept can be seen. When clicked this button, same modal window with empty model is opened and new object is created on the MDR Knowledge Base on chosen Context with given information. On the main menu, there is another dropdown menu for the importers. Currently there are 4 importers as explained in section 6.3. From this dropdown, importers can be triggered and SALUS MDR can be populated. When importing process finished, page is reloaded and newly imported Context can be seen on Context dropdown. As described in section 6.2.3, SALUS MDR has a full text search support over the Data Elements. This functionality is also provided on Web GUI for users to search Data Elements according to name.

Page 58: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 58 of 67

Figure 31 shows the resulting Data Element’s found as the result of search with keyword Person and their Context. For each Data Element, Name, Definition and Context are shown. Arrow at right side of each column shows the detailed view of the corresponding Data Element. As seen in all figures about graphical user interface, there is a back-forward button at each page. These are actually just triggers of the browsers back forward button. With Backbone.history module, history of the browser can be controlled. Any location in the graphical user interface can be saved in browsers history using hash tags, later this history can easily be navigated by utilities provided by Backbone.history. This way, SALUS MDR Web GUI provides a navigation utility for better user experience.

Figure 31 – Result of Person Search

Page 59: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 59 of 67

7 IHE DEX PROFILE SALUS Project, in particular Task 4.2 is in close cooperation with IHE Quality, Research and Public Health Domain (QRPH) Committee [6]. In particular, Gokce B. Laleci (SRDC), Ali Anil Sinaci (SRDC), Christel Daniel (INSERM) and Dr. Landen Bain (CDISC) are collaboratively working on developing a new interoperability profile, namely the IHE Data Element Exchange (DEX) Profile. The objective of DEX profile is to enable re-use of the Electronic Health Record (EHR) in clinical research and public health studies such as: lessening of the burden and optimization clinical trial data collection through the targeted re-purposing of EHR data during a trial’s execution phase (Pre-population of a Research Case Report Form (CRF)); leveraging routinely collected clinical data for adverse event detection and reporting (Screening clinical data for ADE detection and notification and Pre-population of ICSR reports); providing a better understanding of the available cohorts based on the trial’s Inclusion and Exclusion criteria during trial design (Eligibility Determination); and use of routinely collected clinical data for conducting retrospective observational studies. An existing set of IHE profiles – RFD, CRD, and Redaction – provide a way to export a standard EHR documents on demand and to map the elements to a research specification, CDISC CDASH annotated Operational Data Model (ODM) files. The problem is that the mapping is done using a ‘one size fits all’ approach; an XSLT that maps a generic CCD to generic CDASH annotated CRF forms in ODM format. This generic approach provides one map to use for any and all case report forms, despite the wide variance among these forms. IHE DEX profile argues that integrating patient care and clinical research domains requires a standard-based expressive and scalable semantic interoperability framework, allowing dynamic mappings between data elements and semantics of varying data sources. This can be achieved through a metadata registry architecture where machine processable definitions of data elements across domains can be shared, re-used, and semantically interlinked with each other to address this semantic interoperability challenge to move towards EHR-enabled research. Through DEX profile, we aim to enable retrieving “extraction specifications” for a data element defined in a selected domain (like SDTM data elements), from an implementation dependent content model in another domain (like HL7 CCD).   The DEX Profile will support study feasibility, patient eligibility and recruiting, adverse event reporting, retrospective observational studies as well as case report form pre-population. DEX is especially useful in making use of existing standard export documents such as CCD and CCDA.

7.1 DEX Actors, Transactions

Figure 32 – IHE DEX Profile transactions

As presented in Figure 32 and Table 6, two actors namely “Metadata Consumer” and “Metadata Source” are interacting through a single transaction “Retrieve Metadata”. The Metadata Consumer is responsible for the importation of metadata created by the Metadata Source. The Metadata Source is responsible for the creation of metadata per request from the Metadata Consumer. The Metadata Source is associated with a metadata repository.

Retrieve Metadata [QRPH -Y1] ↓

Metadata Consumer

Metadata Consumer

Metadata Source

Page 60: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 60 of 67

Table 6 – DEX Profile - Actors and Transactions

Actors Transactions Optionality Reference Metadata Consumer

Retrieve Metadata [QRPH- Y1]

R QRPH TF-2: 3.Y1

Metadata Source

Retrieve Metadata [QRPH- Y1]

R QRPH TF-2: 3.Y1

7.2 DEX Profile Use Cases The fundamental concept of DEX is the re-use of EHR data in support of a clinical research study. This support applies to clinical study feasibility, eligibility determination, subject recruiting, repurposing of EHR data for observational studies and data capture during clinical study execution. In the data capture use case, the EHR data are used to pre-populate, where possible, the data elements of a case report form. This set of data elements is collectively called pre-population data. Use Case #1 is patient-centric since it concerns a patient who has been recruited into a given clinical trial. The source system is the EHR. The patient gave his full informed consent for the extraction of data from his EHR and for addition of new information into the patient record. Use Case #2 and #3 are population-centric. For these use cases, usually, the EHR may not be an ideal source system since EHRs are typically built to look at data on single patients, not data across combinations of many patients. Unlike transaction systems that are optimized to show data regarding single patients, clinical data warehouses support queries that cut across multiple patients. In clinical data warehouses, queries can be challenging to specify, and these queries have complex implications for the privacy of the patients. However, as described in Use Case #3b, after the eligible patients are selected, EHRs can also provide the medical summaries of eligible patients through existing standard export documents such as CCD as a means to establish clinical data sets. In most real world implementations a research system responsible for creating protocols would host the Metadata Consumer actor, and a metadata repository would host the Metadata Source actor.

7.2.1 Use Case #1: Pre-population of a Research Case Report Form This use case describes how a researcher can create an extraction specification to extract specific data elements from a standard electronic health record (EHR) export document such as a CCD. The extraction specification is used to pre-populate a case report form for a research study. In this use case, the Metadata Consumer would likely be enacted by an electronic data capture system or research protocol design system. The Metadata Source would be provided by a metadata registry such as CDISC’s SHARE.

7.2.1.1 Pre-population of a Research Case Report Form Use Case Description A research forms designer is building a case report form for a particular research study. The designer refers to an on-line metadata registry of research data elements, e.g. SHARE, and selects the desired data elements from a set of research friendly elements such as CDASH, and, using a unique identifier for that data element, retrieves the metadata defined by the metadata registry into an annotated case report form. The metadata includes the exact specification, using XPath, to find the corresponding data element in the HL7 specification Continuity of Care Document (CCD) as extended in the IHE Clinical Research Document (CRD) profile. Using the XPath statements, the research system creates an extraction specification for all elements to be extracted from the CCD. This extraction specification provides a map that enables re-use of the proper data within a CCD with precision and without inappropriate access to extraneous information. The extraction specification could then be used with RFD and Redaction to pre-populate the case report form.

Page 61: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 61 of 67

7.2.1.2 Pre-population of a Research Case Report Form Process Flow

Figure 33 – Basic Process Flow in DEX Profile

Pre-conditions:

The research designer uses a blank template to design a case report form to meet the requirements of the study protocol. Main Flow:

The research forms designer designs the case report form by selecting data elements from the metadata registry (like CDASH data elements) and retrieving the accompanying metadata. Not all elements of the form will be available in the EHR. These elements will be required to be input by the site research coordinator. Post-conditions: An annotated case report form is created that contains the exact location of each pre-population data element. This annotated case report form is then converted to an extraction specification to automatically create the case report form from the EHR export retrieved.

7.2.2 Use Case #2: Eligibility Determination This use case creates eligibility criteria that are intelligible to an EHR.

7.2.2.1 Research Eligibility Determination Description Eligibility determination for feasibility studies A research worker seeks to find eligible subjects for a research study by searching an EHR or a clinical data warehouse. The worker expresses eligibility criteria, as defined by the research protocol, as inclusion/exclusion criteria using a research standard such as CDISC’s Study Design Model (SDM). The eligibility criteria are drawn down from a metadata registry that includes the exact mappings to corresponding data elements in the EHR or clinical data warehouse. Eligibility Determination is performed on anonymized clinical data warehouses or on EHRs. Using the exact mappings retrieved from the metadata registry (as XPath, as SQL or as SPARQL if the schema of clinical data warehouse is in RDF), the research system constructs the Eligibility Determination Specification to be run on EHRs or clinical data warehouse. The eligibility determination specification could be run against an EHR or a clinical data warehouse established for clinical research purposes (anonymized data) returning summary information only (e.g. counts and percentages) as a part of other profiles. Summary information might be cross-tabulated by a number of key inclusion/exclusion criteria. For instance, the number of eligible participants might be returned for combinations of gender (male/female) and diabetes status (not diabetic/type I/type II). Data will be returned only if counts are sufficiently large to protect privacy.

Metadata Consumer

Metadata Source

Select Data Element(s) Create Extraction Specification

Retrieve Metadata

Page 62: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 62 of 67

Patient recruitment Once a trial design has been finalized, all clinical trial approvals obtained and clinical investigators recruited and contracts completed, there is the opportunity to use routinely collected patient data to facilitate the identification of potentially eligible recruits for the trial. The eligibility determination specification created as described above could be used in the subsequent workflow to create a list of eligible candidates using additional profiles such as Research Matching.

7.2.2.2 Eligibility Determination Process Flow

Figure 34 – Basic Process Flow in DEX Profile

Pre-conditions:

The research designer selects the data elements representing the research eligility crteria for a particular study. Main Flow:

The research designer retrieves the metadata of the selected data elements from the metadata registry. Post-conditions: The eligibility determination specification could be created to extract a list of candidates for inclusion in a study.

7.2.3 Use Case #3: Observational Study This use of DEX enables direct extraction of data on patients for observational studies without the need for supplemental data entered by a human.

7.2.3.1 Observational Study Description

Alternative A A research worker having identified eligible patients for a research study by searching an EHR a clinical data warehouse (Use Case #2) selects research defined data elements from a metadata registry that include the exact mappings to corresponding data elements in the clinical data warehouse in order to create a project-specific mini-databases (“data marts”) to make highly detailed data available on these specific patients to the investigators (Use Case #3). Using the exact mappings retrieved from the

Metadata Consumer

Metadata Source

Select Eligibility Criteria Retrieve Metadata

Create Eligibility Determination Specification

Page 63: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 63 of 67

metadata registry (as XPath, as SQL or as SPARQL if the schema of clinical data warehouse is in RDF), the research system constructs the electronic query to be run on clinical data warehouse to collect the required data sets. The electronic query would be run against a clinical data warehouse that would require the return of pseudonymized individual patient records containing patient level information on key inclusion/exclusion criteria and other variables of interest. The records would not contain any patient identifiers (for example date of birth would be converted into age and recorded to nearest year). The protocol of the observational study will be reviewed and restricted by the Institutional Review Board.

Alternative B A research worker having identified eligible patients for a research study by searching an EHR a clinical data warehouse (Use Case #2) selects research defined data elements from a metadata registry and creates the data collection set specification. The research system retrieves the metadata of the selected data elements that include the exact specifications, using XPath, to find the corresponding data element in a medical summary document expressed in HL7 specification Continuity of Care Document (CCD). Using the XPath statements, the research system creates an entire extraction specification for all elements to be extracted from the CCD. This extraction specification provides a map that enables re-use of the proper data within a CCD with precision and without inappropriate access to extraneous information to retrieve highly detailed data available on these specific patients to the investigators for observational studies. The researcher then can collect the data sets in project-specific mini-databases (“data marts”) to run safety analysis methods on top of it. The protocol of the observational study will be reviewed and restricted by the Institutional Review Board. After the eligible patients are identified, the EHR system, that are already capable of producing medical summaries of patients in standard content models like IHE CCD templates, will share the pseudonymized medical summaries with the Research systems. As the data collection set is already annotated with extraction specifications to retrieve the data sets from medical summary documents, the research data collection of interest can easily be collected from these medical summaries and stored in the project specific databases to run the clinical research methods of interest.

7.2.3.2 Observational Study Process Flow

Figure 35 – Basic Process Flow in DEX Profile

Pre-conditions:

Metadata Consumer

Metadata Source

Select Data Elements for Data Collection Set

Retrieve Metadata

Create extraction specification for data extraction.

Page 64: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 64 of 67

The research worker uses a blank template to design a data collection set to meet the requirements of the observational study protocol by selecting the data elements to be included. Main Flow:

The research worker retrieves the metadata of the data elements in the data collection set from the metadata registry (like CDASH data elements). Post-conditions: The data collection set is annotated with exact location of each research defined data element in a clinical data warehouse or in a pseudonymized medical summary. This annotated data collection set could either be used to query a clinical data warehouse, or converted to an extraction specification to retrieve the data elements from medical summaries of eligible patients exported from an EHR.

7.2.4 Use Case #4: Public Health Case Reporting 7.2.4.1 Use Case Description Current State Patient Polly appears in Doctor Toci physician office, in the great state of Nirvana, with fever and a cough with an unusual whooping sound. Culture is taken and sent to the laboratory. Patient instructed to return in two day. Upon return, lab result shows positive for pertussis. Physician prescribes course of Erythromycin and instructs the patient to return in one week for follow up. The provider knows that pertussis is a reportable condition and knows to report the case to the local, state and federal authorities. Fortunately Dr Toci’s EHR has RFD capabilities that can access the pertussis case reporting form through the Form Manager hosted by the Bliss county health department. Fortunately, the Forms Manager supports the Public Health Reporting Initiative content profile, which enables pre-population of 30% of the form through a transform of CCD. Once the form is competed and submitted to the Forms Receiver. Randy, the software guy, has enabled the Bliss software to submit variants of the case reporting form to the Nirvana state health agency and to the Centers for Disease Control. Desired State Dave the forms designer has upgraded the Bliss county health department’s pertussis form. He designed the form by drawing down data elements from a metadata registry that builds in the explicit path to the data elements in the CCD. Now the pre-population completes 60% of the form, using the same pre-population export document. Dr Andy Antiquated has an EHR that can only generate a CCR, which they provide for pre-pop. The Forms Manager is unable to do any pre-population with this non-compliant document. The Forms Manager must unscramble two different pre-population documents, and three different recipient documents.

A PhD epidemiologist at CDC has developed a case reporting form of 92 elements for pertussis reporting. A masters degree in public health employee at the state of Nirvana has defined a more concise form of 80 elements. A semi-retired physician, Dr Quack, has a form that overlaps with 40 of the state’s data elements, and insists on two elements for Bliss County that neither the state nor the federal jurisdiction specify, but which in HITSP data dictionary. The CDC data elements are contained in an agency MDR, which contains maps to corresponding elements in a CCD. The state uses an MDR from the Public Health Data Standards Consortium which maps to the CDC’s MDR, and to the CCR, but not to the CCD. Dr Quack uses no MDR but his data elements are a subset of the state elements, except for two data elements are normally in a CCD.

7.2.4.2 Public Health Reporting Process Flow Pre-conditions: There are three different Metadata Sources: as an interface to the MDR managed by CDC, as an interface to the MDR managed by Public Health Data Standards Consortium and as an interface to HITSP MDR. Form Designer selects the data elements to be included in the Form from

Page 65: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 65 of 67

the data elements maintained by these MDRs. The MDRs managed by CDC and PHDSC also maintains the exact paths of the data elements to the different Case Report Forms they are expecting to receive. Main Flow:

• Form Designer queries the CDC MDR to retrieve metadata of the CDC data elements, and as a result the mappings to CCD documents. In the same response message the mappings to PHDSC data elements are also returned, and hence he has the mappings of all PHDSC elements to CCD documents.

• Form Designer then queries the Public Health Data Standards Consortium MDR to retrieve the metadata of PHDSC data elements and as a result the mappings to CCR. In this step we have the mappings of a subset of CDC data elements (80 of them) to CCR documents too.

• Form Designer then queries the HITSP MDR to retrieve the metadata of HITSP data elements and a result the mappings to CCD documents.

• As a result, the Form Designer annotates the Form, where 80 of the data elements have a mapping to both CCD and CCR, 12 CDC data elements have a mapping to CCD, and 2 HITSP data elements have a mapping to CCD.

• While the Form Designer queries the CDC and PHDSC MDRs, it also received the exact paths of the corresponding data elements in the Case Report Forms managed by CDC and PHDSC. These are also added to the annotated Form.

Post-conditions: A Form Manager having the annotated Form, retrieves the prepopulation data in CCR and CCD format and by making use of the annotations (including the mappings to CCD and CCR documents), prepopulates the form with the data retrieved from EHRs. A Form Receiver, receiving the annotated and filled Form, creates the different Case Report Forms by making use of the annotations (i.e. the mappings of the data elements to different Case Report Forms).

7.3 Current status of IHE DEX Profile DEX Volume 1 is complete. The latest version of the Data Element Exchange profile can be found from IHE FTP Site [7]. Volume 2 is currently being prepared by SALUS Task 4.2 team. It is intended to have the profile released for public comment in May 2013 and for Trial Implementation in July 2013. SALUS MDR implementation will be one of the first implementations of IHE DEX Profile. EHR4CR Project also aims to use a metadata registry to share and maintain the data elements they define to represent the data elements in eligibility criteria specifications. In this respect, SALUS and EHR4CR projects aim to collaborate; SALUS MDR can act as the Metadata Source in IHE DEX profile, while EHR4CR Data Element Definitions Creation Editor can act as the Metadata Consumer in IHE DEX Profile. The projects also aim to test these capabilities through a Projectathon activity.

Page 66: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 66 of 67

8 CONCLUSION & FUTURE WORK In this deliverable, the initial set of SALUS Common Data Elements are described in conformance to ISO/IEC 11179 guidelines. In addition to this, other efforts in the literature that are developing Common Data Element models for achieving interoperability in eHealth domain are examined. The requirements for a semantic federated metadata registry to address interoperability between the CDEs and content models defined by disparate efforts are elaborated. As a result, the design of a semantic federated MDR is presented briefly, which is discussed in more detail in a SALUS publication effort [23]. Finally the current implementation of SALUS semantic MDR is described in detail. As described in Section 6, the initial version of SALUS MDR, where it is possible to define, search, visualize and navigate CDEs is ready. In addition to this, we provide a number of importers to populate the SALUS MDR with new CDEs. In the second release of this tool (which is due Month 24), we will finalize the semantic federated MDR we envisioned in [23], and briefly discussed in Section 0. In particular, REST services to ease the semantic query of CDEs across MDRs will be implemented. Full list of proposed REST services is presented in [23]. Through these services, it becomes possible to perform federated queries on MDRs to retrieve semantic descriptions of CDEs and process these for achieving semantic interoperability across domains. Finally, as described in Section 0, SALUS team is in close cooperation with IHE Quality, Research and Public Health Domain (QRPH) Committee for the preparation of a new IHE Interoperability profile: namely the Data Element Exchange (DEX) Profile. IHE DEX profile argues that integrating patient care and clinical research domains requires a standard-based expressive and scalable semantic interoperability framework, allowing dynamic mappings between data elements and semantics of varying data sources. This can be achieved through a metadata registry architecture where machine processable definitions of data elements across domains can be shared, re-used, and semantically interlinked with each other to address this semantic interoperability challenge to move towards EHR-enabled research. SALUS Semantic MDR implementation will be one of the first implementations of the IHE DEX Profile.

Page 67: D4.2.1 - SALUS Common Set of Data Elements for Post Market ... · ISO/IEC 11179 follows a bottom-up strategy, that is, data elements which are common, which are the building blocks

FP7-287800 SALUS

SALUS-FP7-287800 • D4.2.1 • Version 1.0, dated March 31, 2013 Page 67 of 67

REFERENCES 1. SALUS Deliverable 4.1.1 “SALUS Content models for the Functional Interoperability Profiles

for Post Market Safety Studies - R1” 2. SALUS Deliverable 8.1.1 “Pilot Application Scenario and Requirement Specifications of the Pilot

Application” 3. ISO/IEC. ISO/IEC 11179: Information Technology – Metadata Registries (MDR) Parts 1–6 (2nd

Edition). 4. OMOP CDM. Observational Medical Outcomes Project (OMOP) Common Data Model (CDM).

http://omop.fnih.org/. 5. CDISC. Study Data Tabulation Model (SDTM). http://www.cdisc.org/sdtm 6. IHE Quality, Research and Public Health Domain (QRPH) Committee,

http://www.ihe.net/qrph/index.cfm 7. IHE Data Element Exchange (DEX) Profile, Volume I,

ftp://ftp.ihe.net/Quality/2013_2014_YR_7/QRPH_Technical/DataElementExchange/2013-03-19_IHE-DEX-Suppl_lb+gl.doc

8. HL7/ASTM. Continuity of Care Document (CCD). http://www.hl7.org/documentcenter/public_temp_DC68F8CB-1C23-BA17-0CB6B9727B87B502/pressreleases/20070212.pdf

9. HITSP. C 154: HITSP Data Dictionary. http://www.hitsp.org/ConstructSet_Details.aspx?&PrefixAlpha=4&PrefixNumeric=154

10. HITSP. C 32: HITSP Summary Documents Using HL7 Continuity of Care Document (CCD) Component. http://www.hitsp.org/ConstructSet_Details.aspx?&PrefixAlpha=4&PrefixNumeric=32

11. FHIMS. Federal Health Information Model. http://www.fhims.org/content/420A62FD03B6_root.html

12. S&I Framework. Transitions of Care Initiative (ToC). http://wiki.siframework.org/Transitions+of+Care+(ToC)+Initiative

13. S&I Framework. S&I Clinical Element Data Dictionary (CEDD) WG. http://wiki.siframework.org/S%26I+Clinical+Element+Data+Dictionary+WG

14. S&I Framework. Query Health. http://wiki.siframework.org/Query+Health 15. CDISC. Clinical Data Acquisition Standards Harmonization (CDASH).

http://www.cdisc.org/cdash 16. BRIDG. The Biomedical Research Integrated Domain Group (BRIDG) Model.

http://www.bridgmodel.org/ 17. HL7. HL7 Reference Information Model (RIM).

http://www.hl7.org/implement/standards/rim.cfm 18. FDA. Sentinel Initiative – Mini-Sentinel. http://mini-sentinel.org/ 19. GE/Intermountain Healthcare. Clinical Element Models (CEM). http://www.clinicalelement.com/ 20. NEHTA. Detailed Clinical Models. http://www.nehta.gov.au/connecting-australia/terminology-

and-information/detailed-clinical-models 21. I2B2. I2B2 star schema. https://www.i2b2.org/events/slides/Workshop1.pdf 22. ISO 21090. Health Informatics – Harmonized data types for Information Interchange. 23. Sinaci A. A., Laleci G. B., A Federated Semantic Metadata Registry Framework for Enabling

Interoperability across Clinical Research and Care Domains. 2013. Submitted to the Journal of Biomedical Informatics.