GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow,...

73
GMP Data Warehouse System Documentation and Architecture Ladislav Dušek, Jana Klánová, Jakub Gregor, Richard Hůlek, Jana Borůvková, Daniel Klimeš, Jiří Jarkovský, Jiří Kalina Daniel Schwarz, Petr Holub, Kateřina Šebková

Transcript of GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow,...

Page 1: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data WarehouseSystem Documentation and Architecture

Ladislav Dušek, Jana Klánová, Jakub Gregor, Richard Hůlek, Jana Borůvková, Daniel Klimeš, Jiří Jarkovský, Jiří KalinaDaniel Schwarz, Petr Holub, Kateřina Šebková

Page 2: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access
Page 3: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

1

Content Content .................................................................................................................................................... 1

1. Introduction ......................................................................................................................................... 2

2. Principal concept of the GMP data warehouse .................................................................................... 3

3. IT background and database implementation ...................................................................................... 5

3.1. Web-based application (“thin client”) with central data repository ............................................. 5

3.2. Standardized components of the GMP DWH .............................................................................. 5

3.3. Principles of the TrialDB’s architecture – the "EAV" design ...................................................... 6

3.4. System functionalities .................................................................................................................. 7

4. Data security and protection ................................................................................................................ 8

4.1. Physical security ........................................................................................................................... 8

4.2. Authorized access ......................................................................................................................... 8

4.3. Data backup procedures ............................................................................................................... 9

4.4. Data ownership ............................................................................................................................. 9

4.5. Service level agreement (SLA) .................................................................................................... 9

4.6. License agreement ........................................................................................................................ 9

5. Data collection procedure .................................................................................................................. 11

5.1. Reporting via MS Excel sheets................................................................................................... 11

5.2. Reporting via on-line system ...................................................................................................... 11

5.3. Transfer of data from public on-line databases .......................................................................... 12

6. Database structure ............................................................................................................................. 13

7. Reporting and visualization ............................................................................................................... 16

7.1. Summary tables .......................................................................................................................... 16

7.2. Visualization ............................................................................................................................... 16

8. Additional services for GMP DWH users ......................................................................................... 18

8.1. DWH manager, formal verification of inserted data .................................................................. 18

8.2. Help desk .................................................................................................................................... 18

8.3. Track changes ............................................................................................................................. 18

8.4. Data validation ........................................................................................................................... 18

8.5. Data export ................................................................................................................................. 18

Annexes ................................................................................................................................................. 19

Annex 1. GMP DWH – structure and description of parameters ...................................................... 19

Annex 2. GMP DWH – user guide .................................................................................................... 19

Page 4: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

2

1. Introduction This document describes a data warehouse developed for the purposes of the Stockholm Convention’s Global Monitoring Plan for monitoring Persistent Organic Pollutants (thereafter referred to as GMP), particularly for the second data collection campaign, which is to begin in year 2014.

Building the GMP Data Warehouse (hereinafter referred as GMP DWH) was one of important conclusions of the meetings of experts and members of Regional Organization Groups and Global Coordination Group for GMP, held in Brno in June 2012 and in Geneva in October 2012. Establishment of such data warehouse is also required by the updated Guidance on the Global Monitoring plan for Persistent Organic Pollutants (Chapter 6.5.2 GMP data storage) document, adopted at the 6th meeting of the Conference of the Parties to the Stockholm Convention in May 2013 in Geneva (document number UNEP/POPS/COP.6/INF/31 and in the decision SC-6/23.

The data reporting model suggested in the updated Guidance involves compiling and archiving primary GMP data within a “regional data repository” in each of the 5 UN regional groups. In addition, regional data centres and a single GMP “data warehouse” should be established to compile and archive aggregated data, data products and results, including supplementary data that would be used in the Stockholm Convention effectiveness evaluation.

Furthermore, the need for regional dimension and specificities of data collection and handling was underlined both in expert meetings recommendations and in the Guidance document. Separately administered regional nodes should be made available to all regions, in particular to those which currently collect data without support of sophisticated data handling tools.

Based on the facts stated above, a multi-modular, on-line working data warehouse has been developed for data collection, processing and reporting within the next GMP campaigns. The proposed standardized GMP DWH is based on fully parametric data sheets. It has been designed in order to improve quality of the collected global data sets on POPs concentrations, to determine their fate in the environment and to strengthen the position and responsibility of the local data administrators.

Page 5: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

3

2. Principal concept of the GMP data warehouse The GMP DWH has been designed to address all main challenges associated with organization, evaluation of performance and impact of long-term environmental programs. Lack of data standards, incompleteness of archived datasets and insufficient statistical power can be easily identified as the most important limits in functionality of monitoring networks. To avoid these failures, GMP DWH structure was optimized to incorporate data management as its integral part. Such complex functionality necessarily requires multi-modular structure of the DWH encapsulating four main layers focused on different aspects of data collection, archiving, validation, processing and reporting as well. Structure of the GMP DWH consists of these layers (Figure 2.1):

1. data layer – import / export, on-line data capture systems, archiving, basic validation of data – data standards

2. core DWH layer – data management: data validation, recoding, transformation, background for data services including GIS.

3. analytical layer – data workflow, statistical tools, data pre-processing and processing 4. presentation and communication layer – reporting tools, web services

The established DWH is well optimized to handle documentation in terms of standardized coding, data formats, metadata coding and consistency of records over time. Conceptual model usable to facilitate the integration and analysis of data on POPs concentrations is involved with its multilayer hierarchy of entities (POPs as nomenclature classes, couples “observation – measurement” as content classes). A robust set of statistical methods for processing time series of concentration data is customized from the viewpoint of practical implementation within the data collection campaign. It consists of the following components: baseline pollution estimates, uncertainty analyses, spatial extrapolations, effect size estimates, time trend identification and quantification. The individual layers respect already established and approved data flow. The GMP DWH databases are interlinked with additional supporting tools and processes, which are essential to ensure predefined data standards of collected data, their validation, processing, and publication, as well as management of user accounts and rights during the data flow.

Up-to-date trends in building knowledge-based infrastructures are incorporated in all components and in data repositories for all examined matrices. The GMP DWH has been designed to allow reliable collection of data on POPs concentrations in four core matrices: air, human milk, human blood, and water. Outcomes of large established environmental monitoring programmes are preferred to be used as data sources for purposes of the GMP; however, certain national projects can also serve as potential sources of POPs data.

The GMP DWH is based on standardized data forms. Well completed and validated data records would thereby enter the databases, further supported by specific processes and services ensuring standardized inputs, handling and outputs. With this background, the GMP DWH is capable to cover data from a relatively wide range of heterogeneous sources, from local monitoring programs to nationwide large networks.

Page 6: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

4

Figure 2.1. Architecture of the GMP Data Warehouse

Page 7: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

5

3. IT background and database implementation

3.1. Web-based application (“thin client”) with central data repository Projects realized or supported by the Institute of Biostatistics and Analyses of the Masaryk University (IBA MU) usually build on an on-line web-based information system with a central data repository to collect relevant data (TrialDB system). TrialDB system was developed in cooperation with the Yale University.1,2,3

Complex generic environment for on-line data management is recommended for the development of on-line databases with multiple touch-points collecting primary data. Multi-tier architecture (client - application server - database server) is the most commonly used approach (see Figure 3.1). Standard web browser (i.e. Internet Explorer) is employed as a client in that architecture. Therefore, internet connection is necessary for all participating users. Nevertheless, a web browser is available as default equipment in all personal computers so there is no need to install any specialized software on the users' workstations. Users can thereby access all functions of the system (data manipulation, entry, editing, viewing, reporting, analyses, etc.) through their web browser.

Communication between the client and the server is always carried on via secured (encrypted) https protocol (128-bit encryption is used), and data security is guaranteed in a standardized format: system administrator assigns access login and password to individual users. Access level (scope) can be specified separately for each account. Service level, licensing and property rights are defined in contracts between the users and the GMP DWH provider.

Main advantage of the centralized on-line solution:

• Regular support by trained staff • Security and availability for sharing among clients (when permitted). • No need for upgrades, all changes in the web application are made centrally • Monitoring the progress of the project and addressing potential problems as soon as possible

3.2. Standardized components of the GMP DWH Client - web browser

• Mozilla Firefox, Google Chrome, or Internet Explorer (recent versions) • Data collecting, viewing, etc. • Client side scripts

Application server - web server

• Microsoft Windows Server, Microsoft IIS / Apache web server • Web application

1 Nadkarni PM, Brandt C, Frawley S, Sayward FG, Einbinder R, Zelterman D, Schacter L, Miller PL. 1998. Managing attribute--value clinical trials data using the ACT/DB client-server database system. J Am Med Inform Assoc 5(2):139-151. 2 Nadkarni PM, Brandt CM, Marenco L. 2000. WebEAV: automatic metadata-driven generation of web interfaces to entity-attribute-value databases. J Am Med Inform Assoc 7(4):343-356. 3 Nadkarni PM, Marenco L. 2001. Easing the transition between attribute-value databases and conventional databases for scientific data. Proc AMIA Symp:483-487.

Page 8: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

6

• Server side script

Central data storage - database server

• Central data repository • Definition and design of data forms • Definition and administration of user accounts and user rights • Validation rules • Oracle 11g • SUSE Linux Enterprise Server

Figure 3.1. On-line solution of GMP DWH data collection system

3.3. Principles of the TrialDB’s architecture – the "EAV" design TrialDB runs on a relational database engine (Oracle), but it structures its data using an "entity-attribute-value" (thereafter as EAV) design. In this design, data names (such as "chemical group” or "value of LOQ") are not in the database as table column headers as is the case in a traditional relational database; they are stored as data.

Metadata describing each data element are stored in a data library. Each row in the library holds information on the entity (site ID, year, date, etc.), the attribute (the name or ID of the parameter being recorded), and the value of the parameter. Such arrangement allows users to easily create, view and edit data item definitions. Furthermore, the EAV design facilitates accommodation of new protocols (including new data items) without additional programming. The only necessary additional task is to add a description for each new data element to the data library. Finally, the data library also records how to present data to the end users; data then display as being organized into regular tables.

Page 9: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

7

TrialDB extends the basic EAV model in several aspects:

• All data are stored in several EAV tables classified by data type (not in a single table); this allows faster searching based on stored values as well as easier data validation.

• Extensive metadata component (data element library) allows for preparing case studies and defining questions/parameters and their logical grouping into data forms. TrialDB uses information to automatically generate data forms and browsing tables for the web. TrialDB stores certain data, such as site description, in conventional table form rather than in EAV form, to facilitate the storage and retrieval of data elements that are common to all projects.

3.4. System functionalities The proposed system is equipped with a number of useful functionalities - data reporting, data handling and evaluation. In addition, track changes tool (displays recent changes in a form or in a group of forms), data validation tool (provides a list of incomplete records and of missing items) and some other useful tools. All system functionalities are described in detail in Chapter 8 and in the GMP DWH User Guide (see Annex 2 to this document).

Please note that system allows data reporting in two formats:

Aggregated data can be reported either by using on-line forms via internet browser, or in standardized MS Excel sheets sent by e-mail to the database administrator (see Chapter 5). Primary data should be reported in MS Excel sheets only. For description of the database structure see Chapter 6 or Annex 1 of this document.

Page 10: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

8

4. Data security and protection Data quality and security is a prerequisite for the correct functioning of the proposed system.

4.1. Physical security High quality servers with hardware encryption of disc array are used. Servers are operated in internal IBA/RECETOX data centre in the Masaryk University network with multilevel security:

• Central university network traffic monitoring • Firewall of IBA/RECETOX subnet and dedicated firewalls on each server • Servers are placed in a separated subnet dedicated only for server traffic • Server operating system is updated on regular basis • Physical access to data centre is strictly limited to authorized personnel • Data centre is under nonstop camera monitoring, access is possible only via electronic cards

and the building has its own physical security • Servers have backup power supply and there is specialized mode in case of fire or accident • Additional measures have been taken to prevent potential data loss or damage in case of

unexpected events, which are not directly related to information technology. These measures involve fire-stop system, air-conditioned server rooms etc.

• Finally, both the system configuration and data stored are subject to a regular backup procedure. Therefore, even in case of system breakdown, the entire system including data can be promptly restored.

4.2. Authorized access Access to the system is permitted to authorized users only via username and password.

There is a system of different user rights in the administration of user accounts. Users can be assigned various levels of authorization so that they have access to all or selected functions or parts of the system. This is particularly important for projects such as GMP data collection, in which occur both the horizontal structure (different countries and UN regional groups) and the vertical hierarchy of users (data providers, data managers, ROGs members, GCG members).

Encryption protocol is used for data transfer between the user and the central database to prevent tapping the communication between the client and server (for example, tapping user login and password). For this reason, any communication between the client and server is realized via the secure protocol HTTPS, using the SSL (Secure Socket Layer) encryption.

In addition, automatic log-off takes place after a predefined period of user's inactivity. This function is aimed to prevent a misuse of an unoccupied computer if the user forgets to log out.

All development and implementation steps are guaranteed by ISO certificates:

• EN ISO 9001:2010 (Quality Management Systems) • ISO/IEC 20000-1:2012 (IT Service Management) • ISO/IEC 27001:2006 (Information Security Management Systems)

Page 11: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

9

4.3. Data backup procedures Procedures and safeguards are enforced to ensure that backup data cannot be accidentally overwritten. The latest copies of data from servers are always available in case of an adverse event to ensure timely resumption of service. Unauthorized access to backups is prevented. Any security incident is addressed and solved immediately. Servers are backed up before any changes to the system are carried out. Backups are made every night in order to minimize any data loss (up to 1 working day)).

4.4. Data ownership The data and eventual further documents/information provided will be stored in the GMP DWH, located at servers of the Masaryk University, Brno. All data records remain in ownership of their providers (institutions). Participation in data collection in the GMP DWH is voluntary and any data provider has right to back out of the participation. In such case, all data records of the particular provider entered into the GMP DWH will be deleted. The data owners can store their own local copies of all reported records, metadata records and descriptive files. All reports generated over the data can be downloaded as well.

4.5. Service level agreement (SLA) Service level agreement is a common part of contract between IBA MU and a client / data warehouse user and guarantees availability of the service and immediate solution of eventual technical problems.

SLA specifies monthly availability of the service, events subject to SLA, time of the event beginning, maximum time to finding solution, or total time of service unavailability due to planned maintenance, upgrade, etc. Ways of communication and event reporting are also defined.

The GMP DWH performance is under continuous monitoring; the administrators are therefore immediately informed when any technical problem or other events occur and can adopt measures for their quick solution.

4.6. License agreement License agreement defines conditions under which the Masaryk University (“Licensor”) gives right to use the object of license agreement (in this case the GMP Data Warehouse, or “Software”) to the Secretariat of the Stockholm Convention, members of Regional Organization Groups and other users determined by these two parties (“Licensee”). Following paragraphs briefly summarize the most important points of the License Agreement.

The Software is implemented on-line and the Licensor guarantees access to the Software through Internet. Access to the software is authorized, i.e., only persons selected by the Licensee and provided with access data from the Licensor can access and use the system. The Software is secured, which prevents third parties (non-authorized persons) from tracking, changing, or copying communication between the users and the Software.

Users provided with access to the Software are selected by Licensee and approved by the Licensor. Every user is then provided with access data (login, password) and defined set of access right and system roles.

Page 12: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

10

The Licensee can use the Software only in full agreement with the System Documentation, in particular with the User Guide and other related documents delivered by the Licensor. The system documentation can be used only for purposes of correct use of the Software. The Licensee is not allowed to copy and distribute the system documentation to third parties (excluding users of the Software outside of Secretariat and members of the Regional Organization Groups and Global Coordination Group).

The Licensee must ensure fulfilling hardware and software requirements defined in the System Documentation. Failing that the Licensor cannot guarantee correct functioning of the Software on user’s side.

The Licensee is allowed to use all components of the Software (particularly reporting and visualization tools) for purposes of fulfilling the GMP objectives: compilation of monitoring reports, effectiveness evaluation and others. The outputs can also be used for scientific and other non-profit activities. Commercial use of any component of the Software is prohibited.

Page 13: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

11

5. Data collection procedure In accordance with the updated GMP Guidance (UNEP/POPS/COP.6/INF/31), MS Excel sheets were prepared for reporting the analytical data, in both primary and aggregated forms. The GMP DWH also allows a direct insertion of aggregated data into the system. In aggregated data reporting, the data structure is identical for both, the on-line and MS Excel forms.

ROG members can also authorize DWH managers to use public on-line environmental databases, from which the data records will be extracted and imported into the GMP DWH (such as EMEP data, MONET data etc.).

5.1. Reporting via MS Excel sheets The prepared MS Excel sheets will be distributed to all data providers nominated by individual ROGs and will also be available in the GMP DWH (after accessing the system) and in the web portal www.pops-gmp.org. There are separate files for each matrix (air, human milk, human blood, water) and type of data and they contain:

1) Columns described identically to the defined data structure (see Chapter 6). 2) Illustrative data sets to facilitate insertion of data to be reported. 3) Full code lists (e.g., types of site, individual chemicals, etc.).

It is essential to respect prescribed data format and code lists to ensure full compatibility and standardization with the GMP DWH.

Completed sheets should be sent to database administrator (leader of the GMP DWH helpdesk), Dr. Jakub Gregor ([email protected]), or uploaded through the GMP DWH system. Data will be subsequently validated by DWH manager, imported into the data warehouse, labelled as “pending” and the responsible data manager or data manager assistant will be informed by e-mail and asked to supervise the appropriate data sets in the GMP DWH.

5.2. Reporting via on-line system Aggregated data can be inserted directly into the GMP DWH through prepared on-line forms. The on-line forms are separated and optimized for each matrix (air, human milk, human blood, water) and are divided into three parts:

1) Site – a key component; all other information and data are linked to. 2) Sampling attributes – defines time period for which the data are reported, methodology of

sampling and eventually a monitoring programme. 3) Measurement – defines the specific chemical compound (= parameter), reported concentration

values and descriptive statistics.

All data fields are filled up as text, number, or a selection from defined code list (drop-down menu). The person filling in the data records can label them as “pending“ (= potentially subject to further changes) or completed (= subject to supervision by appropriate data manager from the institution providing data).

Page 14: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

12

5.3. Transfer of data from public on-line databases Substantial part of POPs monitoring data is available in public on-line databases (e.g., EBAS, NatChem). ROGs may authorize DWH managers to extract data collected in a particular region from the identified database, import them into the GMP DWH, validate and submit them for approval by the respective ROG members.

The data sources and sets to be extracted and imported should be identified by a ROG member either by direct e-mail to DWH management ([email protected]) or through a simple on-line form available in the GMP DWH system (together with other options and documents for data reporting).

Page 15: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

13

6. Database structure Development and adjustment of systems for the collection, analysis, and visualization of environmental data must cope with a relatively high heterogeneity of collected data, e.g. in terms of data sources (institutions, projects, purposes), different matrices (ambient air, water, soil, sediments, human tissues), or chemical parameters (terminology of parent compounds, isomers, degradation products). It is therefore important to strictly define data structure and code lists to ensure reliability of all collected and analysed data and provide detailed guidance and support to all users participating in the data collection process. Furthermore, sufficiently complex conceptual models, advancing development of formal ontologies for environmental and epidemiological data acquisition systems are needed4.

Analysis and handling of GMP data collected during the first campaign identified some serious problems through lacking standardization, which caused loss of some of the reported data for interregional comparisons, in the coding and in validation process. The losses were due to time-related heterogeneity of data reports, non-standardized nomenclature of reported chemicals and monitoring methods, and sometimes also due to incompleteness in obligatory data items. Moreover, as the quantitative identification of reported measurements and values were not standardized, unclear units, LOQs or values recalculated per different base handicapped interpretation of the GMP reports.

All the above mentioned challenges limit data processing and reduce the information value of reported data. Consequently, data series allowing for representative spatial and time comparisons are a rather scarce.

Fully standardized system for future data collection campaigns is thereby proposed building on all of the above. The system is based on hierarchical structure of data fields of standardized parameters with predefined content in all ontology dimensions necessary for future data processing (values, units, measurement method, LOQs, description of data aggregation, etc.). We believe it reduces the risk of the loss of any reported data to a minimum and that standardization of data collection would also facilitate the retrospective control of already reported GMP records.

The scope of the data capture system is adjusted to data from monitored environmental matrices (ambient air, human milk, human blood, and water). Options to check and correct, should need be, the GMP1 data are also included.

There are two principal ways for data insertion into the database:

a) Data sets of primary POPs concentration data in a given site and at given time points; these records should be sent do DWH managers via standardized MS Excel sheets, which are available for download in the GMP DWH system.

b) Data sets of aggregated data on mean (median) annual POPs concentrations with proper variability measures can be sent via on-line forms inside the GMP DWH, or using the standardized MS Excel sheets, which are available for download in the GMP DWH system as well.

4 Dušek L, et al. Estimating impacts of environmental interventions in monitoring programs requires conceptual data models and robust statistical processing. IFIP Advances in Information and Communication Technology, ISESS 2013. (in press)

Page 16: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

14

Table 6.1. Overview of the database structure. See Annex 1 for more details.

Air • Site ID (number) • Site name (text) • Longitude (number) • Latitude (number) • Region (code list) • Country (code list) • Site type (code list) • Potential source type

(code list) • Year (number) • Start of sampling

(number) • End of sampling

(number) • Sampling frequency (code

list) • Largest gap (number) • Type of sampling (code

list) • Type of passive sampling

(code list) • Recalculation (code list) • Calibration description

(text) • Monitoring

programme/network (text)

• Chemical – group (code list)

• Parameter (code list) • Method (code list) • LOQ (number) • No. of values (number)A • No. under LoQ (number)A • Value (number)P • Value (mean) (number)A • Value (median)

(number)A • Minimum (number)A • Maximum (number)A • 5th percentile (number)A • 95th percentile (number)A • SD (number)A • Laboratory (text)

Human milk • Site ID (number) • Site name (text) • Region (code list) • Country (code list) • Year (number) • Start of sampling

(number) • End of sampling

(number) • Type of sample (code list) • Monitoring

programme/network (text)

• Chemical – group (code list)

• Parameter (code list) • Method (code list) • LOQ (number) • No. of values (number)A • No. under LoQ (number)A • Value (number)P • Value (mean) (number)A • Value (median)

(number)A • Minimum (number)A • Maximum (number)A • 5th percentile (number)A • 95th percentile (number)A • SD (number)A • Laboratory (text)

Human blood • Site ID (number) • Site name (text) • Region (code list) • Country (code list) • Year (number) • Start of sampling

(number) • End of sampling

(number) • Blood source (code list) • Fraction (code list) • Monitoring

programme/network (text)

• Chemical – group (code list)

• Parameter (code list) • Method (code list) • LOQ (number) • No. of values (number)A • No. under LoQ (number)A • Value (number)P • Value (mean) (number)A • Value (median)

(number)A • Minimum (number)A • Maximum (number)A • 5th percentile (number)A • 95th percentile (number)A • SD (number)A • Laboratory (text)

Water • Site ID (number) • Site name (text) • Region (code list) • Country (code list) • Surface water type • Longitude (number) • Latitude (number) • Region (code list) • Country (code list) • Ocean or sea (code list) • Site type (code list) • Discharges (code list) • Year (number) • Start of sampling

(number) • End of sampling

(number) • Sampling frequency (code

list) • Largest gap (number) • Type of sampling (code

list) • Depth – minimum

(number) • Depth – maximum

(number) • Temperature (number) • Salinity (number) • Monitoring

programme/network (text)

• Chemical – group (code list)

• Parameter (code list) • Method (code list) • LOQ (number) • No. of values (number)A • No. under LoQ (number)A • Value (number)P • Value (mean) (number)A • Value (median)

(number)A • Minimum (number)A • Maximum (number)A • 5th percentile (number)A • 95th percentile (number)A • SD (number)A • Laboratory (text)

A – the item is valid for aggregated data reporting only P – the item is valid for primary data reporting only

The structure of the data fields covers important parameters that must be reported in a fully standardized and parametric way such as: geographical identification and time of reported data, “measurement – value – unit” chain and definition of data aggregation (if applied).

The following list highlights the most important data fields and information items, which are required as obligatory in the GMP DWH:

• Contact identification of the data administrator responsible for data insertion into GMP DWH; • Identification of site reported and identification of any type of spatial aggregation (if used); • Predefined set of reported chemicals (POPs);

Page 17: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

15

• Definition of method used, including corresponding LOQ; • Identification of units used for reported concentration values; • Description of time aggregation (if used); • Definition of variability (an obligatory field for aggregated data).

New data templates are proposed to allow a direct and immediate data validation during the reporting process. Local data administrators will be informed about obligatory data fields through automated electronic queries. Each (completed/filled) record must therefore contain all required information, i.e., relevant unit as an obligatory attribute to inserted concentration value.

Table 6.1 provides an overview of the database structure. Detailed structure of the database including description of individual items and code lists is available in the Annex 1 to this document.

Page 18: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

16

7. Reporting and visualization A set of reporting and visualization tools has been prepared for various groups of GMP DWH users to allow easy browsing and inspection of data records, from their insertion into the database to final publication. The tools are closely linked to the system of user rights and roles, so that each user could work or see only records falling under his/her responsibility in terms of their geographical origin and data flow status. In practice, this means, e.g., that a data manager (member of an institution performing environmental monitoring and providing the data) can browse and edit only his/her data records which have yet not been formally verified by the relevant DWH manager, or that a ROG member can browse and ratify only data records from his/her UN Regional Group which have already been formally verified by DWH manager.

The tools have been designed to facilitate completion of data records and compilation of monitoring reports by individual ROGs and by the GCG. Most of the displayed outputs are available for download as MS Excel sheets and/or graphics (PNG files).

7.1. Summary tables The summary tables are available after logging in to the GMP DWH and their main purpose is summarizing number of data records which are available to a particular user.

For example: Data managers see the quantity of data records from their institution (inserted, completed…), ROG members see quantity of data records from their region that have already been approved, are ready for approval, etc.

7.2. Visualization Some visualization tools for GMP2 campaign are identical to those prepared for GMP1 (see www.pops-gmp.org/visualization). They provide an overview of data available for particular regions, matrices, POPs, etc. Additional tools are designed to present measured concentrations and assess their time trends.

World map – monitoring overview

An animated map shows the geographical coverage of data with predefined filters for display of the available datasets: matrix, compound and year. For the latter, the user has the possibility to choose among a single point in time and a time interval. As an output, the user gets the data coverage for a selected type of matrix and a point or interval in time or the data coverage for a particular point/interval in time, matrix and compound.

Available data – parameters

A Chart shows the sampling frequency for each compound in a particular year (user defined). Compounds are listed on x-axis, countries on y-axis. The predefined filters for the selection include: matrix, sampling method (for air), type of measurement site (for air – urban, rural, remote etc.), UN region, compound and year. Detailed data are accessible by clicking on a chart cell. Multiple selections are enabled in the predefined filters.

Page 19: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

17

Available data – years

A chart shows the sampling frequency in a user defined time period. Years are listed on x-axis, countries on y-axis. The predefined filters for the selection include: matrix, sampling method (for air), type of measurement site (for air – urban, rural, remote etc.), UN region, compound and time period (minimum three years). Detailed data are accessible by clicking on a chart cell. Multiple selections are enabled in the predefined filters.

Reported values – summary statistics

A chart shows a summary statistics for each reported value (mean, median, min, max). Concentration values are displayed in the form of box-and-whisker plots with mean/median value, minimum and maximum. Concentrations are listed on x-axis, reported sites/countries on y-axis. The predefined filters for the selection include: matrix, sampling method (for air), type of measurement site (for air – urban, rural, remote etc.), UN region, compound and year.

Reported values – advanced interactive map visualization

The output is a map with bar charts at each sampling site corresponding to the measured concentrations (the user to be able to choose what to plot- mean, median, etc.). Filters for the analysis include: matrix, sampling method (for air), type of measurement site (for air – urban, rural, remote etc.), compound, year, UN region. Multiple selections are enabled for the list of compounds / years. The map allows panning and zooming.

Time series analysis

The output is a report showing analysis of the trends in concentration data and statistical evaluation of the trend significance via standard tests (according to the GMP guidance document, Chapter 3: “Statistical Considerations and the Technical Note to Chapter 3”). Filters for the analysis include: matrix, compound and measurement site. The standardized option to adjust the analysis settings, e.g., selection of the method for trend evaluation and data smoothing is available.

Page 20: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

18

8. Additional services for GMP DWH users A set of additional functions and services has been designed for GMP DWH users, particularly for ROG members, to facilitate the process of data browsing, approval and preparing/selecting them for use in a regional monitoring report.

8.1. DWH manager, formal verification of inserted data Due to high requirements on data standards and overall quality, a DWH manager has been established to perform data verification. The data are verified after they have been completed by the individual data managers (on the side of data providers) and before they are sent for approval to ROG members. The procedure should ensure that the data format is in full agreement with a predefined data structure and in line with code lists as well as that logical hierarchy and ontology of data records has been fulfilled.

The DWH manager validates all data records regardless the way they have been sent into the GMP DWH (via on-line forms or MS Excel sheets).

8.2. Help desk Help desk is available for all GMP DWH users and other persons interested in the project. Help desk operators can immediately provide an information or assistance required, or, in case of more complicated requests, allocate them to the relevant person. The help desk performance is in agreement with requirements of ISO/IEC 20000-1:2012 standard.

8.3. Track changes Tracking changes is a system function which enables users track history of changes made in a particular data record. This facilitates e.g. a contact with data providers if some discrepancies in data record appear during the data verification/validation by DWH manager.

8.4. Data validation Tracking changes is system function which identifies data records that have not fulfilled criteria for a completed record, particularly due to missing obligatory items. The service is available for both Sites (e.g., missing characteristics, coordinates) and Data records (e.g., missing values, methods…). A standardized table report is displayed after each data validation process.

8.5. Data export Data forming the charts displayed in some visualization tools can be downloaded in CSV format. This download can be later opened in conventional software such as MS Excel. ROG members can also ask for other data export, in particular, should they need more extensive data sets or data sets selected by more complex filter criteria than those predefined.

Page 21: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse – System Documentation and Architecture

19

Annexes

Annex 1. GMP DWH – structure and description of parameters

Annex 2. GMP DWH – user guide

Page 22: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access
Page 23: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

Data structure – Air The aim of this document is to provide a short and clear description of parameters (data items) that are to be reported in the data collection forms of the Global Monitoring Plan (GMP) data collection campaigns 2013–2014. Data itself should be reported by means of MS Excel sheets as suggested in the document UNEP/POPS/COP.6/INF/31, chapter 2.3, p. 22. Aggregated data can also be reported via on-line forms available in the GMP data warehouse (GMP DWH).

Structure of the database and associated code lists are based on following documents, recommendations and expert opinions (as adopted by the Stockholm Convention COP6 in 2013):

• Guidance on the Global Monitoring Plan for Persistent Organic Pollutants UNEP/POPS/COP.6/INF/31 (version January 2013)

• Conclusions of the Meeting of the Global Coordination Group and Regional Organization Groups for the Global Monitoring Plan for POPs, held in Geneva, 10–12 October 2012

• Conclusions of the Meeting of the expert group on data handling under the global monitoring plan for persistent organic pollutants, held in Brno, Czech Republic, 13-15 June 2012

The individual reported data component is inserted as:

• text or number (e.g. Site name, Monitoring programme, Value) • a defined input selected from a particular code list (e.g., Country, Chemical – group,

Sampling). All code lists (i.e., allowed values for individual parameters) are enclosed in this document, either in a particular section (e.g., Region, Method) or listed separately in the annexes below (Country, Chemical – group, Parameter) for your reference.

• multiple selection from a particular code list, i.e., more than one option can be selected (Potential source type)

Site • Site ID (number)

Description: Identification code of the site generated by the GMP DWH system in the format GMP-XX-XXXXX

• Site name (text) Description: Name of the site. Note: When providing data from the site that was already reported, the name used this time must be identical to that already contained in the GMP DWH.

• Longitude (number) Description: Longitude of the site in decimal format (XX,XXXXXE or XX,XXXXXW)

• Latitude (number) Description: Latitude of the site in decimal format (XX,XXXXXN or XX,XXXXXS)

• Region (code list) Description: list of UN regional groups

o WEOG o CEEC o GRULAC o Africa

Page 24: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

o Asia and Pacific • Country (code list)

Description: Country, in which the site is located o code list – see “Country” code list

• Site type (code list) Description: Character of the site with respect to the population density defined in the document UNEP/POPS/COP.6/INF/31, p. 38

o Urban o Sub-urban o Rural o Remote o High altitude o Polar

• Potential source type (code list, multiple selection) Description: Character of the site with respect to potential sources of POPs defined in the document UNEP/POPS/COP.6/INF/31, p. 36

o Industrial o Traffic o Residential o Agricultural o Waste sector o No specific source

Sampling attributes • Year (number)

Description: Year in the format YYYY • Start of sampling (number)

Description: Date in the format DD.MM.YYYY • End of sampling (number)

Description: Date in the format DD.MM.YYYY • Sampling frequency (code list)

Description: Periodicity of sampling within one year • Largest gap (number)

Largest gap between end and start of individuals samples within aggregated year (months) • Type of sampling (code list)

Description: Type of air sampling o Active o Passive

• Type of passive sampling (code list) Description: A method (filter) used for passive sampling allowed for “Type of sampling” = “Passive”

o PUF o SIP o XAD

• Recalculation (code list) Description: A method used to recalculate values from air-passive sampling to those of air-

Page 25: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

active sampling allowed for “Type of sampling” = “Passive”

o PRC o Calibration

• Calibration description (text) Description: Short description of the calibration process allowed for “Type of sampling” = “Passive”

• Monitoring programme/network (text) Description: Name of the monitoring programme or network that provided this data record.

Measurement • Chemical – group (code list)

Description: Persistent organic pollutants (POPs) included in Annexes of the Stockholm Convention as defined in the document UNEP/POPS/COP.6/INF/31, chapter 2.1, p. 16. Please note that indicator and coplanar PCBs are separated.

o code list – see “Chemical – group” code list in the annex below • Parameter (code list)

Description: Parent POPs, isomers and transformation products of POPs listed in the Stockholm Convention, and summations defined in the document UNEP/POPS/COP.6/INF/31, chapter 2.2, p. 19–21. The parameters are directly linked with units. Please note that each parameter should be reported in pg or fg (for dioxins and furans) per m3.

o code list – see “Parameter” code list • Method (code list)

Description: Analytical method used for determination of the concentration o GC-ECD o GC-ECNI-MS o GC-HRMS o GC-MS o HPLC o HPLC-MS-MS

• LOQ (non-negative real number) Description: Number representing Limit of quantification value

• No. of values (non-negative integer) Description: Number representing amount of values aggregated

• No. under LoQ (non-negative integer) Description: Number representing amount of values in this aggregation that were smaller than the LoQ value

• Value (mean) (non-negative real number)

Description: Number; Mean of aggregated values • Value (median) (non-negative real number)

Description: Number; Median of aggregated values • Minimum (non-negative real number)

Description: Number; Minimum value in this aggregation • Maximum (non-negative real number)

Description: Number; Maximum value in this aggregation

Page 26: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

• 5th percentile (non-negative real number) Description: Number; Value on the 5% position of the aggregated data set (sorted from the lowest to highest concentration)

• 95th percentile (non-negative real number) Description: Number; Value on the 95% position of the aggregated data set (sorted from the lowest to highest concentration)

• SD (non-negative real number) Description: Number; Standard deviation of aggregated values

• Laboratory (text) Description: Name of the laboratory performing analysis of this data record

Page 27: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Country” Code List Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antarctica Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Bouvet Island Brazil British Indian Ocean Territory Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands Central African Republic Ceuta Cocos Islands (or Keeling Islands) Colombia Comoros Congo Congo, Democratic Republic of Cook Islands Costa Rica Côte dIvoire Croatia Cuba Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Falkland Islands Faroe Islands Fiji Finland Former Yugoslav Republic of Macedonia France French Polynesia French Southern Territories Gabon Gambia Georgia Germany Ghana Gibraltar

Greece Greenland Grenada Guam Guatemala Guinea Guinea-Bissau Guyana Haiti Heard Island and McDonald Islands Holy See Honduras Hong Kong Hungary Chad Chile China, Peoples Republic of Christmas Island Iceland India Indonesia Iran, Islamic Republic of Iraq Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Korea, Democratic People’s Republic of Korea, Republic of Kosovo Kuwait Kyrgyzstan Lao Peoples Democratic Republic Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Liechtenstein Lithuania Luxembourg Macao Madagascar Malawi Malaysia Maldives Mali Malta Marshall Islands Mauritania Mauritius Mayotte Melilla Mexico Micronesia, Federated States of Moldova, Republic of Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands Netherlands Antilles New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island

Northern Mariana Islands Norway Occupied Palestinian Territory Oman Pakistan Palau Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Poland Portugal Qatar Romania Russian Federation Rwanda Saint Helena Saint Kitts and Nevis Saint Lucia Saint Pierre and Miquelon Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Seychelles Sierra Leone Singapore Slovakia Slovenia Solomon Islands Somalia South Africa South Georgia and South Sandwich Islands Spain Sri Lanka St Vincent and the Grenadines Sudan Suriname Swaziland Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom United States United States Minor Outlying Islands Uruguay Uzbekistan Vanuatu Venezuela Viet-Nam Virgin Islands (US) Virgin Islands, British Wallis and Futuna Yemen Zambia Zimbabwe

Page 28: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Chemical – group” Code List Aldrin Alpha-hexachlorocyclohexane (α-HCH) Beta-hexachlorocyclohexane (β-HCH) Chlordane Chlordecone Dichlorodiphenyltrichloroethane (DDT) Dieldrin Endosulfan Endrin Gamma-hexachlorocyclohexane (γ-HCH) Heptachlor Hexabromobiphenyl (HBB) Hexabromocyclododecane (HBCD) Hexabromodiphenyl ether and heptabromodiphenyl ether (c-octa PBDE) Hexachlorobenzene (HCB) Mirex Pentachlorobenzene (PeCBz) Perfluorooctane sulfonic acid (PFOS) Polychlorinated biphenyls (PCB) – indicator Polychlorinated biphenyls (PCB) – coplanar Polychlorinated dibenzodioxins (PCDD) Polychlorinated dibenzofurans (PCDF) Tetrabromodiphenyl ether and pentabromodiphenyl ether (c-penta PBDE) Toxaphene

Page 29: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Parameter” Code List Aldrin (pg/m3) Alpha-HCH (pg/m3) Beta-HCH (pg/m3) cis-Chlordane (= alpha) (pg/m3) trans-Chlordane (= gamma) (pg/m3) Oxychlordane (pg/m3) cis-Nonachlor (pg/m3) trans-Nonachlor (pg/m3) Chlordecone (pg/m3) o,p-DDT (pg/m3) o,p-DDD (pg/m3) o,p-DDE (pg/m3) p,p-DDT (pg/m3) p,p-DDD (pg/m3) p,p-DDE (pg/m3) Sum p,p-DDTs (pg/m3) Sum 6 DDTs (pg/m3) Dieldrin (pg/m3) Endosulfan I (alpha) (pg/m3) Endosulfan II (beta) (pg/m3) Endosulfan SO4 (pg/m3) Endrin (pg/m3) Gamma-HCH (pg/m3) Heptachlor (pg/m3) Sum 2 heptachlorepoxides (pg/m3) cis-Heptachlorepoxide (= exo, B) (pg/m3) trans-Heptachlorepoxide (= endo, A) (pg/m3) PBB 153 (pg/m3) Alpha-HBCD (pg/m3) Beta-HBCD (pg/m3) Gamma-HBCD (pg/m3) BDE 153 (pg/m3) BDE 154 (pg/m3) BDE 175/183 (pg/m3) HCB (pg/m3) Mirex (pg/m3) PeCB (pg/m3) PFOS (pg/m3) PFOSA (pg/m3) NMeFOSA (pg/m3) NEtFOSA (pg/m3) NMeFOSE (pg/m3) NEtFOSE (pg/m3)

PCB 28 (pg/m3) PCB 52 (pg/m3) PCB 101 (pg/m3) PCB 138 (pg/m3) PCB 153 (pg/m3) PCB 180 (pg/m3) PCB 77 (pg/m3) PCB 81 (pg/m3) PCB 105 (pg/m3) PCB 114 (pg/m3) PCB 118 (pg/m3) PCB 123 (pg/m3) PCB 126 (pg/m3) PCB 156 (pg/m3) PCB 157 (pg/m3) PCB 167 (pg/m3) PCB 169 (pg/m3) PCB 189 (pg/m3) Sum 6 PCBs (pg/m3) Sum 7 PCBs (pg/m3) PCB WHO98-TEQ (pg/m3) PCB WHO2005-TEQ (pg/m3) 1,2,3,4,6,7,8-HpCDD (fg/m3) 1,2,3,4,7,8-HxCDD (fg/m3) 1,2,3,6,7,8-HxCDD (fg/m3) 1,2,3,7,8,9-HxCDD (fg/m3) 1,2,3,7,8-PeCDD (fg/m3) 2,3,7,8-TCDD (fg/m3) OCDD (fg/m3) 1,2,3,4,6,7,8-HpCDF (fg/m3) 1,2,3,4,7,8,9-HpCDF (fg/m3) 1,2,3,4,7,8-HxCDF (fg/m3) 1,2,3,6,7,8-HxCDF (fg/m3) 1,2,3,7,8,9-HxCDF (fg/m3) 1,2,3,7,8-PeCDF (fg/m3) 2,3,4,6,7,8-HxCDF (fg/m3) 2,3,4,7,8-PeCDF (fg/m3) 2,3,7,8-TCDF (fg/m3) OCDF (fg/m3) Sum 7 PCDDs (fg/m3) Sum 10 PCDFs (fg/m3) Sum 17 PCDDs/Fs (fg/m3) PCDDs WHO98-TEQ (fg/m3) PCDFs WHO98-TEQ (fg/m3) PCDDs/Fs WHO98-TEQ (fg/m3) PCDDs WHO2005-TEQ (fg/m3) PCDFs WHO2005-TEQ (fg/m3) PCDDs/Fs WHO2005-TEQ (fg/m3) BDE 17 (pg/m3) BDE 28 (pg/m3) BDE 47 (pg/m3) BDE 99 (pg/m3) BDE 100 (pg/m3) Parlar 26 (pg/m3) Parlar 50 (pg/m3) Parlar 40/41 (pg/m3) Parlar 44 (pg/m3) Parlar 62 (pg/m3)

Page 30: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

Data structure – Human milk The aim of this document is to provide a short and clear description of parameters (data or items) that are to be reported in the data collection forms of the Global Monitoring Plan of the Stockholm Convention (GMP) data collection campaigns 2013–2014. Data itself should be reported by means of MS Excel sheets as suggested in the document UNEP/POPS/COP.6/INF/31, chapter 2.3, p. 22. Aggregated data can also be reported via on-line forms available in the GMP data warehouse (GMP DWH).

Structure of the database and associated code lists are based on following documents, recommendations and expert opinions adopted by the Stockholm Convention COP6 in 2013:

• Guidance on the Global Monitoring Plan for Persistent Organic Pollutants UNEP/POPS/COP.6/INF/31 (version January 2013)

• Conclusions of the Meeting of the Global Coordination Group and Regional Organization Groups for the Global Monitoring Plan for POPs, held in Geneva, 10–12 October 2012

• Conclusions of the Meeting of the expert group on data handling under the global monitoring plan for persistent organic pollutants, held in Brno, Czech Republic, 13-15 June 2012

The individual reported data component is inserted as:

• text or number (e.g. Site name, Monitoring programme, Value) • a defined input selected from a particular code list (e.g., Country, Chemical – group,

Sampling). All code lists (i.e., allowed values for individual parameters) are enclosed in this document, either in a particular section (e.g., Region, Method) or listed separately in the annexes below (Country, Chemical – group, Parameter) for your reference.

• multiple selection from a particular code list, i.e., more than one option can be selected (Potential source type)

Site • Site ID (number)

Description: Identification code of the site generated by the GMP DWH system in the format GMP-XX-XXXXX

• Site name (text) Description: Name of the site. Note: When providing data from a site that was already reported, the name used must be identical to that already contained in the GMP DWH.

• Region (code list) Description: list of UN regional groups

o WEOG o CEEC o GRULAC o Africa o Asia and Pacific

• Country (code list) Description: Country, in which the site is located.

o code list – see “Country” code list

Page 31: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

Sampling attributes • Year (number)

Description: Year in the format YYYY • Start of sampling (number)

Description: Date in the format DD.MM.YYYY • End of sampling (number)

Description: Date in the format DD.MM.YYYY • Type of sample (code list)

Description: Indicates whether the samples measured were from one individual or pooled from more donors.

o Individual o Pooled

• Monitoring programme/network (text) Description: Name of the monitoring programme or network that provided this data record.

Measurement • Chemical – group (code list)

Description: Persistent organic pollutants (POPs) included in Annexes of the Stockholm Convention and defined in the document UNEP/POPS/COP.6/INF/31, chapter 2.1, p. 16. Please note that indicator and coplanar PCBs are separated.

o code list – see “Chemical – group” code list • Parameter (code list)

Description: Parent POPs, isomers and transformation products of POPs listed in the Stockholm Convention, and summations defined in the document UNEP/POPS/COP.6/INF/31, chapter 2.2, p. 19–21. The parameters are directly linked with units. Please note that each parameter can be reported in units per litre or per g of fat.

o code list – see “Parameter” code list • Method (code list)

Description: Analytical method used for determination of the concentration o GC-ECD o GC-ECNI-MS o GC-HRMS o GC-MS o HPLC o HPLC-MS-MS

• LOQ (non-negative rational number) Description: Number representing Limit of quantification value

• No. of values (positive integer) Description: Number representing amount of values aggregated

• No. under LoQ (positive integer) Description: Number representing amount of values in this aggregation that were smaller than the LoQ value

• Value (mean) (non-negative real number)

Description: Number; Mean of aggregated values

Page 32: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

• Value (median) (non-negative real number) Description: Number; Median of aggregated values

• Minimum (non-negative real number) Description: Number; Minimum value in this aggregation

• Maximum (non-negative real number) Description: Number; Maximum value in this aggregation

• 5th percentile (non-negative real number) Description: Number; Value on the 5% position of the aggregated data set (sorted from the lowest to highest concentration)

• 95th percentile (non-negative real number) Description: Number; Value on the 95% position of the aggregated data set (sorted from the lowest to highest concentration)

• SD (non-negative real number) Description: Number; Standard deviation of aggregated values

• Laboratory (text) Description: Name of the laboratory performing analysis of this data record

Page 33: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Country” Code List Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antarctica Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Bouvet Island Brazil British Indian Ocean Territory Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands Central African Republic Ceuta Cocos Islands (or Keeling Islands) Colombia Comoros Congo Congo, Democratic Republic of Cook Islands Costa Rica Côte dIvoire Croatia Cuba Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Falkland Islands Faroe Islands Fiji Finland Former Yugoslav Republic of Macedonia France French Polynesia French Southern Territories Gabon Gambia Georgia Germany Ghana Gibraltar

Greece Greenland Grenada Guam Guatemala Guinea Guinea-Bissau Guyana Haiti Heard Island and McDonald Islands Holy See Honduras Hong Kong Hungary Chad Chile China, Peoples Republic of Christmas Island Iceland India Indonesia Iran, Islamic Republic of Iraq Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Korea, Democratic People’s Republic of Korea, Republic of Kosovo Kuwait Kyrgyzstan Lao Peoples Democratic Republic Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Liechtenstein Lithuania Luxembourg Macao Madagascar Malawi Malaysia Maldives Mali Malta Marshall Islands Mauritania Mauritius Mayotte Melilla Mexico Micronesia, Federated States of Moldova, Republic of Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands Netherlands Antilles New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island

Northern Mariana Islands Norway Occupied Palestinian Territory Oman Pakistan Palau Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Poland Portugal Qatar Romania Russian Federation Rwanda Saint Helena Saint Kitts and Nevis Saint Lucia Saint Pierre and Miquelon Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Seychelles Sierra Leone Singapore Slovakia Slovenia Solomon Islands Somalia South Africa South Georgia and South Sandwich Islands Spain Sri Lanka St Vincent and the Grenadines Sudan Suriname Swaziland Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom United States United States Minor Outlying Islands Uruguay Uzbekistan Vanuatu Venezuela Viet-Nam Virgin Islands (US) Virgin Islands, British Wallis and Futuna Yemen Zambia Zimbabwe

Page 34: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Chemical – group” Code List Aldrin Alpha-hexachlorocyclohexane (α-HCH) Beta-hexachlorocyclohexane (β-HCH) Chlordane Chlordecone Dichlorodiphenyltrichloroethane (DDT) Dieldrin Endosulfan Endrin Gamma-hexachlorocyclohexane (γ-HCH) Heptachlor Hexabromobiphenyl (HBB) Hexabromocyclododecane (HBCD) Hexabromodiphenyl ether and heptabromodiphenyl ether (c-octa PBDE) Hexachlorobenzene (HCB) Mirex Pentachlorobenzene (PeCBz) Perfluorooctane sulfonic acid (PFOS) Polychlorinated biphenyls (PCB) – indicator Polychlorinated biphenyls (PCB) – coplanar Polychlorinated dibenzodioxins (PCDD) Polychlorinated dibenzofurans (PCDF) Tetrabromodiphenyl ether and pentabromodiphenyl ether (c-penta PBDE) Toxaphene

Page 35: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Parameter” Code List Reporting in ng or pg/g fat is preferred.

Aldrin (ng/g fat) Aldrin (ng/l) Alpha-HCH (ng/g fat) Alpha-HCH (ng/l) Beta-HCH (ng/g fat) Beta-HCH (ng/l) cis-Chlordane (= alpha) (ng/g fat) cis-Chlordane (= alpha) (ng/l) trans-Chlordane (= gamma) (ng/g fat) trans-Chlordane (= gamma) (ng/l) cis-Nonachlor (ng/g fat) cis-Nonachlor (ng/l) Oxychlordane (ng/g fat) Oxychlordane (ng/l) trans-Nonachlor (ng/g fat) trans-Nonachlor (ng/l) Chlordecone (ng/g fat) Chlordecone (ng/l) o,p-DDT (ng/g fat) o,p-DDT (ng/l) o,p-DDD (ng/g fat) o,p-DDD (ng/l) o,p-DDE (ng/g fat) o,p-DDE (ng/l) p,p-DDT (ng/g fat) p,p-DDT (ng/l) p,p-DDD (ng/g fat) p,p-DDD (ng/l) p,p-DDE (ng/g fat) p,p-DDE (ng/l) Sum p,p-DDTs (ng/g fat) Sum p,p-DDTs (ng/l) Sum 6 DDTs (ng/g fat) Sum 6 DDTs (ng/l) Dieldrin (ng/g fat) Dieldrin (ng/l) Endosulfan I (alpha) (ng/g fat) Endosulfan I (alpha) (ng/l) Endosulfan II (beta) (ng/g fat) Endosulfan II (beta) (ng/l) Endosulfan SO4 (ng/g fat) Endosulfan SO4 (ng/l) Endrin (ng/g fat) Endrin (ng/l) Gamma-HCH (ng/g fat) Gamma-HCH (ng/l) Heptachlor (ng/g fat) Heptachlor (ng/l) Sum 2 heptachlorepoxides (ng/g fat) Sum 2 heptachlorepoxides (ng/l) cis-Heptachlorepoxide (= exo, B) (ng/g fat) cis-Heptachlorepoxide (= exo, B) (ng/l) trans-Heptachlorepoxide trans- (= endo, A) (ng/g fat) trans-Heptachlorepoxide trans- (= endo, A) (ng/l) PBB 153 (ng/g fat) PBB 153 (ng/l) Alpha-HBCD (ng/g fat) Alpha-HBCD (ng/l) Beta-HBCD (ng/g fat) Beta-HBCD (ng/l) Gamma-HBCD (ng/g fat) Gamma-HBCD (ng/l) BDE 153 (ng/g fat) BDE 153 (ng/l) BDE 154 (ng/g fat) BDE 154 (ng/l) BDE 175/183 (ng/g fat) BDE 175/183 (ng/l)

HCB (ng/g fat) HCB (ng/l) Mirex (ng/g fat) Mirex (ng/l) PeCB (ng/g fat) PeCB (ng/l) PFOS (ng/g fat) PFOS (ng/l) PFOSA (ng/g fat) PFOSA (ng/l) NMeFOSA (ng/g fat) NMeFOSA (ng/l) NEtFOSA (ng/g fat) NEtFOSA (ng/l) NMeFOSE (ng/g fat) NMeFOSE (ng/l) NEtFOSE (ng/g fat) NEtFOSE (ng/l) PCB 28 (ng/g fat) PCB 28 (ng/l) PCB 52 (ng/g fat) PCB 52 (ng/l) PCB 101 (ng/g fat) PCB 101 (ng/l) PCB 138 (ng/g fat) PCB 138 (ng/l) PCB 153 (ng/g fat) PCB 153 (ng/l) PCB 180 (ng/g fat) PCB 180 (ng/l) PCB 77 (ng/g fat) PCB 77 (ng/l) PCB 81 (ng/g fat) PCB 81 (ng/l) PCB 105 (ng/g fat) PCB 105 (ng/l) PCB 114 (ng/g fat) PCB 114 (ng/l) PCB 118 (ng/g fat) PCB 118 (ng/l) PCB 123 (ng/g fat) PCB 123 (ng/l) PCB 126 (ng/g fat) PCB 126 (ng/l) PCB 156 (ng/g fat) PCB 156 (ng/l) PCB 157 (ng/g fat) PCB 157 (ng/l) PCB 167 (ng/g fat) PCB 167 (ng/l) PCB 169 (ng/g fat) PCB 169 (ng/l) PCB 189 (ng/g fat) PCB 189 (ng/l) Sum 6 PCBs (ng/g fat) Sum 6 PCBs (ng/l) Sum 7 PCBs (ng/g fat) Sum 7 PCBs (ng/l) PCBs WHO98-TEQ (pg/g fat) PCBs WHO98-TEQ (pg/l) PCBs WHO2005-TEQ (pg/g fat) PCBs WHO2005-TEQ (pg/l) 1,2,3,4,6,7,8-HpCDD (pg/g fat) 1,2,3,4,6,7,8-HpCDD (pg/l) 1,2,3,4,7,8-HxCDD (pg/g fat) 1,2,3,4,7,8-HxCDD (pg/l) 1,2,3,6,7,8-HxCDD (pg/g fat) 1,2,3,6,7,8-HxCDD (pg/l) 1,2,3,7,8,9-HxCDD (pg/g fat) 1,2,3,7,8,9-HxCDD (pg/l)

1,2,3,7,8-PeCDD (pg/g fat) 1,2,3,7,8-PeCDD (pg/l) 2,3,7,8-TCDD (pg/g fat) 2,3,7,8-TCDD (pg/l) OCDD (pg/g fat) OCDD (pg/l) 1,2,3,4,6,7,8-HpCDF (pg/g fat) 1,2,3,4,6,7,8-HpCDF (pg/l) 1,2,3,4,7,8,9-HpCDF (pg/g fat) 1,2,3,4,7,8,9-HpCDF (pg/l) 1,2,3,4,7,8-HxCDF (pg/g fat) 1,2,3,4,7,8-HxCDF (pg/l) 1,2,3,6,7,8-HxCDF (pg/g fat) 1,2,3,6,7,8-HxCDF (pg/l) 1,2,3,7,8,9-HxCDF (pg/g fat) 1,2,3,7,8,9-HxCDF (pg/l) 1,2,3,7,8-PeCDF (pg/g fat) 1,2,3,7,8-PeCDF (pg/l) 2,3,4,6,7,8-HxCDF (pg/g fat) 2,3,4,6,7,8-HxCDF (pg/l) 2,3,4,7,8-PeCDF (pg/g fat) 2,3,4,7,8-PeCDF (pg/l) 2,3,7,8-TCDF (pg/g fat) 2,3,7,8-TCDF (pg/l) OCDF (pg/g fat) OCDF (pg/l) Sum 7 PCDDs (pg/g fat) Sum 7 PCDDs (pg/l) Sum 10 PCDFs (pg/g fat) Sum 10 PCDFs (pg/l) Sum 17 PCDDs/Fs (pg/g fat) Sum 17 PCDDs/Fs (pg/l) PCDDs WHO98-TEQ (pg/g fat) PCDDs WHO98-TEQ (pg/l) PCDFs WHO98-TEQ (pg/g fat) PCDFs WHO98-TEQ (pg/l) PCDDs/Fs WHO98-TEQ (pg/g fat) PCDDs/Fs WHO98-TEQ (pg/l) PCDDs WHO2005-TEQ (pg/g fat) PCDDs WHO2005-TEQ (pg/l) PCDFs WHO2005-TEQ (pg/g fat) PCDFs WHO2005-TEQ (pg/l) PCDDs/Fs WHO2005-TEQ (pg/g fat) PCDDs/Fs WHO2005-TEQ (pg/l) BDE 17 (ng/g fat) BDE 17 (ng /l) BDE 28 (ng /g fat) BDE 28 (ng /l) BDE 47 (ng/g fat) BDE 47 (ng/l) BDE 99 (ng/g fat) BDE 99 (ng/l) BDE 100 (ng/g fat) BDE 100 (ng/l) Parlar 26 (ng/g fat) Parlar 26 (ng/l) Parlar 50 (ng/g fat) Parlar 50 (ng/l) Parlar 40/41 (ng/g fat) Parlar 40/41 (ng/l) Parlar 44 (ng/g fat) Parlar 44 (ng/l) Parlar 62 (ng/g fat) Parlar 62 (ng/l)

Page 36: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

Data structure – Human blood The aim of this document is to provide a short and clear description of parameters (data items) that are to be reported in the data collection forms of the Global Monitoring Plan (GMP) data collection campaigns 2013–2014. Data itself should be reported by means of MS Excel sheets as suggested in the document UNEP/POPS/COP.6/INF/31, chapter 2.3, p. 22. Aggregated data can also be reported via on-line forms available in the GMP data warehouse (GMP DWH).

Structure of the database and associated code lists are based on following documents, recommendations and expert opinions as adopted by the Stockholm Convention COP6 in 2013:

• Guidance on the Global Monitoring Plan for Persistent Organic Pollutants UNEP/POPS/COP.6/INF/31 (version January 2013)

• Conclusions of the Meeting of the Global Coordination Group and Regional Organization Groups for the Global Monitoring Plan for POPs, held in Geneva, 10–12 October 2012

• Conclusions of the Meeting of the expert group on data handling under the global monitoring plan for persistent organic pollutants, held in Brno, Czech Republic, 13-15 June 2012

The individual reported data component is inserted as:

• text or number (e.g. Site name, Monitoring programme, Value) • a defined input selected from a particular code list (e.g., Country, Chemical – group,

Sampling). All code lists (i.e., allowed values for individual parameters) are enclosed in this document, either in a particular section (e.g., Region, Method) or listed separately in the annexes below (Country, Chemical – group, Parameter) for your reference.

• multiple selection from a particular code list, i.e., more than one option can be selected (Potential source type)

Site • Site ID (number)

Description: Identification code of the site generated by the GMP DWH system in the format GMP-XX-XXXXX

• Site name (text) Description: Name of the site. Note: When providing data from a site that was reported previously, the name used this time must be identical to that already contained in the GMP DWH.

• Region (code list) Description: list of UN regional groups

o WEOG o CEEC o GRULAC o Africa o Asia and Pacific

• Country (code list) Description: Country, in which the site is located.

o code list – see “Country” code list

Page 37: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

Sampling attributes • Year (number)

Description: Year in the format YYYY • Start of sampling (number)

Description: Date in the format DD.MM.YYYY • End of sampling (number)

Description: Date in the format DD.MM.YYYY • Blood source (code list)

Description: Specification of the blood that was sampled. o Blood – other o Blood – maternal o Blood – children o Blood – cord

• Fraction (code list) Description: Specification of the blood fraction which was analysed.

o Plasma o Serum o Whole blood

• Monitoring programme/network (text) Description: Name of the monitoring programme or network that provided this data record.

Measurement • Chemical – group (code list)

Description: Persistent organic pollutants (POPs) included in Annexes of the Stockholm Convention and defined in the document UNEP-POPS-COP.6-INF-31, chapter 2.1, p. 16. Please note that indicator and coplanar PCBs are separated.

o code list – see “Chemical – group” code list • Parameter (code list)

Description: Parent POPs, isomers and transformation products of POPs listed in the Stockholm Convention and summations defined in the document UNEP/POPS/COP.6/INF/31, chapter 2.2, p. 19–21. The parameters are directly linked with units. Please note that each parameter can be reported in units per litre or per g of fat.

o code list – see “Parameter” code list • Method (code list)

Description: Analytical method used for determination of the concentration o GC-ECD o GC-ECNI-MS o GC-HRMS o GC-MS o HPLC o HPLC-MS-MS

• LOQ (non-negative real number) Description: Number representing Limit of quantification value

Page 38: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

• No. of values (positive integer) Description: Number representing amount of values aggregated

• No. under LoQ (positive integer) Description: Number representing amount of values in this aggregation that were smaller than the LoQ value

• Value (mean) (non-negative real number)

Description: Number; Mean of aggregated values • Value (median) (non-negative real number)

Description: Number; Median of aggregated values • Minimum (non-negative real number)

Description: Number; Minimum value in this aggregation • Maximum (non-negative real number)

Description: Number; Maximum value in this aggregation • 5th percentile (non-negative real number)

Description: Number; Value on the 5% position of the aggregated data set (sorted from the lowest to highest concentration)

• 95th percentile (non-negative real number) Description: Number; Value on the 95% position of the aggregated data set (sorted from the lowest to highest concentration)

• SD (non-negative real number) Description: Number; Standard deviation of aggregated values

• Laboratory (text) Description: Name of the laboratory performing analysis of this data record

Page 39: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Country” Code List Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antarctica Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Bouvet Island Brazil British Indian Ocean Territory Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands Central African Republic Ceuta Cocos Islands (or Keeling Islands) Colombia Comoros Congo Congo, Democratic Republic of Cook Islands Costa Rica Côte dIvoire Croatia Cuba Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Falkland Islands Faroe Islands Fiji Finland Former Yugoslav Republic of Macedonia France French Polynesia French Southern Territories Gabon Gambia Georgia Germany Ghana Gibraltar

Greece Greenland Grenada Guam Guatemala Guinea Guinea-Bissau Guyana Haiti Heard Island and McDonald Islands Holy See Honduras Hong Kong Hungary Chad Chile China, Peoples Republic of Christmas Island Iceland India Indonesia Iran, Islamic Republic of Iraq Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Korea, Democratic People’s Republic of Korea, Republic of Kosovo Kuwait Kyrgyzstan Lao Peoples Democratic Republic Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Liechtenstein Lithuania Luxembourg Macao Madagascar Malawi Malaysia Maldives Mali Malta Marshall Islands Mauritania Mauritius Mayotte Melilla Mexico Micronesia, Federated States of Moldova, Republic of Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands Netherlands Antilles New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island

Northern Mariana Islands Norway Occupied Palestinian Territory Oman Pakistan Palau Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Poland Portugal Qatar Romania Russian Federation Rwanda Saint Helena Saint Kitts and Nevis Saint Lucia Saint Pierre and Miquelon Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Seychelles Sierra Leone Singapore Slovakia Slovenia Solomon Islands Somalia South Africa South Georgia and South Sandwich Islands Spain Sri Lanka St Vincent and the Grenadines Sudan Suriname Swaziland Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom United States United States Minor Outlying Islands Uruguay Uzbekistan Vanuatu Venezuela Viet-Nam Virgin Islands (US) Virgin Islands, British Wallis and Futuna Yemen Zambia Zimbabwe

Page 40: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Chemical – group” Code List Aldrin Alpha-hexachlorocyclohexane (α-HCH) Beta-hexachlorocyclohexane (β-HCH) Chlordane Chlordecone Dichlorodiphenyltrichloroethane (DDT) Dieldrin Endosulfan Endrin Gamma-hexachlorocyclohexane (γ-HCH) Heptachlor Hexabromobiphenyl (HBB) Hexabromocyclododecane (HBCD) Hexabromodiphenyl ether and heptabromodiphenyl ether (c-octa PBDE) Hexachlorobenzene (HCB) Mirex Pentachlorobenzene (PeCBz) Perfluorooctane sulfonic acid (PFOS) Polychlorinated biphenyls (PCB) – indicator Polychlorinated biphenyls (PCB) – coplanar Polychlorinated dibenzodioxins (PCDD) Polychlorinated dibenzofurans (PCDF) Tetrabromodiphenyl ether and pentabromodiphenyl ether (c-penta PBDE) Toxaphene

Page 41: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Parameter” Code List Reporting in ng or pg/g fat is preferred.

Aldrin (ng/g fat) Aldrin (ng/l) Alpha-HCH (ng/g fat) Alpha-HCH (ng/l) Beta-HCH (ng/g fat) Beta-HCH (ng/l) cis-Chlordane (= alpha) (ng/g fat) cis-Chlordane (= alpha) (ng/l) trans-Chlordane (= gamma) (ng/g fat) trans-Chlordane (= gamma) (ng/l) cis-Nonachlor (ng/g fat) cis-Nonachlor (ng/l) Oxychlordane (ng/g fat) Oxychlordane (ng/l) trans-Nonachlor (ng/g fat) trans-Nonachlor (ng/l) Chlordecone (ng/g fat) Chlordecone (ng/l) o,p-DDT (ng/g fat) o,p-DDT (ng/l) o,p-DDD (ng/g fat) o,p-DDD (ng/l) o,p-DDE (ng/g fat) o,p-DDE (ng/l) p,p-DDT (ng/g fat) p,p-DDT (ng/l) p,p-DDD (ng/g fat) p,p-DDD (ng/l) p,p-DDE (ng/g fat) p,p-DDE (ng/l) Sum p,p-DDTs (ng/g fat) Sum p,p-DDTs (ng/l) Sum 6 DDTs (ng/g fat) Sum 6 DDTs (ng/l) Dieldrin (ng/g fat) Dieldrin (ng/l) Endosulfan I (alpha) (ng/g fat) Endosulfan I (alpha) (ng/l) Endosulfan II (beta) (ng/g fat) Endosulfan II (beta) (ng/l) Endosulfan SO4 (ng/g fat) Endosulfan SO4 (ng/l) Endrin (ng/g fat) Endrin (ng/l) Gamma-HCH (ng/g fat) Gamma-HCH (ng/l) Heptachlor (ng/g fat) Heptachlor (ng/l) Sum 2 heptachlorepoxides (ng/g fat) Sum 2 heptachlorepoxides (ng/l) cis-Heptachlorepoxide (= exo, B) (ng/g fat) cis-Heptachlorepoxide (= exo, B) (ng/l) trans-Heptachlorepoxide (= endo, A) (ng/g fat) trans-Heptachlorepoxide (= endo, A) (ng/l) PBB 153 (ng/g fat) PBB 153 (ng/l) Alpha-HBCD (ng/g fat) Alpha-HBCD (ng/l) Beta-HBCD (ng/g fat) Beta-HBCD (ng/l) Gamma-HBCD (ng/g fat) Gamma-HBCD (ng/l) BDE 153 (ng/g fat) BDE 153 (ng/l) BDE 154 (ng/g fat) BDE 154 (ng/l) BDE 175/183 (ng/g fat) BDE 175/183 (ng/l) HCB (ng/g fat) HCB (ng/l)

Mirex (ng/g fat) Mirex (ng/l) PeCB (ng/g fat) PeCB (ng/l) PFOS (ng/g fat) PFOS (ng/l) PFOSA (ng/g fat) PFOSA (ng/l) NMeFOSA (ng/g fat) NMeFOSA (ng/l) NEtFOSA (ng/g fat) NEtFOSA (ng/l) NMeFOSE (ng/g fat) NMeFOSE (ng/l) NEtFOSE (ng/g fat) NEtFOSE (ng/l) PCB 28 (ng/g fat) PCB 28 (ng/l) PCB 52 (ng/g fat) PCB 52 (ng/l) PCB 101 (ng/g fat) PCB 101 (ng/l) PCB 138 (ng/g fat) PCB 138 (ng/l) PCB 153 (ng/g fat) PCB 153 (ng/l) PCB 180 (ng/g fat) PCB 180 (ng/l) PCB 77 (ng/g fat) PCB 77 (ng/l) PCB 81 (ng/g fat) PCB 81 (ng/l) PCB 105 (ng/g fat) PCB 105 (ng/l) PCB 114 (ng/g fat) PCB 114 (ng/l) PCB 118 (ng/g fat) PCB 118 (ng/l) PCB 123 (ng/g fat) PCB 123 (ng/l) PCB 126 (ng/g fat) PCB 126 (ng/l) PCB 156 (ng/g fat) PCB 156 (ng/l) PCB 157 (ng/g fat) PCB 157 (ng/l) PCB 167 (ng/g fat) PCB 167 (ng/l) PCB 169 (ng/g fat) PCB 169 (ng/l) PCB 189 (ng/g fat) PCB 189 (ng/l) Sum 6 PCBs (ng/g fat) Sum 6 PCBs (ng/l) Sum 7 PCBs (ng/g fat) Sum 7 PCBs (ng/l) PCBs WHO98-TEQ (pg/g fat) PCBs WHO98-TEQ (pg/l) PCBs WHO2005-TEQ (pg/g fat) PCBs WHO2005-TEQ (pg/l) 1,2,3,4,6,7,8-HpCDD (pg/g fat) 1,2,3,4,6,7,8-HpCDD (pg/l) 1,2,3,4,7,8-HxCDD (pg/g fat) 1,2,3,4,7,8-HxCDD (pg/l) 1,2,3,6,7,8-HxCDD (pg/g fat) 1,2,3,6,7,8-HxCDD (pg/l) 1,2,3,7,8,9-HxCDD (pg/g fat) 1,2,3,7,8,9-HxCDD (pg/l) 1,2,3,7,8-PeCDD (pg/g fat) 1,2,3,7,8-PeCDD (pg/l)

2,3,7,8-TCDD (pg/g fat) 2,3,7,8-TCDD (pg/l) OCDD (pg/g fat) OCDD (pg/l) 1,2,3,4,6,7,8-HpCDF (pg/g fat) 1,2,3,4,6,7,8-HpCDF (pg/l) 1,2,3,4,7,8,9-HpCDF (pg/g fat) 1,2,3,4,7,8,9-HpCDF (pg/l) 1,2,3,4,7,8-HxCDF (pg/g fat) 1,2,3,4,7,8-HxCDF (pg/l) 1,2,3,6,7,8-HxCDF (pg/g fat) 1,2,3,6,7,8-HxCDF (pg/l) 1,2,3,7,8,9-HxCDF (pg/g fat) 1,2,3,7,8,9-HxCDF (pg/l) 1,2,3,7,8-PeCDF (pg/g fat) 1,2,3,7,8-PeCDF (pg/l) 2,3,4,6,7,8-HxCDF (pg/g fat) 2,3,4,6,7,8-HxCDF (pg/l) 2,3,4,7,8-PeCDF (pg/g fat) 2,3,4,7,8-PeCDF (pg/l) 2,3,7,8-TCDF (pg/g fat) 2,3,7,8-TCDF (pg/l) OCDF (pg/g fat) OCDF (pg/l) Sum 7 PCDDs (pg/g fat) Sum 7 PCDDs (pg/l) Sum 10 PCDFs (pg/g fat) Sum 10 PCDFs (pg/l) Sum 17 PCDDs/Fs (pg/g fat) Sum 17 PCDDs/Fs (pg/l) PCDDs WHO98-TEQ (pg/g fat) PCDDs WHO98-TEQ (pg/l) PCDFs WHO98-TEQ (pg/g fat) PCDFs WHO98-TEQ (pg/l) PCDDs/Fs WHO98-TEQ (pg/g fat) PCDDs/Fs WHO98-TEQ (pg/l) PCDDs WHO2005-TEQ (pg/g fat) PCDDs WHO2005-TEQ (pg/l) PCDFs WHO2005-TEQ (pg/g fat) PCDFs WHO2005-TEQ (pg/l) PCDDs/Fs WHO2005-TEQ (pg/g fat) PCDDs/Fs WHO2005-TEQ (pg/l) BDE 17 (ng/g fat) BDE 17 (ng /l) BDE 28 (ng /g fat) BDE 28 (ng /l) BDE 47 (ng/g fat) BDE 47 (ng/l) BDE 99 (ng/g fat) BDE 99 (ng/l) BDE 100 (ng/g fat) BDE 100 (ng/l) Parlar 26 (ng/g fat) Parlar 26 (ng/l) Parlar 50 (ng/g fat) Parlar 50 (ng/l) Parlar 40/41 (ng/g fat) Parlar 40/41 (ng/l) Parlar 44 (ng/g fat) Parlar 44 (ng/l) Parlar 62 (ng/g fat) Parlar 62 (ng/l)

Page 42: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

Data structure – Water The aim of this document is to provide a short and clear description of parameters (data items) that are to be reported in the data collection forms of the Global Monitoring Plan (GMP) data collection campaigns 2013–2014. The data itself should be reported by means of MS Excel sheets as suggested in the document UNEP/POPS/COP.6/INF/31, chapter 2.3, p. 22. Aggregated data can also be reported via on-line forms available in the GMP data warehouse (GMP DWH).

Structure of the database and associated code lists are based on following documents, recommendations and expert opinions as adopted by the Stockholm Convention COP6 in 2013:

• Guidance on the Global Monitoring Plan for Persistent Organic Pollutants UNEP/POPS/COP.6/INF/31 (version January 2013)

• Conclusions of the Meeting of the Global Coordination Group and Regional Organization Groups for the Global Monitoring Plan for POPs, held in Geneva, 10–12 October 2012

• Conclusions of the Meeting of the expert group on data handling under the global monitoring plan for persistent organic pollutants, held in Brno, Czech Republic, 13-15 June 2012

The individual reported data component is inserted as:

• free text or number (e.g. Site name, Monitoring programme, Value) • a defined item selected from a particular code list (e.g., Country, Chemical – group,

Sampling). All code lists (i.e., allowed values for individual parameters) are enclosed in this document, either in a particular section (e.g., Region, Method) or listed separately in the annexes below (Country, Chemical – group, Parameter) for your reference.

• multiple selection from a particular code list, i.e., more than one option can be selected (Discharges)

Site • Site ID (number)

Description: Identification code of the site generated by the GMP DWH system in the format GMP-XX-XXXXX

• Site name (text) Description: Name of the site. Note: When providing data from a site that was reported previously, the name used this time must be identical to that already contained in the GMP DWH. The recommended format: “name of the water body – name of the sampling site”, i.e. “Elbe – Decin” or “Donau – Bratislava”

• Surface water type (code list) Description: Specification of water body which was sampled

o Fresh water – lake o Fresh water – reservoir o Fresh water – river o Fresh water – channel o Marine water – coastal o Marine water – open ocean o Transitional

Page 43: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

• Point coordinates allowed for “Type of spatial data” = “Point”

o Longitude (number) Description: Longitude of the site in decimal format (XX,XXXXXE or XX,XXXXXW)

o Latitude (number) Description: Latitude of the site in decimal format (XX,XXXXXN or XX,XXXXXS)

• Region (code list) Description: list of UN regional groups

o WEOG o CEEC o GRULAC o Africa o Asia and Pacific

• Country (code list) Description: Country, in which the site is located. not allowed for “Surface water type” = “Marine water – open ocean”

o code list – see ”Country” code list • Ocean or sea (code list)

allowed for “Surface water type” = “Marine water – open ocean” or “Marine water – coastal” or “Transitional” Description: Ocean or sea, in which the site is located.

o code list –see ”Ocean or sea” code list • Site type (code list)

Description: Character of the site with respect to the population density (for rivers and channels state the highest density along the whole stream) allowed for “Surface water type” = “Fresh water – lake” or “Fresh water – reservoir” or “Fresh water – river” or “Fresh water – channel”

o Urban o Rural o Remote o High altitude o Polar

• Discharges (code list, multiple selection) Description: Character of the site with respect to potential sources of POPs allowed for “Surface water type” = “Fresh water – lake” or “Fresh water – reservoir” or “Fresh water – river” or “Fresh water – channel”

o Municipal o Industrial o Agricultural o None

Sampling attributes • Year (number)

Description: Year in the format YYYY

Page 44: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

• Start of sampling (number) Description: Date in the format DD.MM.YYYY

• End of sampling (number) Description: Date in the format DD.MM.YYYY

• Sampling frequency (code list) Description: Periodicity of sampling within one year

• Largest gap (number) Largest gap between end and start of individuals samples within aggregated year (months)

• Type of sampling (code list) Description: Type of water sampling

o Active o Passive

• Depth – minimum (non-negative real number) Description: Number; Depth in which the sample was collected (m).

• Depth – maximum (non-negative real number) Description: Number; Depth in which the sample was collected (m).

• Temperature Description: Temperature of collected water (°C).

o Value (mean) (non-negative real number)

Description: Number; Mean of aggregated values o Minimum (non-negative real number)

Description: Number; Minimum value in this aggregation o Maximum (non-negative real number)

Description: Number; Maximum value in this aggregation • Salinity

Description: Salinity of collected water (‰). o Value (mean) (non-negative real number)

Description: Number; Mean of aggregated values o Minimum (non-negative real number)

Description: Number; Minimum value in this aggregation o Maximum (non-negative real number)

Description: Number; Maximum value in this aggregation • Monitoring programme/network/cruise (text)

Description: Name of the monitoring programme, network or cruise that provided this data record

Measurement • Chemical – group (code list)

Description: Persistent organic pollutants (POPs) included in Annexes of the Stockholm Convention as defined in the document UNEP/POPS/COP.6/INF/31, chapter 2.1, p. 16. Please note that indicator and coplanar PCBs are separated.

o code list – see “Chemical – group” code list • Parameter (code list)

Description: Parent POPs, isomers and transformation products of POPs listed in the Stockholm Convention, and summations defined in the document UNEP/POPS/COP.6/INF/31, chapter 2.2, p. 19–21. The parameters are directly linked with

Page 45: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

units. Please note that each parameter should be reported in pg or fg (for dioxins and furans) per litre.

o code list – see “Parameter” code list • Method (code list)

Description: Analytical method used for determination of the concentration o GC-ECD o GC-ECNI-MS o GC-HRMS o GC-MS o HPLC o HPLC-MS-MS

• LOQ (non-negative real number) Description: Number representing Limit of quantification value

• No. of values (positive integer) Description: Number representing amount of values aggregated

• No. under LoQ (positive integer) Description: Number representing amount of values in this aggregation that were smaller than the LoQ value

• Value (mean) (non-negative real number)

Description: Number; Mean of aggregated values • Value (median) (non-negative real number)

Description: Number; Median of aggregated values • Minimum (non-negative real number)

Description: Number; Minimum value in this aggregation • Maximum (non-negative real number)

Description: Number; Maximum value in this aggregation • 5th percentile (non-negative real number)

Description: Number; Value on the 5% position of the aggregated data set (sorted from the lowest to highest concentration)

• 95th percentile (non-negative real number) Description: Number; Value on the 95% position of the aggregated data set (sorted from the lowest to highest concentration)

• SD (non-negative real number) Description: Number; Standard deviation of aggregated values

• Laboratory (text) Description: Name of the laboratory performing analysis of this data record

Page 46: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Country” Code List Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antarctica Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bosnia and Herzegovina Botswana Bouvet Island Brazil British Indian Ocean Territory Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands Central African Republic Ceuta Cocos Islands (or Keeling Islands) Colombia Comoros Congo Congo, Democratic Republic of Cook Islands Costa Rica Côte dIvoire Croatia Cuba Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Falkland Islands Faroe Islands Fiji Finland Former Yugoslav Republic of Macedonia France French Polynesia French Southern Territories Gabon Gambia Georgia Germany Ghana Gibraltar

Greece Greenland Grenada Guam Guatemala Guinea Guinea-Bissau Guyana Haiti Heard Island and McDonald Islands Holy See Honduras Hong Kong Hungary Chad Chile China, Peoples Republic of Christmas Island Iceland India Indonesia Iran, Islamic Republic of Iraq Ireland Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Korea, Democratic People’s Republic of Korea, Republic of Kosovo Kuwait Kyrgyzstan Lao Peoples Democratic Republic Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Liechtenstein Lithuania Luxembourg Macao Madagascar Malawi Malaysia Maldives Mali Malta Marshall Islands Mauritania Mauritius Mayotte Melilla Mexico Micronesia, Federated States of Moldova, Republic of Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands Netherlands Antilles New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island

Northern Mariana Islands Norway Occupied Palestinian Territory Oman Pakistan Palau Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Poland Portugal Qatar Romania Russian Federation Rwanda Saint Helena Saint Kitts and Nevis Saint Lucia Saint Pierre and Miquelon Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Seychelles Sierra Leone Singapore Slovakia Slovenia Solomon Islands Somalia South Africa South Georgia and South Sandwich Islands Spain Sri Lanka St Vincent and the Grenadines Sudan Suriname Swaziland Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom United States United States Minor Outlying Islands Uruguay Uzbekistan Vanuatu Venezuela Viet-Nam Virgin Islands (US) Virgin Islands, British Wallis and Futuna Yemen Zambia Zimbabwe

Page 47: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Ocean or sea” Code List Atlantic ocean Arctic ocean Indian ocean Pacific ocean Southern ocean Adriatic Sea Aegean Sea Alboran Sea Amundsen Gulf Amundsen Sea Andaman Sea Arabian Sea Arafura Sea Aral Sea Archipelago Sea Argentine Sea Baffin Bay Balearic Sea Baltic Sea Banda Sea Barents Sea Bass Strait Bay of Bengal Bay of Biscay Bay of Campeche Bay of Fundy Beaufort Sea Bellingshausen Sea Bering Sea Bismarck Sea Black Sea Bohai Sea Bohol / Mindanao Sea Bothnian Sea Camotes Sea Caribbean Sea Caspian Sea Catalan Sea Celebes Sea Celtic Sea Central Baltic Sea Ceram Sea Chesapeake Bay Chilean Sea Chukchi Sea Cilician Sea Cooperation Sea Coral Sea Cosmonauts Sea Davis Sea Davis Strait Dead Sea Denmark Strait Drake Passage D'Urville Sea East China Sea East Siberian Sea English Channel

Flores Sea Great Australian Bight Greenland Sea Gulf of Aden Gulf of Alaska Gulf of Bothnia Gulf of California / Sea of Cortéz Gulf of Carpentaria Gulf of Finland Gulf of Guinea Gulf of Maine Gulf of Mexico Gulf of Oman Gulf of Riga Gulf of Sidra Gulf of St. Lawrence Gulf of Thailand Gulf of Venezuela Gulf St Vincent Halmahera Sea Hudson Bay Ionian Sea Irish Sea James Bay Java Sea Kara Sea Kara Strait King Haakon VII Sea Koro Sea Labrador Sea Laccadive Sea Laptev Sea Lazarev Sea Levantine Sea Libyan Sea Ligurian Sea Lincoln Sea Mar de Grau Marmara Sea Mawson Sea Mediterranean Sea Mirtoon Sea Molucca Sea Mozambique Channel North Sea Norwegian Sea Oresund Strait Pechora Sea Persian Gulf Philippine Sea Prince Gustav Adolf Sea Red Sea Riiser-Larsen Sea Ross Sea Salish Sea Salton Sea Sargasso Sea

Savu Sea Scotia Sea Sea of Åland Sea of Azov Sea of Chiloé Sea of Crete Sea of Japan Sea of Okhotsk Sea of Sardinia Sea of Sicily Sea of the Hebrides Seto Inland Sea Sibuyan Sea Solomon Sea Somov Sea South China Sea Spencer Gulf Sulu Sea Tasman Sea Thracian Sea Timor Sea Tyrrhenian Sea Visayan Sea Wadden Sea Wandel Sea Weddell Sea White Sea Yellow Sea

Page 48: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Chemical – group” Code List Aldrin Alpha-hexachlorocyclohexane (α-HCH) Beta-hexachlorocyclohexane (β-HCH) Chlordane Chlordecone Dichlorodiphenyltrichloroethane (DDT) Dieldrin Endosulfan Endrin Gamma-hexachlorocyclohexane (γ-HCH) Heptachlor Hexabromobiphenyl (HBB) Hexabromocyclododecane (HBCD) Hexabromodiphenyl ether and heptabromodiphenyl ether (c-octa PBDE) Hexachlorobenzene (HCB) Mirex Pentachlorobenzene (PeCBz) Perfluorooctane sulfonic acid (PFOS) Polychlorinated biphenyls (PCB) – indicator Polychlorinated biphenyls (PCB) – coplanar Polychlorinated dibenzodioxins (PCDD) Polychlorinated dibenzofurans (PCDF) Tetrabromodiphenyl ether and pentabromodiphenyl ether (c-penta PBDE) Toxaphene

Page 49: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

“Parameter” Code List Aldrin (pg/l) Alpha-HCH (pg/l) Beta-HCH (pg/l) cis-Chlordane (= alpha) (pg/l) trans-Chlordane (= gamma) (pg/l) Oxychlordane (pg/l) cis-Nonachlor (pg/l) trans-Nonachlor (pg/l) Chlordecone (pg/l) o,p-DDT (pg/l) o,p-DDD (pg/l) o,p-DDE (pg/l) p,p-DDT (pg/l) p,p-DDD (pg/l) p,p-DDE (pg/l) Sum p,p-DDTs (pg/l) Sum 6 DDTs (pg/l) Dieldrin (pg/l) Endosulfan I (alpha) (pg/l) Endosulfan II (beta) (pg/l) Endosulfan SO4 (pg/l) Endrin (pg/l) Gamma-HCH (pg/l) Heptachlor (pg/l) Sum 2 heptachlorepoxides (pg/l) cis-Heptachlorepoxide (= exo, B) (pg/l) trans-Heptachlorepoxide (= endo, A) (pg/l) PBB 153 (pg/l) Alpha-HBCD (pg/l) Beta-HBCD (pg/l) Gamma-HBCD (pg/l) BDE 153 (pg/l) BDE 154 (pg/l) BDE 175/183 (pg/l) HCB (pg/l) Mirex (pg/l) PeCB (pg/l) PFOS (pg/l) PFOSA (pg/l) NMeFOSA (pg/l) NEtFOSA (pg/l) NMeFOSE (pg/l) NEtFOSE (pg/l)

PCB 28 (pg/l) PCB 52 (pg/l) PCB 101 (pg/l) PCB 138 (pg/l) PCB 153 (pg/l) PCB 180 (pg/l) PCB 77 (pg/l) PCB 81 (pg/l) PCB 105 (pg/l) PCB 114 (pg/l) PCB 118 (pg/l) PCB 123 (pg/l) PCB 126 (pg/l) PCB 156 (pg/l) PCB 157 (pg/l) PCB 167 (pg/l) PCB 169 (pg/l) PCB 189 (pg/l) Sum 6 PCBs (pg/l) Sum 7 PCBs (pg/l) PCB WHO98-TEQ (pg/l) PCB WHO2005-TEQ (pg/l) 1,2,3,4,6,7,8-HpCDD (fg/l) 1,2,3,4,7,8-HxCDD (fg/l) 1,2,3,6,7,8-HxCDD (fg/l) 1,2,3,7,8,9-HxCDD (fg/l) 1,2,3,7,8-PeCDD (fg/l) 2,3,7,8-TCDD (fg/l) OCDD (fg/l) 1,2,3,4,6,7,8-HpCDF (fg/l) 1,2,3,4,7,8,9-HpCDF (fg/l) 1,2,3,4,7,8-HxCDF (fg/l) 1,2,3,6,7,8-HxCDF (fg/l) 1,2,3,7,8,9-HxCDF (fg/l) 1,2,3,7,8-PeCDF (fg/l) 2,3,4,6,7,8-HxCDF (fg/l) 2,3,4,7,8-PeCDF (fg/l) 2,3,7,8-TCDF (fg/l) OCDF (fg/l) Sum 7 PCDDs (fg/l) Sum 10 PCDFs (fg/l) Sum 17 PCDDs/Fs (fg/l) PCDDs WHO98-TEQ (fg/l) PCDFs WHO98-TEQ (fg/l) PCDDs/Fs WHO98-TEQ (fg/l) PCDDs WHO2005-TEQ (fg/l) PCDFs WHO2005-TEQ (fg/l) PCDDs/Fs WHO2005-TEQ (fg/l) BDE 17 (pg/l) BDE 28 (pg/l) BDE 47 (pg/l) BDE 99 (pg/l) BDE 100 (pg/l) Parlar 26 (pg/l) Parlar 50 (pg/l) Parlar 40/41 (pg/l) Parlar 44 (pg/l) Parlar 62 (pg/l)

Page 50: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access
Page 51: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

GMP Data Warehouse

USER GUIDE

Created by:

Research Centre for Toxic Compounds in the Environment, Faculty of Science, Masaryk University, Brno, Czech Republic

Institute of Biostatistics and Analyses, Faculty of Medicine and Faculty of Science, Masaryk University, Brno, Czech Republic

Page 52: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

2

GMP DWH: User Guide

Online Data Collection – General Information On-line data collection is based on a TRIALDB system developed on Yale University, Connecticut,

USA, which is widely used for this purpose

The system is user-friendly; all data can be inserted directly into the database using the web forms.

Data can be inserted from any computer with an internet access equipped with a web browser supporting communication secured with a 128-bit encryption.

It is not necessary to install any additional software. The proposed database (GMP DWH) can only be accessed by authorized users; each using their

user name and unique password. All authorized users have a full control over their access passwords, rights and recorded data.

The GMP DWH database design meets all requirements for data protection and safety policy, enacted by the existing legislation and ISO 27001 standards. The authorized data providers keep all ownership rights to the stored data sets. The data providers such as participating institutions, the Regional Organisation Groups, Global Coordination Group members and other stakeholders or participating legal entities can assess their data at any time.

Any data transfer is encrypted and the on-line collection system is designed to prevent any unauthorized usage during the data transfer. The submitted data forms are also available in a printable format.

Data will be stored in Oracle 11g database in the central server at the Masaryk University in Brno, Czech Republic. The local copies or local safety back-ups required by the users are supported by the system.

Institute of Biostatistics and Analyses, Masaryk University, Brno, Czech Republic (IBA MU, www.iba.muni.cz) works as a technical provider, supporting also the central data management.

Page 53: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

3

GMP DWH: User Guide

Contents

Online Data Collection – General Information .................................................................................... 2

Contents .............................................................................................................................................. 3

1 Database Connection .................................................................................................................. 4

2 Basic Application Window ........................................................................................................... 6

3 Adding New Site .......................................................................................................................... 7

4 Site Search ................................................................................................................................... 9

5 Working Window of the Application ......................................................................................... 12

6 Adding New Entry ...................................................................................................................... 14

7 Entry Completion ....................................................................................................................... 15

8 Work with Submitted Entries .................................................................................................... 19

9 Track changes ............................................................................................................................ 20

10 Data validation ...................................................................................................................... 21

11 Technical Support .................................................................................................................. 23

Page 54: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

4

GMP DWH: User Guide

1 Database Connection

The GMP DWH database can be accessed from a web page www.pops-gmp.org/dwh after selecting one of the matrices (see Figure 1).

Figure 1. How to enter the database

Comment 1:

To access the database, use an Internet browser supporting JavaScript and secured communication

(https protocol). Internet Explorer version 5.5 or higher or Mozilla Firefox version 2.0 or higher are

examples of such browsers. In this user guide, the web browser Mozilla Firefox version 8.0 is used to

demonstrate the work with the GMP DWH database.

After typing a correct login information and clicking on the “Login” button, the user is logged into the

system. The work then starts by selecting the environmental matrix where the user wishes insert

data to (see a red mark in Figure 2).

Login: enter your user name

Password: enter your password

Click on „Login“

Page 55: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

5

GMP DWH: User Guide

Figure 2. Matrix selection

Page 56: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

6

GMP DWH: User Guide

2 Basic Application Window

After a successful login to the database, the initial application window shows on the screen (see

Figure 3). This window permits to search existing sites and within their forms (see Chapter 4, Site

Search), to add a new site (see Chapter 3, New Site Registration), to use supporting tools, such as

change user`s password (se A in Figure 3) and to display Help (see B in Figure 3). You can log out of

the system by using the button „Log Out“ in the upper right corner of the screen (see encircled in red

in Figure 3).

Figure 3. Initial page of the system

Comment 2

Automatic logout occurs after an inactivity of a certain duration. This function prevents an

unauthorized access to the system. To continue previous work, it is necessary to log in again.

The time remaining until the automatic logout is displayed in the middle of the top row on the screen

(see blue circle in Figure 3). Any activity in the GMP DWH database restarts the countdown.

All data inserted into the GMP DWH database are linked to a particular site (air) or country (human

milk and blood). Therefore, any sampling site recorded previously can be selected from the list of

available sites (see Chapter 4, Site Search) or a new site can be added (see Chapter 3, Adding New

Site) to enter data from a new location.

A B

Page 57: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

7

GMP DWH: User Guide

G M P - 0 0 - 0 0 0 0 1

3 Adding New Site

By clicking „Add New Site“ link (marked in red in Figure 4) in the initial page of the system, a form

allowing to submit information about a new site will be displayed. Please insert all required

information into the form; the system will subsequently generate a unique ID for this new site.

Identification of the new site consists of the following:

1. „Site name“

2. „User“

3. „Region“

4. „Country“

Please do select the „User“ mode from the list (see green parenthesis in Figure 4) when creating a

new site. However, should you just wish to try working with the database, please tick the box

„Training site“ (see brown rectangle in the upper part of Figure 4) as well.

By clicking the „Save“ button (encircled in blue in Figure 4), a unique “Site ID” will be generated

(marked in purple in lower part of Figure 4).

The Site ID will be generated in the following form:

GMP-00-00001 (see the example below):

GMP - Global Monitoring Plan abbreviation – project identifier

00 - Site identification

00001 - Site code in the database (it will be generated automatically from 00001 to 99999)

Example of a new training site is shown in Figure 4.

Site code in the GMP DWH

Site identification

project identifier

Page 58: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

8

GMP DWH: User Guide

Figure 4. Adding new site

Comment 3:

Should you wish to work with the GMP DWH database (e.g., to amend a form of an existing site), click

on the Save button. This transfers you to the main part of the system (see Chapter 5, Working

Window of the Application) that stores all forms. The main part can also be accessed by clicking the

„Data forms“ button in the header menu.

01 – Region A . . . . . 99 – Region Z

User AA . . . . . User ZZ

Page 59: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

9

GMP DWH: User Guide

4 Site Search

There are two ways how to find previously registered sites (see „Search Site“, encircled in red in

Figure 7):

A) If you know the Site ID, please insert this ID into the relevant field and click the „Search“

button (see A in Figure 7). The record of this site will be displayed.

B) Search site according to other known parameters (such as UN region, country, type of the

site etc). Please fill the fields and click the „Search“ button (see B in Figure 7). Records of all

sites registered in a particular country matching your search will be displayed. If too many

records are displayed, we recommend adding as much additional known information about

that site as possible to limit the number of displayed records. These include „Site name“ or

„Region“ and click on the „Search“ button again. The number of records found will be

notably reduced.

By clicking the „Search“ button only, the system will list all submitted site records, to which you have

access and which correspond to the submitted criteria. There is a maximum of 50 site records in one

page. To move to the next page of records found, click on the range of numbers which are tinged

with orange (see red rectangle in Figure 5).

Figure 5. Moving to the next page of the records found

After you have found the desired site, click the „Open“. This command brings you to the electronic

form for that site.

Comment 4:

For a quick access to recently opened site records, there is a table in the right part of the application

screen (see Figure 6). The site records are displayed chronologically (the most recent at the top).

Page 60: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

10

GMP DWH: User Guide

Figure 6. History of opened site records

Figure 7. Site search

A B

Page 61: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

11

GMP DWH: User Guide

Comment 5:

Search results (see Figure 8) can be sorted in ascending order or descending order by various criteria:

Site ID (see A – descending order according to Site ID), Region, Site name, Country, Date of Entry,

Entered by - name of the person who had submitted the site record (see B – ascending order

according to the criterion Entered by).

Figure 8. Search results and examples of site ordering

A

B

Page 62: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

12

GMP DWH: User Guide

5 Working Window of the Application

The working window of the application is divided into two interconnected sections:

a) Site

b) Folders and Entries

The Site section contains a fundamental information about the site. The information displayed in the

form is copied therein automatically from the form submitted upon creation of the site (see Chapter

3, Adding New Site). It is also possible to edit (modify) the site information in this section by clicking

the „Site description“ (marked in red in Figure 9).

Figure 9. Site information update

All site forms are stored in the Folders and Entries section (purple parenthesis in Figure 9). These

forms are organized in “Folders“ (i.e. “All primary data“, “2012“ and “2011“).Content of individual

folders can be displayed by clicking on the folder; the list of individual entries (forms) will show (see

Figure 10).

Section Folders and

Entries

Section Site

Page 63: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

13

GMP DWH: User Guide

Figure 10. Folder content

There are two options when working with Entries:

a) Create a new entry (see Chapter 6, Creation of a New Entry)

b) Edit an existing entry (see Chapter 8, Work with Submitted Entries)

Page 64: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

14

GMP DWH: User Guide

6 Adding New Entry

After you have found an existing site or created a new one and had opened the respective Folder

(see Chapter 5, Working Window of the Application), you can also create a new form for data

insertion by clicking on „Add new entry“ (see encircled in red in Figure 11).

Figure 11. Adding new entry

Page 65: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

15

GMP DWH: User Guide

7 Entry Completion

Each entry form page is divided into three sections (see example in Figure 12 :

1) header – contains basic information about the selected site

2) main section – contains data forms

3) footer – stores information on the completeness and validity of the entry

Header contains a basic information, such as Site ID, Country, Region, Entered by and Date of

entry.

Main section contains a form to insert values. It is divided into two subsections: Sampling

attributes and Measurement marked in bold (see encircled in blue in Figure 12). You can move

within the form by using the arrows of the scrollbar on the right. The form contains white and grey

fields to insert the data.

The Footer of the form contains a dropdown menu to specify status of the form: Pending,

Completed and Incomplete – objective reasons. Selected value provides information about the

status of users work on the form. Data inserted into the form can be saved by clicking ”Save“

button (see encircled in blue in Fig. 12).

In case you did not manage to complete the whole form for any reason, select the

„Pending“ option (see green parenthesis in Fig. 12). You can later continue in completion of

the form.

Upon completion of all fields of the form, select the „Completed“ option.

In case it is not possible to obtain all information necessary for a completion of the form,

select the option „Incomplete – objective reasons“.

Page 66: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

16

GMP DWH: User Guide

Figure 12. Form completion

Comment 6:

The fields marked with a red asterisk are obligatory and must be filled prior to form saving (see

Figure 12, main section). An error warning appears when leaving the form with some obligatory fields

incomplete returns you to the form completion.

Footer

Main section

Header

Page 67: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

17

GMP DWH: User Guide

Data insertion is the same in all forms. All forms contain entry fields tinged with white and grey. In

general, white fields are to be filled in directly (see A in Figure 13) or to select an option from the

dropdown box (see B in Figure 13). Gray boxes become active only if a relevant option in the

preceding boxes has been selected (see C and D in Figure 13).

Figure 13. Field types

A

B

D

C

Page 68: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

18

GMP DWH: User Guide

The multiple values for a particular site, year, and parameter can be inserted into one form using

„Add new entry“ and „Delete“ (encircled in red and green respectively in Figure 14). Click the „Add

new entry“ button to add a new line below the existing record.

To erase any line, click into any field of that line and then hit „Delete “ in the Action setting. Please

note that a line will be irreversibly deleted after your confirmation of the warning alert (see the

bottom part in Figure 14).

Figure 14. Inserting multiple values

Page 69: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

19

GMP DWH: User Guide

8 Work with Submitted Entries

Previously submitted forms can be modified/edited by using the „Open“ (see A in Figure 15), printed

by using the „Printable“ (see B in Fig. 15) or deleted using the „Delete“ (see C in Figure 15).

CAUTION: Deletion of the form/all records: Deletion of the form is irreversible action! We

recommend using this function only after a careful consideration. However, upon user`s request it is

the HelpDesk (exclusively) who can perform Deletion of a whole site including its ID from the GMP

DWH database. Such request has to be sent to the HelpDesk by email. The site to be deleted must

not contain any entries (all previously submitted forms must be deleted by using the “Delete”

opotion in the menu Action).

Figure 15. Work with the form

A B

C

Page 70: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

20

GMP DWH: User Guide

9 Track changes

Track changes is a function to search for all changes made in (a) particular form(s). Select the site

first, then move to the „Tools“ menu (right top corner) and select „Track changes“ there (encircled in

red in Figure 16). If you want to track all changes made in all forms of a particular site, then leave

both fields empty and press the „Search“ button (marked in green in Figure 16). Any changes made

in the forms will then be listed in the Track changes window (see A in Figure 16). If you want to

reduce the list of track changes to a specific form or action, add information in the fields „Available

forms“ and „Available questions“ (see B in Fig. 16). All relevant changes will appear in the table (see

C in Figure 16).

Figure 16. Track changes

A

B

C

Page 71: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

21

GMP DWH: User Guide

10 Data validation To search for a valid/invalid data, go to the „Search Site“ form (see Chapter 4 – Site Search) and select

a Valid or Not Valid option in the Validation result dropdown menu. Then click on the „Search“

(marked in red in Figure 17) or „Not valid“ (marked in blue, Figure 17). The GMP DWH database will

show all identified problems/data gaps of the selected site in details. Should you wish to correct

selected data immediately, click the View data in the Action menu (marked in green in the bottom

part of Figure 17) and the system will show the form to add/correct data.

Figure 17. Searching the invalid sites

Page 72: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

22

GMP DWH: User Guide

The “Data validation“ function is also available in the main menu under the item “Selected site“. This

function allows validating all entries under the selected site directly without going through the

”Search Site” form.

The “Validate your region“ function validates all sites in your region (marked in red in Figure 18).

If there is a site with a “Not known“ attribute, it could be validated simply by clicking on the

“Validate“ button. All sites are automatically validated once a day, at midnight.

Figure 18. Data validation

Page 73: GMP Data Warehouse System Documentation and Architecture · analytical layer – data workflow, statistical tools, data pre-processing and processing ... Users can thereby access

23

GMP DWH: User Guide

11 Technical Support

Technical Support

Institute of Biostatistics and Analyses

Faculty of Medicine and the Faculty of Science

Masaryk University, Brno

Kamenice 126/3, 625 00 Brno

http://www.iba.muni.cz

In case of technical problems or questions please contact:

Dr. Jakub Gregor, Ph.D. E-mail: [email protected] Phone: (+420) 549 49 5164