EMODnet Thematic Lot n° 4 - Chemistry

25
EMODnet Thematic Lot n° 4 - Chemistry EMODnet Phase IV Updated guidelines for SeaDataNet ODV production M. Lipizer, M. Vinci, A. Giorgetti, L. Buga, M. Fichaut, J. Gatti, S. Iona, M. Larsen, R. Schlitzer, D. Schaap, M. Wenzer, M.E. Molina Jack Date: 15/06/2020

Transcript of EMODnet Thematic Lot n° 4 - Chemistry

Page 1: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Phase IV Updated guidelines for SeaDataNet ODV production

M. Lipizer, M. Vinci, A. Giorgetti, L. Buga, M. Fichaut, J. Gatti, S. Iona, M. Larsen, R. Schlitzer, D. Schaap, M. Wenzer, M.E. Molina Jack

Date: 15/06/2020

Page 2: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

1

Index Index..................................................................................................................................................... 1 History .................................................................................................................................................. 2 Introduction .......................................................................................................................................... 3 SeaDataNet ODV import format.......................................................................................................... 3

How to check your SeaDataNet ODV file format? .......................................................................... 8 Vocabularies ..................................................................................................................................... 9 How to choose the correct P01? ..................................................................................................... 10

Additional recommendations for dataset preparation ............................................................. 15 General indications ......................................................................................................................... 15

Flagging of Data Below Detection Limits and Data Below Limit of Quantification: ............... 15 Eutrophication datasets................................................................................................................... 15 Contaminants datasets .................................................................................................................... 15

Contaminants in the Sediment ................................................................................................... 16 Contaminants in Biota ................................................................................................................ 20

Other general issues and inventory of common errors ...................................................................... 23

Page 3: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

2

History Authors Date Comments

M. Lipizer, M. Vinci, A. Giorgetti, L. Buga, M. Fi-chaut, J. Gatti, S. Iona, M. Larsen, R. Schlitzer, D. Schaap, M. Wenzer

27/06/2017 Created

M. Lipizer, A. Giorgetti 2018 Updated: LOD/LOQ clarifications

M. E. Molina Jack, M. Lipizer, A. Giorgetti 12/04/2018 Updated: Inclusion of SDN tools for vo-cabularies

M. E. Molina Jack, M. Lipizer, A. Giorgetti, M. Fi-chaut, R. Schlitzer 15/06/2020

Updated: New rec-ommendations for dataset preparation

Acknowledgements: We acknowledge the fundamental contribution of the British Oceanographic Data Centre (BODC) for the design, development and continuous management and update of the Vocab-ularies and for the provision of the webservices. How to cite this document: M. Lipizer, M. Vinci, A. Giorgetti, L. Buga, M. Fichaut, J. Gatti, S. Iona, M. Larsen, R. Schlitzer, D. Shaap, M. Wenzer, M. E. Molina Jack, 2020, EMODnet Phase VI - Updated guidelines for SeaDataNet ODV production, 15/06/2020, 25 pp., DOI: 10.6092/259c43eb-4ba4-419b-bb38-df00e189bd35

Page 4: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

3

Introduction EMODnet Chemistry uses SeaDataNet infrastructure (https://www.seadatanet.org/) for the technical set-up. In particular, it adopts:

A set of Standards for metadata description and data formats:

SeaDataNet (SDN) Standards for metadata (https://www.seadatanet.org/Stand-ards/Metadata-formats)

Common Vocabularies (i.e. standardised terms that cover a broad spectrum of disci-plines) to allow consistency and interoperability (https://www.seadatanet.org/Stand-ards/Common-Vocabularies)

Common Data Index mechanism (CDI) to access data with data policy (https://www.seadatanet.org/Standards/Metadata-formats/CDI)

Ocean Data View format (ODV) for data exchange (https://www.seada-tanet.org/Standards/Data-Transport-Formats)

Data Quality Control procedures (https://www.seadatanet.org/Standards/Data-Qual-ity-Control)

A set of software tools specifically developed for metadata and data formatting, data exchange and visualization:

MIKADO to prepare XML metadata files NEMO to enable conversion from any type of ASCII format to the SeaDataNet ODV and

Medatlas ASCII formats as well as the SeaDataNet NetCDF (CF) format OCTOPUS to convert files in a given SeaDataNet format to another SeaDataNet format

(e.g.: ODV to NetCDF, MedAtlas to NetCDF, MedAtlas to ODV) and to check the compli-ancy of SeaDataNet MedAtlas and ODV files format

ODV as the fundamental data analysis and visualisation software. DIVA to spatially interpolate (or analyse) observations on a regular grid in an optimal

way

SeaDataNet ODV import format All data entering EMODnet Chemistry must be converted to SeaDataNet ODV import format, which is the common standard format. Delivery of data to users requires common data transport formats, which interact with other SeaDataNet standards (Vocabularies, Quality Flag Scale…) and SeaDataNet analysis & presentation tools (ODV, DIVA). Detailed guidelines on Data transport formats area for data in the water column and specific for biological data are available on: https://www.seadatanet.org/Standards/Data-Transport-Formats (Fig. 1). All SDN ODV files are .txt files.

Page 5: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

4

Fig. 1: SeaDataNet datafile formats documentation

As described in SDN documentation regarding datafile formats: The fundamental data model underlying the format is the ASCII spreadsheet (ODV spreadsheet files may contain comments, and the column separation character may be TAB or semicolon, https://odv.awi.de/filead-min/user_upload/odv/misc/odv4Guide.pdf): i.e. a collection of rows each having the same fixed number of columns. There are three different types of column:

Metadata columns Primary variable data columns (one column for the value plus one for the qualifying

flag) Data columns (one column for the value plus one for the qualifying flag)

The metadata columns are stored at the left hand end of the row, followed by the primary var-iable columns and then the data columns.

There are three different types of rows:

Comment rows Semantic header row Column header row Data row

Comment lines start with two slashes //as first two characters of the line and may contain ar-bitrary text in free format. Comment lines may, in principle, appear anywhere in the file, most commonly, however, they are placed at the beginning of the file and contain descriptions of the data, information about the originator or definitions of the variables included in the file.

Page 6: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

5

The Semantic header consists in a set of ‘special’ mandatory comment lines that must appear before the column header row. Their function is to map the text strings used to label the metadata and data columns to standardized SeaDataNet concepts, which is necessary if data files from different sources are to be combined in a meaningful way (Fig. 2, Fig. 3).

Fig. 2: Example of SDN ODV file, with depth as primary variable (in column 10)

The metadata values are followed by the primary variable value (on the 10th column), its flag and then the data value plus qualifying flag pairs for each parameter. Flag values are taken from the SeaDataNet vocabulary for qualifying flags (L20 at http://vocab.nerc.ac.uk/collec-tion/L20/current).

Important: there MUST be exact correspondence between Semantic header and Header. The order and labels of the first 9 metadata column MUST be re-spected (ie. Cruise; Station; Type; YYYY-MM-DDThh:mm:ss.sss; Longitude [degrees_east], Latitude [degrees_north], LOCAL_CDI_ID, EDMO_code, Bot. Depth [m]).

Page 7: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

6

Fig. 3: Example of SDN ODV file showing correspondence between SDN_parameter_map-ping lines and measured parameters.

The primary variable depends on the type of dataset:

Time series: point time series have row_groups made up of measurements from a given in-strument at different times. The primary variable (column 10) is time. In case data come from samples taken in the same position and sampled at a fix frequency, the dataset type can be considered as “time series” and the SDN ODV file must provide time as primary variable in the 10th column. This is particularly suitable for regular monitoring stations and is commonly used for mooring data (eg. oceanographic buoy).

Vertical profile: profile data have row_groups made up of measurements at different depths. The primary variable (which is indicated in column 10) is the ‘z co-ordinate’, which is either depth in meters or pressure in decibars (in case of water column data), or depth below seabed (in case of sediment data). It is the choice for stations usually not repeated regularly over time, for which several depths are sampled. Typical ex-amples: CTD profiles, bottle stations (chemical data, chlorophyll, plankton,…)

However, this distinction is not always clear and in cases of a same position, sampled over time at different depths, some data providers prefer to consider these as “repeated profiles” stations.

Lastly, when data are not available for the same position at more or less regular time periods, the dataset type is regarded as “vertical profiles” even if there is only 1 sampling depth and it is not a real profile. This can be the case of sediment samples taken during occasional surveys, for example.

In order to produce correctly SDN ODV datasets for EMODnet Chemistry, the use NEMO soft-ware is STRONGLY RECOMMENDED. User Manual, NEMO presentation and NEMO examples

Page 8: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

7

are available on: https://www.seadatanet.org/Software/NEMO and lessons on NEMO given during EMODnet Chemistry 3 first training (18-19 May 2017, Trieste) are available on: https://www.emodnet-chemistry.eu/help/videotutorial.

SeaDataNet ODV Dataset templates for water column data (eg. CTD, nutrients, dissolved oxy-gen,…) are available on: https://www.seadatanet.org/Standards/Data-Transport-Formats (Fig. 4, 5, 6)

Fig. 4: SeaDataNet ODV Dataset templates for water column data.

Fig. 5: Example of time series of nutrient data in the water column. NB: To facilitate read-ing, the above example shows the Column header (CH) displayed in several lines, but the standard SeaDataNet ODV file in .txt has the header on 1 single row.

Page 9: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

8

Fig. 6: Example of vertical profile of CTD data in the water column. NB: To facilitate read-ing, the above example shows the Column header (CH) displayed in several lines, but the standard SeaDataNet ODV file in .txt has the header on 1 single row.

How to check your SeaDataNet ODV file format? - SeaDataNet import of ODV software - OCTOPUS

OCTOPUS (https://www.seadatanet.org/Software/OCTOPUS) allows to check the compliancy of your SeaDataNet ODV files format. Once the directory of files has been chosen, it is possible to check the format of the file(s) by clicking on the "Check the input format" button (Fig. 7).

Page 10: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

9

Fig. 7: Example of Octopus working window.

Vocabularies Use of common vocabularies in all metadatabases and data formats is an important prerequisite towards consistency and interoperability. Common vocabularies consist of lists of standardised terms that cover a broad spectrum of disciplines of relevance to the oceanographic and wider community (https://www.seadatanet.org/Standards/Common-Vocabularies). The British Oceangraphic Data Center (BODC) manages and updates all vocabularies used by EMODnet Chemistry (http://seadatanet.maris2.nl/v_bodc_vocab_v2/welcome.asp).

The basic term to describe parameter is P01 (BODC Parameter Usage Vocabulary) which is made up by 3 main elements:

The property observed (e.g. “concentration”) The entity observed (e.g. a chemical substance like “phosphate” or a physical phenome-

non like “waves” or a biological entity like “Skeletonema costatum”) The matrix (e.g. water body and its various phases, sediments and its various compo-

nents, atmosphere and its various components, etc.)

The P01 Parameter Usage Vocabulary is based on a semantic model. This model uses a de-fined set of controlled vocabularies (semantic components).

Page 11: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

10

Sometimes P01 contains also information on sampling, filtration and analytical method (Fig. 8). These additional components of P01 contribute to improve the information about the submit-ted data and to allow comparability between datasets during aggregation intro collections.

Fig. 8: Example of P01.

How to choose the correct P01? NEMO allows to find the correct P01 starting from the more general P02 Vocabulary, SDN Pa-rameter Discovery Vocabulary (http://seadatanet.maris2.nl/v_bodc_vocab_v2/vocab_rela-tions.asp?lib=P02)(Fig. 9), or from P09 (MEDATLAS Parameter Usage Vocabulary).

Fig. 9: Example: P02 to P01

For eutrophication, the user can also use the ad hoc P35 Vocabulary (EMODnet Chemistry ag-gregated parameter names, http://seadatanet.maris2.nl/v_bodc_vocab_v2/vocab_rela-tions.asp?lib=P35 and http://vocab.nerc.ac.uk/collection/P35/current/, Fig. 10) developed for EMODnet Chemistry and find the correct P01 from the aggregated terms (Fig. 11).

Fig. 10: Extract from P35 vocabulary (EMODnet Chemistry aggregated parameter names).

Page 12: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

11

Fig. 11: Example: from P35 to P01

The British Oceanographic Data Centre provides a useful and user-friendly tool for Vocabulary search available at: https://www.bodc.ac.uk/resources/vocabularies/vocabulary_search/ for simple and advanced search for vocabulary and within a vocabulary (Fig. 12-14).

Fig. 12: Example of “Vocabulary search” tool (https://www.bodc.ac.uk/resources/vo-cabularies/vocabulary_search/).

Page 13: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

12

Fig. 13: Example of “Simple search within a vocabulary”, example: nitrate in P01 Vocab-ulary tool. The user can also search for the correct P01 term with the P01 vocabulary – facet search on semantic components (http://seadatanet.maris2.nl/bandit/browse_step.php ) (Fig. 14a) and with (https://www.bodc.ac.uk/resources/vocabularies/vocabulary_search/ ) (Fig.14b).

Page 14: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

13

Fig. 14a: Output of search of “nitrate” in P01 Vocabulary from facet search (SeaDataNet).

Page 15: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

14

Fig. 14b: Output of search of “nitrate” in P01 Vocabulary (BODC)

Page 16: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

15

Additional recommendations for dataset preparation General indications

• Bottom depth cannot be 0, 9999, -9999. It must be left empty if information is not available.

• Sampling depth must be always provided, also for «Time-series» data types, if the in-formation is available (depth below seabed for sediment and depth for water and bi-ota). In this case it can be provided as a parameter (not the primary variable)

Flagging of Data Below Detection Limits and Data Below Limit of Quantification: The SDN L20 (SeaDataNet measure and qualifier flags) vocabulary, which includes Quality Flags concepts and definitions, has been updated with a new Quality Flag needed to flag data with concentrations “below limit of quantification (LOQ)”. The new flag “Q” (value below limit of quantification) is used when “The level of the measured phenomenon was less than the limit of quantification (LoQ). The accompanying value is the limit of quantification for the analytical method.”

This must not be confused with QF 6 (value below the limit of detection) which indicates that “the level of the measured phenomenon was less than the limit of detection (LoD) for the method employed to measure it. The accompanying value is the detection limit for the tech-nique or zero if that value is unknown.”

Data originators have to be sure to use the correct QF. Eutrophication datasets The inclusion of CTD data (temperature, salinity, …) at bottle depths in the same file of nutri-ents, chlorophyll, oxygen is highly recommended.

Data (of the same type of parameters and same data originators) of the same station (same spatial and temporal coordinate) must be prepared in a unique file, avoiding the splitting into several files, because this makes comparison (eg. Total Nitrogen versus DIN) as well as some steps of data QC loop impossible.

Contaminants datasets It is recommended to keep data related to the same matrix together, as provided by origina-tors, and not separate different parameters. Conversely, data related to different matrices should be split.

It is important to choose the most complete P01s as possible to include all the relevant infor-mation for the measured parameters (grain size of sediment, biota size…). The use of too ge-neric P01 might result in non-usability of the information.

Page 17: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

16

Contaminants in the Sediment For the management of data in the sediments, the protocols proposed within the EU project Geo-Seas Pan-European infrastructure for management of marine and ocean geological and ge-ophysical data (http://www.geo-seas.eu/) have been taken into consideration and adopted and previous experience gained with EMODnet phase 1 and 2 have been used. An example of a SDN ODV template provided by Geo-Seas is attached to this report.

Samples can be collected in the sediment with several devices, and data may refer to different depth layers below the seabed (ex. Fig. 15 c).

Fig. 15: Examples of sampling devices: Grab sampler (a), box corer (b), corer (c) and mul-ticorer (d) The “depth” variable represents a depth INSIDE the seabed and should be indicated as “COREDIST” which is the distance of a sensor or sampling point below the floor of a water body (Fig. 16). The preferred unit for COREDIST is “m (= meters)”.

Important: DO NOT use depth as ADEPZZ01, which corresponds to “The dis-tance of a sensor or sampling point below the sea surface.

Page 18: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

17

T Fig. 16: Example of a vertical profile in the sediment. For SeaDataNet ODV depth must be coded as “COREDIST” and expressed in “m (meters)“. The following table contains several terms used to describe the depth where sediment samples were collected (Tab. 1). The depth of the sample is indicated as:

Entryterm P01 en-trykey Definition

Depth below surface of the bed

COREDIST The distance of a sensor or sampling point below the floor of a water body

Sample length SEGMLENG The physical length of a sample upon which measure-ments have been made

Minimum depth below surface of the bed

MINCDIST The distance between the top of a core sample and the seabed.

Maximum depth below surface of the bed

MAXCDIST The distance between the base of a core sample and the seabed. For an unsegmented core with its top co-incident with the bed this is equivalent to the core length.

Table 1: List of P01 terms and definitions for “depth” parameters. Example: COREDIST = 0 in case of surface sediment data COREDIST ≠ 0 when samples are taken below the sea floor (ex. Depth of sample A, B,…, Fig. 17 in core cut into sections and depth of the sample in the unsegmented core, Fig. 18)

Page 19: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

18

Fig. 17: Example of core cut into sections.

Fig. 18: Example of unsegmented core. P01 terms: COREDIST: http://vocab.nerc.ac.uk/collection/P01/current/COREDIST/ - The distance of a

sensor or sampling point below the floor of a water body MANDATORY MINCDIST: http://vocab.nerc.ac.uk/collection/P01/current/MINCDIST/ - The distance be-

tween the top of a core sample and the seabed. OPTIONAL (STRONGLY RECOM-MENDED)

Page 20: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

19

MAXCDIST: http://vocab.nerc.ac.uk/collection/P01/current/MAXCDIST/ - The distance be-tween the base of a core sample and the seabed. For an unsegmented core with its top co-incident with the bed this is equivalent to the core length (SEGMLENG) OPTIONAL (STRONGLY RECOMMENDED)

SEGMLENG: http://vocab.nerc.ac.uk/collection/P01/current/SEGMLENG/ - The physical length of a sample upon which measurements have been made OPTIONAL

As a general rule:

COREDIST = MINCDIST + (MAXCDIST-MINCDIST)/2 and SEGMLENG = MAXCDIST - MINCDIST.

The information of depth of the seafloor should be put in the metadata "Bot.Depth" (9th column in SDN ODV format, see Tab. 2).

Table 2: Template for SDN ODV dataset of sediment profile. NB: To facilitate reading, the above example shows the Column header (CH) displayed in 2 lines, but the standard SDN ODV file in .txt has the header on 1 single row. It is also recommended to include additional useful data: proportion of sizes of particles (eg. % clay), parameters related to granularity, water content, organic matter content and sedimenta-tion rates. Due to heterogeneity in grain size, missing information on grain size, as well as lack of indication of station depth and of sample thickness strongly affect QC of contaminants in the sediment matrix. Organic carbon, aluminium content, wet weight/dry weight ratio and grain size are relevant supplementary data required for QC and for the application of normalization procedures.

Page 21: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

20

Contaminants in Biota According to experience obtained during the previous phase of EMODnet Chemistry, contami-nants in biota are mostly regarded as time series. This consideration is based on the fact that, most of the time, no depth information is linked to biota data. In this case, the SDN ODV file must provide time as primary variable which is indicated in the 10th column (see example in Table 3).

However, sometimes there are several measurements taken in the same position and time, but they are distinguished according to their sample_id (case of ICES datasets) (see example in table 4). This means that the data type is neither profiles nor timeseries. In this case, ODV files need to have the following comment included: //SeaDataNet file without vertical reference The column for the vertical reference should be included in the file in the first column after bottom depth (10th column), all values in this column will be null and have a corresponding flag 9.

During the preparation of the datasets it is also recommended to include additional parame-ters related to the biometrics (biota sizes, sample ids, sex, life stage…), water and lipid con-tents, wet weight/dry weight ratio. All this information is very relevant for normalization pro-cedures, comparability and QC.

Page 22: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

21

Table 3: Template of SDN ODV dataset of time – series of contaminants in biota (IFREMER example 01950406_ODV_biota_TS.txt). NB: To facilitate reading, the above example shows the Column header (CH) displayed in several lines, but the standard SDN ODV file in .txt has the header on 1 single row.

Page 23: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 - Chemistry Updated guidelines for SDN ODV production

22

Table 4: Template of SDN ODV dataset of vertical profile of contaminants in biota (note however that in this example value of depth is missing and QV is 9 = missing value). NB: To facilitate reading, the above example shows the Column header (CH) displayed in several lines, but the standard SDN ODV file in .txt has the header on 1 single row.

Page 24: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 – Chemistry Updated guidelines for SDN ODV production

Other general issues and inventory of common errors OCTOPUS detects most common format errors in the datasets. It is very important to use it to con-trol datasets. A list of common errors and suggestions to avoid them is listed in Tab. 5.

Page 25: EMODnet Thematic Lot n° 4 - Chemistry

EMODnet Thematic Lot n° 4 – Chemistry Updated guidelines for SDN ODV production

24

Table 5: List of common errors and suggestions to avoid them. Other errors :

• Files without primary variable. All datasets must have time or depth/pressure column in the first column after bottom depth (10th column) and this column should contain not null values.

• Wrong primary variable used: sediment data must not contain water depth/pressure as the primary variable. The correct P01 for sediment is COREDIST (Depth below seabed).

• Wrong correspondence between parameter (P01) and units (P06). • Spatial coordinates on land. This error can be detected importing data into ODV to make a

quick visual check. • «0» concentrations with QF =1 should be carefully checked: it is proposed to label with

QF=6. • Inconsistencies between CDI and ODV files