The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view...

23
The BioImage metadata framework Based on the <indecs> metadata framework Contents 1 Introduction 1.1 The four axioms 2 2 Principles 2.1 The principle of unique identification 2.2 The principle of functional granularity 3 3 Entities 3 4 Attributes 4.1 Labels 4.2 Quantities 4.3 Qualities 5 5 Relations 5.1 Roles 5.2 Types 5.3 Events 7 6 Parties 10 7 Creations - a model of making 7.1 Creation types 7.2 Creation qualities 7.3 Creation-to-creation relation roles 10 8 Non-textual metadata 13 9 Metadata dictionary 13 Title The BioImage metadata framework Description This document is based on Godfrey Rust and Mark Bides <indecs> metadata framework Funding The International DOI Foundation, www.doi.org Author Steffen Lindek, [email protected] File Name METADATA.RTF Date 05.12.00

Transcript of The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view...

Page 1: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

The BioImage metadata frameworkBased on the <indecs> metadata framework

Contents

1 Introduction1.1 The four axioms

2

2 Principles2.1 The principle of unique identification2.2 The principle of functional granularity

3

3 Entities 3

4 Attributes4.1 Labels4.2 Quantities4.3 Qualities

5

5 Relations5.1 Roles5.2 Types5.3 Events

7

6 Parties 10

7 Creations - a model of making7.1 Creation types7.2 Creation qualities7.3 Creation-to-creation relation roles

10

8 Non-textual metadata 13

9 Metadata dictionary 13

Title The BioImage metadata frameworkDescription This document is based on Godfrey Rust and Mark Bides <indecs> metadata

frameworkFunding The International DOI Foundation, www.doi.orgAuthor Steffen Lindek, [email protected] Name METADATA.RTFDate 05.12.00

Page 2: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

2

1 IntroductionBioImage (www.bioimage.org) is a database for multidimensional microscopy data frombiological samples and their metadata. The EC funded the development of BioImage for threeyears (Dec. 1996 Nov. 1999) in its 4th Framework Programme. During that time, a databasemodel was designed, implemented and populated for demonstration purposes. The prototypedatabase is available at two sites (Madrid, Spain: www.bioimage.org; Heidelberg, Germany:www-embl.bioimage.org).

The BioImage metadata framework is the BioImage-specific version of the <indecs> metadataframework. It documents the BioImage model in an <indecs>-compliant way and is designedto be both an application and an extension of the general framework to a specific genre.

1.1 The four axiomsThe <indecs> metadata framework rests on four axioms, which apply to BioImage as follows.

1.1.1 Axiom 1: Metadata is criticalBioImage has recognized from the very beginning that metadata is critical, not so much for e-commerce (in its commercial meaning), but especially for scientific communication, for theexchange of information relating to scientific experiments. Actually, BioImage is more aboutmetadata of scientific microscopy data than about the data itself. The major motivation andgoal of BioImage is to make the raw data (not the images derived from that data) available forscientific research and thereby promote reuse and cross-subject evaluation of the data. For thispurpose, the microscopy data itself is of no value if it is not accompanied by additionalinformation on how it was acquired, how the sample was prepared, etc.1 A considerableamount of time was therefore dedicated to the definition and the organization of the metadata.

1.1.2 Axiom 2: Stuff is complexBioImage is a good example for how complex stuff can be. The top-level entity, the BioImagestudy, is generally composed of several datasets, which are groups of data files acquiredduring the same experiment. The data in the files may be two, three or even four-dimensional(three space and one time coordinate) and have one or more channels (each channel, whichholds specific information on the sample, is generally represented in a different color). Allthese entities have their own set of attributes. Mostly, scientists are not involved in all parts ofan experiment (and may therefore have different rights, although this differentiation iscurrently not applied in scientific publications). Biological samples, chemicals, instrumentsused for the experiments need to be described individually. A BioImage study may thereforeeasily involve more than a hundred of different entities, all of them with their own set ofattributes.

Stuff is also complex because of the manifold of biological samples (from entire organisms tomacromolecular complexes) and of microscopes and imaging techniques. The fundamentalassumption of BioImage is that it is possible to set up a generic system to handle complexmetadata for all different image data types, starting with microscopy data from a wide varietyof different instruments.

1.1.3 Axiom 3: Metadata is modularBioImage is based on a modular model. The biological (sample related) and technical(instrument related) information is distinct and can be modelled separately. The BioImagedatabase can therefore be decomposed into five different modules:1. the administrative and organizational data (authors, institutions, etc.)2. the image metadata (size, format, etc.)3. the technical information (microscope, components, etc.)

1 In this respect, scientific (microscopy) data is very different from art or music which can be appreciated withoutadditional information.

Page 3: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

3

4. the biological information (specimen information)5. the experimental details (sample preparation steps with biological and physical parameters)

This metadata is generally produced and provided by different authors. While the leadingscientist will be able to describe the biological information and the experimental details, themanager of the microscope facility or the operator of the microscope will be more competentto provide the technical details of the microscope and its settings during data acquisition.

1.1.4 Axiom 4: Transactions need automationThe amount of work that is necessary to document a microscopy study for the BioImagedatabase is enormous if one wishes the submission to be as complete as possible. The timerequired for the entry of the data could be considerably reduced if the technical parameters(the microscope settings for data acquisition), which are known to the computer thatnowadays controls many microscopes, were transmitted automatically to the BioImagedatabase.

This type of automated transaction requires the interoperability postulated in the <indecs>framework, which allows information created in one context (the microscope control) to beused in another one (the image library), in a way that is as highly automated as possible. Here,it becomes obvious how important the standardization of these parameters is, a standardizationthat should be worked out by the image libraries such that instrument and softwaremanufacturers can incorporate these standards.

2 PrinciplesThe <indecs> framework describes four guiding principles for the development of well-formed metadata. BioImage can be used for comments on the two first principles.

2.1 The principle of unique identificationUnique identification has been implemented into BioImage from the very beginning. Eachstudy, each dataset, each data file is uniquely identified by an ID. On one hand, this qualifiesthe BioImage ID for use as a digital object identifier (DOI). On the other hand, it indicates theimportance of some kind of a standard (legacy) identifier to be introduced for digital imagedata.

As stressed in the <indecs> metadata framework, unique identification should apply at alllevels, including the use of controlled vocabularies for values of properties such as measures.This is even more important in the context of scientific applications like BioImage. However,the difficulty in finding a common cross-subject scheme and the desire to give the submitteras much freedom as possible in the choice of his vocabulary and its structure led to theincorporation of some free-text fields other than names or titles.

2.2 The principle of functional granularityThe functional granularity implemented in BioImage is deep. As mentioned above, studies,datasets, files are uniquely identified, and this unique ID can be used to access directly theentity and its metadata. However, the granularity mostly originates in constraints imposed by,or derived from, the database model, and are not motivated purely by the content. Henceabstracts (or summaries), which are uniquely identified in scientific publications by somepublishers, are not tagged with an ID in the BioImage database, but are treated as a simpleattribute.

3 EntitiesBioImage takes into account two of the views described in the <indecs> framework: thegeneral view and the commerce view. The additional view, the generic attribute structure, isalso used (see section 4).

Page 4: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

4

The general view classifies entities into three basic types: percepts, which are perceived withthe senses, concepts, which are conceived in the mind, and relations, which connect them.Percepts are animate (beings) or inanimate (things), and relations are dynamic (events) orstatic (situations). Figure 3 further details the entities that are important for BioImage.

Figure 3 BioImage entities

perceptbeing (specimen)

animalplanthumanBeing (person)

thing, artefactmanifestation

recording (file)datafilm

videoderivation (file)

imagecompilation

studydatasetfigure

relationevent

creatingEventsamplePreparationmountingrecordingEventcompilingEventderivingEventprocessing

situationpossessingSituationassociation

affiliationattribute

label (see section 4.1)quantity (see section 4.2)quality (see section 4.3)role

agent (person)contributor

recorder, operatorproducerdirector

inputtool

microscopematerial

sample (specimen)subject

outputrecordingcompilationderivation

contexttimeplace (institution)

Page 5: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

5

The commerce view focuses on how things are made. The creating events and the creations aretherefore the key elements (see section 7).

4 AttributesThe <indecs> metadata framework proposes five types of attributes: labels (strings, such asidentifiers and names), quantities (numbers + value qualifiers, such as dimensions,durations, etc.), qualities (adjectives such as language, colour, etc.), types (nouns reflectinga classification) and roles (agent, input, output, context).

Here are some more examples for these attributes, which are relevant to BioImage. Note thatattributes with an <indecs> identifier (as superscript) are defined in the <indecs> metadataframework. The other attributes are introduced by BioImage.

4.1 LabelsTable 4.1 lists some identifiers and names used in the BioImage database. Names can beclassified into titles, which have a identifying character, and descriptors, which have noidentifying meaning.

Table 4.1 BioImage labels

label subtypes examples

identifier26 doi174 email url190 catalogNo620

enzymeCommissionNoname29 title303

descriptorname* synonym acronym symbolgrantNo breed strain title303

shortTitle summary* name is used as a generic attribute for all the names used in BioImage: firstName, lastName,cellName, taxonomyName, etc.

4.2 QuantitiesTable 4.2 lists some quantities and measures. Some of them are not used in the BioImagedatabase and are just shown to complete the structure.

The numeric value of a quantity can be detailed through value qualifiers. In the BioImagemodel we use four value qualifiers: the measure and the precision, which are also identifiedby the <indecs> metadata framework, and the origin and the measurementTimePoint. Theorigin is a new element that is introduced because many measurements are not absolute (suchas the height of an object), but relative (such as the age, where the origin is the generally thebirth, or the temperature (Fahrenheit vs. Celsius)). Often, the origin is part of the measure. ThemeasurementTimePoint indicates the time point when the measurement of a dynamic valuewas made (weight of an animal at a measurementTimePoint).

There are many ways to classify quantities. Here, the approach chosen in the <indecs> modelis refined and extended. Note that the base quantities dimension, duration, angle, mass,temperature, charge, and count2 are complemented by another entity, the rate, which is theparent of most of the derived quantities. Many of these can be grouped into spatial rates(something measured per dimension, i.e. densities) and temporal rates (something measuredper duration, i.e. frequencies, velocities). This schema does not allow some quantities to be

2 Note that this is not the base of the SI (Systme International d'Units). The SI is also based on the luminousintensity. The planeAngle has been chosen as a fundamental quantity by the UCUM (Unified Code for Units ofMeasures).

Page 6: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

6

unambiguously classified. The force can be considered to be the time derivative of themomentum or the spatial derivative of the energy.

Table 4.2 BioImage quantities and measures

quantity subtypes / measure examples

dimension50 distance / m length width height thickness spacingdiameter radiuswavelength bandwidth amplituderesolution

area / m2

volume / m3

duration57 age / s

interval / s feedbackTime pulseLength

angle planeAngle / ° angle tiltAngle orientationsolidAngle / sr

mass mass / kg

temperature temperature / K

charge electricCharge / C

count61 digitalCount / bit dynamicRangefileSize

1Dcount pixelNumbernoise (readOutNoise thermalNoise)amount / mol

2DCount binning meshmultiplicity (or factor) oligomerization stoichiometry cycles

passageNo magnification gainsequence119 order

rate62 percentage / % transmittance efficiencycompressionDegree

frequency (count/duration) / Hz resonanceFrequencyoscillationFrequency pulseFrequency

velocity (distance/duration) m/s scanSpeedconcentration (count/dimension,mass/dimension, etc.)

concentration density osmolarityelectronDose

energy / Jmomentum momentum / kg.m/s

angularMomentum / kg.m2/sforce59 (momentum/duration,energy/dimension)

force / Ntorque / N.m

pressure (force/dimension) pressure / PaspringConstant / N/m

power (energy/duration) / Wcurrent (charge/duration) / A darkCurrent tunnelCurrentelectricPotential (energy/charge) / V highVoltage accelerationVoltageother electricResistance / ΩΩΩΩ

electricConductance / Sconductivity / S/mpolarity

Page 7: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

7

hydrophobicityrefractiveIndexacidity / pHnumericalAperturemolecularWeight

4.3 QualitiesTable 4.3 shows some qualities used in the BioImage model. Note that some of them can bequantified and hence be used as quantities (e.g. the charge).

Table 4.3 BioImage qualities

quality subtypes examplesgrade academicTitle

salutationDr., PhD, MD, Prof.Mr., Mrs., Ms., Miss

state synchronizedfunctionalStateaggregationStatecellCycleStagelifeCycleStage

yes/nofunctional/inhibitedfilamentmetaphaseembryo

method scanningMethodgeometricModeimageModescanDirection

beam scanningtransmissiondark-fieldperpendicular, parallel

? chargepolarizationpulsedpolarity

+/-circular, linear, ellipticalyes/no+/-

shape triangular

material gold

5 Relations

5.1 RolesIn the <indecs> framework, relations consist of two or more entities which play roles inrelation to one another. Four generic roles are identified: agent, input, output and context.All these roles have many subtypes, and the BioImage subtypes are described in this section.

5.1.1 Agent rolesIn BioImage, agent roles are fulfilled by persons and institutions, who are mainlycontributors.

While there are areas in which there is a tendency to precisely describe the role of individualcontributors (the <indecs> example of the third assistant graphic art director), there is onlylittle role differentiation in scientific publications. In the life sciences, convention is that thefirstAuthor is the leading author (most credit goes to him, since his name will appear in allabbreviated author lists), while the last author is the senior author, the head of the group orinstitute, but usually neither the first nor the senior author are explicitly identified. The onlyrole explicitly stated is the role of the correspondingAuthor.

Page 8: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

8

Table 5.1.1 BioImage agent roleselement definition genealogy

contributor69 A party contributing to the making of something - synonymfor BioImage author

creator70 A party contributing to the making of an original creation- synonym for BioImage author

author A party contributing to the making of a (BioImage)publication (study or dataset)

creator/

correspondingAuthor

An author who is the contact for a BioImage study ordataset

author/

firstAuthor An author who is the first in the author list author/

modifier71 A party contributing to the making of a modification (arole for some of the authors)

excerpter72 A party contributing to the making of an excerpt (a role forsome of the authors)

compiler73 A party contributing to the making of an compilation (arole for some of the authors)

producer75 A contributor responsible for the realisation of a creation(the senior author)

funder A contributor providing funds for the making of a creation contributor/

director76 A contributor directing the activity of others (the principalinvestigator or firstAuthor or senior author)

operator78 A contributor operating equipment (e.g. a microscope) - inBioImage synonym for recorder

recorder79 A contributor recording an event (e.g. with a microscope) -in BioImage synonym for operator

facilitator80 A contributor providing support services to othercontributors (e.g. the manager of a microscopy facility)

possessor84 A party retaining possession of an entity - synonym forBioImage owner (of a microscope)

owner A party retaining possession of an entity - synonym forpossessor (of a microscope)

party/

5.1.2 Input rolesIn BioImage the input roles are mainly tools (e.g. microscopes) and materials (e.g.specimens and chemicals). Note that specimens could also be considered as being patients.

Table 5.1.2 BioImage input roleselement definition genealogy

patient86 An entity which is the immediate object of the act in anevent, or is possessed or associated in a situation (e.g. aspecimen)

tool90 A bounded thing used directly by a contributor (e.g. amicroscope)

material91 An unbounded thing used directly by a contributor (e.g. achemical)

microscope An instrument used as a tool in a magnified visualrecordingEvent

input/tool

chemical A chemical substance input/material

5.1.3 Output rolesOutputs are entities that result from an event. In BioImage, output roles are fulfilled bycreations (see section 7).

Page 9: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

9

5.1.4 Context rolesContext roles are time and place. They are treated only briefly in the <indecs> metadataframework, which is the reason why this section comments on them generally, i.e. beyond theBioImage framework. The comments are inspired by sections 3.7 and 5.4 of version 3 of the<indecs> metadata model (5.7.1999).

Although it seems at first glance that time and place are just simple attributes (12 December1995, 10:30:56, Mount Everest), which have different resolution (1995 vs. the exact date andtime; Himalaya vs. Mount Everest), a deeper investigation reveals that they are in fact verycomplex elements. Time is not merely a quantity that can be decomposed into a number(1995), a measure (years), and an origin or reference point (Christ's birth) (as for otherquantities the origin is often implicit, e.g. A.D.) or a double quantity if the time (number10:30:56; implicit measure hh:mm:ss; origin may be GMT) is considered independently fromthe date (note that any time and date can be expressed by a single quantity (number + measure+ origin)). First, like many other quantities, it can be extended by a value qualifier, theprecision (<indecs> metadata model), which is a quantity, too (the error). On the other hand,precision values such as before or after are only qualitative, which means that the precisioncan also be a quality.3 Second, time can also be a period, a range (<indecs> metadata model),which can be described by a name (the Renaissance) or by explicit start and end dates. Thisleads to a third value qualifier, the measurementTimePoint (e.g. the start of the event, whichwould indicate that the time is a startTime). If the precision is of the number is not high incomparison with the duration of an event (e.g. August 1995 for a two-days event), the timepoint does not matter. However, if the precision is high, the measurementTimePoint should beindicated.4 Obviously, the temporal context of an event can consist of several times.

This leads us to consider time as an entity, a concept, with the label name and the quantitynumber with its qualifiers measure, origin, qualitativePrecision, quantitativePrecision andmeasurementTimePoint (qualities).

The place is complex, too. It is also a concept that can be described by quantifiers (e.g.geographical coordinates or postal codes) and labels (names). The qualitative and quantitativevalue qualifiers qualitativePrecision and a quantitativePrecision also apply, as well as therange (between two or more places). Additional problems arise from the fact thatgeographical names are neither unique (e.g. Perth in Australia and in UK) nor stable (placesare denoted by different names at the same time or at different times) and therefore do notqualify as identifier.

Time and place are important for BioImage in two contexts. The places described in themodel are the institutions where the experiments have been done. This does not constitute anunusual problem. However, while we are used to use the birth as an age reference point whenspeaking about human and animals, this becomes subtler when dealing with very youngorganisms in biomedical research. Does the age of five days refer to the birth, to thefertilization, or to another biological time point? This stresses the importance of the origin.

5.1.5 Role qualificationBioImage also uses role qualifiers. The sequence in which authors are listed in a study or adataset is an example for such a role qualifier. The quantity plays a role where components ofmacromolecules are counted (see count).

5.2 TypesTypes do not play a major role in the BioImage model. Although persons may often beauthor or supplier, they are not characterized as such. The same for the creations: data filesare recordings, datasets and studies are compilations.

3 This is also the case for some quantities (see section 4.3).4 These considerations also apply to all measurements of dynamic values (see section 4.2).

Page 10: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

10

5.3 EventsThe events in BioImage are the steps that form the samplePreparation procedure, themounting, the data acquisition (recordingEvent), and the data processing. Further stepsdescribe transformations of the data.

Table 5.3 BioImage event typeselement definition genealogystep A single element of an experiment experiment/

samplePreparation

A step which is the modification of a specimen andresults in the making of a microscopy sample

step/

mounting A step in which a sample is mounted step/

recordingEvent A step in which an image file of a sample is recorded step/creatingEvent

processing A step in which a file is processed step/transformingEventmodificationEvent

compilingEvent A step in which a figure, a dataset or a study iscompiled

step/

derivingEvent A step in which an image is derived from a file step/

An experimental procedure may contain any number of steps, which may in turn be composedof more elementary steps. In principle, the procedure can be subdivided into atom steps thatconsist of one gesture (which makes sense for a washing step, in which a biological sample isrinsed, but not for the setting of a single instrumental parameter).

6 PartiesIn the BioImage model, all the principal party types defined in the <indecs> framework play arole. Authors (or more generally persons) are humanBeings, biological specimens can beanimals or plants, the institutions (universities, institutes, laboratories) are organizations,and research groups can be considered to be ensembles (groups of authors).

The metadata that BioImage uses for parties conforms to the generic attribute structure: labels(name), quantities (age), qualities (lifeCycleStage), and roles in relations (affiliation).

Table 6 BioImage party typeselement definition genealogyperson A man or woman of the species homo sapiens; synonym for

humanBeing17being/

specimen An animal or a plant used in a BioImage study being/

institution A group of human beings; synonym for organization615 group_party;concept/

7 Creations - a model of makingMicroscopy data is a creation. The generation of the data can be described by a model ofmaking as presented in Section 8 of the <indecs> metadata framework.

Page 11: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

11

7.1 Creation typesBasically, the making of a scientific study involves three entities: the idea (a concept), theexperiment (or performing the calculations for a theoretical work; an expression), and thepublication (a classical printed publication or a BioImage study; a percept/manifestation).While the idea is not documented in the BioImage database, the other entities can besubdivided. The experiment is composed of different steps: samplePreparation, mounting,data acquisition (recordingEvent) and data or image processing steps. The direct creationsare the microscopy data and other (supplemental) files, which are documented as one entityfile in the BioImage database. The indirect creations are the classical printed publicationbased on the experiment, and the BioImage entities dataset and study that are based on thedata and the publication (there is a one-to-one relationship between the printed publicationand the BioImage study). These are indirect in the sense that they are compilations ofmicroscopy data and non-microscopical material.

Note that, as mentioned in the <indecs> metadata framework, the creatingEvents themselves(the recordingEvent, for example) are not treated as expressions (or intellectual property) inthe BioImage model.

Table 7.1 BioImage creation typeselement definition genealogypublication An creation that is published artefact/manifestation

study A compilation of one or more datasets; an indirectBioImage creation

publication/

dataset A compilation of files; an indirect BioImage creation publication/

file A BioImage creation creation/

recording A data file resulting from a recordingEvent creation/

figure A compilation of images creation/

image A derivation from a data file creation/

experiment An event which is a scientific creation event/expression/

step A single instance of an experiment experiment/

Two examples illustrate this model:

An engineer has an idea for a new instrument. He describes the instrument in patent. Then,he orders and assembles all parts of the instrument and produces some test data. Based onthe information of the instrument and on the data that he uses to produce a plot he writes amanual.

A scientist has an idea for a novel experiment. His experiment consists of growing cells,mounting them on a slide and observing them with a microscope. The data he hasrecorded is in digital form. He can therefore process it on his PC. He selects the two bestrecordings and combines them into a single figure. Then, he writes an article about hisfindings. The article is entered into the BioImage database as a study with two data files(his two best recordings) and a supplemental file (the figure he composed).

An issue that is BioImage specific is the distinction between generic entities and experiment-specific entities. A cell, e.g., is described in general terms (cell type, cell name and tissue fromwhich the cell is taken) and in terms that are specific for a certain instance of the cells (e.g.cell cycle stage and cell culture parameters). A microscope is described as a generic (virtual)instrument (the microscope XYZ described by the manufacturer's catalogue), and as anexisting instrument (the microscope XYZ at the ABC Institute, which may not have all theparts that the manufacturer sells for that microscope; on the other hand, it may be extended by

Page 12: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

12

parts that are not original). In the case of the microscope, there is even a third layer:experiment-specific settings of the microscope are stored in separate tables (these are basicallyattributes to the recordingEvent grouped into microscope-component-specific complexes).This schema has proven to be very useful.

This instantiation can be described as a simple role between a concept (the generic entity) anda percept (the specific entity). Since it is always a one-to-many relationship, the role isdescribed as seen from the specific entity. Then, the generic entity is the source(input/sourceCreation) or parent (input/patient/parent) of the specific entity (percept).

7.2 Creation qualitiesThe most important BioImage creation qualities and their values are given in the followingtable:

Table 7.2 BioImage creation qualitieselement remarks values

mode46 most creations are visual, filmsmay be audiovisual

visual162

audiovisual295

origination209 studies, datasets, and figures arecompiledimages are derivedprocessed files are modifiedthumbnails (previews) areexcerpted

original214

compiled212

derived

modified213

excerpted211

genre34 lexical288

pictorial283

audiovisual295

substance37 digital133

infixion33 bitEncoded135

continuity39 except for the compilationswhich can change over time(data may be added to studiesand datasets) the creations arestatic

dynamic138

static139

7.3 Creation-to-creation relation rolesThe next table lists the output roles which creations play in events.

Table 7.3 Creation-to-creation relation roleselement definition

compilation99 A creation made from two or more pre-existing creations of other types (e.g.a study)

derivation A creation derived from a creation of another type (e.g. an image)

recording A creation resulting from a recordingEvent

excerpt95 A creation which is made by extraction from a pre-existing creation (e.g. athumbnail)

modification97 A creation made by changing a pre-existing creation of the same type (e.g. aprocessed file)

Page 13: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

13

8 Non-textual metadataFor BioImage, as for all other image databases, non-textual metadata is crucial. Previews(thumbnail images), images (views or projections of multidimensional data), figures(composed images as published in printed publications), and additional material such asdiagrams are key elements of the added-value provided by BioImage. While some of these areattributes (previews), others are entities with their own attributes (images, figures, anddiagrams are treated as files with their own set of attributes, such as the name, the format, thecopyright owner).

9 Metadata dictionaryThe metadata dictionary holds information on the metadata elements used for BioImage(name, definition, relationship with other elements).

Table 9.1 includes terms defined in the <indecs> metadata framework document that aremodified or extended in the context of BioImage, and terms newly introduced for BioImage.The element names try to follow both the original nomenclature developed by BioImage(which is in turn as close as possible to the scientific use) and the generic system presented inthe <indecs> metadata framework. Wherever BioImage uses an <indecs> term with anotherdefinition or wherever BioImage uses another term equivalently to an <indecs> term, this isexplicitly mentioned.

Table 9.1 BioImage framework basic metadata dictionaryelement description genealogy iid

1DCount A number measuring the occurrence of anattribute

count/

2DCount A pair of numbers measuring the occurrence of anattribute

count/

academicTitle A academic grade attributed to a person grade/

acronym A abbreviated label by which an entity is known(often identifier within a restricted namespace)

name/title/

age A duration with an origin duration/

angle A base quantity; a number measuring someorientational aspect of an entity

quantity/

area A two-dimensional dimension dimension/

author A creator of a BioImage creation creator/

breed A label for beings name/descriptor/

cellCycleStage The stage of a cell within its life cycle state/

charge A base quantity; a number measuring someelectromagnetic aspect of an entity

quantity/

chemical A chemical substance input/material/

compilingEvent A step in which a figure, a dataset or a study iscompiled

step/

concentration A spatial rate; a number measuring something perdimension

rate/

correspondingAuthor

An author who is the contact for a BioImagecreation

author/

current A number measuring the flow of charge rate/

dataset A compilation of BioImage files publication/

Page 14: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

14

derivation A creation derived from a creation of anothertype

creation/

derivingEvent A step in which an image is derived from a file step/

digitalCount A number measuring digital information count/

distance A one-dimensional dimension dimension/

electricPotential A number measuring the energy per charge rate/

email An electronic address that can generally used toidentify a party

identifier/

energy A rate rate/

enzymeCommissionNo

An identifier for enzymes identifier/

experiment An event which is a scientific creation event/expression/

figure A compilation of images creation/

file A BioImage creation creation/

firstAuthor An author who is the first in the author list author/

force A rate rate/

frequency A temporal rate; a number measuring somethingper duration

rate/

funder A contributor providing funds for the making ofa creation

contributor/

grade An attribute qualifying the rank of a party quality/

grantNo A label by which an grant is identified within thefunder's namespace)

name/title/

image A derivation from a data file creation/

institution A group of human beings; synonym fororganization

party/; concept/

instrument A tool tool/

interval A duration without an origin duration/

lifeCycleStage The stage of a being within its life cycle state/

mass A base quantity; a number measuring somephysical aspect of an entity

quantity/

material An unbounded thing that makes up an entity quality/

measurementTimePoint

A quality for an extended quantity (a valuequalifier)

method A configuration of an instrument quality/

microscope An instrument used as a tool in a magnified visualrecordingEvent

input/tool/

momentum A rate rate/

mounting A step in which a sample is mounted step/

multiplicity A number measuring the occurrence of an entity count/

operator A contributor operating equipment to createcontent in a creation - in BioImage synonym forrecorder

contributor/ 78

origin A reference point for a measure (a value qualifier)

owner A party retaining possession of an entity -synonym for possessor

party/

percentage A rate which can be expressed in percent rate/

Page 15: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

15

person A man or woman of the species homo sapiens;BioImage synonym for humanBeing

being/

possessor A party retaining possession of an entity -synonym for BioImage owner

party/ 84

power A number measuring the energy per duration rate/

precision A quality or a quantity for the exactness of anumber (a value qualifier)

pressure A number measuring the force per dimension rate/

processing A step in which a file is processed step/

publication An creation that is published manifestation/

range A quality for an extended quantity (a valuequalifier)

recorder A contributor recording an event in the makingof a creation - in BioImage synonym for operator

contributor/ 79

recording A creation resulting from a recordingEvent creation/

recordingEvent A step in which an image file of a sample isrecorded

step/

salutation A qualifier used for a person grade/

samplePreparation A step which is the modification of a specimenand results in the making of a microscopy sample

step/

shortTitle A quality relating to the aspect or form of anentity

quality/

shortTitle A shortened title name by which a creation isknown

name/title/

specimen An animal or a plant used in a BioImage study being/

state A quality that describes a configuration or a statusof an entity

quality/

step A single element of an experiment experiment/

strain A label for beings name/descriptor/

study A compilation of one or more BioImage datasets publication/

summary A brief text that describes a creation name/descriptor/

symbol A short label by which an entity is known (oftenidentifier within a restricted namespace)

name/title/

synonym A label by which an entity is also known name/title/

temperature A base quantity; a number measuring somethermal aspect of an entity

quantity/

velocity A temporal rate; a number measuring adimension per duration

rate/

volume A three-dimensional dimension dimension/

Table 9.2 lists more BioImage entities with their genealogy, but without a definition.

Table 9.2 BioImage metadata listelement genealogy

affiliation situation/

anatomicalStructure thing/

Page 16: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

16

authorship situation/

bioContent percept/

cantilever thing/microscope/

cell being/

cellType attribute/type/; concept/

chamber thing/

channel thing/

complexComponent thing/

component thing/

country attribute/quality/; concept/

cover thing/

database thing/

datasetType attribute/type/; concept/

detector thing/

electronSource thing/source/

expression (biological) situation/

fileFormat attribute/type/; concept/

filter thing/

funding situation/

gas thing/

genericCell concept/

genericMolecule concept/

genotype concept/

lens thing/

lightSource thing/source/

macromolecularComplex thing/

modification concept/

organ thing/anatomicalStructure/

organelle thing/

organism being/

parameter concept/

pipette thing/microscope/

position concept/

scanner thing/microscope/

setup (configuration of an instrument) situation/

software thing/; input/tool/

solution thing/

source thing/

stage thing/microscope/

Page 17: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

17

stepType attribute/type/; concept/

substance thing/

substrate thing/

suppliedItem percept/

supply (situation defining the supplier) situation/

support thing/

supporting (for supporting files) situation/

taxonomyName attribute/label/; concept/

technique attribute/quality/; concept/

tip thing/microscope/

tissue thing/anatomicalStructure/

unit (synonym for measure) attribute/quality/; concept/

Table 9.3 is a coarse tabular description of the BioImage model. It lists entities with a shortdescription and the attributes. The roles are derived from the relationships that the BioImagemodel defines between entities.

Table 9.3 BioImage modelelement/description label/quantity/quality/type/role

(multiplicity 0,1,n in BioImage database)anatomicalStructureAn anatomical structure or tissue

label/namequality/taxinput/patient>anatomicalStructure (0..1)

bioContentA biological feature within a specimen

input/patient>anatomicalStructure (0..1)input/patient>genericCell (0..1)input/patient>organelle (0..1)input/patient>genericMolecule (0..1)input/patient>specimen (1)

cellA cell

quality/cellCycleStagequantity/passageNoquantity/cultureAgequality/synchronizedlabel/cultureDensitylabel/straininput/patient>organism (0..1)input/patient>specimen (1)

cellType (Controlled Vocabulary)The cell types

label/nameinput/patient>cellType (0..1)

chamberA chamber used for mounting asample

label/namequality/sealedquantity/volumelabel/materiallabel/moreInfo

channelA channel of a file

label/contentquantity/depthlabel/labelquantity/axisSizex|y|z|tmeasure/axisUnitxyz|tquantity/spacingx|y|z|tquantity/resolutionx|y|z|tinput/patient>file (1)

Page 18: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

18

input/subject>bioContent (0..n)

complexComponentA component of a macromolecularcomplex

quantity/oligomerizationquantity/stoichiometryquantity/molecularWeightlabel/sequenceMutationlabel/sequenceFragmentlabel/sequenceCommentquality/functionalStatequantity/acMoleculelabel/acChainlabel/acCommentquality/modificationquantity/modifiedFromquantity/modifiedTolabel/modificationCommentinput/patient>specimen (1)input/patient>complex (0..1)input/patient/target>specimen (0..1)input/patient/modification>specimen (0..n)input/agent/sequenceProvider>database(accessNo,textString) (0..1)input/agent/atomicCoordinatesProvider>database (accessNo,textString) (0..1)input/patient/fusion>complexComponent(order) (0..n)

componentA part of the microscope

label/namequality/classcontext/place>place(microscope,instrument,settings(channel,step))(0..n)

contactAn abstract class for institutions andpersons

type/contactType

country (Controlled Vocabulary)The countries

label/countryCodelabel/namelabel/phoneCodelabel/mailCode

coverA cover used for mounting a sample

label/namequantity/spacerThicknessquantity/refractiveIndexquantity/coverThicknesslabel/materiallabel/moreInfo

crystal (Attribute Entity)The attributes of a crystal

quantity/cellDimensionsx|y|zquantity/cellAnglea|b|ctype/groupSymmetry

databaseA reference database

lable/namelabel/urllabel/databaseTail1label/databaseTail2

datasetA set of files in the BioImage database

label/titlelabel/runningTitlelabel/abstractlabel/versionlabel/previewquality/techniquetype/datasetTypelabel/educationLevelagent/contributor/funder>institution(grantNumber,recipient>person) (0..n)

Page 19: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

19

input/subject>study (1)input/subject>publication (0..n)input/subject>specimen (1..n)

datasetType (Controlled Vocabulary)The dataset types

label/name

emSettings (Attribute Entity)The settings for en electronmicroscopy experiment

quantity/accelerationVoltagequantity/electronDosequality/illuminationTypequantity/magnificationquantity/defocus

expressionThe expression of a macromolecule

label/vectorlabel/vector.quality/typequality/nativeExpressioninput/patient>complComponent (1)input/source>genericCell (tax) (0..1)

fileA file (microscopy data file or figure,table, etc., which are derived from orsupport the microscopy data).Currently subdivided intononvideoFile and videoFile

label/doilabel/locationquality/typelabel/descriptionlabel/previewlabel/educationLevelquality/formatquantity/sizeattribute/processingQualitylabel/compressionAlgorithmquantity/compressionDegreeagent/contributor/transformer>person (0..n)input/subject>dataset (1)

fileFormat (Controlled Vocabulary)The file formats (data files and imagefiles)

label/namelabel/extension

gasA gas in the atmosphere surroundingthe sample

label/name

genericCellA cell (generic)

type/cellTypelabel/namequality/anatomicalStructure

genericMoleculeA molecule (generic)

label/namequality/typelabel/synonymslabel/enzymeCommissionNoquantity/molecularWeight

genotypeThe genotype of a specimen

label/namelabel/alleleSymbollabel/transfectionlabel/karyotypelabel/moreInfoinput/subject>publication (0..1)

institutionAn institution

label/namelabel/acronymlabel/streetAddresslabel/poBoxlabel/postcodelabel/citylabel/statequality/countrylabel/emaillabel/urllabel/faxlabel/phone

Page 20: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

20

instrumentAn instrument, e.g. a microscope, thatis not further specified (generic)

lable/namequality/typequality/technique (n)

macromolecularComplexA macromolecular complex

type/complexTypelabel/namequality/aggregationStateattribute/crystalquantity/NoOfComponentsinput/patient>specimen (1)

miscComponent (ControlledVocabulary)A miscellaneous component

label/name

modification (Controlled Vocabulary)The modification of a component of amacromolecular complex

label/name

mountingThe step in which the sample ismounted for observation in themicroscope

input/tool>cover (0..n)input/tool>chamber (0..n)input/tool>support (0..n)input/tool>substrate (0..n)

nearFieldSettings (Attribute Entity)The settings for a near-fieldmicroscopy experiment

label/nearFieldModequantity/heightquantity/forcequantity/tunnelCurrentquantity/biasVoltagequantity/angularDisplacementquantity/oscillationFrequencyquantity/oscillationAmplitudequality/oscillationDirectionquantity/deflectionAmplitudelabel/servo

organelleAn organelle (or sub-cellularstructure)

label/namequality/taxinput/patient>organelle (0..1)

organismAn organism

label/breedlabel/strainquantity/ageage.quality/originage.quality/measurementTimePointquality/lifeCycleStagequality/sexinput/patient>specimen (1)

ownedMicroscopeA microscope that belongs to aninstitution or to a person

agent/user/possessor>contact (1)input/tool>software (0..1)input/sourceCreation>instrument (1)

parameter (Controlled Vocabulary)The experimental parameters

label/namelabel/symbolquality/unit

personA person

label/lastNamelabel/firstNamelabel/middleNamelabel/suffixquality/academicTitlequality/salutationlabel/emaillabel/urllabel/faxlabel/phonecontext/place>institution(jobTitle,groupName,affiliationDate) (1..n)

Page 21: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

21

positionThe position of a component within amicroscope

type/pathTypequantity/incomingPathquantity/outgoingPathquantity/placeNo

processingA step in which microscopy data isprocessed

input/sourceCreation>file (1..n)input/tool>software (1..n)output>file (1..n)

processingQuality (Attribute Entity)The attributes for the processingquality of a data file

type/measurementTypequantity/x|y|zquantity/x|y|z.measure/unitlabel/methodinput/subject>publication (0..1)

publicationA non-BioImage publication

label/doiagent/provider>database (accessNo,textString)(0..n)

recordingEventA step in which the sample isobserved in the microscope and inwhich data is recorded

quality/techniquequality/scanningMethodquality/geometricModequality/imageModequality/otherModeattribute/settingsinput/tool>microscope (1)output>file (1..n)

samplePreparationA sample preparation step

input/tool>substance (0..1)input/patient/target>bioContent (0..1)

softwareA software for image processing or formicroscope control

lable/namelabel/version

solutionA solution used as a medium for thesample

label/namequality/polarityquantity/conductivityquantity/proteinLipidRatioquantity/osmolarityquantity/acidityquantity/densityquantity/refractiveIndexinput/source>substance (0..n)

specimenA biological specimen used in anexperiment

type/SpecimenTypequality/taxlabel/moreInfolabel/taxParentsinput/patient>anatomicalStructure (0..1)input/patient>genericCell (0..1)input/patient>organelle (0..1)input/patient>genericMolecule (0..1)input/patient>genotype (0..n)

stepAn experimental step: samplepreparation, mounting, dataacquisition (recording) or processing

quantity/ordertype/stepTypequantity/cyclesquantity/durationlabel/moreInfoagent/contributor>person (0..n)input/material>specimen (1..n)input/subject>publication (0..n)input/tool>instrument (1)input/material>gas (concentration,measure)(0..n)input/material>solution (amount,measure) (0..1)quantity>parameter(floatValue,integerValue,textValue) (0..n)

Page 22: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

22

input/subject>dataset (1)context/time>absoluteTimecontext/place>institution (0..n)

stepType (Controlled Vocabulary)The experimental steps

label/namelabel/purposelabel/methodquality/classinput/patient>technique (0..n)input/patient>parameter (0..n)

studyA set of BioImage datasets

label/titlelabel/runningTitlelabel/abstractlabel/versionlabel/previewagent/contributor/author>person(authorType,order,institution) (1..n)agent/contributor/funder>institution(grantNumber,recipient>person) (0..n)input/sourceCreation>dataset (0..n)input/subject>publication (1)

substanceA solution, a molecule or anothersubstance

quantity/concentrationquantity/concentration.measureinput/patient>miscComponentinput/patient>solutioninput/patient>genericMolecule

substrateA substrate used for mounting asample

label/namequantity/hydrophobicitylabel/changesquantity/thicknesslabel/materiallabel/moreInfo

suppliedItemAn item that has a supplier

quality/typeagent/contributor/supplier>contact(catalogNo,context/place>database) (0..n)input/patient>substance (0..1)input/patient>solution (0..1)input/patient>specimen (0..1)input/patient>cover (0..1)input/patient>chamber (0..1)input/patient>support (0..1)input/patient>substrate (0..1)input/patient>instrument (0..1)input/patient>component (0..1)input/subject>publication (0..1))

supportA support used for mounting a sample

label/namelabel/manufacturingMethodquantity/meshlabel/changesquantity/conductivityquality/chargequantity/hydrophobicitylabel/materiallabel/moreInfo

tax (Controlled Vocabulary)The NCBI taxonomy. Consists of twotables taxCV and taxNames

label/namelabel/uniqueNamequality/classquality/rankinput/patient>tax (0..1)

technique (Controlled Vocabulary)All microscopy techniques

label/namequality/class

Page 23: The BioImage metadata framework · 2017. 3. 16. · BioImage Metadata Framework 4 The general view classifies entities into three basic types: percepts, which are perceived with the

BioImage Metadata Framework

23

quality/groupquality/radiation

unit (Controlled Vocabulary)The units

label/namelabel/symbol