Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other...

77
C. Troupin GHER-University of Liège Notebooks, reproducibility and other topics in ocean sciences

Transcript of Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other...

Page 1: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

C. Troupin

GHER-University of Liège

Notebooks, reproducibility andother topics in ocean sciences

Page 2: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

The material (slides, exercises) are made available through GitHubathttps://github.com/gher-ulg/COST-EUMETSAT-Training

Page 3: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Who knows/uses …?

GitHub

orcID

Page 4: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Who knows/uses …?

GitHub orcID

Page 5: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Who knows/uses …?

GitHub orcID

Page 6: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Who knows/uses …?

GitHub orcID

Page 7: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Who knows/uses …?

GitHub orcID

Page 8: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 9: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Problem:how to guarantee reproducibility of

results?

Page 10: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to go from data to results?

▶ Read the publication?▶ Read the manual?▶ Get and re-use code referenced in publication?

Page 11: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to go from data to results?

▶ Read the publication?

▶ Read the manual?▶ Get and re-use code referenced in publication?

Page 12: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to go from data to results?

▶ Read the publication?▶ Read the manual?

▶ Get and re-use code referenced in publication?

Page 13: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to go from data to results?

▶ Read the publication?▶ Read the manual?▶ Get and re-use code referenced in publication?

Page 14: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Notebooks: interactive computational environments

Notebooks combine:1 code fragments that can be executed,2 text for the description of the application and3 figures illustrating the data or the results.

Page 15: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Notebooks: interactive computational environments

Notebooks combine:1 code fragments that can be executed,2 text for the description of the application and3 figures illustrating the data or the results.

”Digital Playground”

”Data Story Telling”

”Computational Narratives”

Page 16: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Notebooks: interactive computational environments

http://www.nature.com/news/interactive-notebooks-sharing-the-code-1.16261

Page 17: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Interactiveenvironments:

what exists today?

Page 18: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 19: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

http://rmarkdown.rstudio.com/

Possible to export in journal or presentation formats https://github.com/rstudio/rticles

LATEX templates for different journals

Page 20: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 21: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

https://zeppelin.apache.org/

Languages can be mixed in the same notebook Users can write their own interpreter (language backend)

Page 22: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 23: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

http://jupyter.org/ More than 40 language kernels available Can be used as a multi-user server ( jupyterhub )

Page 24: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 25: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

http://beakernotebook.com/

Usage of different languages in different cells,within the same notebook

Installation and multi-language

Page 26: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 27: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

https://cocalc.com/ ”Collaborative Calculation in the Cloud” Support of many languages Users to upload their files on the platform

Page 28: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Exercise 1subsetting using nco

Page 29: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Goal: extract a regional subset from fieldTool: nco

Page 30: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

NetCDF subsetting

Notebook file: NetCDF-regridding/netCDF-subsetting.ipynbLanguage: bash (not usual)

1 Run the notebook cell-by-cell

2 Create a new notebook File / New Notebook3 Modify the bounding box and perform subsetting4 Repeat the operations on one of your files5 Export the notebook as a pdf file

Page 31: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Exercise 2regridding using nco

Page 32: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Goal: re-interpolateTool: nco + ESMF

Page 33: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

NetCDF regridding

Notebook file: NetCDF-regridding/netCDF-regrid.ipynbLanguage: bash

1 Run the notebook cell-by-cell

2 Create a new grid file with a different resolution

3 Perform again the regridding4 Try the regridding on one of your file

use create-netCDF-grid.ipynbif you don’t know how to create such a file

Page 34: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Another step

towards reproducibility

Page 35: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to identify scientists/researchers?

Let’s work with ORCID

Source: Academicons

Page 36: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to identify scientists/researchers?

Let’s work with ORCID

Source: Academicons

Page 37: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to identify datasets and publications?

Digital object identifier

Page 38: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 39: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset

Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 40: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 41: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 42: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 43: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 44: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 45: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 46: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Zeno-what???

Page 47: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Zeno-what???

Zenodo – http://zenodo.org/

A platform to upload papers, datasets, software codes…and to get permanent identifiers

Page 48: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Zeno-what???After Zenodotus1st superintendent of the Library of Alexandria and 1st critical editor of Homer

By O. Von Corven - Tolzmann, Don Heinrich, Alfred Hessel and Reuben Peiss. The Memory of Mankind. NewCastle, DE: Oak Knoll Press, 2001, Public Domain,https://commons.wikimedia.org/w/index.php?curid=2307486

Page 49: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 50: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 51: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 52: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 53: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 54: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through
Page 55: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Ocean Observation Science

Generate Dataset Software tool PublicationAnalysis Results

Source code, notebook

Upload

Login

Upload

Page 56: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

A recent (and real) example: MedSea Atlas

The publication? ”Mediterranean Sea Hydrographic Atlas…” 10.5194/essd-2018-9

The author? Sissy Iona (Hellenic Center for Marine Research) 0000-0001-6878-4671

The data? MedSea – T and S observation collection V2 10.12770/8c3bd19b-9687-429c-a232-48b10478581c

The products? MedSea – T and S Annual Climatology 10.5281/zenodo.1146976 via

The method/tool? DIVA Version 4.6.11 10.5281/zenodo.400970 via

Page 57: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

A recent (and real) example: MedSea Atlas

The publication? ”Mediterranean Sea Hydrographic Atlas…” 10.5194/essd-2018-9

The author? Sissy Iona (Hellenic Center for Marine Research) 0000-0001-6878-4671

The data? MedSea – T and S observation collection V2 10.12770/8c3bd19b-9687-429c-a232-48b10478581c

The products? MedSea – T and S Annual Climatology 10.5281/zenodo.1146976 via

The method/tool? DIVA Version 4.6.11 10.5281/zenodo.400970 via

Page 58: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

A recent (and real) example: MedSea Atlas

The publication? ”Mediterranean Sea Hydrographic Atlas…” 10.5194/essd-2018-9

The author? Sissy Iona (Hellenic Center for Marine Research) 0000-0001-6878-4671

The data? MedSea – T and S observation collection V2 10.12770/8c3bd19b-9687-429c-a232-48b10478581c

The products? MedSea – T and S Annual Climatology 10.5281/zenodo.1146976 via

The method/tool? DIVA Version 4.6.11 10.5281/zenodo.400970 via

Page 59: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

A recent (and real) example: MedSea Atlas

The publication? ”Mediterranean Sea Hydrographic Atlas…” 10.5194/essd-2018-9

The author? Sissy Iona (Hellenic Center for Marine Research) 0000-0001-6878-4671

The data? MedSea – T and S observation collection V2 10.12770/8c3bd19b-9687-429c-a232-48b10478581c

The products? MedSea – T and S Annual Climatology 10.5281/zenodo.1146976 via

The method/tool? DIVA Version 4.6.11 10.5281/zenodo.400970 via

Page 60: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

A recent (and real) example: MedSea Atlas

The publication? ”Mediterranean Sea Hydrographic Atlas…” 10.5194/essd-2018-9

The author? Sissy Iona (Hellenic Center for Marine Research) 0000-0001-6878-4671

The data? MedSea – T and S observation collection V2 10.12770/8c3bd19b-9687-429c-a232-48b10478581c

The products? MedSea – T and S Annual Climatology 10.5281/zenodo.1146976 via

The method/tool? DIVA Version 4.6.11 10.5281/zenodo.400970 via

Page 61: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

A recent (and real) example: MedSea Atlas

The publication? ”Mediterranean Sea Hydrographic Atlas…” 10.5194/essd-2018-9

The author? Sissy Iona (Hellenic Center for Marine Research) 0000-0001-6878-4671

The data? MedSea – T and S observation collection V2 10.12770/8c3bd19b-9687-429c-a232-48b10478581c

The products? MedSea – T and S Annual Climatology 10.5281/zenodo.1146976 via

The method/tool? DIVA Version 4.6.11 10.5281/zenodo.400970 via

Page 62: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Conclusions Suggestions

1 Create an account in ORCID(https://orcid.org/) 2 minutes

2 Create an account in Zenodo(http://zenodo.org/) 2 minutes

3 Publish your code along with your paper 15 minutes4 Make your data public ∞ minutes

Page 63: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Conclusions Suggestions

1 Create an account in ORCID(https://orcid.org/) 2 minutes

2 Create an account in Zenodo(http://zenodo.org/) 2 minutes

3 Publish your code along with your paper 15 minutes4 Make your data public ∞ minutes

Page 64: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Conclusions Suggestions

1 Create an account in ORCID(https://orcid.org/) 2 minutes

2 Create an account in Zenodo(http://zenodo.org/) 2 minutes

3 Publish your code along with your paper 15 minutes4 Make your data public ∞ minutes

Page 65: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Conclusions Suggestions

1 Create an account in ORCID(https://orcid.org/) 2 minutes

2 Create an account in Zenodo(http://zenodo.org/) 2 minutes

3 Publish your code along with your paper 15 minutes

4 Make your data public ∞ minutes

Page 66: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Conclusions Suggestions

1 Create an account in ORCID(https://orcid.org/) 2 minutes

2 Create an account in Zenodo(http://zenodo.org/) 2 minutes

3 Publish your code along with your paper 15 minutes4 Make your data public ∞ minutes

Page 67: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Anything missing?

Page 68: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Working with

in situ observations

Page 69: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

”Without sufficient observations,useful prediction will likely never bepossible”

Page 70: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

”Models will evolve and improve, but,without data, will be untestable, andobservations not taken today are lostforever.”

C. Wunsch et al. (2010) PNAS

Page 71: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

”Without data assimilation, anyattempt to produce reliable forecastsis almost certain to end in failure.”

https://www.metoffice.gov.uk/learning/making-a-forecast/first-steps

Page 72: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to access the data?Copernicus Marine Environment Monitoring Service (CMEMS)catalog: http://marine.copernicus.eu

Page 73: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to access the data?

1 Create user2 Login into the system3 Select the data set of interest from the catalog4 Download from FTP

Page 74: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

Exercise 3: find sea watertemperature near Hamburg

using in situ data

Page 75: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

In situ temperature

Data: folder INSITU_BAL_NRT_OBSERVATIONS_013_032index_latest.txt: list of files with bounding box, timecoverage etc latest directory: netCDF files

Tool: notebook CMEMS_INSTAC/read_CMEMS_indexfile.ipynbTasks:

1 Read the index file2 Represent the data points on a figure (map)3 Find the closest data point4 Read the temperature at the point

Page 77: Notebooks, reproducibility and other topics in ocean ... · Notebooks, reproducibility and other topics in ocean sciences. The material (slides, exercises) are made available through

How to access the data?In Situ Thematic Assembly Center:http://www.marineinsitu.eu/dashboard/