Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek...

17
Cyberinfrastructure and its Applications in the Czech Republic Petr Holub, Luděk Matyska I2 Fall meeting, 1.10.2012

Transcript of Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek...

Page 1: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Cyberinfrastructure and its Applications inthe Czech Republic

Petr Holub, Luděk Matyska

I2 Fall meeting, 1.10.2012

Page 2: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Czech Cyberinfrastructure

Three cyberinfrastructure institutions in the Roadmap of largeinfrastructure for Research and Development, approved by the CzechGovernment

I CESNETI National Research and Education Network (NREN) providerI National Grid Infrastructure (NGI) coordinatorI Moving into the basic data provisioningI Independent legal body owned by public universities and Academy of

Science of the Czech RepublicI Centre CERIT-SC (CERIT Scientific Cloud)

I Largest Grid and Cloud providerI Flexible experimental compute and storage facility for the new

algorithms, tools, and applications developmentI Part of the Masaryk University in Brno

I Centre IT4InnovationsI Supercomputing centre, under setup (no services provided yet)I Part of the Technical University in Ostrava

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 2 / 15

Page 3: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Cyberinfrastructure Parameters

I Optical network connecting all the Czech major cities and all thepublic and most private universities

I Multi 10Gbps backbone with several 100Gbps lines plannedI 10Gbps line to Geant (EU) and 5Gbps to I2I Shared traffic, bt dedicated research lines/lambdas available

I National Grid InfrastructureI Almost 6000 CPUs in total coordinated through CESNETI 2200 CPU provided by CERIT-SC

I With plans to almost double the figure next year

I Data facilitiesI More than 5 PB in early deployment (HSM and MAID at CERIT-SC)I Additional 5–8 PB (HSM) in the pipeline

I Supercomputing facilitiesI Some 4000+ CPUs next yearI 32 thousand CPUs in 2014

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 3 / 15

Page 4: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Collaboration and Partnerships

I CESNET and CERIT-SC share part of the workforceI NGI originally conceived at Masaryk UniversityI Complementary activities in data provisioning and developmentI Common Cloud Task ForceI Common Identity Management and AAI infrastructureI Complementary high level application activities

I Following information valid for both

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 4 / 15

Page 5: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Collaboration with third parties

I Partnership, not just provider and user relationshipI Joint activitiesI Joint projectsI Strong involvement of postgraduate students in the process

I Both from the Computer Science and the Application Area

I Currently mostly in the areas of computing and data manipulationand processing

I Examples follow

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 5 / 15

Page 6: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Brain Neurology Models

I Brain dynamic causal models based on intracranial EEGI Intracranial electrodes, up to 128 signal channels from one electrode,

frequence up to 1 kHzI Analysis of complex systems made from coupled systemsI Correlation of signalsI Causal (directional) relationshipI Bi- and Multi-variant analyses

I Study of anatomic connectivity of brain tissue through diffuse tensorimaging

I provides anatomical model of the brain (brain threads)I Very computationally demanding

I Faculty of Medicine and University Hospital, Masaryk University,Institute of Scientific Instruments

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 6 / 15

Page 7: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Bivariant models

I Floating windowI FFT, power spectrum, Hilbert transformation, wavesI Separate repetition for each frequency channel

I Synchronization indexesI many variants: regression (R2, h2), Shannon entropy . . .I Always only two channels analyzed

R2 = maxτcov2(x(t),y(t+τ))

var(x(t))var(y(t+τ))

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 7 / 15

Page 8: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Multivariant Models

For more than two channelsI Multivariant Methods

I One possibility is to usebivariant methods for all thecombination of channels

I Visualization Problems,correctness

I Covariant matrix andeigenvalues

I Covariant matrix over allsignals for each window

I Only eigenvalues arevisualized

I Approximate, but givesproper impression

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 8 / 15

Page 9: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Global Climate Change

Tree reconstruction from a laser scanI Search for a new algorithm for a reconstruction of a tree from a

cloud of 3D points from a laser scanI Tree scanned by a laser scan LIDAR

I Output is a 4 D map of XYZ coordinates plus reflection intensity(different for a trunk, leaves, . . . )

I Order of thousand points per a treeI Expected output

I Tree structureI in a format for further digital processing

I Institute of Global Climate Change

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 9 / 15

Page 10: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Tree Reconstruction

I Major problemsI Data quality

I Combination of scans from different angles—precision, movementdue to the wind, . . .

I overlaps ⇒ gaps in dataI Adjacency graph → independent reconstruction of promising

identifies areas → combination of reconstructed areas into a treeI Use of neural networks to fill in gaps in data

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 10 / 15

Page 11: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Virtual Microscope

http://atlases.muni.cz/I Collection of tissue scans with a resolution up to 170000x140000

pixels (gigapixel range)I More than 3000 samples in more than 150 million files

I Currently more than 30 thousand tiles (=independent scans) splitover 1 million images per a picture

I Accessible through web interfaceI Simulation of a real microscope

I Fine-grained focusI JPEG2000 version under development

I GPGPU (CUDA) accelerated processingI Interesting research in the perceived picture quality of JPEG versus

JPEG2000I Institute of Pathology, Faculty of Medicine, Masaryk University

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 11 / 15

Page 12: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Molecular Modeling

Haptic models of interaction of a largebiomolecule with a smaller agent

I Energy gradient is mappedon the haptic forcefield

I Needs fast response (1 kHz)I Realistic simulation →

computationally intensiveElectric charges at atoms in a molecule

I Extremely computationally intensivefor large molecules

I Electronegativity equalizationI Based on ab initio parameters

Large multipoint interactionsI Long distance interaction of million of particlesI Necessary for realistic molecular dynamics simulation of large

biomoleculesNCBR (CEITEC)Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 12 / 15

Page 13: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

ELIXIR and ELIXIR_CZ

I European Bioinformatics InfrastructureI Extremely large number of data

I Thousands of genomic sequencers foreseen in EuropeI Each capable of producing petabyte(s) of data yearly

I Participation at the ELIXIR_CZ node setupI Collaboration on data storage, management (including access

control), processing and long term preservationI Combined with computationally intensive simulations

I Many institutions through the country, coordinated through Instituteof Organic Chemistry and Biochemistry

I Cyberinfrastructure institutions founding members

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 13 / 15

Page 14: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

BBMRI_CZ

I National biobanking infrastructureI distributed infrastrucutre: both from geo and organization

perspectiveI gathering of anonymized data about the stored samples

BBMRI-CZ�National Biobanking Infrastructure�

in the� Cze�ch Re�publicDalibor Valík1, Pe�tr Holub2, Kristína Gre�plová1, Dana Knoflíčková1

1RECAMOMasarykův onkologický ústavŽlutý kope�c 7, 656 53 Brno

e�mail: [email protected], gre�[email protected], [email protected]

2CERIT-SCÚstav výpoče�tní te�chniky MUBotanická 68a, 602 00 Brno

e�mail: hope�[email protected]

Hospital information system

Data anonymization

Hospital

Sample storage information

Export to central storage

Biobank administratorBiobank monitorng system

Biobank

Central BBMRI-CZ index

Search interface

Sample request/approval interfaceCentral BBMRI-CZinfrastructure

Researcher

Approved research projects

sample request

appr

oval

/den

ial

IT Infrastructure for BBMRI-CZData gathe�ring as we�ll as sample� re�que�sting by the� re�se�arche�rs in BBMRI-CZ� infrastructure� is a distribute�d proce�ss that spans se�ve�ral inde�pe�nde�nt institutions and involve�s patie�nts’ data, thus re�quiring comple�x IT infrastructure� to support and prote�ct it. While� the� biobanks with colle�ct sample�s and store� sample�s in cryoboxe�s, the� IT infrastructure� will colle�ct me�tadata for e�ach sample� from se�ve�ral he�te�roge�ne�ous source�s (hospital information syste�ms, biobanks the�mse�lve�s, national oncology re�giste�r), anonymize� the� me�tadata and inde�x it in orde�r to allow re�se�arche�rs to find sample�s of the�ir inte�re�st using both simple� and comple�x que�rie�s. The� IT infrastructure� will le�ve�rage� Europe�an and Cze�ch e�-infrastructure�s: CESNET2 high-spe�e�d backbone� ne�twork, distribute�d storage� te�chnologie�s and computing syste�ms provide�d by CERIT-SC and authe�ntication and authorization infrastructure�s base�d on fe�de�ration principle�s on national le�ve�l. The� IT infrastructure� is be�ing de�ve�lope�d jointly by RECAMO and CERIT-SC partne�r proje�cts and is sche�dule�d for de�ployme�nt in 2012.

About BBMRI-CZBBMRI-CZ� is a national biobank proje�ct commite�d to providing re�se�arche�rs with mate�rial for me�dical and biological re�se�arch, focusing mainly on oncology. Coordinate�d by Masaryk Me�morial Cance�r Institute� in Brno, the� ge�ographically distribute�d facility will consist of at le�ast 5 biobanks (Brno, Prague�, Olomouc, Hrade�c Králové, and Plze�ň).The� proje�ct is part of the� BBMRI Europe�an Re�se�arch Infrastrcture� Consortium (ERIC).

5 Gb/s

Praha

Liberec

Pardubice

Brno

Olomouc

Ostrava

Opava

NIX

Internet

ChebPoděbrady

Turnov

GÉANT

AMS-IXPísek

SANETACONET

PIONIER

Dvůr Králové

Krnov

Kyjov

Jihlava

Humpolec

Řež

Děčín

Ústí n. L.

Plzeň

Beroun

Č. TřebováLitomyšl Karviná

ZlínVyškov

Břeclav a Lednice

České Budějovice

Vodňany

Nové Hrady

J. Hradec

Tábor

Třeboň

Telč

Znojmo

Hradec Králové

Most

Kostelecn.Č.L.

Ondřejov Kutná HoraMariánské

Lázně

Jablonec n. N.

Prostějov

Uherské Hradiště

DWDM10 Gb/s1–2,5 Gb/s100 Mb/s<100 Mb/s

BBMRI-CZ� (gre�e�n dots) on top of CESNET2 national re�se�arch and e�ducational ne�twork.Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 14 / 15

Page 15: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

BBMRI_CZ

Hospital information system

Data anonymization

Hospital

Sample storage information

Export to central storage Biobank administratorBiobank monitorng system

Biobank

Central BBMRI-CZ index

Search interface

Sample request/approval interfaceCentral BBMRI-CZinfrastructure

Researcher

Approved research projects

sample request

appr

oval

/den

ial

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 14 / 15

Page 16: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

BBMRI_CZ

I National biobanking infrastructureI CERIT-SC helps to build the underlying IT inftrastructureI R&D problems:

I coherent data gathering from two layers of institutionI distributed pseudonymization architecture (bijective)I k-anonymization for rare cases/diseasesI extraction of data from Hospital Information SystemsI data protection during transmission and storageI long-term data preservation

I use of distributed AAI

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 14 / 15

Page 17: Cyberinfrastructure and its Applications in the Czech Republic · 10/1/2012  · AMS-IX Písek SANET ACONET PIONIER Dv r Králové Krnov Kyjov Jihlava Humpolec e D ín Ústí n. L.

Conclusions

I Strong Cyberinfrastructure in the Czech RepublicI Coverage from network up to application layers

I Research work based on a partnership with other communitiesI Bringing students into the process

I Intensive international collaborationI Networks through GeantI Grids through EGII Supercomputing through PRACEI Also additional more narrowly focused projects

I Including collaboration with the US, although without externalfunding

Holub, Matyska (CESNET & CERIT-SC) Czech Cyberinfrastructure October 1, 2012 15 / 15