AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

11
AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke 1

Transcript of AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Page 1: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

AWG 2014

Data Model DescriptionChristian Nieke – IT-DSS

24.9.2014 AWG 2014: Christian Nieke 1

Page 2: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Disclaimer

•General ideas to start up the discussion

•Nothing is set in stone!

24.9.2014 AWG 2014: Christian Nieke 2

Page 3: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Architecture

24.9.2014 AWG 2014: Christian Nieke 3

Page 4: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Open questions

•Data dumping• Time interval (daily, monthly…)• Uniform format?

•Format of processed files• CSV for now (simple)• Technology review going on (Parquet, Avro…)

•Advanced Export interfaces• Open research question

24.9.2014 AWG 2014: Christian Nieke 4

Page 5: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Data Description – WHAT do we need?

•Absolute Minimum:

• List of data sets (EOS, LSF, …)• Contact address for each data set!

• List of attributes in processed data• Attribute names • Understandable description• Data type & unit

24.9.2014 AWG 2014: Christian Nieke 5

Page 6: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Data Description – WHAT do we need?

•Short Term:

• Links between data sets• EOS::FileServer <-> LanDB::name

• List of attributes in raw data• For discovery of features that should be extracted• Names, optionally additional descriptions

24.9.2014 AWG 2014: Christian Nieke 6

Page 7: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Data Sources

•Graphical overview

24.9.2014 AWG 2014: Christian Nieke 7

LsHost(Node configuration)

LanDB(Network details&

position)

LSF

(scheduling)

EOS

(I/O details)

Experiment-Dashboards

(semantics)

job_id

worker_node worker_node

file_server/worker_node

user+host+time

Page 8: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Where to put it - TWiki Page?

24.9.2014 AWG 2014: Christian Nieke 8

Page 9: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

TWiki•Pros:

• Central• Access for everybody

• To manage ones own data• … but also improve other peoples descriptions on demand

• Hyperlinks• Fast & easy for the start

•Cons:• Requires manual changes

24.9.2014 AWG 2014: Christian Nieke 9

Page 10: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Later: Machine Readable Descriptions

•Simple description files

• in the filesystem

• description where the data is

• can be used with parser

24.9.2014 AWG 2014: Christian Nieke 10

Descriptions

Page 11: AWG 2014 Data Model Description Christian Nieke – IT-DSS 24.9.2014 AWG 2014: Christian Nieke1.

Discussion and demo…

24.9.2014 AWG 2014: Christian Nieke 11