The Big Metadata

Post on 22-Mar-2017

193 views 0 download

Transcript of The Big Metadata

The Big Metadata

Stories from the dark underbelly of data

operations

By Daniela Tomova

Origin Story

ID=112056 Name=Julia’s file

What was Julia’s file?

Who was Julia?

What is metadata?

Data: qualifies or quantifies a concept or a real-world occurence, often in the form of a variable across time. Used to measure and understand.

Metadata: classifies and describes data. Used to understand, structure, track and manipulate data.

What is metadata?

Metadata and ”dumb” data

This commentator is basically your average data.

What is metadata?

ID Time Dimension 1 Time Dimension 2 Value

112056 27-11-2006 23:00:00 28-11-2006 01:00:00 830

112056 27-11-2006 23:00:00 28-11-2006 02:00:00 12.7

Descriptive or Semantic Metadata:

CommodityVariableContract typeFacility typeTechnologyGeographySectorEtc.

Structural or Technical Metadata:

Creation dateOrigin systemSet IDPublication freq.Value freq.Variable typeChange dateSourceSource fileEtc.

Precisely!

We cannot afford not to use metadata:

- Structure, traceability and common standards save time and resources. The more data – the greater the savings.

- Matadata removes the human bottleneck. Enables data usage and reusage by both people and processes.

But that’s even more data! Don’t we have enough/too much already?

No.

-Aggregation. Easier to process than the underlying data even across sets and dimensions.

- Abstraction. Easier for people with different levels of experience to understand.

- Tool. It has a bi-directional relationship with its subject and can be used to manipulate it.

Just data about data?

- Julia’s File or WeaCity.ECeENS_Europe.Precip;;WeaCity;PC;EC.Ens;F;H.12;UTC;SVK.SK01.BRATIS;Wea.Precip;mm;H.6;;03;

How do we use it?

Common standard

Result

Multiple tuples linked to a curve ID:

Application dictionaryEasy, powerful, and robust Matlab quieries.

Easy groupings of data in containers: charts, files, tables.

Reusable and pivotable code.

Efficient manipulation of groups of curves.

Powerful and scalable monitoring and debugging of large amounts of heterogenous data.

Human dictionary

A store of analyst knowledge about the data in a common vocabulary.

Searchable

Some cool stuff which would be impossible without meta

Smart homes and IoTMachine learningNatural language processingBitcoin operations and new uses for the blockchain meta Tergeted online contentSmart gridsBig data analysisModern video and audio librariesiTunes

Future usesEmergent algorithms – like those underpinning swarm intelligence behavior and artificial neural networks

Emergent technology – technology the effects of which are greated than its building blocks

Singularity?

SummaryHumans are not optimised for raw data processing.

We think in abstractions, relationships and tool manipulation. If we want to keep up with data, we need to shape it to the way our brains work.

That’s what metadata does.

“I've seen things you people wouldn't believe…” – Roy in Blade Runner

Questions?