Efficient Ti喘imeiieries S...Efficient Ti喘imeiieries S with PostgreSQL iTeveiiimps Sont...

Post on 20-May-2020

1 views 0 download

Transcript of Efficient Ti喘imeiieries S...Efficient Ti喘imeiieries S with PostgreSQL iTeveiiimps Sont...

Efficient Ti喘imeiieries Swith PostgreSQL

iTeveiiimps Sont steve@smpsn.net FOSDEM PGDay 2018

Overview

Overview

Background

Overview

Background Complexity

Overview

Background Complexity Time Series

Overview

Background

Schema

Indexing

Normalisation

Summarisation

Partitioning

Complexity Time Series

Bafickgrount d

Developers S,iDevelopers S,iDevelopers S

Expent s Sivei喘oys S

Mont iTorint g

Vis SibiliTyiint ToiTheioperaTiont iofihardwareiant dis SofTware

e.g. web site, database, cluster, disk drive

Mont iTorint g

Vis SibiliTyiint ToiTheioperaTiont iofihardwareiant dis SofTware

e.g. web site, database, cluster, disk drive

! Look !! Graphs !

Mont iTorint gi–i喘imeiieries S

ComplexiTy

“Mificros Servifices SiHell”

PostgreSQL

Redis

FrontendMachine

Learning Bit

Login / IAM

Backend APICache

ConfigurationStore

ElasticsearchCassandra

Search APIService

Discovery(consul, etcd)

Container Orchestration EngineDocker Swarm / Kubernetes

MySQL

MoreiCowbelli-iOpent iTafick

Heat HorizonNova Keystone

Conductor

Scheduler

Compute

GlanceCinder

KVM

MySQL

RabbitMQ

...

...

...

...

......

...

...

Ceph

MoreiCowbelli-iOpent iTafick

Heat HorizonNova Keystone

Conductor

Scheduler

Compute

GlanceCinder

KVM

MySQL

RabbitMQ

...

...

...

...

......

...

...

Ceph

AWS

i i

Mont iTorint g

Software

Network

Storage

Servers

MeTrifics S

Logs S

i i

Middleware

Mont iTorint g

Software

Network

Storage Log API

Kafka

Logstash Elastic

InfluxDB

Metric APIAlerting

Grafana

Kibana

MySQL

SQLite

Servers

MeTrifics S

Logs S

Zookeeper

Storm

i i

Reficap

PostgreSQL

Redis

FrontendMachine

Learning Bit

Login / IAM

Backend APICache

ConfigurationStore

ElasticsearchCassandra

Search APIService

Discovery(consul, etcd)

Container Orchestration Engine - Docker Swarm / Kubernetes

MySQL

i i

Reficap

PostgreSQL

Redis

FrontendMachine

Learning Bit

Login / IAM

Backend APICache

ConfigurationStore

ElasticsearchCassandra

Search APIService

Discovery(consul, etcd)

Container Orchestration Engine - Docker Swarm / Kubernetes

MySQL

Heat HorizonNova Keystone

Conductor

Scheduler

Compute

GlanceCinder

KVM

MySQL

RabbitMQ

...

.........

.........

...

Ceph

i i

Reficap

PostgreSQL

Redis

FrontendMachine

Learning Bit

Login / IAM

Backend APICache

ConfigurationStore

ElasticsearchCassandra

Search APIService

Discovery(consul, etcd)

Container Orchestration Engine - Docker Swarm / Kubernetes

MySQL

Heat HorizonNova Keystone

Conductor

Scheduler

Compute

GlanceCinder

KVM

MySQL

RabbitMQ

...

.........

.........

...

Ceph

Middleware

Log API

Kafka

Logstash Elastic

InfluxDB

Metric APIAlerting

Grafana

Kibana

MySQL

SQLite

Zookeeper

Storm

i i

Middleware

Mont iTorint g

Software

Network

Storage Log API

Kafka

Logstash Elastic

InfluxDB

Metric APIAlerting

Grafana

Kibana

MySQL

SQLite

Servers

MeTrifics S

Logs S

Zookeeper

Storm

i i

Middleware

Mont iTorint g

Software

Network

Storage Log API

Kafka

Logstash Elastic

InfluxDB

Metric APIAlerting

Grafana

Kibana

MySQL

SQLite

Servers

MeTrifics S

Logs S

Zookeeper

Storm

i i

Mont iTorint g

Commendable “right tool for the job” attitude, but…

ATileas ST,ificouldis SomeiofiTheipers Sis STent ficeibeiunt ifedd

Fewerifailureimodes S

Feweribafickupis STraTegies S

Feweriredunt dant ficy/replificaTiont iproToficols S

Ont eis SeTiofificont s Sis STent TidaTais Semant Tifics S

Re-us Seiexis STint gioperaTiont aliknt owledge

i i

Ont eis Simpleiques STiont …

Is PostreSQL a Time Series Database?

喘imeiieries S

喘imeiieries SiDaTabas Ses Si–i喘heiChoifice!

● Ganglia (RRDtool)● Graphite (Whisper)● OpenTSDB (HBase)● KairosDB (Cassandra)● InfuxDB● Prometheus● Gnocchi● Atlas● Heroic

● Hawkular (Cassandra)● MetricTank (Cassandra)● Riak TS (Riak)● Bluefood (Cassandra)● DalmatinerDB● Druid● BTrDB● Warp 10 (Hbase)● Tgres (PostgreSQL!)

喘imeiieries SiDaTabas Ses Si–i喘heiChoifice!

● Ganglia [Berkley]● Graphite [Orbitz]● OpenTSDB [Stubleupon]● KairosDB● InfuxDB● Prometheus [SoundCloud]● Gnocchi [OpenStack]● Atlas [Netfix]● Heroic [Spotify]

● Hawkular [Redhat]● MetricTank [Raintank]● Riak TS [Basho]● Bluefood [Rackspace]● DalmatinerDB● Druid● BTrDB● Warp 10● Tgres

喘imeiieries SiDaTabas Ses S

喘imeiieries SiDaTabas Ses S

2000

喘imeiieries SiDaTabas Ses S

2000 2010

喘imeiieries SiDaTabas Ses S

2000 2010

2013 - 2017

喘imeiieries Si–iPeriodific

time value

10:01 15%

10:02 100%

10:03 86%

ColleficTor

10:04 60%

10:05 0%

ColleficTor

喘imeiieries Si–iPeriodific

ColleficTortime value

10:01 15%

MeTrific

MeTa

name dimensions

cpu { host:prod1 }

10:01 1% cpu { host:prod2 }

10:01 24° temp { sensor:rack }

ColleficTor

10:02 100% cpu { host:prod1 }

10:02 87% cpu { host:prod2 }

10:02 26° temp { sensor:rack }

ColleficTor

喘imeiieries Si–iiporadific

ColleficTortime value

10:07 13

MeTrific

MeTa

name dimensions meta

log { host:prod1 }

Event TMeTa

{ msg:… }

11:39 1 log { host:prod2 } { msg:… }

11:50 2 alarm { host:prod1 } { reason:… }

Event Ts S

14:02 1 alarm { host:prod1 } { reason:… }

"metric": { "timestamp": 1232141412, "value": 42, "name": "cpu.percent", "dimensions": { "hostname": "dev-01" }, "value_meta": { … }}

喘imeiieries Si–iInt ges STiDaTa

● JiONiEnt ficoded● 喘imes STampi(Unt ix/IiO/..)● Meas Surement TiValue● MeTrificiIdent TifficaTiont

– Name– Dimensions (or “Tags”)

● OpTiont aliEvent TiDaTa

喘imeiieries Si–iQueries S

10:01 10:02 10:0320

21

22

23

24

25

26

27

28

29

30

{ sensor:rack }

tem

p °

C

time value

10:01 15%

name dimensions

cpu { host:prod1 }

10:01 1% cpu { host:prod2 }

10:01 24° temp { sensor:rack }

10:02 100% cpu { host:prod1 }

10:02 87% cpu { host:prod2 }

10:02 26° temp { sensor:rack }

10:03 100% cpu { host:prod1 }

10:03 50% cpu { host:prod2 }

10:03 27° temp { sensor:rack }

喘imeiieries Si–iQueries S

10:01 10:02 10:030

10

20

30

40

50

60

70

80

90

100

{ host:prod1 } { host:prod2 }

cpu

%

time value

10:01 15%

name dimensions

cpu { host:prod1 }

10:01 1% cpu { host:prod2 }

10:01 24° temp { sensor:rack }

10:02 100% cpu { host:prod1 }

10:02 87% cpu { host:prod2 }

10:02 26° temp { sensor:rack }

10:03 100% cpu { host:prod1 }

10:03 50% cpu { host:prod2 }

10:03 27° temp { sensor:rack }

? ? ?? ? ?? ? ?

? ? ?? ? ?? ? ?? ? ?? ? ?? ? ?

10:0010:0010:00

喘imeiieries Si–iQueries S

10:01 10:02 10:030

10

20

30

40

50

60

70

80

90

100

{ host:prod1 } { host:prod2 }

cpu

%

time value

10:01 15%

name dimensions

cpu { host:prod1 } 10:01 1% cpu { host:prod2 } 10:01 24° temp { sensor:rack } 10:02 100% cpu { host:prod1 } 10:02 87% cpu { host:prod2 } 10:02 26° temp { sensor:rack } 10:03 100% cpu { host:prod1 } 10:03 50% cpu { host:prod2 } 10:03 27° temp { sensor:rack } 10:0410:0410:0410:0510:0510:05

喘imeiieries Si–iificalint g

time value name dimensions ● ReTent Tiont i(Volume)– How long is data kept– Data volume stored

● MeTrificiAmount T– Number of series names– Additional dimensions

● QueryiComplexiTy– Time range size– Number of metrics

10:01 15% cpu {host:prod1}

10:01 1% cpu {host:prod2}

10:01 24° temp {sensor:rack}

10:02 100% cpu {host:prod1}

10:02 87% cpu {host:prod2}

10:02 26° temp {sensor:rack}

10:03 100% cpu {host:prod1}

10:03 50% cpu {host:prod2}

10:03 27° temp {sensor:rack}

RelaTiont aliModel

CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON);

RelaTiont aliModeli–iDent ormalis Sediifichema

● iToreiint is Sint gleiTable– Trivial mapping of data– Row is a measurement

● NaTiveiformaTiTime– Normalise to UTC timezone– “Just use TIMESTAMPTZ”

● FloaTint gipoint Tivalue– Typical for sensor use cases– Strict accuracy: use NUMERIC

● e.g. Money!

CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON);

RelaTiont aliModeli–iDent ormalis Sediifichema

● ArbiTraryilent gThint ame– Fixed length: no real efect

● Key/valueidiment s Siont s S– JSONB is binary encoded– Efficient access to children– Best if querying contents

● Key/valueimeTadaTa– JSON just validated text– Best if purely payload

SELECT DISTINCT name, dimensionsFROM measurementsWHERE name = 'cpu.percent' dimensions @> '{"host": "dev-01"}'::JSONB

RelaTiont aliModeli–iieries SiLis STint giQuery

● ieleficTimeTrificimeTadaTa● ihowieafichimeTrificiont fice● OpTiont allyiflTer

– All metrics with name– All metrics for a host

SELECT TIME_ROUND(timestamp, 60), AVG(value)FROM measurementsWHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1

RelaTiont aliModeli–iiint gleiieries SiQuery

● Fint dimeas Surement Ts S– Specify time range– Specify metric name– Specify dimension value

● AggregaTeidaTaipoint Ts S– Round to desired interval– Group by that interval– Take average of all data

points in that interval

RelaTiont aliModeli–iPerformant ficeiAnt alys Sis S

Query Duration (seconds)

Data Volume(M/rows)

喘argeT:i<100ms S(QueryiDuraTiont )

Query Duration (seconds)

Time Range(seconds)

RelaTiont aliModeli–iiint gleiieries SiQueryi(vs Si喘imeiRant ge)

0 1000 2000 3000 4000 5000 6000 7000 8000 90000.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

3M Rows

2M Rows

1M Rows

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

RelaTiont aliModeli–iiint gleiieries SiQueryi(vs SiDaTaiVolume)

0 1 2 3 4 5 6 7 8 9 100.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

Data Volume (M-rows)

Qu

ery

Du

ratio

n (

seco

nd

s)

RelaTiont aliModeli-iAnt alys Sis S

✔ QueryiTimeifxediregardles Ss SiofiTimeirant ge

✔ Ont iTargeTifor<i~1Mirows S

✗ QueryiTimeis Sficales Silint earlyiwiThidaTaivolume

✗ Everyiqueryireads Sieveryirow✗ Full table scan

Int dexint g

Int dexint g

● 喘imes STamps Siareies Ss Sent Tiallyiint Tegers S● Pos STgreiQLihas Simant yiint dexiTypes SB喘REE,iHAiH,iBRIN,iGIN,iGIi喘

● B喘REEiexficellent TiforiEqualiTyiant diBeTweent

Index Table

1

63 2

Int dexint gi–iB喘REE

3 three2 two4 four6 six

1 one5 five8 eight7 seven

45

78

Index Table

1

63 2

Int dexint gi–iB喘REE

3 three2 two4 four6 six

1 one5 five8 eight7 seven

45

78

=7

Index Table

1

63 2

Int dexint gi–iB喘REE

3 three2 two4 four6 six

1 one5 five8 eight7 seven

45

78

>=6<=8

SELECT TIME_ROUND(timestamp, 60), AVG(value)FROM measurementsWHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1

Int dexint gi–iiint gleiieries SiQuery

● BE喘WEENipredificaTe● Elimint aTes SihugeiporTiont iofiTableificont Tent Ts S– High selectivity

● Exficellent Tificant didaTeiforiint dexificreaTiont

Int dexint gi-i喘imes STamp

CREATE INDEX ON measurements USING BTREE (timestamp);

● ipeficifyiTableiToiint dex● ipeficifyiint dexiType

– Optional: BTREE is default● ipeficifyificolumnt iToiint dex

Int dexint gi–iiint gleiieries SiQueryi(vs SiDaTaiVolume)

0 1 2 3 4 5 6 7 8 9 100.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

No Index

With Index

Data Volume (millions/rows)

Qu

ery

Du

ratio

n (

seco

nd

s)

Int dexint gi–iiint gleiieries SiQueryi(vs SiDaTaiVolume)

0 1 2 3 4 5 6 7 8 9 100.000

0.005

0.010

0.015

0.020

0.025

9000

8000

7000

Data Volume (millions/rows)

Qu

ery

Du

ratio

n (

seco

nd

s)

Int dexint gi–iiint gleiieries SiQueryi(vs Si喘imeiRant ge,i10MiRows S)

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10

1 Metric

10 Metrics

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Int dexint gi–iiint gleiieries SiQueryi(vs Si喘imeiRant ge,i10MiRows S)

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000.00

0.05

0.10

0.15

0.20

0.25

1 Metric

10 Metrics

100 Metrics

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Int dexint gi-iAnt alys Sis S

✔ DaTaiVolume

✔ To 10M✔ 喘imeiRant ge

✔ To 9000s (10 Metrics)✔ QueryiTimeis STableias SiDaTaiVolumeiint ficreas Ses S

✗ 喘imeiRant ge

✗ Over 4000s (100 Metrics)✗ Nowiapparent TiqueryiduraTiont iint ficreas Ses Sias Si喘imeiRant geigrows S

✗ Int ficreas Sint gint umberiofimeTrifics Sidras STificallyiafeficTs SiqueryiduraTiont

✗ Data for each uninteresting series must be fltered out

SELECT TIME_ROUND(timestamp, 60), AVG(value)FROM measurementsWHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1

Int dexint gi–iiint gleiieries SiQuery

● Moreiint dexint gd– name– dimensions

Int dexint gi–iAddiTiont al

CREATE INDEX ON measurements USING BTREE (name);

CREATE INDEX ON measurements USING GIN (dimensions);

● CreaTeint ewiint dexes Siont imeas Surement Ts SiTable

● ipeficifyint ame– Equality: Use BTREE

● ipeficifyidiment s Siont s S– Containment: Use GIN– Find contents of JSON

Int dexint gi–iieries SiQueryi(vs Si喘imeiRant ge,i10MiRows S,i100iMeTrifics S)

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000.00

0.05

0.10

0.15

0.20

0.25

Time & Metric

Time Index

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Int dexint g

✗ Ohidear

✗ 喘ooimufichiint dexint gificant ibeiharmful

Normalis SaTiont

CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON);

Normalis SaTiont

CREATE TABLE values ( timestamp TIMESTAMPTZ, value FLOAT8, metric_id INT, value_meta JSON);

CREATE TABLE metrics ( id SERIAL, name VARCHAR, dimensions JSONB, UNIQUE (name, dimensions));

Normalis SaTiont

● Values Sis SToredibyiint Tegeriid– References entry in metric table– The name/dimensions for each

metric are only stored once– Eliminates repeated bulky data

in measurements table● MeTrificiTableidefnt es Siid

– SERIAL produces incrementing integers to allot id values

– UNIQUE constraint is useful during normalisation

● Implicitly creates suitable index

CREATE TABLE values ( timestamp TIMESTAMPTZ, value FLOAT8, metric_id INT, value_meta JSON);

CREATE TABLE metrics ( id SERIAL, name VARCHAR, dimensions JSONB, UNIQUE (name, dimensions));

Normalis SaTiont i–iView

CREATE VIEW measurementsAS SELECT timestamp, value, name, dimensions, value_meta FROM values INNER JOIN metrics ON (metric_id = id);

● Mimificimeas Surement Ts S– Views can be queried in

the same way as tables● Defnt ediwiThiiELEC喘

– Query to run which produces contents of view

● Joint int ormalis SediTables S● Cant ire-us Seis Sameiqueries Sias Sibefore

CREATE RULE measurements_insert AS ON INSERT TO measurements DO INSTEAD INSERT INTO values ( timestamp, value, metric_id, value_meta ) VALUES ( NEW.timestamp, NEW.value, create_metric ( NEW.name, NEW.dimensions), NEW.value_meta );

Normalis SaTiont i–iViewiInt s SerT

● Cant ’Tiint s SerTidaTaiint Toiviews SibyidefaulT

● Cant is Speficifyiant iaficTiont iToiperformiont iINiER喘

● Int s SerTiint Toivalues S● HelperiproficedureiToialloficaTeimeTrific_id

● Normalis SaTiont iis SiTrant s Sparent Tiforius Ser

CREATE FUNCTION create_metric ( in_name VARCHAR, in_dims JSONB) RETURNS INT LANGUAGE plpgsql AS $_$DECLARE out_id INT;BEGIN SELECT id INTO out_id FROM metrics AS m WHERE m.name = in_name AND m.dimensions = in_dims; IF NOT FOUND THEN INSERT INTO metrics ("name", "dimensions") VALUES (in_name, in_dims) RETURNING id INTO out_id; END IF; RETURN out_id;END; $_$;

Normalis SaTiont i–iMeTrificiLookup

● iTorediproficedure– Take name/dimensions– Returns metric_id

● Fint diexis STint gimeTrific– Return existing id

● Ifint ew,iThent iINiER喘– Allocates new id– Return the new id

CREATE INDEX ON values USING BTREE (timestamp);

CREATE INDEX ON values USING BTREE (metric_id);

Normalis SaTiont i-iInt dexint g

● 喘imes STampiint dex– Same as before

● Newiint dexiont imeTrific_id– Allow efficient fltering of

metrics during JOIN– Serves similar purpose to

existing metric indexing

Normalis SaTiont i–iieries SiQueryi(vs Si喘ime,i10MiRows S,i100iMeTrifics S)

0 1000 2000 3000 4000 5000 6000 7000 8000 90000.00

0.05

0.10

0.15

0.20

0.25

Normalised

Denormalised (Time Index)

Denormalised (Extra Index)

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Normalis SaTiont

✔ Normalis SaTiont ielimint aTedioverheadiofiaddiTiont alimeTrificiint dexint g

✗ 喘heimeTrificiint dexint gis STillidoes Snt ’Tihaveiaipos SiTiveiefeficT

time value metric

10:01 . 1

10:01 . 2

10:02 . 1

10:02 . 2

10:03 . 1

10:03 . 2

10:04 . 1

10:04 . 2

A

B

C

D

E

F

G

H

Normalis SaTiont i–iBiTmapiInt dexiificant

timeindex

metricindex

C

E

F

B

D

F

H

D

D

F

2:02:03

time value metric

10:01 . 1

10:01 . 2

10:02 . 1

10:02 . 2

10:03 . 1

10:03 . 2

10:04 . 1

10:04 . 2

A

B

C

D

E

F

G

H

Normalis SaTiont i–iMulTi-Columnt iInt dexint g

timemetricindex

D

F

2:02:03

CREATE INDEX ON values USING BTREE (timestamp);

CREATE INDEX ON values USING BTREE (metric_id);

Normalis SaTiont i–iMulTi-Columnt iInt dexint g

CREATE INDEX ON values USING BTREE (timestamp, metric_id);

CREATE INDEX ON values USING BTREE (metric_id, timestamp);

Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i10MiRows S,i100iMeTrifics S)

0 1000 2000 3000 4000 5000 6000 7000 8000 90000.00

0.05

0.10

0.15

0.20

0.25

Normalised (Single Index)

Normalised

Denormalised (Time Index)

Denormalised (Extra Index)

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Normalis SaTiont

● Int ficreas Seivolumei10MiToi100M– @ 1Hz / 100 Metrics: 1M seconds

● Before: ~1.15 days● Now: ~11.5 days

● Int ficreas SeimaxiTimeirant ges Sifromi9000s SiToi90,000s S– Before: 2.5 hours– Now: 1.04 days

Normalis SaTiont i–iieries SiQueryi(vs SiVolume,i10iMeTrifics S)

0 10 20 30 40 50 60 70 80 90 1000

0.02

0.04

0.06

0.08

0.1

0.12

10000

20000

30000

Data Volume (M-rows)

Qu

ery

Du

ratio

n (

seco

nd

s)

Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i100MiRows S)

0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

1 Metric

10 Metrics

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i100MiRows S)

0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1 Metric

10 Metrics

100 Metrics

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i100MiRows S)i(+Cont fg)

0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1 Metric

10 Metrics

100 Metrics

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

Normalis SaTiont i-iAnt alys Sis S

✔ DaTaiVolume

✔ To 100M✔ 喘imeiRant ge

✔ To 90,000s (10 Metrics)

✗ 喘imeiRant ge

✗ Over 30,000s (100 Metrics)

✗ Over 90,000s✗ NeediaibeTTeris STraTegyiforis Servificint gilargeriTimeirant ges S

iummaris Sint g

iummaris Sint gi-iProblem

● Fori100imeTrifics S,is Someiqueries Simis Ss SiTargeTi100ms S– Over ~40Ks (~11 days)

● QueryimighTibeireTurnt int giupiToi40Kipoint Ts S● Is SiThis SiaficTuallyint efices Ss Saryd

– Especially if data is simply used for visualisation– An average 1080p monitor only has ~2000 pixels

● LeTs Sis Sayi4000ipoint Ts Siareient ough,iorievent i400

values_2

iummaris Sint gi-iExample

time sum metric

10:00 10 1

10:00 2 2

10:02 5 1

10:02 4 2

values

time value metric

10:00 10 1

10:00 2 2

10:01 20 1

10:01 6 2

10:02 5 1

10:02 4 2

10:03 15 1

10:03 1 2

30

8

20

5

30

8

20

5

✔ iummaryiTableijus STiaifraficTiont iofiTheis Size

CREATE TABLE values_10 ( timestamp TIMESTAMPTZ, metric_id INT, sum FLOAT8, count FLOAT8, min FLOAT8, max FLOAT8,

UNIQUE (metric_id, timestamp));

iummaris Sint gi

● CreaTeivalues SiTable– Use for 10:1 summary

● Ont eient Try/Timeiperiod– Per metric– UNIQUE provide indexing

● MulTipleiaggregaTes S– SUM– COUNT– MIN– MAX

CREATE VIEW summary_10 AS SELECT * FROM values_10 INNER JOIN metrics ON (metric_id = id);

iummaris Sint g

● CreaTeiaiviewias Sibefore– Only storing metric_id

● iimplifes Siqueries S● Joint s SimeTrificidefnt iTiont s S

CREATE FUNCTION summarise_10 () RETURNS TRIGGER LANGUAGE plpgsql AS $_$BEGIN :END; $_$;

CREATE TRIGGER summarise_10_t AFTER INSERT ON values FOR EACH ROW EXECUTE PROCEDURE summarise_10 ();

iummaris Sint gi–i喘riggeriDefnt iTiont

● BoilerplaTe● Defnt eiTriggerifunt ficTiont

– Stored procedure– Contents omitted

● 喘riggeriToiexeficuTe…– On INSERT– To values table– Data passed to procedure

INSERT INTO values_10 VALUES ( TIME_ROUND(NEW.timestamp, 10), NEW.metric_id, NEW.value, 1, NEW.value, NEW.value)ON CONFLICT (metric_id, timestamp)DO UPDATE SET sum = sum + EXCLUDED.sum, count = count + EXCLUDED.count, min = LEAST (min,EXCLUDED.min), max = GREATEST(max,EXCLUDED.max);

iummaris Sint gi–i喘riggeriAficTiont

● Int s SerTiint Tois Summary– NEW is inserted data

● Rount diTimeiToiperiod– 10 seconds

● Int iTialiaggregaTeivalues S● Ifient Tryiexis STs Sialready● UpdaTeiint s STead

– EXCLUDED is current row– Combine new value with

existing aggregate value

SELECT TIME_ROUND(timestamp, 60), (SUM(sum) / SUM(count)) AS avgFROM summary_10WHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1

iummaris Sint gi–iiint gleiieries SiQuery

● Mos STlyiunt fichant ged● Queryis SummaryiTable,int oTirawimeas Surement Ts S

● HaveiToiaggregaTeiTheiparTialiaggregaTiont s S– MIN: MIN(min)

– MAX: MAX(max)

– SUM: SUM(sum)

– COUNT: SUM(count)

– AVG: SUM(sum)/SUM(count)

iummaris Sint gi–iieries SiQueryi(vs SiRant ge,i100MiRows S)

0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00

0.02

0.04

0.06

0.08

0.10

0.12

1 Metric

10 Metrics

100 Metrics

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

iummaris Sint gi–iieries SiQueryi(vs SiRant ge,i100MiRows S)

0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00

0.02

0.04

0.06

0.08

0.10

0.12

1 Metric

10 Metrics

100 Metrics

1 Metric

10 Metrics

100 Metrics

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

iummaris Sint g

● Int ficreas SeivolumeiToi1BN– @ 1Hz / 100 Metrics: 10M seconds– Before: 11.5 days– Now: ~115 days : 16½ weeks

● Int ficreas SeimaxiTimeirant ges Sifromi90Ks SiToi900Ks S– Before: ~1.04 days– Now: ~10.4 days

iummaris Sint gi–iieries SiQueryi(vs SiVolume;i100M-1BN)

0 100 200 300 400 500 600 700 800 900 10000.000

0.020

0.040

0.060

0.080

0.100

0.120

100000

200000

300000

Data Volume (M-rows)

Qu

ery

Du

ratio

n (

seco

nd

s)

iummaris Sint gi–iieries SiQueryi(vs SiRant ge,i1BNiRows S)

0 100000 200000 300000 400000 500000 600000 700000 800000 9000000

0.02

0.04

0.06

0.08

0.1

0.12

Query Time Range (seconds)

Qu

ery

Du

ratio

n (

seco

nd

s)

✔ DaTaiVolume

✔ To 1BN – ~16 weeks ✔ 喘imeiRant ge

✔ To ~10 days

✔ 喘ois SficaleifurTherdi喘ryi100:1is Summary

iummaris Sint g

Clos Sint giNoTes S

WhaTiNexTd

● ParTiTiont int g– By timestamp and metric_id– Retention window (i.e. keep last 6 months)

● Int ges STiopTimis SaTiont – Compute aggregates in batches (e.g. every 5min)– Non trigger-based summarisation– Possibility to abuse logical replication

Is SiITiWorThiITd

● Yes S– Reduce complexity in terms of number of technologies– Reuse existing infrastructure and operational knowledge– No loss in consistency – ACID all the way

● Nod– Increase complexity in terms of database design– Is an out-of-the box tool already available?

i i 103

喘hant ks S

s STeve@s Smps Snt .nt eT