Influx db talk-20150415
-
Upload
richard-elling -
Category
Technology
-
view
209 -
download
0
Transcript of Influx db talk-20150415
FeaturesHTTP(S) API with user access controls
Scalability
Billions of data points
Hundreds of thousands of series
Multiple nodes
Managed retention policies
Simple to install and manage — no external dependencies
Dev Featuresgithub.com/influxdb
Written in go
SQL-like query language
Client libraries available for your favorite dev environment
python, javascript, node.js, java, R, ruby, C#, PHP, …
HTTP: curl, httpie, wget
MIT license
Ops FeaturesSecurity model separates admins from users
Active and vibrant community
Flexible data retention policies
Time-based sharding
Downsample data using different time windows
Expand storage space by adding nodes
Why We Chose InfluxDB?Need telemetry, events, and status from systems
Information, not just numbers
100k+ metrics per system, 2-4k are interesting to measure forever
Events and configuration
Collecting more relational data, extensible, JSON works well
Requirements rules out many “metrics-oriented” time-series solutions
Feed from collectd and HTTP POST
Open source, redistribution and contribution friendly license (MIT)
Deployment Architecture
SchemaVersion 0.8
Embed metadata into series name
Similar to graphite
name1.value1.name2.value2.metric
datacenter.0.server.elvis.temperature
Version 0.9
Spoiler alert
QueriesSQL-like query language
select * from series_name
select value from series_name where time > ‘2015-04-15’
select value from series_name where time > now() - 1h
select value from series_name where time > now() - 1d limit 100
Regular expressions are handyselect * from /.*\.elvis\..*/ limit 10
select value from /^MyCompany\..*/ limit 1
Queries do mathcount, top, bottom
min, max, mean, mode, median, stddev
distinct
percentile
histogram
first, last, difference, sum, derivative
select mean(value) from series_name where time > now() - 1h
select derivative(value) from series_name where time > now() - 1h group by time (60s) order asc
Continuous QueriesUseful for downsampling
Choices:
downsample every time you query
downsample in advance and store the results
Restricted query: only admins can create continuous queries
Powerful with many different options and applications
select mean(value) from series_name group by time(5m) into series_name.mean.5m
Python pluginfrom influxdb import InfluxDBClient client = InfluxDBClient('localhost', 8086, 'user', 'password', ‘db_name’) print json.dumps(client.query('list series'), indent=4) [ { "points": [ [ 0, "Node.elvis.CPU_stats.0.derive.cpu_nsec_idle" ], … ], "name": "list_series_result", "columns": [ "time", "name" ] } ]
Managing ShardsSetup shard spaces when creating databases (!){ “spaces”: [{ “name”: “detail”, “retentionPolicy”: “10d”, “shardDuration”: “2d”, “regex”: “/.*/“, “replicationFactor”: 1, “split”: 1 }] } object { array { object { string name; // space name string retentionPolicy; // minimum time to keep string shardDuration; // max expected group by time() number replicationFactor; // number of replicas number split; // shards per period }; } spaces; };
InfluxDB Version 0.8
current stable release
end of the road for 0.8 (0.8.8)
database back-ends: LevelDB (use this), RocksDB, HyperLevelDB, and LMDB
caveat: clustering is completely redesigned in 0.9
FuturesVersion 0.9 in release-candidate stage (start testing now!)
Significant redesign — migration may be challeging
Tags for fast, efficient queries — see docs and begin schema planning now
Dropping multiple database backends — using BoltDB
Clustering, replication, high-availability
Streaming raft implementation
Role = broker for raft consensus
Role = data for hosting data, answer queries
www.influxdb.com https://groups.google.com/forum/#!forum/influxdb
@InfluxDB
[email protected] #richardelling
Demo and Questions