Devoxx france 2015 influx db
-
Upload
nicolas-muller -
Category
Software
-
view
566 -
download
2
Transcript of Devoxx france 2015 influx db
@zepouet#InfluxDB
:: InfluxDB ::
@zepouet http://www.treeptik.fr http://www.cloudunit.fr http://www.labaixbidouille.com
@zepouet#InfluxDB
:: InfluxDB :: Time Series ::• About Me
• What is a time serie ?
• State of the Art in 2015
• Why yet another product for time series ?
• Live Demo
• Q/A
@zepouet#InfluxDB
About Me•Treeptik
•MarsJUG
•LabAixBidouille
What is a time series ?
Things happening in times…
@zepouet#InfluxDB
@zepouet#InfluxDB
Events, events… events• Measurements (physical sensors…)
• Exceptions (applications)
• Page views
• User actions
• Commits Git
• Webapp Deployment
• Things appening in time
State of the Art :: 2015
@zepouet#InfluxDB
What we have to store ?
• At the moment, we have :
• Graphite
• OpenTSDB (events, Hadoop, HBase…)
• Kairos (events, rewrite from OpenTSBD)
• Ganglia (more present in BigData/Hadoop)
• And others…
@zepouet#InfluxDB
What we have to collect ?
• At the moment, we have :
• CollectD
• Sensu
• DropWizard/Metrics
• JMXTrans
• Jolokia
@YourTwitterHandle@YourTwitterHandle@zepouet#InfluxDB
Something missing…
@zepouet#InfluxDB
Because in 2015, we need
• Simple product to install and manage
• To store millions of points (IoT is here)
• HTTP native support (JSON)
• Build with API
• Automatically clear out old data
• Easy scalable : cloud is a buzzword
@YourTwitterHandle@YourTwitterHandle@zepouet#InfluxDB
UseCase : Fablab
@zepouet#InfluxDB
wiki.labaixbidouille.com/index.php?4tle=Domo4que
@zepouet#InfluxDB
Feedback •Data volume : • 1 event / sensor / minute
• 1 * 60 * 24 = 1440 events per day
• 42.300 events per month
• 518.400 events per year
•First error : use MYSQL
•Second error : bad pattern with InfluxDB
@zepouet#InfluxDB
1.21
GIG
AWAT
TS
@zepouet#InfluxDB
About InfluxDB•An opensource distributed time series database
• ErrPlane
• MIT License
• Written in GO
• Young but awesome project
@zepouet#InfluxDB
InfluxDB :: design goals• Simple to install and manage thank to Go.
• No external dependencies like Zookeeper and Hadoop.
• HTTP(s) interface for reading and writing data.
• Horizontally scalable.
• On disk and in memory. Most data is cold.
• Compute percentiles and others functions on the fly.
• Downsample data on different windows of time.
@zepouet#InfluxDB
InfluxDB :: installing• MacOS : $ brew install influxdb
• Debian : $ sudo dpkg -i influxdb_latest_amd64.deb
• CentOS : $ sudo rpm -ivh influxdb-latest-1.x86_64.rpm
• Docker : $ docker run tutum/influxdb
• Soon ARM and Windows
@zepouet#InfluxDB
InfluxDB :: running• $ influxdb -config=/usr/local/etc/influxdb.conf
• Ports
• 8083 : UI
• 8086 : API
• 8090 : Cluster management raft
• 8099 : Cluster management protobuf
@zepouet#InfluxDB
InfluxDB :: design• Database (like in Mysql, Postgres…)
• Time Series (kind of like tables with time, sequence number and columns)
• A timeserie is composed by points or events (kinds of like rows)
• Primary index is always time
• Null values are not stored
• You can have millions of series
@zepouet#InfluxDB
InfluxDB :: security• Cluster admins
• Database admins
• Database users• Read permissions
• only certains series
• only queries with a column having a specific value (e.g. customer_id = 32)
• Write permissions
• only certains series
• only columns having a specific value
@zepouet#InfluxDB
InfluxDB :: create points
curl -X POST -d '[{"name":"temp","columns":["celsius"],"points":[[23]]}]' ‘http://localhost:8086/db/mydb/series?u=root&p=root
curl -G 'http://localhost:8086/db/mydb/series?u=root&p=root' --data-urlencode "q=select * from temp"
@zepouet#InfluxDB
InfluxDB :: Pitfalls• Schemaless Warning • Data partinioning with one serie
Time Name Host Metrics
3236765 cpu web0 78
3236765 disk_io web0 98344
3236765 load db1 5
3236765 eth_0 ldap0 8755
@zepouet#InfluxDB
Time Name Host Metrics
3236765 disk_io web0 98344
3236766 disk_io web0 98354
3236767 disk_io web0 98224
3236768 disk_io web0 98994
Time Name Host Metrics
3236765 eth_0 ldap0 8755
3236766 eth_0 ldap0 8721
3236767 eth_0 ldap0 8734
3236768 eth_0 ldap0 8723
Time Name Host Metrics
3236765 cpu web0 78
3236766 cpu web0 77
3236767 cpu web0 79
3236768 cpu web0 76
Time Name Host Metrics
3236765 load db1 5
3236766 load db1 6
3236767 load db1 5
3236768 load db1 7
@zepouet#InfluxDB
InfluxDB :: Why so many series?
• To take advantage of the Storage engines • Points are indexed by time, not by any other
columns • Tricks : easily work with grafana
InfluxDB works best with large number of series with fewer columns in each one
@zepouet#InfluxDB
:: Query Langage• select * from /.*/ limit 1
• select val1, val2 from serverA
• select cpu from /server.*/
• select * from /.*/ where time > now() - 1h
• select * from /.*/ where time > ‘2013-08-12 23:32:00’
• select * from /.*/ group by time(10m)
• select count(val) from /.*/ group by time(10m)
• select percentile(val, 95) from /.*/ group by time(10m)
• select count(distinct(val)) from /.*/
@zepouet#InfluxDB
:: Query Langage• DELETE
• delete from response_times where time < now() - 1h
• delete from /^stats.*/ where time < now() - 7d
• drop series response_times
• GROUP BY
• select count(type) from events group by time(10m);
• select count(type),type from events group by time(10m), type;
@zepouet#InfluxDB
:: Visualize and summarize• Graphs
• Last 10 minutes
• Last 4 hours
• Last 24 hours
• Past week
• Past month
• All time
@zepouet#InfluxDB
:: Merging :: Series
• select count(type) from user_events merge admin_events group by time(10m)
• select mean(value) from merge(/.*az\.1.*\.cpu/) group by time(1h)
@zepouet#InfluxDB
:: Joining :: Series
• select hosta.value + hostb.valuefrom cpu_load as hosta inner join cpu_load as hostbwhere hosta.host = 'hosta.influxdb.orb' and hostb.host = ‘hostb.influxdb.org’;
• select errors_per_minute.value / page_views_per_minute.valuefrom errors_per_minute inner join page_views_per_minute
@zepouet#InfluxDB
:: Naming Strategy :: 0.8
• Tag versus Value
• Rule : <tagName>.<tagValue>.serieName
• Examples : arduino.uno.shield.ethernet.sensor.dht11.temperature arduino.uno.shield.ethernet.sensor.dht11.temperature
arduino.uno.shield.wifi.sensor.dht22.humidity arduino.uno.shield.wifi.sensor.dht22.humidity
@zepouet#InfluxDB
:: Naming Strategy :: 0.9+
• Migration processus
• Rule : serieName = serieName
• Tag are defined into JSON and indexed
{ "database" : "domotic", "points": [ { "name": "temperature_x", "tags": { "arduino": "uno", "shield": "wifi", "position": "indoor", "sensor": "dht22", }, "timestamp": "2015-03-28T14:50:00Z", "fields": { "celsius": 23.2, "farenheit": 192 } } ] }
Continuous Queries
@zepouet#InfluxDB
:: Continuous Queries
• select count(type) from events group by time(10m), type into events.count_per_type.10m
DOWNSAMPLING
Next release
@zepouet#InfluxDB
Soon in april 2015
• New model Clustering
• Influx shell
• Tags indexed
• Backup
For Java Dev and Devops
@zepouet#InfluxDB
Libraries
• https://github.com/influxdb/influxdb-java Official java client
• https://github.com/davidB/metrics-influxdb A reporter for metrics which announces measurements to an InfluxDB server.
• https://github.com/vietj/vertx-influxdb-metricsProof of concept of reporting to InfluxDB
@zepouet#InfluxDB
davidb/metrics-influxdbNon official plugin from https://github.com/dropwizard/metrics
@zepouet#InfluxDB
Carbon-influxdb
https://github.com/dropwizard/metrics
@YourTwitterHandle@YourTwitterHandle@zepouet#InfluxDB
Demo
@YourTwitterHandle@YourTwitterHandle@zepouet#InfluxDB
Q & A