Introduction to InfluxDB

JornJambers 165 views 43 slides Apr 27, 2020
Slide 1
Slide 1 of 169
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97
Slide 98
98
Slide 99
99
Slide 100
100
Slide 101
101
Slide 102
102
Slide 103
103
Slide 104
104
Slide 105
105
Slide 106
106
Slide 107
107
Slide 108
108
Slide 109
109
Slide 110
110
Slide 111
111
Slide 112
112
Slide 113
113
Slide 114
114
Slide 115
115
Slide 116
116
Slide 117
117
Slide 118
118
Slide 119
119
Slide 120
120
Slide 121
121
Slide 122
122
Slide 123
123
Slide 124
124
Slide 125
125
Slide 126
126
Slide 127
127
Slide 128
128
Slide 129
129
Slide 130
130
Slide 131
131
Slide 132
132
Slide 133
133
Slide 134
134
Slide 135
135
Slide 136
136
Slide 137
137
Slide 138
138
Slide 139
139
Slide 140
140
Slide 141
141
Slide 142
142
Slide 143
143
Slide 144
144
Slide 145
145
Slide 146
146
Slide 147
147
Slide 148
148
Slide 149
149
Slide 150
150
Slide 151
151
Slide 152
152
Slide 153
153
Slide 154
154
Slide 155
155
Slide 156
156
Slide 157
157
Slide 158
158
Slide 159
159
Slide 160
160
Slide 161
161
Slide 162
162
Slide 163
163
Slide 164
164
Slide 165
165
Slide 166
166
Slide 167
167
Slide 168
168
Slide 169
169

About This Presentation

Learn How To Use The #1 DevOps Open Source Time Series DB Platform for Metrics & Events (Time Series Data).

Presentation used in Udemy training: https://www.udemy.com/course/influxdb-time-series-database/?referralCode=09D0B30F92258262D4C6

If you're looking to setup a system to store your ...


Slide Content

InfluxDB

InfluxDB
InfluxDB
Introduction InfluxDB basics Query Language Advanced InfluxDB Integrations
What is time series dataKey Concepts InfluxQL Storage Engine Chronograph
What is the TICK stackSchema design SELECT
Writing / Reading data
with HTTP API
InfluxDB + Grafana
What is InfluxDB WHERE
Authentication &
Authorization
InfluxDB use cases GROUP BY Retention policies
Installing InfluxDB
locally
INTO Continuous queries
Installing InfluxDB
remote in the cloud
Subqueries
Sample data JOINS
Import data
A first query

InfluxDB: Introduction

InfluxDB
Introduction overview
•What is time series data
•TICK stack
•InfluxDB
•InfluxDB use cases
•Installing InfluxDB locally
•Installing InfluxDB on the cloud
•Sample data
•Import data
•A first query

InfluxDB
Introduction
•A lot of applications and devices are generating timestamped data
•smartphones, fitbits, machine monitoring, application monitoring, IOT
•Super valuable for business and insights of your data
•Need for a time series database
•Relational databases are not cutout for this
•Comparison to databases
•missing features in existing ones
•extensive developer efforts

InfluxDB
Introduction
•Isn’t it possible to do this with an existing database?
•Limited options for TSDB’s (time series databases)
•InfluxDB is able to handle high write and read requests
•Written in go and compiles into single binary
•Mature
•Installation = piece of cake

InfluxDB: Time Series

InfluxDB
What is time series data
•A time series is a series of data points indexed in time order.
•Time series is a sequence taken at the same interval
•Time series are very frequently plotted via line charts.
•Time series are used in statistics, signal processing, pattern recognition,
econometrics, mathematical finance, …
•Applied in science / engineering / applications and server metrics

InfluxDB
What is time series data
•Time series data is a collection of datapoints within a certain time
•All datapoints are variable measurements within a certain time interval
•Interval between datapoints should be the same(saves disk)
•Time between one measurement and the next should be the same
•Each datapoint only has one measurement

InfluxDB
What is time series data
Regular timeseries
VS
Irregular timeseries

InfluxDB
What is time series data
•Regular time series (Orange datapoints)
•Measure points happen at the same interval
Examples: DevOps monitoring like cpu, ram

InfluxDB
What is time series data
•Irregular time series (Grey datapoints)
•Measure points happen at random moments in time
Examples: User clicks, orders in restaurant

InfluxDB: Tick Stack

InfluxDB
TICK Stack
© 2017 InfluxData, Inc.

InfluxDB
TICK Stack: Telegraph
© 2017 InfluxData, Inc.

InfluxDB
Tick Stack: Telegraph
•metric collection daemon
•plugin-driven
•Written in Go, no need for external dependencies, no npm, pip, gem
•over 100 plugins (Docker, Apache,Kafka, Kubernetes, php, redis,…)
•parse the input data formats
•Telegraf’s flexibility makes it an easy decision to implement.

InfluxDB
TICK Stack: InfluxDB
© 2017 InfluxData, Inc.

InfluxDB
TICK Stack: Chronograf
© 2017 InfluxData, Inc.

InfluxDB
Tick Stack: Chronograf
•Complete dashboarding solution
•Example dashboards
•Visual administrative tool for your InfluxData deployments
•Provides user management and user authentication
•User interface for Kapacitor
•Telegraf’s flexibility makes it an easy decision to implement.

InfluxDB
TICK Stack:Kapacitor
© 2017 InfluxData, Inc.

InfluxDB
Tick Stack: Kapacitor
•More than just dashboarding and operator alerts
•Call User Defined Functions
•Kapacitor very flexible to act as a control pannel
•Autoscaling, stock reordering, IoT device control
•Designed to process streaming data in real-time
•Downsampling / performing advanced analytics
•Easy to use yet so powerful
•Kapacitor integrations

InfluxDB
TICK Stack: InfluxDB
© 2017 InfluxData, Inc.

InfluxDB
InfluxDB
•InfluxDB is a time series, metrics, and analytics database for any use case
involving large or small amounts of timestamped data.
•It’s written in Go and has no external dependencies: That means once you
install it there’s nothing else to manage (such as Redis, ZooKeeper, Cassandra,
HBase, …).
•Built from the ground up to handle high write & query loads.
•Targeted at use cases for DevOps, metrics, sensor data, and real-time analytics.
•Offers a SQL-like query language for interacting with data.
•Provides a high performance write and query HTTP or https API

InfluxDB
InfluxDB
•InfluxDB can handle millions of data points per second.
•Some extraordinary features: automatically expiring & deleting any
unwanted data from the system.

InfluxDB
InfluxDB use cases
Custom DevOps Real Time
Analytics
IoT and Sensor
Data
Cloud &
OpenStack
MonitoringAnomaly Detection Messaging Equities Trading
Personalization
Municipal
Infrastructure
Management
GPS Services
Quantum Physics
Research
Point of Sale
Systems
Manufacturing &
Home
Automation …

InfluxDB
InfluxDB use cases

InfluxDB: Installation

InfluxDB
Installing InfluxDB locally (Docker)
•Installation procedure for Mac

InfluxDB
Installing InfluxDB locally (Docker)
•Installation procedure for Windows

InfluxDB
Installing InfluxDB in the cloud
•Create account at DigitalOcean
•Log-on the machine
•Install InfluxDB

InfluxDB
InfluxDB: sample data
•Travel data from my last trip with my kids to France
My Speed


Example:
speed_ms,location=biron_france speed=13.570000,speed description="speed in meter per second" 1502208946.000170
speed_ms,location=biron_france speed=13.570000,speed description="speed in meter per second" 1502208946.256510
speed_ms,location=biron_france speed=13.320000,speed description="speed in meter per second" 1502208947.000170
speed_ms,location=biron_france speed=13.110000,speed description="speed in meter per second" 1502208948.000170
speed_ms,location=biron_france speed=12.790000,speed description="speed in meter per second" 1502208949.000170
speed_ms,location=biron_france speed=12.210000,speed description="speed in meter per second" 1502208950.000170
speed_ms,location=biron_france speed=10.760000,speed description="speed in meter per second" 1502208951.000170
speed_ms,location=biron_france speed=9.740000,speed description="speed in meter per second" 1502208952.000170
The Altitude I was driving


Example:
altitude_m,location=biron_france altitude=193.342822,altitude description="altitude in meter" 1502208946.000170
altitude_m,location=biron_france altitude=193.342822,altitude description="altitude in meter" 1502208946.256510
altitude_m,location=biron_france altitude=193.379115,altitude description="altitude in meter" 1502208947.000170
altitude_m,location=biron_france altitude=193.139504,altitude description="altitude in meter" 1502208948.000170
altitude_m,location=biron_france altitude=193.397174,altitude description="altitude in meter" 1502208949.000170
altitude_m,location=biron_france altitude=193.087170,altitude description="altitude in meter" 1502208950.000170
altitude_m,location=biron_france altitude=192.850863,altitude description="altitude in meter" 1502208951.000170
altitude_m,location=biron_france altitude=192.850863,altitude description="altitude in meter" 1502208952.000170
altitude_m,location=biron_france altitude=192.850357,altitude description="altitude in meter" 1502208953.000170

InfluxDB
InfluxDB: importing the sample data
•Travel data from my last trip with my kids to France

InfluxDB
InfluxDB: a first query
•Travel data from my last trip with my kids to France

InfluxDB
Introduction recap
•What is time series data
•TICK stack
•InfluxDB
•InfluxDB use cases
•Installing InfluxDB locally
•Installing InfluxDB on the cloud
•Sample data
•Import data
•A first query

InfluxDB: Basics

InfluxDB: Key concepts

InfluxDB
InfluxDB: Key concepts
•Database
•Field key, set, value
•Point
•Tag key, set, value
•Series
•Timestamp
•Measurement
•Retention Policy

InfluxDB
InfluxDB: Key concepts
•Database
•An InfluxDB database is similar to traditional relational databases and
serves as a logical container for users, retention policies, continuous
queries, and, of course, your time series data

InfluxDB
InfluxDB: Key concepts
•Fields
time location speed speed description
---- -------- ------- -----------------
1502208946 biron_france 13.57 speed in meter per second
•Field values is the actual data this can be a strings, floats, integers, or
booleans, timestamp.
Example: 13.57
•Field key are are strings and they store metadata
Example: speed
•Field Set is field key + field value
Example: speed=13.57

InfluxDB
InfluxDB: Key concepts
•Point
•A point is the field set in the same series with the same timestamp.

InfluxDB
InfluxDB: Key concepts
•Tags(optional)
time location car. speed speed description
---- -------- ------- ------- -----------------
1502208946 biron_france audi 13.57 speed in meter per second
1502209075 paris bmw 13.57 speed in meter per second
•Tag value extra meta data
Example: biron_france, Paris, bmw and Audi
•Tag key extra meta data
Example: location or car
•Tag set is the different combinations of all the tag key-value pairs
Example: location=paris, car=audi
location=biron_france, car=bmw
location=paris, car=bmw
location=biron_france, car=audi

InfluxDB
InfluxDB: Key concepts
•Series
•A series is the collection of data that share a retention policy,
measurement, and tag set.

InfluxDB
InfluxDB: Key concepts
•Timestamp
•Since InfluxDB is a time series datastore every measurement does
have a timestamp.
•Epoch time

InfluxDB
InfluxDB: Key concepts
•Measurement
•Acts as a container for tags, fields and time column(comparison with a
table in sql)

InfluxDB
InfluxDB: Key concepts
•Retention policy
• A retention policy describes how long InfluxDB keeps data
(DURATION) and how many copies of those data are stored in the
cluster (REPLICATION).

InfluxDB: Datamodel

InfluxDB
InfluxDB’s Datamodel: Tags
•car_speed maps to a measurement
•For mysql users, a measurement is like a table
Time
Speed
car_speed

InfluxDB
InfluxDB’s Datamodel: Tags
•The legend contains the meta data of what the measurement represents,
metadata like this = tag (driver=jorn)
•Tags are index and thus faster to query on than on fields
Time
Speed
car_speed

InfluxDB
InfluxDB’s Datamodel: Collection of Tags
•A collection of all tags = a tag set(example: driver=jorn,location=biron)
•A tag set are all available combinations of key value pairs
Time
Speed
car_speed

InfluxDB
InfluxDB’s Datamodel: Y-axis
•The values on the y-axis are called fields and are also a key value pair
Key=Speed, Value=14.3
•Possible values: floats, ints, strings, bools
Time
Speed
car_speed

InfluxDB
InfluxDB’s Datamodel: Collection of fields
•The collection of fields is called a field set
example: speed=14.3,altitude=123
Time
Speed
car_speed

InfluxDB
InfluxDB’s Datamodel: X-axis
•The x-axis is a timestamp
•Represented as a number in unix epoch time
Time
Speed
car_speed

InfluxDB
InfluxDB’s Datamodel: Lineprotocol
speed_ms,location=biron_france speed=13.570000,speed description="speed in meter per second" 1502208946.000170
•Speed_ms = MEASUREMENT
•Location = TAG SET
•Speed = FIELD VALUE
•Description = FIELD VALUE
•Speed + description = FIELD SET
• Timestamp

InfluxDB: Schema design

InfluxDB
InfluxDB: Schema design
•Don’t encode data into the measurement names
Bad:
speed_ms.jorn.biron.france speed=13.57 1502208946
cpu.server-1337.eu-west.amazon value=23 1502208946
Good:
speed_ms,driver=jorn,location=biron,country=france speed=13.570000 1502208946
cpu,server=server-1337,region=eu-west,provider=amazon value=23 1502208946

InfluxDB
InfluxDB: Schema design
•I can’t change the way we send metrics to influxdb!
No worries -> Plugins

Example Graphite Plugin:
Takes input:
cpu.server-1337.eu-west.amazon value=23 1502208946
Translates it:
cpu,server=server-1337,region=eu-west,provider=amazon value=23 1502208946

InfluxDB
InfluxDB: Schema design
•Don’t overload tags with metadata
Bad:
cpu.server=server-1337.eu-west.amazon value=23 1502208946
Good:
cpu,host=server-1337,region=eu-west,provider=amazon value=23 1502208946

InfluxDB
InfluxDB: Schema design
Other best practices:
•Functions only accept fields
•Regular expressions only work on tags
•GROUP BY only on tags

InfluxDB
InfluxDB: Schema design
Other best practices:
•Don’t use the same name for a field and tag
•Use enough tags
cpu,region=eu-west host=server-1000,provider=amazon value=23 1502204738 -> gets silently overwritten because same tag-set and timestamp!
cpu,region=eu-west host=server-1000,provider=amazon value=50 1502204738
•Get the precision correct: ms vs s = 1970 vs 2185
•Way more network traffic in ms vs s
•Larger on disk
•Write throughput can get saturated
•Functions only accept fields

InfluxDB
InfluxDB: Schema design
Other tips
•Anything in GROUP BY needs to be a tag
•If you pass anything to a function its a field
•Losing info can be due to wrong datatype
•10 measurements in a database = OK
•1000 measurements in a 1 database = NOK

InfluxDB: Hardware?

InfluxDB
InfluxDB: Hardware
•Load profiles
•Low:
<25K writes, <5 queries/sec, <100K unique series
Recommended: 2-4 cores, 2-4GB RAM, 500 IOPS
•Moderate
<200K writes, <25 queries/sec, <1M unique series
Recommended: 4-6 cores, 8-32GB RAM, 500-1000 IOPS
•High
>200K writes, >25 queries/sec, >1M unique series
Recommended: 8+ cores, 32+GB RAM, 1000 IOPS

InfluxDB
InfluxDB: Hardware
•Capacity planning
•SSD in AWS is cheap
•1 Bilion points = 3GB(best) on disk
•3 bytes per point
•Strings = variable in size as of compression
•Regular time series takes up less space

InfluxDB: InfluxQL

InfluxDB
InfluxQL: Introduction
•This section covers InfluxQL, InfluxDB’s SQL-like query language for
interacting with data in InfluxDB.
•Schema exploration
•Data exploration

InfluxDB
InfluxQL: Introduction
ALL ALTER ANALYZE ANY AS ASC
BEGIN BY CREATE CONTINUOUS DATABASE DATABASES
DEFAULT DELETE DESC DESTINATIONS DIAGNOSTICS DISTINCT
DROP DURATION END EVERY EXPLAIN FIELD
FOR FROM GRANT GRANTS GROUP GROUPS
IN INF INSERT INTO KEY KEYS
KILL LIMIT SHOW MEASUREMENT MEASUREMEN
TS
NAME
OFFSET ON ORDER PASSWORD POLICY POLICIES
PRIVILEGES QUERIES QUERY READ REPLICATION RESAMPLE
RETENTION REVOKE SELECT SERIES SET SHARD
SHARDS SLIMIT SOFFSET STATS SUBSCRIPTIONSUBSCRIPTIO
NSTAG TO USER USERS VALUES WHERE
WITH WRITE
•KEY WORDS

InfluxDB
InfluxQL: Data exploitation
•Schema exploitation

SHOW DATABASES
SHOW SERIES
SHOW MEASUREMENTS

InfluxDB
InfluxQL: Data exploitation, SELECT
•Data exploitation

SELECT <field_key>[,<field_key>,<tag_key>]
FROM <measurement_name>[,<measurement_name>]

InfluxDB
InfluxQL: SELECT
Jorns-MacBook-Pro:1.3 jornjambers$ docker run --rm --link=influxdb -it influxdb influx -host influxdb
Connected to http://influxdb:8086 version 1.3.5
InfluxDB shell version: 1.3.5
> show databases;
name: databases
name
----
_internal
holiday_france
> use holiday_france
Using database holiday_france
> select * from altitude_m limit 2;
name: altitude_m
time altitude altitude description location
---- -------- -------------------- --------
1502208946 193.342822 altitude in meter biron_france
1502208947 193.379115 altitude in meter biron_france
> select * from speed_ms limit 2;
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946 biron_france 13.57 speed in meter per second
1502208947 biron_france 13.32 speed in meter per second
>
•Select all fields and tags from a single measurement

InfluxDB
InfluxQL: SELECT
> select time, altitude from altitude_m limit 2;
name: altitude_m
time altitude
---- --------
1502208946 193.342822
1502208947 193.379115
> select time, speed from speed_ms limit 2;
name: speed_ms
time speed
---- -----
1502208946 13.57
1502208947 13.32
>
•Select specific tags and fields from a single measurement

InfluxDB
InfluxQL: SELECT
Using database holiday_france
> SELECT "description"::field,"location"::tag,"altitude"::field FROM "altitude_m" LIMIT 3
name: altitude_m
time description location altitude
---- ----------- -------- --------
1502208946 biron_france 193.342822
1502208947 biron_france 193.379115
1502208948 biron_france 193.139504
>
•Select specific tags and fields from a single measurement, and provide their
identifier type

InfluxDB
InfluxQL: SELECT
> SELECT *::field FROM "altitude_m" LIMIT 3
name: altitude_m
time altitude altitude description
---- -------- --------------------
1502208946 193.342822 altitude in meter
1502208947 193.379115 altitude in meter
1502208948 193.139504 altitude in meter
•Select all fields from a single measurement

InfluxDB
InfluxQL: SELECT
> SELECT ("altitude" / 1000) FROM "altitude_m" LIMIT 3
name: altitude_m
time altitude
---- --------
1502208946 0.19334282200000003
1502208947 0.19337911500000002
1502208948 0.193139504
•Select a specific field from a measurement and perform basic arithmetic

InfluxDB
InfluxQL: SELECT
> SELECT * FROM "altitude_m","speed_ms" LIMIT 3;
name: altitude_m
time altitude altitude description location speed speed description
---- -------- -------------------- -------- ----- -----------------
1502208946 193.342822 altitude in meter biron_france
1502208947 193.379115 altitude in meter biron_france
1502208948 193.139504 altitude in meter biron_france
name: speed_ms
time altitude altitude description location speed speed description
---- -------- -------------------- -------- ----- -----------------
1502208946 biron_france 13.57 speed in meter per second
1502208947 biron_france 13.32 speed in meter per second
1502208948 biron_france 13.11 speed in meter per second
>
•Select all data from more than one measurement

InfluxDB
InfluxQL: Data exploitation, WHERE
•Data exploitation

SELECT_clause FROM_clause WHERE <conditional_expression> [(AND|
OR) <conditional_expression> [...]]

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "altitude_m" WHERE "altitude" > 194 LIMIT 3
name: altitude_m
time altitude altitude description location
---- -------- -------------------- --------
1502208964 194.254053 altitude in meter biron_france
1502208965 194.71817 altitude in meter biron_france
1502208966 194.81373 altitude in meter biron_france
•Select data that have specific field key-values

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE "speed description" = 'speed in meter per second' LIMIT 3
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946 biron_france 13.57 speed in meter per second
1502208947 biron_france 13.32 speed in meter per second
1502208948 biron_france 13.11 speed in meter per second
>
•Select data that have a specific string field key-value

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE "speed" + 1 > 12 LIMIT 3
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946 biron_france 13.57 speed in meter per second
1502208947 biron_france 13.32 speed in meter per second
1502208948 biron_france 13.11 speed in meter per second
> SELECT * FROM "speed_ms" WHERE "speed" + 1 < 12 LIMIT 3
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208951 biron_france 10.76 speed in meter per second
1502208952 biron_france 9.74 speed in meter per second
1502208953 biron_france 8.51 speed in meter per second
>
•Select data that have a specific field key-value and perform basic arithmetic

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE "location" = 'biron_france' LIMIT 3
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946 biron_france 13.57 speed in meter per second
1502208947 biron_france 13.32 speed in meter per second
1502208948 biron_france 13.11 speed in meter per second
•Select data that have specific field key-values and tag key-values

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE "location" = 'biron_france' AND ("speed" + 1 < 12 OR "speed" + 1 > 13 )LIMIT 10
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946 biron_france 13.57 speed in meter per second
1502208947 biron_france 13.32 speed in meter per second
1502208948 biron_france 13.11 speed in meter per second
1502208949 biron_france 12.79 speed in meter per second
1502208950 biron_france 12.21 speed in meter per second
1502208951 biron_france 10.76 speed in meter per second
1502208952 biron_france 9.74 speed in meter per second
1502208953 biron_france 8.51 speed in meter per second
1502208954 biron_france 7.53 speed in meter per second
1502208955 biron_france 6.87 speed in meter per second
>
•Select data that have a specific tag key-value

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE time > now() - 150d;
> SELECT * FROM "speed_ms" WHERE time >= '2015-08-18'
> SELECT * FROM "speed_ms" WHERE time >= '2000-08-18'
•Select data that have specific timestamps
•Why don’t following queries return anything?

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE time >= '1970-01-01'
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946 biron_france 13.57 speed in meter per second
1502208947 biron_france 13.32 speed in meter per second
1502208948 biron_france 13.11 speed in meter per second
1502208949 biron_france 12.79 speed in meter per second
1502208950 biron_france 12.21 speed in meter per second
•But there is data in there?!?

InfluxDB
InfluxQL: WHERE
docker run -v "$PWD":/tmp/testdata/ --rm --link=influxdb -it influxdb influx -precision s -import -path=/tmp/testdata/
sample_data_course -host influxdb

> SELECT * FROM "speed_ms" WHERE time > '2017-06-30T21:24:00Z' LIMIT 3
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946000000000 biron_france 13.57 speed in meter per second
1502208947000000000 biron_france 13.32 speed in meter per second
1502208948000000000 biron_france 13.11 speed in meter per second
•Rerun import with the correct precision:

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE time > now() - 100d LIMIT 3
name: speed_ms
time location speed speed description
---- -------- ----- -----------------
1502208946000000000 biron_france 13.57 speed in meter per second
1502208947000000000 biron_france 13.32 speed in meter per second
1502208948000000000 biron_france 13.11 speed in meter per second
•Select data that have specific timestamps

InfluxDB
InfluxQL: WHERE
> SELECT * FROM "speed_ms" WHERE time > now() - 150d;
> SELECT * FROM "speed_ms" WHERE time >= '2015-08-18'
> SELECT * FROM "speed_ms" WHERE time >= '2000-08-18'
•Select data that have specific timestamps
•Why don’t following queries return anything?

InfluxDB
InfluxQL: Data exploitation, GROUP BY
•Data exploitation

SELECT_clause FROM_clause [WHERE_clause] GROUP BY [* |
<tag_key>[,<tag_key]]

InfluxDB
InfluxQL: GROUP BY
docker run -v "$PWD":/tmp/testdata/ --rm --link=influxdb -it influxdb influx -precision s -import -path=/tmp/testdata/
sample_data_course2 -host influxdb
2017/10/01 14:49:53 Processed 1 commands
2017/10/01 14:49:53 Processed 76290 inserts
2017/10/01 14:49:53 Failed 0 inserts
Jorns-MacBook-Pro:1.3 jornjambers$ docker run --rm --link=influxdb -it influxdb influx -host influxdb
Connected to http://influxdb:8086 version 1.3.5
InfluxDB shell version: 1.3.5
> show databases
name: databases
name
----
_internal
holiday_france
NOAA_water_database
> use NOAA_water_database
Using database NOAA_water_database
>
•Import bigger sample data

InfluxDB
InfluxQL: GROUP BY
> SELECT MEAN("water_level") FROM "h2o_feet" GROUP BY "location"
name: h2o_feet
tags: location=coyote_creek
time mean
---- ----
0 5.359342451341401
name: h2o_feet
tags: location=santa_monica
time mean
---- ----
0 3.530863470081006
•Group query results by a single tag

InfluxDB
InfluxQL: GROUP BY
> SELECT MEAN("water_level") FROM "h2o_feet" GROUP BY "location" LIMIT 2;
name: h2o_feet
tags: location=coyote_creek
time mean
---- ----
0 5.359342451341401
name: h2o_feet
tags: location=santa_monica
time mean
---- ----
0 3.530863470081006
•Group query results by more than one tag

InfluxDB
InfluxQL: GROUP BY
> SELECT MEAN("index") FROM "h2o_quality" GROUP BY * LIMIT 2;
name: h2o_quality
tags: location=coyote_creek, randtag=1
time mean
---- ----
0 50.69033760186263
name: h2o_quality
tags: location=coyote_creek, randtag=2
time mean
---- ----
0 49.661867544220485
• Group query results by all tags

InfluxDB
InfluxQL: GROUP BY
> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z'
AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m)
name: h2o_feet
time count
---- -----
1439856000000000000 2
1439856720000000000 2
1439857440000000000 2
•Group query results into 12 minute intervals

InfluxDB
InfluxQL: Data exploitation, INTO
•Data exploitation

SELECT_clause INTO <measurement_name> FROM_clause
[WHERE_clause] [GROUP_BY_clause]

InfluxDB
InfluxQL: INTO
> SELECT * INTO "copy_speed_ms" FROM "speed_ms" GROUP BY *
name: result
time written
---- -------
1970-01-01T00:00:00Z 1170
•Rename a database
•Rename in smaller pieces when dataset is huge
SELECT *
INTO <destination_database>.<retention_policy_name>.<measurement_name>
FROM <source_database>.<retention_policy_name>.<measurement_name>
WHERE time > now() - 100w and time < now() - 90w GROUP BY *
SELECT *
INTO <destination_database>.<retention_policy_name>.<measurement_name>
FROM <source_database>.<retention_policy_name>.<measurement_name>}
WHERE time > now() - 90w and time < now() - 80w GROUP BY *

InfluxDB
InfluxQL: INTO
> SELECT "speed" INTO copy_speed FROM "speed_ms" WHERE "location"='biron_france'
name: result
time written
---- -------
1970-01-01T00:00:00Z 1170
> show measurements
name: measurements
name
----
altitude_m
copy_speed
copy_speed_ms
speed_ms
>
•Write the results of a query to a measurement

InfluxDB
InfluxQL: INTO
> CREATE DATABASE copy_speed_ms
> SELECT "speed" INTO "copy_speed_ms"."autogen"."speed_ms" FROM "speed_ms" WHERE "location" = 'biron_france'
name: result
time written
---- -------
1970-01-01T00:00:00Z 1170
> USE copy_speed_ms
Using database copy_speed_ms
> SHOW measurements
name: measurements
name
----
speed_ms
> SELECT * FROM speed_ms LIMIT 3;
name: speed_ms
time speed
---- -----
1970-01-01T00:00:01.502208946Z 13.57
1970-01-01T00:00:01.502208947Z 13.32
1970-01-01T00:00:01.502208948Z 13.11
•Write the results of a query to a fully qualified measurement

InfluxDB
InfluxQL: INTO
> SELECT MEAN("speed") INTO "speed_ms_downsample" FROM "speed_ms" WHERE "location"='biron_france' AND time >=
'2017-08-08T16:15:46Z' AND time <= '2017-08-08T16:25:23Z' GROUP BY time(10s);
name: result
time written
---- -------
1970-01-01T00:00:00Z 59
> SELECT * FROM speed_ms_downsample LIMIT 3
name: speed_ms_downsample
time mean
---- ----
2017-08-08T16:15:40Z 13.1975
2017-08-08T16:15:50Z 7.226000000000001
2017-08-08T16:16:00Z 1.6549999999999998
•Downsampling

InfluxDB
InfluxQL: Data exploitation, Subqueries
•Data exploitation

SELECT_clause FROM ( SELECT_statement ) [...]

InfluxDB
InfluxQL: Subqueries
> SELECT SUM("max") FROM (SELECT MAX("speed") FROM "speed_ms" GROUP BY "location")
name: speed_ms
time sum
---- ---
1970-01-01T00:00:00Z 26.25
>
•Calculate the SUM() of several MAX() values

InfluxDB
InfluxQL: Subqueries
> SELECT MEAN("difference") FROM (SELECT "cats" - "dogs" AS "difference" FROM "pet_daycare")
name: pet_daycare
time mean
---- ----
1970-01-01T00:00:00Z 1.75
•Calculate the MEAN() difference between two fields

InfluxDB
InfluxQL: Subqueries
> SELECT "all_the_means" FROM (SELECT MEAN("water_level") AS "all_the_means" FROM "h2o_feet" WHERE time >=
'2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) ) WHERE "all_the_means" > 5
name: h2o_feet
time all_the_means
---- -------------
2015-08-18T00:00:00Z 5.07625

> SELECT MEAN("water_level") AS "all_the_means" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <=
'2015-08-18T00:30:00Z' GROUP BY time(12m)
name: h2o_feet
time all_the_means
---- -------------
2015-08-18T00:00:00Z 5.07625
2015-08-18T00:12:00Z 4.950749999999999
2015-08-18T00:24:00Z 4.80675
•Calculate several MEAN() values and place a condition on those mean values

InfluxDB
InfluxQL: Subqueries
> SELECT SUM("water_level_derivative") AS "sum_derivative" FROM (SELECT DERIVATIVE(MEAN("water_level")) AS
"water_level_derivative" FROM "h2o_feet" WHERE time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY
time(12m),"location") GROUP BY "location"
name: h2o_feet
tags: location=coyote_creek
time sum_derivative
---- --------------
1970-01-01T00:00:00Z -0.4950000000000001
name: h2o_feet
tags: location=santa_monica
time sum_derivative
---- --------------
1970-01-01T00:00:00Z -0.043999999999999595
•Calculate the SUM() of several DERIVATIVE() values

InfluxDB
InfluxQL: JOINS
•Joins don’t exist in InfluxDB
•Your schema should reflect that
•A measurement is like a sql table with a primary index set to pre-set time
•Influxdb timestamps must be UNIX epoch or as date-time RFC3339

InfluxDB: Advanced

InfluxDB: Storage engine

InfluxDB
Storage engine
•In-Memory Index
•WAL
•Cache
•TSM Files
•FileStore
•Compactor
•Compaction Planner
•Compression
•Writers/Readers

InfluxDB: HTTP API

InfluxDB
HTTP API: Create database
curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE mydb"
•Create a database

InfluxDB
HTTP API: Writing data
curl -i -XPOST 'http://localhost:8086/write?db=mydb' —data-binary 'cpu_load_short,host=server01,region=us-west value=0.64
1434055562000000000'
•Writing a single point
•Writing multiple points
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02 value=0.67
cpu_load_short,host=server02,region=us-west value=0.55 1422568543702900257
cpu_load_short,direction=in,host=server01,region=us-west value=2.0 1422568543702900257’
•Writing points from file
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary @cpu_data.txt

InfluxDB
HTTP API: Reading data
curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM
\"cpu_load_short\" WHERE \"region\"='us-west'"
•A single query

InfluxDB
HTTP API: Reading data
curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=mydb" --data-urlencode "q=SELECT \"value\" FROM
\"cpu_load_short\" WHERE \"region\"='us-west';SELECT count(\"value\") FROM \"cpu_load_short\" WHERE \"region\"='us-west'"
•Multiple queries

InfluxDB
HTTP API: Reading data
curl -G 'http://localhost:8086/query' --data-urlencode “db=holiday_france" --data-urlencode "chunked=true" --data-urlencode
“chunk_size=500" --data-urlencode "q=SELECT * FROM speed_ms”
•Chunking

InfluxDB
HTTP API: Responses
•2xx: If your write request received HTTP 204 No Content, it was a success!
•You Rock
•4xx: InfluxDB could not understand the request.
•You screwed up
•5xx: The system is overloaded or significantly impaired.
•Operations screwed up

InfluxDB: Authentication and Authorization

InfluxDB
InfluxDB: Authentication and Authorization
•By default, authentication is disabled!
•All users have all privileges
•All credentials are silently ignored
•Authorization is only enforced once you’ve enabled authentication.
•Very similar to SQL user authentication and authorization
•For CLI as well as the HTTP API
•3 Ways for the HTTP API
• Authenticate with Basic Authentication
• Authenticate by providing query parameters or via the body

InfluxDB
InfluxDB: Authentication and Authorization
Difference between authentication and authorization
•Authentication
•Who can come in
•Authorization
•Who can access what

InfluxDB
InfluxDB: Authentication and Authorization
Error codes
•HTTP 401 Unauthorized
• No authentication credentials or incorrect credentials
•403 Forbidden
•Requests by unauthorized users

InfluxDB
InfluxDB: Authentication and Authorization
General user management
•Set password
•> SET PASSWORD FOR jorn = 'mpsywqv9bPduE'
•Remove user
•> DROP USER jorn

InfluxDB: Authentication

InfluxDB
InfluxDB: Authentication
•Setup:
•Create an admin user first using following syntax:

> CREATE USER "jorn" WITH PASSWORD
'G{3jsgmXVmupvE' WITH ALL PRIVILEGES

> show users
user admin
---- -----
jorn true

InfluxDB
InfluxDB: Authentication
•Setup:
•Enable authentication by setting the auth-enabled to true in the [http] section of the
configuration file

vi /etc/influxdb/influxdb.conf

[http]
enabled = true
bind-address = ":8086"
auth-enabled = true # <———
log-enabled = true
write-tracing = false
pprof-enabled = false
https-enabled = false
https-certificate = "/etc/ssl/influxdb.pem"

InfluxDB
InfluxDB: Authentication
•CLI:

•influx -username jorn -password G{3jsgmXVmupvE

InfluxDB
InfluxDB: Authentication
•CLI
•Environment vars:


export INFLUX_USERNAME jorn
export INFLUX_PASSWORD G{3jsgmXVmupvE
echo $INFLUX_USERNAME $INFLUX_PASSWORD
jorn G{3jsgmXVmupvE

influx
Connected to http://influxdb:8086 version 1.3.5
InfluxDB shell version: 1.3.5
>

InfluxDB
InfluxDB: Authentication
•CLI
•Influx CLI:


influx
Connected to http://influxdb:8086 version 1.3.5
InfluxDB shell version: 1.3.5
> auth
username: jorn
password: G{3jsgmXVmupvE
>

InfluxDB
InfluxDB: Authentication
•HTTP API
•Authenticate with Basic Authentication:
•Format  RFC 2617, Section 2

curl -G http://localhost:8086/query -u jorn:G{3jsgmXVmupvE
--data-urlencode "q=SHOW DATABASES"

InfluxDB
InfluxDB: Authentication
•HTTP API
•Authenticate by providing query parameters:

curl -G "http://localhost:8086/query?u=jorn&p=G{3jsgmXVmupvE"
--data-urlencode "q=SHOW DATABASES"
•Authenticate by using request body:

curl -G http://localhost:8086/query --data-urlencode "u=jorn" --data-
urlencode "p=G{3jsgmXVmupvE" --data-urlencode "q=SHOW
DATABASES"

InfluxDB
InfluxDB: Authentication
•HTTP API
•Authenticate by providing query parameters:

curl -G "http://localhost:8086/query?u=jorn&p=G{3jsgmXVmupvE"
--data-urlencode "q=SHOW DATABASES"
•Authenticate by using request body:

curl -G http://localhost:8086/query --data-urlencode "u=jorn" --data-
urlencode "p=G{3jsgmXVmupvE" --data-urlencode "q=SHOW
DATABASES"

InfluxDB: Authorization

InfluxDB
InfluxDB: Authorization
•Authorization is only enforced once you’ve enabled
authentication.
•By default, authentication is disabled,
•All credentials are silently ignored if authentication is disabled
•All users have all privileges if authentication is disabled

InfluxDB
InfluxDB: Authorization
User Types and Privileges
•Admin users
•Have READ and WRITE access to all databases and full
access to the administrative queries
•Non-admin users
•Non-admin users can have one of the following three
privileges: READ, WRITE ,ALL(READ+WRITE)

InfluxDB
InfluxDB: Authorization
Create user
•Admin users
•> CREATE USER "jorn" WITH PASSWORD 'G{3jsgmXVmupvE' WITH ALL PRIVILEGES
•Non-admin users
•> CREATE USER "jorn_read" WITH PASSWORD ‘JEUDYQJD#kd’
> GRANT READ ON “holiday_france” TO “jorn_read”
•> CREATE USER "jorn_rw" WITH PASSWORD ‘JEUDYQJD#kd’
> GRANT ALL ON “holiday_france” TO “jorn_rw”

InfluxDB
InfluxDB: Authorization
Revoke user
•Non-admin users
•> CREATE USER "jorn_read" WITH PASSWORD ‘JEUDYQJD#kd’
> GRANT READ ON “holiday_france” TO “jorn_read”
•> REVOKE ALL ON "holiday_france" FROM "jorn_read"’

InfluxDB
InfluxDB: Authorization
Show user
•Show permissions for a user
•> CREATE USER "jorn_read" WITH PASSWORD ‘JEUDYQJD#kd’
> GRANT READ ON “holiday_france” TO “jorn_read”
•> SHOW GRANTS FOR “jorn_read”
•> SHOW GRANTS FOR “jorn_read"
database privilege
holiday_france READ

InfluxDB: Retention policies

InfluxDB
InfluxDB: Create Retention policies
•Retention policies
CREATE RETENTION POLICY <retention_policy_name> ON <database_name> DURATION <duration>
REPLICATION <n> [SHARD DURATION <duration>] [DEFAULT]

InfluxDB
InfluxDB: Create Retention policies
•Retention policy 1 day
CREATE RETENTION POLICY "one_day_only" ON "NOAA_water_database" DURATION 1d REPLICATION 1

InfluxDB
InfluxDB: Create Retention policies
•Retention policy DEFAULT
CREATE RETENTION POLICY "one_day_only" ON "NOAA_water_database" DURATION 23h60m REPLICATION 1 DEFAULT

InfluxDB
InfluxDB: ALTER Retention policies
•Alter retention policy
ALTER RETENTION POLICY <retention_policy_name> ON <database_name> DURATION <duration> REPLICATION <n>
SHARD DURATION <duration> DEFAULT

InfluxDB
InfluxDB: ALTER Retention policies
•Alter retention policy example
> CREATE RETENTION POLICY "what_is_time" ON "NOAA_water_database" DURATION 2d REPLICATION 1
>ALTER RETENTION POLICY "what_is_time" ON "NOAA_water_database" DURATION 3w SHARD DURATION 30m DEFAULT

InfluxDB
InfluxDB: DROP Retention policies
•Drop(delete) retention policy
> DROP RETENTION POLICY <retention_policy_name> ON <database_name>

InfluxDB
InfluxDB: DROP Retention policies
•Drop(delete) retention policy example
> DROP RETENTION POLICY "what_is_time" ON "NOAA_water_database"

InfluxDB: Continuous queries

InfluxDB
InfluxDB: Continuous queries
•Continuous queries syntax
CREATE CONTINUOUS QUERY <cq_name> ON <database_name>
BEGIN
<cq_query>
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries basic syntax
SELECT <function[s]> INTO <destination_measurement> FROM <measurement> [WHERE <stuff>] GROUP BY time(<interval>)
[,<tag_key[s]>]

InfluxDB
InfluxDB: Continuous queries
•Continuous queries example automatically downsample
•CREATE CONTINUOUS QUERY "cq_basic" ON "transportation"
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h)
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries example automatically downsample data into another retention policy

CREATE CONTINUOUS QUERY "cq_basic_rp" ON "transportation"
BEGIN
SELECT mean("passengers") INTO "transportation"."three_weeks"."average_passengers" FROM "bus_data" GROUP BY time(1h)
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries with offset and time boundaries
CREATE CONTINUOUS QUERY "cq_basic_offset" ON "transportation"
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h,15m)
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries advanced syntax
CREATE CONTINUOUS QUERY <cq_name> ON <database_name>
RESAMPLE EVERY <interval> FOR <interval>
BEGIN
<cq_query>
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries advanced syntax example
CREATE CONTINUOUS QUERY "cq_advanced_every" ON "transportation"
RESAMPLE EVERY 30m
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h)
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries advanced syntax example
CREATE CONTINUOUS QUERY "cq_advanced_for" ON "transportation"
RESAMPLE FOR 1h
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m)
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries advanced syntax example
CREATE CONTINUOUS QUERY "cq_advanced_every_for" ON "transportation"
RESAMPLE EVERY 1h FOR 90m
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(30m)
END

InfluxDB
InfluxDB: Continuous queries
•Continuous queries advanced syntax example
CREATE CONTINUOUS QUERY "cq_advanced_for_fill" ON "transportation"
RESAMPLE FOR 2h
BEGIN
SELECT mean("passengers") INTO "average_passengers" FROM "bus_data" GROUP BY time(1h) fill(1000)
END

InfluxDB: InfluxDB Integrations

InfluxDB: Chronograph

InfluxDB
InfluxDB: Chronograph

InfluxDB: InfluxDB + Grafana

InfluxDB
InfluxDB: Grafana

InfluxDB: Use Cases

InfluxDB: Monitoring

InfluxDB
InfluxDB: Monitoring
•Free
•Open-source monitoring
•Scalable
•Tick stack
•Telegraf collects time-series data from a variety of sources.
•InfluxDB stores time-series data.
•Chronograf visualizes and graphs your time-series data.
•Kapacitor provides alerting and detects anomalies in time-series data.

InfluxDB
InfluxDB: Monitoring
•Lab
•Prerequisites:
- One Ubuntu 16.04 server(DigitalOcean)
- An SMTP server(sendmail)

InfluxDB
InfluxDB: Monitoring
•Lab
•Steps:
- Adding the TICK Stack Repository
- Install InfluxDB and Configuring Authentication(!)
- Install and Configure Telegraf
- Install Kapacitator
- Install and Configure Chronograf
- Configure Alerts

InfluxDB
InfluxDB: Monitoring
•Lab
•Steps:
- Adding the TICK Stack Repository
- Install InfluxDB and Configuring Authentication(!)
- Install and Configure Telegraf
- Install Kapacitator
- Install and Configure Chronograf
- Configure Alerts

InfluxDB
InfluxDB: Monitoring
•Lab
•Conclusion
- Easy to use
- Free
- A lot of plugins
- Alerting(Anomaly detection)

InfluxDB: Chronograf Authentication

InfluxDB
InfluxDB: Chronograf Authentication
•Prerequisites for this lab
•Running TICK stack
•Github account

InfluxDB
InfluxDB: Chronograf Authentication
•Chronograf with OAuth 2.0 Authentication
•Providers: GitHub Google Heroku Auth0
•JWT Signature
•Random String
•Needed for all OAuth2 providers
•Verify user authorization
•Allow users to stay logged in
•Force logout users, change this TOKEN_SECRET on restart

InfluxDB
InfluxDB: Chronograf Authentication
•SystemD
•Config file
•/lib/systemd/system/chronograf.service
•Environment vars
•EnvironmentFile=-/etc/default/chronograf

InfluxDB
InfluxDB: Chronograf Authentication
•SystemD
•Config file
•/lib/systemd/system/chronograf.service
•Environment vars
•EnvironmentFile=-/etc/default/chronograf

InfluxDB
InfluxDB: Chronograf Authentication
Let’s get started!