You can design distributed systems with 400 microservices, but if you're not able to understand what's going on with them and how an application is behaving because of them, you can't control anything. You can deploy 100 times a day, but you won't have the visibility into how your code impacts your system, which will lead to failure.
How fast we're able to predict and detect any failures or new behaviors is an important step of the puzzle. That's why I'm fascinated by time series and monitoring systems. In this article, I will give you a high-level overview about InfluxDB and the TICK stack, a set of open-source projects focused on time series and monitoring.
What Is InfluxDB and How Does the TICK Stack Work?
TICK is acrimonious for Telegraf, InfluxDB, Chronograf, and Kapacitor. It's a set of open-source tools that can be combined together or used separately to collect, store, visualize, and manipulate any time series data.
First, we need to understand what time series is. Essentially, it is a collection of points. Every point has a special label called time
. It can be stored at different levels of precision like second, nanosecond, and so on.
The data structure in InfluxDB looks like this:
h2o_feet,location=coyote_creek water_level=5.617,level_description="between 3 and 6 feet" 1439862840
There is a measurement h2o_feet
, it represents a set of points, it's the time series. As you can see, the last value is always time: 1439862840
. There are two different concepts: field
and tag
. A tag is a key value store indexed. Fields are not indexed, and they can have a different type of values like integer, boolean, or float.
The schema is measurement,tags fields timestamp
. With the previous example:
measurement:
h2o_feet
tags:
location=coyote_creek
fields:
water_level=5.617,level_description="between 3 and 6 feet"
timestamp:
1439862840
You can store more than one tag separated by a comma, like so: location=coyote_creek, region=us
.
Why do we need a database focused on time series? Can't we use MongoDB, MySQL, Elasticsearch? This is the common question. There are a set of benchmarks about it:
The first reason is that the resource is obviously an important parameter. Time series is a specific area; if we can cover only that struct, we can also offer specific queries and solutions around this use case.
The second reason is that the database needs to be very fast, since a monitoring system receives a lot of points in a short amount of time. It also needs to be able to serve reads without block writes.
InfluxDB is the first and most famous project part of the TICK stack, and it's the storage engine. It's able to handle points and time series. It supports two different protocols: UDP and TCP. It provides an API to write and to query time series.
Getting Started with InfluxDB
I use Docker a lot, and with it, it's easy to show how we can run some containers together. Let's start InfluxDB and use an isolated network called tsdb-test
just to be sure that we can configure the internal communication without trouble:
docker network create tsdb-test docker run -d -it --name influxdb -p 8086:8086 --network tsdb-test influxdb
Let's test it:
$ curl -I http://localhost:8086/ping HTTP/1.1 204 No Content Content-Type: application/json Request-Id: e1096af5-6bb3-11e7-8001-000000000000 X-Influxdb-Version: 1.2.4 Date: Tue, 18 Jul 2017 12:23:17 GMT
By default, port 8085 serves the HTTP API. We called the /ping
entry point, just to double check. InfluxDB has a powerful CLI called influx
, and now we will use it to make some other tests.
Copy/paste this in a file ~/influx_data.txt
:
# DDL CREATE DATABASE NOAA_water_database # DML # CONTEXT-DATABASE: NOAA_water_database h2o_feet,location=coyote_creek water_level=2.943,level\ description="below 3 feet" 1439870400 h2o_feet,location=coyote_creek water_level=2.831,level\ description="below 3 feet" 1439870760 h2o_feet,location=coyote_creek water_level=2.717,level\ description="below 3 feet" 1439871120 h2o_feet,location=coyote_creek water_level=2.625,level\ description="below 3 feet" 1439871480 h2o_feet,location=coyote_creek water_level=2.533,level\ description="below 3 feet" 1439871840 h2o_feet,location=coyote_creek water_level=2.451,level\ description="below 3 feet" 1439872200 h2o_feet,location=coyote_creek water_level=2.385,level\ description="below 3 feet" 1439872560 h2o_feet,location=coyote_creek water_level=2.339,level\ description="below 3 feet" 1439872920 h2o_feet,location=coyote_creek water_level=2.293,level\ description="below 3 feet" 1439873280 h2o_feet,location=coyote_creek water_level=2.287,level\ description="below 3 feet" 1439873640 h2o_feet,location=coyote_creek water_level=2.290,level\ description="below 3 feet" 1439874000 h2o_feet,location=coyote_creek water_level=2.313,level\ description="below 3 feet" 1439874360 h2o_feet,location=coyote_creek water_level=2.359,level\ description="below 3 feet" 1439874720 h2o_feet,location=coyote_creek water_level=2.425,level\ description="below 3 feet" 1439875080 h2o_feet,location=coyote_creek water_level=2.513,level\ description="below 3 feet" 1439875440 h2o_feet,location=coyote_creek water_level=2.608,level\ description="below 3 feet" 1439875800 h2o_feet,location=coyote_creek water_level=2.703,level\ description="below 3 feet" 1439876160 h2o_feet,location=coyote_creek water_level=2.822,level\ description="below 3 feet" 1439876520 h2o_feet,location=coyote_creek water_level=2.927,level\ description="below 3 feet" 1439876880 h2o_feet,location=coyote_creek water_level=3.054,level\ description="between 3 and 6 feet" 1439877240 h2o_feet,location=coyote_creek water_level=3.176,level\ description="between 3 and 6 feet" 1439877600 h2o_feet,location=coyote_creek water_level=3.304,level\ description="between 3 and 6 feet" 1439877960 h2o_feet,location=coyote_creek water_level=3.432,level\ description="between 3 and 6 feet" 1439878320 h2o_feet,location=coyote_creek water_level=3.570,level\ description="between 3 and 6 feet" 1439878680 h2o_feet,location=coyote_creek water_level=3.720,level\ description="between 3 and 6 feet" 1439879040 h2o_feet,location=coyote_creek water_level=3.881,level\ description="between 3 and 6 feet" 1439879400 h2o_feet,location=coyote_creek water_level=4.049,level\ description="between 3 and 6 feet" 1439879760 h2o_feet,location=coyote_creek water_level=4.209,level\ description="between 3 and 6 feet" 1439880120 h2o_feet,location=coyote_creek water_level=4.383,level\ description="between 3 and 6 feet" 1439880480 h2o_feet,location=coyote_creek water_level=4.560,level\ description="between 3 and 6 feet" 1439880840 h2o_feet,location=coyote_creek water_level=4.744,level\ description="between 3 and 6 feet" 1439881200 h2o_feet,location=coyote_creek water_level=4.915,level\ description="between 3 and 6 feet" 1439881560 h2o_feet,location=coyote_creek water_level=5.102,level\ description="between 3 and 6 feet" 1439881920 h2o_feet,location=coyote_creek water_level=5.289,level\ description="between 3 and 6 feet" 1439882280 h2o_feet,location=coyote_creek water_level=5.469,level\ description="between 3 and 6 feet" 1439882640 h2o_feet,location=coyote_creek water_level=5.643,level\ description="between 3 and 6 feet" 1439883000 h2o_feet,location=coyote_creek water_level=5.814,level\ description="between 3 and 6 feet" 1439883360 h2o_feet,location=coyote_creek water_level=5.974,level\ description="between 3 and 6 feet" 1439883720 h2o_feet,location=coyote_creek water_level=6.138,level\ description="between 6 and 9 feet" 1439884080 h2o_feet,location=coyote_creek water_level=6.293,level\ description="between 6 and 9 feet" 1439884440 h2o_feet,location=coyote_creek water_level=6.447,level\ description="between 6 and 9 feet" 1439884800 h2o_feet,location=coyote_creek water_level=6.601,level\ description="between 6 and 9 feet" 1439885160
Now we can import that data:
$ docker run -it --rm --network tsdb-test -v ${HOME}:${HOME} -w ${HOME} influxdb influx -host influxdb -import -path ./influx_data.txt 2017/07/18 12:35:20 Processed 1 commands 2017/07/18 12:35:20 Processed 42 inserts 2017/07/18 12:35:20 Failed 0 inserts
At this point, we can start the Influx CLI to make some queries:
show databases
lists all the databases.use NOAA_water_database
moves the scope of the CLI to a specific database in our caseNOAA_water_database
.
$ docker run -it --rm --network tsdb-test -v ${HOME}:${HOME} -w ${HOME} influxdb influx -host influxdb Connected to http://influxdb:8086 version 1.2.4 InfluxDB shell version: 1.2.4 > show databases name: databases name ---- _internal NOAA_water_database > use NOAA_water_database Using database NOAA_water_database
select * from h2o_feet limit 10
to get 10 points fromh2o_feet
measurement.
> select * from h2o_feet limit 10 name: h2o_feet time level description location water_level ---- ----------------- -------- ----------- 1439870400 below 3 feet coyote_creek 2.943 1439870760 below 3 feet coyote_creek 2.831 1439871120 below 3 feet coyote_creek 2.717 1439871480 below 3 feet coyote_creek 2.625 1439871840 below 3 feet coyote_creek 2.533 1439872200 below 3 feet coyote_creek 2.451 1439872560 below 3 feet coyote_creek 2.385 1439872920 below 3 feet coyote_creek 2.339 1439873280 below 3 feet coyote_creek 2.293 1439873640 below 3 feet coyote_creek 2.287
As you probably noticed, the queries look like SQL. They are similar because it makes it easy to interact with a database in a familiar way. You can read more about that here.
Collect Information From Any Server with Telegraf
Building an efficient engine to store and manage time series is one of the challenges of a monitoring system. How to collect and store information from a source is also important. There are several different sources that you need to collect data from: your application, hardware, virtual machines, and so on.
Telegraf is an agent written in Go ,and its main focus is to simplify this task. It's made of input and output plugins. Inputs are MySql, CouchDB, Spark, HaProxy, Disqus, Docker, AWS, and so on. The list is very long. The input plugins represent the list of services where you can get data from.
Output plugins are the store where you can save your data. InfluxDB is just one. As an open-source and standalone project, Telegraf does support storage platforms other than InfluxDB, such as AMQP, Kafka, Kinesis, MQTT, OpenTSDB, Prometheus, and others....
Telegraf is configuration based, which means that there is a configuration file that you need to configure in order to get info from system and services:
[global_tags] department = "it" [agent] interval = "10s" round_interval = true metric_buffer_limit = 5000 flush_buffer_when_full = true collection_jitter = "0s" flush_interval = "30s" flush_jitter = "30s" debug = false hostname = "" # Send metrics to the monitoring instance [[outputs.influxdb]] urls = ["http://influxdb:8086"] database = "telegraf" retention_policy = "autogen" precision = "s" timeout = "10s" username = "" password = "" [[inputs.cpu]] percpu = false totalcpu = true fieldpass = ["usage_idle", "usage_user", "usage_system"] [[inputs.diskio]] [[inputs.diskio]] name_prefix = "local_" [[inputs.docker]] endpoint = "unix:///var/run/docker.sock" container_names = [] namepass = [ "docker", "docker_container_cpu", "docker_container_mem" ] [[inputs.mem]] [[inputs.netstat]] [[inputs.system]]
Copy this one in $HOME/telegraf.conf
. This is a TOML file that contains some main sections:
global_tags
is a free set of tags that will be added to every point. For this Telegraf, we are using thedepartment
to identify the laptop in a company. But if you think about an infrastructure, you can useprovider
if your application is running on bare metal, AWS, Google Cloud, orregion
if you have a multi-region architecture.agent
contains information about the single telegraf agent.output
andinput
. As said before, we are describing sources and destination of your points. This is a very easy and standard configuration with system, memory, CPU, Docker, and network input plugin and InfluxDB as output.
There are different ways to install and run Telegraf depending on your distribution. As before, we will use Docker to run our example:
docker run -it -d --network tsdb-test \ --name telegraf --hostname my-laptop \ -v /sys:/rootfs/sys:ro \ -v /proc:/rootfs/proc:ro \ -v /var/run/docker.sock:/var/run/docker.sock:ro \ -v /var/run/utmp:/var/run/utmp:ro \ -v ${HOME}/telegraf.conf:/etc/telegraf/telegraf.conf:ro \ telegraf
We shared a couple of volumes from the host -- sys
, proc
, utmp
-- because Telegraf needs to get process and system information from the host itself. If you don't share these directories, Telegraf will get this info from the container itself. For this example, that's not the right behavior because we are using Telegraf to monitor our laptop.
As you can see, we shared the Docker socket because Telegraf has a plugin capable of getting events and useful information from Docker itself, such as the number of running containers, images, and so on.
Now that our Telegraf is running, we can log into InfluxDB to have a look at the metrics stored by Telegraf.
14:24 $ docker run -it --rm --network tsdb-test -v ${HOME}:${HOME} -w ${HOME} influxdb influx -host influxdb Connected to http://influxdb:8086 version 1.2.4 InfluxDB shell version: 1.2.4 > show databases name: databases name ---- _internal telegraf
Now you can see that we have two databases; telegraf
is the one used by the agent.
> use telegraf Using database telegraf > show measurements name: measurements name ---- cpu diskio docker docker_container_cpu docker_container_mem local_diskio mem netstat system
You can see that we have some measurements, and they are related to the plugins that we enabled in the configuration file. You can go deeper and query them if you like.
!Sign up for a free Codeship Account
What Is Chronograf?
Now we know how to get and store data, but we need to make them useful. One of the ways is to be able to read them in some nice graph and collect more of them in dashboards to share within our company. One of the most famous open-source tools to create dashboards with is Grafana; there is another one built for InfluxDB and the Tick Stack called Chronograf.
docker run -it -p 8888:8888 -d --name chronograf --network tsdb-test chronograf
`
Open your browser on http://localhost:8888
and properly configure the first source:
url:
influxdb:8086
name:
influx
password and username are empty for this example.
You can also notice that it asks for the telegraf
plugin. Leave that part as it is because we are using the default configuration.
If you are asking yourself, "When do I use Chronograf versus Grafana?", Chronograf is part of the Tick stack, and there are some utilities to increase the interoperability between these projects. For example. you can see that you have the list of Telegraf agents storing information in InfluxDB. They're split by host name, and from this page, you already have a high-level visibility on your cluster. The green circle tells you that the agent is running as expected; it will be red if Telegraf stops sending data.
As I said, Chronograf is built to work well with the Tick stack, and if you look at the Apps
column, you can see system
, docker
. Chronograf detects what you are monitoring via Telegraf and it has a set of built-in dashboards for these plugins.
You can create dashboards and graphs to go deeper and combine information not only from a server but also from your applications to analyze; for example, how a specific application behavior can change how a server works or vice versa.
Chronograf only works with InfuxDB. It's designed to be the unique UI to manage and interact with the entire stack. We saw the powerful integration with Telegraf, but you can also look at the query builder to create InfluxDB queries in an easy and step-by-step way. You can manage your InfuxDB instance, configure ACL, and new users.
Using Kapacitor to Send Alerts
Kapacitor is the last piece of the puzzle. We now know how to store, get and read time series, and now you need to elaborate on them to do something like alerting or proactive monitoring.
Alerting is easy to understand. If your server runs on higher CPU usage (usually more than 70 percent), you can page somebody via Slack, Pagerduty, email, or other channels in order to have a human on the problem.
For some tasks, you really don't need a human. If you're on the cloud or if you're able to spin up more servers via an API, you can trigger an action via HTTP POST, for example.
Let's start Kapacitor via a Docker container:
docker run -d --network tsdb-test \ -p 9092:9092 --hostname kapacitor \ -e KAPACITOR_INFLUXDB_0_URLS_0=http://influxdb:8086 \ --name kapacitor \ kapacitor
Chronograf has a set of features to manage and interact with Kapacitor. You can open the Alerting
tab in Chronograf, and it will show you how to add a new Kapacitor instance.
url: kapacitor:9092
name: My kapacitor
You can configure the target of your alerts, and as I said before, Kapacitor supports different services like Slack, HipChat, and Pagerduty.
You can use the alert builder to create rules, and if you look at the following image, you will see how to trigger an alert. In this example, we are writing to a file, where the CPU for the laptop with hostname my-laptop
is over 70.
I use stress
, a utility that you can easily install on Ubuntu/Debian via Apt, to stress my laptop and trigger an alert:
$ stress -c 3
You can leave it running for three minutes. After that, you can look at the page Alerting > Alert history
, where you will see that Kapacitor triggered some alerts and it also recovered them (status OK) if the alert is resolved, in our case when the CPU usage comes back as less than 70 percent.
If you remember, we configurated Kapacitor to store alerts in a file, in my case to /tmp/alerts.log
. If I look at the file in the Kapacitor container, I have the alerts stored there:
$ docker exec -it kapacitor cat /tmp/alerts.log {"id":"high cpu:nil","message":" high cpu:nil is CRITICAL value: 77.1220492215092","details":"{\u0026#34;Name\u0026#34;:\u0026#34;cpu\u0026#34;,\u0026#34;TaskName\u0026#34;:\u0026#34;chronograf-v1-8b222593-a176-4678-b94d-88043c79e289\u0026#34;,\u0026#34;Group\u0026#34;:\u0026#34;nil\u0026#34;,\u0026#34;Tags\u0026#34;:{\u0026#34;cpu\u0026#34;:\u0026#34;cpu-total\u0026#34;,\u0026#34;department\u0026#34;:\u0026#34;it\u0026#34;,\u0026#34;host\u0026#34;:\u0026#34;my-laptop\u0026#34;},\u0026#34;ServerInfo\u0026#34;:{\u0026#34;Hostname\u0026#34;:\u0026#34;kapacitor\u0026#34;,\u0026#34;ClusterID\u0026#34;:\u0026#34;9eca4366-b17f-4b9e-9e16-930931353272\u0026#34;,\u0026#34;ServerID\u0026#34;:\u0026#34;59574edb-5a0c-4269-ab5c-6e57cf7b49c2\u0026#34;},\u0026#34;ID\u0026#34;:\u0026#34;high cpu:nil\u0026#34;,\u0026#34;Fields\u0026#34;:{\u0026#34;value\u0026#34;:77.1220492215092},\u0026#34;Level\u0026#34;:\u0026#34;CRITICAL\u0026#34;,\u0026#34;Time\u0026#34;:\u0026#34;2017-07-31T13:26:40Z\u0026#34;,\u0026#34;Message\u0026#34;:\u0026#34; high cpu:nil is CRITICAL value: 77.1220492215092\u0026#34;}\n","time":"2017-07-31T13:26:40Z","duration" :0,"level":"CRITICAL","data":{"series":[{"name":"cpu","tags":{"cpu":"cpu-total","department":"it","host":"my-laptop"},"columns":["time","value"],"values":[["2017-07-31T13:26:40Z",77.1220492215092]]}]}} {"id":"high cpu:nil","message":" high cpu:nil is OK value: 29.250830989511382","details":"{\u0026#34;Name\u0026#34;:\u0026#34;cpu\u0026#34;,\u0026#34;TaskName\u0026#34;:\u0026#34;chronograf-v1-8b222593-a176-4678-b94d-88043c79e289\u0026#34;,\u0026#34;Group\u0026#34;:\u0026#34;nil\u0026#34;,\u0026#34;Tags\u0026#34;:{\u0026#34;cpu\u0026#34;:\u0026#34;cpu-total\u0026#34;,\u0026#34;department\u0026#34;:\u0026#34;it\u0026#34;,\u0026#34;host\u0026#34;:\u0026#34;my-laptop\u0026#34;},\u0026#34;ServerInfo\u0026#34;:{\u0026#34;Hostname\u0026#34;:\u0026#34;kapacitor\u0026#34;,\u0026#34;ClusterID\u0026#34;:\u0026#34;9eca4366-b17f-4b9e-9e16-930931353272\u0026#34;,\u0026#34;ServerID\u0026#34;:\u0026#34;59574edb-5a0c-4269-ab5c-6e57cf7b49c2\u0026#34;},\u0026#34;ID\u0026#34;:\u0026#34;high cpu:nil\u0026#34;,\u0026#34;Fields\u0026#34;:{\u0026#34;value\u0026#34;:29.250830989511382},\u0026#34;Level\u0026#34;:\u0026#34;OK\u0026#34;,\u0026#34;Time\u0026#34;:\u0026#34;2017-07-31T13:26:50Z\u0026#34;,\u0026#34;Message\u0026#34;:\u0026#34; high cpu:nil is OK value: 29.250830989511382\u0026#34;}\n","time":"2017-07-31T13:26:50Z","duration":10000000000 ,"level":"OK","data":{"series":[{"name":"cpu","tags":{"cpu":"cpu-total","department":"it","host":"my-laptop"},"columns":["time","value"],"values":[["2017-07-31T13:26:50Z",29.250830989511382]]}]}} {"id":"high cpu:nil","message":" high cpu:nil is CRITICAL value: 77.1220492215092","details":"{\u0026#34;Name\u0026#34;:\u0026#34;cpu\u0026#34;,\u0026#34;TaskName\u0026#34;:\u0026#34;chronograf-v1-8b222593-a176-4678-b94d-88043c79e289\u0026#34;,\u0026#34;Group\u0026#34;:\u0026#34;nil\u0026#34;,\u0026#34;Tags\u0026#34;:{\u0026#34;cpu\u0026#34;:\u0026#34;cpu-total\u0026#34;,\u0026#34;department\u0026#34;:\u0026#34;it\u0026#34;,\u0026#34;host\u0026#34;:\u0026#34;my-laptop\u0026#34;},\u0026#34;ServerInfo\u0026#34;:{\u0026#34;Hostname\u0026#34;:\u0026#34;kapacitor\u0026#34;,\u0026#34;ClusterID\u0026#34;:\u0026#34;9eca4366-b17f-4b9e-9e16-930931353272\u0026#34;,\u0026#34;ServerID\u0026#34;:\u0026#34;59574edb-5a0c-4269-ab5c-6e57cf7b49c2\u0026#34;},\u0026#34;ID\u0026#34;:\u0026#34;high cpu:nil\u0026#34;,\u0026#34;Fields\u0026#34;:{\u0026#34;value\u0026#34;:77.1220492215092},\u0026#34;Level\u0026#34;:\u0026#34;CRITICAL\u0026#34;,\u0026#34;Time\u0026#34;:\u0026#34;2017-07-31T13:26:40Z\u0026#34;,\u0026#34;Message\u0026#34;:\u0026#34; high cpu:nil is CRITICAL value: 77.1220492215092\u0026#34;}\n","time":"2017-07-31T13:26:40Z","duration" :0,"level":"CRITICAL","data":{"series":[{"name":"cpu","tags":{"cpu":"cpu-total","department":"it","host":"my-laptop"},"columns":["time","value"],"values":[["2017-07-31T13:26:40Z",77.1220492215092]]}]}} .....
Chronograf offers an easy-to-use Kapacitor, rules builder, and historical visualization of what Kapacitor does, but it is a standalone project that runs an API on port 9092
by default, and you can interact with it via CLI.
$ docker exec -it kapacitor kapacitor list tasks ID Type Status Executing Databases and Retention Policies chronograf-v1-8b222593-a176-4678-b94d-88043c79e289 stream enabled true ["telegraf"."autogen"]
kapacitor list tasks
, for example, is the command that shows the number of tasks currently managed by Kapacitor. We have only one task created via Chronograf.
docker exec -it kapacitor kapacitor show chronograf-v1-8b222593-a176-4678-b94d-88043c79e289 ID: chronograf-v1-8b222593-a176-4678-b94d-88043c79e289 Error: Template: Type: stream Status: enabled Executing: true Created: 31 Jul 17 13:21 UTC Modified: 31 Jul 17 13:27 UTC LastEnabled: 31 Jul 17 13:27 UTC Databases Retention Policies: ["telegraf"."autogen"] TICKscript: var db = 'telegraf' var rp = 'autogen' var measurement = 'cpu' var groupBy = [] var whereFilter = lambda: ("cpu" == 'cpu-total') AND ("host" == 'my-laptop') var name = 'high cpu' var idVar = name + ':{{.Group}}' var message = ' {{.ID}} is {{.Level}} value: {{ index .Fields "value" }}' var idTag = 'alertID' var levelTag = 'level' var messageField = 'message' .....
kapacitor show id-task
contains the output of a task. It prints useful information like the row tick script and a representation of how many points the scripts is handling and the alerts triggered.
DOT: digraph chronograf-v1-8b222593-a176-4678-b94d-88043c79e289 { graph [throughput="0.00 points/s"]; stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ]; stream0 -> from1 [processed="6795"]; from1 [avg_exec_time_ns="60.552µs" errors="0" working_cardinality="0" ]; from1 -> eval2 [processed="6795"]; eval2 [avg_exec_time_ns="18.618µs" errors="0" working_cardinality="1" ]; eval2 -> alert3 [processed="6795"]; alert3 [alerts_triggered="10" avg_exec_time_ns="1.75714ms" crits_triggered="5" errors="0" infos_triggered="0" oks_triggered="5" warns_triggered="0" working_cardinality="1" ]; alert3 -> http_out5 [processed="10"]; alert3 -> influxdb_out4 [processed="10"]; http_out5 [avg_exec_time_ns="5.085µs" errors="0" working_cardinality="1" ]; influxdb_out4 [avg_exec_time_ns="3.02µs" errors="0" points_written="10" working_cardinality="0" write_errors="0" ]; }
Conclusion
If you are looking for a set of open-source and free projects to manage your monitoring system, now you know that the Tick stack offers storage, collector, visualization tools, and an alert system. All of them are offering API to build your custom implementation if you need a specific or different approach for one of these tools.
Monitoring is a very hot topic -- you cannot manage a fast-growing system without a deeper understanding of how your applications are working. When you considering a monitoring system, the hard part is that it needs to be up when all your systems are down. If your monitoring goes down with your infrastructure, then you will have an even bigger problem. That's why you need the best tools and methodology available to manage your applications on your own.
Learn more about docker Exec from our blog post An Introductory How-To, With Examples, of Docker Exec.