By Chris Tozzi
So you’ve got a ClickHouse DB, and you’re looking for a tool to monitor it. You’ve come to the right place. In this blog post (which is part 2 of a three-part series on ClickHouse monitoring), we’ll identify and discuss the various ClickHouse monitoring tools available today.
(To learn more about which metrics to monitor for ClickHouse, check out part 1 of this series. You can learn how to monitor ClickHouse with Sematext in part 3.)
The Challenge of Finding ClickHouse Monitoring Tools
ClickHouse itself provides no built-in monitoring solution other than exposing metrics in various database tables that you can query yourself. It does generate metrics that you can collect but offers no feasible way to keep track of ClickHouse alerts or interpret them.
In addition, many of the major commercial monitoring, log analysis and APM tools available today, such as Datadog and Splunk, lack ClickHouse integrations. This is probably due partly to the fact that ClickHouse, which became open source only in 2016, remains a relatively young DBMS.
Still, you can’t responsibly or effectively use ClickHouse without monitoring it. Let’s look at metrics which are available within the ClickHouse installation, and an example of how you can monitor the health of your system.
Accessing ClickHouse Metrics with SQL Queries
A ClickHouse database includes an extensive collection of system tables which contain important monitoring information. You can find a comprehensive list of these tables here.
One table which is useful when checking the health of your database is system.replicas. The replicas table which contains information about replicated tables on the local server. Executing a query such as the one below can alert you to any problems within your system, such as locked tables, extended queue sizes or inactive replicas.
SELECT database, table, is_leader, is_readonly, is_session_expired, future_parts, parts_to_check, columns_version, queue_size, inserts_in_queue, merges_in_queue, log_max_index, log_pointer, total_replicas, active_replicas FROM system.replicas WHERE is_readonly OR is_session_expired OR (future_parts > 20) OR (parts_to_check > 10) OR (queue_size > 20) OR (inserts_in_queue > 10) OR ((log_max_index - log_pointer) > 10) OR (total_replicas < 2) OR (active_replicas < total_replicas)
Figure 1. Query to Identify Problems Within Your Database
Ideally, this query should return 0 rows if everything is working correctly.
Ok. 0 rows in set. Elapsed: 0.005 sec.
Figure 2. Ideal Response to the Query Above
The following tables contain additional metrics and useful performance information:
While useful for one-off ClickHouse monitoring checks, queries such as these aren’t practical for long-term monitoring of your database. Let’s investigate other options which you can consider.
Graphite and Grafana for ClickHouse Monitoring
Graphite is a popular enterprise monitoring tool. While Graphite itself does not natively integrate with ClickHouse, it is possible to integrate Graphite with ClickHouse by using custom integrations such as this one from Altinity.
Figure 3. Viewing DBMS Performance Using the Altinity Graphite Integration.
Grafana is primarily a visualization tool which you can pair with Graphite, or use independently to view performance metrics from your ClickHouse system.
Here again, however, this approach is not practical for long-term monitoring. While these tools make the monitoring process somewhat more accessible and more visually appealing than executing command line SQL queries as described above, they don’t support automated alerting, which is a crucial element of an effective monitoring solution.
ClickHouse Monitoring with Prometheus
Another approach which is worth exploring is Prometheus, an open-source monitoring tool which can ingest ClickHouse metrics and provide the user with visualizing and alerting support. An export server, such as this clickhouse exporter, written in Go can be used to query the system tables mentioned earlier in this article and expose the statistics through an HTTP which can be ingested by your Prometheus instance.
Using Zabbix to Monitor ClickHouse
Zabbix is another popular open source monitoring tool. Like Graphite, it does not directly integrate with ClickHouse, but you can configure Zabbix to monitor ClickHouse manually and use a predefined temple which is available here.
The following steps are required to install the Zabbix client on your ClickHouse DB server and begin monitoring your Database.
- Install xmllint
- Install the clickhouse-client
- Clone the repo for the Zabbix Agent
- Edit the zabbix_agentd.conf file in /etc/zabbix/ and add the following line, where
- /PATH/TO/zbx_clickhouse_monitor.sh is the location where you cloned the Zabbix Agent.
- HOST_WHERE_CH_IS_RUNNING is the host where the ClickHouse server is running.
UserParameter=ch_params[*],sh /PATH/TO/zbx_clickhouse_monitor.sh "$1" "HOST_WHERE_CH_IS_RUNNING"
Finally, import the zbx_clickhouse_template.xml temple file into Zabbix to begin to begin monitoring the ClickHouse instance.
Figure 4. Monitoring the Rate of Data Inserted into ClickHouse, through Zabbix
Sematext as a ClickHouse Monitoring Tool
Last but not least is Sematext, one of the few monitoring and log management tools explicitly designed to support ClickHouse monitoring.
Sematext offers several advantages:
- Support for monitoring more than seventy ClickHouse-related metrics.
- Official ClickHouse integration support direct from the tool vendor. Moreover, the Sematext ClickHouse integration is customizable and open source.
- The open source Sematext Monitoring Agent is customized for a variety of Linux distributions, as well as Kubernetes, OpenShift and Docker Swarm.
- Support for a broad range of ClickHouse metrics (detailed here).
- Built-in data visualization.
- Alerting based on Thresholds, Anomaly Detection, and Server Health
In part 3 of this blog post series, you’ll find details on using Sematext for monitoring ClickHouse.
ClickHouse remains a relatively new DBMS, and monitoring tools for ClickHouse are few in number at this time. Most of the monitoring tools that support ClickHouse at all lack official integrations with ClickHouse from their vendors, and in many cases the number of metrics that they can collect is limited. Sematext supports the ClickHouse DB system as a first-class citizen and as such provides an official ClickHouse integration and supports the collection of a broad range of ClickHouse metrics and alerts.
Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure, and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO. His latest book, For Fun and Profit: A History of the Free and Open Source Software Revolution, was published in 2017.