Presentation: Top Node.js Metrics to Watch

Fresh from Germany’s largest Node.js Meetup, hosted by Wikimedia in Berlin, is the latest presentation from Sematext DevOps Evangelist Stefan Thies — “Top Node.js Metrics To Watch”. The event was shared with the Node.js Meetup in London via video-live stream, the full recording is available on YouTube.

Stefan’s talk goes through challenges of developing Node.js monitoring solutions, such as SPM for Node.js  and key metrics including examples from monitoring Kibana’s Node.js Server

Here is the video:

and the slides:

Please note, the article “ Top Node.js Metrics to Watch” was originally published by O’Reilly Radar.

Have a look at our other Node.js posts — there is a lot more interesting material to discover, like  MongoDB monitoring made with Node.js.

Questions or Feedback?
If you have any questions or feedback for us, please contact us by email or hit us on Twitter.  We love talking about performance monitoring — and log management!


Slack Analytics & Search with Elasticsearch, Node.js and React

Sematext team is highly distributed. We are ex-Skype users who recently switched to Slack for team collaboration. We’ve been happy with Slack features and especially integrations for watching our Github repositories, Jenkins, or receiving SPM or Logsene Alerts from our production servers through their ChatOps support. The ability to add custom integrations is really awesome! Being search experts it is hard for us to accept any limitation in search functionality in tools we use. For example, I personally miss the ability to search over all teams and all channels and I really miss having no analytics on user activity or channel usage. Elasticsearch has become a popular data store for analytical queries.  What if we could take all Slack messages and index them into Elasticsearch? This would make it possible to perform advanced analytics with Kibana or Grafana, such as getting like top terms used, most active users or channels. Finally, a simple mobile web page to access only the indexed data from various Teams and Channels might be handy to have, too.

In this post we’re going to see how to build what we just described.  We’ll use the Slack API, Node.js, React and Elasticsearch in 3 steps:

  • Index Data from Slack
  • Analyse Data from Slack
  • Create a custom Web-App for searchslack-indexing-logsene.png

Index Data from Slack

The Slack API provides several ways to access data. For example, outgoing webhook. This looks useful at first, however, this needs a setup per channel or keywords as trigger. Then I discovered a better way – the Node.js Slack Client.  Simply log in with your Slack account and get all Slack messages! I wrote a little Node.js app to dump the relevant information as JSON to the console or to a file.  Having the JSON output, it can be piped to Logagent-js a smart log shipper written in Node.js. I packaged this as “slack-elasticsearch-indexer” so it’s super easy to run:

npm install slack-elasticsearch-indexer
# Set Elasticsearch Server, btw. the Logsene Receiver is the default
# 1 - Slack API Token from
# 2 - Index name or Logsene Token from

The LOGSENE_TOKEN is what you can get from Logsene – the “ELK log management service”.  Using Logsene means you don’t have to bother running your own Elasticsearch, plus the volume of most team’s Slack data is probably so small that it fits in Logsene’s free plan! 🙂

Once you run the above you should see new Slack Messages on the console.  At the same time the messages will also be sent to Logsene and you will see them in the Logsene UI (or your local Elasticsearch server or cluster) right away.

Analyze Slack Messages in Logsene

Now that our Slack messages are in Logsene we can build our Kibana Dashboards to visualize channel utilization, top terms, the chattiest people, and so on.  But … did you know, that Logsene comes with a nice ad-hoc charting function? Simply open one of the Slack messages in Logsene, and click on the little chart symbol in the field userName and channel (see below).


This will very quickly render top users and channels for you:


Slack Alerting

Imagine a support chat channel – wouldn’t it be nice to be notified when people start mentioning “Error”, “Problems” and “Broken” things increasingly frequently? This is where we can make use of Logsene Alerts and its ability to do anomaly detection. Any triggered alerts can be delivered via email, PagerDuty, Slack, HipChat or WebHooks:

logsene-alert-definition.pngWhile Logsene is great for alerts, analytics and Slack message search, as a general ‘data viewer’ the message rendering in Logsene does not show application-specific things like users’ profile pictures, which would allow much faster recognition of user messages. Thus, as our next step, we’ll create a simple Web Client with nice rendering of indexed Slack messages. Let’s see how this can be done very quickly using some cutting edge Web technology together with Logsene.

Create a Custom Web-App for Search

We recently started using Facebook’s React.js for rendering of various UI parts like the views for Top Database Operations and we came across a new set of React UI Components for Elasticsearch called SearchKit. Thanks to Logsene’s Elasticsearch API SearchKit works out of the box with Logsene!
After a few lines of CSS and some JavaScript a simple Slack Search UI is born. Check it out!


Edit the source code

You just need to use your Logsene token as the Elasticsearch index name to run this app on your own data. For production we recommend adding a proxy to Elasticsearch (or Logsene) on the server side as described in the SearchKit UI documentation to hide connection details from the client application.

While this post shows how to index your Slack messages in Logsene for the purpose of archiving, searching, and analytics, we hope it also serves as an inspiration to build your own custom Search application with SearckKit, React, Node.js and Logsene?

If you haven’t used Logsene before, give it try – you can get a free account and have your logs and other event data in Logsene in no time. Drop us an email or hit us on Twitter with suggestions, questions or comments.



Top Node.js Metrics to Watch

Monitoring Node.js Applications has special challenges. The dynamic nature of the language provides many “opportunities” for developers to produce memory leaks, and a single function blocking the event queue can have a huge impact on the overall application performance. Parallel execution of jobs is done using multiple worker processes using the “cluster” functionality to take full advantage of multi-core CPUs – but the master and worker processes belong to a single application, which means that they should be monitored together. Let’s have a deep look at the Top Metrics in Node.js Applications to get a better understanding of why they are so important to monitor.

Note: All images in this post are from Sematext’s SPM Performance Monitoring solution and its Node.js integration.

Garbage Collection & Process Memory – Node.js is based on Google’s Chrome V8 Javascript engine. Garbage Collection reclaims memory used by objects that are not longer required. The V8 garbage collection stops the program execution. Incremental GC cycles (scavenging) process only a part of the Heap and are very fast. Full GC cycles deal with objects that survived multiple Incremental GC runs. Full GC runs are executed less frequently to minimize pauses in the program execution.

With regard to garbage collection metrics, we should first measure all the time spent for garbage collection. In addition, it is useful to see how often a full GC cycle — or incremental GC cycle — is executed. The size of heap memory can be compared with the size of the last GC run to see if there is a growing trend. That’s why the following metrics should be monitored:

  • Time consumed for garbage collection
  • Counters for full GC cycles
  • Counters for incremental GC cycles


Garbage Collection Metrics

Aside from how often GC happens and how long it takes, we can measure the effect on memory by providing the following metrics:

  • Released memory between GC cycles (see above)
  • Process Heap Size and Heap Usage


Process Memory Information

Event Loop – The secret of Node.js’s performance is its ability to be CPU bound and use async operations; in that way CPU can be highly utilized and doesn’t waste cycles waiting for I/O operations. This means a server can take many connections and will not be blocked for async operations. As soon as the operation is finished, callback functions are used to continue processing.   The implementation is based on a single event loop, which processes the async function calls in a separate thread. Using synchronous operations drags down performance because other operations need to wait to be executed.  That’s why the golden rule for Node.js performance is “don’t block the event loop”.

The metric to watch is the Latency to process the next event:

  • Slowest Event Handling (Max Latency)
  • Average Event Loop Latency
  • Fastest Event Handling (Min Latency)


Slowest, average and fastest event processing

A high latency in the event loop might indicate the use of blocking (sync) or time-consuming functions in event handlers, which could impact the performance of the whole Node.js application.

Cluster Mode and number of processes – To scale Node.js beyond the capacity of a single process the use of master and worker processes is required – the so called “cluster” mode. Master processes share sockets with the forked worker process and can exchange messages with it. A typical use case for web servers is forking N worker processes, which operate on the shared server socket and handle the requests in round robin (since Node v0.12). In many cases programs choose N with the number of CPUs the server provides – that’s why a constant number of worker processes should be the regular case.  If this number changes it means worker processes have been terminated for some reason.  In the case of processing queues, workers might be started on demand.  In this scenario it would be normal that the number of workers changes all the time, but it might be interesting to see how long a higher number of workers was active. Using a tool like SPM for Node.js lets one track the number of workers. When picking a monitoring solution or developing your own monitoring for Node.js, make sure it is capable of filtering by hostname and worker ID.  Keep in mind Node.js workers can have a very short lifetime that traditional monitoring tools may not be able to handle well.


Worker Count

For example, to compare event loop latency in different Node.js sub-processes, we need to be able to select workers we want to compare:


Event Loop Latency for each Worker

Web Frameworks – There is a steadily growing number of frameworks to build web services using Node.js.  The most popular are: Express, Hapi.js, Restify,, Meteor, and many more. When doing HTTP monitoring, here are some of the key metrics to pay attention to:

  • Response time (http/https)
  • Request rate
  • Error rates (total, error categories)
  • Request/Response content size


HTTP/HTTPS Metrics Overview

Of course, Node.js apps don’t run in a vacuum.  They connect to other services, other types of applications, caches, data stores, etc.  As such, while knowing what key Node.js metrics are, monitoring Node.js alone or monitoring it separately from other parts of the infrastructure is not the best practice.  If there is one piece of advice I can give to anyone looking into (Node.js) monitoring it is this: when you buy a monitoring solution — or if you are building it for your own use — make sure you end up with a solution that is capable of showing you the big picture.  For example, Node.js is often used with Elasticsearch (see Top 10 Elasticsearch Metrics to Watch post), Redis, etc.  Seeing metrics for all the systems that surround Node.js apps is precious.  Here is just a small example of a dashboard showing a few Node.js and Elasticsearch metrics together.


Combined Dashboard of Node.js and Elasticsearch Metrics

So, those are our top Node.js metrics — what are YOUR top 10 metrics? We’d love to know so we can compare and contrast them with ours in a future post.  Please leave a comment, or send them to us via email or hit us on Twitter: @sematext.

And…if you’d like try SPM to monitor Node.js yourself, check out a Free 30-day trial by registering here.  There’s no commitment and no credit card required. Small startups, startups with no or very little outside funding, non-profit and educational institutions get special pricing – just get in touch with us.

[Note: this post originally appeared on]

How to Add Performance Monitoring to Node.js Applications

[Updated: Instructions for Node.js 4.x / 5.x / 6.x]

We have been using Node.js here at Sematext and, since eating one’s own dogfood is healthy, we wanted to be able to monitor our Node.js apps with SPM (we are in performance monitoring and big data biz). So, the first thing to do in such a case is to add monitoring capabilities for technology we use in-house (like we did for Java, Solr, Elasticsearch, Kafka, HBase, NGINX, and others).  For example we monitor Kibana4 servers (based on Node.js), which we have in production for our “1-click ELK stack”.

You may have seen our post about SPM for Node.js  —  but I thought I’d share a bit about how we monitor Node.js to help others with the same DevOps challenges when introducing new Node.js apps, or even the additional challenge of operating large deployments with a mix of technologies in the application stack:

1) Install the monitoring agent
npm i spm-agent-nodejs
It’s open-sourced on Github: sematext/spm-agent-nodejs

2) add a new SPM App for Node.js — each monitored App is identified by its App-Token (and yes — there is an API to automate this step)

3) set the Environment variable for the application token


4) add one line to the beginning of your source code when using Node 0.12 – Node 4.x / 5.x got a better option/see below …

var spmAgent = require ('spm-agent-nodejs')

5) Run your app and after 60 seconds you should start seeing node.js metrics in SPM

At this point what do I get? I can see pre-defined metric charts like these, with about 5 minutes of work 🙂


I saved time already —there’s  no need to define metric queries/widgets/dashboards

Now I can set up alerts on Latency or Garbage Collection, or I can have anomaly detection tell me when the number of Workers in a dynamic queue changes drastically. I typically set ‘Algolerts’ (basically machine learning-based anomaly detection) to get notified (e.g. via PagerDuty) when a service suddenly slows down because  they produce less noise than regular threshold alerts. In addition, I recommend adding Heartbeat alerts for each monitored service to be notified of any server outages or network problems. In our case, where a Node.js app runs tasks on Elasticsearch, it makes sense to create a custom dashboard to see Elasticsearch and Node.js metrics together (see 2nd screenshot above) — of course, this is applicable for other applications in the stack like NGINX, Redis or HAProxy — and can be combined with Docker container metrics


In fact, you can use the application token for multiple servers  to see how your cluster behaves using the “birds eye view” (a kind of top + df to show the health of all your servers)

Now, let’s have a look at how the procedure differs when using Node 4.x or Node 5.x

Node 4.x/5.x/6.x Supports Preloading Modules

When we use the new node.js preload command-line option, we can add instrumentaion without adding the require statement for ‘spm-agent-nodejs’ to the source code:

That’s why Step 4) could be done even better with  node 4.x/5.x:

node -r spm-agent-nodejs yourApp.js

This is just a little feature but it shows how the node.js community is listening to the needs of users and is able to release such things quickly.

If you want to try node 4.x/5.x, here is how to install it using n:

npm i n -g 
n lts      # for node 4.x or
n stable   # for node 6.x

The ‘node’ executable is now linked to the new node version 4.x/5.x /6.x — to switch back to node 0.12 simply use

n 0.12 

I hope this helps.  If you’d like to see some Node.js metrics that are currently not being captured by SPM then please hit me on Twitter — @seti321 — or drop me an email.  Or, even better, simply open an issue here:   Enjoy!

Monitoring Kibana 4's Node.js App

The release of Kibana 4.x has had an impact on monitoring and other related activities.  In this post we’re going to get specific and show you how to add Node.js monitoring to the Kibana 4 server app.  Why Node.js?  Because Kibana 4 now comes with a little Node.js server app that sits between the Kibana UI and the Elasticsearch backend.  Conveniently, you can monitor Node.js apps with SPM, which means SPM can monitor Kibana in addition to monitoring Elasticsearch.  Futhermore, Logstash can also be monitored with SPM, which means you can use SPM to monitor your whole ELK Stack!  But, I digress…

A few important things to note first:

  • the Kibana 4 project moved from Ruby to pure browser app to Node.js on the server side, as mentioned above
  • it now uses the popular Express Web Framework
  • the server component has a built-in proxy to Elasticsearch, just like it did with the Ruby app
  • when monitoring Kibana 4, the proxy requests to Elasticsearch are monitored at the same time

OK, here’s how to add Node.js monitoring to the Kibana 4 server-side app.

1) Preparation

Get an App Token for SPM by creating a new Node.js SPM App in SPM.

Kibana 4 currently ships with Node.js version 0.10.35 in a subdirectory – so please make sure your Node.js is on 0.10 while installing SPM Agent for Node.js (it compiles native modules, which need to fit to Kibana’s 0.10 runtime).

  npm-install n -g
  n 0.10.35

After finishing the described installation below you can easily switch back to 0.12 or io.js 2.0 by using “n 0.12” or “n io 2.0” – because Kibana will use its own node.js sub-folder.

2) Install SPM Agent for Node.js

Switch over to your Kibana 4 installation directory.  It has a “src” folder where the Node.js modules are installed.

  cd src
  npm install spm-agent-nodejs

Add the following line to ./src/app.js

  var spmAgent = require ('spm-agent-nodejs')

Add the following line to bin/kibana shell script at the beginning

export spmagent_tokens__spm=YOUR-SPM-APP-TOKEN

3) Run Kibana


4) Check results in SPM

After a minute you should see the performance metrics such as EventLoop Latencies, Memory Usage, Garbage Collection details and HTTP statistics of your Kibana 4 Server app in SPM.

Kibana 4 - monitored with SPM for Node.js
Kibana 4 – monitored with SPM for Node.js

SPM for Node.js Monitoring – Details, Screenshots and more

For more specific details about SPM’s Node.js monitoring integration, check out this blog post.

That’s all there is to it!  If you’ve got questions or feedback to this post, please let us know!

Custom Metrics from Node.js Apps

We recently added support for Node.js and io.js monitoring to SPM and have received great feedback.  While SPM for Node.js monitors all key Node.js metrics, most applications have additional metrics one often wants to track — things like: the number of concurrent users, the number of items placed in a shopping cart, or any other kind of IT metric, business transaction or KPI.  SPM already provides a Custom Metrics API and libraries that make shipping custom metrics from Java and from Ruby applications a snap.  But why leave Node.js behind?  Meet spm-metrics-js (it’s on Github) – the npm module for sending custom metrics from Node.js apps to SPM.  

This JavaScript module supports measurements using counters, meters, timers, and histograms. These helpers calculate values of metrics objects and ship them to SPM, where they are then turned into charts and inputs to alert rules and anomaly detection algorithms.

Here’s an example for counting users on login and logout:

Sending custom metrics is really that easy!

Now, let’s have a look at the options used when creating a custom metric object:

  • name – the name of the metric you can find in SPM’s user interface
  • aggregation – the aggregation type: ‘avg’, ‘sum’, ‘min’ or ‘max’ used in SPM’s aggregations server
  • filter1 – the SPM user interface provides two filter criteria; the value will be available in the UI as the first filter
  • filter2 – the filter value for the second filter field in SPM’s UI
  • interval – time in ms to call save() periodically. Defaults to no automatic call to save(). The save() function captures the metric and resets meters, histograms, counters or timers.
  • valueFilter – array of property names for calculated values. Only specified fields are sent to SPM (e.g. [‘count’, ‘min’, ‘max’].

Additional measurement functions are available to extend the custom metric object automatically with additional calculated properties:

  • Meter – measure rates and provide the following calculated properties:
    • mean: the average rate since the meter was started
    • count: the total of all values added to the meter
    • currentRate: the rate of the meter since the meter was started
    • 1MinuteRate: the rate of the meter biased toward the last 1 minute
    • 5MinuteRate: the rate of the meter biased toward the last 5 minutes
    • 15MinuteRate: the rate of the meter biased toward the last 15 minutes
  • Histogram – build percentile, min, max, & sum aggregations over time
    • min: the lowest observed value
    • max: the highest observed value
    • sum: the sum of all observed values
    • variance: the variance of all observed values
    • mean: the average of all observed values
    • stddev: the stddev of all observed values
    • count: the number of observed values
    • median: 50% of all values in the reservoir are at or below this value.
    • p75: see median, 75% percentile
    • p95: see median, 95% percentile
    • p99: see median, 99% percentile
    • p999: see median, 99.9% percentile
  • Timer – measures time and captures rates in an internal meter and histogram

If this is more than you actually need, we recommend selecting only the relevant properties (using the ‘valueFilter’ option). Please note that Custom Metrics are aggregated by the specified aggregation type (‘avg’, ‘sum’, ‘min’, ‘max’).  Moreover, the aggregation type for each property can be defined – for further details please check the package documentation.

Adding instrumentation always raises the question of performance; in spm-metrics-js all metrics are buffered and efficiently ship metrics to SPM in bulk using asynchronous functions. We recommend using a transmit time of 60 seconds.

Once you send custom metrics to SPM you can create alerts on them, have SPM detect and alert you about anomalies, put charts with those metrics on dashboards, share charts with those metrics publicly or just with your team or organization, etc.

Actions for Metrics – e.g. define alerts using anomaly detection
Dashboard with Custom Metric and other Metrics

Please note the free plan has no limits on the number of monitored Applications, Processes, Dashboards or Users and you can share Accounts with your whole DevOps team and integrate SPM with Slack, HipChat, PagerDuty, Webhooks, etc. If you don’t use SPM yet, grab a free account to start monitoring your Node.js and io.js applications and benefit from all standard SPM features such as alerting, anomaly detection, event and log correlation, unlimited dashboards, secure information sharing, etc. Check out spm-metrics-js (or on Github) and drop us a line (or tweet 140 characters to @sematext) — we’d love to hear from you!

Node.js and io.js Monitoring Support

Node.js and io.js are increasingly being used to run JavaScript on the server side for many types of applications, such as websites, real-time messaging and controllers for small devices with limited resources. For DevOps it is crucial to monitor the whole application stack and Node.js is rapidly becoming an important part of the stack in many organizations. Sematext has historically had a strong support for monitoring big data applications such as Elastic (aka Elasticsearch), Cassandra, Solr, Spark, Hadoop, and HBase, as well as more traditional databases, web servers like Nginx, Nginx Plus and Apache, Java applications, cache servers like Redis and Memcached, messaging middleware like everyone’s darling Kafka, etc.  With such rapid adoption of Node.js and now io.js, we’d be remiss not to add performance monitoring, alerting, and anomaly detection for them in SPM!


SPM for Node.js

We’re happy to announce we’ve just added Node.js monitoring to this growing list of SPM integrations.  SPM for Node.js covers key Node.js metrics such as Event Loop, Garbage Collection, CPU, Memory and web services metrics.  All metrics are organized in out-of-the-box charts, which can be put on additional dashboards and placed next to performance charts for other parts of the application stack.

Overview for top node.js and io.js metrics
Overview for top node.js and io.js metrics

Of course, you can view your Node.js metrics in a larger context.  For example, here is a dashboard that shows Node.js metrics together with Elasticsearch metrics, making it easier to correlate performance across multiple tiers of the application stack.  You could also get your event and log charts on the same dashboard for an even more thorough correlation.

Dashboard with node.js HTTP response time and Elasticsearch query latency

Needless to say, we made sure everything works for the latest versions of Node.js (0.12) and io.js (1.6). Installation is as easy as integration of any other module using npm.  If you are not using SPM yet, you can sign up with no commitment or credit card.  You have 30-days free on any new app you create.  If you are already using SPM, you can simply add a new SPM App for Node.js and see all your Node.js metrics in just a few minutes.  Don’t see something in SPM for Node.js?  Please let us know (@sematext) or comment below, we are looking for feedback!