What is the easiest way to parse, ship and analyze my web server logs? You should know that I’m a Node.js fan boy and not very thrilled with the idea of running a heavy process like Logstash on my low memory server, hosting my private Ghost Blog. I looked into Filebeat, a very light-weight log forwarder written in Go with an impressively low memory footprint of only a few MB, but Filebeat ships only unparsed log lines to Elasticsearch. In other words, it sort of still needs Logstash to parse web server logs, which include many fields and numeric values! Of course, structuring logs is essential for analytics. The setup for rsyslog with elasticsearch and regex parsers is a bit more time consuming but very efficient compared to Logstash. Are there any better alternatives? Having a quick setup, well structured logs and a low memory footprint?
Guess what? There is! Meet logagent-js – a log parser and shipper with log patterns for a number of popular log formats – from various Docker Images including Nginx, Apache, Linux and Mac system logs, to Elasticsearch, Redis, Solr, MongoDB and more. Logagent-js detects the log format automatically using the built-in pattern definitions (and also lets you provide your own, custom patterns).
Logagent-js includes a command line tool with default settings for Logsene as the Elasticsearch backend for storing the shipped logs. Logsene is compatible with the Elasticsearch API, but can do much more, such as role-based access control, account sharing for DevOps teams, ad-hoc charts in the Logsene UI, alerts on logs, and finally it integrates Kibana to ease the life of everybody dealing with log data!
— Sematext Group, Inc. (@sematext) February 15, 2016
Free eBook: Centralized Logging with Rsyslog
Evaluating rsyslog for a log management project? This eBook covers all you need to know about collecting and parsing data using rsyslog. You’ll find useful how-to instructions, code, structured logging with rsyslog and Elasticsearch, and more.
Getting the ingredients
Now let’s see what I run on my private blog site: logagent-js as single command to tail, parse and ship logs, all with less than 40 MB of RAM. Compare that to Logstash, which would not even start with just 40 MB of JVM heap. Logagent-js can be installed as a command line tool with npm, which is included in Node.js (>0.12):
npm i logagent-js -g
Logagent-js needs only the Logsene Token as a parameter to ship logs to Logsene. When running it as a background process or daemon, it makes sense to limit the Node.js memory with –max-old-space-size=60 to 100 MB, just in case. Without such setting Node.js could consume more memory to improve performance in a long running process:
node --max-old-space-size=60 /usr/local/bin/logagent -s -t your-logsene-token-here logs/access_log &
You can also run logagent-js as upstart or systemd service, of course.
A few seconds after you start it you’ll see all your logs, parsed and structured into fields, with correct timestamps, numeric fields, etc., all without any additional configuration! A real gift and a huge time time saver for busy ops people!
Next, let’s create some fancy charts with data from our logs. Logsene has ad-hoc charting functions (look for the little blue chart icons in the above screenshot) that let you draw Pie, Area, Line, Spline, Bar, and other types of charts. Logsene is smart and automatically provides chooses Pie charts to display distinct values and bar/line charts for numeric values over time.
In the above screenshot we see the top viewed pages and the distribution of HTTP status codes. We were able to generate these charts literally with just a few mouse clicks. The charts use the current query, so we could search for specific URLs and exclude e.g. images, stylesheets or traffic from robots using Logsene’s query language e.g. ‘NOT css AND NOT jpg AND NOT png AND NOT seoscanners’ or, more simply: -css -jpg -png -seoscanners).
This visualization could be saved and added to a Kibana dashboard. If you know Kibana this takes a few minutes per visualization. The result is a stored dashboard that could be shared with colleagues, which might not know how to create such dashboards.
The final thing I usually do is define alert queries e.g. to get notified about a growing number of HTTP error messages. For my private blog I use e-mail notifications, but Logsene integrates well with PagerDuty, HipChat, Slack or arbitrary WebHooks.
Finally, a few more words about logagent-js, which I consider a ‘swiss army knife’ for logs. It integrates seamlessly with Logsene, while at the same time it can also work with other log destinations. It provides what I believe is a good compromise in terms of performance and setup time – I’d say it’s somewhere between rsyslog and logstash.
All tools for log processing require memory for this processing, but looking at the initial memory usage after starting the tools gives you an impression of the minimum resource usage. Here are some numbers taking from my server:
- rsyslog 2.2 MB using https://github.com/megastef/rsyslog-logsene on Docker
- logagent-js 39 MB (Node.js 4.2)
- logstash – 263 MB (Java 1.8)
Contributions to the pattern library for even more log formats are welcome – we are happy to help with additional log formats or input sources beside the existing inputs (standard input, file, Heroku, CloudFoundry and syslog UDP). Feel free to contact me @seti321 or @sematext to get up and running with your special setup!
If you don’t want to run and manage your own Elasticsearch cluster but would like to use Kibana for log and data analysis, then give Logsene a quick try by registering here – we do all the backend heavy lifting so you can focus on what you want to get out of your data and not on infrastructure. There’s no commitment and no credit card required.
We are happy to answer questions or receive feedback – please drop us a line or get us @sematext.