Last updated on Dec 14, 2017
In the context of centralizing logs (say, to Sematext Logs or your own Elasticsearch), we often get the question of whether one should log directly from the application (e.g. via an Elasticsearch or syslog appender) or use a dedicated log shipper.
In this post, we’ll look at the advantages of each approach, so you’ll know when to use which.
Logging Libraries
Most programming languages have libraries to assist you with logging. Most commonly, they support local files or syslog, but more “exotic” destinations are often added to the list, such as Elasticsearch/Sematext Logs. Here’s why you might want to use them:
- Convenience: you’ll want a logging library anyway, so why not go with it all the way, without having to set up and manage a separate application for shipping? (well, there are some reasons below, but you get the point)
- Fewer moving parts: logging from the library means you don’t have to manage the communication between the application and the log shipper
- Lighter: logs serialized by your application can be consumed by Elasticsearch/Sematext Logs directly, instead of having a log shipper in the middle to deserialize/parse it and then serialize it again
Log Shippers
Your log shipper can be Logstash or one of its alternatives. A logging library is still needed to get logs out of your application, but you’ll only write locally, either to a file or to a socket. A log shipper will take care of taking that raw log all the way to Elasticsearch/Sematext Logs:
- Reliability: most log shippers have buffers of some form. Whether it tails a file and remembers where it left off, or keeps data in memory/disk, a log shipper would be more resilient to network issues or slowdowns. Buffering can be implemented by a logging library too, but in reality most either block the thread/application or drop data
- Performance: buffering also means a shipper can process data and send it to Elasticsearch/Sematext Logs in bulks. This design will support higher throughput. Once again, logging libraries may have this functionality too (only tightly integrated into your app), but most will just process logs one by one
- Enriching: unlike most logging libraries, log shippers often are capable of doing additional processing, such as pulling the host name or tagging IPs with Geo information
- Fanout: logging to multiple destinations (e.g. local file + Sematext Logs) is normally easier with a shipper
- Flexibility: you can always change your log shipper to one that suits your use-case better. Changing the library you use for logging may be more involved
Conclusions
Design-wise, the difference between the two approaches is simply tight vs loose coupling, but the way most libraries and shippers are actually implemented are more likely to influence your decision on sending data to Elasticsearch/Sematext Logs:
- logging directly from the library might make sense for development: it’s easier to set up, especially if you’re not (yet) familiar with a log shipper
- in production you’ll likely want to use one of the available log shippers, mostly because of buffers: blocking the application or dropping data (immediately) are often non-options in a production deployment
If logging isn’t critical to your environment (i.e. you can tolerate the occasional loss of data), you may want to fire-and-forget your logs to Sematext Logs’ UDP syslog endpoint. This takes reliability out of the equation, meaning you can use a shipper if you need enriching or support for other destinations, or a library if you just want to send the raw logs (which may well be JSON).
Shippers or libraries, if you want to send logs with anything that can talk to Elasticsearch or syslog, you can sign up for Sematext Logs here. No credit card or commitment is required, and we offer 30-day trials for all plans, in addition to the free ones.
If, on the other hand, you enjoy working with logs, metrics and/or search engines, come join us: we’re hiring worldwide.