Logging Libraries vs Log Shippers

Logging Libraries vs Log Shippers

Logging Libraries vs Log Shippers

In the context of centralizing logs (say, to Logsene or your own Elasticsearch), we often get the question of whether one should log directly from the application (e.g. via an Elasticsearch or syslog appender) or use a dedicated log shipper.

In this post, we’ll look at the advantages of each approach, so you’ll know when to use which.

Logging Libraries

Most programming languages have libraries to assist you with logging. Most commonly, they support local files or syslog, but more “exotic” destinations are often added to the list, such as Elasticsearch/Logsene. Here’s why you might want to use them:

  • Convenience: you’ll want a logging library anyway, so why not go with it all the way, without having to set up and manage a separate application for shipping? (well, there are some reasons below, but you get the point)
  • Fewer moving parts: logging from the library means you don’t have to manage the communication between the application and the log shipper
  • Lighter: logs serialized by your application can be consumed by Elasticsearch/Logsene directly, instead of having a log shipper in the middle to deserialize/parse it and then serialize it again

Log Shippers

Your log shipper can be Logstash or one of its alternatives. A logging library is still needed to get logs out of your application, but you’ll only write locally, either to a file or to a socket. A log shipper will take care of taking that raw log all the way to Elasticsearch/Logsene:

  • Reliability: most log shippers have buffers of some form. Whether it tails a file and remembers where it left off, or keeps data in memory/disk, a log shipper would be more resilient to network issues or slowdowns. Buffering can be implemented by a logging library too, but in reality most either block the thread/application or drop data
  • Performance: buffering also means a shipper can process data and send it to Elasticsearch/Logsene in bulks. This design will support higher throughput. Once again, logging libraries may have this functionality too (only tightly integrated into your app), but most will just process logs one by one
  • Enriching: unlike most logging libraries, log shippers often are capable of doing additional processing, such as pulling the host name or tagging IPs with Geo information
  • Fanout: logging to multiple destinations (e.g. local file + Logsene) is normally easier with a shipper
  • Flexibility: you can always change your log shipper to one that suits your use-case better. Changing the library you use for logging may be more involved

Conclusions

Design-wise, the difference between the two approaches is simply tight vs loose coupling, but the way most libraries and shippers are actually implemented are more likely to influence your decision on sending data to Elasticsearch/Logsene:

  • logging directly from the library might make sense for development: it’s easier to set up, especially if you’re not (yet) familiar with a log shipper
  • in production you’ll likely want to use one of the available log shippers, mostly because of buffers: blocking the application or dropping data (immediately) are often non-options in a production deployment

If logging isn’t critical to your environment (i.e. you can tolerate the occasional loss of data), you may want to fire-and-forget your logs to Logsene’s UDP syslog endpoint. This takes reliability out of the equation, meaning you can use a shipper if you need enriching or support for other destinations, or a library if you just want to send the raw logs (which may well be JSON).

Shippers or libraries, if you want to send logs with anything that can talk to Elasticsearch or syslog, you can sign up for Logsene here. No credit card or commitment is required, and we offer 30-day trials for all plans, in addition to the free ones.

If, on the other hand, you enjoy working with logs, metrics and/or search engines, come join us: we’re hiring worldwide.

2 thoughts on “Logging Libraries vs Log Shippers

  1. A few more arguments:

    Log libraries (log pushing):
    – Don’t have to write logs on local disk (reduced I/O), don’t risk to fill the disk in case of emergency
    – But often need to add libraries to handle specific protocol (Elasticsearch bulking, Kafka, GELF…) and mess with Java classpath for example. This libraries may also pollute the application at runtime (threads, memory consumption).
    – But need to handle log receiver failures: what should I do with Logs when I can’t send them to the server (network or server failure): retry, drop?

    Log shipper (log pulling)
    – Logs buffering is transparently handled through local disk. (this what you call reliability)
    – A single agent can handle multiple applications on the same host, and share same connection (resources) to log server.
    – Even supports software that is not log friendly (this is what you call flexibility). You can even collect logs of dumb shell scripts because any language can write to disk!
    – But need to deploy and monitor another piece of software (this is what you call moving parts)

Leave a Reply