Handling Stack Traces with Logstash

Here at Sematext we use Java (8!) and rely on Logsene, our hosted ELK logging SaaS a lot. We like them so much that we regularly share our logging experience with everyone and help others with logging, especially, ELK stack. Centralized logging plays nice with Java (and anything else that can write pretty logs). However, there is one tricky thing that can be hard to get right: properly capturing exception stack traces. Unlike your typical single-line log events, stack traces have multiple lines and they aren’t always perfectly uniform.  As such, most log shipper don’t handle them properly out of the box and typically treat each stack trace line as a separate event – clearly the wrong thing to do (n.b., if you are sending logs to Logsene this way, you are going to hit your plan limits faster, so make sure you send stack traces properly!).  Thus, in most cases, a special configuration is needed in order to get stack traces right. Let’s see how one needs to configure Logstash for this task.  We’ll be using Logstash 1.5, but this should work with Logstash 2.x.

Prerequisites

We start from the following assumptions:

  • you have a working Logstash setup
  • local logs are written to a file named: /var/log/test.log
  • the conversion pattern for log4j/logback/log4j2 is: “%d %p %m%n”

Based on the above pattern, logs should look like this, for example:

2015-03-23 11:11:50,778 INFO Initializing Systems Cache
2015-03-23 11:11:58,666 ERROR Processing request failed
kafka.common.FailedToSendMessageException: Failed to send messages after 3 tries.
    at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:90)
    at kafka.producer.Producer.send(Producer.scala:77)
    at kafka.javaapi.producer.Producer.send(Producer.scala:42)
    ...

Extracting Exception Stack Traces Correctly

Stack traces are multiline messages or events. Logstash has the ability to parse a log file and merge multiple log lines into a single event. You can do this using either the multiline codec or the multiline filter, depending on the desired effect. A codec is attached to an input and a filter can process events from multiple inputs.

The most popular example that allows matching log lines that are a part of an exception uses a not so simple regexp:

input {
  file {
    path => "/var/log/test.log"
    start_position => "beginning"
    codec => multiline {
      pattern => "(^d+serror)|(^.+Exception: .+)|(^s+at .+)|(^s+... d+ more)|(^s*Caused by:.+)"
      what => "previous"
    }
  }
}

If the log line is part of an exception, it is appended to the last received log line. Once a new log line is received, the complete exception is sent to be processed. The downside of this approach is that it might not match all exceptions correctly. Sure, you could keep fine-tuning and expanding the regexp and attempt to capture all possible variants of stack trace format, but … there is a better (way.

input {
  file {
    path => "/var/log/test.log"
    start_position => "beginning"
    codec => multiline {
      pattern => "^%{TIMESTAMP_ISO8601}"
      negate => true
      what => "previous"
    }
  }
}

The pattern looks for log lines starting with a timestamp and, until a new match is found, all lines are considered part of the event. This is done by setting the negate parameter to true.

Structuring Logs with Grok

To structure the information before storing the event, a filter section should be used for parsing the logs.

filter {
  mutate {
    gsub => [ "message", "r", "" ]
  }
  grok {
    match => [ "message", "(?m)%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:severity} %{GREEDYDATA:message}" ]
    overwrite => [ "message" ]
  }
  date {
    match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
  }
}

output {
  stdout { codec => rubydebug }
}

The output section, in this case, is only used for debugging.  In the next section we show how to actually ship your logs.

First, the mutate filter removes any “r” characters from the event. Usually, this is something you want to do, to prevent later issues when storing and visualizing the logs where “r” could be interpreted as a “n”.

The grok filter splits the event content into 3 parts: timestamp, severity and message (which overwrites original message). The (?m) in the beginning of the regexp is used for multiline matching and, without it, only the first line would be read. The patterns used in the regexp are provided with Logstash and should be used when possible to simplify regexps.

By default, the timestamp of the log line is considered the moment when the log line is read from the file. The date filter fixes the timestamp, by changing it to the one matched earlier with the grok filter.

Shipping the Logs

Finally, now that you’ve got all your logs extracted, even those with multiple lines like exception stack traces, it’s time to ship them to wherever you keep your logs.  Here’s how you’d ship them to Logsene:

output {
  elasticsearch {
    host => "logsene-receiver.sematext.com"
    ssl => true
    port => 443
    index => "YOUR LOGSENE APP TOKEN GOES HERE"
    protocol => http
    manage_template => false
  }
}

Final Words

If you are looking for a way to ship logs containing stack traces or other complicated multi line events, Logstash is the simplest way to do it at the moment.

It is written JRuby, which makes it possible for many people to contribute to the project. The downside of this ease of use and maintainability is that it is not the fastest tool for the job and it is also quite resource hungry (both CPU and memory). Though, depending on the log volume that needs to be shipped, this might not be a problem.

For those looking for tips on how to ship multiline logs with rsyslog or syslog-ng, subscribe to this blog or follow @sematext – tips for handling stack traces with rsyslog and syslog-ng are coming.

If your organization need help with logging (ELK stack, but also rsyslog, Fluentd, Flume…), let us know.  If you like to build tools and products around logs and logging, join us – we’re always looking for good toolmakers world-wide.

6 thoughts on “Handling Stack Traces with Logstash

  1. Hello HAKMANC,

    I am new to ELK and dealing with logs containing java exceptions and other multiline events.
    Have multiline codec working fine to build message field with multiple lines.
    What I would like is to be able to split this message field into other fields. I have managed to do that only for 1rst line of the field.
    This is my filter section:

    filter {
    if [type] == “weblogic” {
    multiline {
    pattern => “^%{DATESTAMP}”
    negate => true
    what => “previous”
    }
    grok {
    match => { “source” => “appli-acetp-%{WORD:environment}-%{WORD:wls_instance}\.log$” }
    }
    grok {
    match => { “message” => “%{DATESTAMP} +%{NUMBER} +%{WORD:log_severity} +%{JAVACLASS:java_class} +- +%{DATA:app_message}$” }
    }
    grok {
    match => { “app_message” => “Caused by: +%{JAVACLASS:exception_class}: +%{DATA:exception_message}$” }
    }
    }
    }

    So, “message” is multiline but “app_message” contains only information until end of line, and therefore, I cannot use it to extract java exception class (which comes on further lines).
    How can I create multiline “app_message”?

    Thanks for your time.

  2. SB:2016-03-13 19:53:39,902 [http-bio-0.0.0.0-8003-exec-1486] INFO Start:No Error occured.
    SB:2016-03-13 19:53:39,902 [http-bio-0.0.0.0-8003-exec-1486] ERROR (NewError.java:470) – Start:Malformed PAC logon info.
    org.jaaslounge.decoding.DecodingException: Malformed PAC logon info.
    at org.jaaslounge.decoding.pac.PacLogonInfo.(PacLogonInfo.java:209)
    at org.jaaslounge.decoding.pac.Pac.(Pac.java:46)
    at org.jaaslounge.decoding.kerberos.KerberosPacAuthData.(KerberosPacAuth.java:13)
    at org.jaaslounge.decoding.kerberos.KerberosAuthData.parse(KerberosAuth.java:21)
    at org.jaaslounge.decoding.kerberos.KerberosRelevantAuthData.(KerberosRelevantAuthData.java:44)
    at org.jaaslounge.decoding.kerberos.KerberosAuthData.parse(KerberosAuth.java:18)
    at org.jaaslounge.decoding.kerberos.KerberosEncData.(KerberosEnc.java:138)
    at org.jaaslounge.decoding.kerberos.KerberosTicket.(KerberosTicket.java:104)
    at org.jaaslounge.decoding.kerberos.KerberosApRequest.(KerberosApRequest.java:63)
    at org.jaaslounge.decoding.kerberos.KerberosToken.(KerberosToken.java:52)
    at com.eleven38.bos.filter.SpnegoAuthenticator.doSpnegoAuth(SpnegoAuthenticator.java:430)
    at com.eleven38.bos.filter.SpnegoAuthenticator.authenticate(SpnegoAuthenticator.java:265)
    at com.eleven38.bos.filter.BosSpnegoHttpFilter.doFilter(BosSpnegoHttpFilter.java:261)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:312)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
    Caused by: java.io.EOFException
    at java.io.DataInputStream.readFully(DataInputStream.java:197)
    at java.io.DataInputStream.readFully(DataInputStream.java:169)
    at org.jaaslounge.decoding.pac.PacDataInputStream.readFully(PacDataInputStream.java:33)
    at org.jaaslounge.decoding.pac.PacDataInputStream.readSid(PacDataInputStream.java:132)
    Caused by: java.io.EOFException
    at java.io.DataInputStream.readFully(DataInputStream.java:197)
    at java.io.DataInputStream.readFully(DataInputStream.java:169)
    at org.jaaslounge.decoding.pac.PacDataInputStream.readFully(PacDataInputStream.java:33)
    at org.jaaslounge.decoding.pac.PacDataInputStream.readSid(PacDataInputStream.java:132)

    This is my sample log and tried too much to get field value of caused by and exception class as i tried with multiline filter but getting no solution please help me

  3. I am struggling to write a multiline parser to parse the following log. The file is filled with such rows and anything between the rows “:DCTransactionEnds” needs to be put into a single line. Can somebody help with the multiline Grok filter for this ? Thanks in advance

    :DCTransactionEnds
    time:514711710621294
    node_name:v_hunger_node0001
    session_id:ip-172-31-23-237.us-2976:0x93b7
    user_id:45035996273704962
    user_name:dbadmin
    transaction_id:45035996273747374
    number_of_statements:2
    is_committed:true
    end_begin_time:514711710620535
    epoch_close_time:-9223372036854775808
    end_epoch:1
    is_ddl:false
    ros_rows_written:0
    dvros_rows_written:0
    wos_rows_written:0
    dvwos_rows_written:0
    .
    :DCTransactionEnds
    time:51471171062208

    1. In this case, the filter section should look like this:
      filter {
      mutate {
      gsub => [ “message”, “\n”, ” ” ]
      }
      grok {
      match => [ “message”, “:DCTransactionEnds %{GREEDYDATA:data} \.$” ]
      }
      }

      You will also need to use a multiline codec on your input, with negate pattern “^:DCTransactionEnds”.

  4. I have a setup of ELK with F5 – HSL Pools for Logstash for consuming logs from our cloud foundry. Log for exception stacktrace come down across multiple lines, so they get sent to multiple instances of logstash. So cannot really use a multiline codec with negate param to append to previous line on a particular instance of logstash. What would u think would be a solution for such problem? As per 12 factor app the logs must be single line per event with excuse in case of exception stacktrace.

Leave a Reply