At the end of November, we’ll be migrating the Sematext Logs backend from Elasticsearch to OpenSearch

Logstash Tutorial: Getting Started Guide for Shipping Logs

July 7, 2019

Table of contents

Looking to learn about Logstash as quickly as possible? This Logstash Tutorial is for you: we’ll install Logstash and push some Apache logs to Elasticsearch in less than 5 minutes.

1. The Basics: What Is Logstash and How Does It Work?

Logstash is a free and open source data collection engine that reads data from a variety of sources, processes it, then sends it to one or more destinations, the most popular one being Elasticsearch. It’s also one of the easiest log management tools to get started with (so it’s perfect for beginners), which is exactly what this guide is about.

Once logs are structured and stored in Elasticsearch, you can start searching and visualizing with Kibana, completing the Elastic Stack (formerly known as ELK Stack).

Use Logstash to send logs to Sematext Logs, our log management & analysis solution. Get Started

2. Installing and Running Logstash

After you download Logstash (careful which version you are downloading – there is the Apache Software License version of Elastic License version. The former is free.), you’d start it with bin/logstash -f config_file.conf. If you choose the RPM/DEB package, you’d put the config file in /etc/logstash/conf.d/ and start Logstash via service logstash start.

3. Logstash Configuration & Quick Example

In less than 5 minutes, you’ll learn how to set up Logstash, send logs from a file, parse them to extract metrics from those logs and send them to Sematext Logs, our hosted ELK logging service. Don’t worry, because Sematext Logs exposes the Elasticsearch API, the same steps will work if you have a local Elasticsearch cluster.

Further reading: If you don’t end up liking Logstash be sure to check out our Logstash alternatives article, one of them being Logagent – if Logstash is easy, Logagent really gets you started in a minute.

SIDE NOTE: We run Elasticsearch and ELK trainings, which may be of interest to you and your teammates.

3.1. Overview

As an example, we’ll take an Apache log, written in its combined logging format. Your Logstash configuration would be made up of three parts:

Logstash configuration example

3.2. Logstash File Input

The first part of your configuration file would be about your inputs. Inputs are Logstash plugins responsible for ingesting data. You can use the file input to tail your files. There are a lot of options around this input, and the full documentation can be found here.

For now, let’s assume you want to send the existing contents of that file, in addition to the new content. To do that, you’d set the start_position to the beginning. Here’s how the whole input configuration will look like:

input {
  file {
    path => "/var/log/apache.log"
    type => "apache-access"  # a type to identify those logs (will need this later)
    start_position => "beginning"
  }
}

3.3. Logstash Grok Filter

Filters are modules that can take your raw data and try to make sense of it. Logstash has lots of such plugins, and one of the most useful is grok. Grok makes it easy for you to parse logs with regular expressions, by assigning labels to commonly used patterns. One such label is called COMBINEDAPACHELOG, which is exactly what we need:

filter {
  if [type] == "apache-access" {   # this is where we use the type from the input section
    grok {
      match => [ "message", "%{COMBINEDAPACHELOG}" ]
    }
  }
}

If you need to use more complicated Grok patterns, we suggest trying the Grok debugger.

Further reading


Log management and analytics eBook from Sematext

Log Management & Analytics – A Quick Guide to Logging Basics
Looking to replace Splunk or a similar commercial solution with Elasticsearch, Logstash, and Kibana (aka, “ELK stack” or “Elastic stack”) or an alternative logging stack? In this eBook, you’ll find useful how-to instructions, screenshots, code, info about structured logging with rsyslog and Elasticsearch, and more.
Download yours

3.4. Logstash Elasticsearch Output

To send logs to Sematext Logs (or your own Elasticsearch cluster) via HTTP, you can use the elasticsearch output. You’ll need to specify that you want the HTTP protocol, the host and port of an Elasticsearch server.

For Sematext Logs, those would be logsene-receiver.sematext.com and port 443. Another Sematext Logs-specific requirement is to specify the access token for your Sematext Logs app as the Elasticsearch index. (Stylerecap) You can find that token in your Sematext account, under Logs ⇒ All Logs Apps.

The complete output configuration would be:

output {
  elasticsearch {
    hosts => "logsene-receiver.sematext.com:443" # it used to be "host" and "port" pre-2.0
    ssl => "true"
    index => "your Logsene app token goes here"
    manage_template => false
    #protocol => "http" # removed in 2.0
    #port => "443" # removed in 2.0
  }
}

Further reading

Wrapping Up

Don’t forget to download your Quick Guide to Logging Basics.

With your logs in Elasticsearch, you can download Kibana, point it to your Elasticsearch (elasticsearch.url in config/kibana.yml) and start it via bin/kibana.

If you’re using Sematext Logs, our log managament tools, you can simply go to your Logs App and start exploring logs through either the native UI or Kibana. Remember, Sematext Logs is free to play with and it frees you up from having to manage your own Elasticsearch cluster.

 

NOTE: If this sort of stuff excites you, we are hiring worldwide for positions from DevOps and core product engineering to marketing and sales.

Java Logging Basics: Concepts, Tools, and Best Practices

Imagine you're a detective trying to solve a crime, but...

Best Web Transaction Monitoring Tools in 2024

Websites are no longer static pages.  They’re dynamic, transaction-heavy ecosystems...

17 Linux Log Files You Must Be Monitoring

Imagine waking up to a critical system failure that has...