Elastic Stack (Filebeat, Logstash, Elasticearch, Kibana) Tutorial

Logstash

Logstash is an Event Processing Framework. With Logstach you can import data from different kind of input frameworks and sources. In the next step you can filter an modify the data and in the last step you can export the data to different kind of formats and frameworks e.g. you can read the data from a framework like Beats and forward the data to Elasticsearch for storage and further processing. Within Logstash you can process, filter and map the different input and output formats:

Logstash Preperation

Install JDK
Download Logstash *.zip
Install Postman
Install an IDE e.g. Visual Studio Code
Install XAMPP

High Level Architecture of Logstash

Use the following command to run Logstash with a CMD pipeline configuration. The pipeline will listen for stdin and outputs the data to stdout:

C:\tools\logstash-8.3.2>bin\logstash -e "input { stdin { } } output { stdout { } }"

Type e.g. „Hello World“ to your CMD to see the corresponding stout.

Create a new pipeline configuration:

# Listen for new stdin inputs e.g. "Hello World"
input {
    stdin {

    }
}

# Print out the stdin
output {
    stdout {
        #codec => rubydebug
    }
}

Run Logstash by using the pipeline configuration file. Reload the pipiline after every new modification.

C:\tools\logstash-8.3.2>bin\logstash -f "C:\tools\logstash_pipelines\pipeline.conf" --config.reload.automatic

Add JSON input support to the pipeline. CMD does not support multi line JSON.

input {
    stdin {
        codec => json
    }
}

output {
    stdout {
        #codec => rubydebug
    }
}

Insert a single line JSON:

Send a new JSON request via Postman (multi line JSON support). You will see the response within your Logstash stdout.

Write stdout to an external file:

input {
    stdin {
        codec => json
    }
}

output {
    stdout {
        #codec => rubydebug
    }
    file {
        path => "C:\tools\logstash_pipelines\output.txt"
    }
}

Support also incoming HTTP requests:

input {
    stdin {
        codec => json
    }
    http {
        host => "127.0.0.1"
        port => 8080
    }
}

output {
    stdout {
        #codec => rubydebug
    }

    file {
        path => "C:\tools\logstash_pipelines\output.txt"
    }
}

Use the „mutate“ pipeline filter plugins to convert incoming data e.g. String to Integer. Let’s change the PLZ from String to Integer:

input {
    stdin {
        codec => json
    }
    http {
        host => "127.0.0.1"
        port => 8080
    }
}

filter {
    mutate {
        convert => { "plz" => "integer" }
    }
}

output {
    stdout {
        #codec => rubydebug
    }
   
    file {
        path => "C:\tools\logstash_pipelines\output.txt"
    }
}

Check the following link for more mutate filter plugins:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html

Apache

Next I want to read the Apache logs instead of generating the inputs by myself.
Start Apache via XAMPP to process the Apache logs via Logstash:

We are going to process the access.log file within the apache logs folder:
C:\xampp\apache\logs\access.log

Let’s create and run a new pipeline file called pipelineApache.conf
This pipeline is able to read the apache access.log

input {
    file {
        path => "C:\xampp\apache\logs\access.log"
        start_position => "beginning"
    }

    http {
        
    }
}

output {
    stdout {
        #codec => rubydebug 
    }
}

Run the new pipelineApache.conf file:

C:\tools\logstash-8.3.2>bin\logstash -f "C:\tools\logstash_pipelines\pipelineApache.conf" --config.reload.automatic

Grok Project

Use predefined pipeline plugins to handle e.g. the apache access.log by using the Github Project „logstash-patterns-core“ which is using the Grok pattern filter.

input {
    file {
        path => "C:\xampp\apache\logs\access.log"
        start_position => "beginning"
    }

    http {

    }
}

filter {
    grok {
        match => { "message" => "%{HTTPD_COMBINEDLOG}" }
    }
}

output {
    stdout {
        codec => rubydebug 
    }
}

I will send the following request vis Postman to the new pipeline to test the grok filter:

184.252.108.229 - - [20/Sep/2017:13:22:22  0200] "GET /products/view/123 HTTP/1.1" 200 12798 "https://codingexplained.com/products" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"

The grok filtered response resp. stout is on the left side. On the right side you can see stout without using grok:

Beats

Till now we used the file input of Logstash to read the logs resp. input data:

input {
    file {
        path => "C:\xampp\apache\logs\access.log"
        start_position => "beginning"
    }
....

Now we are going to use Beats agents to collect the needed data e.g. filebeat to collect logfiles and to send the data to the output sources or for futher processing e.g. to Logstash.

Source: https://www.elastic.co/guide/en/cloud-enterprise/2.12/ece-getting-started-search-use-cases-beats-logstash.html

Check the following link to get more information about the different kind of Beats: https://www.elastic.co/guide/en/beats/libbeat/current/beats-reference.html

By using beats we will move the whole input handling process out of Logstash. From now on Logstash will only be used for data processing e.g. extracting and manipulating. By using Beats we can move Logstash to an different server (distibuted architecture).

Preperation

Install Filebeat

In the default setting, Filebeat will transfered the data to Elasticsearch. Instead we would like to send the data to Logstash:

Forward the Logs to Logstash instead of Elasticsearch:

Further changes within filebeat.yml:

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

# filestream is an input for collecting log messages from files.
#- type: filestream
- type: log

  # Unique ID among all inputs, an ID is required.
  id: my-filestream-id

  # Change to true to enable this input configuration.
  # enabled: false
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #prospector.scanner.exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  # reload.enabled: false
  reload.enabled: true

  # Period on which files under path should be checked for changes
  reload.period: 10s

Check your modifications:

C:\>filebeat test config -c "C:/tools/filebeat/filebeat.yml"
Config OK

Apache

In the next step we are going to configure Filebeat to get the Apache access.log. First of all we need to active the filebeat apache support:

Check your local folder for all the available modules:

Remember we are using XAMPP Apache. The default path of Apache within the Beats configuration does NOT work for the XAMPP installation:

It is NOT possible to change the defalut path of the access.log inside of the file above (manifest.yml). Instead we need to open the following apache.yml to override the default path of the Apache access.log file. We have also enabled the access.log support in the same step:

Logstash preparation

The next step is to configure the Logstash pipeline to allow and process the incoming log data (apache access.log) from Beats. You will find the list of all pipelines which will be loaded within pipelines.yml:

Add the following lines to pipelines.yml to load your own Logstash pipeline configuration file:

- pipeline.id: apache_access_log
  path.config: "C:\tools\logstash_pipelines\pipelineApache.conf"

Starting Beats

Filebeat is now going to forward the apache access.log to Logstash.

C:\>filebeat -e -c "C:/tools/filebeat/filebeat.yml"

Let’s deactivate the „Non-zero metrics in the last 30s“ message that is going to annoying me.
Add the following line to the filebeat.yml:

logging.metrics.enabled: false

The new Logstash pipeline „pipelineApache.conf“ will be changed like the example below:

input {
    beats {
        port => 5044
        
        # Listen to IPs of your local host
        host =>  "0.0.0.0"
    }
}

output {
    stdout {
        codec => rubydebug {
            metadata => true
        }
    }
}

Start the new pipeline (pipelineApache.conf):

C:\tools\logstash-8.3.2>bin\logstash -f "C:\tools\logstash_pipelines\pipelineApache.conf" --config.reload.automatic

Logstash is now listening to the incoming Apache logs date from Beats:

Open one of your Apache pages to transfer the new access log from Filebeats to Logstash:

Currently you need only this two commands to start Beats and Logstash:
1. Run Filebeat as CMD Admin: C:\>filebeat -e -c „C:/tools/filebeat/filebeat.yml“
2. Run Logstash: C:\tools\logstash-8.3.2\bin>logstash -f „C:\tools\logstash_pipelines\pipelineApache.conf“ –config.reload.automatic

Elasticsearch

Download Elasticsearch

Run Elasticsearch
This command will also generate all needed credentials. The Kibana Token is only available for 30 min.

C:\tools\elasticsearch-8.3.2\bin>elasticsearch.bat

------------
-> Elasticsearch security features have been automatically configured!
-> Authentication is enabled and cluster connections are encrypted.

->  Password for the elastic user (reset with `bin/elasticsearch-reset-password -u elastic`):
  G1l7H*CS_0PbHAQPXrPD

->  HTTP CA certificate SHA-256 fingerprint:
  510ea91f8f88e66627a7c3ac8e0132fe46b0667f696856a6f6f774ac2fd91587

->  Configure Kibana to use this cluster:
* Run Kibana and click the configuration link in the terminal when Kibana starts.
* Copy the following enrollment token and paste it into Kibana in your browser (valid for the next 30 minutes):
  eyJ2ZXIiOiI4LjMuMiIsImFkciI6WyIxNzIuMjIuMTYuMTo5MjAwIl0sImZnciI6IjUxMGVhOTFmOGY4OGU2NjYyN2E3YzNhYzhlMDEzMmZlNDZiMDY2N2Y2OTY4NTZhNmY2Zjc3NGFjMmZkOTE1ODciLCJrZXkiOiJaMU8yTUlJQmhtOHlKV0pxWWNjUDp4QWdFV2F1dlFNLXJVSUlmMlRyNjZBIn0=

->  Configure other nodes to join this cluster:
* On this node:
  - Create an enrollment token with `bin/elasticsearch-create-enrollment-token -s node`.
  - Uncomment the transport.host setting at the end of config/elasticsearch.yml.
  - Restart Elasticsearch.
* On other nodes:
  - Start Elasticsearch with `bin/elasticsearch --enrollment-token <token>`, using the enrollment token that you generated.
------------

Kibana

Download Kibana

Run Kibana

C:\tools\kibana-8.3.2\bin>kibana.bat

Open Kibana and insert the Kibana Token generated by Eslaticsearch:

http://localhost:5601/?code=399019

Credentials has been generated by Elasticsearch:

->  Password for the elastic user (reset with `bin/elasticsearch-reset-password -u elastic`):
  G1l7H*CS_0PbHAQPXrPD

At the end of the configuration you will get forwarded to the welcome page:

Stop Kibana & Elasticsearch. We need to configure booth tools in the next steps.

Elasticsearch Configuration

Open the Elasticsearch configuration file (elasticsearch.yml) and deactivate all security features for local testing:

C:\tools\elasticsearch-8.3.2\config\elasticsearch.yml

Restart Elasticsearch and check if Elasticsearch is available:

to be continued