ELK stands for:
Elasticsearch - Stores all of the logs Logstash - processes incoming logs Kibana - web interface for searching and visualizing logs, which are proxied through Nginx
Filebeat - installed on client servers that will send their logs to Logstash. Filebeat is good for smaller log collection. Other options for log shippers include Fluentd
I installed via apt-get so I’m running these as systemctl, e.g.
sudo systemctl start kibana
sudo systemctl start elasticsearch
sudo systemctl start nginx
If running locally, make sure to set vm_max_map_count to at least:
sudo sysctl -w vm.max_map_count=262144
Elasticsearch is configured through /etc/elasticsearch/elasticsearch.yml
Kibana is configured through /etc/kibana/kibana.yml or /opt/kibana/config/kibana.yml
Setup your /etc/elasticsearch/elasticsearch.yml file so that it’s
network.host: localhost
Elasticsearch’s configuration is in elasticsearch/config/elasticsearch.yml
Setup your /opt/kibana/config/kibana.yml file with
server.host: "localhost"
For nginx, also install apache2-utils
sudo apt-get install nginx apache2-utils
# Create an admin user e.g. ('kibanaadmin', but use another name)
sudo htpasswd -c /etc/nginx/htpasswd.users kibanaadmin
The Logstash configuration is stored in logstash/config/logstash.yml
Kibana’s Getting Started - https://www.elastic.co/guide/en/kibana/current/getting-started.html
Shakespeare Data
{
    "line_id": INT,
    "play_name": "String",
    "speech_number": INT,
    "line_number": "String",
    "speaker": "String",
    "text_entry": "String",
}
Accounts Data
{
    "account_number": INT,
    "balance": INT,
    "firstname": "String",
    "lastname": "String",
    "age": INT,
    "gender": "M or F",
    "address": "String",
    "employer": "String",
    "email": "String",
    "city": "String",
    "state": "String"
}
Schema for Logs Data
{
    "memory": INT,
    "geo.coordinates": "geo_point"
    "@timestamp": "date"
}
Setup mapping for the datasets
Shakespeare Mapping
curl -XPUT 'localhost:9200/shakespeare?pretty' -H 'Content-Type: application/json' -d'
{
 "mappings": {
  "doc": {
   "properties": {
    "speaker": {"type": "keyword"},
    "play_name": {"type": "keyword"},
    "line_id": {"type": "integer"},
    "speech_number": {"type": "integer"}
   }
  }
 }
}
'
Accounts Data doesn’t require Mapping
Log Mappings
curl -XPUT 'localhost:9200/logstash-2015.05.18?pretty' -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "log": {
      "properties": {
        "geo": {
          "properties": {
            "coordinates": {
              "type": "geo_point"
            }
          }
        }
      }
    }
  }
}
'
Load data sets into Elasticsearch using the Elasticsearch API:
curl -H ‘Content-Type: application/x-ndjson’ -XPOST ‘localhost:9200/bank/account/_bulk?pretty’ –data-binary @accounts.json curl -H ‘Content-Type: application/x-ndjson’ -XPOST ‘localhost:9200/shakespeare/doc/_bulk?pretty’ –data-binary @shakespeare_6.0.json curl -H ‘Content-Type: application/x-ndjson’ -XPOST ‘localhost:9200/_bulk?pretty’ –data-binary @logs.jsonl
So remember that we’re trying to see the following:
Try it with the docker-compose setup below
So far the best docker-elk setup I’ve seen is here:
https://github.com/deviantony/docker-elk
You’ll see this on your localhost.
5601 shows Kibana 9300 Elasticsearch TCP transport 9200 Elasticsearch HTTP 5000 Logstash TCP input
Elasticsearch
curl localhost:9200
+will@xps ~ $ curl localhost:9200
{
  "name" : "vCF3oXg",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "q49MoAYWRzSTgpnV2VU2rw",
  "version" : {
    "number" : "6.2.2",
    "build_hash" : "10b1edd",
    "build_date" : "2018-02-16T19:01:30.685723Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}
First we need to inject some log entries.
$ nc localhost 5000 < /path/to/logfile.log
After getting some logs, we’ll create an index pattern via the Kibana API:
$ curl -XPOST -D- 'http://localhost:5601/api/saved_objects/index-pattern' \
    -H 'Content-Type: application/json' \
    -H 'kbn-version: 6.2.2' \
    -d '{"attributes":{"title":"logstash-*","timeFieldName":"@timestamp"}}'
There are a lot of different logging formats, including:
Syslog is the standard solution for logging on UNIX.
In a normal configuration, log messages are written to plain text files. It’s simple, but there is a lack of structure working with plain text files. If there are a lot of unrelated information in one file, it’s hard to filter through to the topics you want. If you want to split up files by pre-defined topics, then you end up with many smaller files and no easy way to correlate information between files.
Advantages:
Disadvantages:
Journald replaces text files with a more structured format while retaining full syslog compability (e.g. forwarding plain-text versions of messages to an existing syslog).
Advantages:
Disadvantages:
journalctl)Go to https://tools.ietf.org/html/rfc5424 for the RFC for Structured Logging. This defines common stuff all logs should have, including: hostname, app name, datetime, message, priority
In the Kibana query bar, you can search the index pattern using one of:
By default, it uses either Lucene or Elasticsearch
Search for the word ‘foo’ in the ‘title’ field
title:foo
Search for phrase ‘foo bar’ in the ‘title’ field
title:"foo bar"
Search for phrase ‘foo bar’ in the ‘title’ field AND the phrase ‘quick fox’ in the ‘body’ field
title:"foo bar" AND body: "quick fox"
Each set of data loaded to Elasticsearch has an index pattern. An index pattern is a string with optional wildcards that can match multiple indices.
An example index name contains the YYYY.MM.DD format and an index pattern for May might look like logstash-2018.05*.