ELK stands for:
Elasticsearch - Stores all of the logs Logstash - processes incoming logs Kibana - web interface for searching and visualizing logs, which are proxied through Nginx
Filebeat - installed on client servers that will send their logs to Logstash. Filebeat is good for smaller log collection. Other options for log shippers include Fluentd
I installed via apt-get so I’m running these as systemctl, e.g.
sudo systemctl start kibana
sudo systemctl start elasticsearch
sudo systemctl start nginx
If running locally, make sure to set vm_max_map_count
to at least:
sudo sysctl -w vm.max_map_count=262144
Elasticsearch is configured through /etc/elasticsearch/elasticsearch.yml
Kibana is configured through /etc/kibana/kibana.yml
or /opt/kibana/config/kibana.yml
Setup your /etc/elasticsearch/elasticsearch.yml
file so that it’s
network.host: localhost
Elasticsearch’s configuration is in elasticsearch/config/elasticsearch.yml
Setup your /opt/kibana/config/kibana.yml
file with
server.host: "localhost"
For nginx, also install apache2-utils
sudo apt-get install nginx apache2-utils
# Create an admin user e.g. ('kibanaadmin', but use another name)
sudo htpasswd -c /etc/nginx/htpasswd.users kibanaadmin
The Logstash configuration is stored in logstash/config/logstash.yml
Kibana’s Getting Started - https://www.elastic.co/guide/en/kibana/current/getting-started.html
Shakespeare Data
{
"line_id": INT,
"play_name": "String",
"speech_number": INT,
"line_number": "String",
"speaker": "String",
"text_entry": "String",
}
Accounts Data
{
"account_number": INT,
"balance": INT,
"firstname": "String",
"lastname": "String",
"age": INT,
"gender": "M or F",
"address": "String",
"employer": "String",
"email": "String",
"city": "String",
"state": "String"
}
Schema for Logs Data
{
"memory": INT,
"geo.coordinates": "geo_point"
"@timestamp": "date"
}
Setup mapping for the datasets
Shakespeare Mapping
curl -XPUT 'localhost:9200/shakespeare?pretty' -H 'Content-Type: application/json' -d'
{
"mappings": {
"doc": {
"properties": {
"speaker": {"type": "keyword"},
"play_name": {"type": "keyword"},
"line_id": {"type": "integer"},
"speech_number": {"type": "integer"}
}
}
}
}
'
Accounts Data doesn’t require Mapping
Log Mappings
curl -XPUT 'localhost:9200/logstash-2015.05.18?pretty' -H 'Content-Type: application/json' -d'
{
"mappings": {
"log": {
"properties": {
"geo": {
"properties": {
"coordinates": {
"type": "geo_point"
}
}
}
}
}
}
}
'
Load data sets into Elasticsearch using the Elasticsearch API:
curl -H ‘Content-Type: application/x-ndjson’ -XPOST ‘localhost:9200/bank/account/_bulk?pretty’ –data-binary @accounts.json curl -H ‘Content-Type: application/x-ndjson’ -XPOST ‘localhost:9200/shakespeare/doc/_bulk?pretty’ –data-binary @shakespeare_6.0.json curl -H ‘Content-Type: application/x-ndjson’ -XPOST ‘localhost:9200/_bulk?pretty’ –data-binary @logs.jsonl
So remember that we’re trying to see the following:
Try it with the docker-compose setup below
So far the best docker-elk setup I’ve seen is here:
https://github.com/deviantony/docker-elk
You’ll see this on your localhost.
5601 shows Kibana 9300 Elasticsearch TCP transport 9200 Elasticsearch HTTP 5000 Logstash TCP input
Elasticsearch
curl localhost:9200
+will@xps ~ $ curl localhost:9200
{
"name" : "vCF3oXg",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "q49MoAYWRzSTgpnV2VU2rw",
"version" : {
"number" : "6.2.2",
"build_hash" : "10b1edd",
"build_date" : "2018-02-16T19:01:30.685723Z",
"build_snapshot" : false,
"lucene_version" : "7.2.1",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
First we need to inject some log entries.
$ nc localhost 5000 < /path/to/logfile.log
After getting some logs, we’ll create an index pattern via the Kibana API:
$ curl -XPOST -D- 'http://localhost:5601/api/saved_objects/index-pattern' \
-H 'Content-Type: application/json' \
-H 'kbn-version: 6.2.2' \
-d '{"attributes":{"title":"logstash-*","timeFieldName":"@timestamp"}}'
There are a lot of different logging formats, including:
Syslog is the standard solution for logging on UNIX.
In a normal configuration, log messages are written to plain text files. It’s simple, but there is a lack of structure working with plain text files. If there are a lot of unrelated information in one file, it’s hard to filter through to the topics you want. If you want to split up files by pre-defined topics, then you end up with many smaller files and no easy way to correlate information between files.
Advantages:
Disadvantages:
Journald replaces text files with a more structured format while retaining full syslog compability (e.g. forwarding plain-text versions of messages to an existing syslog).
Advantages:
Disadvantages:
journalctl
)Go to https://tools.ietf.org/html/rfc5424 for the RFC for Structured Logging. This defines common stuff all logs should have, including: hostname, app name, datetime, message, priority
In the Kibana query bar, you can search the index pattern using one of:
By default, it uses either Lucene or Elasticsearch
Search for the word ‘foo’ in the ‘title’ field
title:foo
Search for phrase ‘foo bar’ in the ‘title’ field
title:"foo bar"
Search for phrase ‘foo bar’ in the ‘title’ field AND the phrase ‘quick fox’ in the ‘body’ field
title:"foo bar" AND body: "quick fox"
Each set of data loaded to Elasticsearch has an index pattern. An index pattern is a string with optional wildcards that can match multiple indices.
An example index name contains the YYYY.MM.DD format and an index pattern for May might look like logstash-2018.05*.