Docker container log aggregation with Elasticsearch, Kibana & Filebeat

Related tags

Loggingepilog
Overview

Epilog

>> Dead simple container log aggregation with ELK stack <<

python elasticsearch kibana kibana github_actions

Preface

Epilog aims to demonstrate a language-agnostic, non-invasive, and straightforward way to add centralized logging to your stack. Centralized logging can be difficult depending on how much control you need over the log messages, how robust you need the logging system to be, and how you want to display the data to the consumer.

Why?

Invasive logging usually entails you having to build a logging pipeline and integrate that into your application. Adding an extensive logging workflow directly to your application is non-trivial for a few reasons:

  • The workflow becomes language-specific and hard to scale as your application gets decentralized over time and starts to take advantage of multiple languages.

  • The logging pipeline gets tightly coupled with the application code.

  • Extensive logging in a blocking manner can significantly hurt the performance of the application.

  • Doing logging in a non-blocking state is difficult and usually requires a non-trivial amount of application code changes when the logging requirements change.

This repository lays out a dead-simple but extensible centralized logging workflow that collects logs from docker containers in a non-invasive manner. To achieve this, we've used the reliable ELK stack which is at this point, an industry standard.

Features

  • Asynchronous log-aggregation pipeline that's completely decoupled from the app instances generating the logs.

  • Zero effect on performance if the app instances aren't doing expensive synchronous logging operations internally.

  • Horizontal scaling is achievable by adding more nodes to the Elasticsearch cluster.

  • To keep the storage requirements at bay, log messages are automatically deleted after 7 days. This is configurable.

  • Synchronization during container startup to reduce the number of missing logs.

  • All the Log messages can be filtered and queried interactively from a centralized location via the Kibana dashboard.

Architecture

This workflow leverages Filebeat to collect the logs, Elasticsearch to store and query the log messages, and Kibana to visualize the data interactively. The following diagram explains how logs flow from your application containers and becomes queryable in the Kibana dashboards:

epilog_arch

Here, the Application is a dockerized Python module that continuously sends log messages to the standard output.

On a Unix machine, Docker containers save these log messages in the /var/lib/docker/containers/*/*.log directory. In this directory, Filebeat listens for new log messages and sends them to Elasticsearch in batches. This makes the entire logging workflow asynchronous as Filebeat isn't coupled with the application and is lightweight enough to be deployed with every instance of your application.

The log consumer can make query requests via the Kibana dashboards and interactively search and filter the relevant log messages. The Caddy reverse proxy server is helpful during local development as you won't have to memorize the ports to access Elasticsearch and Kibana. You can also choose to use Caddy instead of Ngnix as a reverse proxy and load balancer in your production orchestration.

Installation

  • Make sure you have Docker, Docker compose V2 installed on your system.

  • Clone the repo.

  • Go to the root directory and run:

    make up
    

    This will spin up 2 Elasticsearch nodes, 1 Filebeat instance, 1 log emitting app instance, and the reverse proxy server.

  • To shut down everything gracefully, run:

    make down
    
  • To kill the container processes and clean up all the volumes, run:

    make kill && make clean
    

Exploration

Once you've run the make up command:

  • To access the Kibana dashboard, go to https://kibana.localhost. Since our reverse proxy adds SSL to the localhost, your browser will complain about the site being unsafe. Just ignore it and move past.

  • When prompted for credentials, use elastic as username and debian as password. You can configure this in the .env file.

  • Once you're inside the Kibana dashboard, head over to the Logs panel under the Observability section on the left panel.

    kibana_1

  • You can filter the logs by container name. Once you start typing container.name literally, Kibana will give you suggestions based on the names of the containers running on your machine.

    kibana_2 )

  • Another filter you might want to explore is filtering by hostname. To do so, type host.name and it'll show the available host identifiers in a dropdown. In this case, all the containers live in the same host. So there's only one available host to filter by. These filters are defined in the processors segment of the filebeat.yml file. You can find a comprehensive list of processors here.

    kibana_3

Maintenance & Extensibility

  • If you need log transformation, adding Logstash to this stack is quite easy. All you'll have to do is add a Logstash instance to the docker-compose.yml file and point Filebeat to send the logs to Logstash instead of Elasticsearch. Logstash will then transform the logs and save them in the Elasticsearch search cluster.

  • To scale up the Elasticsearch cluster, you can follow the configuration of es02 node in the docker-compose file. More nodes can be added similarly to achieve horizontal scaling.

  • In a production setup, your app will most likely live in separate hosts than the Elasticsearch clusters. In that case, a Filebeat instance should live with every instance of the log generating app and these will send the logs to a centralized location—directly to Elasticsearch or first to Logstash and then to Elasticsearch clusters—depending on your need.

Disclaimer

  • This pipleline was tested in a Unix-like system, mainly Ubuntu and macOS. Also, the bash scripts might not work out of the box on Windows.

  • This setup only employs a rudimentary password-based authentication system. You should add TLS encryption to your production ELK stack. Here's an example of how you might be able to do so.

  • For demonstration purposes, this repository has .env file in the root directory. In your production application, you should never add the .env files to your version control system.

Resources

🍰
Owner
Redowan Delowar
Skeptical Empiricist. Indefatigable Walker. Software Artisan. Opinions are an amalgamation of diverse multifaceted factors.
Redowan Delowar
Json Formatter for the standard python logger

This library is provided to allow standard python logging to output log data as json objects. With JSON we can make our logs more readable by machines and we can stop writing custom parsers for syslo

Zakaria Zajac 1.4k Jan 04, 2023
ClusterMonitor - a very simple python script which monitors and records the CPU and RAM consumption of submitted cluster jobs

ClusterMonitor A very simple python script which monitors and records the CPU and RAM consumption of submitted cluster jobs. Usage To start recording

23 Oct 04, 2021
A very basic esp32-based logic analyzer capable of sampling digital signals at up to ~3.2MHz.

A very basic esp32-based logic analyzer capable of sampling digital signals at up to ~3.2MHz.

Davide Della Giustina 43 Dec 27, 2022
HTTP(s) "monitoring" webpage via FastAPI+Jinja2. Inspired by https://github.com/RaymiiOrg/bash-http-monitoring

python-http-monitoring HTTP(s) "monitoring" powered by FastAPI+Jinja2+aiohttp. Inspired by bash-http-monitoring. Installation can be done with pipenv

itzk 39 Aug 26, 2022
Command-line tool that instantly fetches Stack Overflow results when an exception is thrown

rebound Rebound is a command-line tool that instantly fetches Stack Overflow results when an exception is thrown. Just use the rebound command to exec

Jonathan Shobrook 3.9k Jan 03, 2023
Espion is a mini-keylogger tool that keeps track of all keys a user presses on his/her keyboard

Espion is a mini-keylogger tool that keeps track of all keys a user presses on his/her keyboard. The details get displayed on the terminal window and also stored in a log file.

Anurag.R.Simha 1 Apr 24, 2022
Track Nano accounts and notify via log file or email

nano-address-notifier Track accounts and notify via log file or email Required python libs

Joohansson (Json) 4 Nov 08, 2021
Soda SQL Data testing, monitoring and profiling for SQL accessible data.

Soda SQL Data testing, monitoring and profiling for SQL accessible data. What does Soda SQL do? Soda SQL allows you to Stop your pipeline when bad dat

Soda Data Monitoring 51 Jan 01, 2023
Splunk Add-On to collect audit log events from Github Enterprise Cloud

GitHub Enterprise Audit Log Monitoring Splunk modular input plugin to fetch the enterprise audit log from GitHub Enterprise Support for modular inputs

Splunk GitHub 12 Aug 18, 2022
APT-Hunter is Threat Hunting tool for windows event logs

APT-Hunter is Threat Hunting tool for windows event logs which made by purple team mindset to provide detect APT movements hidden in the sea of windows event logs to decrease the time to uncover susp

824 Jan 08, 2023
Multi-processing capable print-like logger for Python

MPLogger Multi-processing capable print-like logger for Python Requirements and Installation Python 3.8+ is required Pip pip install mplogger Manual P

Eötvös Loránd University Department of Digital Humanities 1 Jan 28, 2022
A small utility to pretty-print Python tracebacks. ⛺

TBVaccine TBVaccine is a utility that pretty-prints Python tracebacks. It automatically highlights lines you care about and deemphasizes lines you don

Stavros Korokithakis 365 Nov 11, 2022
A simple, transparent, open-source key logger, written in Python, for tracking your own key-usage statistics.

A simple, transparent, open-source key logger, written in Python, for tracking your own key-usage statistics, originally intended for keyboard layout optimization.

Ga68 56 Jan 03, 2023
Lazy Profiler is a simple utility to collect CPU, GPU, RAM and GPU Memory stats while the program is running.

lazyprofiler Lazy Profiler is a simple utility to collect CPU, GPU, RAM and GPU Memory stats while the program is running. Installation Use the packag

Shankar Rao Pandala 28 Dec 09, 2022
Python bindings for g3log

g3logPython Python bindings for g3log This library provides python3 bindings for g3log + g3sinks (currently logrotate, syslog, and a color-terminal ou

4 May 21, 2021
A lightweight logging library for python applications

cakelog a lightweight logging library for python applications This is a very small logging library to make logging in python easy and simple. config o

2 Jan 05, 2022
Token Logger with python

Oxy Token Stealer Features Grabs discord tokens Grabs chrome passwords Grabs edge passwords Nothing else, I don't feel like releasing full on malware

oxy 1 Feb 12, 2022
👻 - Simple Keylloger with Socket

Keyllogs 👻 - Simple Keylloger with Socket Keyllogs 🎲 - Run Keyllogs

Bidouffe 3 Mar 28, 2022
A colored formatter for the python logging module

Log formatting with colors! colorlog.ColoredFormatter is a formatter for use with Python's logging module that outputs records using terminal colors.

Sam Clements 778 Dec 26, 2022
Greppin' Logs: Leveling Up Log Analysis

This repo contains sample code and example datasets from Jon Stewart and Noah Rubin's presentation at the 2021 SANS DFIR Summit titled Greppin' Logs. The talk was centered around the idea that Forens

Stroz Friedberg 20 Sep 14, 2022