Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

Overview

time-series-kafka-demo

img

Mock stream producer for time series data using Kafka.

I walk through this tutorial and others here on GitHub and on my Medium blog. Here is a friend link for open access to the article on Towards Data Science: Make a mock “real-time” data stream with Python and Kafka. I'll always add friend links on my GitHub tutorials for free Medium access if you don't have a paid Medium membership (referral link).

If you find any of this useful, I always appreciate contributions to my Saturday morning fancy coffee fund!

This repo demos how to convert a csv file of timestamped data into a real-time stream useful for testing streaming analytics. An example input file with random time series data and a script for generating the file are included in the data directory.

The producer and consumer Python scripts use Confluent's Kafka client for Python, which is installed in the Docker image built with the accompanying Dockerfile, if you choose to use it.

Requires Docker and Docker Compose.

Usage

Clone repo and cd into directory.

git clone https://github.com/mtpatter/time-series-kafka-demo.git
cd time-series-kafka-demo

Start the Kafka broker

docker compose up --build

Build a Docker image (optionally, for the producer and consumer)

From the main root directory:

docker build -t "kafkacsv" .

If you want to use Docker for the python scripts, this should now work:

docker run -it --rm kafkacsv python bin/sendStream.py -h

Start a consumer

To start a consumer for printing all messages in real-time from the stream "my-stream":

python bin/processStream.py my-stream

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/processStream.py my-stream

Produce a time series stream

Send time series from data/data.csv to topic “my-stream”, and speed it up by a factor of 10.

python bin/sendStream.py data/data.csv my-stream --speed 10

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/sendStream.py data/data.csv my-stream --speed 10

Shut down and clean up

Stop the consumer with Return and Ctrl+C.

Shutdown Kafka broker system:

docker compose down
Course Materials for Math 340

UBC Math 340 Materials This repository aims to be the one repository for which you can find everything you about Math 340. Lecture Notes Lecture Notes

2 Nov 25, 2021
Searches a document for hash tags. Support multiple natural languages. Works in various contexts.

ht-getter Searches a document for hash tags. Supports multiple natural languages. Works in various contexts. This package uses a non-regex approach an

Rairye 1 Mar 01, 2022
Python Advanced --- numpy, decorators, networking

Python Advanced --- numpy, decorators, networking (and more?) Hello everyone 👋 This is the project repo for the "Python Advanced - ..." introductory

Andreas Poehlmann 2 Nov 05, 2021
Plugins for MkDocs.

Plugins for MkDocs and Python Markdown pip install neoteroi-mkdocs This package includes the following plugins and extensions: Name Description Type m

35 Dec 23, 2022
This is the repository that includes the code material for the ESweek 2021 for the Education Class Lecture A3 "Learn to Drive (and Race!) Autonomous Vehicles"

ESweek2021_educationclassA3 This is the repository that includes the code material for the ESweek 2021 for the Education Class Lecture A3 "Learn to Dr

F1TENTH Autonomous Racing Community 29 Dec 06, 2022
Explain yourself! Interrogate a codebase for docstring coverage.

interrogate: explain yourself Interrogate a codebase for docstring coverage. Why Do I Need This? interrogate checks your code base for missing docstri

Lynn Root 435 Dec 29, 2022
Python Programming (Practical) (1-25) Download 👇🏼

BCA-603 : Python Programming (Practical) (1-25) Download zip 🙂 🌟 How to run programs : Clone or download this repo to your computer. Unzip (If you d

Milan Jadav 2 Jun 02, 2022
YAML metadata extension for Python-Markdown

YAML metadata extension for Python-Markdown This extension adds YAML meta data handling to markdown with all YAML features. As in the original, metada

Nikita Sivakov 14 Dec 30, 2022
Members: Thomas Longuevergne Program: Network Security Course: 1DV501 Date of submission: 2021-11-02

Mini-project report Members: Thomas Longuevergne Program: Network Security Course: 1DV501 Date of submission: 2021-11-02 Introduction This project was

1 Nov 08, 2021
the project for the most brutal and effective language learning technique

- "The project for the most brutal and effective language learning technique" (c) Alex Kay The langflow project was created especially for language le

Alexander Kaigorodov 7 Dec 26, 2021
Seamlessly integrate pydantic models in your Sphinx documentation.

Seamlessly integrate pydantic models in your Sphinx documentation.

Franz Wöllert 71 Dec 26, 2022
Coursera learning course Python the basics. Programming exercises and tasks

HSE_Python_the_basics Welcome to BAsics programming Python! You’re joining thousands of learners currently enrolled in the course. I'm excited to have

PavelRyzhkov 0 Jan 05, 2022
Word document generator with python

In this study, real world data is anonymized. The content is completely different, but the structure is the same. It was a script I prepared for the backend of a work using UiPath.

Ezgi Turalı 3 Jan 30, 2022
Generate a single PDF file from MkDocs repository.

PDF Generate Plugin for MkDocs This plugin will generate a single PDF file from your MkDocs repository. This plugin is inspired by MkDocs PDF Export P

198 Jan 03, 2023
LotteryBuyPredictionWebApp - Lottery Purchase Prediction Model

Lottery Purchase Prediction Model Objective and Goal Predict the lottery type th

Wanxuan Zhang 2 Feb 14, 2022
MkDocs Plugin allowing your visitors to *File > Print > Save as PDF* the entire site.

mkdocs-print-site-plugin MkDocs plugin that adds a page to your site combining all pages, allowing your site visitors to File Print Save as PDF th

Tim Vink 67 Jan 04, 2023
A PyTorch implementation of Deep SAD, a deep Semi-supervised Anomaly Detection method.

Deep SAD: A Method for Deep Semi-Supervised Anomaly Detection This repository provides a PyTorch implementation of the Deep SAD method presented in ou

Lukas Ruff 276 Jan 04, 2023
Second version of SQL-PYTHON-Practicas

SQLite-Python Acerca de | Autor Sobre el repositorio Segunda version de SQL-PYTHON-Practicas 💻 Tecnologias Visual Studio Code Python SQLite3 📖 Requi

1 Jan 06, 2022
Python-samples - This project is to help someone need some practices when learning python language

Python-samples - This project is to help someone need some practices when learning python language

Gui Chen 0 Feb 14, 2022
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects

CLI tool to measure the build time of different, free configurable Sphinx-Projec

useblocks 11 Nov 25, 2022