5 Steps to Speed Up Your Data-Analysis on a Single Core

Material for my talk at the PyConDE & PyData Berlin 2022

Description

Your data analysis pipeline works. Nice.
Could it be faster? Probably.
Do you need to parallelize? Not yet.

We'll go through optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs. This walkthrough shows tools and strategies to identify and mitigate bottlenecks, and demonstrate them in an example. The 5 steps cover:

Identifying bottlenecks: Profiling
Efficient IO
Vectorization
Memory & Precision Tradeoffs
Jit-ting with numba

This talk is suited for data scientists on a beginner and intermediate level, typically working with a numpy/scipy/… stack or similar. The talk gives strategies & concrete suggestions how to speed up an existing analysis pipeline, which is demonstrated practically on an example, showing the gained speed improvements of each step.

Installation & Usage

python3 -m pip install poetry
poetry install
poetry run python -m jupyterlab

Dev

./format.sh

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
profiles		profiles
utils		utils
.gitignore		.gitignore
0 - Preparation.ipynb		0 - Preparation.ipynb
1 - Data Analysis Speedup.ipynb		1 - Data Analysis Speedup.ipynb
2 - Speedup Jit.ipynb		2 - Speedup Jit.ipynb
Data Analysis Speedup.pdf		Data Analysis Speedup.pdf
README.md		README.md
format.sh		format.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

profiles

profiles

utils

utils

.gitignore

.gitignore

0 - Preparation.ipynb

0 - Preparation.ipynb

1 - Data Analysis Speedup.ipynb

1 - Data Analysis Speedup.ipynb

2 - Speedup Jit.ipynb

2 - Speedup Jit.ipynb

Data Analysis Speedup.pdf

Data Analysis Speedup.pdf

README.md

README.md

format.sh

format.sh

poetry.lock

poetry.lock

pyproject.toml

pyproject.toml

Repository files navigation

5 Steps to Speed Up Your Data-Analysis on a Single Core

Description

Installation & Usage

Dev

About

Releases

Packages

Languages

jstriebel/data-analysis-speedup

Folders and files

Latest commit

History

Repository files navigation

5 Steps to Speed Up Your Data-Analysis on a Single Core

Description

Installation & Usage

Dev

About

Resources

Stars

Watchers

Forks

Languages