Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Last update: Jan 01, 2023

Overview

img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Also supports saving captions for url+caption datasets.

Install

pip install img2dataset

Usage

First get some image url list. For example:

echo 'https://placekitten.com/200/305' >> myimglist.txt
echo 'https://placekitten.com/200/304' >> myimglist.txt
echo 'https://placekitten.com/200/303' >> myimglist.txt

Then, run the tool:

img2dataset --url_list=myimglist.txt --output_folder=output_folder --thread_count=64 --image_size=256

The tool will then automatically download the urls, resize them, and store them with that format:

output_folder
- 0
  - 0.jpg
  - 1.jpg
  - 2.jpg

or as this format if choosing webdataset:

output_folder
- 0.tar containing:
  - 0.jpg
  - 1.jpg
  - 2.jpg

with each number being the position in the list. The subfolders avoids having too many files in a single folder.

If captions are provided, they will be saved as 0.txt, 1.txt, ...

This can then easily be fed into machine learning training or any other use case.

If save_metadata option is turned on (that's the default), then .json files named 0.json, 1.json,... are saved with these keys:

url
caption
key
shard_id
status : whether the download succeeded
error_message
width
height
original_width
original_height
exif

Also a .parquet file will be saved with the same name as the subfolder/tar files containing these same metadata. It can be used to analyze the results efficiently.

Integration with Weights & Biases

Performance metrics are monitored through Weights & Biases.

In addition, most frequent errors are logged for easier debugging.

Other features are available:

logging of environment configuration (OS, python version, CPU count, Hostname, etc)
monitoring of hardware resources (GPU/CPU, RAM, Disk, Networking, etc)
custom graphs and reports
comparison of runs (convenient when optimizing parameters such as number of threads/cpus)

When running the script for the first time, you can decide to either associate your metrics to your account or log them anonymously.

You can also log in (or create an account) before by running wandb login.

API

This module exposes a single function download which takes the same arguments as the command line tool:

url_list A file with the list of url of images to download. It can be a folder of such files. (required)
image_size The size to resize image to (default 256)
output_folder The path to the output folder. If existing subfolder are present, the tool will continue to the next number. (default "images")
processes_count The number of processes used for downloading the pictures. This is important to be high for performance. (default 1)
thread_count The number of threads used for downloading the pictures. This is important to be high for performance. (default 256)
resize_mode The way to resize pictures, can be no, border or keep_ratio (default border)
- no doesn't resize at all
- border will make the image image_size x image_size and add a border
- keep_ratio will keep the ratio and make the smallest side of the picture image_size
- center_crop will keep the ratio and center crop the largest side so the picture is squared
resize_only_if_bigger resize pictures only if bigger that the image_size (default False)
output_format decides how to save pictures (default files)
- files saves as a set of subfolder containing pictures
- webdataset saves as tars containing pictures
input_format decides how to load the urls (default txt)
- txt loads the urls as a text file of url, one per line
- csv loads the urls and optional caption as a csv
- tsv loads the urls and optional caption as a tsv
- parquet loads the urls and optional caption as a parquet
url_col the name of the url column for parquet and csv (default url)
caption_col the name of the caption column for parquet and csv (default None)
number_sample_per_shard the number of sample that will be downloaded in one shard (default 10000)
save_metadata if true, saves one parquet file per folder/tar and json files with metadata (default True)
save_additional_columns list of additional columns to take from the csv/parquet files and save in metadata files (default None)
timeout maximum time (in seconds) to wait when trying to download an image (default 10)
wandb_project name of W&B project used (default img2dataset)

How to tweak the options

The default values should be good enough for small sized dataset. For larger ones, these tips may help you get the best performance:

set the processes_count as the number of cores your machine has
increase thread_count as long as your bandwidth and cpu are below the limits
I advise to set output_format to webdataset if your dataset has more than 1M elements, it will be easier to manipulate few tars rather than million of files
keeping metadata to True can be useful to check what items were already saved and avoid redownloading them

Road map

This tool works very well in the current state for up to 100M elements. Future goals include:

a benchmark for 1B pictures which may require
- further optimization on the resizing part
- better multi node support
- integrated support for incremental support (only download new elements)

Architecture notes

This tool is designed to download pictures as fast as possible. This put a stress on various kind of resources. Some numbers assuming 1350 image/s:

Bandwidth: downloading a thousand average image per second requires about 130MB/s
CPU: resizing one image may take several milliseconds, several thousand per second can use up to 16 cores
DNS querying: million of urls mean million of domains, default OS setting usually are not enough. Setting up a local bind9 resolver may be required
Disk: if using resizing, up to 30MB/s write speed is necessary. If not using resizing, up to 130MB/s. Writing in few tar files make it possible to use rotational drives instead of a SSD.

With these information in mind, the design choice was done in this way:

the list of urls is split in N shards. N is usually chosen as the number of cores
N processes are started (using multiprocessing process pool)
- each process starts M threads. M should be maximized in order to use as much network as possible while keeping cpu usage below 100%.
- each of this thread download 1 image and returns it
- the parent thread handle resizing (which means there is at most N resizing running at once, using up the cores but not more)
- the parent thread saves to a tar file that is different from other process

This design make it possible to use the CPU resource efficiently by doing only 1 resize per core, reduce disk overhead by opening 1 file per core, while using the bandwidth resource as much as possible by using M thread per process.

Setting up a bind9 resolver

In order to keep the success rate high, it is necessary to use an efficient DNS resolver. I tried several options: systemd-resolved, dnsmaskq and bind9 and reached the conclusion that bind9 reaches the best performance for this use case. Here is how to set this up on ubuntu:

sudo apt install bind9
sudo vim /etc/bind/named.conf.options

Add this in options:
        recursive-clients 10000;
        resolver-query-timeout 30000;
        max-clients-per-query 10000;
        max-cache-size 2000m;

sudo systemctl restart bind9

sudo vim /etc/resolv.conf

Put this content:
nameserver 127.0.0.1

This will make it possible to keep an high success rate while doing thousands of dns queries. You may also want to setup bind9 logging in order to check that few dns errors happen.

For development

Either locally, or in gitpod (do export PIP_USER=false there)

Setup a virtualenv:

python3 -m venv .env
source .env/bin/activate
pip install -e .

to run tests:

pip install -r requirements-test.txt

then

python -m pytest -v tests -s

Benchmarks

10000 image benchmark

cd tests
bash benchmark.sh

18M image benchmark

Download crawling at home first part, then:

cd tests
bash large_bench.sh

It takes 3.7h to download 18M pictures

1350 images/s is the currently observed performance. 4.8M images per hour, 116M images per 24h.

36M image benchmark

downloading 2 parquet files of 18M items (result 936GB) took 7h24 average of 1345 image/s

190M benchmark

downloading 190M images from the crawling at home dataset took 41h (result 5TB) average of 1280 image/s

Comments

Downloader is not producing full set of expected outputs

Heya, I was trying to download the LAION400M dataset and noticed that I am not getting the full set of data for some reason.

Any tips on debugging further?

TL;DR - I was expecting ~12M files to be downloaded, only seeing successes in *_stats.json files indicating ~2M files were actually downloaded

For example - I recently tried to download this dataset in a distributed manner on EMR:

https://deploy.laion.ai/8f83b608504d46bb81708ec86e912220/dataset/part-00000-5b54c5d5-bbcf-484d-a2ce-0d6f73df1a36-c000.snappy.parquet

I applied some light NSFW filtering on it to produce a new parquet

# rest of the script is redacted, but there is some code before this to normalize the NSFW row to make filtering more convenient
sampled_df = df[df["NSFW"] == "unlikely"]
sampled_df.reset_index(inplace=True)

Verified its row count is ~12M samples:

import glob
import json
from pyarrow.parquet import ParquetDataset

files = glob.glob("*.parquet")

d = {}

for file in files:
    d[file] = 0
    dataset = ParquetDataset(file)
    for piece in dataset.pieces:
        d[file] += piece.get_metadata().num_rows

print(json.dumps(d, indent=2, sort_keys=True))

{
  "part00000.parquet": 12026281
}

Ran the download, and scanned over the output s3 bucket:

aws s3 cp\
	s3://path/to/s3/download/ . \
	--exclude "*" \
	--include "*.json" \
	--recursive

Ran this script to get the total count of images downloaded:

import json
import glob

files = glob.glob("/path/to/json/files/*.json")

count = {}
successes = {}

for file in files:
    with open(file) as f:
        j = json.load(f)
        count[file] = j["count"]
        successes[file] = j["successes"]

rate = 100 * sum(successes.values()) / sum(count.values())
print(f"Success rate: {rate}. From {sum(successes.values())} / {sum(count.values())}")

which gave me the following output:

Success rate: 56.15816066896948. From 1508566 / 2686281

The high error rate here is not of major concern, I was running at low worker node count for experimentation so we have a lot of dns issues (I'll use a knot resolver later)

unknown url type: '21nicrmo2'                                                      1.0
<urlopen error [errno 22] invalid argument>                                        1.0
encoding with 'idna' codec failed (unicodeerror: label empty or too long)          1.0
http/1.1 401.2 unauthorized\r\n                                                    4.0
<urlopen error no host given>                                                      5.0
<urlopen error unknown url type: "https>                                          11.0
incomplete read                                                                   14.0
<urlopen error [errno 101] network is unreachable>                                38.0
<urlopen error [errno 104] connection reset by peer>                              75.0
[errno 104] connection reset by peer                                              92.0
opencv                                                                           354.0
<urlopen error [errno 113] no route to host>                                     448.0
remote end closed connection without response                                    472.0
<urlopen error [errno 111] connection refused>                                  1144.0
encoding issue                                                                  2341.0
timed out                                                                       2850.0
<urlopen error timed out>                                                       4394.0
the read operation timed out                                                    4617.0
image decoding error                                                            5563.0
ssl                                                                             6174.0
http error                                                                     62670.0
<urlopen error [errno -2] name or service not known>                         1086446.0
success                                                                      1508566.0

I also noticed there were only 270 json files produced, but given that each shard should contain 10,000 images, I expected ~1,200 json files to be produced. Not sure where this discrepancy is coming from

> ls
00000_stats.json  00051_stats.json  01017_stats.json  01066_stats.json  01112_stats.json  01157_stats.json
00001_stats.json  00052_stats.json  01018_stats.json  01067_stats.json  01113_stats.json  01159_stats.json
...
> ls -l | wc -l 
270

opened by PranshuBansalDev 33

Increasing mem and no output files

Currently using your tool to download laion dataset, thank you for your contribution. The program grows in memory until it uses all of my 32G of RAM and 64G of SWAP. No tar files are ever output. Am I doing something wrong?

Using the following command (slightly modified from official command provided by laion) img2dataset --url_list laion400m-meta --input_format "parquet" \ --url_col "URL" --caption_col "TEXT" --output_format webdataset \ --output_folder webdataset --processes_count 1 --thread_count 12 --image_size 384 \ --save_additional_columns '["NSFW","similarity","LICENSE"]'

opened by pbatk 23
feat: support tfrecord

Add support for tfrecords.

The webdataset format is not very convenient on TPU's due to bad support of pytorch dataloaders in multiprocessing at the moment so tfrecords allow better usage of CPU's.

opened by borisdayma 22
Download stall at the end
I'm trying to download the CC3M dataset on an AWS Sagemaker Notebook instance. I first do pip install img2dataset. Then I fired up a terminal and do

img2dataset --url_list cc3m.tsv --input_format "tsv"\ --url_col "url" --caption_col "caption" --output_format webdataset\ --output_folder cc3m --processes_count 16 --thread_count 64 --resize_mode no\ --enable_wandb False

Code runs and downloads but stalls towards the end. I tried terminating by restarting the instance (restart), as a result, some .tar files are having read error "Unexpected end of file" while using the tar files for training. I also tried to terminate it using Ctrl-C on a second run, which result in the same read error when using the tar files for training. The difference between two termination methods is the later seemed to do some cleanup which removed "_tmp" folder within the download folder.
opened by xiankgx 13
Respect noai and noimageai directives when downloading image files
Media owners can use the X-Robots-Tag header to communicate usage directives for the associated media, including instruction that the image not be used in any indexes (noindex) or included in datasets used for machine learning purposes (noai).

This PR makes img2dataset respect such directives by not including associated media in the generated dataset. It also updates the useragent string, introducing a img2dataset user agent token so that requests made using the tool are identifiable by media hosts.

Refs:

https://developers.google.com/search/docs/crawling-indexing/robots-meta-tag#xrobotstag

https://www.deviantart.com/team/journal/A-New-Directive-for-Opting-Out-of-AI-Datasets-934500371
opened by raincoastchris 12
How to download SBUcaptions and Visual Genome (VG) dataset in webdataset format
For Vision and Language pretraining cc3m, mscoco, SBUcaptions and VG are very relevant datasets. I haven't been able to download SBU captions and VG. Here are my questions.

How to download SBU captions and VG's metadata?

How to download these datasets on webdataset format?

Could you also please provide me with a tutorial or just some hints to download it in webdataset format using img2dataset? Thank you in advance.
opened by sanyalsunny111 8
clip-retrieval-getting-started.ipynb giving errors (Urgent)

Hello there I am new to the world of deep learning.I am trying to run clip-retrieval-getting-started.ipynb but getting the error attached as snip....Please help its urgent

opened by minakshimathpal 8
Decrease memory usage

Currently the memory usage is about 1.5GB per core. That's way too much, it must be possible to decrease it. Figure out what's using all that ram (is it because the resize queue is full ? should there be some backpressure on the downloader ,etc) and solve it

opened by rom1504 8
Interest in supporting video datasets?

Hi. Thanks for the amazing repository. It really makes the workflow very easy. I was wondering if you are considering to add video datasets as well. Some are based on urls, while some others are derived from youtube or segments from youtube.

opened by TheShadow29 7
Add checksum of image

I think it could be useful to add a checksum in the parquet files since we're downloading the images anyway and it's fast to compute. It would help us do a real deduplication, not only on urls but on actual image content.

opened by borisdayma 7
add list of int, float feature in TFRecordSampleWriter

We use list of int, list of float attribute in coyo-labeled-300m dataset. (It will be released soon) To create a dataset using img2dataset in tfrecord, we need to add above features.

opened by justHungryMan 6
Figure out how to timeout

I implemented some new metrics and found that many urls timeout after 20s, which clearly slow down everything

here is some examples: Downloaded (12, 'http://www.herteldenbirname.com/wp-content/uploads/2014/05/Italia-Independent-Flocked-Aviator-Sunglasses-150x150.jpg') in 10.019284009933472 Downloaded (124, 'http://image.rakuten.co.jp/sneak/cabinet/shoes-03/cr-ucrocs5-a.jpg?_ex=128x128') in 10.01184344291687 Downloaded (146, 'http://www.slicingupeyeballs.com/wp-content/uploads/2009/05/stoneroses452.jpg') in 10.006474256515503 Downloaded (122, 'https://media.mwcradio.com/mimesis/2013-03/01/2013-03-01T153415Z_1_CBRE920179600_RTROPTP_3_TECH-US-GERMANY-EREADER_JPG_475x310_q85.jpg') in 10.241626739501953 Downloaded (282, 'https://8d1aee3bcc.site.internapcdn.net/00/images/media/5/5cfb2eba8f1f6244c6f7e261b9320a90-1.jpg') in 10.431355476379395 Downloaded (298, 'https://my-furniture.com.au/media/catalog/product/cache/1/small_image/295x295/9df78eab33525d08d6e5fb8d27136e95/a/u/au0019-stool-01.jpg') in 10.005694150924683 Downloaded (300, 'http://images.tastespotting.com/thumbnails/889506.jpg') in 10.007027387619019 Downloaded (330, 'https://www.infoworld.pk/wp-content/uploads/2016/02/Cool-HD-Valentines-Day-Wallpapers-480x300.jpeg') in 10.004335880279541 Downloaded (361, 'http://pendantscarf.com/image/cache/data/necklace/JW0013-(2)-150x150.jpg') in 10.00539231300354 Downloaded (408, 'https://www.solidrop.net/photo-6/animorphia-coloring-books-for-adults-children-drawing-book-secret-garden-style-relieve-stress-graffiti-painting-book.jpg') in 10.004313945770264

Let's try to implement request timeout

I tried #153 , eventlet and #260 and none of them can timeout properly

A good value for timeout is 2s

opened by rom1504 16
Add asyncio implementation of downloader
#252 #256

The impl of asyncio downloader. It can also run properly on windows (without 3rd place dns resolver) with avg 500~600Mbps (under 1gbps network).

use command arg --downloader to choose type of downloader("normal", "async"):

img2dataset --downloader async

mscoco download test
opened by KohakuBlueleaf 3
opencv-python => opencv-python-headless

This PR replaces opencv-python with opencv-python-headless to remove the dependency on GUI-related libraries (see: https://github.com/opencv/opencv-python/issues/370#issuecomment-671202529). I tested this working on the python:3.9 Docker image.

opened by shionhonda 2

Releases(1.40.0)

1.40.0(Dec 22, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(517.56 MB)
1.39.0(Dec 19, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(516.91 MB)
1.38.0(Dec 17, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(516.71 MB)
1.37.0(Dec 17, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(516.70 MB)
1.36.0(Dec 10, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(516.70 MB)
1.35.0(Nov 26, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.32 MB)
1.34.0(Nov 25, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.32 MB)
1.33.0(Aug 23, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(545.97 MB)
1.32.0(Jul 24, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(544.30 MB)
1.31.0(Jun 27, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(572.48 MB)
1.30.2(Jun 24, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(572.48 MB)
1.30.1(May 26, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(543.11 MB)
1.30.0(May 18, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(542.91 MB)
1.29.0(May 18, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(542.91 MB)
1.28.0(Apr 4, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(539.80 MB)
1.27.4(Apr 3, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(539.80 MB)
1.27.3(Apr 3, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(539.80 MB)
1.27.2(Apr 3, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(539.80 MB)
1.27.0(Apr 3, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(538.67 MB)
1.26.0(Mar 6, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(538.08 MB)
1.25.6(Mar 2, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(538.07 MB)
1.25.5(Feb 21, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.89 MB)
1.25.4(Feb 15, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.71 MB)
1.25.3(Feb 10, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.66 MB)
1.25.2(Feb 9, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.66 MB)
1.25.1(Feb 7, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.66 MB)
1.25.0(Feb 7, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.66 MB)
1.24.1(Feb 6, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(537.66 MB)
1.24.0(Feb 5, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(535.37 MB)
1.23.1(Feb 4, 2022)

Source code(tar.gz)
Source code(zip)
img2dataset.pex(525.03 MB)

Owner

Romain Beaumont

Interested in machine learning (computer vision, natural language processing, deep learning), node.js (network, bots, web), and programming in general

GitHub Repository

Unique image & metadata generation using weighted layer collections.

nft-generator-py nft-generator-py is a python based NFT generator which programatically generates unique images using weighted layer files. The progra

243 Dec 31, 2022

Python library that finds the size / type of an image given its URI by fetching as little as needed

FastImage This is an implementation of the excellent Ruby library FastImage - but for Python. FastImage finds the size or type of an image given its u

28 Mar 01, 2022

Archive of the image generator stuff from my API

alex_api_archive Archive of the image generator stuff from my API FAQ Q: Why? A: Because I am removing these components from the API Q: How do I run i

26 Nov 17, 2022

Computer art based on joining transparent images

Computer Art There is no must in art because art is free. Introduction The following tutorial exaplains how to generate computer art based on a series

12 Jul 30, 2022

Easy to use Python module to extract Exif metadata from digital image files.

719 Jan 05, 2023

Python implementation of image filters (such as brightness, contrast, saturation, etc.)

PyPhotoshop Python implementation of image filters Use Python to adjust brightness and contrast, add blur, and detect edges! Follow along tutorial: ht

87 Dec 15, 2022

A warping based image translation model focusing on upper body synthesis.

Pose2Img Upper body image synthesis from skeleton(Keypoints). Sub module in the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis

15 Nov 10, 2022

HyperBlend is a new type of hyperspectral image simulator based on Blender.

HyperBlend version 0.1.0 This is the HyperBlend leaf spectra simulator developed in Spectral Laboratory of University of Jyväskylä. You can use and mo

2 Jun 20, 2022

display: a browser-based graphics server

display: a browser-based graphics server Installation Quick Start Usage Development A very lightweight display server for Torch. Best used as a remote

205 Oct 17, 2022

Find target hash collisions for Apple's NeuralHash perceptual hash function.💣

neural-hash-collider Find target hash collisions for Apple's NeuralHash perceptual hash function. For example, starting from a picture of this cat, we

630 Jan 01, 2023

A 3D structural engineering finite element library for Python.

An easy to use elastic 3D structural engineering finite element analysis library for Python.

220 Dec 27, 2022

The friendly PIL fork (Python Imaging Library)

Pillow Python Imaging Library (Fork) Pillow is the friendly PIL fork by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lund

10.4k Dec 31, 2022

API to help generating QR-code for ZATCA's e-invoice known as Fatoora with any programming language

You can try it @ api-fatoora api-fatoora API to help generating QR-code for ZATCA's e-invoice known as Fatoora with any programming language Disclaime

12 Oct 05, 2022

Fill holes in binary 2D & 3D images fast.

11 Dec 09, 2022

An ascii art generator that's actually good. Does edge detection and selects the most appropriate characters.

Ascii Artist An ascii art generator that's actually good. Does edge detection and selects the most appropriate characters. Installing Installing with

18 Jan 03, 2023

Anaglyph 3D Converter - A python script that adds a 3D anaglyph style effect to an image using the Pillow image processing package.

2 Jan 22, 2022

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Related tags

Overview

img2dataset

Install

Usage

Integration with Weights & Biases

API

How to tweak the options

Road map

Architecture notes

Setting up a bind9 resolver

For development

Benchmarks

10000 image benchmark

18M image benchmark

36M image benchmark

190M benchmark

Comments

Releases(1.40.0)

1.40.0(Dec 22, 2022)

1.39.0(Dec 19, 2022)

1.38.0(Dec 17, 2022)

1.37.0(Dec 17, 2022)

1.36.0(Dec 10, 2022)

1.35.0(Nov 26, 2022)

1.34.0(Nov 25, 2022)

1.33.0(Aug 23, 2022)

1.32.0(Jul 24, 2022)

1.31.0(Jun 27, 2022)

1.30.2(Jun 24, 2022)

1.30.1(May 26, 2022)

1.30.0(May 18, 2022)

1.29.0(May 18, 2022)

1.28.0(Apr 4, 2022)

1.27.4(Apr 3, 2022)

1.27.3(Apr 3, 2022)

1.27.2(Apr 3, 2022)

1.27.0(Apr 3, 2022)

1.26.0(Mar 6, 2022)

1.25.6(Mar 2, 2022)

1.25.5(Feb 21, 2022)

1.25.4(Feb 15, 2022)

1.25.3(Feb 10, 2022)

1.25.2(Feb 9, 2022)

1.25.1(Feb 7, 2022)

1.25.0(Feb 7, 2022)

1.24.1(Feb 6, 2022)

1.24.0(Feb 5, 2022)

1.23.1(Feb 4, 2022)

Owner

Romain Beaumont

Unique image & metadata generation using weighted layer collections.

Python library that finds the size / type of an image given its URI by fetching as little as needed

Archive of the image generator stuff from my API

Computer art based on joining transparent images

Easy to use Python module to extract Exif metadata from digital image files.

Python implementation of image filters (such as brightness, contrast, saturation, etc.)

A warping based image translation model focusing on upper body synthesis.

HyperBlend is a new type of hyperspectral image simulator based on Blender.

display: a browser-based graphics server

Find target hash collisions for Apple's NeuralHash perceptual hash function.💣

A 3D structural engineering finite element library for Python.

The friendly PIL fork (Python Imaging Library)

API to help generating QR-code for ZATCA's e-invoice known as Fatoora with any programming language

Fill holes in binary 2D & 3D images fast.

An ascii art generator that's actually good. Does edge detection and selects the most appropriate characters.

Anaglyph 3D Converter - A python script that adds a 3D anaglyph style effect to an image using the Pillow image processing package.

Image generation API.

GTK and Python based, simple multiple image editor tool

Nutrify - take a photo of food and learn about it

A Blender add-on to create interesting meshes using symmetry