Generate images from texts. In Russian

Last update: Dec 31, 2022

Overview

ruDALL-E

Generate images from texts

pip install rudalle==1.1.0rc0

🤗 HF Models:

ruDALL-E Malevich (XL)
ruDALL-E Emojich (XL) (readme here)
ruDALL-E Surrealist (XL)

Minimal Example:

Example usage ruDALL-E Malevich (XL) with 3.5GB vRAM!

Finetuning example

generation by ruDALLE:

import ruclip
from rudalle.pipelines import generate_images, show, super_resolution, cherry_pick_by_ruclip
from rudalle import get_rudalle_model, get_tokenizer, get_vae, get_realesrgan
from rudalle.utils import seed_everything

# prepare models:
device = 'cuda'
dalle = get_rudalle_model('Malevich', pretrained=True, fp16=True, device=device)
tokenizer = get_tokenizer()
vae = get_vae(dwt=True).to(device)

# pipeline utils:
realesrgan = get_realesrgan('x2', device=device)
clip, processor = ruclip.load('ruclip-vit-base-patch32-384', device=device)
clip_predictor = ruclip.Predictor(clip, processor, device, bs=8)
text = 'радуга на фоне ночного города'

seed_everything(42)
pil_images = []
scores = []
for top_k, top_p, images_num in [
    (2048, 0.995, 24),
]:
    _pil_images, _scores = generate_images(text, tokenizer, dalle, vae, top_k=top_k, images_num=images_num, bs=8, top_p=top_p)
    pil_images += _pil_images
    scores += _scores

show(pil_images, 6)

auto cherry-pick by ruCLIP:

top_images, clip_scores = cherry_pick_by_ruclip(pil_images, text, clip_predictor, count=6)
show(top_images, 3)

super resolution:

sr_images = super_resolution(top_images, realesrgan)
show(sr_images, 3)

text, seed = 'красивая тян из аниме', 6955

Image Prompt

see jupyters/ruDALLE-image-prompts-A100.ipynb

text, seed = 'Храм Василия Блаженного', 42
skyes = [red_sky, sunny_sky, cloudy_sky, night_sky]

Aspect ratio images -->NEW<--

🚀 Contributors 🚀

@bes shared great idea and realization with IDWT for decoding images with higher quality 512x512! 😈 💪 thanks a lot for your constructive advices, appreciate it
@neverix thanks a lot for contributing for speed up of inference
@Igor Pavlov trained model and prepared code with super-resolution
@oriBetelgeuse thanks a lot for easy API of generation using image prompt
@Alex Wortega created first FREE version colab notebook with fine-tuning ruDALL-E Malevich (XL) on sneakers domain 💪
@Anton Lozhkov Integrated to Huggingface Spaces with Gradio, see here

Supported by

Social Media

Comments

Smaller / Distilled model?

Will there be a smaller or a distilled model release? The problem with inferencing in google colab is the speeds. 4:32 for one image on a P100, and 2 hours+ for 3 images on K80.

opened by johnpaulbin 10
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

i use default code and get error after generation 100% please help i use windows and conda

`◼️ Malevich is 1.3 billion params model from the family GPT3-like, that uses Russian language and text+image multi-modality. x4 --> ready tokenizer --> ready Working with z of shape (1, 256, 32, 32) = 262144 dimensions. vae --> ready ruclip --> ready 100%|██████████████████████████████████████████████████████████████████████████████| 1024/1024 [00:46<00:00, 22.14it/s] Traceback (most recent call last): File "gen.py", line 29, in _pil_images, _scores = generate_images(text, tokenizer, dalle, vae, top_k=top_k, images_num=images_num, top_p=top_p) File "C:\Users\1\anaconda3\lib\site-packages\rudalle\pipelines.py", line 60, in generate_images images = vae.decode(codebooks) File "C:\Users\1\anaconda3\lib\site-packages\rudalle\vae\model.py", line 38, in decode img = self.model.decode(z) File "C:\Users\1\anaconda3\lib\site-packages\rudalle\vae\model.py", line 98, in decode quant = self.post_quant_conv(quant) File "C:\Users\1\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\1\anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 399, in forward return self._conv_forward(input, self.weight, self.bias) File "C:\Users\1\anaconda3\lib\site-packages\torch\nn\modules\conv.py", line 395, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

import torch torch.backends.cuda.matmul.allow_tf32 = True torch.backends.cudnn.benchmark = True torch.backends.cudnn.deterministic = True torch.backends.cudnn.allow_tf32 = True data = torch.randn([3, 256, 32, 32], dtype=torch.float, device='cuda', requires_grad=True).to(memory_format=torch.channels_last) net = torch.nn.Conv2d(256, 256, kernel_size=[1, 1], padding=[0, 0], stride=[1, 1], dilation=[1, 1], groups=1) net = net.cuda().float().to(memory_format=torch.channels_last) out = net(data) out.backward(torch.randn_like(out)) torch.cuda.synchronize()

ConvolutionParams data_type = CUDNN_DATA_FLOAT padding = [0, 0, 0] stride = [1, 1, 0] dilation = [1, 1, 0] groups = 1 deterministic = true allow_tf32 = true input: TensorDescriptor 0000020481F094B0 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 3, 256, 32, 32, strideA = 262144, 1, 8192, 256, output: TensorDescriptor 0000020481F09590 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 3, 256, 32, 32, strideA = 262144, 1, 8192, 256, weight: FilterDescriptor 000001FFD2E76AF0 type = CUDNN_DATA_FLOAT tensor_format = CUDNN_TENSOR_NHWC nbDims = 4 dimA = 256, 256, 1, 1, Pointer addresses: input: 0000001538C7D000 output: 000000153B87D000 weight: 00000014D3BB0000 `

opened by bitcoin5000 7
Auto cut pictures into separated images

Есть ли какие-нибудь параметры, которые автоматически нарежут и сохранят сгенерированные картинки по отдельности?

Are there any args that will automatically cut and save separated images?

opened by Sidiusz 4
Gradient checkpointing

This patch enables gradient checkpointing for ruDALLE.

It's possible to use up to 3x higher batch sizes in memory-limited environments during training.

Setting the gradient_checkpointing during model.forward makes a checkpoint every gradient_checkpointing layers. 6 is a good starting value.

opened by neverix 3
Feature/dwt vae
add support decoding vae with DWT (discrete wavelet transform):

allow restore 512x512 images

thanks a lot @bes for issue https://github.com/sberbank-ai/ru-dalle/issues/42 with this idea 👍

vae = get_vae(dwt=True)
opened by shonenkov 3
optimize image prompts

This enables caching for image prompts. For some reason, the results change slightly. I tried looking for off-by-one bugs in this, but couldn't find one myself.

opened by neverix 3

The error in ruDall-e code that published in Kaggle

Execution of ruDall-e code in the Kaggle notebook (as is published), in GPU session ends with error:

ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_29/1914141142.py in <module>
----> 1 from rudalle.pipelines import generate_images, show, super_resolution, cherry_pick_by_clip
      2 from rudalle import get_rudalle_model, get_tokenizer, get_vae, get_realesrgan, get_ruclip
      3 from rudalle.utils import seed_everything

ModuleNotFoundError: No module named 'rudalle'

The error message refers to this code:

!pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html > /dev/null
!pip install rudalle==0.0.1rc1 > /dev/null

opened by XieBaoshi 3

Constantly having to redownload models

Hi, I've noticed that running it on a local jupyter instance will always redownload the model again. Is there a way I can avoid this as I don't want to be waiting for it to finish everytime. Thanks/

opened by JohnnyRacer 2
Problem about the PyTorch vision?

I have look for the issues but I can't find the same problem. So sorry to bother you. GPU: my python environment: pytorch=1.8.0&torchvision=0.9.0, cudatoolkit=11.3.1&cudnn =8.2.1. I have tried the rudalle=0.3.0 just following the readme.md, or 0.0.1rc5 by the RTX3090.ipynb, but I only got the following error!

So I wanna know if any problem in my environment? Waiting for your reply!

opened by Wang-Xiaodong1899 2
image_prompts.py – borders crop not working properly

From an official documentation:

borders (dict[str] | int): borders that we croped from pil_image example: {'up': 4, 'right': 0, 'left': 0, 'down': 0} (1 int eq 8 pixels)

Up crop works just fine. But if I will pass as a crop argument something other than "Up" in the result, I will get an AssertionError:

Thank you for a fantastic algo ✨

opened by DenisSergeevitch 2

Не запускается generate_images

Пытаюсь запустить на device = 'cpu'. Пример из README самый первый

Падает с таким трейсбеком. Что я делаю не так?

◼️ Malevich is 1.3 billion params model from the family GPT3-like, that uses Russian language and text+image multi-modality.
x4 --> ready
tokenizer --> ready
Working with z of shape (1, 256, 32, 32) = 262144 dimensions.
vae --> ready
ruclip --> ready
  0%|          | 0/1024 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "%projectfolder%\test\venv\lib\site-packages\rudalle\pipelines.py", line 46, in generate_images
    logits, has_cache = dalle(out, attention_mask,
  File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\fp16.py", line 51, in forward
    return fp16_to_fp32(self.module(*(fp32_to_fp16(inputs)), **kwargs))
  File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\model.py", line 150, in forward
    transformer_output, present_has_cache = self.transformer(
  File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\transformer.py", line 76, in forward
    hidden_states, present_has_cache = layer(hidden_states, mask, has_cache=has_cache, use_cache=use_cache)
  File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "%projectfolder%\test\venv\lib\site-packages\rudalle\dalle\transformer.py", line 146, in forward
    layernorm_output = self.input_layernorm(hidden_states)
  File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "%projectfolder%\test\venv\lib\site-packages\torch\nn\modules\normalization.py", line 173, in forward
    return F.layer_norm(
  File "%projectfolder%\test\venv\lib\site-packages\torch\nn\functional.py", line 2346, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

opened by Xoma163 2

Add optional resume_download argument to help download large models

It's kinda pain to download large models with unstable network connection. For instance, i've started seeing this type of error (see screenshot). It breaks download process and you have to start again from zero bytes downloaded.

However, cached_download(..) function in huggingface_hub has resume_download argument that can be used to restart download without loosing progress. See this line. So i think it would be helpful to add it as optional argument(defaults to False) to the get_rudalle_model(..) so users can turn it on if they have unstable internet.

opened by Rexhaif 0
kandinsky model not available

Nice to see the update! There is an auth error with the kandinsky model. Not sure if this is intended as there seem to be some token requirement. Could you clarify?

opened by xavierleung 0

RuntimeError: nvrtc: error: failed to open libnvrtc-builtins.so.11.1.

What might be causing this ?

RuntimeError: nvrtc: error: failed to open libnvrtc-builtins.so.11.1. Make sure that libnvrtc-builtins.so.11.1 is installed correctly. nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)
#define NEG_INFINITY __int_as_float(0xff800000)


template<typename T>
__device__ T maximum(T a, T b) {
  return isnan(a) ? a : (a > b ? a : b);
}

template<typename T>
__device__ T minimum(T a, T b) {
  return isnan(a) ? a : (a < b ? a : b);
}


#define __HALF_TO_US(var) *(reinterpret_cast<unsigned short *>(&(var)))
#define __HALF_TO_CUS(var) *(reinterpret_cast<const unsigned short *>(&(var)))
#if defined(__cplusplus)
  struct __align__(2) __half {
    __host__ __device__ __half() { }

  protected:
    unsigned short __x;
  };

  /* All intrinsic functions are only available to nvcc compilers */
  #if defined(__CUDACC__)
    /* Definitions of intrinsics */
    __device__ __half __float2half(const float f) {
      __half val;
      asm("{  cvt.rn.f16.f32 %0, %1;}\n" : "=h"(__HALF_TO_US(val)) : "f"(f));
      return val;
    }

    __device__ float __half2float(const __half h) {
      float val;
      asm("{  cvt.f32.f16 %0, %1;}\n" : "=f"(val) : "h"(__HALF_TO_CUS(h)));
      return val;
    }

  #endif /* defined(__CUDACC__) */
#endif /* defined(__cplusplus) */
#undef __HALF_TO_US
#undef __HALF_TO_CUS

typedef __half half;

extern "C" __global__
void fused_mul_mul_mul_mu_5065363705190979294(half* t0, half* aten_mul) {
{
  float t0_1 = __half2float(t0[(8192 * (((512 * blockIdx.x + threadIdx.x) / 8192) % 128) + ((512 * blockIdx.x + threadIdx.x) / 1048576) * 1048576) + (512 * blockIdx.x + threadIdx.x) % 8192]);
  aten_mul[(8192 * (((512 * blockIdx.x + threadIdx.x) / 8192) % 128) + ((512 * blockIdx.x + threadIdx.x) / 1048576) * 1048576) + (512 * blockIdx.x + threadIdx.x) % 8192] = __float2half((t0_1 * 0.5f) * ((tanhf((t0_1 * 0.7978845834732056f) * ((t0_1 * 0.04471499845385551f) * t0_1 + 1.f))) + 1.f));
}
}

opened by c0ffymachyne 1

Bad syntax in collab

In https://colab.research.google.com/drive/1wGE-046et27oHvNlBNPH07qrEQNE04PQ?usp=sharing#scrollTo=GdOYJvwZSB-D

it should be a couple of quotes (") in the text parameter:

text = Что бы ни # @param

Should be:

text = "Что бы ни" # @param

Thanks!

opened by Jakeukalane 1

Releases(v1.1.0)

v1.1.0(Jun 22, 2022)
add kandinsky

Source code(tar.gz)
Source code(zip)
v0.0.1rc7(Nov 9, 2021)
added params for dwt decoding images with higher quality 512x512 by @bes, source idea here

Source code(tar.gz)
Source code(zip)
v0.0.1rc6(Nov 7, 2021)
adapt cache for image prompts generations

fix bugs with left/right image prompt

support crop_first with left/right/down in image prompts generations

Source code(tar.gz)
Source code(zip)
v0.0.1rc5(Nov 5, 2021)
added in method show param save_dir for saving separately pics

added caching in image prompt with "up" settings

added fine tuning notebook on sneakers and inference notebook with translation eng --> rus

Source code(tar.gz)
Source code(zip)
v0.0.1rc4(Nov 3, 2021)
@neverix thanks a lot for contributing for speed up of inference

add image prompts

fix some bug with to device

Source code(tar.gz)
Source code(zip)
v0.0.1-rc1(Nov 2, 2021)

Source code(tar.gz)
Source code(zip)

Owner

AI Forever

Creating ML for the future. AI projects you already know. We are non-profit organization with members from all over the world.

GitHub Repository https://rudalle.ru/

WPPNets: Unsupervised CNN Training with Wasserstein Patch Priors for Image Superresolution

WPPNets: Unsupervised CNN Training with Wasserstein Patch Priors for Image Superresolution This code belongs to the paper [1] available at https://arx

5 Jun 02, 2022

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

5 Jul 14, 2022

A PyTorch implementation of Learning to learn by gradient descent by gradient descent

Intro PyTorch implementation of Learning to learn by gradient descent by gradient descent. Run python main.py TODO Initial implementation Toy data LST

300 Dec 11, 2022

Neural Fixed-Point Acceleration for Convex Optimization

Licensing The majority of neural-scs is licensed under the CC BY-NC 4.0 License, however, portions of the project are available under separate license

27 Oct 06, 2022

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

SwinTrack This is the official repo for SwinTrack. A Simple and Strong Baseline Prerequisites Environment conda (recommended) conda create -y -n SwinT

196 Jan 04, 2023

Lipschitz-constrained Unsupervised Skill Discovery

Lipschitz-constrained Unsupervised Skill Discovery This repository is the official implementation of Seohong Park, Jongwook Choi*, Jaekyeom Kim*, Hong

17 Dec 18, 2022

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence, etc. This article aims to provide an introduction on how to make use of the S

1 Feb 13, 2022

Python PID Tuner - Based on a FOPDT model obtained using a Open Loop Process Reaction Curve

PythonPID_Tuner Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a rough e

6 Jan 14, 2022

DeepLab2: A TensorFlow Library for Deep Labeling

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.

845 Jan 04, 2023

[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru

687 Jan 07, 2023

magiCARP: Contrastive Authoring+Reviewing Pretraining

magiCARP: Contrastive Authoring+Reviewing Pretraining Welcome to the magiCARP API, the test bed used by EleutherAI for performing text/text bi-encoder

43 Dec 29, 2022

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Domain Transfer Network (DTN) TensorFlow implementation of Unsupervised Cross-Domain Image Generation. Requirements Python 2.7 TensorFlow 0.12 Pickle

864 Dec 30, 2022

An experiment to bait a generalized frontrunning MEV bot

Honeypot 🍯 A simple experiment that: Creates a honeypot contract Baits a generalized fronturnning bot with a unique transaction Analyze bot behaviour

14 Nov 24, 2022

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Introduction This repository contains the PyTorch implem

25 Nov 09, 2022

Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

PL-Marker Source code for Pack Together: Entity and Relation Extraction with Levitated Marker. Quick links Overview Setup Install Dependencies Data Pr

173 Dec 30, 2022

PyTorch code of "SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks"

SLAPS-GNN This repo contains the implementation of the model proposed in SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks

60 Dec 22, 2022

IhoneyBakFileScan Modify - 批量网站备份文件扫描器，增加文件规则，优化内存占用

ihoneyBakFileScan_Modify 批量网站备份文件泄露扫描工具 2022.2.8 添加、修改内容增加备份文件fuzz规则修改备份文件大小判断

220 Jan 05, 2023

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

8 Nov 01, 2022

Cards Against Humanity AI

cah-ai This is a Cards Against Humanity AI implemented using a pre-trained Semantic Search model. How it works A player is described by a combination

2 Aug 22, 2022

Simple Tensorflow implementation of Toward Spatially Unbiased Generative Models (ICCV 2021)

Spatial unbiased GANs — Simple TensorFlow Implementation [Paper] : Toward Spatially Unbiased Generative Models (ICCV 2021) Abstract Recent image gener

16 Apr 15, 2022