RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

Last update: Apr 13, 2022

Overview

ruCLIP-SB

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

Our model achieved 37.02% zero-shot accuracy on CIFAR100 and has 39543907 parameters.

Download URL: ruCLIP-SB

Example usage:

Finetuning:

ONNX example:

We trained model on 2 millions images.

Thanks to Sber AI for help.

Owner

Shahmatov Arseniy

GitHub Repository

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Instance-level Image Retrieval using Reranking Transformers Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021. Abstract Instance-level image retriev

86 Dec 28, 2022

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Crosslingual Coreference Coreference is amazing but the data required for training a model is very scarce. In our case, the available training for non

71 Jan 04, 2023

Script and models for clustering LAION-400m CLIP embeddings.

clustering-laion400m Script and models for clustering LAION-400m CLIP embeddings. Models were fit on the first million or so image embeddings. A subje

22 Oct 04, 2022

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

easySpeech easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't

14 May 24, 2022

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

SentAugment SentAugment is a data augmentation technique for semi-supervised learning in NLP. It uses state-of-the-art sentence embeddings to structur

363 Dec 30, 2022

Tools for curating biomedical training data for large-scale language modeling

242 Dec 25, 2022

Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries.

VirtualAssistant Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries. Third Party Libraries us

1 Nov 27, 2021

A Python script which randomly chooses and prints a file from a directory.

___ ____ ____ _ __ ___ / _ \ | _ \ | _ \ ___ _ __ | '__| / _ \ | |_| || | | || | | | / _ \| '__| | | | __/ | _ || |_| || |_| || __

0 Aug 06, 2021

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Fine-tuning wav2vec2 for speaker recognition This is the code used to run the experiments in https://arxiv.org/abs/2109.15053. Detailed logs of each t

103 Dec 26, 2022

Rhyme with AI

Local development Create a conda virtual environment and activate it: conda env create --file environment.yml conda activate rhyme-with-ai Install the

28 Nov 21, 2022

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration This repo contains only model Implementation of Zero-Shot Text-to-Speech for Text

33 Sep 22, 2022

GSoC'2021 | TensorFlow implementation of Wav2Vec2

73 Nov 28, 2022

keras implement of transformers for humans

4.8k Jan 03, 2023

A telegram bot to translate 100+ Languages

🔥 GOOGLE TRANSLATER 🔥 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🔰 Deploy To Railway 🔰 • • ✅ OFF

5 Dec 20, 2021

Kestrel Threat Hunting Language

Kestrel Threat Hunting Language What is Kestrel? Why we need it? How to hunt with XDR support? What is the science behind it? You can find all the ans

201 Dec 16, 2022

A python wrapper around the ZPar parser for English.

NOTE This project is no longer under active development since there are now really nice pure Python parsers such as Stanza and Spacy. The repository w

49 Sep 12, 2022

Let Xiao Ai speakers control third-party devices

A stupid way to extend miot/xiaoai. Demo for Panasonic Bath Bully FV-RB20VL1 逆向 Panasonic Smart China，获得控制浴霸的请求信息（HTTP 请求），详见 apps/panasonic.py； 2. 通过

14 Jul 07, 2022

Trains an OpenNMT PyTorch model and SentencePiece tokenizer.

Trains an OpenNMT PyTorch model and SentencePiece tokenizer. Designed for use with Argos Translate and LibreTranslate.

61 Dec 13, 2022

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

Med-VQA In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset. Two of these are made on top of Facebook AI Reasearch's Multi-Mo

8 Apr 14, 2022

A framework for implementing federated learning

This is partly the reproduction of the paper of [Privacy-Preserving Federated Learning in Fog Computing](DOI: 10.1109/JIOT.2020.2987958. 2020)

46 Sep 23, 2022

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

Related tags

Overview

ruCLIP-SB

Our model achieved 37.02% zero-shot accuracy on CIFAR100 and has 39543907 parameters.

Download URL: ruCLIP-SB

Example usage:

Finetuning:

ONNX example:

Thanks to Sber AI for help.

Owner

Shahmatov Arseniy

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Script and models for clustering LAION-400m CLIP embeddings.

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

Tools for curating biomedical training data for large-scale language modeling

Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries.

A Python script which randomly chooses and prints a file from a directory.

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Rhyme with AI

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

GSoC'2021 | TensorFlow implementation of Wav2Vec2

keras implement of transformers for humans

A telegram bot to translate 100+ Languages

Kestrel Threat Hunting Language

A python wrapper around the ZPar parser for English.

Let Xiao Ai speakers control third-party devices

Trains an OpenNMT PyTorch model and SentencePiece tokenizer.

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

A framework for implementing federated learning