Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Last update: Dec 23, 2022

Overview

ITTR - Pytorch

Implementation of the Hybrid Perception Block (HPB) and Dual-Pruned Self-Attention (DPSA) block from the ITTR paper for Image to Image Translation using Transformers.

Install

$ pip install ITTR-pytorch

Usage

They had 9 blocks of Hybrid Perception Block (HPB) in the paper

import torch
from ITTR_pytorch import HPB

block = HPB(
    dim = 512,              # dimension
    dim_head = 32,          # dimension per attention head
    heads = 8,              # number of attention heads
    attn_height_top_k = 16, # number of top indices to select along height, for the attention pruning
    attn_width_top_k = 16,  # number of top indices to select along width, for the attention pruning
    attn_dropout = 0.,      # attn dropout
    ff_mult = 4,            # expansion factor of feedforward
    ff_dropout = 0.         # feedforward dropout
)

fmap = torch.randn(1, 512, 32, 32)

out = block(fmap) # (1, 512, 32, 32)

You can also use the dual-pruned self-attention as so

import torch
from ITTR_pytorch import DPSA

attn = DPSA(
    dim = 512,         # dimension
    dim_head = 32,     # dimension per attention head
    heads = 8,         # number of attention heads
    height_top_k = 48, # number of top indices to select along height, for the attention pruning
    width_top_k = 48,  # number of top indices to select along width, for the attention pruning
    dropout = 0.       # attn dropout
)

fmap = torch.randn(1, 512, 32, 32)

out = attn(fmap) # (1, 512, 32, 32)

Citations

@inproceedings{Zheng2022ITTRUI,
  title   = {ITTR: Unpaired Image-to-Image Translation with Transformers},
  author  = {Wanfeng Zheng and Qiang Li and Guoxin Zhang and Pengfei Wan and Zhongyuan Wang},
  year    = {2022}
}

Comments

The image size and channel number of each layer of network are discussed

Hello！Thank you very much for publishing the core code of ITTR. But I would like to know the number of channels and image size of each layer, thank you very much！

opened by SHNsunhenan 0

Dual languaged (rus+eng) tool for packing and unpacking archives of Silky Engine.

SilkyArcTool English Dual languaged (rus+eng) GUI tool for packing and unpacking archives of Silky Engine. It is not the same arc as used in Ai6WIN. I

5 Sep 15, 2022

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

Graph2Pix: A Graph-Based Image to Image Translation Framework Installation Install the dependencies in env.yml $ conda env create -f env.yml $ conda a

18 Nov 17, 2022

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

normalizer This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch

23 Nov 30, 2022

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

20 Dec 12, 2022

A Paper List for Speech Translation

Keyword: Speech Translation, Spoken Language Processing, Natural Language Processing

138 Dec 24, 2022

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Graformer The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models Graformer (also named BridgeTransformer in t

22 Dec 14, 2022

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

211 Dec 28, 2022

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Related tags

Overview

ITTR - Pytorch

Install

Usage

Citations

You might also like...

Dual languaged (rus+eng) tool for packing and unpacking archives of Silky Engine.

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

A Paper List for Speech Translation

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Comments

The image size and channel number of each layer of network are discussed

Releases(0.0.4)

0.0.4(Apr 1, 2022)

0.0.3(Apr 1, 2022)

0.0.2(Apr 1, 2022)

0.0.1(Apr 1, 2022)

Owner

Phil Wang

Precision Medicine Knowledge Graph (PrimeKG)

Tensorflow implementation of paper: Learning to Diagnose with LSTM Recurrent Neural Networks.

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

MiCECo - Misskey Custom Emoji Counter

KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark.

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

Fake news detector filters - Smart filter project allow to classify the quality of information and web pages

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

Stand-alone language identification system

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Arabic speech recognition, classification and text-to-speech.

Rhythm-Finder is a unsupervised ML driven python powered web-application that can find the songs that suits you.

Nateve compiler developed with python.

Honor's thesis project analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry.

PyTorch impelementations of BERT-based Spelling Error Correction Models.

BERT score for text generation

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE