LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA

Overview

LightSeq: A High Performance Library for Sequence Processing and Generation

logo

[2021/06/18] ๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰ LightSeq supports fast training for models in the Transformer family now, please check out here for details.


LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT, Transformer, etc. It is therefore best useful for Machine Translation, Text Generation, Dialog, Language Modelling, Sentiment Analysis, and other related tasks with sequence data.

The library is built on top of CUDA official library(cuBLAS, Thrust, CUB) and custom kernel functions which are specially fused and optimized for Transformer model family. In addition to model components, the inference library also provide easy-to deploy model management and serving backend based on TensorRT Inference Server. With LightSeq, one can easily develop modified Transformer architecture with little additional code.

Features

>>> Training

The following is a support matrix of LightSeq training library compared with DeepSpeed.

features

>>> Inference

The following is a support matrix of LightSeq inference library compared with TurboTransformers and FasterTransformer.

support

Performance

>>> Training

Here we present the experimental results on WMT14 English to German translation task based on Transformer-big models. We train Transformer models of different sizes on eight NVIDIA Tesla V100/NVIDIA Ampere A100 GPUs with data parallel and fp16 mixed precision. Fairseq with Apex is choosed as our baseline.

We compute speedup on different batch size using the WPS (real words per second) metric.

More results is available here

>>> Inference

Here we present the experimental results on neural machine translation based on Transformer-base models using beam search methods. We choose Tensorflow and FasterTransformer as a comparison. The implementation from tensor2tensor was used as the benchmark of Tensorflow.

More results is available here.

Quick Start

Fast training from Fairseq

You can experience lightning fast training by running following commands, Firstly install these requirements.

pip install lightseq fairseq sacremoses

Then you can train a translation task on wmt14 en2de dataset by running the following script

sh examples/training/fairseq/ls_fairseq_wmt14en2de.sh

To compare lightseq with fairseq, delete the arguments with ls_ prefix to using the original fairseq implementation

More usage is available here.

Fast inference from HuggingFace bart

We provide an end2end bart-base example to see how fast Lightseq is compared to HuggingFace. First you should install these requirements.

pip install torch tensorflow transformers lightseq
cd examples/inference/python

then you can check the performance by simply running following commands. hf_bart_export.py is used to transform pytorch weights to LightSeq protobuffer.

python hf_bart_export.py
python ls_bart.py

LightSeq installation from pypi only supports python 3.6 to 3.8 on Linux for now. Consider compiling from source if you have other environments.

More usage is available here.

Cite Us

If you use LightSeq in your research, please cite the following paper.

@InProceedings{wang2021lightseq,
    title = "{L}ight{S}eq: A High Performance Inference Library for Transformers",
    author = "Wang, Xiaohui and Xiong, Ying and Wei, Yang and Wang, Mingxuan and Li, Lei",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers (NAACL-HLT)",
    month = jun,
    year = "2021",
    publisher = "Association for Computational Linguistics",
    pages = "113--120",
}

Contact

Any questions or suggestions, please feel free to contact us at [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Issues
  • RuntimeError: Parse weights from [lightseq_bart_base.hdf5] failed

    RuntimeError: Parse weights from [lightseq_bart_base.hdf5] failed

    When I tried to run the example case like this

    python hf_bart_export.py
    python ls_bart.py
    

    It has some errors

    initializing bart tokenizer...
    creating lightseq model...
    Traceback (most recent call last):
      File "ls_bart.py", line 102, in <module>
        main()
      File "ls_bart.py", line 69, in main
        ls_model = lsi.Transformer("lightseq_bart_base.hdf5", 128)
    RuntimeError: Parse weights from [lightseq_bart_base.hdf5] failed.
    

    Alright๏ผŒI tried to run other case , huggingface gpt2 in examples:

    python hf_gpt2_export.py
    python ls_gpt.py
    

    It had some error again:

    initializing gpt tokenizer...
    Downloading: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1.04M/1.04M [00:00<00:00, 1.81MB/s]
    Downloading: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 456k/456k [00:00<00:00, 1.36MB/s]
    Downloading: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1.36M/1.36M [00:00<00:00, 2.29MB/s]
    lightseq tokenizer pad token id: 50257
    huggingface tokenizer pad token id: 50256
    creating lightseq model...
    Traceback (most recent call last):
      File "ls_gpt.py", line 119, in <module>
        main()
      File "ls_gpt.py", line 79, in main
        ls_model = lsi.Gpt("lightseq_gpt2_base.hdf5", max_batch_size=16)
    TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
        1. lightseq.inference.Gpt(weight_path: str, max_batch_size: int, max_step: int)
    
    Invoked with: 'lightseq_gpt2_base.hdf5'; kwargs: max_batch_size=16
    

    I don't know how to fixed them. Can you give me some advices. thank you very much.

    opened by juha0 21
  • lightseq inference abnormal using ls_fs_transformer_export.py exported model

    lightseq inference abnormal using ls_fs_transformer_export.py exported model

    Hi, I used the python export/ls_fs_transformer_export.py to export lightseq trained NMT model to do inference, but I found the result is quiet abnormal. These are some details output in the ls_fs_transformer_export.py test part.

    generator config beam size: 4 extra decode length(max decode length - src input length): 50 length penalty: 0.6 diverse lambda: 0 sampling method: beam_search topk: 1 topp: 0.75 Allocated 882MB GPU buffer for transformer decoder buffer init start decoder buffer init succeed pb results: (array([[[ 4, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 6]]], dtype=int32), array([[0.]], dtype=float32)) hdf5 results: (array([[[ 4, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 164, 6]]], dtype=int32), array([[0.]], dtype=float32))

    I also tested more examples, and it continued to generate some repeated logits, and when I decoded the array with my tgt_dict, it generated something like this:

    thesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesamesame.

    I used the 0.10.2 fairseq and 2.1.4 lightseq version, and the lightseq-generate result seems normal. I think maybe something wrong happened in the export procedure. Looking forward for your reply.

    opened by dearchill 18
  • fix pos embedding index bug

    fix pos embedding index bug

    Fixed the implementation of position embedding

    • the size of position matrix is determined by max_positions parameter
    • ignore all the padding tokens when calculating the token position
    • the position index begin from padding_idx + 1, consistent with fairseq implementation
    opened by nomadlx 13
  • Gpt exceeds maximum protobuf size of 2GB: 3096122166

    Gpt exceeds maximum protobuf size of 2GB: 3096122166

    when I use lightseq(2.0) export gpt2-large, it raises an error ValueError: Message Gpt exceeds maximum protobuf size of 2GB: 3096122166

    hf_gpt2_export.py is as follows

    
    if __name__ == "__main__":
        output_lightseq_model_name = "lightseq_gpt2_large.pb"
        input_huggingface_gpt_model = "gpt2-large"
        head_number = 36
        # generation_method should be "topk" or "topp"
        generation_method = "topk"
        topk = 1
        topp = 0.75
        # default eos_id from https://huggingface.co/transformers/model_doc/gpt2.html#gpt2lmheadmodel
        eos_id = 50256
        pad_id = 50257
        extract_gpt_weights(
            output_lightseq_model_name,
            input_huggingface_gpt_model,
            head_num=head_number,  # layer number
            generation_method=generation_method,
            topk=topk,
            topp=topp,
            eos_id=eos_id,
            pad_id=pad_id,
        )
    
    
    ['transformer.h.34.mlp.c_proj.bias'] -> ffn_second_bias, shape: (1280,), convert finished.
    ['transformer.h.35.ln_1.weight'] -> multihead_norm_scale, shape: (1280,), convert finished.
    ['transformer.h.35.ln_1.bias'] -> multihead_norm_bias, shape: (1280,), convert finished.
    ['transformer.h.35.attn.c_attn.weight'] -> multihead_project_kernel_qkv, shape: (1280, 3840), convert finished.
    ['transformer.h.35.attn.c_attn.bias'] -> multihead_project_bias_qkv, shape: (3840,), convert finished.
    ['transformer.h.35.attn.c_proj.weight'] -> multihead_project_kernel_output, shape: (1280, 1280), convert finished.
    ['transformer.h.35.attn.c_proj.bias'] -> multihead_project_bias_output, shape: (1280,), convert finished.
    ['transformer.h.35.ln_2.weight'] -> ffn_norm_scale, shape: (1280,), convert finished.
    ['transformer.h.35.ln_2.bias'] -> ffn_norm_bias, shape: (1280,), convert finished.
    ['transformer.h.35.mlp.c_fc.weight'] -> ffn_first_kernel, shape: (1280, 5120), convert finished.
    ['transformer.h.35.mlp.c_fc.bias'] -> ffn_first_bias, shape: (5120,), convert finished.
    ['transformer.h.35.mlp.c_proj.weight'] -> ffn_second_kernel, shape: (5120, 1280), convert finished.
    ['transformer.h.35.mlp.c_proj.bias'] -> ffn_second_bias, shape: (1280,), convert finished.
    ['transformer.ln_f.weight'] -> norm_scale, shape: (1280,), convert finished.
    ['transformer.ln_f.bias'] -> norm_bias, shape: (1280,), convert finished.
    ['transformer.wte.weight'] -> token_embedding, shape: (50257, 1280), convert finished.
    ['transformer.wpe.weight'] -> position_embedding, shape: (1024, 1280), convert finished.
    Wrting to lightseq_gpt2_large.pb
    Traceback (most recent call last):
      File "hf_gpt2_export.py", line 127, in <module>
        pad_id=pad_id,
      File "hf_gpt2_export.py", line 100, in extract_gpt_weights
        fout.write(gpt.SerializeToString())
    ValueError: Message Gpt exceeds maximum protobuf size of 2GB: 3096122166
    
    opened by zmingshi 8
  • Questions about beam search

    Questions about beam search

    Hi guys,

    Two questions related beam search confused me and I am looking forward to reply๐Ÿ˜Šใ€‚

    1. Your beam search is same as T2T?
    2. length_penalty == 1.0 means no length_penalty?

    Thx

    opened by gongel 6
  • Can you provide a docker file that can test training and inference code the lightseq?

    Can you provide a docker file that can test training and inference code the lightseq?

    I tried to set up LightSeq on docker system (RTX2080TI 4-way or A100 2-way) but failed to set it for 8 hours.

    Therefore, please upload Dockerfile or Images using LightSeq system test.

    (I tested based on image file nvcr.io/nvidia/pytorch:21.08, 20.12, 20.10, taka23/lightseq .. etc but didn't succeed.)

    opened by pdh930105 6
  • Support for VIT-small (hidden_dim=384)

    Support for VIT-small (hidden_dim=384)

    Hello, thank you for your contribution. I want to replace the encoders in vit-small with LSHFTransformerEncoderLayer. For each encoder, num_attention_heads = 6 and hidden_dim = 384. However, there is an error here says that hidden_dim must be an integer multiple of 256. Why does LSHFTransformerEncoderLayer have this restriction๏ผŸ Is there any solution to use LSHFTransformerEncoderLayer in vit-small? Correct me if I am wrong. Thanks!

    opened by woskii 6
  • Lightseq model inference for fairseq task after training

    Lightseq model inference for fairseq task after training

    Hi, i could not find any details about lightseq model inference for fairseq task after training, did i miss something? I mean after training, the model arch is ls_tranformer, i can't use native fairseq-generate command for inference, and i don't find something like lightseq-generate. I find the examples about inference are huggingface models such as bart and gpt2, and no after-training fairseq model inference documents are provided. Could someone tell me how to do this?

    opened by dearchill 6
  • Example/Support of converting Fairseq Model to run in LightSeq

    Example/Support of converting Fairseq Model to run in LightSeq

    I am curious of trying LightSeq to speed up my inference for a vanilla Transformer Encoder-Decoder (Vasawani 17) model. My original model was trained with FairSeq (or OpenNMT-py). Is there any example or places that you can refer to help me convert my transformer model to the format compatible of running LightSeq?

    opened by pttzty 6
  • How to convert model like BertForTokenClassification by huggingface transformer to pb format

    How to convert model like BertForTokenClassification by huggingface transformer to pb format

    only saw how to convert BART๏ผŒgpt2

    after search for some pkl to onnx to pd code, they all only have one input, but for the task that need attention mask or token_seq_id๏ผŒi don't know how to convert

    opened by cdhx 5
  • ModuleNotFoundError: No module named 'lightseq.inference'; 'lightseq' is not a package

    ModuleNotFoundError: No module named 'lightseq.inference'; 'lightseq' is not a package

    I could use the " pip install lightseq" command to install the lightseq package, and I could import the lightseq in the python file. But I encountered "ModuleNotFoundError: No module named 'lightseq.inference'; 'lightseq' is not a package" when I try to run the command "import lightseq.inference as lsi". Do I miss some steps? Thanks.

    opened by gm3g11 5
  • [CUDA][ERROR]: misaligned address

    [CUDA][ERROR]: misaligned address

    hi, I have 1 question. When a large amount of text requests the model, the model starts to run properly. After the model runs for a period of time, the program reports an error : [CUDA][ERROR] /tmp/build-via-sdist-uagdfpbf/lightseq-2.2.1/lightseq/inference/pywrapper/gpt.cc.cu(160): misaligned address.

    opened by fc20567 5
  • Running examples meet error, with cuda 11.6

    Running examples meet error, with cuda 11.6

    /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=lightseq_layers -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1013" -I/opt/conda/lib/python3.8/site-packages/lightseq/training/csrc/kernels/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/training/csrc/ops/includes -I/opt/conda/lib/python3.8/site-packages/3rdparty/cub -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -c /opt/conda/lib/python3.8/site-packages/lightseq/training/csrc/kernels/cuda_util.cu -o cuda_util.cuda.o /usr/local/cuda/include/cub/detail/device_synchronize.cuh(26): error: expected a ";"

    /usr/local/cuda/include/cub/detail/device_synchronize.cuh(33): error: this pragma must immediately precede a declaration

    /usr/local/cuda/include/cub/detail/device_synchronize.cuh(65): error: expected a declaration

    /usr/local/cuda/include/thrust/system/cuda/detail/util.h(61): error: expected a declaration

    /usr/local/cuda/include/thrust/system/cuda/detail/util.h(206): error: expected a ";"

    /usr/local/cuda/include/thrust/system/cuda/detail/util.h(208): error: expected a declaration

    /usr/local/cuda/include/thrust/system/cuda/detail/util.h(239): error: variable "cuda_cub" has already been defined

    opened by FeixLiu 2
  • how to use lightseq inference engine

    how to use lightseq inference engine

    hello, I have 1 question about how to use lightseq inference engine

    I trained an en2de model based on fairseq, which is a variant of the transformer, i.e. I modified the FFN layer. Can I use the lightseq inference engine to speed up the model? If not, what should I do

    looking forward for your reply, thank you!

    opened by lyzKF 7
  • module

    module "lightseq" has no attribute "training"

    ๆŒ‰็…งๅฎ˜ๆ–น็ป™ๅฎš็š„ๆ–นๅผๅฎ‰่ฃ…ๅŽ๏ผŒๅœจcdๅˆฐ็›ธๅบ”็š„่ทฏๅพ„ไธ‹๏ผŒ่ฟ่กŒnative_fs_transformer_export.pyๆ–‡ไปถ๏ผŒไผšๅ‡บ็ŽฐNo module named 'export'็š„้”™่ฏฏ๏ผŒ้€š่ฟ‡ln -s่ฝฏ้“พๆŽฅๆ–นๅผ่งฃๅ†ณๅŽ๏ผŒๅŽ็ปญไผšๆŠฅ้”™module "lightseq" has no attribute "training" .

    opened by princelisjtu 7
  • GPT2 training example

    GPT2 training example

    I try to run the GPT2 training script. The version is 2.2.0. But there is no "lightseq.training.ops.pytorch.quantization" file. In which version, can I run the GPT2 training example?

    opened by panyan96 1
Releases(v2.2.0)
  • v2.2.0(Oct 26, 2021)

    Inference

    Support more multi-language models #209

    Fixes

    Fix inference error on HDF5 #208 Fix training error when batch_size=1 #192 Other minor fixes: #205 #202 #193

    Source code(tar.gz)
    Source code(zip)
  • v2.1.3(Aug 19, 2021)

    This version contains several features and bug fixes.

    Training

    relax restriction of layer norm hidden size #137 #161 support inference during training for transformer #141 #146 #147

    Inference

    Add inference support and examples for BERT #145

    Fixes

    fix save/load for training with pytorch #139 fix pos embedding index bug #144

    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Jul 19, 2021)

    This version contains several features and bug fixes.

    Training

    support BertEncoder #116 support torch amp and apex amp #100

    Inference

    support big models like gpt2-large and bart-large #82

    Fixes

    fix adam bug when param size < 1024 #98 fix training compiling fail in cuda < 11 #80

    Source code(tar.gz)
    Source code(zip)
  • v2.0.2(Jun 25, 2021)

  • v2.0.1(Jun 24, 2021)

  • v2.0.0(Jun 20, 2021)

    It's been a long time since our last release (v1.2.0). For the past six months, we have focused on training efficiency.

    In this release, LightSeq supports fast training for models in the Transformer family!

    We provide highly optimized custom operators for PyTorch and TensorFlow, which cover the entire training process for Transformer-based models. Users of LightSeq can use these operators to build their own models with efficient computation.

    In addition, we integrate our custom operators into popular training libraries like Fairseq, Hugging Face, NeurST, which enables a 1.5X-3X end-to-end speedup campred to the native version.

    With only a small amount of code, you can enjoy the excellent performance provided by LightSeq. Try it now!

    Training

    • support lightseq-train to accelerate fairseq training, including optimized transformer model, adam, and label smoothed loss
    • huggingface bert training example
    • neurst transformer training example for Tensorflow users

    Inference

    • support GPT python wrapper
    • inference APIs are moved to lightseq.inference

    This release has API change for inference, all inference API has moved to lightseq.inference. For example, use import lightseq.inference and model = lightseq.inference.Transformer("$PB_PATH", max_batch_size)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Dec 24, 2020)

  • v1.1.0(Oct 29, 2020)

  • v1.0.0(Dec 6, 2019)

Owner
Bytedance Inc.
Bytedance Inc.
PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

null 814 Jun 19, 2022
null 176 Jun 22, 2022
High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.

Anakin2.0 Welcome to the Anakin GitHub. Anakin is a cross-platform, high-performance inference engine, which is originally developed by Baidu engineer

null 502 Jun 3, 2022
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 19 Mar 21, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 657 Jun 24, 2022
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

Abhay Gupta 152 Jun 13, 2022
Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

zhaohu xing 96 Jun 13, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 825 Jun 20, 2022
Sequence to Sequence Models with PyTorch

Sequence to Sequence models with PyTorch This repository contains implementations of Sequence to Sequence (Seq2Seq) models in PyTorch At present it ha

Sandeep Subramanian 700 Jun 1, 2022
Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train

Elad Hoffer 516 May 4, 2022
Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

Maha 487 May 25, 2022
An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

Luke Tonin 185 Jun 19, 2022
Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

Cornelius Roemer 22 Jan 15, 2022
torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

??A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily install with pip.

DefTruth 90 Jun 20, 2022
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences an

Microsoft 7k Jun 21, 2022
Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

The Picasso Library is intended for complex real-world applications with large-scale surfaces, while it also performs impressively on the small-scale applications over synthetic shape manifolds. We have upgraded the point cloud modules of SPH3D-GCN from homogeneous to heterogeneous representations, and included the upgraded modules into this latest work as well. We are happy to announce that the work is accepted to IEEE CVPR2021.

null 86 Jun 24, 2022
Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Yolov5 running on TorchServe (GPU compatible) ! This is a dockerfile to run TorchServe for Yolo v5 object detection model. (TorchServe (PyTorch librar

null 72 Jun 29, 2022
PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

Amin Rezaei 151 May 13, 2022
Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

Katsuya Hyodo 7 May 23, 2022