Production First and Production Ready End-to-End Speech Recognition Toolkit

Overview

WeNet

中文版

License Python-Version

Discussions | Docs | Papers | Runtime (x86) | Runtime (android) | Pretrained Models

We share neural Net together.

The main motivation of WeNet is to close the gap between research and production end-to-end (E2E) speech recognition models, to reduce the effort of productionizing E2E models, and to explore better E2E models for production.

Highlights

  • Production first and production ready: The python code of WeNet meets the requirements of TorchScript, so the model trained by WeNet can be directly exported by Torch JIT and use LibTorch for inference. There is no gap between the research model and production model. Neither model conversion nor additional code is required for model inference.
  • Unified solution for streaming and non-streaming ASR: WeNet implements Unified Two Pass (U2) framework to achieve accurate, fast and unified E2E model, which is favorable for industry adoption.
  • Portable runtime: Several demos will be provided to show how to host WeNet trained models on different platforms, including server x86 and on-device android.
  • Light weight: WeNet is designed specifically for E2E speech recognition, with clean and simple code. It is all based on PyTorch and its corresponding ecosystem. It has no dependency on Kaldi, which simplifies installation and usage.

Performance Benchmark

Please see examples/$dataset/s0/README.md for benchmark on different speech datasets.

Installation

  • Clone the repo
git clone https://github.com/wenet-e2e/wenet.git
conda create -n wenet python=3.8
conda activate wenet
pip install -r requirements.txt
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c conda-forge
  • Optionally, if you want to use x86 runtime or language model(LM), you have to build the runtime as follows. Otherwise, you can just ignore this step.
# runtime build requires cmake 3.14 or above
cd runtime/server/x86
mkdir build && cd build && cmake .. && cmake --build .

Discussion & Communication

Please visit Discussions for further discussion.

For Chinese users, you can aslo scan the QR code on the left to follow our offical account of WeNet. We created a WeChat group for better discussion and quicker response. Please scan the personal QR code on the right, and the guy is responsible for inviting you to the chat group.

If you can not access the QR image, please access it on gitee.

Or you can directly discuss on Github Issues.

Contributors

Acknowledge

  1. We borrowed a lot of code from ESPnet for transformer based modeling.
  2. We borrowed a lot of code from Kaldi for WFST based decoding for LM integration.
  3. We referred EESEN for building TLG based graph for LM integration.
  4. We referred to OpenTransformer for python batch inference of e2e models.

Citations

@inproceedings{yao2021wenet,
  title={WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit},
  author={Yao, Zhuoyuan and Wu, Di and Wang, Xiong and Zhang, Binbin and Yu, Fan and Yang, Chao and Peng, Zhendong and Chen, Xiaoyu and Xie, Lei and Lei, Xin},
  booktitle={Proc. Interspeech},
  year={2021},
  address={Brno, Czech Republic }
  organization={IEEE}
}

@article{zhang2020unified,
  title={Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition},
  author={Zhang, Binbin and Wu, Di and Yao, Zhuoyuan and Wang, Xiong and Yu, Fan and Yang, Chao and Guo, Liyong and Hu, Yaguang and Xie, Lei and Lei, Xin},
  journal={arXiv preprint arXiv:2012.05481},
  year={2020}
}

@article{wu2021u2++,
  title={U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition},
  author={Wu, Di and Zhang, Binbin and Yang, Chao and Peng, Zhendong and Xia, Wenjing and Chen, Xiaoyu and Lei, Xin},
  journal={arXiv preprint arXiv:2106.05642},
  year={2021}
}
Comments
  • BUG for ONNX inference

    BUG for ONNX inference

    when i inference with u2++_conformer, execute just 50 wav files, a bug will be thrown as below: I0515 00:10:40.909694 146005 decoder_main.cc:67] num frames 1118 I0515 00:10:41.026697 146005 decoder_main.cc:86] Partial result: 在机关 I0515 00:10:41.056061 146005 decoder_main.cc:86] Partial result: 在机关服务 I0515 00:10:41.085124 146005 decoder_main.cc:86] Partial result: 在机关围剿 I0515 00:10:41.110785 146005 decoder_main.cc:86] Partial result: 在机关围剿和 I0515 00:10:41.136417 146005 decoder_main.cc:86] Partial result: 在机关围剿和 I0515 00:10:41.176227 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程 I0515 00:10:41.217715 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的 I0515 00:10:41.251241 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的战斗中 I0515 00:10:41.282459 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的战斗中太勇敢 I0515 00:10:41.311969 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的战斗中太勇敢坚定 I0515 00:10:41.341024 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的战斗中太勇敢坚定是 I0515 00:10:41.398414 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的战斗中太勇敢坚定是一军的 I0515 00:10:41.429834 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的战斗中太勇敢坚定是一军的 I0515 00:10:41.462321 146005 decoder_main.cc:86] Partial result: 在机关围剿和工程多处的战斗中太勇敢坚定是一军的主要将领 Segmentation fault (core dumped) this file shoud be processed completely, i will go deep to locate the bug info.

    onnxruntime: 1.10.0 and 1.11.1

    opened by Fred-cell 26
  • LibTorch gpu cmake error

    LibTorch gpu cmake error

    Hello, when I execute " mkdir build && cd build && cmake -DGRPC=ON ..", the following error is reported, Native environment: centors 7.9 nvidia: 11.3 cuda version: 11


    (wenet_gpu) [[email protected] build]$ cmake -DGPU=ON .. -- The C compiler identification is GNU 4.8.5 -- The CXX compiler identification is GNU 4.8.5 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Populating libtorch -- Configuring done -- Generating done -- Build files have been written to: /home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/fc_base/libtorch-subbuild [ 11%] Performing download step (download, verify and extract) for 'libtorch-populate' -- verifying file... file='/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/fc_base/libtorch-subbuild/libtorch-populate-prefix/src/libtorch-shared-with-deps-1.10.0%2Bcu113.zip' -- File already exists and hash match (skip download): file='/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/fc_base/libtorch-subbuild/libtorch-populate-prefix/src/libtorch-shared-with-deps-1.10.0%2Bcu113.zip' SHA256='0996a6a4ea8bbc1137b4fb0476eeca25b5efd8ed38955218dec1b73929090053' -- extracting... src='/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/fc_base/libtorch-subbuild/libtorch-populate-prefix/src/libtorch-shared-with-deps-1.10.0%2Bcu113.zip' dst='/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/fc_base/libtorch-src' -- extracting... [tar xfz] -- extracting... [analysis] -- extracting... [rename] -- extracting... [clean up] -- extracting... done [ 22%] No patch step for 'libtorch-populate' [ 33%] No update step for 'libtorch-populate' [ 44%] No configure step for 'libtorch-populate' [ 55%] No build step for 'libtorch-populate' [ 66%] No install step for 'libtorch-populate' [ 77%] No test step for 'libtorch-populate' [ 88%] Completed 'libtorch-populate' [100%] Built target libtorch-populate -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE
    -- Found CUDA: /usr/local/cuda-11.3 (found version "11.3") -- Caffe2: CUDA detected: 11.3 -- Caffe2: CUDA nvcc is: /usr/local/cuda-11.3/bin/nvcc -- Caffe2: CUDA toolkit directory: /usr/local/cuda-11.3 CMake Error at fc_base/libtorch-src/share/cmake/Caffe2/public/cuda.cmake:75 (message): Caffe2: Couldn't determine version from header: Change Dir: /home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/build/CMakeFiles/CMakeTmp

    Run Build Command(s):/usr/bin/gmake cmTC_3d968/fast

    /usr/bin/gmake -f CMakeFiles/cmTC_3d968.dir/build.make CMakeFiles/cmTC_3d968.dir/build

    gmake[1]: 进入目录“/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/build/CMakeFiles/CMakeTmp”

    Building CXX object CMakeFiles/cmTC_3d968.dir/detect_cuda_version.cc.o

    /usr/bin/c++ -I/usr/local/cuda-11.3/include -std=c++14 -pthread -fPIC -o CMakeFiles/cmTC_3d968.dir/detect_cuda_version.cc.o -c /home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/build/detect_cuda_version.cc

    c++: 错误:unrecognized command line option ‘-std=c++14’

    gmake[1]: *** [CMakeFiles/cmTC_3d968.dir/detect_cuda_version.cc.o] 错误 1

    gmake[1]: 离开目录“/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/build/CMakeFiles/CMakeTmp”

    gmake: *** [cmTC_3d968/fast] 错误 2

    Call Stack (most recent call first): fc_base/libtorch-src/share/cmake/Caffe2/Caffe2Config.cmake:88 (include) fc_base/libtorch-src/share/cmake/Torch/TorchConfig.cmake:68 (find_package) cmake/libtorch.cmake:52 (find_package) CMakeLists.txt:35 (include)

    -- Configuring incomplete, errors occurred! See also "/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/build/CMakeFiles/CMakeOutput.log". See also "/home/ZYJ/WeNet/wenet_gpu/wenet/runtime/LibTorch/build/CMakeFiles/CMakeError.log".


    please what should Ido?

    opened by zhaoyinjiang9825 16
  • Streaming performance issues on upgrading to release v2.0.0

    Streaming performance issues on upgrading to release v2.0.0

    Describe the bug On updating to release v2.0.0, I've been noticing some performance issues when running real-time audio streams against a quantized e2e model (no LM) via runtime/server/x86/bin/websocket_server_main. For some stretches of time, performance may be comparable between v1 and v2, but there are points where I can expect to see upwards of 20s delay on a given response. Outside of a few minor updates related to the switch, nothing else (e.g. resource allocations) has been changed on my end.

    Thus far, I haven't been able to pinpoint much of a pattern to the lag, except that it seems to consistently happen (in addition to other times) at the start of the stream. Have you observed any similar performance issues between v1 and v2, or is there some v2-specific runtime configuration I may have missed?

    Expected behavior Comparable real-time performance between releases v1 and v2.

    Screenshots The following graphs show the results from a single test. The x-axes represent the progression of the audio file being tested, and the y-axes represent round-trip response times from wenet minus some threshold, i.e. any data points above 0 indicate additional round-trip latency above an acceptable threshold (in my case, 500ms). As you can see, in the v1 graph responses are largely generated and returned below the threshold time (with the exception of a few final-marked transcripts). However, in the v2 graph, there are several lengthy periods during which responses take an unusually long time to return (I've capped the graph at 2s for clearer viewing, but in reality responses are taking up to 20s to return).

    Wenet v1 Snag_4f1b28

    Wenet v2 Snag_508b72

    Additional context Both tests were run with wenet hosted via AWS ECS/EC2. So far as I've seen, increasing CPU + memory allocations to the wenet container doesn't seem to resolve the issue.

    opened by kangnari 16
  • onnx runtime error 2: not enough space: expected 318080, got 314240

    onnx runtime error 2: not enough space: expected 318080, got 314240

    Describe the bug 这个bug或许是tritonserver的问题,在使用代码中提供的gpu生产服务(triton server)部署后。直接测试encoder模块时,我需要直接发送fbank的特征到服务器上,此时假如我有三个线程并发的请求,每个线程请求的的step是随机的,也就是fbank的时间步是不一样长的,此时转写的速度会比较慢,但不会报错。这里猜测是由于每个请求的step不一样长,所以没办法组成batch,服务器端的dynamic_batching等待组batch等待耗时较长。于是添加参数max_queue_delay_microseconds等于70000,也就是70ms后就不要等待batch了直接预测,此时客户端就会有一定概率出现异常,异常如下: Traceback (most recent call last): File "debug_encoder.py", line 30, in input_numpy response = triton_client.infer("encoder", File "/opt/conda/lib/python3.8/site-packages/tritonclient/grpc/init.py", line 1156, in infer raise_error_grpc(rpc_error) File "/opt/conda/lib/python3.8/site-packages/tritonclient/grpc/init.py", line 62, in raise_error_grpc raise get_error_grpc(rpc_error) from None tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] onnx runtime error 2: not enough space: expected 318080, got 314240 此时我请求的三个fbank特征的step是482, 497, 485,dims是80,batch_size是1,318080刚好是497808,也就是模型在预测497那个请求时,莫名遇到空间不足的问题。而且在多次并发请求中,这种错是偶发的,出现后继续请求也有可能成功。如果不并发请求,而是一个个请求的话,则不会报错,如果并发请求的尺寸是固定的也不会报错,只有在并发请求不固定长度的时候,且max_queue_delay_microseconds比较小时会报错。

    Desktop (please complete the following information):

    • triton server:21.11
    • 服务器 内存16G 显存16G T4显卡,应该不可能是显存或者内存不足
    opened by piekey1994 15
  • Runtime: words containing non-ASCII characters are concatenated without space

    Runtime: words containing non-ASCII characters are concatenated without space

    The runtime outputs decoded words containing non-ASCII characters as concatenated with neighbouring words: e.g. "aa ää xx yy" is transformed to "aaääxx yy".

    This is caused by the code block starting at https://github.com/wenet-e2e/wenet/blob/604231391c81efdf06454dbc99406bbc06cb030d/runtime/core/decoder/torch_asr_decoder.cc#L217

    I understand that this is done in order to output Chinese "words" correctly (i.e., without spaces). However, this should at least be configurable, as currently it breaks wenet runtime for most other languages (i.e. those that have words with non-ASCII characters and where words are separated by spaces in the orthography).

    opened by alumae 14
  • cmake compile server/x86 error

    cmake compile server/x86 error

    Describe the bug A clear and concise description of what the bug is.

    environment: centos7
    gcc version 7.5.0
    cmake version: 3.18.3
    CUDA version: 10.2
    gpu version:  Quadro RTX 8000
    
    
    install steps:
    $ conda create -n wenet python=3.8
    $ conda activate wenet
    $ pip install -r requirements.txt
    $ conda install pytorch==1.6.0 cudatoolkit=10.2 torchaudio -c pytorch
    
    $ cd wenet/runtime/server/x86/
    $ mkdir build && cd build && cmake .. && cmake --build .
    

    ERROR is as follows:

    [ 50%] Linking CXX executable ctc_prefix_beam_search_test
    /home4/md510/cmake-3.18.3/bin/cmake -E cmake_link_script CMakeFiles/ctc_prefix_beam_search_test.dir/link.txt --verbose=1
    /home3/md510/gcc-7.5.0/bin/g++  -std=c++14 -pthread -fPIC -D_GLIBCXX_USE_CXX11_ABI=1 -DC10_USE_GLOG -L/cm/shared/apps/cuda10.2/toolkit/10.2.89/lib64 CMakeFiles/ctc_prefix_beam_search_test.dir/decoder/ctc_prefix_beam_search_test.cc.o -o ctc_prefix_beam_search_test   -L/home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/build/openfst/lib  -Wl,-rpath,/home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/build/openfst/lib:/home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/fc_base/libtorch-src/lib lib/libgtest_main.a lib/libgmock.a libdecoder.a lib/libgtest.a ../fc_base/libtorch-src/lib/libtorch.so -Wl,--no-as-needed,/home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/fc_base/libtorch-src/lib/libtorch_cpu.so -Wl,--as-needed ../fc_base/libtorch-src/lib/libc10.so -lpthread -Wl,--no-as-needed,/home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/fc_base/libtorch-src/lib/libtorch.so -Wl,--as-needed ../fc_base/libtorch-src/lib/libc10.so kaldi/libkaldi-decoder.a kaldi/libkaldi-lat.a kaldi/libkaldi-util.a kaldi/libkaldi-base.a libutils.a -lfst 
    /home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/fc_base/libtorch-src/lib/libtorch_cpu.so: undefined reference to `[email protected]_2.23'
    /home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/fc_base/libtorch-src/lib/libtorch_cpu.so: undefined reference to `[email protected]_2.23'
    collect2: error: ld returned 1 exit status
    gmake[2]: *** [ctc_prefix_beam_search_test] Error 1
    gmake[2]: Leaving directory `/home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/build'
    gmake[1]: *** [CMakeFiles/ctc_prefix_beam_search_test.dir/all] Error 2
    gmake[1]: Leaving directory `/home3/md510/w2020/wenet_20210512/wenet/runtime/server/x86/build'
    gmake: *** [all] Error 2
    
    

    Could you help me to solve it ?

    opened by shanguanma 14
  • DLL load failed while importing _wenet: 找不到指定的模块。

    DLL load failed while importing _wenet: 找不到指定的模块。

    我安装了wenet, pip install wenet. 安装提示成功了。 我用例子程序做识别。 程序如下: import sys import wenet

    def get_text_from_wav(dir, wav): model_dir = dir wav_file = wav decoder = wenet.Decoder(model_dir) ans = decoder.decode_wav(wav_file) print(ans)

    if name == 'main': dir = "./models" wav = "./1.wav" get_text_from_wav(dir,wav)

    但是运行报错如下: Traceback (most recent call last): File "D:\codes\speech2word\main.py", line 2, in import wenet File "D:\codes\speech2word\venv\lib\site-packages\wenet_init_.py", line 1, in from .decoder import Decoder # noqa File "D:\codes\speech2word\venv\lib\site-packages\wenet\decoder.py", line 17, in import _wenet ImportError: DLL load failed while importing _wenet: 找不到指定的模块。

    请问如何解决?

    opened by billqu01 13
  • [Draft] Cache control v2

    [Draft] Cache control v2

    This is not a merge-ready PR, I just push my testing code for discussion and further evaluation (such as GPU perf, ONNX export, ...).

    Performance on CPU (intel i7-10510U @ 1.80GHz), RTF from 0.1 -> 0.07, about 30% improvement: image

    Detailed descriptions (in Chinese): https://horizonrobotics.feishu.cn/sheets/shtcniLh77AgP6NJAXhd5UHXDwh

    Test code:

    bash rtf.sh --api 1 > log.txt.1
    bash rtf.sh --api 2 > log.txt.2
    grep "RTF:" log.txt.1
    grep "RTF:" log.txt.2
    

    u2++_conformer.zip: https://horizonrobotics.feishu.cn/file/boxcnO50Ea8m0rR2p9FwJ8ZHEIc words.txt: https://horizonrobotics.feishu.cn/file/boxcnBpSEOWoBSIgLdlHetsjOFd

    opened by xingchensong 13
  • Use DDP training to get stuck

    Use DDP training to get stuck

    Describe the bug

    I got stuck when using DDP training with my own wenet and my own data. And stuck(GPU utilization 100%) at the beginning of the second epoch every time. After debugging, it was found to be stuck in this position:

    # wenet/utils/executor.py
    with torch.cuda.amp.autocast(scaler is not None):
        loss, loss_att, loss_ctc = model(
            feats, feats_lengths, target, target_lengths)
    

    Environment

    CentOS Linux release 7.8.2003 (Core) GPU Driver Version: 450.80.02 CUDA Version: 10.2 torch==1.8.0 torchaudio==1.8.1 torchvision==0.9.0

    Some Attempts

    I did some attempts later and found: 1 gpu no problem multi gpu stuck static batch no problem dynamic batch stuck conformer no problem unified_conformer stuck

    Other attempts: Upgrade pytorch version to 1.9.0, 1.10.0 is useless Set num_workers=0/1 is useless V100 -> P40 useless Sleep 1 minute after completing an epoch is useless NCCL is completely stuck without error log GLOO error log:

    2021-12-07 11:36:17,011 INFO Epoch 0 CV info cv_loss 115.3632936241356
    2021-12-07 11:36:17,011 INFO Epoch 1 TRAIN info lr 6.08e-06
    2021-12-07 11:36:17,014 INFO using accumulate grad, new batch size is 8 times larger than before
    2021-12-07 11:36:17,335 INFO Epoch 0 CV info cv_loss 115.36239801458647
    2021-12-07 11:36:17,335 INFO Epoch 1 TRAIN info lr 6.200000000000001e-06
    2021-12-07 11:36:17,338 INFO using accumulate grad, new batch size is 8 times larger than before
    2021-12-07 11:36:17,579 INFO Epoch 0 CV info cv_loss 115.36309641650827
    2021-12-07 11:36:17,579 INFO Epoch 1 TRAIN info lr 5.96e-06
    2021-12-07 11:36:17,582 INFO using accumulate grad, new batch size is 8 times larger than before
    2021-12-07 11:36:17,926 INFO Epoch 0 CV info cv_loss 115.36275817930736
    2021-12-07 11:36:17,926 INFO Checkpoint: save to checkpoint exp/conformer/0.pt
    2021-12-07 11:36:18,889 INFO Epoch 1 TRAIN info lr 6.32e-06
    2021-12-07 11:36:18,892 INFO using accumulate grad, new batch size is 8 times larger than before
    terminate called after throwing an instance of 'gloo::EnforceNotMet'
      what():  [enforce fail at /opt/conda/conda-bld/pytorch_1614378062065/work/third_party/gloo/gloo/transport/tcp/pair.cc:490] op.preamble.length <= op.nbytes. 939336 vs 4
    ./run.sh: line 165:  7108 Aborted                 (core dumped) python wenet/bin/train.py --gpu $gpu_id --config $train_config --data_type $data_type --symbol_table $dict --train_data data/$train_set/data.list --cv_data data/dev/data.list ${checkpoint:+--checkpoint $checkpoint} --model_dir $dir --ddp.init_method $init_method --ddp.world_size $world_size --ddp.rank $rank --ddp.dist_backend $dist_backend --num_workers 8 $cmvn_opts --pin_memory
    /homepath/envs/anaconda3/lib/python3.8/multiprocessing/process.py:108: ResourceWarning: unclosed file <_io.BufferedReader name='/homepath/tools/wenet-uio/examples/aishell/s0/data/train/shards/shards_000000002.tar'>
      self._target(*self._args, **self._kwargs)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    /homepath/envs/anaconda3/lib/python3.8/multiprocessing/process.py:108: ResourceWarning: unclosed file <_io.BufferedReader name='/homepath/tools/wenet-uio/examples/aishell/s0/data/train/shards/shards_000000110.tar'>
      self._target(*self._args, **self._kwargs)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    /homepath/envs/anaconda3/lib/python3.8/multiprocessing/process.py:108: ResourceWarning: unclosed file <_io.BufferedReader name='/homepath/tools/wenet-uio/examples/aishell/s0/data/train/shards/shards_000000112.tar'>
      self._target(*self._args, **self._kwargs)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    /homepath/envs/anaconda3/lib/python3.8/multiprocessing/process.py:108: ResourceWarning: unclosed file <_io.BufferedReader name='/homepath/tools/wenet-uio/examples/aishell/s0/data/train/shards/shards_000000075.tar'>
      self._target(*self._args, **self._kwargs)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    /homepath/envs/anaconda3/lib/python3.8/multiprocessing/process.py:108: ResourceWarning: unclosed file <_io.BufferedReader name='/homepath/tools/wenet-uio/examples/aishell/s0/data/train/shards/shards_000000001.tar'>
      self._target(*self._args, **self._kwargs)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    /homepath/envs/anaconda3/lib/python3.8/multiprocessing/process.py:108: ResourceWarning: unclosed file <_io.BufferedReader name='/homepath/tools/wenet-uio/examples/aishell/s0/data/train/shards/shards_000000086.tar'>
      self._target(*self._args, **self._kwargs)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    Traceback (most recent call last):
      File "wenet/bin/train.py", line 277, in <module>
        main()
      File "wenet/bin/train.py", line 250, in main
        executor.train(model, optimizer, scheduler, train_data_loader, device,
      File "/homepath/tools/wenet-uio/wenet/utils/executor.py", line 71, in train
        loss.backward()
      File "/homepath/envs/anaconda3/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/homepath/envs/anaconda3/lib/python3.8/site-packages/torch/autograd/__init__.py", line 145, in backward
        Variable._execution_engine.run_backward(
    RuntimeError: [/opt/conda/conda-bld/pytorch_1614378062065/work/third_party/gloo/gloo/transport/tcp/pair.cc:575] Connection closed by peer [11.88.165.7]:54008
    Traceback (most recent call last):
      File "wenet/bin/train.py", line 277, in <module>
        main()
      File "wenet/bin/train.py", line 250, in main
        executor.train(model, optimizer, scheduler, train_data_loader, device,
      File "/homepath/tools/wenet-uio/wenet/utils/executor.py", line 71, in train
        loss.backward()
      File "/homepath/envs/anaconda3/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/homepath/envs/anaconda3/lib/python3.8/site-packages/torch/autograd/__init__.py", line 145, in backward
        Variable._execution_engine.run_backward(
    RuntimeError: Application timeout caused pair closure
    

    To Reproduce

    Finally, pull the latest wenet code,reproduced the above problem with aishell recipe:

    data_type=shard
    train_config=conf/train_unified_conformer.yaml
    cmvn=false
    dynamic batch
    accum_grad=8
    

       How should this be solved? Thank you.

    opened by 601222543 12
  • Decoding hangs when using LM rescoring

    Decoding hangs when using LM rescoring

    I'm following this tutorial to use LM rescoring for decoding: https://github.com/wenet-e2e/wenet/blob/23a61b212bf2c3886546925913f5574f779f474a/examples/librispeech/s0/run.sh#L234

    I didn't re-train a model and instead, I use the pre-trained conformer model. I had no problem building the TLG.fst, but ./tools/decode.sh hangs forever when evaluating on the test set. Could you provide any suggestions on where the problem would be and how to debug this?

    The below is the code I used for LM rescoring (I took this out from run.sh):

    pretrained_model=wenet/models/20210216_conformer_exp
    dict=$pretrained_model/words.txt
    bpemodel=$pretrained_model/train_960_unigram5000
    
    lm=data/local/lm
    lexicon=data/local/dict/lexicon.txt
    mkdir -p $lm
    mkdir -p data/local/dict
    
    # 7.1 Download & format LM
    which_lm=3-gram.pruned.1e-7.arpa.gz
    if [ ! -e ${lm}/${which_lm} ]; then
        wget http://www.openslr.org/resources/11/${which_lm} -P ${lm}
    fi
    echo "unzip lm($which_lm)..."
    gunzip -k ${lm}/${which_lm} -c > ${lm}/lm.arpa
    echo "Lm saved as ${lm}/lm.arpa"
    
    # 7.2 Prepare dict
    unit_file=$dict
    bpemodel=$bpemodel
    # use $dir/words.txt (unit_file) and $dir/train_960_unigram5000 (bpemodel)
    # if you download pretrained librispeech conformer model
    cp $unit_file data/local/dict/units.txt
    if [ ! -e ${lm}/librispeech-lexicon.txt ]; then
        wget http://www.openslr.org/resources/11/librispeech-lexicon.txt -P ${lm}
    fi
    echo "build lexicon..."
    tools/fst/prepare_dict.py $unit_file ${lm}/librispeech-lexicon.txt \
        $lexicon $bpemodel.model
    echo "lexicon saved as '$lexicon'"
    
    # 7.3 Build decoding TLG
    tools/fst/compile_lexicon_token_fst.sh \
       data/local/dict data/local/tmp data/local/lang
    tools/fst/make_tlg.sh data/local/lm data/local/lang data/lang_test || exit 1;
    
    # 7.4 Decoding with runtime
    echo "Start decoding..."
    fst_dir=data/lang_test
    dir=$pretrained_model
    recog_set="test_clean"
    for test in ${recog_set}; do
        ./tools/decode.sh --nj 2 \
            --beam 10.0 --lattice_beam 5 --max_active 7000 --blank_skip_thresh 0.98 \
            --ctc_weight 0.5 --rescoring_weight 1.0 --acoustic_scale 1.2 \
            --fst_path $fst_dir/TLG.fst \
            data/$test/wav.scp.10 data/$test/text.10 $dir/final.zip $fst_dir/words.txt \
            $dir/lm_with_runtime_${test}
        tail $dir/lm_with_runtime_${test}/wer
    done
    
    opened by boliangz 12
  • macOS M1 support?

    macOS M1 support?

    [ 96%] Linking CXX shared library libwenet_api.dylib ld: warning: ignoring file ../../../fc_base/libtorch-src/lib/libtorch.dylib, building for macOS-arm64 but attempting to link with file built for macOS-x86_64

    opened by jinfagang 11
  • When using libtorch, gpu decoding is slower than cpu.

    When using libtorch, gpu decoding is slower than cpu.

    When using gpu to decode, gpu memory gets allocated but gpu-util rises after a lot of time. For example, if you proceed with decoding 600 voices, it progresses very slowly until about the 100th, and then speeds up from the point when gpu-util rises. Increasing the number of threads in decoder_main.cc makes it faster, but I'd like to fix the problem when it's single-threaded. What should I do?

    cpu = 24 cores gpu = rtx a5000(24gb) x 2 ubuntu 20.04.4

    opened by hms1205 0
  • Quantized model under checkpoint mode performs quite different from the one under jit mode

    Quantized model under checkpoint mode performs quite different from the one under jit mode

    I have trained an original asr model and i convert it into quantized model in both jit mode(named asr_quant.zip) and checkpoint mode (named asr_quant_checkpoint.pt). But the results from the jit mode and the checkpoint mode are quite different.

    Quantized model in jit mode: test Final result: 甚至出现交易几乎停滞的情况

    Quantized model in checkpoint mode: INFO BAC009S0764W0121 ▁LAWS骑钰阐易ISH燕▁CRITIC▁QUANTITY▁GOING骑燕▁MORE鲨ANSISH致▁GOING燕▁GOING燕▁GOING▁DESIRED▁GOING▁BREATH▁CRITIC俏尺骑▁GOING骑▁PERFECTION燕▁GOING燕▁SH燕▁SH谊▁PERFECTION敷唬诊▁SH定▁OVEN▁ORDERS尹O▁IGNORISH▁PRESIDENTO锣OKA▁PERFECTIONISH燕▁EIGHTEEN笛燕何▁PERFECTION▁INFORMEDLAND何骑▁PRETTY燕湿O▁PERFECTION尺O燕汐辆女何燕翼鲨O▁PERFECTION▁FIRST架燕绘翼盘锣▁THIS▁PRETTY▁SONG▁PERFECTION唬▁INFORMED障渲▁EIGHTEEN锣燕咏劈赌盘涉燕轧▁ABSORB汐O▁PERFECTION锣▁EIGHTEEN燕▁SH燕▁SH敷▁PRESIDENT书敷诊唬治唬唯轧辆▁IGNOR▁DOESN▁PERFECTION▁IGNOR洒翼O▁SAVE▁FIRST▁KISS▁PERFECTION锣▁PERFECTION备惭骑企洒▁PERFECTION洒慌▁SH▁CANDLE▁CHIN▁CANDLE企▁CHIN▁LIBERTY锣▁WEATHER▁FIRST▁COUNTRY敷▁CLERK

    opened by PPGGG 2
  • windows识别没有输出,也没有错报

    windows识别没有输出,也没有错报

    python version = 3.8.5

    先是安装了runtime pip install wenetruntime

    然后脚本如下: import sys import torch import wenetruntime as wenet

    wav_file = sys.argv[1] decoder = wenet.Decoder(lang='chs') ans = decoder.decode_wav(wav_file) print(ans)

    执行脚本给定一个audio.wav音频,没有任何输出,也没有报错信息,脚本就结束了 有人知道是为啥吗?我还缺了哪些环境配置吗?

    opened by zhhl9101 1
  • Efficient Conformer implementation

    Efficient Conformer implementation

    This PR is about our implementation of Efficient Conformer for WeNet encoder structure and runtime.

    • Original paper: https://arxiv.org/pdf/2109.01163.pdf
    • Original code: https://github.com/burchim/EfficientConformer

    In 58.Com Inc, using Efficient Conformer can reduce CER by 6% relative to Conformer and a 10% increase in inference speed (CPU JIT runtime). Combined with int8 quantization, the inference speed can be improved by 50~70%. More detail of our work: https://mp.weixin.qq.com/s/7T1gnNrVmKIDvQ03etltGQ

    Added features

    • [X] Efficient Conformer Encoder structure
      • [X] StrideConformerEncoderLayer for "Progressive Downsampling to the Conformer encoder"
      • [X] GroupedRelPositionMultiHeadedAttention for "Grouped Attention"
      • [X] Conv2dSubsampling2 for 1/2 Convolution Downsampling
    • [X] Recognize and JIT export
      • [X] forward_chunk and forward_chunk_by_chunk in wenet/efficient_conformer/encoder.py
    • [X] Streaming inference at JIT runtime
      • [X] TorchAsrModelEfficient in runtime/core/decoder for Progressive Downsampling
    • [X] Configuration file of Aishell-1
      • [X] train_u2++_efficonformer_v1.yaml for our online deployment
      • [X] train_u2++_efficonformer_v2.yaml for Original paper

    Developers

    • Efficient Conformer Encoder structure: ( Yaru Wang & Wei Zhou )
    • Recognize and JIT export: ( Wei Zhou )
    • Streaming inference at JIT runtime: ( Yongze Li )
    • Configuration file of Aishell-1: ( Wei Zhou )

    TODO

    • [ ] ONNX export and runtime
    • [x] Aishell-1 experiment
    opened by zwglory 2
  • Export ONNX fail  with export_onnx_gpu.py

    Export ONNX fail with export_onnx_gpu.py

    error.log Attached error.log is showed with verbose.

    i tried with different onnxruntime versions, still gave the same errors. Simple log is as follow:

    python3 wenet/bin/export_onnx_gpu.py --config=/home/ricky/heqing/8w-hours/squeezeformer-8whr-avg2/train.yaml --checkpoint=/home/ricky/heqing/8w-hours/squeezeformer-8whr-avg2/avg_10_156000_13_196000.pt --cmvn_file=/home/ricky/heqing/8w-hours/squeezeformer-8whr-avg2/global_cmvn --ctc_weight=0.5 --output_onnx_dir=/tmp Failed to import k2 and icefall. Notice that they are necessary for hlg_onebest and hlg_rescore Update ctc weight to 0.5 /home/ricky/wenet_train_res/wenet_tools_git/wenet/utils/mask.py:213: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! max_len = max_len if max_len > 0 else lengths.max().item() /home/ricky/wenet_train_res/wenet_tools_git/wenet/transformer/embedding.py:96: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert offset + size < self.max_len /home/ricky/wenet_train_res/wenet_tools_git/wenet/squeezeformer/attention.py:187: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if cache.size(0) > 0: /home/ricky/wenet_train_res/wenet_tools_git/wenet/squeezeformer/attention.py:119: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if mask.size(2) > 0: # time2 > 0 /home/ricky/wenet_train_res/wenet_tools_git/wenet/squeezeformer/convolution.py:140: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if mask_pad.size(2) > 0: # time > 0 /home/ricky/wenet_train_res/wenet_tools_git/wenet/squeezeformer/convolution.py:171: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if mask_pad.size(2) > 0: # time > 0 /home/ricky/wenet_train_res/wenet_tools_git/wenet/squeezeformer/subsampling.py:159: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if L - T < 0: [0, 0, 0] [-1, -1, -1] 2022-12-21 19:30:33.720608405 [W:onnxruntime:, constant_folding.cc:150 ApplyImpl] Unsupported output type of N11onnxruntime22SequenceTensorTypeBaseE. Can't constant fold SequenceEmpty node 'SequenceEmpty_2506' 2022-12-21 19:30:33.768034651 [W:onnxruntime:, constant_folding.cc:150 ApplyImpl] Unsupported output type of N11onnxruntime22SequenceTensorTypeBaseE. Can't constant fold SequenceEmpty node 'SequenceEmpty_2506' 2022-12-21 19:30:33.812875437 [W:onnxruntime:, constant_folding.cc:150 ApplyImpl] Unsupported output type of N11onnxruntime22SequenceTensorTypeBaseE. Can't constant fold SequenceEmpty node 'SequenceEmpty_2506' 2022-12-21 19:30:35.151413519 [E:onnxruntime:, sequential_executor.cc:333 Execute] Non-zero status code returned while running MatMul node. Name:'MatMul_2528' Status Message: Not satisfied: K_ == right_shape[right_num_dims - 2] || transb && K_ == right_shape[right_num_dims - 1] matmul_helper.h:42 ComputeMatMul dimension mismatch Traceback (most recent call last): File "wenet/bin/export_onnx_gpu.py", line 574, in onnx_config = export_enc_func(model, configs, args, logger, encoder_onnx_path) File "wenet/bin/export_onnx_gpu.py", line 331, in export_offline_encoder ort_outs = ort_session.run(None, ort_inputs) File "/home/ricky/anaconda3/envs/wenet/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 124, in run return self.sess.run(output_names, input_feed, run_options) onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running MatMul node. Name:'MatMul_2528' Status Message: Not satisfied: K == right_shape[right_num_dims - 2] || transb && K_ == right_shape[right_num_dims - 1] matmul_helper.h:42 ComputeMatMul dimension mismatch

    opened by rickychanhoyin 9
  • undefined value chunk_masks: in squeezformer

    undefined value chunk_masks: in squeezformer

    Just pulled the latest wenet code and tried out Squeezformer. The training is failed with this log attached below. Any suggestion would be helpful. Thanks.

    `the number of model params: 135,220,418 Traceback (most recent call last): File "wenet/bin/train.py", line 309, in main() File "wenet/bin/train.py", line 205, in main script_model = torch.jit.script(model) File "/home/bsen/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/jit/_script.py", line 1257, in script return torch.jit._recursive.create_script_module( File "/home/bsen/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/jit/_recursive.py", line 451, in create_script_module return create_script_module_impl(nn_module, concrete_type, stubs_fn) File "/home/bsen/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/jit/_recursive.py", line 517, in create_script_module_impl create_methods_and_properties_from_stubs(concrete_type, method_stubs, property_stubs) File "/home/bsen/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/jit/_recursive.py", line 368, in create_methods_and_properties_from_stubs concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults) File "/home/bsen/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/jit/_recursive.py", line 869, in compile_unbound_method create_methods_and_properties_from_stubs(concrete_type, (stub,), ()) File "/home/bsen/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/jit/_recursive.py", line 368, in create_methods_and_properties_from_stubs concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults) RuntimeError: undefined value chunk_masks: File "/home/bsen/wenet_new/examples/squeezformer/wenet/squeezeformer/encoder.py", line 379 pos_emb = recover_pos_emb mask_pad = recover_mask_pad xs = xs.masked_fill(~chunk_masks[:, 0, :].unsqueeze(-1), 0.0) ~~~~~~~~~~~ <--- HERE

            factor = self.calculate_downsampling_factor(i)
    

    'SqueezeformerEncoder.forward_chunk' is being compiled since it was called from 'ASRModel.forward_encoder_chunk' File "/home/bsen/wenet_new/examples/squeezformer/wenet/transformer/asr_model.py", line 776

        """
        return self.encoder.forward_chunk(xs, offset, required_cache_size,
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                          att_cache, cnn_cache)
                                          ~~~~~~~~~~~~~~~~~~~~ <--- HERE`
    
    opened by senbukai0203 2
Releases(v2.1.0)
  • v2.1.0(Nov 25, 2022)

    What's Changed

    • allow instantiate multiple models in #1580
    • do not pack libtorch.so in python binding to reduce wheel in #1573 and #1576
    • support iOS by @Ma-Dan in #1549 🛫
    • support HLG decode by @aluminumbox in #1521 💯
    • support squeezeformer by @yygle in #1519 👍
    • support XPU by @imoisture in #1455 🚀
    • and so on ...
    Source code(tar.gz)
    Source code(zip)
  • v2.0.1(Jun 21, 2022)

  • v2.0.0(Jun 14, 2022)

    The following features are stable.

    • [x] U2++ framework for better accuracy
    • [x] n-gram + WFST language model solution
    • [x] Context biasing(hotword) solution
    • [x] Very big data training support with UIO
    • [x] More dataset support, including WenetSpeech, GigaSpeech, HKUST and so on.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Jun 21, 2021)

    Model

    • propose and support U2++, as the following graph shows, which uses both forward and backward information at training and decoding.

    image

    • support dynamic left chunk training and decoding, so we can limit history chunk at decoding to save memory and computation.
    • support distributed training.

    Dataset

    Now we support the following five standard speech datasets, and we got SOTA result or close to SOTA result. | 数据集 | 语言 | 数据量(h) | 测试集 | CER/WER | SOTA | |-------------|------|-----------|------------|---------|---------------| | aishell-1 | 中文 | 200 | test | 4.36 | 4.36(WeNet) | | aishell-2 | 中文 | 1000 | test_ios | 5.39 | 5.39(WeNet) | | multi-cn | 中文 | 2385 | / | / | / | | librispeech | 英文 | 1000 | test_clean | 2.66 | 2.10(EspNet) | | gigaspeech | 英文 | 10000 | test | 11.0 | 10.80(EspNet) |

    Productivity

    Here are some features related to productivity.

    • LM support. Here is the system design or LM supporting. WeNet can work with/without LM according to your applications/scenarios.

    image

    • timestamp support.
    • n-best support.
    • endpoint support.
    • gRPC support
    • further refine x86 server and on-device android recipe.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Feb 4, 2021)

Owner
Production First and Production Ready End-to-End Speech Toolkit
Styled Handwritten Text Generation with Transformers (ICCV 21)

⚡ Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We

Ankan Kumar Bhunia 85 Dec 22, 2022
Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

Kai Zhang 2k Dec 31, 2022
Neuralnetwork - Basic Multilayer Perceptron Neural Network for deep learning

Neural Network Just a basic Neural Network module Usage Example Importing Module

andreecy 0 Nov 01, 2022
This project is for a Twitter bot that monitors a bird feeder in my backyard. Any detected birds are identified and posted to Twitter.

Backyard Birdbot Introduction This is a silly hobby project to use existing ML models to: Detect any birds sighted by a webcam Identify whic

Chi Young Moon 71 Dec 25, 2022
An efficient and easy-to-use deep learning model compression framework

TinyNeuralNetwork 简体中文 TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework, which contains features like neura

Alibaba 441 Dec 25, 2022
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If

1.1k Dec 30, 2022
Simulate genealogical trees and genomic sequence data using population genetic models

msprime msprime is a population genetics simulator based on tskit. Msprime can simulate random ancestral histories for a sample of individuals (consis

Tskit developers 150 Dec 14, 2022
PyTorch implementation of saliency map-aided GAN for Auto-demosaic+denosing

Saiency Map-aided GAN for RAW2RGB Mapping The PyTorch implementations and guideline for Saiency Map-aided GAN for RAW2RGB Mapping. 1 Implementations B

Yuzhi ZHAO 20 Oct 24, 2022
ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 09, 2023
Simulation of moving particles under microscopic imaging

Simulation of moving particles under microscopic imaging Install scipy numpy scikit-image tiffile Run python simulation.py Read result https://imagej

Zehao Wang 2 Dec 14, 2021
Predict multi paths to a moving person depending on his trajectory history.

Multi-future Trajectory Prediction The project is about using the Multiverse model to make possible multible-future trajectory prediction for a seen p

Said Gamal 1 Jan 18, 2022
Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Varun Nair 37 Dec 30, 2022
A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

ICT.MIRACLE lab 75 Dec 26, 2022
This is the official repository of XVFI (eXtreme Video Frame Interpolation)

XVFI This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206 Last Update: 20210607 We provide th

Jihyong Oh 195 Dec 29, 2022
基于PaddleClas实现垃圾分类,并转换为inference格式用PaddleHub服务端部署

百度网盘链接及提取码: 链接:https://pan.baidu.com/s/1HKpgakNx1hNlOuZJuW6T1w 提取码:wylx 一个垃圾分类项目带你玩转飞桨多个产品(1) 基于PaddleClas实现垃圾分类,导出inference模型并利用PaddleHub Serving进行服务

thomas-yanxin 22 Jul 12, 2022
Recursive Bayesian Networks

Recursive Bayesian Networks This repository contains the code to reproduce the results from the NeurIPS 2021 paper Lieck R, Rohrmeier M (2021) Recursi

Robert Lieck 11 Oct 18, 2022
A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising (CVPR 2020 Oral & TPAMI 2021)

ELD The implementation of CVPR 2020 (Oral) paper "A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising" and its journal (TPAMI) v

Kaixuan Wei 359 Jan 01, 2023
Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

Description The source code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chin

Zhengxiang Wang 3 Jun 28, 2022
Layered Neural Atlases for Consistent Video Editing

Layered Neural Atlases for Consistent Video Editing Project Page | Paper This repository contains an implementation for the SIGGRAPH Asia 2021 paper L

Yoni Kasten 353 Dec 27, 2022