This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

Last update: Dec 25, 2022

Related tags

Deep Learning grover

Overview

Self-Supervised Graph Transformer on Large-Scale Molecular Data

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

Requirements

Python 3.6.8
For the other packages, please refer to the requirements.txt. To resolve PackageNotFoundError, please add the following channels before creating the environment.

   conda config --add channels pytorch
   conda config --add channels rdkit
   conda config --add channels conda-forge
   conda config --add channels rmg

You can just execute following command to create the conda environment.

conda create --name chem --file requirements.txt

We also provide the Dockerfile to build the environment, please refer to the Dockerfile for more details.

Pretained Model Download

We provide the pretrained models used in paper.

Usage

The whole framework supports pretraining, finetuning, prediction, fingerprint generation, and evaluation functions.

Pretraining

Pretrain GTransformer model given the unlabelled molecular data.

Data Preparation

We provide an input example of unlabelled molecular data at exampledata/pretrain/tryout.csv.

Semantic Motif Label Extraction

The semantic motif label is extracted by scripts/save_feature.py with feature generator fgtasklabel.

python scripts/save_features.py --data_path exampledata/pretrain/tryout.csv  \
                                --save_path exampledata/pretrain/tryout.npz   \
                                --features_generator fgtasklabel \
                                --restart

Contributing guide: you are welcomed to register your own feature generator to add more semantic motif for the graph-level prediction task. For more details, please refer to grover/data/task_labels.py.

Atom/Bond Contextual Property (Vocabulary)

The atom/bond Contextual Property (Vocabulary) is extracted by scripts/build_vocab.py.

python scripts/build_vocab.py --data_path exampledata/pretrain/tryout.csv  \
                             --vocab_save_folder exampledata/pretrain  \
                             --dataset_name tryout

The outputs of this script are vocabulary dicts of atoms and bonds, tryout_atom_vocab.pkl and tryout_bond_vocab.pkl, respectively. For more options for contextual property extraction, please refer to scripts/build_vocab.py.

Data Splitting

To accelerate the data loading and reduce the memory cost in the multi-gpu pretraining scenario, the unlabelled molecular data need to be spilt into several parts using scrpits/split_data.py.

Note: This step is required for single-gpu pretraining scenario as well.

python scripts/split_data.py --data_path exampledata/pretrain/tryout.csv  \
                             --features_path exampledata/pretrain/tryout.npz  \
                             --sample_per_file 100  \
                             --output_path exampledata/pretrain/tryout

It's better to set a larger sample_per_file for the large dataset.

The output dataset folder will look like this:

tryout
  |- feature # the semantic motif labels
  |- graph # the smiles
  |- summary.txt

Running Pretraining on Single GPU

Note: There are more hyper-parameters which can be tuned during pretraining. Please refer to add_pretrain_args inutil/parsing.py .

python main.py pretrain \
               --data_path exampledata/pretrain/tryout \
               --save_dir model/tryout \
               --atom_vocab_path exampledata/pretrain/tryout_atom_vocab.pkl \
               --bond_vocab_path exampledata/pretrain/tryout_bond_vocab.pkl \
               --batch_size 32 \
               --dropout 0.1 \
               --depth 5 \
               --num_attn_head 1 \
               --hidden_size 100 \
               --epochs 3 \
               --init_lr 0.0002 \
               --max_lr 0.0004 \
               --final_lr 0.0001 \
               --weight_decay 0.0000001 \
               --activation PReLU \
               --backbone gtrans \
               --embedding_output_type both

Running Pretraining on Multiple GPU

We have implemented distributed pretraining on multiple GPU using horovod. To start the distributed pretraining, please refer to this link. To enable the multi-GPU training of the pretraining model, --enable_multi_gpu flag should be proposed in the above command line.

Training & Finetuning

The finetune dataset is organized as a .csv file. This file should contain a column named as smiles.

(Optional) Molecular Feature Extraction

Given a labelled molecular dataset, it is possible to extract the additional molecular features in order to train & finetune the model from the existing pretrained model. The feature matrix is stored as .npz.

python scripts/save_features.py --data_path exampledata/finetune/bbbp.csv \
                                --save_path exampledata/finetune/bbbp.npz \
                                --features_generator rdkit_2d_normalized \
                                --restart

Finetuning with Existing Data

Given the labelled dataset and the molecular features, we can use finetune function to finetunning the pretrained model.

Note: There are more hyper-parameters which can be tuned during finetuning. Please refer to add_finetune_args inutil/parsing.py .

python main.py finetune --data_path exampledata/finetune/bbbp.csv \
                        --features_path exampledata/finetune/bbbp.npz \
                        --save_dir model/finetune/bbbp/ \
                        --checkpoint_path model/tryout/model.ep3 \
                        --dataset_type classification \
                        --split_type scaffold_balanced \
                        --ensemble_size 1 \
                        --num_folds 3 \
                        --no_features_scaling \
                        --ffn_hidden_size 200 \
                        --batch_size 32 \
                        --epochs 10 \
                        --init_lr 0.00015

The final finetuned model is stored in model/bbbp and will be used in the subsequent prediction and evaluation tasks.

Prediction

Given the finetuned model, we can use it to make the prediction of the target molecules. The final prediction is made by the averaging the prediction of all sub models (num_folds * ensemble_size).

(Optional) Molecular Feature Extraction

Note: If the finetuned model uses the molecular feature as input, we need to generate the molecular feature for the target molecules as well.

python scripts/save_features.py --data_path exampledata/finetune/bbbp.csv \
                                --save_path exampledata/finetune/bbbp.npz \
                                --features_generator rdkit_2d_normalized \
                                --restart

Prediction with Finetuned Model

python main.py predict --data_path exampledata/finetune/bbbp.csv \
               --features_path exampledata/finetune/bbbp.npz \
               --checkpoint_dir ./model \
               --no_features_scaling \
               --output data_pre.csv

Generating Fingerprints

The pretrained model can also be used to generate the molecular fingerprints.

Note: We provide three ways to generate the fingerprint.

atom: The mean pooling of atom embedding from node-view GTransformer and edge-view GTransformer.
bond: The mean pooling of bond embedding from node-view GTransformer and edge-view GTransformer.
both: The concatenation of atom and bond fingerprints. Moreover, the additional molecular features are appended to the output of GTransformer as well if provided.

python main.py fingerprint --data_path exampledata/finetune/bbbp.csv \
                           --features_path exampledata/finetune/bbbp.npz \
                           --checkpoint_path model/tryout/model.ep3 \
                           --fingerprint_source both \
                           --output model/fingerprint/fp.npz

The Results

The classification datasets.

Model	BBBP	SIDER	ClinTox	BACE	Tox21	ToxCast
GROVER_base	0.936(0.008)	0.656(0.006)	0.925(0.013)	0.878(0.016)	0.819(0.020)	0.723(0.010)
GROVER_large	0.940(0.019)	0.658(0.023)	0.944(0.021)	0.894(0.028)	0.831(0.025)	0.737(0.010)

The regression datasets.

Model	FreeSolv	ESOL	Lipo	QM7	QM8
GROVER_base	1.592(0.072)	0.888(0.116)	0.563(0.030)	72.5(5.9)	0.0172(0.002)
GROVER_large	1.544(0.397)	0.831(0.120)	0.560(0.035)	72.6(3.8)	0.0125(0.002)

The Reproducibility Issue

Due to the non-deterministic behavior of the function index_select_nd( See link), it is hard to exactly reproduce the training process of finetuning. Therefore, we provide the finetuned model for eleven datasets to guarantee the reproducibility of the experiments.

BBBP: BASE, LARGE
SIDER: BASE, LARGE
ClinTox: BASE, LARGE
BACE: BASE, LARGE
Tox21: BASE, LARGE
ToxCast: BASE, LARGE
FreeSolv: BASE, LARGE
ESOL: BASE, LARGE
Lipo: BASE, LARGE
QM7: BASE, LARGE
QM8 BASE, LARGE

We provide the eval function to reproduce the experiments. Suppose the finetuned model is placed in model/finetune/.

python main.py eval --data_path exampledata/finetune/bbbp.csv \
                    --features_path exampledata/finetune/bbbp.npz \
                    --checkpoint_dir model/finetune/bbbp \
                    --dataset_type classification \
                    --split_type scaffold_balanced \
                    --ensemble_size 1 \
                    --num_folds 3 \
                    --metric auc \
                    --no_features_scaling

Note: The defualt metric setting is rmse for regression tasks. For QM7 and QM8 datasets, you need to set metric as mae to reproduce the results. For classification tasks, you need to set metric as auc.

Known Issues

Comparing with the original implementation, we add the dense connection in MessagePassing layer in GTransformer . If you do not want to add the dense connection in MessagePasssing layer, please fix it at L256 of model/layers.py.

Roadmap

Implementation of GTransformer in DGL / PyG.
The improvement of self-supervised tasks, e.g. more semantic motifs.

Reference

@article{rong2020self,
  title={Self-Supervised Graph Transformer on Large-Scale Molecular Data},
  author={Rong, Yu and Bian, Yatao and Xu, Tingyang and Xie, Weiyang and Wei, Ying and Huang, Wenbing and Huang, Junzhou},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

Disclaimer

This is not an officially supported Tencent product.

Comments

Package not found error

Sorry I'm new to this. But can I install grover on my gpu-supported windows laptop?

I do add conda channels in advance but it still gives me package not found error.

conda create --name pretrain --file requirements.txt

Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - torchvision==0.3.0=py36_cu9.0.176_1
  - python==3.6.8=h0371630_0
  - numpy-base==1.16.4=py36hde5b4d6_0
  - boost==1.68.0=py36h8619c78_1001
  - numpy==1.16.4=py36h7e9f1db_0
  - rdkit==2019.03.4.0=py36hc20afe1_1
  - pandas==0.25.0=py36hb3f55d8_0
  - pytorch==1.1.0=py3.6_cuda9.0.176_cudnn7.5.1_0
  - readline==7.0=h7b6447c_5
  - boost-cpp==1.68.0=h11c811c_1000
  - scikit-learn==0.21.2=py36hcdab131_1
  - scipy==1.3.0=py36h921218d_1

Current channels:

  - https://conda.anaconda.org/rmg/win-64
  - https://conda.anaconda.org/rmg/noarch
  - https://conda.anaconda.org/conda-forge/win-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://conda.anaconda.org/rdkit/win-64
  - https://conda.anaconda.org/rdkit/noarch
  - https://conda.anaconda.org/pytorch/win-64
  - https://conda.anaconda.org/pytorch/noarch
  - http://conda.anaconda.org/gurobi/win-64
  - http://conda.anaconda.org/gurobi/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

opened by shangzhu-cmu 0

The incorrect implementation of multi-head attention!

Assuming the number of attention heads is 4. I find that self-attention is computed between different heads rather than different atoms. The attention scores shape is (num_atoms, 4, 4, 4) which should be (batch_size, max_num_atoms, max_num_atoms). The flattened atom features (num_atoms, node_fdim) should be processed into padded batch data(batch_size, max_num_atoms, node_fdim).

For details, I only extract the main codes there to illustrate why the self-attention is computed between different heads rather than different atoms.

For MTBlock class:

# in the __init__ function
  self.attn = MultiHeadedAttention(h=num_attn_head,
                                   d_model=self.hidden_size,
                                   bias=bias,
                                   dropout=dropout)
for _ in range(num_attn_head):
            self.heads.append(Head(args, hidden_size=hidden_size, atom_messages=atom_messages))

# in the forward function
for head in self.heads:
            q, k, v = head(f_atoms, f_bonds, a2b, a2a, b2a, b2revb)
            queries.append(q.unsqueeze(1))
            keys.append(k.unsqueeze(1))
            values.append(v.unsqueeze(1))

queries = torch.cat(queries, dim=1) # (num_atoms, 4, hidden_size)
keys = torch.cat(keys, dim=1) # (num_atoms, 4, hidden_size)
values = torch.cat(values, dim=1) # (num_atoms, 4, hidden_size)

x_out = self.attn(queries, keys, values, past_key_value)  # multi-headed attention

Now, the queries, keys and values will be fed into multi-head attention to get new results. For MultiHeadedAttention class:

# in the __init__ function
self.attention = Attention()
self.d_k = d_model // h # equals hidden_size // num_attn_head
self.linear_layers = nn.ModuleList([nn.Linear(d_model, d_model) for _ in range(3)])  # why 3: query, key, value

# in the forward function
    # 1) Do all the linear projections in batch from d_model => h x d_k
query, key, value = [l(x).view(batch_size, -1, self.h, self.d_k).transpose(1, 2)
                     for l, x in zip(self.linear_layers, (query, key, value))] # q, k, v 's shape will be (num_bonds, 4, 4, d_k)
x, _ = self.attention(query, key, value, mask=mask, dropout=self.dropout)

For the Attention class:

class Attention(nn.Module):
    """
    Compute 'Scaled Dot Product SelfAttention
    """

    def forward(self, query, key, value, mask=None, dropout=None):
        scores = torch.matmul(query, key.transpose(-2, -1)) \
                 / math.sqrt(query.size(-1)) # scores shape is (num_atoms, 4, 4, 4)

        if mask is not None:
            scores = scores.masked_fill(mask == 0, -1e9)

        p_attn = F.softmax(scores, dim=-1)

        return torch.matmul(p_attn, value), p_attn # the new output is (num_atoms, 4, 4, d_k) which will be processed into (num_atoms, 4, hidden_size)

As you can see, the scores shape is (num_atoms, 4, 4, 4) which is computed between different heads rather than different atoms. That is, each atom's representation is the combination of different heads' information which is meaningless.

opened by ZhuYun97 0

I couldn't install horovod in grover conda environment.

I wanna use grover in multi-gpu environment so I've tried to install horovod in my grover conda env. but I got this error message.

Import Error

Traceback (most recent call last): File "", line 1, in File "/home/eung0/miniconda3/envs/grover_1/lib/python3.6/site-packages/horovod/torch/init.py", line 44, in from horovod.torch.mpi_ops import allreduce, allreduce_async, allreduce_, allreduce_async_ File "/home/eung0/miniconda3/envs/grover_1/lib/python3.6/site-packages/horovod/torch/mpi_ops.py", line 31, in from horovod.torch import mpi_lib_v2 as mpi_lib ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.30' not found (required by/home/eung0/miniconda3/envs/grover_1/lib/python3.6/site-packages/horovod/torch/mpi_lib_v2.cpython-36m-x86_64-linux-gnu.so)`

command

HOROVOD_WITH_PYTORCH=1 HOROVOD_WITHOUT_TENSORFLOW=1 HOROVOD_GPU_OPERATIONS=NCCL pip install --no-cache-dir --force-reinstall horovod[pytorch]==0.19.5

grover conda env

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_kmp_llvm conda-forge blas 1.0 mkl conda-forge boost 1.68.0 py36h8619c78_1001 conda-forge boost-cpp 1.68.0 h11c811c_1000 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge ca-certificates 2020.6.20 hecda079_0 rmg cairo 1.16.0 h18b612c_1001 conda-forge descriptastorus 2.2.0 py_0 rmg expat 2.4.8 h27087fc_0 conda-forge fontconfig 2.14.0 h8e229c2_0 conda-forge freetype 2.10.4 h0708190_1 conda-forge gettext 0.19.8.1 hf34092f_1004 conda-forge glib 2.66.3 h58526e2_0 conda-forge icu 58.2 hf484d3e_1000 conda-forge joblib 1.1.0 pyhd8ed1ab_0 conda-forge jpeg 9e h166bdaf_1 conda-forge lcms2 2.12 hddcbb42_0 conda-forge lerc 3.0 h9c3ff4c_0 conda-forge libblas 3.9.0 8_mkl conda-forge libboost 1.67.0 h46d08c1_4 libcblas 3.9.0 8_mkl conda-forge libdeflate 1.10 h7f98852_0 conda-forge libffi 3.2.1 he1b5a44_1007 conda-forge libgcc-ng 12.1.0 h8d9b700_16 conda-forge libgfortran-ng 7.5.0 h14aa051_20 conda-forge libgfortran4 7.5.0 h14aa051_20 conda-forge libglib 2.66.3 hbe7bbb4_0 conda-forge libiconv 1.16 h516909a_0 conda-forge liblapack 3.9.0 8_mkl conda-forge libpng 1.6.37 h21135ba_2 conda-forge libstdcxx-ng 12.1.0 ha89aaad_16 conda-forge libtiff 4.3.0 h0fcbabc_4 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libwebp-base 1.2.2 h7f98852_1 conda-forge libxcb 1.13 h7f98852_1004 conda-forge libzlib 1.2.12 h166bdaf_0 conda-forge llvm-openmp 14.0.4 he0ac6c6_0 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge mkl 2020.4 h726a3e6_304 conda-forge mkl_fft 1.0.10 py36_0 conda-forge mkl_random 1.1.1 py36h830a2c2_0 conda-forge ncurses 6.3 h27087fc_1 conda-forge numpy 1.16.4 py36h7e9f1db_0 numpy-base 1.16.4 py36hde5b4d6_0 olefile 0.46 pyh9f0ad1d_1 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 1.1.1g h516909a_0 rmg pandas 0.25.0 py36hb3f55d8_0 conda-forge pcre 8.45 h9c3ff4c_0 conda-forge pillow 8.3.2 py36h676a545_0 conda-forge pip 21.3.1 pyhd8ed1ab_0 conda-forge pixman 0.38.0 h516909a_1003 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge py-boost 1.67.0 py36h04863e7_4 python 3.6.8 h0371630_0 python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.6 2_cp36m conda-forge pytz 2022.1 pyhd8ed1ab_0 conda-forge rdkit 2020.03.3.0 py36hc20afe1_1 rmg readline 7.0 h7b6447c_5 scikit-learn 0.21.2 py36hcdab131_1 conda-forge scipy 1.3.0 py36h921218d_1 conda-forge setuptools 58.0.4 py36h5fab9bb_2 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge sqlite 3.28.0 h8b20d00_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge torch 1.7.1+cu101 pypi_0 pypi torchaudio 0.7.2 pypi_0 pypi torchvision 0.8.2+cu101 pypi_0 pypi tqdm 4.32.1 py_0 conda-forge typing 3.6.4 py36_0 conda-forge typing_extensions 3.10.0.2 pyha770c72_0 conda-forge wheel 0.37.1 pyhd8ed1ab_0 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.0.10 h7f98852_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.7.2 h7f98852_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h7f98852_1 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h7f98852_1002 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.5 h516909a_1 conda-forge zlib 1.2.12 h166bdaf_0 conda-forge zstd 1.5.2 h8a70e8d_1 conda-forge

which version should I install?

opened by eungyeonglee-dev 0
File call issues
When I use

python scripts/build_vocab.py --data_path exampledata/pretrain/tryout.csv \

--vocab_save_folder exampledata/pretrain \ --dataset_name tryout

Here an error will be reported Traceback (most recent call last): File "scripts/build_vocab.py", line 6, in from grover.data.torchvocab import MolVocab ModuleNotFoundError: No module named 'grover'
opened by uestc-lese 1
How to install env with python3.7, cuda 11.1

install.sh

#Installing env with python3.7

conda create --name envgrover37 python=3.7 conda activate env grover37

pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

(cuda version: 11.1)

conda install -c conda-forge boost conda install -c conda-forge boost-cpp conda install -c rmg descriptastorus conda install -c acellera rdkit=2019.03.4.0 conda install -c conda-forge tqdm
conda install -c anaconda typing
conda install -c anaconda scipy=1.3.0 conda install -c anaconda scikit-learn=0.21.2

how to fix 'module grover not found' error

Add import sys sys.path.append('/user/grover/') to build_vocab.py and save_features.py split_data.py

opened by funihang 0

Releases(1.0.0)

1.0.0(Jan 18, 2021)

This is the original implementation.
Source code(tar.gz)
Source code(zip)

Owner

Research repositories.

GitHub Repository

Relative Positional Encoding for Transformers with Linear Complexity

Stochastic Positional Encoding (SPE) This is the source code repository for the ICML 2021 paper Relative Positional Encoding for Transformers with Lin

48 Nov 16, 2022

Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Exposure: A White-Box Photo Post-Processing Framework ACM Transactions on Graphics (presented at SIGGRAPH 2018) Yuanming Hu1,2, Hao He1,2, Chenxi Xu1,

719 Dec 29, 2022

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

Related tags

Overview

Self-Supervised Graph Transformer on Large-Scale Molecular Data

Requirements

Pretained Model Download

Usage

Pretraining

Data Preparation

Semantic Motif Label Extraction

Atom/Bond Contextual Property (Vocabulary)

Data Splitting

Running Pretraining on Single GPU

Running Pretraining on Multiple GPU

Training & Finetuning

(Optional) Molecular Feature Extraction

Finetuning with Existing Data

Prediction

(Optional) Molecular Feature Extraction

Prediction with Finetuned Model

Generating Fingerprints

The Results

The Reproducibility Issue

Known Issues

Roadmap

Reference

Disclaimer

Comments

Package not found error

The incorrect implementation of multi-head attention!

I couldn't install horovod in grover conda environment.

Import Error

command

grover conda env

File call issues

How to install env with python3.7, cuda 11.1

install.sh

(cuda version: 11.1)

how to fix 'module grover not found' error

Releases(1.0.0)

1.0.0(Jan 18, 2021)

Owner

Relative Positional Encoding for Transformers with Linear Complexity

Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Improving Object Detection by Label Assignment Distillation

NeuralDiff: Segmenting 3D objects that move in egocentric videos

PyTorch code for training MM-DistillNet for multimodal knowledge distillation

Multi-Stage Episodic Control for Strategic Exploration in Text Games

Python port of R's Comprehensive Dynamic Time Warp algorithm package

SAS: Self-Augmentation Strategy for Language Model Pre-training

Music source separation is a task to separate audio recordings into individual sources

Scalable training for dense retrieval models.

The fastai book, published as Jupyter Notebooks

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

A general framework for deep learning experiments under PyTorch based on pytorch-lightning

WRENCH: Weak supeRvision bENCHmark

Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays

Source code for paper "Deep Diffusion Models for Robust Channel Estimation", TBA.

Face Mask Detector by live camera using tensorflow-keras, openCV and Python

efficient neural audio synthesis in the waveform domain

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.