Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

Last update: Dec 23, 2022

Related tags

Deep Learning TPTrans

Overview

This is an official Pytorch implementation of the approaches proposed in:

Han Peng, Ge Li, Wenhan Wang, Yunfei Zhao, Zhi Jin “Integrating Tree Path in Transformer for Code Representation”

which appeared at NeurIPS 2021[Paper Link][Poster][Slides].

In this paper, we investigate the interaction between the absolute and relative path encoding, and propose novel code representation model TPTrans and its variants, which introduce path encoding inductive bias into the attention module of Transformer and power Transformer to know the structure of source codes.

Please cite our paper if you use the model, experimental results, or our code in your own work.

1.1 Raw data

To run experiments with TPTrans and its variants, please first create datasets from raw code snippets of CodeSearchNet dataset. Download and unzip the raw jsonl data of CSN into the raw_data dir like that

├── raw_data        
│   ├── python         
│   │   ├── train    
│   │   │   ├── XXXX.jsonl...
│   │   ├── test    
│   │   ├── valid   
│   ├── ruby          
│   ├── go        
│   ├── javascript

1.2 Tree-Sitter

The Tree-Sitter is a open-source parser for multi-language programming languages. Please install it and then download the grammer files into vendor dir for four different programming languages like that

├── vendor        
│   ├── tree-sitter-python  (from https://github.com/tree-sitter/tree-sitter-python)         
│   ├── tree-sitter-javascript  (from https://github.com/tree-sitter/tree-sitter-javascript)     
│   ├── tree-sitter-go  (from https://github.com/tree-sitter/tree-sitter-go)
│   ├── tree-sitter-ruby  (from https://github.com/tree-sitter/tree-sitter-ruby)

After that, run the multi_language_parse.py in parser dir to parse the raw code snippets into the data dir.

1.3 Training

After preprocessing, run the _main.py_ to train the model.

To run the TPTrans, please specify the relation_path=True and absolute_path=False.

To run the TPTrans-\alpha, please specify the relation_path=True and absolute_path=True.

For other command triggers, please refer the comment inline for details.

Contact If you have any questions, please contact me via email: [email protected] or open issue on Github.

Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

Related tags

Overview

1.1 Raw data

1.2 Tree-Sitter

1.3 Training

Owner

Han Peng

Rename Images with Auto Generated Neural Image Captions

PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)

Differentiable Quantum Chemistry (only Differentiable Density Functional Theory and Hartree Fock at the moment)

This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling

CMP 414/765 course repository for Spring 2022 semester

Contrastive Learning for Metagenomic Binning

Satellite labelling tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, rings etc.

Utilizes Pose Estimation to offer sprinters cues based on an image of their running form.

ContourletNet: A Generalized Rain Removal Architecture Using Multi-Direction Hierarchical Representation

Implementation of PersonaGPT Dialog Model

Fully Convolutional DenseNets for semantic segmentation.

Discord bot for notifying on github events

Barlow Twins and HSIC

VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

Source code for "Roto-translated Local Coordinate Framesfor Interacting Dynamical Systems"

ML-PersonalWork - Big assignment PersonalWork in Machine Learning, 2021 autumn BUAA.

This is the official pytorch implementation of AutoDebias, an automatic debiasing method for recommendation.