CorNet Correlation Networks for Extreme Multi-label Text Classification

Related tags

Text Data & NLPCorNet
Overview

CorNet

Correlation Networks for Extreme Multi-label Text Classification

Prerequisites

  • python==3.6.3
  • pytorch==1.2.0
  • torchgpipe==0.0.5
  • click==7.0
  • ruamel.yaml==0.16.5
  • numpy==1.16.2
  • scipy==1.2.1
  • scikit-learn==0.20.3
  • gensim==3.7.2
  • nltk==3.2.4
  • tqdm==4.31.1
  • joblib==0.13.2
  • logzero==1.5.0

Datasets

Pretrained Word Embeddings in gensim format

Run

Preprocess (the EUR-Lex dataset is already tokenized in advance)

./scripts/preprocess_eurlex.sh

or (the other datasets need to be tokenized using NLTK)

./scripts/preprocess_others.sh

Train and evaluate

./scripts/run_models.sh

Baselines

The codes for the baseline models are adapted from the following repositories: XML-CNN, BERT, MeSHProbeNet, and AttentionXML.

Owner
Guangxu Xun
Guangxu Xun
Backend for the Autocomplete platform. An AI assisted coding platform.

Introduction A custom predictor allows you to deploy your own prediction implementation, useful when the existing serving implementations don't fit yo

Tatenda Christopher Chinyamakobvu 1 Jan 31, 2022
Fidibo.com comments Sentiment Analyser

Fidibo.com comments Sentiment Analyser Introduction This project first asynchronously grab Fidibo.com books comment data using grabber.py and then sav

Iman Kermani 3 Apr 15, 2022
LeBenchmark: a reproducible framework for assessing SSL from speech

LeBenchmark: a reproducible framework for assessing SSL from speech

11 Nov 30, 2022
๐Ÿš€ RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

In recent years, the dense retrievers based on pre-trained language models have achieved remarkable progress. To facilitate more developers using cutt

475 Jan 04, 2023
NLP codes implemented with Pytorch (w/o library such as huggingface)

NLP_scratch NLP codes implemented with Pytorch (w/o library such as huggingface) scripts โ”œโ”€โ”€ models: Neural Network models โ”œโ”€โ”€ data: codes for dataloa

3 Dec 28, 2021
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰ We released the 2.0.0 version with TF2 Support. ๐ŸŽ‰ ๐ŸŽ‰ ๐ŸŽ‰ If you

Eliyar Eziz 2.3k Dec 29, 2022
XLNet: Generalized Autoregressive Pretraining for Language Understanding

Introduction XLNet is a new unsupervised language representation learning method based on a novel generalized permutation language modeling objective.

Zihang Dai 6k Jan 07, 2023
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Table of contents Introduction Using BARTpho with fairseq Using BARTpho with transformers Notes BARTpho: Pre-trained Sequence-to-Sequence Models for V

VinAI Research 58 Dec 23, 2022
Crowd sourced training data for Rasa NLU models

NLU Training Data Crowd-sourced training data for the development and testing of Rasa NLU models. If you're interested in grabbing some data feel free

Rasa 169 Dec 26, 2022
String Gen + Word Checker

Creates random strings and checks if any of them are a real words. Mostly a waste of time ngl but it is cool to see it work and the fact that it can generate a real random word within10sec

1 Jan 06, 2022
Ongoing research training transformer language models at scale, including: BERT & GPT-2

Megatron (1 and 2) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA.

NVIDIA Corporation 3.5k Dec 30, 2022
A look-ahead multi-entity Transformer for modeling coordinated agents.

baller2vec++ This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling

Michael A. Alcorn 30 Dec 16, 2022
Linear programming solver for paper-reviewer matching and mind-matching

Paper-Reviewer Matcher A python package for paper-reviewer matching algorithm based on topic modeling and linear programming. The algorithm is impleme

Titipat Achakulvisut 66 Jul 05, 2022
KR-FinBert And KR-FinBert-SC

KR-FinBert & KR-FinBert-SC Much progress has been made in the NLP (Natural Language Processing) field, with numerous studies showing that domain adapt

5 Jul 29, 2022
LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

LightSeq is a high performance inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT2, Transform

Bytedance Inc. 2.5k Jan 03, 2023
Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

AAGCN-ACSA EMNLP 2021 Introduction This repository was used in our paper: Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment An

Akuchi 36 Dec 18, 2022
Reformer, the efficient Transformer, in Pytorch

Reformer, the Efficient Transformer, in Pytorch This is a Pytorch implementation of Reformer https://openreview.net/pdf?id=rkgNKkHtvB It includes LSH

Phil Wang 1.8k Dec 30, 2022
Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

Dirk Neuhรคuser 4 Apr 06, 2022
2021 2ํ•™๊ธฐ ๋ฐ์ดํ„ฐํฌ๋กค๋ง ๊ธฐ๋งํ”„๋กœ์ ํŠธ

๊ณต์ง€ ์ฃผ์ œ ์›น ํฌ๋กค๋ง์„ ์ด์šฉํ•œ ์ทจ์—… ๊ณต๊ณ  ์Šค์ผ€์ค„๋Ÿฌ ์Šค์ผ€์ค„ ์ฃผ์ œ ์ •ํ•˜๊ธฐ ์ฝ”๋”ฉํ•˜๊ธฐ ํ•ต์‹ฌ ์ฝ”๋“œ ์„ค๋ช… + ํ”ผํ”ผํ‹ฐ ๊ตฌ์กฐ ๊ตฌ์ƒ // 12/4 ํ†  ํ”ผํ”ผํ‹ฐ + ์Šคํฌ๋ฆฝํŠธ(๋Œ€๋ณธ) ์ œ์ž‘ + ๋…นํ™” // ~ 12/10 ~ 12/11 ๊ธˆ~ํ†  ์˜์ƒ ํŽธ์ง‘ // ~12/11 ํ†  ์›นํฌ๋กค๋Ÿฌ ์‚ฌ๋žŒ์ธ_ํ‰๊ท 

Choi Eun Jeong 2 Aug 16, 2022
Converts python code into c++ by using OpenAI CODEX.

๐Ÿฆพ codex_py2cpp ๐Ÿค– OpenAI Codex Python to C++ Code Generator Your Python Code is too slow? ๐ŸŒ You want to speed it up but forgot how to code in C++? โŒจ

Alexander 423 Jan 01, 2023