NLP算法

说明

此算法仓库包括文本分类、序列标注、关系抽取、文本匹配、文本相似度匹配这五个主流NLP任务，涉及到22个相关的模型算法。

框架结构

文件结构

all_models
├── Base_line
│   ├── __init__.py
│   ├── base_data_process.py
│   ├── base_evaluation.py
│   └── single_tokenizer.py
│
├── Texts_Classification
│   ├── 机器学习_文本分类
│   ├── fasttext_文本分类
│   ├── textcnn_文本分类
│   ├── lstm_文本分类
│   ├── han_文本分类
│   ├── bert_文本分类
│   └── 数据准备
│
├── Sequence_Labeling
│   ├── crf_suite
│   ├── lstm_crf
│   ├── bert_lstm_crf
│   ├── bert_mrc
│   └── 数据准备
│
├── Relation_Extraction
│   ├── CasRel
│   ├── multihead_joint_extraction
│   ├── R-bert_relation_recognition
│   ├── attention_lstm_relation_recognition
│   ├── attention_lstm_relation_recognition_for_single_sentence
│   ├── tagging_scheme_joint_extraction
│   ├── entity_extraction_bert_lstm_crf
│   └── 数据准备
│
├── Text_Matching
│   ├── DSSM
│   ├── ARC-II
│   ├── ESIM
│   ├── bert
│   └── 数据准备
│
├── Text_Similarity_Matching
│   ├── tfidf
│   ├── BM25
│   ├── pysparnn
│   └── commodity_title.txt
│
├── 记录
├── .gitignore
└── README.md

nlp基础任务

Related tags

Overview

NLP算法

说明

框架结构

文件结构

Owner

zuxinqi

Conversational text Analysis using various NLP techniques

English loanwords in the world's languages

Code for PED: DETR For (Crowd) Pedestrian Detection

translate using your voice

[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Pipeline for training LSA models using Scikit-Learn.

Create a machine learning model which will predict if the mortgage will be approved or not based on 5 variables

HAIS_2GNN: 3D Visual Grounding with Graph and Attention

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

2021 2학기 데이터크롤링 기말프로젝트

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

nlpcommon is a python Open Source Toolkit for text classification.

WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary

A Word Level Transformer layer based on PyTorch and 🤗 Transformers.

Findings of ACL 2021

Search Git commits in natural language

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.