A repo for materials relating to the tutorial of CS-332 NLP

Last update: Feb 15, 2022

Overview

CS-332-NLP

A repo for materials relating to the tutorial of CS-332 NLP

Tutorial 1:
- Introduction
- Corpus
- Regular expression
- Tokenization
Tutorial 2:
- Normalization
- Parsing
- Morpheme
- Stemming
- Lemmatization

Acknowledgements

Speech and Language Processing. Daniel Jurafsky & James H. Martin. (Edition 2 & 3)
Marcinkiewicz, M. A. (1994). Building a large annotated corpus of English: The Penn Treebank. Using Large Corpora, 273.
http://su.diva-portal.org/smash/record.jsf?pid=diva2%3A686162&dswid=9114

Owner

Alok singh

GitHub Repository

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides

2.3k Jan 07, 2023

A simple visual front end to the Maya UE4 RBF plugin delivered with MetaHumans

poseWrangler Overview PoseWrangler is a simple UI to create and edit pose-driven relationships in Maya using the MayaUE4RBF plugin. This plugin is dis

105 Dec 18, 2022

Searching keywords in PDF file folders

keyword_searching Steps to use this Python scripts： (1)Paste this script into the file folder containing the PDF files you need to search from; (2)Thi

1 Nov 08, 2021

My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensorflow

Easy Data Augmentation Implementation This repository contains my Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Per

9 Oct 31, 2022

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

molten A minimal, extensible, fast and productive API framework for Python 3. Changelog: https://moltenframework.com/changelog.html Community: https:/

3.2k Dec 28, 2022

🎐 a python library for doing approximate and phonetic matching of strings.

jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk 1.8k Dec 21, 2022

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

1 Feb 01, 2022

A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.

GuwenModels: 古文自然语言处理模型合集, 收录互联网上的古文相关模型及资源. A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.

66 Dec 26, 2022

NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter 🦎 → 🐍 The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

43 Dec 23, 2022

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra. What is Lightning Tran

581 Dec 21, 2022

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation

1 Mar 20, 2022

Task-based datasets, preprocessing, and evaluation for sequence models.

SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models. SeqIO is a library for processing sequential data to be fed into downst

290 Dec 26, 2022

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

82 Dec 19, 2022

Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Regression Free Model Update Code for the paper: Regression Bugs Are In Your Mod

2 Feb 17, 2022

A repo for materials relating to the tutorial of CS-332 NLP

Related tags

Overview

CS-332-NLP

Contents

Acknowledgements

Owner

Alok singh

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

A simple visual front end to the Maya UE4 RBF plugin delivered with MetaHumans

Searching keywords in PDF file folders

My Implementation for the paper EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks using Tensorflow

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

🎐 a python library for doing approximate and phonetic matching of strings.

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.

NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Task-based datasets, preprocessing, and evaluation for sequence models.

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

GPT-3: Language Models are Few-Shot Learners

Auto-researching tool generating word documents.

Russian words synonyms and antonyms

Simple program that translates the name of files into English