Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

Last update: Aug 25, 2022

Overview

NLP learning

Trying to learn NLP to use in my projects!

Table of Contents

About The Project
- Built With
Getting Started
- Requirements
- Run
Usage
License
Contact

About The Project

There many ways and algorithms to understand language by machines. but first of all we should convert our words to vetcotrs ecause we nedd do to some calulcation on them

Here's some NLP keywords that i have learned till now:

Using classic AI algorithms like NAIVE Bayes
using TF-IDF to convert words to vectors
using word2vec to convert words to vectors

Of course, the list above in not complete but we will epand it in future.

(back to top)

Built With

This section should list any major frameworks/libraries and tools used implement this project.

(back to top)

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Requirements

We used Numpy for it array and math functions

numpy
```
pip install numpy
```

Run

$ python3 main.py

(back to top)

Usage

With the TF-IDF algorithm implemented you can find similaroty between different documnets so you can use it in chat bots and search engines.

For more examples, please refer to the Documentation

(back to top)

License

Distributed under the MIT License. See LICENSE.md for more information.

(back to top)

Contact

Faraz Farangizadeh - [email protected]

Project Link: https://github.com/farazff/NLP-Learning

(back to top)

Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

Related tags

Overview

NLP learning

About The Project

Built With

Getting Started

Requirements

Run

Usage

License

Contact

Owner

Faraz Farangizadeh

用Resnet101+GPT搭建一个玩王者荣耀的AI

This is the offline-training-pipeline for our project.

Conversational-AI-ChatBot - Intelligent ChatBot built with Microsoft's DialoGPT transformer to make conversations with human users!

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

Machine translation models released by the Gourmet project

Source code for the paper "TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations"

A paper list of pre-trained language models (PLMs).

A retro text-to-speech bot for Discord

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Phrase-Based & Neural Unsupervised Machine Translation

Language-Agnostic SEntence Representations

This repository contains helper functions which can help you generate additional data points depending on your NLP task.

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Unsupervised Language Modeling at scale for robust sentiment classification

Nmt - TensorFlow Neural Machine Translation Tutorial

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP)

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

A Multi-modal Model Chinese Spell Checker Released on ACL2021.

Code-autocomplete, a code completion plugin for Python