Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Last update: Mar 20, 2022

Overview

Neural Search

Description: Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation, and natural language understanding (NLU).

bert_search

Status: WIP
Source: https://towardsdatascience.com/building-a-search-engine-with-bert-and-tensorflow-c6fdc0186c8a
Description: Use a pre-trained BERT model checkpoint to build a general-purpose text feature extractor, which will be applied to a task of nearest neighbor search.

bert_tfhub

Status: Completed
Source: Use a matching preprocessing model to tokenize raw text and convert it to ids, generate the pooled and sequence output from the token input ids using the loaded (BERT) model, and look at the semantic similarity of the pooled outputs of different sentences.
Description: https://www.tensorflow.org/hub/tutorials/bert_experts

finetune_bert

Status: Completed
Source: https://www.tensorflow.org/text/tutorials/fine_tune_bert
Description: Work through fine-tuning a BERT model using the tensorflow-models pip package. The pretrained BERT model is on Tensorflow Hub.

text_summarization_encoderdecoder

Status: Abandoned (No source code to reference)
Source: https://towardsdatascience.com/text-summarization-from-scratch-using-encoder-decoder-network-with-attention-in-keras-5fa80d12710e
Description: Summarizing text from new articles to generate meaningful headlines using an Encoder-Decoder with Attention in Keras.

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Related tags

Overview

Neural Search

bert_search

bert_tfhub

finetune_bert

text_summarization_encoderdecoder

Owner

Diego

Training RNNs as Fast as CNNs

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Beyond the Imitation Game collaborative benchmark for enormous language models

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

code for modular summarization work published in ACL2021 by Krishna et al

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

Code release for "COTR: Correspondence Transformer for Matching Across Images"

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

PyTorch source code of NAACL 2019 paper "An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models"

Contains descriptions and code of the mini-projects developed in various programming languages

Codes for coreference-aware machine reading comprehension

Unsupervised intent recognition

Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

Binaural Speech Synthesis

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU