Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Last update: Mar 20, 2022

Overview

Neural Search

Description: Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation, and natural language understanding (NLU).

bert_search

Status: WIP
Source: https://towardsdatascience.com/building-a-search-engine-with-bert-and-tensorflow-c6fdc0186c8a
Description: Use a pre-trained BERT model checkpoint to build a general-purpose text feature extractor, which will be applied to a task of nearest neighbor search.

bert_tfhub

Status: Completed
Source: Use a matching preprocessing model to tokenize raw text and convert it to ids, generate the pooled and sequence output from the token input ids using the loaded (BERT) model, and look at the semantic similarity of the pooled outputs of different sentences.
Description: https://www.tensorflow.org/hub/tutorials/bert_experts

finetune_bert

Status: Completed
Source: https://www.tensorflow.org/text/tutorials/fine_tune_bert
Description: Work through fine-tuning a BERT model using the tensorflow-models pip package. The pretrained BERT model is on Tensorflow Hub.

text_summarization_encoderdecoder

Status: Abandoned (No source code to reference)
Source: https://towardsdatascience.com/text-summarization-from-scratch-using-encoder-decoder-network-with-attention-in-keras-5fa80d12710e
Description: Summarizing text from new articles to generate meaningful headlines using an Encoder-Decoder with Attention in Keras.

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Related tags

Overview

Neural Search

bert_search

bert_tfhub

finetune_bert

text_summarization_encoderdecoder

Owner

Diego

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

This repo contains simple to use, pretrained/training-less models for speaker diarization.

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

Russian words synonyms and antonyms

Estimation of the CEFR complexity score of a given word, sentence or text.

Learning Spatio-Temporal Transformer for Visual Tracking

a CTF web challenge about making screenshots

This repository describes our reproducible framework for assessing self-supervised representation learning from speech

Grover is a model for Neural Fake News -- both generation and detectio

CPC-big and k-means clustering for zero-resource speech processing

Python library for interactive topic model visualization. Port of the R LDAvis package.

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.

Winner system (DAMO-NLP) of SemEval 2022 MultiCoNER shared task over 10 out of 13 tracks.

A method for cleaning and classifying text using transformers.

Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.

Word Bot for JKLM Bomb Party

Generating new names based on trends in data using GPT2 (Transformer network)

Fast topic modeling platform