Text classification on IMDB dataset using Keras and Bi-LSTM network

Last update: Sep 27, 2022

Overview

Text classification on IMDB dataset using Keras and Bi-LSTM

Text classification on IMDB dataset using Keras and Bi-LSTM network.

Usage

python3 main.py

Hyper Parameter

Epoch: 12
Batch size: 128
Dropout: 0.5

Model Accuracy

Loss: 0.0574
Accuracy: 0.9809
Validation Loss: 0.6073
Validation Accuracy: 0.8534

Terminology

Recurrent Neural Network

Recurrent neural networks (RNN) is a type of neural network that uses previous information during model training. It remember the sequence of the data and use data patterns to give the prediction.

RNN uses feedback loops which makes it different from other neural networks. Those loops help RNN to process the sequence of the data. This loop allows the data to be shared to different nodes and predictions according to the gathered information. This process can be called memory.

RNN and the loops create the networks that allow RNN to share information, and also, the loop structure allows the neural network to take the sequence of input data. RNN converts an independent variable to a dependent variable for its next layer.

Long Short Term Memory

Long short term memory networks (LSTM) are a special kind of RNN. They were introduced to avoid the long-term dependency problem. In regular RNN, the problem frequently occurs when connecting previous information to new information. If RNN could do this, they’d be very useful. This problem is called long-term dependency.

The repeating module in a standard RNN contains a single layer. To remember the information for long periods in the default behaviour of the LSTM. LSTM networks have a similar structure to the RNN, but the memory module or repeating module has a different LSTM. The block diagram of the repeating module will look like the image below.

Bi-Directional Long Short Term Memory

Bidirectional long-short term memory (Bi-LSTM) is the process of making any neural network o have the sequence information in both directions backwards (future to past) or forward (past to future).

In bidirectional, our input flows in two directions, making a Bi-LSTM different from the regular LSTM. With the regular LSTM, we can make input flow in one direction, either backwards or forward. However, in bidirectional, we can make the input flow in both directions to preserve the future and the past information. For a better explanation, let’s have an example.

In the sentence "boys go to…" we can not fill the blank space. Still, when we have a future sentence “boys come out of school”, we can easily predict the past blank space the similar thing we want to perform by our model and bidirectional LSTM allows the neural network to perform this.

Text classification on IMDB dataset using Keras and Bi-LSTM network

Related tags

Overview

Text classification on IMDB dataset using Keras and Bi-LSTM

Usage

Hyper Parameter

Model Accuracy

Terminology

Recurrent Neural Network

Long Short Term Memory

Bi-Directional Long Short Term Memory

Owner

Hamza Rashid

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.

Klexikon: A German Dataset for Joint Summarization and Simplification

Huggingface Transformers + Adapters = ❤️

Search for documents in a domain through Google. The objective is to extract metadata

[WWW 2021 GLB] New Benchmarks for Learning on Non-Homophilous Graphs

Sequence-to-Sequence Framework in PyTorch

A fast and easy implementation of Transformer with PyTorch.

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.

Fixes mojibake and other glitches in Unicode text, after the fact.

A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/casual, active/passive, and many more. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Input english text, then translate it between languages n times using the Deep Translator Python Library.

ChatBotProyect - This is an unfinished project about a simple chatbot.

Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources (NAACL-2021).

null

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

A Streamlit web app that generates Rick and Morty stories using GPT2.