DeepSpamReview: Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures. Summer Internship project at CoreView Systems.

Overview

Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures

PWC

Dataset: https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polarity_csv.tgz
https://www.kaggle.com/rtatman/deceptive-opinion-spam-corpus
The data includes 1,569,264 samples from the Yelp Dataset Challenge 2015. This subset has 280,000 training samples and 19,000 test samples in each polarity.
**Also, if you happen to refer my work, a citation would do wonders for me. Thanks! **
The following implementations are done:

  1. Bidirectional LSTM with GLoVE 50D word embeddings
  2. LSTM with GLoVE 100D word embeddings
  3. LSTM with GLoVE 300D word embeddings
  4. CNN-LSTM with Doc2Vec and TF-IDF
  5. Attention mechanism with GLoVe 100D word embeddings
  6. Logistic Regression
  7. Multinomial Naive Bayes
  8. Support Vector Machine - Stochastic Gradient Descent (SGD)

The results obtained were as follows:

Sr. No. Model Accuracy (%) Precision Score Recall Score F1 Score
1 MultinomialNB 90.25 0.9325 0.8601
2 Stochastic Gradient Descent (SGD) 87.75 0.8913 0.8497
3 Logistic Regression 87.00 0.8691 0.8601
4 Support Vector Machine 56.25 0.525 0.9792
5 Gaussian Naive Bayes 63.5 0.6424 0.6169
6 K-Nearest Neighbour 57.5 0.8604 0.1840
7 Decision tree 68.5 0.6681 0.7412
Model Training accuracy(%) Testing accuracy(%)
Bidirectional LSTM + GLoVe(50D) 92.17 88.13
LSTM + GLoVe(100D) 99.18 85.75
CNN + LSTM + Doc2Vec +TF-IDF 96.23 92.19
CNN + Attention + GLoVe(100D) 99.00 90.25
BiLSTM + Attention + GLoVe(100D) 99.18 89.27
CNN + BiLSTM + Attention + GLoVe(100D) 99.75 81.25
LogisticRegression + TF-IDF 99.11 87.21

Future scope includes improvement in the attention layer to increase testing accuracy. BERT and XLNet can be implemented to improve the performance further.

Owner
Ashish Salunkhe
Full Stack Developer. Interests in NLP, Knowledge Graphs and Product Dev
Ashish Salunkhe
Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).

Revisiting spatio-temporal layouts for compositional action recognition Codebase for "Revisiting spatio-temporal layouts for compositional action reco

Gorjan 20 Dec 15, 2022
NeuroGen: activation optimized image synthesis for discovery neuroscience

NeuroGen: activation optimized image synthesis for discovery neuroscience NeuroGen is a framework for synthesizing images that control brain activatio

3 Aug 17, 2022
This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging To appear on KDD'21...[pdf] This project provides an unsupervised framework for mining and

Xiaotao Gu 146 Dec 22, 2022
MutualGuide is a compact object detector specially designed for embedded devices

Introduction MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two

ZHANG Heng 103 Dec 13, 2022
Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style

Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style [NeurIPS 2021] Official code to reproduce the results and data p

Yash Sharma 27 Sep 19, 2022
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings

Yan Song 55 Dec 15, 2022
Text to image synthesis using thought vectors

Text To Image Synthesis Using Thought Vectors This is an experimental tensorflow implementation of synthesizing images from captions using Skip Though

Paarth Neekhara 2.1k Jan 05, 2023
Unofficial PyTorch implementation of Google AI's VoiceFilter system

VoiceFilter Note from Seung-won (2020.10.25) Hi everyone! It's Seung-won from MINDs Lab, Inc. It's been a long time since I've released this open-sour

MINDs Lab 883 Jan 07, 2023
Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

M4Depth This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in M4Depth: A moti

Michaël Fonder 76 Jan 03, 2023
OntoProtein: Protein Pretraining With Ontology Embedding

OntoProtein This is the implement of the paper "OntoProtein: Protein Pretraining With Ontology Embedding". OntoProtein is an effective method that mak

ZJUNLP 80 Dec 14, 2022
To provide 100 JAX exercises over different sections structured as a course or tutorials to teach and learn for beginners, intermediates as well as experts

JaxTon 💯 JAX exercises Mission 🚀 To provide 100 JAX exercises over different sections structured as a course or tutorials to teach and learn for beg

Rohan Rao 512 Jan 01, 2023
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intel ISL (Intel Intelligent Systems Lab) 1.3k Dec 28, 2022
Code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge.

Open Sesame This repository contains the code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge. Credits We built the project on t

9 Jul 24, 2022
Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Non-Parametric Prior Actor-Critic (N-PPAC) This repository contains the code for On Pathologies in KL-Regularized Reinforcement Learning from Expert D

Cong Lu 5 May 13, 2022
Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

Map Metrics for Trajectory Quality Map metrics toolkit provides a set of metrics to quantitatively evaluate trajectory quality via estimating consiste

Mobile Robotics Lab. at Skoltech 31 Oct 28, 2022
A repository for generating stylized talking 3D and 3D face

style_avatar A repository for generating stylized talking 3D faces and 2D videos. This is the repository for paper Imitating Arbitrary Talking Style f

Haozhe Wu 191 Dec 22, 2022
Gems & Holiday Package Prediction

Predictive_Modelling Gems & Holiday Package Prediction This project is based on 2 cases studies : Gems Price Prediction and Holiday Package prediction

Avnika Mehta 1 Jan 27, 2022
Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors

Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors

Kevin Bock 1.5k Jan 06, 2023
Matthew Colbrook 1 Apr 08, 2022
This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

TransMix: Attend to Mix for Vision Transformers This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transf

Jie-Neng Chen 130 Jan 01, 2023