The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news.

Overview

Fake News Detection

Overview

The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news. However, it is difficult to understand how deep learning models make decisions on what is fake or real news, and furthermore these models are vulnerable to adversarial attacks. In this project, we test the resilience of a fake news detector against a set of adversarial attacks. Our results indicate that a deep learning model remains vulnerable to adversarial attacks, but also is alarmingly vulnerable to the use of generic attacks: the inclusion of certain sequences of text whose inclusion into nearly any text sample can cause it to be misclassified. We explore how this set of generic attacks against text classifiers can be detected, and explore how future models can be made more resilient against these attacks.

Dataset Description

Our fake news model and dataset are taken from this github repo.

  • train.csv: A full training dataset with the following attributes:

    • id: unique id for a news article
    • title: the title of a news article
    • author: author of the news article
    • text: the text of the article; could be incomplete
    • label: a label that marks the article as potentially unreliable
      • 1: unreliable
      • 0: reliable
  • test.csv: A testing training dataset with all the same attributes at train.csv without the label.

Adversarial Text Generation

It's difficult to generate adversarial samples when working with text, which is discrete. A workaround, proposed by J. Gao et al. has been to create small text perturbations, like misspelled words, to create a black-box attack on text classification models. Another method taken by N. Papernot has been to find the gradient based off of the word embeddings of sample text. Our approach uses the algorithm proposed by Papernot to generate our adversarial samples. While Gao’s method is extremely effective, with little to no modification of the meaning of the text samples, we decided to see if we could create valid adversarial samples by changing the content of the words, instead of their text.

Methodology

Our original goal was to create a model that could mutate text samples so that they would be misclassified by the model. We accomplished this by implementing the algorithm set out by Papernot in Crafting Adversarial Input Sequences. The proposed algorithm generates a white-box adversarial example based on the model’s Jacobian matrix. Random words from the original text sample are mutated. These mutations are determined by finding a word in the embedding where the sign of the difference between the original word and the new word are closest to the sign of the Jacobian of the original word. The resulting words have an embedding direction that most closely resemble the direction indicated as being most impactful according to the model’s Jacobian.

A fake news text sample modified to be classified as reliable is shown below:

Council of Elders Intended to Set Up Anti-ISIS Coalition by Jason Ditz, October said 31, 2016 Share This ISIS has killed a number of Afghan tribal elders and wounded several more in Nangarhar Province’s main city of Jalalabad today, with a suicide bomber from the group targeting a meeting of the council of elders in the city. The details are still scant, but ISIS claims that the council was established in part to discuss the formation of a tribal anti-ISIS coalition in the area. They claimed 15 killed and 25 wounded, labeling the victims “apostates.” Afghan 000 government officials put the toll a lot lower, saying only four were killed and seven mr wounded in the attack. Nangarhar is the main base of operations for ISIS forces in Afghanistan, though they’ve recently begun to pop up around several other provinces. Whether the council was at the point of establishing an anti-ISIS coalition or not, this is in keeping with the group mr's reaction to any sign of growing local resistance, with ISIS having similarly made an example of tribal groups in Iraq and Syria during their establishment there. Last 5 posts by Jason Ditz

We also discovered a phenomena where adding certain sequences of text to samples would cause them to be misclassified without needing to make any additional modifications to the original text. To discover additional sequences, we took three different approaches: generating sequences based on the sentiments of the word bank, using Papernot’s algorithm to append new sequences, and creating sequences by hand.

Modified Papernot

Papernot’s original algorithm had been trained to mutate existing words in an input text to generate the adversarial text. However, our LSTM model pads the input, leaving spaces for blank words when the input length is small enough. We modify Papernot’s algorithm to mutate on two “blank” words at the end of our input sequence. This will generate new sequences of text that can then be applied to other samples, to see if they can serve as generic attacks.

The modified Papernot algorithm generated two-word sequences of the words ‘000’, ‘said’, and ‘mr’ in various orders, closely resembling the word substitutions created by the baseline Papernot algorithm. It can be expected that the modified Papernot will still use words identified by the baseline method, given that both models rely on the model’s Jacobian matrix when selecting replacement words. When tested against all unreliable samples, sequences generated are able to shift the model’s confidence to inaccurately classify a majority of samples as reliable instead.

Handcraft

Our simplest approach to the generation was to manually look for sequences of text by hand. This involved looking at how the model had performed on the training set, how confident it was on certain samples, and looking for patterns in samples that had been misclassified. We tried to rely on patterns that appear to a human observer to be innocuous, but also explored other patterns that would change the meaning of the text in significant ways.

Methodology Sample Sequence False Discovery Rate
Papernot mr 000 0.37%
Papernot said mr 29.74%
Handcraft follow twitter 26.87%
Handcraft nytimes com 1.70%

Conclusion

One major issue with the deployment of deep learning models is that "the ease with which we can switch between any two decisions in targeted attacks is still far from being understood." It is primarily on this basis that we are skeptical of machine learning methods. We believe that there should be greater emphasis placed on identifying the set of misclassified text samples when evaluating the performance of fake news detectors. If seemingly minute perturbations in the text can change the entire classification of the sample, it is likely that these weaknesses will be found by fake news distributors, where the cost of producing fake news is cheaper than the cost of detecting it.

Our project also led to the discovery of the existence of a set of sequences that could be applied to nearly any text sample to then be misclassified by the model, resembling generic attacks from the cryptography field. We proposed a modification of Papernot’s Jacobian-based adversarial attack to automatically identify these sequences. However, some of these generated sequences do not feel natural to the human eye, and future work can be placed into improving their generation. For now, while the eyes of a machine may be tricked by our samples, the eyes of a human can still spot the differences.

References

Owner
Kushal Shingote
Android Developer📱📱 iOS Apps📱📱 Swift | Xcode | SwiftUI iOS Swift development📱 Kotlin Application📱📱 iOS📱 Artificial Intelligence 💻 Data science
Kushal Shingote
An Explainable Leaderboard for NLP

ExplainaBoard: An Explainable Leaderboard for NLP Introduction | Website | Download | Backend | Paper | Video | Bib Introduction ExplainaBoard is an i

NeuLab 319 Dec 20, 2022
Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

NLP Boot Camp (Jan) Synopsis Full Name: Prameya Mohanty Name of your School: Delhi Public School, Rourkela Class: VIII Title of the Project: iTransect

TheCodingHub 1 Feb 01, 2022
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

CRNN paper:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 1. create your ow

Tsukinousag1 3 Apr 02, 2022
A very simple framework for state-of-the-art Natural Language Processing (NLP)

A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. Flair is: A powerful NLP library. Flair allo

flair 12.3k Jan 02, 2023
DeepPavlov Tutorials

DeepPavlov tutorials DeepPavlov: Sentence Classification with Word Embeddings DeepPavlov: Transfer Learning with BERT. Classification, Tagging, QA, Ze

Neural Networks and Deep Learning lab, MIPT 28 Sep 13, 2022
Tool to check whether a GCP bucket is public or not.

Tool to check publicly accessible GCP bucket. Blog https://justm0rph3u5.medium.com/gcp-inspector-auditing-publicly-exposed-gcp-bucket-ac6cad55618c Wha

DIVYANSHU SHUKLA 7 Nov 24, 2022
Speach Recognitions

easy_meeting Добро пожаловать в интерфейс сервиса автопротоколирования совещаний Easy Meeting. Website - http://cf5c-62-192-251-83.ngrok.io/ Принципиа

Maksim 3 Feb 18, 2022
Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

japanese-ebook-analysis This aim of this project is to make analysing the contents of a japanese ebook easy and streamline the process for non-technic

Christoffer Aakre 14 Jul 23, 2022
Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

Coreferee Author: Richard Paul Hudson, msg systems ag 1. Introduction 1.1 The basic idea 1.2 Getting started 1.2.1 English 1.2.2 German 1.2.3 Polish 1

msg systems ag 169 Dec 21, 2022
TextFlint is a multilingual robustness evaluation platform for natural language processing tasks,

TextFlint is a multilingual robustness evaluation platform for natural language processing tasks, which unifies general text transformation, task-specific transformation, adversarial attack, sub-popu

TextFlint 587 Dec 20, 2022
sangha, pronounced "suhng-guh", is a social networking, booking platform where students and teachers can share their practice.

Flask React Project This is the backend for the Flask React project. Getting started Clone this repository (only this branch) git clone https://github

Courtney Newcomer 17 Sep 29, 2021
Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

KR-BERT-SimCSE Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT. Training Unsupervised python train_unsupervised.py --mi

Jeong Ukjae 27 Dec 12, 2022
A BERT-based reverse-dictionary of Korean proverbs

Wisdomify A BERT-based reverse-dictionary of Korean proverbs. 김유빈 : 모델링 / 데이터 수집 / 프로젝트 설계 / back-end 김종윤 : 데이터 수집 / 프로젝트 설계 / front-end Quick Start C

Eu-Bin KIM 94 Dec 08, 2022
[Preprint] Escaping the Big Data Paradigm with Compact Transformers, 2021

Compact Transformers Preprint Link: Escaping the Big Data Paradigm with Compact Transformers By Ali Hassani[1]*, Steven Walton[1]*, Nikhil Shah[1], Ab

SHI Lab 367 Dec 31, 2022
Training code of Spatial Time Memory Network. Semi-supervised video object segmentation.

Training-code-of-STM This repository fully reproduces Space-Time Memory Networks Performance on Davis17 val set&Weights backbone training stage traini

haochen wang 128 Dec 11, 2022
Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra. What is Lightning Tran

Pytorch Lightning 581 Dec 21, 2022
Production First and Production Ready End-to-End Keyword Spotting Toolkit

Production First and Production Ready End-to-End Keyword Spotting Toolkit

223 Jan 02, 2023
Stanford CoreNLP provides a set of natural language analysis tools written in Java

Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv

Stanford NLP 8.8k Jan 07, 2023
Deep learning for NLP crash course at ABBYY.

Deep NLP Course at ABBYY Deep learning for NLP crash course at ABBYY. Suggested textbook: Neural Network Methods in Natural Language Processing by Yoa

Dan Anastasyev 597 Dec 18, 2022
💫 Industrial-strength Natural Language Processing (NLP) in Python

spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc

Explosion 24.9k Jan 02, 2023