Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Last update: Aug 04, 2022

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

download the 'resources.zip' file here: https://drive.google.com/file/d/1X88cMrLVpAcJd5Z4Gg6MfTLclIuGF-d6/view?usp=sharing
extract the content of 'resources.zip'

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Execute the following command to train and evaluate the model. The evaluation results are saved under the folder 'results'.

python main.py -c config.json

Optimizing Hyperparameters

The "config.json" file contains hyperparameters that can be changed to train different variants of the model.

{
  "base_dir": "",
  "batch_size": 64,
  "epochs": 20,
  "epoch_patience": 5,
  "bert_model_dir": "resources/hatebert",
  "monitor": "loss",
  "tweet_text_seq_len": 80,
  "tweet_text_char_len": 128,
  "char_size": 29,
  "max_learning_rate": 0.001,
  "end_learning_rate": 0.0000001,
  "rnn_type": "lstm",
  "rnn_layer_size": 200,
  "text_models": ["char_emb", "bert", "hate_words"],
  "normalize_text": true,
  "dataset_year": "2021",
  "optimizer": "adam",
  "text_use_attention": false,
  "oversample": true,
  "feature_normalization_layer_size": 512,
  "min_feature_normalization_layer_size": 64
}

bert_model_dir

"bert_model_dir": "resources/hatebert"
     OR
"bert_model_dir": "resources/bert-base"

dataset_year

"dataset_year": "2019"
	OR
"dataset_year": "2020"
	OR
"dataset_year": "2021"

text_models

"text_models": ["hate_words"]
	OR
"text_models": ["bert"]
	OR
"text_models": ["char_emb"]
	OR
"text_models": ["char_emb", "bert", "hate_words"]

rnn_type

"rnn_type": "lstm"
	OR
"rnn_type": "gru"
	OR
"rnn_type": "bi-gru"

Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Related tags

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Optimizing Hyperparameters

Owner

Sherzod Hakimov

GraphLily: A Graph Linear Algebra Overlay on HBM-Equipped FPGAs

Shitty gaze mouse controller

Implement face detection, and age and gender classification, and emotion classification.

A TensorFlow 2.x implementation of Masked Autoencoders Are Scalable Vision Learners

Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

PyTorch implementation of some learning rate schedulers for deep learning researcher.

This repo contains the code required to train the multivariate time-series Transformer.

Multi-Stage Episodic Control for Strategic Exploration in Text Games

Self-Supervised Methods for Noise-Removal

a Pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021"

Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices.

Unofficial implementation of the paper: PonderNet: Learning to Ponder in TensorFlow

Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".

HybridNets: End-to-End Perception Network

Single Image Random Dot Stereogram for Tensorflow

An introduction to satellite image analysis using Python + OpenCV and JavaScript + Google Earth Engine

This is the pytorch re-implementation of the IterNorm

Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir.