Not Suitable for Work (NSFW) classification using deep neural network Caffe models.

Overview

Open nsfw model

This repo contains code for running Not Suitable for Work (NSFW) classification deep neural network Caffe models. Please refer our blog post which describes this work and experiments in more detail.

Not suitable for work classifier

Detecting offensive / adult images is an important problem which researchers have tackled for decades. With the evolution of computer vision and deep learning the algorithms have matured and we are now able to classify an image as not suitable for work with greater precision.

Defining NSFW material is subjective and the task of identifying these images is non-trivial. Moreover, what may be objectionable in one context can be suitable in another. For this reason, the model we describe below focuses only on one type of NSFW content: pornographic images. The identification of NSFW sketches, cartoons, text, images of graphic violence, or other types of unsuitable content is not addressed with this model.

Since images and user generated content dominate the internet today, filtering nudity and other not suitable for work images becomes an important problem. In this repository we opensource a Caffe deep neural network for preliminary filtering of NSFW images.

Demo Image

Usage

  • The network takes in an image and gives output a probability (score between 0-1) which can be used to filter not suitable for work images. Scores < 0.2 indicate that the image is likely to be safe with high probability. Scores > 0.8 indicate that the image is highly probable to be NSFW. Scores in middle range may be binned for different NSFW levels.
  • Depending on the dataset, usecase and types of images, we advise developers to choose suitable thresholds. Due to difficult nature of problem, there will be errors, which depend on use-cases / definition / tolerance of NSFW. Ideally developers should create an evaluation set according to the definition of what is safe for their application, then fit a ROC curve to choose a suitable threshold if they are using the model as it is.
  • Results can be improved by fine-tuning the model for your dataset/ use case / definition of NSFW. We do not provide any guarantees of accuracy of results. Please read the disclaimer below.
  • Using human moderation for edge cases in combination with the machine learned solution will help improve performance.

Description of model

We trained the model on the dataset with NSFW images as positive and SFW(suitable for work) images as negative. These images were editorially labelled. We cannot release the dataset or other details due to the nature of the data.

We use CaffeOnSpark which is a wonderful framework for distributed learning that brings deep learning to Hadoop and Spark clusters for training models for our experiments. Big thanks to the CaffeOnSpark team!

The deep model was first pretrained on ImageNet 1000 class dataset. Then we finetuned the weights on the NSFW dataset. We used the thin resnet 50 1by2 architecture as the pretrained network. The model was generated using pynetbuilder tool and replicates the residual network paper's 50 layer network (with half number of filters in each layer). You can find more details on how the model was generated and trained here

Please note that deeper networks, or networks with more filters can improve accuracy. We train the model using a thin residual network architecture, since it provides good tradeoff in terms of accuracy, and the model is light-weight in terms of runtime (or flops) and memory (or number of parameters).

Docker Quickstart

This Docker quickstart guide can be used for evaluating the model quickly with minimal dependency installation.

Install Docker Engine

Build a caffe docker image (CPU)

docker build -t caffe:cpu https://raw.githubusercontent.com/BVLC/caffe/master/docker/cpu/Dockerfile

Check the caffe installation

docker run caffe:cpu caffe --version
caffe version 1.0.0-rc3

Run the docker image with a volume mapped to your open_nsfw repository. Your test_image.jpg should be located in this same directory.

cd open_nsfw
docker run --volume=$(pwd):/workspace caffe:cpu \
python ./classify_nsfw.py \
--model_def nsfw_model/deploy.prototxt \
--pretrained_model nsfw_model/resnet_50_1by2_nsfw.caffemodel \
test_image.jpg

We will get the NSFW score returned:

NSFW score:   0.14057905972

Running the model

To run this model, please install Caffe and its python extension and make sure pycaffe is available in your PYTHONPATH.

We can use the classify.py script to run the NSFW model. For convenience, we have provided the script in this repo as well, and it prints the NSFW score.

python ./classify_nsfw.py \
--model_def nsfw_model/deploy.prototxt \
--pretrained_model nsfw_model/resnet_50_1by2_nsfw.caffemodel \
INPUT_IMAGE_PATH 

Disclaimer

The definition of NSFW is subjective and contextual. This model is a general purpose reference model, which can be used for the preliminary filtering of pornographic images. We do not provide guarantees of accuracy of output, rather we make this available for developers to explore and enhance as an open source project. Results can be improved by fine-tuning the model for your dataset.

License

Code licensed under the [BSD 2 clause license] (https://github.com/BVLC/caffe/blob/master/LICENSE). See LICENSE file for terms.

Contact

The model was trained by Jay Mahadeokar, in collaboration with Sachin Farfade , Amar Ramesh Kamat, Armin Kappeler and others. Special thanks to Gerry Pesavento for taking the initiative for open-sourcing this model. If you have any queries, please raise an issue and we will get back ASAP.

Owner
Yahoo
This organization is the home to many of the active open source projects published by engineers at Yahoo Inc.
Yahoo
L-Verse: Bidirectional Generation Between Image and Text

Far beyond learning long-range interactions of natural language, transformers are becoming the de-facto standard for many vision tasks with their power and scalabilty

Kim, Taehoon 102 Dec 21, 2022
Credit fraud detection in Python using a Jupyter Notebook

Credit-Fraud-Detection - Credit fraud detection in Python using a Jupyter Notebook , using three classification models (Random Forest, Gaussian Naive Bayes, Logistic Regression) from the sklearn libr

Ali Akram 4 Dec 28, 2021
Art Project "Schrödinger's Game of Life"

Repo of the project "Team Creative Quantum AI: Schrödinger's Game of Life" Installation new conda env: conda create --name qcml python=3.8 conda activ

ℍ◮ℕℕ◭ℍ ℝ∈ᛔ∈ℝ 2 Sep 15, 2022
Tensorflow implementation of "BEGAN: Boundary Equilibrium Generative Adversarial Networks"

BEGAN in Tensorflow Tensorflow implementation of BEGAN: Boundary Equilibrium Generative Adversarial Networks. Requirements Python 2.7 or 3.x Pillow tq

Taehoon Kim 922 Dec 21, 2022
On Evaluation Metrics for Graph Generative Models

On Evaluation Metrics for Graph Generative Models Authors: Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor This is the offic

13 Jan 07, 2023
Plato: A New Framework for Federated Learning Research

a new software framework to facilitate scalable federated learning research.

System <a href=[email protected] Lab"> 192 Jan 05, 2023
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 08, 2023
Isaac Gym Reinforcement Learning Environments

Isaac Gym Reinforcement Learning Environments

NVIDIA Omniverse 714 Jan 08, 2023
🔮 Execution time predictions for deep neural network training iterations across different GPUs.

Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training Habitat is a tool that predicts a deep neural network's

Geoffrey Yu 44 Dec 27, 2022
Attention mechanism with MNIST dataset

[TensorFlow] Attention mechanism with MNIST dataset Usage $ python run.py Result Training Loss graph. Test Each figure shows input digit, attention ma

YeongHyeon Park 12 Jun 10, 2022
A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS).

UniNAS A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS). under development (which happens mostly on our internal Gi

Cognitive Systems Research Group 19 Nov 23, 2022
AOT (Associating Objects with Transformers) in PyTorch

An efficient modular implementation of Associating Objects with Transformers for Video Object Segmentation in PyTorch

162 Dec 14, 2022
AFLNet: A Greybox Fuzzer for Network Protocols

AFLNet: A Greybox Fuzzer for Network Protocols AFLNet is a greybox fuzzer for protocol implementations. Unlike existing protocol fuzzers, it takes a m

626 Jan 06, 2023
Learning to Stylize Novel Views

Learning to Stylize Novel Views [Project] [Paper] Contact: Hsin-Ping Huang ([ema

34 Nov 27, 2022
This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

12 Oct 28, 2022
MIMO-UNet - Official Pytorch Implementation

MIMO-UNet - Official Pytorch Implementation This repository provides the official PyTorch implementation of the following paper: Rethinking Coarse-to-

Sungjin Cho 248 Jan 02, 2023
Codebase for INVASE: Instance-wise Variable Selection - 2019 ICLR

Codebase for "INVASE: Instance-wise Variable Selection" Authors: Jinsung Yoon, James Jordon, Mihaela van der Schaar Paper: Jinsung Yoon, James Jordon,

Jinsung Yoon 50 Nov 11, 2022
Automatically align face images 🙃→🙂. Can also do windowing and warping.

Automatic Face Alignment (AFA) Carl M. Gaspar & Oliver G.B. Garrod You have lots of photos of faces like this: But you want to line up all of the face

Carl Michael Gaspar 15 Dec 12, 2022
Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021) 🙈 A more detailed readme is co

Lincedo Lab 4 Jun 09, 2021
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022