Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

Overview

SAGCN - Official PyTorch Implementation

| Paper | Project Page

This is the official implementation of the paper "Steganographer detection via a similarity accumulation graph convolutional network". NOTE: We are refactoring this project to the best practice of engineering.

Abstract

Steganographer detection aims to identify guilty users who conceal secret information in a number of images for the purpose of covert communication in social networks. Existing steganographer detection methods focus on designing discriminative features but do not explore relationship between image features or effectively represent users based on features. In these methods, each image is recognized as an equivalent, and each user is regarded as the distribution of all images shared by the corresponding user. However, the nuances of guilty users and innocent users are difficult to recognize with this flattened method. In this paper, the steganographer detection task is formulated as a multiple-instance learning problem in which each user is considered to be a bag, and the shared images are multiple instances in the bag. Specifically, we propose a similarity accumulation graph convolutional network to represent each user as a complete weighted graph, in which each node corresponds to features extracted from an image and the weight of an edge is the similarity between each pair of images. The constructed unit in the network can take advantage of the relationships between instances so that common patterns of positive instances can be enhanced via similarity accumulations. Instead of operating on a fixed original graph, we propose a novel strategy for reconstructing and pooling graphs based on node features to iteratively operate multiple convolutions. This strategy can effectively address oversmoothing problems that render nodes indistinguishable although they share different instance-level labels. Compared with the state-of-the-art method and other representative graph-based models, the proposed framework demonstrates its effectiveness and reliability ability across image domains, even in the context of large-scale social media scenarios. Moreover, the experimental results also indicate that the proposed network can be generalized to other multiple-instance learning problems.

Roadmap

After many rounds of revision, the project code implementation is not elegant. Thus, in order to help the readers to reproduce the experimental results of this paper quickly, we will open-source our study following this roadmap:

  • refactor and open-source all the model files, training files, and test files of the proposed method for comparison experiments.
  • refactor and open-source the visualization experiments.
  • refactor and open-source the APIs for the real-world steganographer detection in an out-of-box fashion.

Quick Start

Dataset and Pre-processing

We use the MDNNSD model to extract a 320-D feature from each image and save the extracted features in different .mat files. You should check ./data/train and ./data/test to confirm you have the dataset ready before experiments. For example, cover.mat and suniward_01.mat should be placed in the ./data/train and ./data/test folders.

Then, we provide a dataset tool to distribute image features and construct innocent users and guilty users as described in the paper, for example:

python preprocess_dataset.py --target suniward_01_100 --guilty_file suniward_01 --is_train --is_test --is_reset --mixin_num 0

Train the proposed SAGCN

To obtain our designed model for detecting steganographers, we provide an entry file with flexible command-line options, arguments to train the proposed SAGCN on the desired dataset under various experiment settings, for example:

python main.py --epochs 80 --batch_size 100 --model_name SAGCN --folder_name suniward_01_100 --parameters_name=sagcn_suniward_01_100 --mode train --learning_rate 1e-2 --gpu 1
python main.py --epochs 80 --batch_size 100 --model_name SAGCN --folder_name suniward_01_100 --parameters_name=sagcn_suniward_01_100 --mode train --learning_rate 1e-2 --gpu 1

Test the proposed SAGCN

For reproducing the reported experimental results, you just need to pass command-line options of the corresponding experimental setting, such as:

python main.py --batch_size 100 --model_name SAGCN --parameters_name sagcn_suniward_01_100 --folder_name suniward_01_100 --mode test --gpu 1

Visualize

If you set summary to True during training, you can use tensorboard to visualize the training process.

tensorboard --logdir logs --host 0.0.0.0 --port 8088

Requirement

  • Hardware: GPUs Tesla V100-PCIE (our version)
  • Software:
    • h5py==2.7.1 (our version)
    • scipy==1.1.0 (our version)
    • tqdm==4.25.0 (our version)
    • numpy==1.14.3 (our version)
    • torch==0.4.1 (our version)

Contact

If you have any questions, please feel free to open an issue.

Contribution

We thank all the people who already contributed to this project:

  • Zhi ZHANG
  • Mingjie ZHENG
  • Shenghua ZHONG
  • Yan LIU

Citation Information

If you find the project useful, please cite:

@article{zhang2021steganographer,
  title={Steganographer detection via a similarity accumulation graph convolutional network},
  author={Zhang, Zhi and Zheng, Mingjie and Zhong, Sheng-hua and Liu, Yan},
  journal={Neural Networks},
  volume={136},
  pages={97--111},
  year={2021}
}
Owner
ZHANG Zhi
日知其所亡,月无忘其所能
ZHANG Zhi
Code for "Learning to Segment Rigid Motions from Two Frames".

rigidmask Code for "Learning to Segment Rigid Motions from Two Frames". ** This is a partial release with inference and evaluation code.

Gengshan Yang 157 Nov 21, 2022
EssentialMC2 Video Understanding

EssentialMC2 Introduction EssentialMC2 is a complete system to solve video understanding tasks including MHRL(representation learning), MECR2( relatio

Alibaba 106 Dec 11, 2022
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish legal domain Language Model ⚖️ This repository contains the page for two main resources for the Spanish legal domain: A RoBERTa model: https:/

Plan de Tecnologías del Lenguaje - Gobierno de España 12 Nov 14, 2022
AbelNN: Deep Learning Python module from scratch

AbelNN: Deep Learning Python module from scratch I have implemented several neural networks from scratch using only Numpy. I have designed the module

Abel 2 Apr 12, 2022
A model to classify a piece of news as REAL or FAKE

Fake_news_classification A model to classify a piece of news as REAL or FAKE. This python project of detecting fake news deals with fake and real news

Gokul Stark 1 Jan 29, 2022
(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing This repository contains pytorch source code for AAAI2020 oral paper: Grapy-ML

54 Aug 04, 2022
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

Capital One 97 Jan 03, 2023
Deep generative models of 3D grids for structure-based drug discovery

What is liGAN? liGAN is a research codebase for training and evaluating deep generative models for de novo drug design based on 3D atomic density grid

Matt Ragoza 152 Jan 03, 2023
Code accompanying "Evolving spiking neuron cellular automata and networks to emulate in vitro neuronal activity," accepted to IEEE SSCI ICES 2021

Evolving-spiking-neuron-cellular-automata-and-networks-to-emulate-in-vitro-neuronal-activity Code accompanying "Evolving spiking neuron cellular autom

SOCRATES: Self-Organizing Computational substRATES 2 Dec 02, 2022
Aydin is a user-friendly, feature-rich, and fast image denoising tool

Aydin is a user-friendly, feature-rich, and fast image denoising tool that provides a number of self-supervised, auto-tuned, and unsupervised image denoising algorithms.

Royer Lab 99 Dec 14, 2022
PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our paper

Flow Gaussian Mixture Model (FlowGMM) This repository contains a PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our pa

Pavel Izmailov 124 Nov 06, 2022
Greedy Gaussian Segmentation

GGS Greedy Gaussian Segmentation (GGS) is a Python solver for efficiently segmenting multivariate time series data. For implementation details, please

Stanford University Convex Optimization Group 72 Dec 07, 2022
This is a five-step framework for the development of intrusion detection systems (IDS) using machine learning (ML) considering model realization, and performance evaluation.

AB-TRAP: building invisibility shields to protect network devices The AB-TRAP framework is applicable to the development of Network Intrusion Detectio

Lab-C2DC - Laboratory of Command and Control and Cyber-security 17 Jan 04, 2023
Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch

CutPaste CutPaste: image from paper Unofficial implementation of Google's "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization"

Lilit Yolyan 59 Nov 27, 2022
FedScale: Benchmarking Model and System Performance of Federated Learning

FedScale: Benchmarking Model and System Performance of Federated Learning (Paper) This repository contains scripts and instructions of building FedSca

268 Jan 01, 2023
Code for ICML 2021 paper: How could Neural Networks understand Programs?

OSCAR This repository contains the source code of our ICML 2021 paper How could Neural Networks understand Programs?. Environment Run following comman

Dinglan Peng 115 Dec 17, 2022
PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

121 Nov 05, 2022
Toward Spatially Unbiased Generative Models (ICCV 2021)

Toward Spatially Unbiased Generative Models Implementation of Toward Spatially Unbiased Generative Models (ICCV 2021) Overview Recent image generation

Jooyoung Choi 88 Dec 01, 2022
MDMM - Learning multi-domain multi-modality I2I translation

Multi-Domain Multi-Modality I2I translation Pytorch implementation of multi-modality I2I translation for multi-domains. The project is an extension to

Hsin-Ying Lee 107 Nov 04, 2022
GLODISMO: Gradient-Based Learning of Discrete Structured Measurement Operators for Signal Recovery

GLODISMO: Gradient-Based Learning of Discrete Structured Measurement Operators for Signal Recovery This is the code to the paper: Gradient-Based Learn

3 Feb 15, 2022