(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’

Last update: Dec 01, 2022

Overview

Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback

About

This repository accompanies the real-world experiments conducted in the paper "Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback" by Yuta Saito, which has been accepted at SIGIR2020 as a full paper.

If you find this code useful in your research then please cite:

@inproceedings{saito2020asymmetric,
  title={Asymmetric tri-training for debiasing missing-not-at-random explicit feedback},
  author={Saito, Yuta},
  booktitle={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  year={2020}
}

Dependencies

numpy==1.17.2
pandas==0.25.1
scikit-learn==0.22.1
tensorflow==1.15.2
optuna==0.17.0
pyyaml==5.1.2

Running the code

To run the simulation with real-world datasets,

download the Coat dataset from https://www.cs.cornell.edu/~schnabts/mnar/ and put train.ascii and test.ascii files into ./data/coat/ directory.
download the Yahoo! R3 dataset from https://webscope.sandbox.yahoo.com/catalog.php?datatype=r and put train.txt and test.txt files into ./data/yahoo/ directory.

Then, run the following commands in the ./src/ directory:

for the MF-IPS models without asymmetric tri-training

for data in yahoo coat
do
  for model in uniform user item both nb nb_true
  do
    python main.py -d $data -m $model
  done
done

for the MF-IPS models with asymmetric tri-training (our proposal)

for data in coat yahoo
do
  for model in uniform-at user-at item-at both-at nb-at nb_true-at
  do
    python main.py -d $data -m $model
  done
done

where (uniform, user, item, both, nb, nb_true) correspond to (uniform propenisty, user propensity, item propensity, user-item propensity, NB (uniform), NB (true)), respectively.

These commands will run simulations with real-world datasets conducted in Section 5. The tuned hyperparameters for all models can be found in ./hyper_params.yaml.
(By adding the -t option to the above code, you can re-run the hyperparameter tuning procedure by Optuna.)

Once the simulations have finished running, the summarized results can be obtained by running the following command in the ./src/ directory:

python summarize_results -d coat yahoo

This creates ./paper_results/.

(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’

Related tags

Overview

Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback

About

Dependencies

Running the code

Owner

yuta-saito

PyBrain - Another Python Machine Learning Library.

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

DeepRec is a recommendation engine based on TensorFlow.

Human Detection - Pedestrian Detection using OpenCV Python

Code To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment.

A Distributional Approach To Controlled Text Generation

An Open Source Machine Learning Framework for Everyone

这是一个yolo3-tf2的源码，可以用于训练自己的模型。

Create images and texts with the First Order Generative Adversarial Networks

This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Benchmarking the robustness of Spatial-Temporal Models

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

official code for dynamic convolution decomposition

Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

An experimental technique for efficiently exploring neural architectures.

Place holder for HOPE: a human-centric and task-oriented MT evaluation framework using professional post-editing

Real-time Object Detection for Streaming Perception, CVPR 2022

Planning from Pixels in Environments with Combinatorially Hard Search Spaces -- NeurIPS 2021

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data