Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Last update: Dec 07, 2022

Overview

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

About

This repository contains the code to replicate the synthetic experiment conducted in the paper "Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model" by Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto, which has been accepted to WSDM2022.

If you find this code useful in your research then please site:

@inproceedings{kiyohara2022doubly,
  author = {Kiyohara, Haruka and Saito, Yuta and Matsuhiro, Tatsuya and Narita, Yusuke and Shimizu, Nobuyuki and Yamamoto, Yasuo},
  title = {Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model},
  booktitle = {Proceedings of the 15th International Conference on Web Search and Data Mining},
  pages = {xxx--xxx},
  year = {2022},
}

Dependencies

This repository supports Python 3.7 or newer.

numpy==1.20.0
pandas==1.2.1
scikit-learn==0.24.1
matplotlib==3.4.3
obp==0.5.2
hydra-core==1.0.6

Note that the proposed Cascade-DR estimator is implemented in Open Bandit Pipeline (obp.ope.SlateCascadeDoublyRobust).

Running the code

To conduct the synthetic experiment, run the following commands.

(i) run OPE simulations with varying data size, with the fixed slate size.

python src/main.py setting=n_rounds

(ii), (iii) run OPE simulations with varying slate size and policy similarities, with the fixed data size.

python src/main.py

Once the code is finished executing, you can find the results (squared_error.csv, relative_ee.csv, configuration.csv) in the ./logs/ directory. Lower value is better for squared error and relative estimation error (relative-ee).

Visualize the results

To visualize the results, run the following commands. Make sure that you have executed the above two experiments (by running python src/main.py and python src/main.py setting=default) before visualizing the results.

python src/visualize.py

Then, you will find the following figures (slate size (standard/cascade/independent).png, evaluation policy similarity (standard/cascade/independent).png, data size (standard/cascade/independent).png) in the ./logs/ directory. Lower value is better for the relative-MSE (y-axis).

reward structure	Standard	Cascade	Independent
varying data size (n)
varying slate size (L)
varying evaluation policy similarity (λ)

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Related tags

Overview

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

About

Dependencies

Running the code

Visualize the results

Owner

Haruka Kiyohara

This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.

Qlib is an AI-oriented quantitative investment platform

Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

MetaTTE: a Meta-Learning Based Travel Time Estimation Model for Multi-city Scenarios

Learning Time-Critical Responses for Interactive Character Control

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

Run Keras models in the browser, with GPU support using WebGL

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

Autonomous racing with the Anki Overdrive

Flappy bird automation using Neuroevolution of Augmenting Topologies (NEAT) in Python

Pytorch code for our paper Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains)

Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Object classification with basic computer vision techniques

The project of phase's key role in complex and real NN

Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020