Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

Overview

Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

reproducibility task

This is the repository for the paper Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigation, developed by Giacomo Medda, PhD student at University of Cagliari, with the support of Gianni Fenu, Full Professor at University of Cagliari, Mirko Marras, Non-tenure Track Assistant Professor at University of Cagliari, and Ludovico Boratto, Tenure Track Assistant Professor at University of Cagliari.

The goal of the paper was to find a common understanding and practical benchmarks on how and when each procedure of consumer fairness in recommender systems can be used in comparison to the others.

Repository Organization

  • reproducibility_study

    This is the directory that contains the source code of each reproduced paper identified by the author names of the respective paper.

    • Ashokan and Haas: Fairness metrics and bias mitigation strategies for rating predictions
    • Burke et al: Balanced Neighborhoods for Multi-sided Fairness in Recommendation
    • Ekstrand et al: All The Cool Kids, How Do They Fit In. Popularity and Demographic Biases in Recommender Evaluation and Effectiveness
    • Frisch et al: Co-clustering for fair recommendation
    • Kamishima et al: Recommendation Independence
    • Li et al: User-oriented Fairness in Recommendation
    • Rastegarpanah et al: Fighting Fire with Fire. Using Antidote Data to Improve Polarization and Fairness of Recommender Systems
    • Wu et al: Learning Fair Representations for Recommendation. A Graph-based Perspective
  • Preprocessing

    Contains the scripts to preprocess the raw datasets and to generate the input data for each reproduced paper.

  • Evaluation

    Contains the scripts to load the predictions of each reproduced paper, compute the metrics and generate plots and tables in latex and markdown forms.

  • Other Folders

    The other folders not already mentioned are part of the codebase that supports the scripts contained in Preprocessing and Evaluation. These directories and their contents are described by README_codebase, since the structure and code inside these folders is only used to support the reproducibility study and it is independent from the specific implementation of each paper.

Reproducibility Pipeline

  • Code Integration.

    The preprocessing of the raw datasets is performed by the scripts.

    The commands to preprocess each dataset are present at the top of the related dataset script, but the procedure is better described inside the REPRODUCE.md. The preprocessed datasets will be saved in data/preprocessed_datasets.

    Once the MovieLens 1M and the Last.FM 1K dataset have been processed, we can pass to the generation of the input data for each reproduced paper:

    The commands to generate the input data for each preprocessed dataset and sensitive attribute are present at the top of the script, but the procedure is better described inside the REPRODUCE.md). The generated input data will be saved in Preprocessing/input_data.

  • Mitigation Execution

    Each paper (folder) listed in the subsection reproducibility_study of Repository Organization contains a REPRODUCE.md file that describes everything to setup, prepare and run each reproduced paper. In particular, instructions to install the dependencies are provided, as well as the specific subfolders to fill with the input data generated in the previous step, in order to properly run the experiments of the selected paper. The procedure for each source code is better described in the already mentioned REPRODUCE.md file.

  • Relevance Estimation and Metrics Computation

    The REPRODUCE.md file contained in each "paper" folder describes also where the predictions can be found at the end of the mitigation procedure and guide the developer on following the instructions of the REPRODUCE.md of Evaluation that contains:

    • metrics_reproduced: script that loads all the predictions of relevance scores and computes the metrics in form of plots and latex tables This is the script that must be configured the most, since the paths of the specific predictions of each paper and model could be copied and pasted inside the script if the filenames do not correspond to what we expect and prepare. The REPRODUCE.MD already mentioned better described these steps and specifying which are the commands to execute to get the desired results.

Installation

Considering the codebase and the different versions of libraries used by each paper, multiple Python versions are mandatory to execute properly this code.

The codebase (that is the code not inside reproducibility_study, Preprocessing, Evaluation) needs a Python 3.8 installation and all the necessary dependencies can be installed with the requirements.txt file in the root of the repository with the following command in Windows:

pip install -r requirements.txt

or in Linux:

pip3 install -r requirements.txt

The installation of each reproducible paper is thoroughly described in the REPRODUCE.md that you can find in each paper folder, but every folder contains a requirements.txt file that you can use to install the dependencies in the same way. We recommend to use virtual environments at least for each reproduced paper, since some require specific versions of Python (2, 3, 3.7) and a virtual environment for each paper will maintain a good order in the code organization. Virtual environments can be created in different ways depending on the Python version and on the system. The Python Documentation describes the creation of virtual environments for Python >= 3.5, while the virtualenv Website can be used for Python 2.

Results

Top-N Recommendation Gender

Top-N Recommendation Gender

Top-N Recommendation Age

Top-N Recommendation Age

Rating Prediction Gender

Rating Prediction Gender

Rating Prediction Age

Rating Prediction Age

Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 09, 2022
tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

tf2onnx converts TensorFlow (tf-1.x or tf-2.x), tf.keras and tflite models to ONNX via command line or python api.

Open Neural Network Exchange 1.8k Jan 08, 2023
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

This is the codebase for the paper: Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs Directory Structur

Peter Hase 19 Aug 21, 2022
Official PyTorch implementation of PICCOLO: Point-Cloud Centric Omnidirectional Localization (ICCV 2021)

Official PyTorch implementation of PICCOLO: Point-Cloud Centric Omnidirectional Localization (ICCV 2021)

16 Nov 19, 2022
Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

61 Jan 01, 2023
RodoSol-ALPR Dataset

RodoSol-ALPR Dataset This dataset, called RodoSol-ALPR dataset, contains 20,000 images captured by static cameras located at pay tolls owned by the Ro

Rayson Laroca 45 Dec 15, 2022
This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

OpenVINO Inference API This is a repository for an object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operati

BMW TechOffice MUNICH 68 Nov 24, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Dec 31, 2022
Improving Deep Network Debuggability via Sparse Decision Layers

Improving Deep Network Debuggability via Sparse Decision Layers This repository contains the code for our paper: Leveraging Sparse Linear Layers for D

Madry Lab 35 Nov 14, 2022
Selfplay In MultiPlayer Environments

This project allows you to train AI agents on custom-built multiplayer environments, through self-play reinforcement learning.

200 Jan 08, 2023
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Model This repository is the official PyTorch implementation of GraphRNN, a graph gene

Jiaxuan 568 Dec 29, 2022
Nonnegative spatial factorization for multivariate count data

Nonnegative spatial factorization for multivariate count data This repository contains supporting code to facilitate reproducible analysis. For detail

Will Townes 24 Dec 19, 2022
PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

Responsible AI Toolbox A library that provides high-quality, PyTorch-centric tools for evaluating and enhancing both the robustness and the explainabi

24 Dec 22, 2022
The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

Six_God 2.6k Jan 08, 2023
The Illinois repository for Climatehack (https://climatehack.ai/). We won 1st place!

Climatehack This is the repository for Illinois's Climatehack Team. We earned first place on the leaderboard with a final score of 0.87992. An overvie

Jatin Mathur 20 Jun 09, 2022
Face detection using deep learning.

Face Detection Docker Solution Using Faster R-CNN Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe thro

Nataniel Ruiz 181 Dec 19, 2022
CS_Final_Metal_surface_detection - This is a final project for CoderSchool Machine Learning bootcamp on 29/12/2021.

CS_Final_Metal_surface_detection This is a final project for CoderSchool Machine Learning bootcamp on 29/12/2021. The project is based on the dataset

Cuong Vo 1 Dec 29, 2021
FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware.

FIRM-AFL FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware. FIRM-AFL addresses two fundamental problems in IoT fuzzing. First, it

356 Dec 23, 2022
Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

28 Dec 02, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022