Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

Overview

Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

reproducibility task

This is the repository for the paper Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigation, developed by Giacomo Medda, PhD student at University of Cagliari, with the support of Gianni Fenu, Full Professor at University of Cagliari, Mirko Marras, Non-tenure Track Assistant Professor at University of Cagliari, and Ludovico Boratto, Tenure Track Assistant Professor at University of Cagliari.

The goal of the paper was to find a common understanding and practical benchmarks on how and when each procedure of consumer fairness in recommender systems can be used in comparison to the others.

Repository Organization

  • reproducibility_study

    This is the directory that contains the source code of each reproduced paper identified by the author names of the respective paper.

    • Ashokan and Haas: Fairness metrics and bias mitigation strategies for rating predictions
    • Burke et al: Balanced Neighborhoods for Multi-sided Fairness in Recommendation
    • Ekstrand et al: All The Cool Kids, How Do They Fit In. Popularity and Demographic Biases in Recommender Evaluation and Effectiveness
    • Frisch et al: Co-clustering for fair recommendation
    • Kamishima et al: Recommendation Independence
    • Li et al: User-oriented Fairness in Recommendation
    • Rastegarpanah et al: Fighting Fire with Fire. Using Antidote Data to Improve Polarization and Fairness of Recommender Systems
    • Wu et al: Learning Fair Representations for Recommendation. A Graph-based Perspective
  • Preprocessing

    Contains the scripts to preprocess the raw datasets and to generate the input data for each reproduced paper.

  • Evaluation

    Contains the scripts to load the predictions of each reproduced paper, compute the metrics and generate plots and tables in latex and markdown forms.

  • Other Folders

    The other folders not already mentioned are part of the codebase that supports the scripts contained in Preprocessing and Evaluation. These directories and their contents are described by README_codebase, since the structure and code inside these folders is only used to support the reproducibility study and it is independent from the specific implementation of each paper.

Reproducibility Pipeline

  • Code Integration.

    The preprocessing of the raw datasets is performed by the scripts.

    The commands to preprocess each dataset are present at the top of the related dataset script, but the procedure is better described inside the REPRODUCE.md. The preprocessed datasets will be saved in data/preprocessed_datasets.

    Once the MovieLens 1M and the Last.FM 1K dataset have been processed, we can pass to the generation of the input data for each reproduced paper:

    The commands to generate the input data for each preprocessed dataset and sensitive attribute are present at the top of the script, but the procedure is better described inside the REPRODUCE.md). The generated input data will be saved in Preprocessing/input_data.

  • Mitigation Execution

    Each paper (folder) listed in the subsection reproducibility_study of Repository Organization contains a REPRODUCE.md file that describes everything to setup, prepare and run each reproduced paper. In particular, instructions to install the dependencies are provided, as well as the specific subfolders to fill with the input data generated in the previous step, in order to properly run the experiments of the selected paper. The procedure for each source code is better described in the already mentioned REPRODUCE.md file.

  • Relevance Estimation and Metrics Computation

    The REPRODUCE.md file contained in each "paper" folder describes also where the predictions can be found at the end of the mitigation procedure and guide the developer on following the instructions of the REPRODUCE.md of Evaluation that contains:

    • metrics_reproduced: script that loads all the predictions of relevance scores and computes the metrics in form of plots and latex tables This is the script that must be configured the most, since the paths of the specific predictions of each paper and model could be copied and pasted inside the script if the filenames do not correspond to what we expect and prepare. The REPRODUCE.MD already mentioned better described these steps and specifying which are the commands to execute to get the desired results.

Installation

Considering the codebase and the different versions of libraries used by each paper, multiple Python versions are mandatory to execute properly this code.

The codebase (that is the code not inside reproducibility_study, Preprocessing, Evaluation) needs a Python 3.8 installation and all the necessary dependencies can be installed with the requirements.txt file in the root of the repository with the following command in Windows:

pip install -r requirements.txt

or in Linux:

pip3 install -r requirements.txt

The installation of each reproducible paper is thoroughly described in the REPRODUCE.md that you can find in each paper folder, but every folder contains a requirements.txt file that you can use to install the dependencies in the same way. We recommend to use virtual environments at least for each reproduced paper, since some require specific versions of Python (2, 3, 3.7) and a virtual environment for each paper will maintain a good order in the code organization. Virtual environments can be created in different ways depending on the Python version and on the system. The Python Documentation describes the creation of virtual environments for Python >= 3.5, while the virtualenv Website can be used for Python 2.

Results

Top-N Recommendation Gender

Top-N Recommendation Gender

Top-N Recommendation Age

Top-N Recommendation Age

Rating Prediction Gender

Rating Prediction Gender

Rating Prediction Age

Rating Prediction Age

A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

Meta Research 431 Jan 07, 2023
Official implementation of Self-supervised Image-to-text and Text-to-image Synthesis

Self-supervised Image-to-text and Text-to-image Synthesis This is the official implementation of Self-supervised Image-to-text and Text-to-image Synth

6 Jul 31, 2022
PyTorch implementation of the ideas presented in the paper Interaction Grounded Learning (IGL)

Interaction Grounded Learning This repository contains a simple PyTorch implementation of the ideas presented in the paper Interaction Grounded Learni

Arthur Juliani 4 Aug 31, 2022
This is a repository of our model for weakly-supervised video dense anticipation.

Introduction This is a repository of our model for weakly-supervised video dense anticipation. More results on GTEA, Epic-Kitchens etc. will come soon

2 Apr 09, 2022
Malware Env for OpenAI Gym

Malware Env for OpenAI Gym Citing If you use this code in a publication please cite the following paper: Hyrum S. Anderson, Anant Kharkar, Bobby Fila

ENDGAME 563 Dec 29, 2022
Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques This repository is derived from the NMTGMinor

Tu Anh Dinh 1 Sep 07, 2022
This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

DeepShift This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplicati

Mostafa Elhoushi 88 Dec 23, 2022
MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

MiraiML Mirai: future in japanese. MiraiML is an asynchronous engine for continuous & autonomous machine learning, built for real-time usage. Usage In

Arthur Paulino 25 Jul 27, 2022
[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression

Delving into Deep Imbalanced Regression This repository contains the implementation code for paper: Delving into Deep Imbalanced Regression Yuzhe Yang

Yuzhe Yang 568 Dec 30, 2022
Implementation of ICCV 2021 oral paper -- A Novel Self-Supervised Learning for Gaussian Mixture Model

SS-GMM Implementation of ICCV 2021 oral paper -- Self-Supervised Image Prior Learning with GMM from a Single Noisy Image with supplementary material R

HUST-The Tan Lab 4 Dec 05, 2022
Code-free deep segmentation for computational pathology

NoCodeSeg: Deep segmentation made easy! This is the official repository for the manuscript "Code-free development and deployment of deep segmentation

André Pedersen 26 Nov 23, 2022
Code for "Adversarial Attack Generation Empowered by Min-Max Optimization", NeurIPS 2021

Min-Max Adversarial Attacks [Paper] [arXiv] [Video] [Slide] Adversarial Attack Generation Empowered by Min-Max Optimization Jingkang Wang, Tianyun Zha

Jingkang Wang 12 Nov 23, 2022
Code repo for "Transformer on a Diet" paper

Transformer on a Diet Reference: C Wang, Z Ye, A Zhang, Z Zhang, A Smola. "Transformer on a Diet". arXiv preprint arXiv (2020). Installation pip insta

cgraywang 31 Sep 26, 2021
Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures

SfM disambiguation with COLMAP About Structure-from-Motion generally fails when the scene exhibits symmetries and duplicated structures. In this repos

Computer Vision and Geometry Lab 193 Dec 26, 2022
Tracking Progress in Question Answering over Knowledge Graphs

Tracking Progress in Question Answering over Knowledge Graphs Table of contents Question Answering Systems with Descriptions The QA Systems Table cont

Knowledge Graph Question Answering 47 Jan 02, 2023
CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

This is the official repository of the paper: CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability A private copy of the

Fadi Boutros 33 Dec 31, 2022
PyTorch implementation of 'Gen-LaneNet: a generalized and scalable approach for 3D lane detection'

(pytorch) Gen-LaneNet: a generalized and scalable approach for 3D lane detection Introduction This is a pytorch implementation of Gen-LaneNet, which p

Yuliang Guo 233 Jan 06, 2023
Open source implementation of AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing

AceNAS This repo is the experiment code of AceNAS, and is not considered as an official release. We are working on integrating AceNAS as a built-in st

Yuge Zhang 6 Sep 07, 2022
Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Building Shazam from scratch In this repository we tried to implement a simplified copy of the Shazam application able to tell you the name of a song

Arturo Ghinassi 0 Nov 17, 2022
AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614

AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614 AquaTimer is a programmable timer for 12V devices such as lighting, solenoid

Stefan Wagner 4 Jun 13, 2022