Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

Last update: Sep 18, 2022

Overview

VFedPCA+VFedAKPCA

This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework.

Despite enormous research interest and rapid application of federated learning (FL) to various areas, existing studies mostly focus on supervised federated learning under the horizontally partitioned local dataset setting. This paper will study the unsupervised FL under the vertically partitioned dataset setting.

Server-Clients Architecture

Figure: Server-Clients Architecture

Master Branch

VFedPCA+VFedAKPCA                    
└── case                        // Case Studies
    └── figs                    // Save experimental results' figures in '.eps' / '.png' format 
        ├── img_name*.eps              
        └── img_name*.png           
    ├── main.py          
    ├── model.py              
    └── utils.py                 
├── dataset                     // Put downloaded dataset in this folder
└── figs                        // Save experimental results' figures in '.eps' / '.png' format
    ├── img_name*.eps              
    └── img_name*.png           
├── README.md               
├── main.py                     // Experiment on Structured Dataset
├── model.py                   
└── utils.py

Environments

python = 3.8.8
numpy = 1.20.1
pandas = 1.2.4
scikit-learn = 0.24.1
scipy = 1.6.2
imageio = 2.9.0

Prepare Dataset

To demonstrate the superiority of our method, we utilized FIVE types of real-world datasets coming with distinct nature.

structured datasets from different domains;
medical image dataset;
face image dataset;
gait image dataset;
person re-identification image dataset.

Step 1: Download Dataset from the Google Drive URL

Step 2: Specify Dataset Path by Command Argument

$ python main.py --data_path="./dataset/xxx"

Experiments

We conduct extensive experiments on structured datasets to exmaines the effect of feature size, local iterations, warm-start power iterations, and weight scaling method on structed datasets. Furthermore, we investigate some case studies with image dataset to demonstrate the effectiveness of VFedPCA and VFedAKPCA.

A. Experiment on Structured Dataset

First, you need to choose the dataset.

python main.py --data_path './dataset/College.csv' --batch_size 160

Then, you only need to set different flag, p_list, iter_list and sampler_num to exmaines the effect of feature size, local iterations, warm-start power iterations, and weight scaling method on structed datasets. The example is as follows.

flag ='clients'
p_list = [3, 5, 10]         # the number of involved clients
iter_list = [100, 100, 100] # the number of local power iterations
sampler_num = 5

B. Case Studies

python main.py --data_path '../dataset/Image/DeepLesion' /
               --client_num 8 / 
               --iterations 100 / 
               --re_size 512

Citation

@inproceedings{
title = {{Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data}},
author = {Yiu-ming Cheung, Fellow, IEEE, Feng Yu, and Jian Lou},
year = 2021
}

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

Related tags

Overview

VFedPCA+VFedAKPCA

Server-Clients Architecture

Master Branch

Environments

Prepare Dataset

Experiments

A. Experiment on Structured Dataset

B. Case Studies

Citation

Owner

John

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Torch-mutable-modules - Use in-place and assignment operations on PyTorch module parameters with support for autograd

This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations,

Deep Reinforcement Learning based autonomous navigation for quadcopters using PPO algorithm.

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Official PaddlePaddle implementation of Paint Transformer

Fully convolutional deep neural network to remove transparent overlays from images

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

DABO: Data Augmentation with Bilevel Optimization

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding

Detection of PCBA defect

Spatial color quantization in Rust

UniFormer - official implementation of UniFormer

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

deep learning for image processing including classification and object-detection etc.