ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Last update: Jan 03, 2023

Overview

Voice2Series-Reprogramming

Voice2Series: Reprogramming Acoustic Models for Time Series Classification

International Conference on Machine Learning (ICML), 2021 | Paper | Colab Demo

Environment

Tensorflow 2.2 (CUDA=10.0) and Kapre 0.2.0.

Noted: Echo to many interests from the community, we will also provide Pytorch V2S layers and frameworks around this September, incoperating the new torch audio layers. Feel free to email the authors for further collaboration.
option 1 (from yml)

conda env create -f V2S.yml

option 2 (from clean python 3.6)

pip install tensorflow-gpu==2.1.0
pip install kapre==0.2.0
pip install h5py==2.10.0

Training

This is tengible Version. Please also check the paper for actual validation details. Many Thanks!

python v2s_main.py --dataset 0 --eps 100 --mapping 3

Result

seg idx: 0 --> start: 0, end: 500
seg idx: 1 --> start: 5000, end: 5500
seg idx: 2 --> start: 10000, end: 10500
Tensor("AddV2_2:0", shape=(None, 16000, 1), dtype=float32)
--- Preparing Masking Matrix
Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 500, 1)]     0                                            
__________________________________________________________________________________________________
zero_padding1d (ZeroPadding1D)  (None, 16000, 1)     0           input_1[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_AddV2 (TensorFlowOp [(None, 16000, 1)]   0           zero_padding1d[0][0]             
__________________________________________________________________________________________________
zero_padding1d_1 (ZeroPadding1D (None, 16000, 1)     0           input_1[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_AddV2_1 (TensorFlow [(None, 16000, 1)]   0           tf_op_layer_AddV2[0][0]          
                                                                 zero_padding1d_1[0][0]           
__________________________________________________________________________________________________
zero_padding1d_2 (ZeroPadding1D (None, 16000, 1)     0           input_1[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_AddV2_2 (TensorFlow [(None, 16000, 1)]   0           tf_op_layer_AddV2_1[0][0]        
                                                                 zero_padding1d_2[0][0]           
__________________________________________________________________________________________________
art_layer (ARTLayer)            (None, 16000, 1)     16000       tf_op_layer_AddV2_2[0][0]        
__________________________________________________________________________________________________
reshape_1 (Reshape)             (None, 16000)        0           art_layer[0][0]                  
__________________________________________________________________________________________________
model (Model)                   (None, 36)           1292911     reshape_1[0][0]                  
__________________________________________________________________________________________________
tf_op_layer_MatMul (TensorFlowO [(None, 6)]          0           model[1][0]                      
__________________________________________________________________________________________________
tf_op_layer_Shape (TensorFlowOp [(2,)]               0           tf_op_layer_MatMul[0][0]         
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [()]                 0           tf_op_layer_Shape[0][0]          
__________________________________________________________________________________________________
tf_op_layer_Reshape_2/shape (Te [(3,)]               0           tf_op_layer_strided_slice[0][0]  
__________________________________________________________________________________________________
tf_op_layer_Reshape_2 (TensorFl [(None, 2, 3)]       0           tf_op_layer_MatMul[0][0]         
                                                                 tf_op_layer_Reshape_2/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean (TensorFlowOpL [(None, 2)]          0           tf_op_layer_Reshape_2[0][0]      
==================================================================================================
Total params: 1,308,911
Trainable params: 217,225
Non-trainable params: 1,091,686
__________________________________________________________________________________________________
Epoch 1/100
2021-07-19 01:43:32.690913: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-07-19 01:43:32.919343: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
113/113 [==============================] - 6s 50ms/step - loss: 0.0811 - accuracy: 1.0000 - val_loss: 1.5589e-04 - val_accuracy: 1.0000
Epoch 2/100
113/113 [==============================] - 5s 41ms/step - loss: 5.0098e-05 - accuracy: 1.0000 - val_loss: 1.0906e-05 - val_accuracy: 1.0000

Class Activation Mapping

python cam_v2s.py --dataset 5 --weight wNo5_map6-88-0.7662.h5 --mapping 6 --layer conv2d_1

Reference

Voice2Series: Reprogramming Acoustic Models for Time Series Classification

@InProceedings{pmlr-v139-yang21j,
  title = 	 {Voice2Series: Reprogramming Acoustic Models for Time Series Classification},
  author =       {Yang, Chao-Han Huck and Tsai, Yun-Yun and Chen, Pin-Yu},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {11808--11819},
  year = 	 {2021},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
}

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Related tags

Overview

Voice2Series-Reprogramming

Environment

Training

Class Activation Mapping

Reference

Owner

On-device wake word detection powered by deep learning.

An OpenAI-Gym Package for Training and Testing Reinforcement Learning algorithms with OpenSim Models

Repo for parser tensorflow(.pb) and tflite(.tflite)

Deep motion transfer

Python package facilitating the use of Bayesian Deep Learning methods with Variational Inference for PyTorch

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation

An Inverse Kinematics library aiming performance and modularity

FS2KToolbox FS2K Dataset Towards the translation between Face

Kaggleship: Kaggle Notebooks

Repositório da disciplina de APC, no segundo semestre de 2021

Probabilistic Programming and Statistical Inference in PyTorch

This project helps to colorize grayscale images using multiple exemplars.

Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

ROS-UGV-Control-Interface - Control interface which can be used in any UGV

The source code for 'Noisy-Labeled NER with Confidence Estimation' accepted by NAACL 2021

Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

3rd Place Solution for ICCV 2021 Workshop SSLAD Track 3A - Continual Learning Classification Challenge

(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Official Implementation of SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.