Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Last update: Nov 21, 2022

Overview

BigGAN Audio Visualizer

Description

This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generate and interpolate between noise/class vector inputs to the model. Classes are chosen manually or optionally using semantic similarity on BERT encodings of a lyrics corpus.

Usage:

usage: visualize.py [-h] -s SONG [--resolution {128,256,512}] [-d DURATION]
               [-ps [200-295]] [-ts [0.05-0.8]]
               [--classes CLASSES [CLASSES ...]] [-n NUM_CLASSES]
               [--jitter [0-1]] [--frame_length i*2^6] [--truncation [0.1-1]]
               [--smooth_factor [10-30]] [--batch_size BATCH_SIZE]
               [-o OUTPUT_FILE] [--use_last_vectors] [--use_last_classes]
               [-l LYRICS]

Arguments

short	long	default	range	help
`-h`	`--help`			show this help message and exit
`-s`	`--song`	`input/romantic.mp3`		path to input audio file
	`--resolution`	`512`	`{128,256,512}`	output video resolution
`-d`	`--duration`	`None`		output video duration
`-ps`	`--pitch_sensitivity`	`220`	`[200-295]`	controls the sensitivity of the class vector to changes in pitch
`-ts`	`--tempo_sensitivity`	`0.25`	`[0.05-0.8]`	controls the sensitivity of the noise vector to changes in volume and tempo
	`--classes`	`None`		manually specify [--num_classes] ImageNet classes
`-n`	`--num_classes`	`12`		number of unique classes to use
	`--jitter`	`0.5`	`[0-1]`	controls jitter of the noise vector to reduce repitition
	`--frame_length`	`512`	`i*2^6`	number of audio frames to video frames in the output
	`--truncation`	`1`	`[0.1-1]`	BigGAN truncation parameter controls complexity of structure within frames
	`--smooth_factor`	`20`	`[10-30]`	controls interpolation between class vectors to smooth rapid flucations
	`--batch_size`	`30`		BigGAN batch_size
`-o`	`--output_file`			name of output file stored in output/, defaults to [--song] path base_name
	`--use_last_vectors`	`False`		set flag to use previous saved class/noise vectors
	`--use_last_classes`	`False`		set flag to use previous classes
`-l`	`--lyrics`	`None`		path to lyrics file; setting [--lyrics LYRICS] computes classes by semantic similarity under BERT encodings

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Related tags

Overview

BigGAN Audio Visualizer

Description

Usage:

Arguments

Owner

Rush Kapoor

PyTorch implementation of the TTC algorithm

A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

[ICCV 2021] Our work presents a novel neural rendering approach that can efficiently reconstruct geometric and neural radiance fields for view synthesis.

Open Source Differentiable Computer Vision Library for PyTorch

A brand new hub for Scene Graph Generation methods based on MMdetection (2021). The pipeline of from detection, scene graph generation to downstream tasks (e.g., image cpationing) is supported. Pytorch version implementation of HetH (ECCV 2020) and TopicSG (ICCV 2021) is included.

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

Sharing of contents on mitochondrial encounter networks

GBK-GNN: Gated Bi-Kernel Graph Neural Networks for Modeling Both Homophily and Heterophily

Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

High-Fidelity Pluralistic Image Completion with Transformers (ICCV 2021)

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

cisip-FIRe - Fast Image Retrieval

Multi Agent Reinforcement Learning for ROS in 2D Simulation Environments

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

A curated list of awesome game datasets, and tools to artificial intelligence in games

Reimplementation of Dynamic Multi-scale filters for Semantic Segmentation.

PyTorch implementation of SQN based on CloserLook3D's encoder