clipit

Yet Another VQGAN-CLIP Codebase

This started as a fork of @nerdyrodent's VQGAN-CLIP code which was based on the notebooks of @RiversWithWings and @advadnoun. But it quickly morphed into a version of the code that had been tuned up with slightly different behavior and features. It also runs either at the command line or in a notebook or (soon) in batch mode.

Basically this is a verison of the notebook with opinionated defaults and slighly different internals. You are welcome to use it if you'd like.

For now, checkout THE DEMO NOTEBOOKS - especially the super simple "Start Here" colab.

Citations

@misc{unpublished2021clip,
    title  = {CLIP: Connecting Text and Images},
    author = {Alec Radford, Ilya Sutskever, Jong Wook Kim, Gretchen Krueger, Sandhini Agarwal},
    year   = {2021}
}

@misc{esser2020taming,
      title={Taming Transformers for High-Resolution Image Synthesis}, 
      author={Patrick Esser and Robin Rombach and Björn Ommer},
      year={2020},
      eprint={2012.09841},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Katherine Crowson - https://github.com/crowsonkb Adverb https://twitter.com/advadnoun

CLIP + VQGAN / PixelDraw

Related tags

Overview

clipit

Citations

Owner

dribnet

Fast Soft Color Segmentation

Multiple-Object Tracking with Transformer

ICRA 2021 - Robust Place Recognition using an Imaging Lidar

Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA sets the new standard on reconstructing highly emotional images in-the-wild

PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis

ML course - EPFL Machine Learning Course, Fall 2021

Implementation of ICCV21 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers

Flybirds - BDD-driven natural language automated testing framework, present by Trip Flight

PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

State of the art Semantic Sentence Embeddings

Object classification with basic computer vision techniques

Deep Federated Learning for Autonomous Driving

Label-Free Model Evaluation with Semi-Structured Dataset Representations

Multi-Glimpse Network With Python

ConvMAE: Masked Convolution Meets Masked Autoencoders

DANet for Tabular data classification/ regression.

Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)