Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

Overview

SW-CV-ModelZoo

Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset


Framework: TF/Keras 2.7

Training SQLite DB built using fire-egg's tools: https://github.com/fire-eggs/Danbooru2019

Currently training on Danbooru2021, 512px SFW subset (sans the rating:q images that had been included in the 2022-01-21 release of the dataset)

Reference:

Anonymous, The Danbooru Community, & Gwern Branwen; “Danbooru2021: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”, 2022-01-21. Web. Accessed 2022-01-28 https://www.gwern.net/Danbooru2021


Journal

06/02/2022: great news crew! TRC allowed me to use a bunch of TPUs!

To make better use of this amount of compute I had to overhaul a number of components, so a bunch of things are likely to have fallen to bitrot in the process. I can only guarantee NFNet can work pretty much as before with the right arguments.
NFResNet changes should have left it retrocompatible with the previous version.
ResNet has been streamlined to be mostly in line with the Bag-of-Tricks paper (arXiv:1812.01187) with the exception of the stem. It is not compatible with the previous version of the code.

The training labels have been included in the 2021_0000_0899 folder for convenience.
The list of files used for training is going to be uploaded as a GitHub Release.

Now for some numbers:
compared to my previous best run, the one that resulted in NFNetL1V1-100-0.57141:

  • I'm using 1.86x the amount of images: 2.8M vs 1.5M
  • I'm training bigger models: 61M vs 45M params
  • ... in less time: 232 vs 700 hours of processor time
  • don't get me started on actual wall clock time
  • with a few amenities thrown in: ECA for channel attention, SiLU activation

And it's all thanks to the folks at TRC, so shout out to them!

I currently have a few runs in progress across a couple of dimensions:

  • effect of model size with NFNet L0/L1/L2, with SiLU and ECA for all three of them
  • effect of activation function with NFNet L0, with SiLU/HSwish/ReLU, no ECA

Once the experiments are over, the plan is to select the network definitions that lay on the Pareto curve between throughput and F1 score and release the trained weights.

One last thing.
I'd like to call your attention to the tools/cleanlab_stuff.py script.
It reads two files: one with the binarized labels from the database, the other with the predicted probabilities.
It then uses the cleanlab package to estimate whether if an image in a set could be missing a given label. At the end it stores its conclusions in a json file.
This file could, potentially, be used in some tool to assist human intervention to add the missing tags.

You might also like...
Human head pose estimation using Keras over TensorFlow.
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

Hyperparameter Optimization for TensorFlow, Keras and PyTorch
Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Hyperparameter Optimization for Keras Talos • Key Features • Examples • Install • Support • Docs • Issues • License • Download Talos radically changes

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Deep GPs built on top of TensorFlow/Keras and GPflow

GPflux Documentation | Tutorials | API reference | Slack What does GPflux do? GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hier

tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

tf2onnx converts TensorFlow (tf-1.x or tf-2.x), tf.keras and tflite models to ONNX via command line or python api.

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Releases(models_db2021_5500_2022_10_21)
  • models_db2021_5500_2022_10_21(Oct 21, 2022)

    ConvNext B, ViT B16
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    top 5500 tags (2021_0000_0899_5500/selected_tags.csv)
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_09_25_2022_05h13m55s | B | 93.2M | 448 | 0.3673 | 0.6941 | | ViTB16_09_25_2022_04h53m38s | B16 | 90.5M | 448 | 0.3663 | 0.6918 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_09_25_2022_05h13m55s.7z(322.58 MB)
    ViTB16_09_25_2022_04h53m38s.7z(312.96 MB)
  • convnexts_db2021_2022_03_22(Mar 22, 2022)

    ConvNext, T/S/B
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_03_10_2022_21h41m23s | B | 90.01M | 448 | 0.3372 | 0.6892 | | ConvNextSV1_03_11_2022_17h49m56s | S | 51.28M | 384 | 0.3301 | 0.6798 | | ConvNextTV1_03_05_2022_15h56m42s | T | 29.65M | 320 | 0.3259 | 0.6595 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_03_10_2022_21h41m23s.7z(311.29 MB)
    ConvNextSV1_03_11_2022_17h49m56s.7z(177.36 MB)
    ConvNextTV1_03_05_2022_15h56m42s.7z(102.96 MB)
  • nfnets_db2021_2022_03_04(Mar 4, 2022)

    NFNet, L0/L1/L2 (based on timm Lx model definitions) Trained on Danbooru2021 512px SFW subset, modulos 0000-0899 alpha to white padding to make the image square is white channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | NFNetL2V1_02_20_2022_10h27m08s | L2 | 60.96M | 448 | 0.3231 | 0.6785 | | NFNetL1V1_02_17_2022_20h18m38s | L1 | 45.65M | 384 | 0.3259 | 0.6691 | | NFNetL0V1_02_10_2022_17h50m14s | L0 | 27.32M | 320 | 0.3190 | 0.6509 |

    Source code(tar.gz)
    Source code(zip)
    NFNetL0V1_02_10_2022_17h50m14s.7z(94.98 MB)
    NFNetL1V1_02_17_2022_20h18m38s.7z(157.97 MB)
    NFNetL2V1_02_20_2022_10h27m08s.7z(210.49 MB)
  • nfnet_tpu_training(Feb 6, 2022)

  • NFNetL1V1-100-0.57141(Dec 31, 2021)

    • NFNet, L1 (based on timm Lx model definitions), 100 epochs, F1 @ 0.4 at the end of the 100th epoch was 0.57141
    • Trained on Danbooru2020 512px SFW subset, modulos 0000-0599
    • 320px per side
    • alpha to white
    • padding to make the image square is white
    • channel order is BGR, scaled to 0-1
    • mixup alpha = 0.2 during epochs 76-100
    • analyze_metrics on Danbooru2020 original set, modulos 0984-0999: {'thres': 0.3485, 'F1': 0.6133, 'F2': 0.6133, 'MCC': 0.6094, 'A': 0.9923, 'R': 0.6133, 'P': 0.6133}
    • analyze_metrics on image IDs 4970000-5000000: {'thres': 0.3148, 'F1': 0.5942, 'F2': 0.5941, 'MCC': 0.5892, 'A': 0.9901, 'R': 0.5940, 'P': 0.5943}
    Source code(tar.gz)
    Source code(zip)
    NFNetL1V1-100-0.57141.7z(158.09 MB)
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
A semismooth Newton method for elliptic PDE-constrained optimization

sNewton4PDEOpt The Python module implements a semismooth Newton method for solving finite-element discretizations of the strongly convex, linear ellip

2 Dec 08, 2022
Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available for research purposes.

Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available f

Yongrui Chen 5 Nov 10, 2022
CVPR 2021: "The Spatially-Correlative Loss for Various Image Translation Tasks"

Spatially-Correlative Loss arXiv | website We provide the Pytorch implementation of "The Spatially-Correlative Loss for Various Image Translation Task

Chuanxia Zheng 89 Jan 04, 2023
This is the code for HOI Transformer

HOI Transformer Code for CVPR 2021 accepted paper End-to-End Human Object Interaction Detection with HOI Transformer. Reproduction We recomend you to

BigBangEpoch 124 Dec 29, 2022
PyTorch Implementation of Backbone of PicoDet

PicoDet-Backbone PyTorch Implementation of Backbone of PicoDet Original Implementation is implemented on PaddlePaddle. Example picodet_l_backbone = ES

Yonghye Kwon 7 Jul 12, 2022
Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

54 Dec 06, 2022
Emblaze - Interactive Embedding Comparison

Emblaze - Interactive Embedding Comparison Emblaze is a Jupyter notebook widget for visually comparing embeddings using animated scatter plots. It bun

CMU Data Interaction Group 77 Nov 24, 2022
The fastest way to visualize GradCAM with your Keras models.

VizGradCAM VizGradCam is the fastest way to visualize GradCAM in Keras models. GradCAM helps with providing visual explainability of trained models an

58 Nov 19, 2022
Multiband spectro-radiometric satellite image analysis with K-means cluster algorithm

Multi-band Spectro Radiomertric Image Analysis with K-means Cluster Algorithm Overview Multi-band Spectro Radiomertric images are images comprising of

Chibueze Henry 6 Mar 16, 2022
Multitask Learning Strengthens Adversarial Robustness

Multitask Learning Strengthens Adversarial Robustness

Columbia University 15 Jun 10, 2022
Dataset and Source code of paper 'Enhancing Keyphrase Extraction from Academic Articles with their Reference Information'.

Enhancing Keyphrase Extraction from Academic Articles with their Reference Information Overview Dataset and code for paper "Enhancing Keyphrase Extrac

15 Nov 24, 2022
통일된 DataScience 폴더 구조 제공 및 가상환경 작업의 부담감 해소

Lucas coded by linux shell 목차 Mac버전 CookieCutter (autoenv) 1.How to Install autoenv 2.폴더 진입 시, activate 구현하기 3.폴더 탈출 시, deactivate 구현하기 4.Alias 설정하기 5

ello 3 Feb 21, 2022
[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru

Yuqing Wang 687 Jan 07, 2023
Streamlit tool to explore coco datasets

What is this This tool given a COCO annotations file and COCO predictions file will let you explore your dataset, visualize results and calculate impo

Jakub Cieslik 75 Dec 16, 2022
'Aligned mixture of latent dynamical systems' (amLDS) for stimulus decoding probabilistic manifold alignment across animals. P. Herrero-Vidal et al. NeurIPS 2021 code.

Across-animal odor decoding by probabilistic manifold alignment (NeurIPS 2021) This repository is the official implementation of aligned mixture of la

Pedro Herrero-Vidal 3 Jul 12, 2022
A toolset for creating Qualtrics-based IAT experiments

Qualtrics IAT Tool A web app for generating the Implicit Association Test (IAT) running on Qualtrics Online Web App The app is hosted by Streamlit, a

0 Feb 12, 2022
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion. NÜWA is a unified multimodal p

Microsoft 2.6k Jan 06, 2023
Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

101 Nov 25, 2022
Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)

Intro This repository contains code to generate data and reproduce experiments from our NeurIPS 2019 paper: Boris Knyazev, Graham W. Taylor, Mohamed R

Boris Knyazev 242 Jan 06, 2023