A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

Last update: Nov 30, 2021

Related tags

Overview

CNN from scratch

The most interesting part is in the folder neural_networks/layers.py: Code for a convolutional neural network, based on only numpy (no PyTorch or TensorFlow). It is therefore very foundational and illustrates how CNNs work mathematically.

The CNNs is compatible with colour images (3-channel rgb), includes pooling layers (class Pool2D) and works with any given (valid) stride.

neural_networks/activations.py contains basic activation functions, like ReLu or SoftMax with the appropriate forward / backward implementations calculating the jacobian, etc., needed for backpropagation.

Many functions make heavy use of slicing, to speed up the training process significantly. See e.g. Conv2D.forward:

for x in range(out_rows):
    for y in range(out_cols):
        out[:,x,y,:] = np.apply_over_axes(np.sum, W[None]*X_pad[:,x*s:x*s+kernel_height,y*s:y*s+kernel_width,:][...,None], [1,2,3])[:,0,0,0,:]

which is the sliced version of a depth-6 nested for loop -- and thus allows for significant speedup (on my computer, more than 20x speedup for the given training data).

In losses.py, CrossEntropy is the most important function. To allow for speed-up, we simplified mathematically as much as possible, yielding

loss = -1.0/m *np.trace(np.matmul(Y,np.log(Y_hat.T)))

for the forward pass and

-1/m*(np.divide(Y,Y_hat))

for the backward pass.

This is based on a project for CS289 at UC Berkeley.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

Related tags

Overview

CNN from scratch

Owner

Make a surveillance camera from your raspberry pi!

DECAF: Deep Extreme Classification with Label Features

How to Predict Stock Prices Easily Demo

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

realsense d400 -> jpg + csv

A script that trains a model to recognize handwritten digits using the MNIST data set.

Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

Official implementation of "Robust channel-wise illumination estimation"

A clear, concise, simple yet powerful and efficient API for deep learning.

[ICLR2021oral] Rethinking Architecture Selection in Differentiable NAS

Pretraining Representations For Data-Efficient Reinforcement Learning

This repository is for DSA and CP scripts for reference.

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.

This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

A New Open-Source Off-road Environment for Benchmark Generalization of Autonomous Driving

Focal Loss for Dense Rotation Object Detection

CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

Job Assignment System by Real-time Emotion Detection