Implementation of parameterized soft-exponential activation function.

Last update: Feb 23, 2022

Overview

Soft-Exponential-Activation-Function:

Implementation of parameterized soft-exponential activation function. In this implementation, the parameters are the same for all neurons initially starting with -0.01. This activation function revolves around the idea of a "soft" exponential function. The soft-exponential function is a function that is very similar to the exponential function, but it is not as steep at the beginning and it is more gradual at the end. The soft-exponential function is a good choice for neural networks that have a lot of connections and a lot of neurons.

This activation function is under the idea that the function is logarithmic, linear, exponential and smooth.

The equation for the soft-exponential function is:

$$ f(\alpha,x)= \left{ \begin{array}{ll} -\frac{ln(1-\alpha(x + \alpha))}{\alpha} & \alpha < 0\ x & \alpha = 0 \ \frac{e^{\alpha x} - 1}{\alpha} + \alpha & \alpha > 0 \ \end{array} \right. $$

Problems faced:

1. Misinformation about the function

From a paper by A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks, here in Figure 2, the soft-exponential function is shown as a logarithmic function. This is not the case.

The real figure should be shown here:

Here we can see in some cases the soft-exponential function is undefined for some values of $\alpha$,$x$ and $\alpha$,$x$ is not a constant.

2. Negative values inside logarithm

Here comes the tricky part. The soft-exponential function is defined for all values of $\alpha$ and $x$. However, the logarithm is not defined for negative values.

In the issues under Keras, one of the person has suggested to use the following function $sinh^{-1}()$ instead of the $\ln()$.

3. Initialization of alpha

Starting with an initial value of -0.01, the soft-exponential function was steep at the beginning and it is more gradual at the end. This was a good idea.

Performance:

First picture showing the accuracy of the soft-exponential function.

This shows the loss of the soft-exponential function.

Model Structure:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 28, 28)]          0         
                                                                 
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense_layer (Dense_layer)   (None, 128)               100480    
                                                                 
 parametric_soft_exp (Parame  (None, 128)              128       
 tricSoftExp)                                                    
                                                                 
 dense_layer_1 (Dense_layer)  (None, 128)              16512     
                                                                 
 parametric_soft_exp_1 (Para  (None, 128)              128       
 metricSoftExp)                                                  
                                                                 
 dense (Dense)               (None, 10)                1290      
                                                                 
=================================================================
Total params: 118,538
Trainable params: 118,538
Non-trainable params: 0

Implementation of parameterized soft-exponential activation function.

Related tags

Overview

Soft-Exponential-Activation-Function:

Problems faced:

1. Misinformation about the function

2. Negative values inside logarithm

3. Initialization of alpha

Performance:

Acknowledgements:

Owner

Shuvrajeet Das

Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation (TPAMI).

ML-Decoder: Scalable and Versatile Classification Head

Code accompanying the paper on "An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers" published at NeurIPS, 2021

Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis"

MultiLexNorm 2021 competition system from ÚFAL

MT-GAN-PyTorch - PyTorch Implementation of Learning to Transfer: Unsupervised Domain Translation via Meta-Learning

Code for visualizing the loss landscape of neural nets

A 3D Dense mapping backend library of SLAM based on taichi-Lang designed for the aerial swarm.

A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks

Privacy-Preserving Machine Learning (PPML) Tutorial Presented at PyConDE 2022

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

Pytorch implementation of the paper: "A Unified Framework for Separating Superimposed Images", in CVPR 2020.

A standard framework for modelling Deep Learning Models for tabular data

Code for "Searching for Efficient Multi-Stage Vision Transformers"

Face Recognition and Emotion Detector Device

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation