Implementation of parameterized soft-exponential activation function.

Last update: Feb 23, 2022

Overview

Soft-Exponential-Activation-Function:

Implementation of parameterized soft-exponential activation function. In this implementation, the parameters are the same for all neurons initially starting with -0.01. This activation function revolves around the idea of a "soft" exponential function. The soft-exponential function is a function that is very similar to the exponential function, but it is not as steep at the beginning and it is more gradual at the end. The soft-exponential function is a good choice for neural networks that have a lot of connections and a lot of neurons.

This activation function is under the idea that the function is logarithmic, linear, exponential and smooth.

The equation for the soft-exponential function is:

$$ f(\alpha,x)= \left{ \begin{array}{ll} -\frac{ln(1-\alpha(x + \alpha))}{\alpha} & \alpha < 0\ x & \alpha = 0 \ \frac{e^{\alpha x} - 1}{\alpha} + \alpha & \alpha > 0 \ \end{array} \right. $$

Problems faced:

1. Misinformation about the function

From a paper by A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks, here in Figure 2, the soft-exponential function is shown as a logarithmic function. This is not the case.

The real figure should be shown here:

Here we can see in some cases the soft-exponential function is undefined for some values of $\alpha$,$x$ and $\alpha$,$x$ is not a constant.

2. Negative values inside logarithm

Here comes the tricky part. The soft-exponential function is defined for all values of $\alpha$ and $x$. However, the logarithm is not defined for negative values.

In the issues under Keras, one of the person has suggested to use the following function $sinh^{-1}()$ instead of the $\ln()$.

3. Initialization of alpha

Starting with an initial value of -0.01, the soft-exponential function was steep at the beginning and it is more gradual at the end. This was a good idea.

Performance:

First picture showing the accuracy of the soft-exponential function.

This shows the loss of the soft-exponential function.

Model Structure:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 28, 28)]          0         
                                                                 
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense_layer (Dense_layer)   (None, 128)               100480    
                                                                 
 parametric_soft_exp (Parame  (None, 128)              128       
 tricSoftExp)                                                    
                                                                 
 dense_layer_1 (Dense_layer)  (None, 128)              16512     
                                                                 
 parametric_soft_exp_1 (Para  (None, 128)              128       
 metricSoftExp)                                                  
                                                                 
 dense (Dense)               (None, 10)                1290      
                                                                 
=================================================================
Total params: 118,538
Trainable params: 118,538
Non-trainable params: 0

Implementation of parameterized soft-exponential activation function.

Related tags

Overview

Soft-Exponential-Activation-Function:

Problems faced:

1. Misinformation about the function

2. Negative values inside logarithm

3. Initialization of alpha

Performance:

Acknowledgements:

Owner

Shuvrajeet Das

Lightweight tool to perform MITM attack on local network

CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery

pytorch implementation of trDesign

Survival analysis (SA) is a well-known statistical technique for the study of temporal events.

Our solution for SSN Invente 2021's Hackathon

Deep Networks with Recurrent Layer Aggregation

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

This repository contains the scripts for downloading and validating scripts for the documents

Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch

AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy"

Semi-supervised learning for object detection

Explainable Zero-Shot Topic Extraction

JAX-based neural network library

This is a official repository of SimViT.

A Demo server serving Bert through ONNX with GPU written in Rust with <3

Pytorch implementation of U-Net, R2U-Net, Attention U-Net, and Attention R2U-Net.

Code and data for ImageCoDe, a contextual vison-and-language benchmark

A PyTorch Image-Classification With AlexNet And ResNet50.