self adapting activation function Implementation of parameterized soft-exponential activation function. In this implementation, the parameters are the same for all neurons initially starting with -0.01. This activation function revolves around the idea of a "soft" exponential function. The soft-exponential function is a function that is very similar to the exponential function, but it is not as steep at the beginning and it is more gradual at the end. The soft-exponential function is a good choice for neural networks that have a lot of connections and a lot of neurons.
This activation function is under the idea that the function is logarithmic, linear, exponential and smooth.
The equation for the soft-exponential function is:
From a paper by A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks, here in Figure 2, the soft-exponential function is shown as a logarithmic function. This is not the case.
The real figure should be shown here:
Here we can see in some cases the soft-exponential function is undefined for some values of
Here comes the tricky part. The soft-exponential function is defined for all values of
In the issues under Keras, one of the person has suggested to use the following function
Starting with an initial value of -0.01, the soft-exponential function was steep at the beginning and it is more gradual at the end. This was a good idea.
First picture showing the accuracy of the soft-exponential function.
This shows the loss of the soft-exponential function.
Model Structure:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 28, 28)] 0
flatten (Flatten) (None, 784) 0
dense_layer (Dense_layer) (None, 128) 100480
parametric_soft_exp (Parame (None, 128) 128
tricSoftExp)
dense_layer_1 (Dense_layer) (None, 128) 16512
parametric_soft_exp_1 (Para (None, 128) 128
metricSoftExp)
dense (Dense) (None, 10) 1290
=================================================================
Total params: 118,538
Trainable params: 118,538
Non-trainable params: 0