Activation
The Activation class represents an activation function that is applied in NeuralNet
.
Use in initialising a NeuralNet
When initialising a NeuralNet
using NeuralNetFactory
, an Activation
is supplied to each Hidden Layer and Output Layer. To see an example, click here.
Provided Activation
s
There are four Activation
s provided by default:
ReLU activation
ReLU activation is provided through ReluActivation
. ReluActivation
has a leak of zero by default, but can be given a leak value through the optional argument leak
.
ReLU activation works well for most projects. If you are unsure of what activation function to use, ReLU activation (with little or no leak) is usually a good option.
The neural layers in initialising a NeuralNet
can be set to use ReluActivation
by setting the layer structure to, for example:
List<NeuralLayerConfig> layerStructure = new ()
{
...
new HiddenLayer(size: 100, activation: new ReluActivation()),
new HiddenLayer(size: 100, activation: new ReluActivation(leak: 0.01)),
...
};
Sigmoid activation
Sigmoid activation (here, the logistic sigmoid) is provided through SigmoidActivation
. SigmoidActivation
takes no arguments.
The outputs of the sigmoid activation function lie between 0 and 1. This range of values has lead to sigmoid activation often being used in statistics.
The neural layers in initialising a NeuralNet
can be set to use SigmoidActivation
by setting the layer structure to, for example:
List<NeuralLayerConfig> layerStructure = new ()
{
...
new HiddenLayer(size: 100, activation: new SigmoidActivation()),
...
};
Tanh activation
Tanh activation is provided through TanhActivation
. TanhActivation
takes no arguments.
Tanh activation can be thought of as rescaling sigmoid activation so that the output lies between -1 and 1. The fact that tanh activation maps zero to zero gives tanh superior properties in gradient descent compared to sigmoid activation (see here).
The neural layers in initialising a NeuralNet
can be set to use TanhActivation
by setting the layer structure to, for example:
List<NeuralLayerConfig> layerStructure = new ()
{
...
new HiddenLayer(size: 100, activation: new TanhActivation()),
...
};
Identity activation
The identity function, also known in machine learning as "linear activation", is provided through IdentityActivation
. The identity function does not process its input: it simply returns whatever input was given to it.
If you do not want to force your output values to lie in a particular range, then you should use IdentityActivation
. IdentityActivation
is used in OutputLayer
by default.
The output layer in initialising a NeuralNet
can be set to use IdentityActivation
by setting the layer structure to:
List<NeuralLayerConfig> layerStructure = new ()
{
...
new OutputLayer(size: 100)
};
or, more explicitly:
List<NeuralLayerConfig> layerStructure = new ()
{
...
new OutputLayer(size: 100, activation: new IdentityActivation())
};
Softmax activation
Softmax activation is provided through SoftmaxActivation
. SoftmaxActivation
takes no arguments.
Softmax activation returns a vector with each entry between 0 and 1. Put simply, the i
th output entry measures how large the i
th input entry is compared to all the other input entries, as a ratio between 0 and 1. More precisely:
- the greater the input entry, the closer the output entry is to 1
- all the output entries add to 1
This makes softmax activation useful to apply to the output layer if the output vector should be a vector of ratios or a vector of probabilities.
The output layer in initialising a NeuralNet
using NeuralNetFactory
can be set to use SoftmaxActivation
by by setting the layer structure to:
List<NeuralLayerConfig> layerStructure = new ()
{
...
new OutputLayer(size: 100, activation: new SoftmaxActivation())
};