Activation

The Activation class represents an activation function that is applied in NeuralNet.

Use in initialising a `NeuralNet`

When initialising a NeuralNet using NeuralNetFactory, an Activation is supplied to each Hidden Layer and Output Layer. To see an example, click here.

Provided `Activation`s

There are four Activations provided by default:

ReLU activation

ReLU activation is provided through ReluActivation. ReluActivationhas a leak of zero by default, but can be given a leak value through the optional argument leak.

ReLU activation works well for most projects. If you are unsure of what activation function to use, ReLU activation (with little or no leak) is usually a good option.

The neural layers in initialising a NeuralNet can be set to use ReluActivation by setting the layer structure to, for example:

List<NeuralLayerConfig> layerStructure = new ()
{
    ...
    new HiddenLayer(size: 100, activation: new ReluActivation()),
    new HiddenLayer(size: 100, activation: new ReluActivation(leak: 0.01)),
    ...
};

Sigmoid activation

Sigmoid activation (here, the logistic sigmoid) is provided through SigmoidActivation. SigmoidActivation takes no arguments.

The outputs of the sigmoid activation function lie between 0 and 1. This range of values has lead to sigmoid activation often being used in statistics.

The neural layers in initialising a NeuralNet can be set to use SigmoidActivation by setting the layer structure to, for example:

List<NeuralLayerConfig> layerStructure = new ()
{
    ...
    new HiddenLayer(size: 100, activation: new SigmoidActivation()),
    ...
};

Tanh activation

Tanh activation is provided through TanhActivation. TanhActivation takes no arguments.

Tanh activation can be thought of as rescaling sigmoid activation so that the output lies between -1 and 1. The fact that tanh activation maps zero to zero gives tanh superior properties in gradient descent compared to sigmoid activation (see here).

The neural layers in initialising a NeuralNet can be set to use TanhActivation by setting the layer structure to, for example:

List<NeuralLayerConfig> layerStructure = new ()
{
    ...
    new HiddenLayer(size: 100, activation: new TanhActivation()),
    ...
};

Identity activation

The identity function, also known in machine learning as "linear activation", is provided through IdentityActivation. The identity function does not process its input: it simply returns whatever input was given to it.

If you do not want to force your output values to lie in a particular range, then you should use IdentityActivation. IdentityActivation is used in OutputLayer by default.

The output layer in initialising a NeuralNet can be set to use IdentityActivation by setting the layer structure to:

List<NeuralLayerConfig> layerStructure = new ()
{
    ...
    new OutputLayer(size: 100)
};

or, more explicitly:

List<NeuralLayerConfig> layerStructure = new ()
{
    ...
    new OutputLayer(size: 100, activation: new IdentityActivation())
};

Softmax activation

Softmax activation is provided through SoftmaxActivation. SoftmaxActivation takes no arguments.

Softmax activation returns a vector with each entry between 0 and 1. Put simply, the ith output entry measures how large the ith input entry is compared to all the other input entries, as a ratio between 0 and 1. More precisely:

the greater the input entry, the closer the output entry is to 1
all the output entries add to 1

This makes softmax activation useful to apply to the output layer if the output vector should be a vector of ratios or a vector of probabilities.

The output layer in initialising a NeuralNet using NeuralNetFactory can be set to use SoftmaxActivation by by setting the layer structure to:

List<NeuralLayerConfig> layerStructure = new ()
{
    ...
    new OutputLayer(size: 100, activation: new SoftmaxActivation())
};