Cost Function

The CostFunction class represents a cost function that is used in fitting a NeuralNet.

Use in initializing a `NeuralNet`

When initializing a NeuralNet using NeuralNetFactory a cost function is supplied at the final step.

Provided cost functions

There are three cost functions provided by default:

Mean Squared Error

Mean squared error is provided by MSECost. Mean squared error is the simplest of the three provided cost functions. It is a good cost function to use when the output values are not forced to lie in a small range, i.e. when using IdentityActivation.

When initialising a NeuralNet, the cost function can be set to MSECost by setting the cost function to

CostFunction costFunction = new MSECost();

Huber cost

Huber cost is provided by HuberCost. Huber cost is an adaptation of mean squared error where the cost is linear, instead of quadratic, when the error is larger than a given value. This means that Huber cost is less affected by extreme errors (outliers) than mean squared error.

To use Huber cost, you need to set the size of the error for which Huber cost becomes linear. Set this value, called outlierBoundary, to how large of an error you wish to treat as an outlier in your project.

When initialising a NeuralNet, the cost function can be set to MSECost by setting the cost function to, for example (with an outlier boundary of 1):

CostFunction costFunction = new HuberCost(outlierBoundary: 1.0);

Cross Entropy

Cross entropy is provided by CrossEntropyCost. Cross entropy measures the error in learning probabilities.

If you are using CrossEntropyCost, it is reccommended to apply softmax activation to your output layer. This will ensure that your output vector is a vector of probabilities.

Warning

In using cross entropy cost, the expected output vector must be a vector of probabilities: i.e. its entries must be between 0 and 1. Supplying an expected output vector with non-positive entries will cause NaNs to appear.