Softmax

What is Softmax?

Softmax is a mathematical function that converts a vector of numbers (logits) into a vector of probabilities, where the probabilities of each value are proportional to the relative scale of each value in the vector. The output values are between 0 and 1 and sum to 1. It is commonly used as the final activation function of a neural network to normalize the output of a network to a probability distribution over predicted output classes.

Where did the term "Softmax" come from?

The function is a generalization of the logistic function. It was named 'softmax' in the context of neural networks by Bridle in 1990, but the mathematical form originates from statistical mechanics (Boltzmann distribution).

How is "Softmax" used today?

Softmax is the standard activation function for the output layer of multi-class classification neural networks. It is essential for tasks where the model needs to choose one class out of many (e.g., classifying an image as a dog, cat, or bird).

Related Terms