Vanishing Gradient Problem

What is Vanishing Gradient Problem?

The vanishing gradient problem is a difficulty found in training artificial neural networks with gradient-based learning methods and backpropagation. In such methods, each of the neural network's weights receives an update proportional to the partial derivative of the error function with respect to the current weight in each iteration of training. The problem is that in some cases, the gradient will be vanishingly small, effectively preventing the weight from changing its value.

Where did the term "Vanishing Gradient Problem" come from?

A major obstacle in the development of deep neural networks.

How is "Vanishing Gradient Problem" used today?

Largely addressed by the use of the ReLU activation function.

Related Terms

Activation Functions
ReLU (Rectified Linear Unit)
Backpropagation