ReLU (Rectified Linear Unit)

What is ReLU (Rectified Linear Unit)?

The most widely used activation function in deep learning, defined as f(x) = max(0, x). It outputs the input directly if it is positive, otherwise, it outputs zero. Its simplicity enables efficient computation and helps mitigate the 'vanishing gradient' problem that plagued earlier functions like Sigmoid and Tanh, allowing for the training of much deeper neural networks.

Where did the term "ReLU (Rectified Linear Unit)" come from?

Popularized by Nair & Hinton (2010) and solidified by its success in AlexNet (2012).

How is "ReLU (Rectified Linear Unit)" used today?

The default activation function for Convolutional Neural Networks (CNNs) and many other deep architectures.

Related Terms