A component that adds the input of a layer to its output (y = f(x) + x), creating a 'highway' for gradients to flow. This solved the vanishing gradient problem for deep networks.
ResNet (He et al., 2015).
Enabled networks to go from 20 layers to 100+.