Which problem occurs when the learning signal becomes so small that earlier parts stop learning?

Prepare for the GARP Risk and AI (RAI) Exam. Master concepts with flashcards and multiple-choice questions, each with hints and clarifications. Get exam-ready with extensive practice!

Multiple Choice

Which problem occurs when the learning signal becomes so small that earlier parts stop learning?

Explanation:
When the learning signal becomes so small that earlier parts stop learning, what’s happening is vanishing gradients. In deep networks, as the error is backpropagated through many layers, the derivatives multiplied at each step can shrink, especially with activation functions like sigmoid or tanh. This causes the gradient that reaches the initial layers to become almost zero, so those early layers hardly update and fail to learn useful features. The network then struggles to improve from the bottom up, limiting overall performance. Other issues describe different problems. Exploding gradients involve gradients becoming very large and causing unstable updates, not the tiny signals at the start. Local optima refers to the loss surface having a suboptimal peak, which is about the landscape rather than the magnitude of the learning signal. Underfitted models describe insufficient capacity or training data to capture patterns, not specifically the problem of vanishing signals during backpropagation. Mitigations include using non-saturating activations like ReLU, applying normalization techniques such as batch normalization, or introducing skip connections (as in residual networks) to preserve signal strength during backpropagation.

When the learning signal becomes so small that earlier parts stop learning, what’s happening is vanishing gradients. In deep networks, as the error is backpropagated through many layers, the derivatives multiplied at each step can shrink, especially with activation functions like sigmoid or tanh. This causes the gradient that reaches the initial layers to become almost zero, so those early layers hardly update and fail to learn useful features. The network then struggles to improve from the bottom up, limiting overall performance.

Other issues describe different problems. Exploding gradients involve gradients becoming very large and causing unstable updates, not the tiny signals at the start. Local optima refers to the loss surface having a suboptimal peak, which is about the landscape rather than the magnitude of the learning signal. Underfitted models describe insufficient capacity or training data to capture patterns, not specifically the problem of vanishing signals during backpropagation.

Mitigations include using non-saturating activations like ReLU, applying normalization techniques such as batch normalization, or introducing skip connections (as in residual networks) to preserve signal strength during backpropagation.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy