Question 4
UnclassifiedA learning rate that is too large typically causes:
Correct answer: B
Explanation
A learning rate that is too large makes each update step overshoot the minimum, so the model can bounce back and forth instead of settling. This leads to "divergence or oscillation around the loss landscape" because the parameter updates are too aggressive to converge smoothly.
Why each option is right or wrong
A. Faster convergence to the global optimum
B. Divergence or oscillation around the loss landscape
In gradient-based optimization, the learning rate is the step-size multiplier on each parameter update; if it is set excessively high, each update can jump past the local minimum rather than move toward it. That produces unstable training dynamics, with the loss increasing or the parameters bouncing from one side of the minimum to the other, which is the standard behavior described as divergence or oscillation in the loss surface.
C. Lower memory usage
D. Better generalization