Question 37
Domain 4: Model DeploymentWhich serving style best matches an application that needs a prediction immediately after a user clicks Submit?
Correct answer: A
Explanation
Real-time inference is used when an application needs an immediate prediction in response to a user action. It serves results synchronously, so the model can return an output right after the user clicks Submit.
Why each option is right or wrong
A. Real-time inference
Under the standard serving taxonomy, a request that must return a model output synchronously at the moment of the user action is classified as real-time inference. The key condition here is the immediate turnaround after Submit, which rules out batch or asynchronous serving because those do not provide a prediction in the same interaction cycle.
B. Offline batch inference
Offline batch inference processes many records later, not instantly for one user action.
C. Monthly reporting
Monthly reporting summarizes historical data on a schedule, not live prediction serving.
D. Feature-table creation
Feature-table creation prepares model inputs for training or serving; it is not the prediction step.