Question 37
Domain 5You have recently trained an XGBoost model that you intend to deploy for online inference in production. Before sending a predict request to your model's binary, you need to perform a straightforward data preprocessing step. This step should expose a REST API that can accept requests within your internal VPC Service Controls and return predictions. Your goal is to configure this preprocessing step while minimizing both cost and effort. What should you do?
Correct answer: D
Explanation
Vertex AI Endpoints provide online prediction behind a REST API, and a custom predictor lets you add the preprocessing step before inference. Using a custom container based on a Vertex built-in image minimizes effort, while storing the pickled model in Cloud Storage keeps deployment simple and low cost for serving within the VPC Service Controls boundary.
Why each option is right or wrong
A. Store a pickled model in Cloud Storage. Develop a Flask-based application, package the application in a custom container image, and then deploy the model to Vertex AI Endpoints.
B. Create a Flask-based application, package the application and a pickled model in a custom container image, and deploy the model to Vertex AI Endpoints.
C. Develop a custom predictor class based on XGBoost Predictor from the Vertex AI SDK, package it along with a pickled model in a custom container image based on a Vertex built-in image, and deploy the model to Vertex AI Endpoints.
D. Design a custom predictor class based on XGBoost Predictor from the Vertex AI SDK; package the handler in a custom container image based on a Vertex built-in container image. Store a pickled model in Cloud Storage and deploy the model to Vertex AI Endpoints.
Vertex AI online prediction is exposed through a deployed model on a Vertex AI Endpoint, which provides the REST predict interface and can be kept inside a VPC Service Controls perimeter. Under the Vertex AI SDK, a custom predictor can subclass the XGBoost predictor to insert the preprocessing logic before calling the model, and packaging that handler in a custom container derived from a Vertex built-in serving image avoids building a fully bespoke serving stack. Storing the serialized model in Cloud Storage and deploying it to an Endpoint is the lowest-effort path because Vertex can load the artifact directly without requiring a separate serving infrastructure or additional runtime management.