Question 36
Domain 5You used Vertex AI Workbench user-managed notebooks to develop a TensorFlow model. The model pipeline accesses data from Cloud Storage, performs feature engineering and training locally, and outputs the trained model in Vertex AI Model Registry. The end-to-end pipeline takes 10 hours on the attached optimized instance type. You want to introduce model and data lineage for automated re-training runs for this pipeline only while minimizing the cost to run the pipeline. What should you do?
Correct answer: A
Explanation
Vertex AI Experiments and metadata let you track “model and data lineage” for a specific workflow without moving the pipeline to a more expensive managed service. Saving metadata throughout the notebook run and using Vertex ML Metadata provides lineage for automated re-training, while a scheduled recurring execution reuses the existing notebook setup to minimize cost.
Why each option is right or wrong
A. 1.Use the Vertex AI SDK to create an experiment for the pipeline runs and save metadata throughout the pipeline. 2. Configure a scheduled recurring execution for the notebook. 3. Access data and model metadata in Vertex ML Metadata.
Vertex AI Experiments is the supported mechanism for tracking run-level metadata, parameters, metrics, and artifacts for a specific notebook workflow, and it stores that lineage in Vertex ML Metadata; the SDK methods are designed to log metadata during the run so the model and input data can be traced back for retraining. Because the workload already runs in a user-managed notebook on an attached optimized VM for 10 hours, scheduling the notebook to recur avoids migrating to a managed pipeline service and keeps costs limited to the existing instance while still enabling automated retraining and lineage capture.
B. 1. Use the Vertex AI SDK to create an experiment, launch a custom training job in Vertex training service with the same instance type configuration as the notebook, and save metadata throughout the pipeline. 2. Configure a scheduled recurring execution for the notebook. 3. Access data and model metadata in Vertex ML Metadata.
C. 1. Refactor the pipeline code into a TensorFlow Extended (TFX) pipeline. 2. Load the TFX pipeline in Vertex AI Pipelines and configure the pipeline to use the same instance type configuration as the notebook. 3. Use Cloud Scheduler to configure a recurring execution for the pipeline. 4. Access data and model metadata in Vertex AI Pipelines.
D. 1. Create a Cloud Storage bucket to store metadata. 2. Write a function that saves data and model metadata by using TensorFlow ML Metadata in one time-stamped subfolder per pipeline run. 3. Configure a scheduled recurring execution for the notebook. 4. Access data and model metadata in Cloud Storage.