Question 37
Content Domain 4: Machine Learning Implementation and OperationsA company is building a predictive maintenance model based on machine learning (ML). The data is stored in a fully private Amazon S3 bucket that is encrypted at rest with AWS Key Management Service (AWS KMS) CMKs. An ML specialist must run data preprocessing by using an Amazon SageMaker Processing job that is triggered from code in an Amazon SageMaker notebook. The job should read data from Amazon S3, process it, and upload it back to the same S3 bucket. The preprocessing code is stored in a container image in Amazon Elastic Container Registry (Amazon ECR). The ML specialist needs to grant permissions to ensure a smooth data preprocessing workflow. Which set of actions should the ML specialist take to meet these requirements?
Correct answer: A
Explanation
SageMaker Processing jobs run under an IAM role, so the notebook must use a role with permission to "create Amazon SageMaker Processing jobs" plus S3 read/write access to the bucket. Because the bucket is encrypted with AWS KMS CMKs and the code is in Amazon ECR, the role also needs KMS and ECR permissions to read the image and access encrypted objects.
Why each option is right or wrong
A. Create an IAM role that has permissions to create Amazon SageMaker Processing jobs, S3 read and write access to the relevant S3 bucket, and appropriate KMS and ECR permissions. Attach the role to the SageMaker notebook instance. Create an Amazon SageMaker Processing job from the notebook.
Amazon SageMaker Processing jobs execute with the IAM role supplied at job creation, so the notebook’s execution role must be allowed to call `sagemaker:CreateProcessingJob` and to pass that role to SageMaker via `iam:PassRole` under the conditions in the SageMaker service authorization model. Because the input and output data are in a private S3 bucket encrypted with customer-managed KMS keys, the same role also needs `s3:GetObject`, `s3:PutObject`, and the KMS actions required to decrypt and encrypt the objects, typically `kms:Decrypt`, `kms:Encrypt`, `kms:GenerateDataKey`, and `kms:DescribeKey` on the CMK. Since the preprocessing container is stored in Amazon ECR, the role must also be able to pull the image, which requires ECR read permissions such as `ecr:GetAuthorizationToken`, `ecr:BatchGetImage`, and `ecr:GetDownloadUrlForLayer`.
B. Create an IAM role that has permissions to create Amazon SageMaker Processing jobs. Attach the role to the SageMaker notebook instance. Create an Amazon SageMaker Processing job with an IAM role that has read and write permissions to the relevant S3 bucket, and appropriate KMS and ECR permissions.
Notebook can start jobs, but the described split misses the single complete execution-permission path.
C. Create an IAM role that has permissions to create Amazon SageMaker Processing jobs and to access Amazon ECR. Attach the role to the SageMaker notebook instance. Set up both an S3 endpoint and a KMS endpoint in the default VPC. Create Amazon SageMaker Processing jobs from the notebook.
Network endpoints do not replace IAM permissions for S3 data access and KMS-encrypted object use.
D. Create an IAM role that has permissions to create Amazon SageMaker Processing jobs. Attach the role to the SageMaker notebook instance. Set up an S3 endpoint in the default VPC. Create Amazon SageMaker Processing jobs with the access key and secret key of the IAM user with appropriate KMS and ECR permissions.
Using IAM user access keys for SageMaker jobs is not the proper role-based permission model.