Question 1
Domain 5: Security, Compliance, and Governance for AI SolutionsA bank is fine-tuning a large language model (LLM) on Amazon Bedrock to assist customers with questions about their loans. The bank wants to ensure that the model does not reveal any private customer data. Which solution meets these requirements?
Correct answer: B
Explanation
PII must be removed before fine-tuning because Bedrock fine-tuning uses “private data,” and training on raw customer records can cause the model to learn and reproduce sensitive details. Sanitizing the corpus at data-prep time prevents private customer data from entering the training set in the first place.
Why each option is right or wrong
A. Use Amazon Bedrock Guardrails.
Guardrails filter model outputs at inference time, not the fine-tuning dataset.
B. Remove personally identifiable information (PII) from the customer data before fine-tuning the LLM.
Amazon Bedrock fine-tuning is supervised adaptation on the customer’s own private corpus, so any raw customer records included in that corpus can be learned and potentially reproduced by the model. Under AWS data-handling expectations, the sensitive fields must be removed during data preparation before the fine-tuning job starts; sanitizing the dataset at this stage is the only way to ensure private customer data never enters the training set.
C. Increase the Top-K parameter of the LLM.
Top-K changes token sampling diversity; it does not remove sensitive training data.
D. Store customer data in Amazon S3. Encrypt the data before fine-tuning the LLM.
Encryption protects stored data, but encrypted PII can still be present in training inputs.