Question 20
Domain 2: Evaluation, Tuning, and Quality OptimizationWhat evaluation metrics best assess this RAG system?
Correct answer: A
Explanation
Retrieval metrics like "Precision@K, Recall@K, MRR" measure whether the system finds the right supporting documents, which is the retrieval step in RAG. Generation metrics like "Faithfulness, Answer Relevance, Context Precision" assess whether the answer stays grounded in the retrieved context and addresses the question, covering the generation step.
Why each option is right or wrong
A. Use retrieval metrics (Precision@K, Recall@K, MRR) for retrieval quality, and generation metrics (Faithfulness, Answer Relevance, Context Precision) for generation quality.
RAG is evaluated in two distinct stages, so the metric set must cover both the retriever and the generator. Precision@K, Recall@K, and MRR are standard information-retrieval measures for whether the top-K passages contain the needed evidence and how highly the first relevant result is ranked; Faithfulness, Answer Relevance, and Context Precision then test whether the produced answer is supported by the retrieved context and actually addresses the query rather than hallucinating.
B. Use only BLEU and ROUGE scores to evaluate output quality.
C. Manually review 100 random outputs.
D. Measure user satisfaction scores only.