NCP-AAI Practice Q20

A. Use retrieval metrics (Precision@K, Recall@K, MRR) for retrieval quality, and generation metrics (Faithfulness, Answer Relevance, Context Precision) for generation quality.

RAG is evaluated in two distinct stages, so the metric set must cover both the retriever and the generator. Precision@K, Recall@K, and MRR are standard information-retrieval measures for whether the top-K passages contain the needed evidence and how highly the first relevant result is ranked; Faithfulness, Answer Relevance, and Context Precision then test whether the produced answer is supported by the retrieved context and actually addresses the query rather than hallucinating.

B. Use only BLEU and ROUGE scores to evaluate output quality.

C. Manually review 100 random outputs.

D. Measure user satisfaction scores only.

Question 20

Explanation

Why each option is right or wrong