Question 32
Domain 2: Core Machine Learning, AI, and Transformer FoundationsWhen fine-tuning an LLM for a specific application, why is Exploratory Data Analysis (EDA) essential?
Correct answer: B
Explanation
EDA is used to "uncover patterns and anomalies in the dataset," which helps identify data quality issues, class imbalance, and feature relationships before fine-tuning. This matters because an LLM trained on flawed or skewed data can learn those problems and perform poorly on the target application.
Why each option is right or wrong
A. To increase the model's parameter count
B. To uncover patterns and anomalies in the dataset
Exploratory Data Analysis is the pre-modeling step used to inspect the training corpus for structure, outliers, missingness, skew, and class imbalance before any fine-tuning begins. In this question, that matters because an LLM will inherit whatever distributional problems are present in the dataset, so EDA is the stage that reveals those issues early enough to correct them before training.
C. To reduce overall training time
D. To eliminate the need for pre-trained models