Question 25
Domain 3: Train and evaluate modelsThe Daily Bugle is a news organization led by J. Jonah Jameson. One of the current projects is creating a new experiment in Azure Machine Learning Studio. One class has a much smaller number of observations than the other classes in the training set. You need to choose an appropriate data sampling strategy to compensate for the class imbalance. The lead developer, Peter Parker, has used the Stratified split sampling mode. Will Peter’s solution meet the goal?
Correct answer: B
Explanation
No. Stratified split keeps the same class distribution in each partition, so it “preserves the class proportions” rather than changing them. Because the minority class remains underrepresented, it does not “compensate for the class imbalance”; a resampling method like oversampling or undersampling is needed.
Why each option is right or wrong
A. Yes. Stratified split automatically oversamples the minority class so the training data becomes balanced.
B. No. Stratified split preserves the class proportions in each partition, but it does not correct class imbalance.
Azure Machine Learning Studio’s Stratified split sampling mode is designed to preserve the label distribution across the resulting partitions, not to alter it. In other words, if one class is underrepresented in the training data, the split still keeps that minority class at the same relative proportion in each partition, so it does not perform the oversampling or undersampling needed to address imbalance.