Question 20
UnclassifiedWhich sampling technique preserves class proportions across train and test sets?
Correct answer: B
Explanation
Stratified sampling divides the data so each split keeps the same class distribution as the full dataset. This preserves class proportions across the train and test sets, which is why it is used when classes are imbalanced.
Why each option is right or wrong
A. Random sampling
B. Stratified sampling
Under standard machine-learning data-splitting practice, stratification is the method that enforces the same class-frequency distribution in each partition as in the full dataset. In scikit-learn, for example, `train_test_split(..., stratify=y)` uses the target labels to allocate samples so each split mirrors the original class proportions, which is especially important when the classes are imbalanced.
C. Cluster sampling
D. Systematic sampling