MLS-C01 Practice Q36

A. Apply binning to group the text into numeric ranges

Binning groups values into intervals, not free-text into language units for modeling.

B. Use tokenization to split the text into smaller units

The source material identifies tokenization as a feature engineering concept specifically suited to text data. Because the field contains free-text product descriptions, the appropriate first step is to break the text into smaller units that can later be represented as features for the model.

C. Use one-hot encoding to detect unusual text values

One-hot encoding represents categorical values; it does not first divide raw text into words or subwords.

D. Reduce dimensionality before extracting any text elements

Reducing dimensionality is used after features exist; raw text must first be converted into usable components.

Question 36

Explanation

Why each option is right or wrong