NCA-GENL Practice Q9

A. Handling variable sequence lengths

B. Lack of inherent sequential order in parallel processing

Transformers use self-attention over all tokens at once, so without an added position signal the architecture is permutation-invariant and cannot distinguish one word order from another. Positional encoding injects location information into each token representation, enabling the model to preserve sequence structure and interpret different arrangements of the same words differently.

C. Memory limitations

D. Computational complexity

Question 9

Explanation

Why each option is right or wrong