Question 23
Domain 2: Fundamentals of Generative AIUnder a token-based pricing model for an LLM API, which factors determine the cost of a single request?
Correct answer: A
Explanation
Token pricing charges per input token and per output token, usually with output priced higher. Latency, hardware, and call count are not direct billing dimensions.
Why each option is right or wrong
A. Number of input tokens + number of output tokens, often at different per-token rates
B. Wall-clock latency only
Latency affects user experience, not token-based billing.
C. GPU memory consumed
GPU memory is a compute resource, not the on-demand billing unit here.
D. Number of API calls made, regardless of token count
Charges depend on tokens consumed, not simply the number of API calls.