Generative AI Leader Practice Q26

A. Veo

Google’s video-generation foundation model is Veo, introduced as the model specifically trained for generating video from text prompts rather than still images or text-only outputs. In Google’s model lineup, this distinguishes it from image-focused models such as Imagen and general multimodal models, so the correct identification is the one purpose-built for text-to-video generation.

B. Imagen

Imagen is associated with text-to-image generation, not text-to-video output.

C. Gemma

Gemma is a lightweight language model family for text-centric tasks, not video generation.

D. Chirp

Chirp is used for speech/audio understanding tasks rather than generating video from prompts.

Question 26

Explanation

Why each option is right or wrong