DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization

Ablation Study

Prompt Length (VBench)

VBench incorporates the prompt optimization technique introduced in CogVideoX, which utilizes GPT-4o to extend the short prompts into more descriptive "long prompts" while preserving their original meanings. We refer to these as "long prompts", distinguishing them from the original "short prompts". The visualization of video samples generated with short and long promtps are compared below. With longer prompts, the text-video alignment, understanding of object relationships, and depiction of motion are generally more robust and accurate.

Short Prompt

Long Prompt