Generate interleaved text and image content in a structured format you can directly pass to downstream APIs. You can also use this to control how many “draft” text tokens and “imagination” image tokens the model generates first before it starts generating the final output.