Skip to main content

Documentation Index

Fetch the complete documentation index at: https://visionagents.ai/llms.txt

Use this file to discover all available pages before exploring further.

View Cartesia Narrator Example on GitHub

Check out the complete Cartesia Narrator example in our GitHub repository
In this example, we build a storytelling narrator agent using Cartesia’s Sonic 3 TTS with Vision Agents. The agent narrates stories with highly expressive speech, leveraging Cartesia’s audio markup tags to customize the output with emotions, pauses, and vocal effects.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.

What You Will Build

Next Steps

Cartesia Integration

Explore Cartesia’s TTS configuration and audio markup

Live Video Try-On

Real-time virtual try-on with Decart’s Lucy-2 model