Documentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
View Simple Agent Example on GitHub
Check out the complete Simple Agent example in our GitHub repository
Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
What You Will Build
- Listen to user speech and convert it to text with Deepgram STT
- Process conversations using OpenAI GPT-4o-mini
- Respond with natural-sounding speech via ElevenLabs TTS
- Detect when the user has finished speaking with Smart Turn detection
- Run on Stream’s low-latency edge network
Next Steps
AI Golf Coach
Add video processing with YOLO pose detection
Integrations
Swap in any of 25+ supported AI providers

