Featured
The default model for most agent workflows in 2026.
Featured
The model to reach for when the job involves images, video, audio, or very long documents.
Featured
When you need token latency measured in milliseconds, not hundreds of milliseconds, Groq is it.
Featured
Still the default for a lot of production agents.
Featured
Useful when an agent needs to swap models without rewiring its caller.
Featured
The default when you want to run a specific open model in production without standing up your own GPU fleet.
View tool→LLM Inference





