What LLM do you guys use for fast inference for voice/phone agents? I feel like ...

What LLM do you guys use for fast inference for voice/phone agents? I feel like to get really good latency I need to "cheat" with Cerebras, groq or SambaNova.

Haiku 4.5 is very good but still seems to be adding a second of latency.