Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
ilaksh
8 months ago
|
parent
|
context
|
favorite
| on:
Claude Haiku 4.5
What LLM do you guys use for fast inference for voice/phone agents? I feel like to get really good latency I need to "cheat" with Cerebras, groq or SambaNova.
Haiku 4.5 is very good but still seems to be adding a second of latency.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
Haiku 4.5 is very good but still seems to be adding a second of latency.