Can you share some parameters you enable tool calling and agentic usage?
Or, higher level, some philosophies on what approaches you are using for tuning to get better tool calling and/or agentic usage?
I'm having surprisingly good success with unsloth/Qwen3.6-27B-GGUF:Q4_K_M (love unsloth guys) on my RTX3090/24GB using opencode as the orchestrator.
It concocts some misleading paths, but the code often compiles, and I consider that a victory.
You have to watch it like you would watch a 14 year old boy who says he is doing his homework but you hear the sound effects of explosions.
My config is similar to: https://github.com/noonghunna/club-3090/blob/master/docs/eng...
I need to try out some of the other set ups mentioned in this repo for increased TPS.
Both 27b and A3B done all my production works pbeautifuly (At Q8) i dont think any model are good for Q4.
Qwen 3.5 122b surpasses both of them tho.
Can you share some parameters you enable tool calling and agentic usage?
Or, higher level, some philosophies on what approaches you are using for tuning to get better tool calling and/or agentic usage?
I'm having surprisingly good success with unsloth/Qwen3.6-27B-GGUF:Q4_K_M (love unsloth guys) on my RTX3090/24GB using opencode as the orchestrator.
It concocts some misleading paths, but the code often compiles, and I consider that a victory.
You have to watch it like you would watch a 14 year old boy who says he is doing his homework but you hear the sound effects of explosions.