Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Related: live demo of DeepSeek v4 Flash running on my 128GB MacBook. Italian language with English subs.

https://www.youtube.com/watch?v=todMmp6AGCE



For many models the performance of llama.cpp on Mac is 20-40% lower than MLX. Did you try MLX? At least on HF there are MLX 2-bit quants. Unfortunately I have only 64GB, so I can't test it.


I'm not using llama.cpp there, it's my inference engine that is DeepSeek v4 specific. The goal is to optimize it as much as possible.


That's cool!

I knew the name sounded familiar, thank you for SDS!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: