Deepseek v4 Pro feels like Claude Opus 4.6 in it's personality but here's what I did find out about costs:
I did cut loose Deepseek v4 on a decent sized Typescript codebase and asked it to only focus on a single endpoint and go in depth on it layer by layer (API, DTOs, service, database models) and form a complete picture of types involved and introduced and ensure no adhoc types are being introduced.
It developed a very brief but very to the point summary of types being introduced and which of them were refunded etc.
Then I asked it to simplify it all.
It obviously went through lots of files in both prompts but total cost? Just $0.09 for the Pro version.
On Claude Opus I think (from past experience before price hikes) these two prompts alone would have burned somewhere between $9 to $13 easily with not much benefit.
Note - I didn't use Open router rather used the Deepseek API directly because Open router itself was being rate limited by Deep seek.
I find a lot of the inefficiency also comes from the model just randomly poking around and grepping all the time which is the fault of the harness. I ended up building a Prolog based MCP where I use tree-sitter to parse the code into a graph, and then the model can just ask questions like 'what are all the functions connected to this function'. So, in case you're trying to focus on what a particular endpoint is doing, you can trivially and predictably trace the whole subgraphs of calls.
I don’t know if it exists already, but bazel would be very useful for the same type of MCP server. Since all dependencies are explicit you can pretty easily do a bazel (r)deps query to find related targets.
Similar idea, I find tree sitter is nice because it already supports a bunch of languages and it's easily extensible. Once you the AST, you can really have the LLM go to town with it.
I've been having the same experience. Tasks like "go through this entire module and pedantically make it match my preferred styleguide exactly" were not worth a couple dollars with frontier models. It's nice to be able to put deepseek flash on stupid, unnecessary or highly speculative tasks without thinking about the cost.
DeepSeek V4 Pro's pricing is blowing me away, particularly with how effective the cache is. I just burned 2M tokens and the total cost was 30¢. On Claude Code, I'd have used up multiple 5 hour windows by now, or else horrific amounts of API consumption, around $20-$30 I'm guessing.
> It obviously went through lots of files in both prompts but total cost? Just $0.09 for the Pro version.
When people say that LLMs aren't worth it, it kills me.
A lot of us, on average, make $100+ an hour. $0.09 is < 4 seconds of our time.
You can't even read the vast majority of prompt responses that fast.
LLMs will continue to get better (I'm doubtful at previous rates, all indications are showing that progress is slowing and costs are increasing disproportionately).
It seems like >50% of devs think LLMs provide less than 0 value. I just do not get it.
Did they use an LLM one time 3 years ago and decide it's never going to be worth it? Have they even tried? Or have you only ever tried it on 1 giant, monolythic proprietary codebase where they're a total expert and decided that an LLM isn't as good as them, so it's "completely worthless"?
They are shockingly unhelpful on my company's codebase.
But that doesn't mean they are flat-out worthless.
I know I'm guilty of making this sort of argument sometimes, but it's just not valid.
I don't get paid for every waking hour of every day. Often I'm using an LLM for something that's uncompensated, so my hourly wage equivalent is irrelevant.
And for times when we might use an LLM for something related to paid work, it's still money out of your paycheck (unless the employer is paying for it; go nuts in that case). And it's not like using the LLM lets you go home early if it saves you time. You just end up doing more work.
I still use them because they're a useful tool sometimes. But I don't pretend it has negligible or no cost. (Not to mention the externalities around electricity use, crazy data center buildout, skyrocketing GPU and RAM prices, etc.)
I don't understand, your employer doesn't pay for your AI use? If my employer didn't pay for it I just wouldn't use it at all out of principle. Just as I don't buy my own work laptop
I'm guessing downvoted because OpenRouter was mentioned in the note (which may not have been there originally), but aside from that this is a perfectly legitimate question. In order to reproduce we need to know how. Was it a coding agent like opencode, an IDE, or something else?
Microsoft just announced the availability of OpenAI GPT-5.5, which they are charging 30x for it. In contrast, they charge 7.5x for Claude Opus 4.6 and 1x for OpenAI GPT-5.4
Check out the token-based pricing, and compare GPT-5.5 with all other models.
When I check GH Copilot right now, it looks like Opus 4.7 multiplier was increased to 15x (I think it was 6x just a few days ago) but 4.6 is still at 3x. But these relatively cheap multipliers exist only until the end of the month.
That’s the classic phenomenon of cheaper pricing due to offshoring! If your expenses are in dollars then for sure recovery is going to be in dollars as well. Why is that a surprise to anyone?
Only similarity it has to Opus 4.6 is the 4 in the name. I do not understand these dishonest comparisons. OOS models are vool, cheap and promising for a future -- but why are we pretending they are better than they are?
Speak for yourself. I found switching from Opus 4.7 to be completely painless and in fact, due to the reliability of Anthropic’s API, less of a friction despite slower response times. Zero issues on a large mono repro
Hi, I am happy it works well for you. For me personally I struggle finding good use-cases in general for these OOS models. I am lightly technical but I do not manually code. So my flow is /grill-me (can take hours), make plan, review plan with 2. model, implement, review after implementation.
Maybe it is because my tasks are usually chunkier, or because I cant code myself that I struggle using cheaper models. Feels like at every stage of this process SOTA model improves it by 5%, which adds up.
But I am maybe ignorant of Opus level. My main driver is 5.5 and Opus is there for frontend and 2. opinion. In a past I also used Claude models for the chatting phase, but 5.5 took over recently. Maybe Deepseek is closer to Opus and I just overestimated the model compared to 5.5? I tried to give it benefit of being similar.
Recently I started experimenting with Deepseek Flash, maybe hoping if plan is solid enough it can implement quickly and cheaply, but for now it feels not worth it.
How do you use the model to see the benefits? Have you tried 5.5 and can you compare to that one as well?
In my experience, deep seek models are massively overrated in terms of how good they actually are at agantic usage, coding and writing, just because they are kind of the first open source entrant and the name a lot of people know. Try GLM 5.1, coding and writing just because they are kind of the first open source entrant and the name a lot of people know. Try GLM 5.1.
What provider are you using? I have it a shot through open router and saw some weird half formed words coming through occasionally, would love to switch over and give it a proper go
I did cut loose Deepseek v4 on a decent sized Typescript codebase and asked it to only focus on a single endpoint and go in depth on it layer by layer (API, DTOs, service, database models) and form a complete picture of types involved and introduced and ensure no adhoc types are being introduced.
It developed a very brief but very to the point summary of types being introduced and which of them were refunded etc.
Then I asked it to simplify it all.
It obviously went through lots of files in both prompts but total cost? Just $0.09 for the Pro version.
On Claude Opus I think (from past experience before price hikes) these two prompts alone would have burned somewhere between $9 to $13 easily with not much benefit.
Note - I didn't use Open router rather used the Deepseek API directly because Open router itself was being rate limited by Deep seek.