Deepseek v4 Pro feels like Claude Opus 4.6 in it's personality but here's what I...

yogthos · 2026-05-02T13:58:22 1777730302

I find a lot of the inefficiency also comes from the model just randomly poking around and grepping all the time which is the fault of the harness. I ended up building a Prolog based MCP where I use tree-sitter to parse the code into a graph, and then the model can just ask questions like 'what are all the functions connected to this function'. So, in case you're trying to focus on what a particular endpoint is doing, you can trivially and predictably trace the whole subgraphs of calls.

https://github.com/yogthos/chiasmus

__turbobrew__ · 2026-05-02T18:57:39 1777748259

I don’t know if it exists already, but bazel would be very useful for the same type of MCP server. Since all dependencies are explicit you can pretty easily do a bazel (r)deps query to find related targets.

yogthos · 2026-05-02T22:04:28 1777759468

Similar idea, I find tree sitter is nice because it already supports a bunch of languages and it's easily extensible. Once you the AST, you can really have the LLM go to town with it.

fragmede · 2026-05-04T12:34:06 1777898046

yeah, lsp integration is way better than grep

mark_l_watson · 2026-05-02T15:12:11 1777734731

Chiasmus Looks very cool. I might have a use for it because I like to use LLM harnesses to explore code. Thanks.

yogthos · 2026-05-02T15:20:55 1777735255

Awesome, and feel free to open issues if you find anything missing that would be useful.

jbritton · 2026-05-02T16:19:42 1777738782

This sounds great. I’m going to play with it.

soerxpso · 2026-05-02T14:23:11 1777731791

I've been having the same experience. Tasks like "go through this entire module and pedantically make it match my preferred styleguide exactly" were not worth a couple dollars with frontier models. It's nice to be able to put deepseek flash on stupid, unnecessary or highly speculative tasks without thinking about the cost.

trollbridge · 2026-05-08T23:57:18 1778284638

DeepSeek V4 Pro's pricing is blowing me away, particularly with how effective the cache is. I just burned 2M tokens and the total cost was 30¢. On Claude Code, I'd have used up multiple 5 hour windows by now, or else horrific amounts of API consumption, around $20-$30 I'm guessing.

onlyrealcuzzo · 2026-05-02T18:17:55 1777745875

> It obviously went through lots of files in both prompts but total cost? Just $0.09 for the Pro version.

When people say that LLMs aren't worth it, it kills me.

A lot of us, on average, make $100+ an hour. $0.09 is < 4 seconds of our time.

You can't even read the vast majority of prompt responses that fast.

LLMs will continue to get better (I'm doubtful at previous rates, all indications are showing that progress is slowing and costs are increasing disproportionately).

It seems like >50% of devs think LLMs provide less than 0 value. I just do not get it.

Did they use an LLM one time 3 years ago and decide it's never going to be worth it? Have they even tried? Or have you only ever tried it on 1 giant, monolythic proprietary codebase where they're a total expert and decided that an LLM isn't as good as them, so it's "completely worthless"?

They are shockingly unhelpful on my company's codebase.

But that doesn't mean they are flat-out worthless.

kelnos · 2026-05-02T18:56:50 1777748210

I know I'm guilty of making this sort of argument sometimes, but it's just not valid.

I don't get paid for every waking hour of every day. Often I'm using an LLM for something that's uncompensated, so my hourly wage equivalent is irrelevant.

And for times when we might use an LLM for something related to paid work, it's still money out of your paycheck (unless the employer is paying for it; go nuts in that case). And it's not like using the LLM lets you go home early if it saves you time. You just end up doing more work.

I still use them because they're a useful tool sometimes. But I don't pretend it has negligible or no cost. (Not to mention the externalities around electricity use, crazy data center buildout, skyrocketing GPU and RAM prices, etc.)

killingtime74 · 2026-05-03T12:08:58 1777810138

I don't understand, your employer doesn't pay for your AI use? If my employer didn't pay for it I just wouldn't use it at all out of principle. Just as I don't buy my own work laptop

fragmede · 2026-05-04T12:35:29 1777898129

> You just end up doing more work.

Might want to dig into that one a bit deeper there.

ifwinterco · 2026-05-03T12:16:18 1777810578

Biggest issue with Opus for me is not so much that it's expensive (though it is), but the fact it's slow especially during US working hours.

I prefer using slightly worse but significantly quicker models on a tighter leash and iterating faster, feels more productive

culopatin · 2026-05-05T12:58:53 1777985933

100+ on average?! That hurt.

mastermage · 2026-05-04T05:24:08 1777872248

Very American centered POV

stavros · 2026-05-02T09:52:24 1777715544

How did you use it? OpenRouter, or provider directly?

freedomben · 2026-05-02T12:25:05 1777724705

I'm guessing downvoted because OpenRouter was mentioned in the note (which may not have been there originally), but aside from that this is a perfectly legitimate question. In order to reproduce we need to know how. Was it a coding agent like opencode, an IDE, or something else?

wg0 · 2026-05-02T16:53:53 1777740833

OpenCode + Direct Deepseek API.

TacticalCoder · 2026-05-02T15:44:22 1777736662

> would have burned somewhere between $9 to $13 easily with not much benefit

With not much benefit compared to DeepSeek v4 Pro @ 9 cents (1/100th of the price) or did neither offer any benefit?

ithkuil · 2026-05-02T08:14:52 1777709692

Even taking into account the fact that they are billing at 75% discount it's still quite cheaper

amelius · 2026-05-02T09:19:49 1777713589

Aren't they all billing at discount?

stavros · 2026-05-02T09:46:14 1777715174

Anthropic's and OpenAI's costs seem to include a fairly ok margin, from the very fourth hand info I have.

vdfs · 2026-05-02T11:56:24 1777722984

In total, how many hands do you have?

gessha · 2026-05-02T12:03:08 1777723388

Enough to reach the bottom of the rabbit hole.

rzzzt · 2026-05-03T14:44:07 1777819447

Counting turtles on the way down?

sumeno · 2026-05-02T18:43:50 1777747430

If I was a betting man I'd bet that at least one of those hands is an LLM

utopiah · 2026-05-02T11:58:30 1777723110

Those aren't their hands.

locknitpicker · 2026-05-02T13:03:39 1777727019

> Aren't they all billing at discount?

Microsoft just announced the availability of OpenAI GPT-5.5, which they are charging 30x for it. In contrast, they charge 7.5x for Claude Opus 4.6 and 1x for OpenAI GPT-5.4

Check out the token-based pricing, and compare GPT-5.5 with all other models.

https://docs.github.com/en/copilot/reference/copilot-billing...

giwook · 2026-05-03T14:13:54 1777817634

Actually it is $30 for GPT 5.5, $25 for both Opus 4.6 and 4.7.

If you're referring to the multipliers that are used for subscription-based usage, GPT 5.5 is not available yet (according to https://docs.github.com/en/copilot/reference/copilot-billing...) and Opus will be at 27x at the end of the month.

When I check GH Copilot right now, it looks like Opus 4.7 multiplier was increased to 15x (I think it was 6x just a few days ago) but 4.6 is still at 3x. But these relatively cheap multipliers exist only until the end of the month.

mandeepj · 2026-05-03T15:43:28 1777823008

That’s the classic phenomenon of cheaper pricing due to offshoring! If your expenses are in dollars then for sure recovery is going to be in dollars as well. Why is that a surprise to anyone?

baldai · 2026-05-02T09:01:27 1777712487

Only similarity it has to Opus 4.6 is the 4 in the name. I do not understand these dishonest comparisons. OOS models are vool, cheap and promising for a future -- but why are we pretending they are better than they are?

gmerc · 2026-05-02T09:38:35 1777714715

Speak for yourself. I found switching from Opus 4.7 to be completely painless and in fact, due to the reliability of Anthropic’s API, less of a friction despite slower response times. Zero issues on a large mono repro

baldai · 2026-05-02T14:29:57 1777732197

Hi, I am happy it works well for you. For me personally I struggle finding good use-cases in general for these OOS models. I am lightly technical but I do not manually code. So my flow is /grill-me (can take hours), make plan, review plan with 2. model, implement, review after implementation.

Maybe it is because my tasks are usually chunkier, or because I cant code myself that I struggle using cheaper models. Feels like at every stage of this process SOTA model improves it by 5%, which adds up.

But I am maybe ignorant of Opus level. My main driver is 5.5 and Opus is there for frontend and 2. opinion. In a past I also used Claude models for the chatting phase, but 5.5 took over recently. Maybe Deepseek is closer to Opus and I just overestimated the model compared to 5.5? I tried to give it benefit of being similar.

Recently I started experimenting with Deepseek Flash, maybe hoping if plan is solid enough it can implement quickly and cheaply, but for now it feels not worth it.

How do you use the model to see the benefits? Have you tried 5.5 and can you compare to that one as well?

Thanks.

logicprog · 2026-05-02T17:34:09 1777743249

In my experience, deep seek models are massively overrated in terms of how good they actually are at agantic usage, coding and writing, just because they are kind of the first open source entrant and the name a lot of people know. Try GLM 5.1, coding and writing just because they are kind of the first open source entrant and the name a lot of people know. Try GLM 5.1.

baldai · 2026-05-03T14:38:51 1777819131

Isnt 5.5 Low just better? Its so fast, needs so little tool calls to get work done.

Reviving1514 · 2026-05-02T10:41:22 1777718482

What provider are you using? I have it a shot through open router and saw some weird half formed words coming through occasionally, would love to switch over and give it a proper go

gmerc · 2026-05-03T04:12:37 1777781557

Direct API

Reviving1514 · 2026-05-08T06:20:16 1778221216

Thank you!