DeepSeek’s official API has a cache hit rate of over 99% if you use it continuou...

halfwhey · 2026-05-02T13:09:58 1777727398

Might be a dumb question but do you have to read the files in the same order in new sessions to ensure the correct prefix for the cache?

weiliddat · 2026-05-02T13:34:10 1777728850

Also curious. With tool calls reading/searching different files, possible compacting reading a large codebase / long threads, I can't imagine how you hit 99% cache rate.

WatchDog · 2026-05-02T13:38:13 1777729093

Yes, you have to use the same session, I guess you could load up a bunch of context, then fork the session into a few different tasks, although I haven't tried it.

naaqq · 2026-05-02T13:35:21 1777728921

Sorry, I was wrong here. I meant a single long session. And there’s no compression, the 1M context is only half used.

gbgarbeb · 2026-05-03T16:29:22 1777825762

Then where did 200M come from? 200,000 tokens?

naaqq · 2026-05-04T07:56:02 1777881362

Not all read tokens are included in the context, many of the tokens are from read cache hits. I hit it many times so it grew to 200M. The number came from the API platform.