More

tamnd · 2026-06-14T23:23:54 1781479434

This could be a nice code golf project. It only needs a webview, a ZIM reader, and a way to append data to an existing binary and read it back.

I did something like that a very long time ago (Of course, I have forgotten)

tamnd · 2026-06-14T23:20:42 1781479242

For sharing, better use the html folder or zim format, Kage supports both of them.

tamnd · 2026-06-14T23:16:28 1781478988

I have a project for creating and archiving RSS feeds, keeping the full history from the time the crawler starts. I need to clean up a bit, then will open source it soon.

tamnd · 2026-06-14T23:13:54 1781478834

Exactly. For downloading, Kage requires Chrome or Chromium. Running it inside Docker makes setup easier and keeps cleanup simple:

https://github.com/tamnd/kage/blob/main/Dockerfile

Btw, let me think the way to only enable this when running inside Docker.

nikisweeting · 2026-06-15T00:28:28 1781483308

Docker is designed to be undetectable by default, the best way I have found is to set env IN_DOCKER=True manually in your Dockerfile + check that there is no $DISPLAY configured + that you're on linux. Usually if all/most of those are true you can safely add --no-sandbox --disable-setuid-sandbox --disable-dev-shm-usage etc. all the docker-specific flags. Thats what we do in https://github.com/ArchiveBox/ArchiveBox/blob/dev/Dockerfile...

tamnd · 2026-06-14T23:08:59 1781478539

Making docs available offline was one of my main motivations for building this tool. I will try Apple Docs too.

I previously downloaded the Snowflake docs, and it was something like tens or even hundreds of thousands of pages, I do not remember exactly. The output ended up being very large.

By the way, I forgot to add zstd compression support to my ZIM reader/writer. I will implement that in the next version.

tamnd · 2026-06-14T23:04:22 1781478262

Kiwix has readers for almost every platform, Android, desktop, iPhone. That's why I made Kage produce ZIM file.

The executable file is mostly for people who don't have Kiwix installed yet, or just want to run the archive directly.

tamnd · 2026-06-14T23:00:40 1781478040

This brings back memories. Around twenty years ago, internet was still expensive dial-up, so I used to go to an internet cafe, run HTTrack to download websites and manga, copy everything onto my tiny 128MB USB stick (felt very large at that time), then bring it home and read offline ;))

tamnd · 2026-06-14T22:54:48 1781477688

You could use python -m http.server instead. I haven't tried it yet, but it should work.

Actually, Kage has two parts: a crawler that crawls pages and converts them to clean HTML by capturing the DOM after rendering in Chrome/Chromium, and a pack/serve component that packages the result as either a ZIM file for Kiwix or an executable file.

tamnd · 2026-06-14T22:50:26 1781477426

I have a bunch of opinionated/personal-use binaries like this in my $HOME/bin/, like delete-all-npm, clean-rust-cache, download-youtube-playlist, and get-markdown <url>. It feels good, and I don't need to remember any commands. Sometimes my coding agent can figure out how to call some of those tools too ;))

tamnd · 2026-06-14T18:02:33 1781460153

Submitting this to Hacker News is the right place! Thanks for your idea. I will consider implementing that :)

Also, in my mind, I already have a script/program to convert HTML to Markdown, so it could actually store everything on disk as a folder of Markdown files, and then commit them to a Git repo.

mgiampapa · 2026-06-14T21:53:11 1781473991

I think the zim flow was perfect for offline use. I know I will be making use of it as soon as I can figure out how to pass chrome the cookies so I can be signed into the site. Didn't see it in the page, but I didn't look closely yet.

tamnd · 2026-06-14T23:37:05 1781480225

Not yet supporting cookies, since I created this tool for shadowing public websites first. I will add options to pass cookies later. It will pass them to the underlying Chrome/Chromium process, so it should not be hard to do.

mcdonje · 2026-06-15T02:43:50 1781491430

Not to load you up with too many ideas, but a markdown folder sounds a lot like obsidian, which has a plugin system now.

Epub would also be a great target.

smeej · 2026-06-15T01:41:50 1781487710

I would use the shit out of this. I'm a heavy user of Logseq (OG, the md file-based version). Would LOVE to save my favorite web resources this way.