Tag Archives: httrack

Complete mirror of this blog

I’ve been blogging here for 13 years. If you take any random post from that first year, the majority of the links to other web sites are broken. The default outcome for any site that isn’t maintained — including the one you’re reading right now — is for it to vanish. Permanence doesn’t exist on the web.

We can solve this, but it will take time. For now I think mirroring our writing is a great solution, to guard against domain names expiring and other inevitable failures. But where to mirror to?

Only 2 companies keep coming to mind: WordPress.com and GitHub. I believe both will last for decades, maybe even 100 years, and both embrace the open web in a way that most other centralized web sites do not.

Even though I self-host this weblog on WordPress, I’ve chosen to mirror to GitHub because of their focus on simple, static publishing via GitHub Pages. It has the best chance of running for a long time without intervention.

I exported all of manton.org with the httrack command-line tool and checked it into GitHub, with a CNAME for mirror.manton.org. It works perfectly. I still need to automate this process so that it updates regularly, but I’m very happy to finally have a complete mirror for the first time.