There has been a lot of distressing news lately, so it’s refreshing to read a story about developers who are just quietly making the web better. The team at the Internet Archive has fixed 9 million broken links on Wikipedia by scanning pages for broken links and updating them to point to the Wayback Machine’s copy:
And for the past 3 years, we have been running a software robot called IABot on 22 Wikipedia language editions looking for broken links (URLs that return a ‘404’, or ‘Page Not Found’). When broken links are discovered, IABot searches for archives in the Wayback Machine and other web archives to replace them with.
I wrote a blog post in 2012 about how fragile web pages are. This is always on my mind. It has informed a couple Micro.blog features such as our automatic mirroring of your blog to a GitHub repository. I hope that as the web evolves that this issue of broken links can be more directly tackled, so that the full responsibility for fixing this isn’t only on the Internet Archive.