Skip to main content

A major Wikipedia project fixed millions of old, broken links

Wikipedia’s enormous army of editors do their best to jump on pages showing erroneous information or quickly rewrite entries that have been tampered with by a miscreant, but occasionally the false information stays up for longer than you’d like.

With that in mind, many people who use the online encyclopedia like to hit the third-party links at the bottom of the page from which information in the main article has been sourced. Those links should not only confirm the information in the Wikipedia article but also offer more depth to the subject, and so are an invaluable resource for those wishing to dig deeper into a particular topic.

Recommended Videos

The trouble is that sometimes those articles — whether from the news media, educational institutions, businesses, or research establishments — are taken offline, resulting in a broken link. This can undermine the credibility of Wikipedia for those looking to verify information appearing in the listing.

The good news is that a team of volunteers from the Internet Archive has been able to restore a colossal nine million broken links on Wikipedia, helping to make those annoying “404 error/page not found” messages a thing of the past.

The Internet Archive is a non-profit digital library that’s been keeping a record of every web page put online since 1996, when the internet as we know it today was in its earliest stages of development. So yes, among its staggering 338 billion archived web pages are all of the ones that Wikipedia linked to but which have since been taken offline.

The Internet Archive’s Mark Graham explained in a blog post this week how it’s been archiving nearly every URL referenced on different Wikipedia sites the moment those links are added or changed — at the rate of about 20 million URLs a week.

It’s also been running a software robot called IABot on more than 20 Wikipedia language editions searching for broken links, Graham wrote. When it finds broken links, IABot looks for archives in the Wayback Machine — a searchable database for web pages — and other web archives to replace them with.

“Restoring links ensures Wikipedia remains accurate and verifiable and thus meets one of Wikipedia’s three core content policies: ‘Verifiability,’” Graham wrote.

The team plans to continue with its efforts to check and fix links on more Wikipedia sites and increase the speed of its system, as well as look at how it can extend its operation beyond the online encyclopedia.

On a side note, the Wayback Machine is a fun tool anyone can use. Besides helping you access information from old sites, it also lets you see how a site’s design has changed over the years — all you need to do is enter the site’s URL. Enter “youtube.com”, for example, and then click on different dates on the calendar to see just how clunky the streaming service used to look. The archived pages aren’t dynamic but instead show a snapshot of how it appeared on a particular day.

Many people who use Wikipedia and know about the Wayback Machine already use the tool to access a snapshot of the lost page, but the Internet Archive’s work to re-establish the links has helped to improve the usability of the site and also boost its credibility in the process.

Trevor Mogg
Contributing Editor
Not so many moons ago, Trevor moved from one tea-loving island nation that drives on the left (Britain) to another (Japan)…
How to change margins in Google Docs
Laptop Working from Home

When you create a document in Google Docs, you may need to adjust the space between the edge of the page and the content -- the margins. For instance, many professors have requirements for the margin sizes you must use for college papers.

You can easily change the left, right, top, and bottom margins in Google Docs and have a few different ways to do it.

Read more
What is Microsoft Teams? How to use the collaboration app
A close-up of someone using Microsoft Teams on a laptop for a videoconference.

Online team collaboration is the new norm as companies spread their workforce across the globe. Gone are the days of primarily relying on group emails, as teams can now work together in real time using an instant chat-style interface, no matter where they are.

Using Microsoft Teams affords video conferencing, real-time discussions, document sharing and editing, and more for companies and corporations. It's one of many collaboration tools designed to bring company workers together in an online space. It’s not designed for communicating with family and friends, but for colleagues and clients.

Read more
Microsoft Word vs. Google Docs
A person using a laptop that displays various Microsoft Office apps.

For the last few decades, Microsoft Word has been the de facto standard for word processors across the working world. That's finally starting to shift, and it looks like one of Google's productivity apps is the heir apparent. The company's Google Docs solution (or to be specific, the integrated word processor) is cross-platform and interoperable, automatically syncs, is easily shareable, and perhaps best of all, is free.

However, using Google Docs proves it still has a long way to go before it can match all of Word's features -- Microsoft has been developing its word processor for over 30 years, after all, and millions still use Microsoft Word. Will Google Docs' low barrier to entry and cross-platform functionality win out? Let's break down each word processor in terms of features and capabilities to help you determine which is best for your needs.
How does each word processing program compare?
To put it lightly, Microsoft Word has an incredible advantage over Google Docs in terms of raw technical capability. From relatively humble beginnings in the 1980s, Microsoft has added new tools and options in each successive version. Most of the essential editing tools are available in Google Docs, but users who are used to Word will find it limited.

Read more