• Check out the results of the Techtree Contest #19!
  • Listen to a special audio message from Bill Roper to the Hive Workshop community (Bill is a former Vice President of Blizzard Entertainment, Producer, Designer, Musician, Voice Actor) 🔗Click here to hear his message!
  • Read Evilhog's interview with Gregory Alper, the original composer of the music for WarCraft: Orcs & Humans 🔗Click here to read the full interview.
  • Create a void inspired texture for Warcraft 3 and enter Hive's 34th Texturing Contest: Void! Click here to enter!
  • The Hive's 22nd Icon Contest: Creep Abilities is now concluded, time to vote for your favourite set of icons! Click here to vote!

PSA: My favorite browser extension - A Web Archiving Tutorial: Wayback Machine plugin and alternatives

Level 12
Joined
Sep 4, 2008
Messages
272

Contents

Motivation
Installation Tips
Backing up youtube videos and reddit posts
Don't forget to back up your own browser history!

How to back up these browser extensions
Other useful data hoarding tools
In Closing





Motivation
A sobering statistic: 25% of web pages posted between 2013 and 2023 have vanished

In this post I want to promote my favorite browser extension (The Internet Archive's official Wayback Machine extension), and a couple alternative tools for web archiving if it ever doesn't work properly for you. After all, archiving (affectionately referred to as "data hoarding") is all about backups, and that means backups for your backups and for the backup tools as well.


Why am I sharing this extension? Well, for one thing, let's just say I had a small influence on it being so useful (I won't go into detail), so not only do I love it, but I'm also proud of it.

More importantly, the fact that it only does its job even half decently because of effort I (a virtual nobody) put in to improving it, continually reminds me that for data archiving, there really are so few people doing it, that every person's contribution counts - especially yours, dear reader.


Besides allowing quick access to archived versions of dead links, the real point of the extension is its auto-backup functionality for pages you visit.

It won't back up Hive resource files (as far as I know), but it will back up helpful threads, including ones with trigger code contained directly in the thread. It'll also be helpful for figuring out who to credit if you use their resource but if its download page has since vanished.


If this extension had existed sooner and more people who used wc3c had used it, we'd have a lot more helpful modding information preserved than we do today. We might also have images of old models and be able to recreate them. Heck, some archives of images of said models possibly DO exist, but without the URLs, or without Google actually searching Wayback Machine archives in its search results instead of just letting you view old versions of sites, we may never find any record of those old models.

Some examples of tutorials that have been saved thanks to the Wayback Machine (and some that have been lost):



Installation Tips

While AFAIK you don't need an Internet Archive account in order to use the manual webpage backup button, for hands-free auto-archiving, you might need to make an Internet Archive account and then log into it via the extension.


Once you've done that, be sure to enable auto-archive via the gear settings-->General


Occasionally, if the Wayback Machine logs you out, auto-archive will stop working and you'll need to log back in.


The extension popup menu looks like this. I usually have "Outlinks" and "Screenshot" checked, but if you find that you're backing a lot of pages up and getting throttled, temporarily disabling saving outlinks helps.

2537e407-eb2c-48bb-96ed-55f9b6210b96.png

I highly recommend going into the gear settings (bottom left icon) and setting it to back up anything w/o an archive more recent than 90 days old minimum (since you'd be very surprised how often pages change), and occasionally setting it to make more frequent backups when you're keen on preserving something rapidly changing - like if someone keeps editing a post, or, idk, a crypto site keeps changing what's on their homepage. You can also manually force it to do a backup if it fails to do one, by clicking the Save Page Now button on the top-level screen of the extension popup, shown above.

74d7a0ac-4a49-4288-9aa9-c3c856f16780.png

I recommend enabling the Wayback Machine backup count (it'll display a number on the extension icon), since that'll tip you off as to whether an auto-backup has failed and needs to be re-triggered manually, or whether other Internet-goers have already sufficiently backed up a given page you're looking for an older version of (such as, e.g., a youtube* video from 5 years ago that's since been privated). Why do auto-backups fail? Generally because you're opening too many pages at once and being throttled by the Internet Archive, and it needs a 1-5 minute break.

9162d94a-52fc-4594-a3e4-7bfaccf8f5f7.png


Lastly, to be extra safe, turn on the blacklist ("Exclude URLs" under "Auto Save Page") to prevent archiving of things like if you have any unsecured API keys that are for some reason accessible via links without a login. For inspiration, here's an excerpt from my own blacklist:

Code:
archive.org*
web.archive.org*
google.com*
*.google.com*
mail.yahoo.com*
duckduckgo.com/?q=*
*.proton.me*
*youtube.com/feed/library
*account*
*login*
*facebook.com*
*dropbox.com*
*.reddit.com/message/*
*.amplenote.com/notes*
*/user/*
*/inbox/*
*chatgpt.com*


Backing up youtube videos and reddit posts
*Youtube and Reddit archiving by the Wayback Machine has stopped functioning properly in the last couple years. The youtube thing might be temporary (it seems like sometimes things do get new backups), but it's very unreliable these days, and in any case, most youtube videos from the past few years aren't being preserved anymore, so you'll want to use yt-dlp to download anything you really care about never losing. In case you're wondering why wc3 modders would care about backing up youtube videos, just see this tutorial for a bunch of videos that unfortunately do not have backups and are probably lost forever.


As for Reddit, this separate extension will properly preserve a reddit thread to a similar archive as the Wayback Machine*. The only caveat is that this separate extension's backup feature requires you to manually click on the extension icon, so it's only practical for things you really want to preserve. If you like, you can also save reddit comments (not OPs) locally to your computer by installing the Reddit Enhancement Suite and using the save-RES button that it adds below comments:

6c4b4d5c-b09b-43aa-bd2d-f72512d0ceab.png


*The Wayback Machine itself is currently only able to preserve OPs of new posts (not comments), as of a couple years ago - I believe it's partly b/c Reddit started preventing scraping thanks to companies using it to train their AI models.



Don't forget to back up your own browser history!
To solve the "I forgot the URL so I can't look it up on the Wayback Machine" problem I mentioned near the top of this post, you can either:

a) check the "Save to My Web Archive" button in the General settings of the Wayback Machine extension (I don't recommend this unless you use an anonymous username, since I don't think it's possible to hide your My Web Archive page from the public)
OR
b) use something like History Trends Unlimited to locally back up your entire browsing history. Turns out, Chromium-based browsers generally delete browsing history older than 3 months. So, this extension is useful for saving more than that. Only annoying thing about that particular extension is that you have to be sure to click on the extension at least once every 3 months b/c for some reason it's not automatic. If you have a searchable database of your browsing history like this provides, you'll be able to locate URLs to old things, and then even if they're dead, you can just plug them into the Wayback Machine, and, if you had my favorite extension installed, be in business.


How to back up these browser extensions
One last tip I'll leave you with: backup your archiving tools. Sometimes even the best browser extensions, including open source ones, suffer from enshittification. They can become paywalled, or your most headache-saving features can be removed entirely (like happened to a lot of extensions thanks to the very very stupid switch over to Manifest v3*), or the extension can even get turned into malware (like happened with The Great Suspender).

*even one extension I liked, that had a Firefox version, removed my favorite feature from that version under the pretense of updating to Manifest v3, presumably b/c the extension developer had a shared codebase b/t the Chromium and Firefox versions; so, Firefox users, who don't have to use Manifest v3, were by no means safe.


To prevent getting screwed by extensions changing due to auto-updates, for my absolute favorite extensions, I follow the below steps to back them up:

1) after I first install them, I find their ID by checking my browser's Extensions settings page:
1779335257865.webp

2) I go into their folders on my computer (e.g., under AppData\Local\Microsoft\Edge\User Data\Default\Extensions or AppData\Local\Google\Chrome\User Data\Default\Extensions), and copy the folder with that ID to a completely separate location (e.g., on another drive).
3) I open up the manifest.json of the copy, and if you see an "update_url" in the file, change the URL written after "update_url" to "https://localhost/", to ensure the extension won't be updated (not guaranteed to work, but it helps - you'll have to DYOR about how to prevent auto-update in the future).
4) After that, I enable "Developer mode" in my browser's Extensions settings page, and then load the copy as an "unpacked extension." Then I disable the original version I downloaded. Alternatively, you can just save the backup initially, and wait to install it in this manner until if and when the original extension becomes enshittified (since otherwise you'll potentially miss out on new, good features if your favorite extension miraculously gets better instead of worse).



Other useful data hoarding tools
  • Discord History Tracker for creating local, browsable backups of Discord servers, attachments, and chats
  • Insanity_AI's Patchwork tool for converting WC3 maps to text format, combined with git version control for local backups (full tutorial) + Github for online storage (you can make private repositories for free)
  • Backblaze is the best service at the moment that I know of for full, unlimited-size backups of all your drives' data; but you should never get too comfortable with any given service
  • Always remember the 3-2-1 rule of data hoarding


In Closing
So many internet forums have vanished in recent years, that I've even started keeping a "never forget" list of names of lost media whose loss has personally affected me, to remind me that while the fight against the Ephemeral Internet is neverending, every single page archived is worth the effort (especially if it's easy with extensions that auto-archive things!).


Happy archiving. And even happier "finding that old thing you were looking for that got archived."
 
Last edited:
Back
Top