Personal Archiving is an Ongoing Battle Against Web Kipple

While cheering on Tim as he migrated UMW Blogs on Saturday, I discovered that my personal website jimgroom.net was returning a 404. It’s something of an online resume with an RSS feed of recent blog posts I setup using WordPress back in 2006 or 2007. I don’t update it that much anymore, but for a while I was using it to document presentations, articles, projects, etc. While I have always considered bavatuesdays my homepage on the web, this site was a fun kind of splash page for my online presence that has served me well for near on 14 years. I never tired of the minimalist aesthetic, but the WordPress theme has begun to feel its age. I eventually got it back online and then immediately grabbed an HTML copy of the site using Sitesucker. I archived the WordPress files and database, and have swapped out the WordPress site with an HTML archived copy.

This episode has pushed me to finally redesign that site. And after troubleshooting John Unsworth’s OG HTML personal homepage I have been inspired to try and rebuild the site entirely in HTML and CSS, with some possible javascript if I am feeling crazy. It’s a good project for me right now given I have been wanting to play with HTML includes, and I will only learn stuff like that if I have a project, not to mention Tommaso has been pushing me to learn Javascript.

But as is often the case, once I got the site back online I realized a couple of Mediawiki instances I was running for courses from 2013 (Hardboiled) and 2014 (True Crime) had also been broken. Upgrading MediaWiki can be a bear, but I was able to get those sites back online (one with a 3 GB database which means the spam cometh) I decided to Sitesuck them as well and archive the MediaWiki instances.

But a couple of hours later I was feeling the weight of managing legacy sites. I have been fairly good about this process, but it takes time. I am currently consolidating two different cPanel accounts I have had with the idea of moving everything into Reclaim Cloud, and the work of remembering what’s where after so many different moves can be trying. Philip K. Dick talks about kipple as a kind of accumulation of broken, useless crap, and this is exactly what database-driven sites can be like. You turn around after any amount of time and a version needs updating, PHP needs upgrading, themes and/or plugins broke the site, etc. It is a constant game of cleaning up after the fact. If left unattended your whole online life can quickly become a pile of kipple. I imagine it’s inevitable in the long run, but pound-for-pound HTML is a lot less overhead once the dynamic, database drive site has been retired.

But I must admit, the talk about using containers to manage this kind of archiving so you can have the environment as it was X amount of years ago is exciting. Those kind of next generation web archiving possibilities are thrilling for me. And the stuff Jason Scott has done with arcade console emulation on the Internet Archive suggests just how cool it could be. But in the interim I’m still managing many of these sites in a cPanel environment, and it turns out many of the popular PHP and MySQL apps from the oughts can be unforgiving 10 years later.

This entry was posted in Archiving, design, digital identity and tagged , . Bookmark the permalink.

3 Responses to Personal Archiving is an Ongoing Battle Against Web Kipple

  1. Chris L says:

    Dude, you have GOT to stop worrying and learn to love the temporary nature of things. Stop trying to grab water with your fists!

  2. Tim Owens says:

    This post reminds me of a conversation we should have soon with Chip German about EaaSI. The project is doing that emulation type stuff and UVA is part of that network https://www.softwarepreservationnetwork.org/two-cohorts-and-two-different-frames-for-software-preservation-and-emulation-practice/. I would *love* to know if there could be a way to package up an emulated container that had an app that would spin up temporarily for users at the time they want to view it but be read-only. I know Ilya at Webrecorder was also playing with these ideas with Stanford’s digital publishing group. I gotta believe the cloud gives us some possibilities there.

Leave a Reply to Tim Owens Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.