Archiving with Amber

3595956498_31a2e646b2_z

Image credit: Paul Ritchie’s “Broken Web Connections? Welcome To 2009…”

I was reading John Johnston’s blog this morning (gotta keep up with my European peeps) and his post about fighting linkrot hit home. Every morning I wake up to an email letting me know a few more links on the bava have died, I pour out a little Louis and mourn the day. As of now I have 2,801 posts with 19,583 links. Of those links, 2721 are broken. That’s just about 15% of all the links on the bava lost to the annals of time. BASTARDS!!!

While I was reading John’s post about linkrot I was reminded of a conversation I had with Kin Lane and Tim Owens over a year ago about how Kin has a link archiver he programmed that preserves all the connections on his various sites by taking screenshot of links before they die. I was very jealous, because every morning I still wake up to the bava obituary of links in my inbox. So, when I read about the Berkman Center’s new tool called Amber that prevents just this kind of linkrot by taking screenshots of existing links I was fired up. What an awesome tool for them to build for folks. Inspired, I decided then and there to finally put a tourniquet on the bava link hemorrhaging. The fact that they have a WordPress plugin that allowed me to send the screenshots to an S3 bucket and the Internet Archive was that much cooler. It was dead simple to setup, and I feel like I just started down the road of preservation of my past thinking, the links to which will only get more fragile with time. Nothing like a little amber to preserve the action:

100-million-year-old spider attack captured in amber.

100-million-year-old spider attack captured in amber.

 

This entry was posted in Archiving, AWS, WordPress and tagged , , , , , , . Bookmark the permalink.

10 Responses to Archiving with Amber

  1. Alan Levine says:

    I meant to give Amber a try after reading about it on Dave Weinbergers blog. I wasn’t clear if it saves an image snapshot or a web one? And what’s the +/- on doing internet archive vs Amazon? If John has angered a dead link in the archive will Amber on mine make another backup or will it reference his?

    Guess I better just shut up and try it!

  2. Reverend says:

    Why is John angering links? I think that’s the whole problem to begin with 🙂

    I am running a scan of all the links and it’s catching screenshots of everything, if that link dies the idea is you could hover or click on it and get a screenshot of the site with a note that the link is dead and this is a local archival copy taken on X day.

    In terms of Internet Archive vs S3, I just like to have the double protection, and maybe the Internet Archive ones are helping make the Way Back machine better?

  3. Those Scots are always angry… “archive”! So it’s an image. That’s okay.

    Maybe if people just claimed all their links and took care of them…

    • Reverend says:

      And maybe if there were world peace, and food for everyone, and economic equality, blah, blah, blah

      • Reverend says:

        Although, thinking about this, the fact that folks like Internet Archive, Kin Lane, and I’m sure many, many others doing just this points to the fact that those who do personal archiving are doing a much bigger thing, especially if they make what they archive available. So, we will never live in a world where everyone will do it, but we do live in a world where some will and do, and that in many ways is often good enough. Deep thought for the day, oh yeah and pedagogy is radical and learning must not be scaffolded or anything like that or you will be a pawn of the system you dirty fascists 🙂

  4. Get offa my scaffolding, punk! I’m trying to build somethin’

    I thought the memento approach interesting, sort of a quasi distributed wayback machine+ but never got the server side stuff http://timetravel.mementoweb.org/

    And you really will dig this epic on fandom http://idlewords.com/talks/fan_is_a_tool_using_animal.htm

  5. Hope this links prove interesting

    http://www.ariadne.ac.uk/issue62/davis

    http://hiberlink.org/Insight.htm

    Amber is a good addition to the struggle to recognise that what’s on the Web today is liable to be gone tomorrow – unless some proactive steps are taken.

    As coined in the Mellon-funded Hiblerlink – see http://hiberlink.org – ‘reference rot’ is the combined effect of link rot (those 404s) and content drift (where what was cited has changed or is no longer available).

    We set out to demonstrate the extent of the threat of reference rot for the integrity of scholarly work. Our focus was on two information objects that are well-established: the journal article and the doctoral e-thesis. Using Memento we assessed the status/existence of content at the end of URIs extracted from several large corpus of full-text works [in short, the rot is significant after only two weeks of publication].

    But the threat and reality of reference rot for OERs is huge, as it is for much that is web resident.

    The good news is that we have also derived some solutions: these are based on transactional archiving at the time of access to web-based content, and hiberlinks that have a Robust Link structure: the original URI + the URI of the archived content + the DateTime stamp. Two papers are instructive. The first is to the research paper in PLOS, http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0115253 . The second a discursive Insight paper on the Hiberlink website. The latter contains ‘robust links’ within its citations, to guard against reference rot. The publishing platform for PLOS and for the article in Insight, http://doi.org/10.1629/uksg.237, did not allow, as remedy has yet to be implemented! However I’m speaking about this at: http://www.alpsp.org/Seminars/Standing-on-the-Digits-of-Giants/31078#sthash.ii0VRFEw.dpuf

    Plug-ins have been written for Zotero and OJS, with HiberActive providing the middleware and a scheme for Link Decoration. More on this, both oodles of PPTs and papers, at hiberlink.org .

    Hope this can benefit discussion, now and in OER16 & all.

  6. Mike C. says:

    Connected copies, man. I’ve been telling you people for six years now. 😉

    It will be the architecture of the next evolution of the web.

  7. For how many of those 19,587 links was Amber able to preserve a snapshot? On my site, I found that Amber was only able to preserve snapshots for 39 of 515 links, but I have no idea if that’s typical. I just posted my notes on Amber here: https://aribadernatal.com/2016/02/15/saving-favorites-amber-edition/

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.