Reclaim Cloud Case Study: Containing TEI Publisher in the Cloud

It started out as an innocent enough ticket into Reclaim Hosting from Dr. Laura Morreale, whose work involves transcribing and translating texts from medieval manuscripts using online digital facsimiles, asked if we can run eXist-db on her cPanel account in shared hosting. In particular she needed to run TEI Publisher, an open source application that is described as follows in this documentation:

The motivation behind TEI Publisher was to provide a tool which enables scholars and editors to publish their materials without becoming programmers, but also does not force them into a one-size-fits-all framework. Experienced developers will benefit as well by writing less code, avoiding redundancy, improve maintenance and interoperability – to just name a few. TEI Publisher is all about standards, modularity, reusability and sustainability!

A quick look at the basic installation documentation for eXist-db told me it was a Java app which is a hard no for cPanel. But avoiding hard NOs when someone comes asking for help is one of the main reasons we started Reclaim Cloud. A cursory search for a Docker container for this application led me to a container that seemed out-dated. I responded suggesting we could try installing it on the Cloud if they had a current Docker instance, which I was not finding. Turns out I wasn’t looking hard enough, it was linked from the eXistDB homepage right in front of my eyes. I was wrong, and Dr. Morreale responded suggesting she was becoming increasingly frustrated trying to get this application running online saying, and I misquote for comic effect: “Dammit Jim, I am Medievalist, not a server admin!” She was right, and this was why we started the Cloud in the first place; I needed to try harder. What’s more, I appreciated the fact she was so determined to make this work. So much so that soon after after the last email I sent to try and get this working, she sent sent me a link to the right Docker container on the recommendation of the folks at eXist-db:

That was all we needed, I simply searched for this container in the Docker area when creating a new environment in Reclaim Cloud:

Click “Next” and add the subdomain of this test environment, in my example teipublisher.us.reclaim.cloud (now deleted), and then clicked “Create.”

And within moments I was able to access the site at at that subdomain:

The eXistdb splash page redirects to a suite of tools, including TEI Publisher!

A click on that icon brings us into that application:

While there are a still few things to work out in regards to user management for the application, it seems like we may have a winner with this Docker container. In fact, Dr. Morreale’s struggle highlights a pain point for many humanities PhDs that need to run an application that demands a bespoke server environment. This is when the value of containers is extremely evident. In this case, running a Java server environment that can provide an  application that affords a stable and citable publication venue for a Medievalist’s transcriptions and translations is a perfect case in point. In fact, Dr. Morreale was kind enough to furnish me with some insight of her work, process, and challenges for this post:

Like a growing number of humanities PhDs, I am an independent scholar who maintains relationships with several programs and institutions. I am currently affiliated in an official capacity with Fordham, Georgetown, and Harvard Universities, and am also engaged in ongoing projects with partners at Stanford and Princeton Universities.  My medievalist practice has always been characterized by a physical distance from both the repositories that hold sources which I study, and the institutions where my scholarly work finds its home. For this reason, digital methods have offered me a solution for my scholarly work when I had few others.

Some of the most rewarding efforts which have in turn informed much of my traditional analytical work, involve transcribing and translating texts found in medieval manuscripts using online digital facsimiles. Using a tool called FromThePage combined with IIIF image technology, I can now easily choose digitized manuscript images from any online repository, upload them, then immediately begin to transcribe the text from the medieval source. I can also translate my own transcription after it is complete, and I have undertaken both individual and collaborative translation projects using this method. Right now my projects include corpus of early 13th century aristocratic legal codes from Crusader Cyprus, a rarely-cited history of Florence that was buried in a late 14th-century letter from a father to his son, and a little known work by Renaissance Florentine Leon Battista Alberti, found in a larger manuscript that has broken up, with parts of it now housed at Harvard’s Houghton Library.

The one difficulty has been to find a stable and citable publication venue for these transcriptions and translations. I have tried several different programs over the years, but could never easily publish all the work I had done to bring more attention to these texts and manuscripts. Using Reclaim Hosting  and a program called TEI Publisher allows me to create the kind of edition I would like, and to allows me to integrate images, notes, and other explanatory materials into my online editions.

In the end, the fact that we could help Dr. Morreale get what she needed fairly seamlessly is a thrill, and it highlights everything we hoped Reclaim Cloud would be. I am planning on turning this Docker container into a one-click application for the Reclaim Cloud marketplace so that other folks can hopefully scratch a similar itch. And special thanks to Dr. Morreale for so generously sharing her process and work to complete this post. Avanti!

This entry was posted in reclaim, Reclaim Cloud and tagged , , , , , . Bookmark the permalink.

3 Responses to Reclaim Cloud Case Study: Containing TEI Publisher in the Cloud

  1. Joern Turner says:

    Hi,

    being one of the developers of TEI-Publisher i’m so happy to see that TEI-Publisher is getting more and more traction and is useful for researchers. Making it available in the cloud certainly is a big improvement for many people in the field.

    Just one word of caution – as TEI-Publisher is running on-top of eXist-db which as a database obviously is an I/O-intense application. Docker is not ideal for such applications and will slow down things, if your application uses the database heavily.

    If you encounter performance problems with your application don’t hesitate to contact eXist Solutions for help or file a ticket on https://github.com/eeditiones/tei-publisher-app.

    • Reverend says:

      Joern,

      Thanks for the comment as well as the heads up about scaling up using Docker. That is definitely something for folks running eXist-db to consider. In the Cloud we also have the option to create a custom Java environment with scalable, clustered database and application instances.I wonder if we could get some recommended setups given we have limited experience with Java environment.

  2. duncdrum says:

    For clusters with vertical and horizontal scaling exist-db has limitations that have nothing to do with it running inside a container or not. How much of an overhead containers actually produce largely depends on your configuration, and typical work loads for exist. I often find the benefits of orchestrating exist containers, to offset the overhead YMMV. Exist-dbs slack channel has a few container heavy users, that can share their experiences.

Leave a Reply to Reverend Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.