Some Notes on Docker Up and Running (Day 1)

Earlier this week I participated in a 3 hour class offered through O’Reilly’s Live Online Training platform to push myself to get more familiar with Docker. The course was called “Docker: Up and Running” and was taught by Sean Kane, who happens to be an architect at New Relic which is an application Reclaim Hosting has become huge fans of as of late. The push to get more familiar with Docker has been precipitated by the launching of Reclaim Cloud this summer for sure, but also by the fact that I finally have a bit of free time to dig in more deeply (thanks Lauren!). I have been nibbling around the edges of Docker just figuring out enough to get by (or asking Tim), but I was hoping this course would provide me with a more fundamental understanding and I was not disappointed.

I will not go over the details of day 1 at length, but I do want to highlight a couple of things that helped me to conceptually think through Docker and containers more generally. First off, when describing the value of containers Kane noted that for him, more than security, isolation, etc. (which were all factors), the real power of containers generally, and Docker specifically, was repeatability: the idea that so many people across multiple environments, can get software up and running with one command. This helped me tremendously. I know it, but I have not been able to articulate it like this, and I appreciated him doing so. This is the power behind Reclaim Cloud, a platform that provides folks the ability to launch and host an application that before would be a one-off, bespoke environment that was virtually impossible to replicate or migrate easily. Whereas applications using containers are not only eminently portable, but endlessly repeatable. Which is why it has been fairly easy for us to spin up a whole bunch of one-click applications in the Reclaim Cloud marketplace based on existing Docker containers. This idea allowed another session I sat in on a couple of weeks back, Scaling a Data Science MOOC with Digital Ocean, to make sense beyond the specifics of the Data Incubator example. The idea of using Kubernetes to orchestrate an environment with 20K+ Jupyterhub containers is premised entirely upon this core idea of repeatability at the heart of this new era of infrastructure. How do you quickly spin up 20K+ identical applications specifically for one course? Using containers.

The other bit that might seem obvious to some, but was a terminology issue for me, was distinguishing between images and containers. The image is the core application that can be repeated innumerable times, in other words the image you pull on your server and then run as a container. It is this distinction between the static image and the running container wherein the difference lies. And the way it which you optimize an image has everything to do with how quickly you can get x-amount of containers up and running, which becomes a crucial question of efficiency when you are scaling a course to 20K+ Jupyterhub containers. Optimizing a manifest file for an environment will be the difference between several seconds and several minutes of creation time. With this, the Docker infrastructure started to make some more conceptual sense to me, and that was worth the time because I can often find myself get lost in the command line details of an issue rather than seeing the whole of the environment.

docker container run -d --rm --name balance_game --publish mode=ingress,target=80,published=80
jimgroom/balance_game:latest

Additionally, a few highlights from day 1 was pushing an image of the Docker Balance game to my own Docker Hub page, which I then used the following command in a Docker Engine instance on Reclaim Cloud to get running online. That felt pretty awesome, and if you are so inclined you can play the Docker Balance game:

I’ll write about Day 2 once I work through that video given I was not able to watch that one live given a couple of meetings and other demands. But having sat-in day 1 live, I am certain watching the Day 2 archive will not impact the experience at all, which is an interesting realization pedagogically—at least for the delivery of this course.

This entry was posted in docker, Reclaim Cloud, sysadmin and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.