The benefits of integrating Docker and Cloud IDE's in the GitHub Classroom workflow

(Ugo Pattacini) #1

Preamble

Hi there :wave:

I’d like to share with you a methodology that can greatly alleviate the installation burden required at the beginning of your class to get all your students aligned and ready to code, needing them to use only the browser instead!

This is especially relevant to crash courses or hackathons where time is a critical factor :hourglass_flowing_sand: and participants should really focus on what matters rather than wasting precious hours fetching all the latest and fanciest applications on their laptops and then crawling StackOverflow desperately together with mentors in search of last resort fixes :sweat:
Despite your efforts in distributing the instructions to download the software weeks in advance, and all the recommendations to follow tightly your accurate guide, and the intermediate checks you planned to make sure that everyone is on the same page, well, you will always end up dealing with laggards that show up day 1 of the BYOL event with a wonky system or, in the worst case, nothing prepared!

:point_right: You might skip this post if your students work out their assignments using the computers of the school during the class or if teaching how to correctly configure a machine is part of your course (as it should be most of the times), although I believe this could be still interesting and worth reading!

VM’s and Docker containers

Providing a Virtual Machine ahead of time with pre-installed software is a valid possibility, which doesn’t come for free though.
The students are still required to install VM clients that may differ based on the OS, may use diverse formats, may quite often lead to troubleshooting, depending on the status and the resources of the host (and we know how “clean” students’ systems can be :wink:)

In recent years, Docker containers :whale: have been establishing as a wieldy alternative to VM’s for lots of good reasons; in particular, in our scope, they offer almost a single command installation on Linux systems. However, containers do not get along seamlessly enough with Windows OS yet.

The methodology

A very effective solution is to embed a Docker container into a Cloud IDE.

  • Through Docker, we can build and maintain a system starting from a complete recipe in a reproducible way.
  • Through a Cloud IDE, students need only their browser to develop and test the code to solve the assignment.

Cloud IDE’s

I’m sure many of you are familiar with Cloud IDE’s, whereas some others use them daily and even in their teaching activities. In essence, through your browser, you’ll put your hands on a full-fledged workspace (mostly an Ubuntu system) running on a remote server plus a complete and usually super-slick IDE for developing code in whatever language you like. Also, you can share the workspace with peers very easily and see the collaboration happening live in real-time.

This is awesome to do pair programming, carry out technical interviews remotely, or simply for testing the latest patch to your web front-end.

Cloud9, the ancestor

In this context, the Harvard’s CS50 workspace hosted at Cloud9 represents a quite remarkable and successful example of such an approach, indeed.
Lately, Cloud9 itself has been acquired by Amazon and integrated as a component in their AWS framework.
Nonetheless, at least to my knowledge, the original Cloud9 didn’t allow for incorporating external docker containers in a friendly manner.

What I could do in Cloud9 was:

  1. first, spawn an online workspace from a bunch of template recipes, based for example on Ubuntu trusty (not a recent release, actually);
  2. then, customize the workspace with a sequence of sudo apt install operations;
  3. finally, ask students to sign up and clone the workspace to get their own copies.

This process turns out to be a bit cumbersome and not very maintainable.

Cloud9 is going to be discontinued this June in favor of AWS Cloud9, thus I started poking around to seek for good replacements as I try to stay away from the gigantic world of AWS, where sooner or later I bet I will have to pay quite some money to take advantage of a service that is used to be free (a little lie, I’m currently paying 1$/month for the educational tier on the old Cloud9 portal).

I’ve therefore landed on Gitpod and Codenvy and I admit I fell in love with both :heart:

Gitpod, the GitHub friend

Gitpod is a brand new Cloud IDE provider that completed the beta phase this April. In the post that officially launched the product you’ll find out an extensive review of the tool.

The main trait is that with literally no setup and a single click from within your repository on GitHub, you can flash your own Docker image into a fully functioning Ubuntu system that works live in the cloud and sports a lucid VSCode-like environment for editing and sharing files :sparkles:

It is sufficient to:

  1. drop in your repository a .gitpod.yml file pointing to the Docker image;
  2. visit the URL https://gitpod.io/#https://github.com/username/repository and you’re done!

The Docker image can be made available through Docker Hub or you can even ask Gitpod to build it on demand by providing the Docker file inside the GitHub repository. If you’re a Chrome user, you may want to install this nice extension providing a button to click on that embeds the URL above.

Now, think of the following workflow that integrates smoothly with GitHub Classroom:

  • Preparing the assignment
  1. Sketch out your assignment and note down the system requirements.
  2. Design the Dockerfile to build an image containing the required software components.
  3. Create the GitHub repository in your organization to define the assignment.
  4. Drop the .gitpod.yml in the repository along with Dockerfile. Alternatively, you may push the Docker image to Docker Hub.
  5. Detail what students are supposed to do in the README.md, comprising the steps to get their Gitpod workspaces. To this end, you ought to make the Gitpod URL explicit in the README.md (no assumptions on the availability of browser extensions).
  6. Make some integrity tests (pretending to be a student).
  7. Once ready, publish the assignment on your GitHub Classroom.
  • Running the assignment
  1. Students click on the usual button to accept the assignment.
  2. They go through the instructions in README.md to know what needs to be done.
  3. They click on the Gitpod URL and immediately start working out the solution within their own Gitpod instances. When the Gitpod workspace is spawned, the GitHub repository is automatically cloned and the selected branch is checked out. The README.md is opened up and receives focus in the editor.
  4. If in trouble, students can file an issue on GitHub and mention teachers that they need help. In turn, one of the teachers may connect remotely to the student Gitpod workspace, observe the status and give hints or feedback in a live session. Moreover, the mechanism of sharing workspaces can be useful also when students team up for solving group assignments.
  5. Finally, students turn in the solution within the deadline.

What about graphics? Sometimes we cannot simply bind our assignments to coding console oriented programs or web services. We do need to get access to X11!

In this respect, Gitpod makes publicly available a series of Docker files for supporting its platform, among which you’ll find this one particularly appealing: https://github.com/gitpod-io/workspace-images/tree/master/full-vnc. In short, this Dockerfile yields the installation of X11 related components plus a web-based VNC service that allows users to see the graphical desktop vividly from within their browser.

I took the chance to tailor the Gitpod full-vnc image to match my personal needs as I teach robotics with GitHub Classroom.

Here’s below a sneak peek of Gitpod potentials:

gitpod

Not too bad, isn’t it? :rocket:

If you want to play with it yourself, here’s my Gitpod workspace: https://gitpod.io/#https://github.com/pattacini/technical-evaluation.
(yeah, the Docker was somehow cloned from the backbone used for our technical evaluations :smile:)

Relevance to the scientific community

The flow goes so sleekly that we have just drafted a procedure to use Gitpod as an accompanying tool (when applicable) for submitting and publishing journal and conference articles. Sadly, the scientific community still struggles to come up with a unified methodology that could put forward the important topic of reproducibility of research outcomes. As a test, we’re therefore making the effort of providing a self-contained Gitpod workspace to showcase an experiment we carried out on a custom filtering technique applied to the visual-tactile localization with a humanoid robot :robot: Eventually, anyone will be able to click and run our filter to process a dataset recorded in a real setting, without installing anything!

The portal Code Ocean provides similar opportunities, although I didn’t find the same freedom offered by native Docker images in customizing the environment.

Gitpod cons :warning:

Gitpod is fantastic but comes with some constraints that may affect yours and your students’ work:

  • Obviously, you’d need to learn how to deal with Docker files and do some preliminary tests on your side. Getting accustomed to Docker might take some time :hourglass_flowing_sand:
  • In a Gitpod workspace, the user has no sudo credentials. This is understandable and represents good practice, but sometimes it’d be convenient to install packages straight away from the console without the burden of adjusting the Dockerfile once more.
  • Gitpod is free only for public GitHub repositories. This is ok for public assignments but could be a hitch for private ones. An easy workaround could be to ask students to clone their private repositories only after the workspace is ready.
  • There’s the threshold of 100 hours/month per user. Quite reasonable, but much depends on your classroom payload.
  • When you share a running workspace, you’ll share also your access to GitHub, which might not be desirable! This is a serious inconvenient during remote technical interviews, for example.
  • Finally, don’t expect to have quantum computers available from free Cloud IDE’s services. After all, there’s a trade-off and resources are limited (e.g. no GPU), even though Gitpod seems quite generous in this sense.

Codenvy, the earnest brother

Some of the cons described above can be circumvented by using Codenvy, which is an equivalent Cloud IDE with a steeper learning curve that allows you to obtain pretty much the same kind of tidbits you get with Gitpod but with a little more fiddling.

I came across Codenvy a few years ago but I remained at the surface since Cloud9 was sufficient for my goals. I didn’t imagine that all this was achievable at the price of digging into its gut.

To create a workspace in Codenvy, you would need to fill in its recipe in terms of a JSON file that collects pointers to the Docker image along with the descriptions of the scripts you want to make available inside the workspace. This recipe is called Factory in Codenvy’s jargon and generates a URL for the user to click on: the same story!

The stack of components I realized is reachable at https://github.com/pattacini/dockerfiles/tree/master/technical-evaluation-stack/codenvy

If you click on http://tiny.cc/iit-codenvy-factory, you’ll be able to launch our robot simulator also in Codenvy :tada:

codenvy

Hope you enjoyed reading!

5 Likes
Student Dev Env setup