Our Solution

We are creating a command line interface tool to help scientists containerize their code.

Key Features

presto create

Create Dockerfiles to create containers in one step. Dockerfiles are what Docker uses to start building containers. The command only asks for the programming language of the files that the main code is in.

example showing CLI usage

presto build

Takes in the Dockerfiles and creates an image. An image is the object Docker will use to create containers.

example showing CLI usage

presto run

The tool will start running the container and wait for it to finish and send back the results.

example showing CLI usage

View your output easily

Results are returned in the form of a zip file. All files inside are easily viewable once uncompressed.

example showing CLI usage

presto upload

The tool will upload an image once the user verifies it is working correctly. By uploading the image, any user can now download it and run their own containers without hassle.

example showing CLI usage

Overall Solution

Fossilized Controller

This tool, which we call the Fossilized Controller, will provide a user interface to help create, run, and manage PReSto containers. The main function of the Fossilized Controller is to guide scientists through the process of creating their container. The purpose of our tool is that instead of having to learn about containerization, scientists can use our tool to create an easy way to share their climate models. Ample documentation will be provided to the scientists so that they can have a good understanding of how to use the tool. The benefits of using our tool is that instead of having to learn Docker, they simply have to run our commands that create the containers for the scientists.

example showing CLI usage

PReSto Containers

Our solution to help paleoclimatologists is a tool to facilitate the containerization of their climate reconstruction programs into what we call a PReSto Container. This is a regular Docker container but has components special to the Fossilized Controller. The first component of a PReSto container is the climate model program within it. The scientist only has to pass any needed required parameters through the Fossilized Controller to the PReSto container. From there, the container can also return output files from the model. The user will have minimal involvement for the creation of the PReSto container aside from some prompts in the beginning.

a PReSto Container is a Docker container with a climate model

System Architecture

Everything comes together to create our entire system. From there, scientists can use the Fossilized Controller to send files to the container as well as receive output files. Climate reconstruction programs are written in a number of different languages, mainly Python, R, and Matlab, and our tool should work for all of them. To fulfill this, the communication between the Fossilized Controller and PReSto Containers doesn’t use any programming language specific features - it uses an HTTP connection. This allows us to ignore the contents of the container as long as it properly communicates with an HTTP client in our Controller. In order to achieve this, we are creating adapter libraries. Scientists need to add a few lines of code provided to them that will create a local server within the container so that the Fossilized Container can establish a connection to it.

project diagram

External Services

The controller will be downloaded through our Github repository, and once installed can be used to start the containerization process and eventually upload the container to a hosting service, such as Dockerhub. To aid in the use of our tool we also plan to host documentation on a website containing all of the features as well as technical internals, making it simple to use.

Diagram showing external services

Technologies

Docker

Docker is the tool that creates containers. There are other tools that also create containers but Docker is the most popular tool among them. Our client has also specified that they want to use Docker. The most beneficial part about Docker to us is the Docker Software Developer Kits (SDKS) that allows us to communicate with the Docker daemon. The SDK is what will allow us to create tools that can essentially run Docker commands.

Docker Logo

Python

We have decided to use Python as our language of choice for the CLI since it is the language we are most comfortable with. Scientists at our client's lab are also familiar with Python so creating the tool in it will allow the lab to easily add new features to the tool past capstone. One of our adapter libraries also have to be in Python.

Python Logo

R

R is a common language used when creating climate models so we are creating an adapter library for the language.

R Logo

HTTP

We are going to use HTTP for communicating between the Fossilized Controller and PReSto Containers. HTTP connections provide good security and doesn't require the controller to be running on the same filesystem as PReSto Containers.

HTTP Logo

Flask

Flask is a microframework for building HTTP servers quickly in Python. Since it is a microframework, its compactness won’t inflate container sizes and makes it very simple to build an HTTP server that sends and receives files. Flask

Flask Logo

httpuv

Httpuv is a package available for the R programming language to create servers. It is easy to use and holds all the required components for creating an HTTP connection and sending files through it.

Package

LiPDverse

LiPDverse is a library available in Python and R that provides functionality for interacting with LiPD files. We are going to be using it so that we can properly work with the different files associated with climate models

LiPD Logo

Schedule

Gantt Chart

We have remained on track for the Spring semester and have finished major development of the Fossilized Controller.