First Steps with Docker

Oops, an error occurred! Code: 201912110455424d5714f5

What is Docker?

In the last years there is nearly now other technology that is as popular as docker. On nearly every conference there is a docker related topic. But what is docker and why is it that popular?

The Problem

Some year ago web and IT-procects were simpler and the requirements could be resolved with a few simple components e.g. with a LAMP-Stack (Linux, Apache, MySQL, PHP). When a developer what's to add a new feature, he builds his local development environment and installs all required components. Since he needed only a few, that effort for that was acceptable. Short downtimes for deployments were also acceptable since we were not used to the fact, that everything is allways online.

The times have change. Onlineshops like Amazon or the Google Search have raised exceptations of the user. Everybody expects that everything is online and working everytime. At the same time more and more complex features are expected and a website needs more components to run e.g:

  • A webserver that delivers the website
  • A database that stores the customer data
  • A search server like elastic or redis that delivers relevant search results
  • A cache like redis that caches parts of the website for a perfect performance
  • A tracking system that is used to generate statistics from the users behaviour
  • ...

More complex requirements means more complex projects. Changes should be applied tested and continously and the peformance should scale when the amount of users increases.

All these facts are a challenge for the development since a manual setup would take longer and longer. Manual deployments with more and more services are also more error-prone.

The Solution

Also others have seen theses changes and solved them e.g. with virtualboxes and vagrant. A developer could download a preconfigured box to get his development instance with the latest development state running in a few minutes. With techniques like ansible or chef the infrastructure can be desribed as code and a plain linux system can be provisioned into a projectspecific instance with all services.

A vagrant box is a virtual machine, this means that the guestsystem thinks to run on a certain simulated environment. The advantage of this approach is that a project can be splitten into one or many virtual machines and it is logically encapsulated from the rest of the system and other projects.

The usage of a vagrant box offers the following advantages:

  • The environment is logically seperated and a new instance on another system can be started in a few minutes.
  • The environment in a vagrant box can be more similar to the live system since you could, e.g. use the same os or database version.

But using vagrant boxes has also disadvantages:

  • The size of a box can be big and the download can take some time
  • The vm has a defined size, it can be changed afterwards but you might waste unused resources

How Docker solves it

Docker uses a new abstraction. They use a common term from the transport business called "container". But which advantages does containers have?:

  • Containers have a well defined size and interface. Therfore they can be used on trucks, ships and trains.
  • To handle the stuff of the container, you do not need to take care of the content. If you have something that can handle a container you can handle the content, regardless what happens inside the container.

What would happen if we could se a service in our project as a container? A container has an interface to the outside and we do not need to take case of the content, aslong as it does what it should do. If we need more power for one service, we just use more containers...

This is the concept of docker. The interfaces a ports. A container exposes ports to the outer world. Behind these ports there are running services.

Compared to virtualbox, a container is NOT a virtual machine. I runs with the same kernel as the host system but is isolated from other processes.

The Dockerfile

A docker image is described with a Dockerfile. The structure of a Dockerfile is simple. The following example shows the dockerfile for a Tika Server from Logicalspark:


FROM ubuntu:latest


RUN	apt-get update \
	&& apt-get install openjdk-8-jre-headless curl gdal-bin tesseract-ocr \
		tesseract-ocr-eng tesseract-ocr-ita tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu -y \
	&& curl -sSL -o /tmp/tika.asc \
	&& gpg --import /tmp/tika.asc \
	&& curl -sSL "$TIKA_SERVER_URL.asc" -o /tmp/tika-server-${TIKA_VERSION}.jar.asc \
	&& NEAREST_TIKA_SERVER_URL=$(curl -sSL${TIKA_SERVER_URL#}\?asjson\=1 \
		| awk '/"path_info": / { pi=$2; }; /"preferred":/ { pref=$2; }; END { print pref " " pi; };' \
		| sed -r -e 's/^"//; s/",$//; s/" "//') \
	&& echo "Nearest mirror: $NEAREST_TIKA_SERVER_URL" \
	&& curl -sSL "$NEAREST_TIKA_SERVER_URL" -o /tika-server-${TIKA_VERSION}.jar \
	&& apt-get clean -y && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

ENTRYPOINT java -jar /tika-server-${TIKA_VERSION}.jar -h

What happens in the Dockerfile?

  • FROM indicates that this image inherits from another docker image. In this case the image inherits from the ubuntu image
  • The ENV commands defines environment variables
  • The RUN command executes several commands. In this case it installs some dependecies of tika like java and finally it downloads and installs tika
  • The EXPOSE 9998 command defines that the port 9998 is available from outside
  • The ENTRYPOINT definition allows to make the container executable. The defined entrypoint script will be executed and the arguments of the docker command when the container was started are forwarded to the entrypoint script

Which advantages does that have?

The described structure and concept of docker have the following advantages:

  • A Dockerfile is short and easy to understand
  • Inheritence allows to build an image based on another image and not all services need to be installed again
  • One line in the Dockerfile is one layer. When you install a docker image, all layer will be downloaded. If a layer allready exists a download is not required. Therfore a build of a Dockerfile can be very fast.

You now did the first steps with Docker. In one of our next blog posts we will show, how you can build an image form a Dockerfile and create a container from an image.