Introduction

For software developers that use Docker, it’s often useful to have an isolated environment on which to experiment with Docker. This allows playing with Docker configurations and images without affecting the underlying host.

While this can be done by deploying a VM, it’s a heavy-weight solution.

An alternative is to use Docker’s official Docker-in-Docker (DinD) image to run Docker inside a Docker container. But it’s not the best solution because it’s very unsecure and too restrictive as described a bit later in this article.

The Nestybox container runtime, Sysbox, provides a better solution. It integrates with Docker and enables it to deploy system containers. These containers support running Docker inside, but securely and with fewer restrictions.

This article shows how to quickly and easily deploy Docker sandbox environments using Nestybox system containers.

See it work!

asciicast

Contents

Why use Docker Sandboxes?

If you are a software developer or system administrator, using a Docker container as “Docker sandbox” allows you to:

  • Play and experiment with Docker images and configurations without affecting the underlying host.

  • Deploy multiple Docker sandboxed environments within the same host machine, isolated from each other.

  • Avoid the need for resource hungry VMs for this same purpose.

  • Split Docker image storage across storage devices.

Problems with Docker’s “Docker-In-Docker” Solution

Docker offers an official “docker-in-docker” (DinD) image for deploying Docker inside a Docker container.

But it has the following drawbacks & limitations:

  • It requires deployment of a privileged Docker container, which is very unsecure and provides no isolation between the container and the host. An error inside the container can mess up your host’s configuration (e.g., root inside the container is root on the host, will all capabilities enabled and with access to all host devices!).

  • Runs a single service (Docker) inside the Docker container. This may be too restrictive for some users.

  • Leads to Docker volume sprawl on the host. Each time you start a DinD image, Docker creates a volume on the host to store the inner container images. When you stop and remove the container, the volume remains, wasting storage on your machine.

In addition:

  • It does not support using a Dockerfile to build a Docker container sandbox that comes pre-loaded with inner images.

  • And it does not support snapshots of the outer Docker container that include inner container images.

Nestybox’s Solution

The Nestybox container runtime, Sysbox, solves all of the limitations listed in the prior section.

It integrates with Docker to enable it to deploy system containers. A system container can be used as secure Docker sandbox, with the following benefits:

  • Easily deploy multiple light-weight, totally-isolated Docker sandbox environments within the same host quickly (using Docker itself).

  • Run one or more services inside the system container Docker sandbox, according to your needs.

  • Avoid using unsecure privileged containers. An error within the system container won’t affect the underlying host.

  • Give unprivileged users on the host access to their own instance of a Docker engine, without granting them root or Docker group access on the host.

  • Add an extra layer of security between your application containers and the underlying host.

  • Use a Dockerfile to build system containers that come pre-loaded with inner container images (saving you the need to pull inner images from the network every time).

  • Use docker commit to snapshot the system container contents, including inner images.

The following sections show examples on how to use system containers to realize the benefits listed above.

If you want to try the examples that follow, you must first install the Nestybox system container runtime Sysbox in your machine. You can get Sysbox for free here. Once you install it, you simply deploy system containers with Docker as shown in the examples below.

Quick Word on System Containers running Multiple Services

Normally Docker containers run a single application or micro-service inside of them.

In contrast, Nestybox system containers are typically configured with multiple services as they are often used as virtual host environments. Of course, you can always run a single system level service on them. You decide what’s best.

Using a System Container as a Docker Sandbox

To use a Nestybox system container as a Docker sandbox, simply deploy a system container image that includes Docker inside of it.

Nestybox has examples of such images in its DockerHub repository. The corresponding Dockerfiles are here.

For example, we have a light-weight Alpine-based system container image that has Docker in it.

You deploy the system container with Docker itself, by pointing it to the Nestybox system container runtime Sysbox:

$ docker run --runtime=sysbox-runc -it --hostname=syscont nestybox/alpine-docker:latest
/ #

Within the system container, you can then start Dockerd as follows:

/ # which docker
/usr/bin/docker

/ # dockerd > /var/log/dockerd.log 2>&1 &

/ # tail /var/log/dockerd.log
time="2019-10-23T20:48:51.960846074Z" level=warning msg="Your kernel does not support cgroup rt runtime"
time="2019-10-23T20:48:51.960860148Z" level=warning msg="Your kernel does not support cgroup blkio weight"
time="2019-10-23T20:48:51.960872060Z" level=warning msg="Your kernel does not support cgroup blkio weight_device"
time="2019-10-23T20:48:52.146157113Z" level=info msg="Loading containers: start."
time="2019-10-23T20:48:52.235036055Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.18.0.0/16. Daemon option --bip can be used to set a preferred IP address"
time="2019-10-23T20:48:52.324207525Z" level=info msg="Loading containers: done."
time="2019-10-23T20:48:52.476235437Z" level=warning msg="Not using native diff for overlay2, this may cause degraded performance for building images: failed to set opaque flag on middle layer: operation not permitted" storage-driver=overlay2
time="2019-10-23T20:48:52.476418516Z" level=info msg="Docker daemon" commit=0dd43dd87fd530113bf44c9bba9ad8b20ce4637f graphdriver(s)=overlay2 version=18.09.8-ce
time="2019-10-23T20:48:52.476533826Z" level=info msg="Daemon has completed initialization"
time="2019-10-23T20:48:52.489489309Z" level=info msg="API listen on /var/run/docker.sock"

/ # docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

Once Dockerd is started, you can start using it to deploy inner containers:

/ # docker run -it busybox
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
7c9d20b9b6cd: Pull complete
Digest: sha256:fe301db49df08c384001ed752dff6d52b4305a73a7f608f21528048e8a08b51e
Status: Downloaded newer image for busybox:latest
/ #

The inner container (busybox in this example) runs within the system container. The Docker engine on the host is completely unaware of its existence.

You can deploy multiple system containers this way, and each will act as a Docker sandbox that is securely isolated from the underlying host and from all other system containers. Within each sandbox you can configure Docker as you wish, without affecting the rest of the system.

Docker Sandbox with Systemd, Docker, and sshd

In the prior example, we used a system container image was based on Alpine and did not contain a process manager within it.

Let’s improve on this by including a process manager. We’ve chosen Systemd for this example as it gives you a system-container that resembles a physical host or VM.

Alternatively you can choose a lighter-weight process manager such as Supervisord. The Nestybox DockerHub repository has system container images for this.

Let’s deploy the system container image that includes Systemd, Dockerd, and sshd. The image is called nestybox/ubuntu-bionic-systemd-docker:latest and it’s in the Nestybox DockerHub repository.

$ docker run --runtime=sysbox-runc -it --rm -P --hostname=syscont nestybox/ubuntu-bionic-systemd-docker:latest
systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization container-other.
Detected architecture x86-64.

Welcome to Ubuntu 18.04.3 LTS!

Set hostname to <syscont>.

...

[  OK  ] Started Docker Application Container Engine.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Ubuntu 18.04.3 LTS syscont console

syscont login:

The -P option tells Docker to publish all system container ports exposed by the Docker image, which in our case is port 22 (the ssh port).

In the system container image we are using, we’ve configured the default console login and password to be admin/admin. You can always change this in the image’s Dockerfile.

syscont login: admin
Password:
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-31-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage
This system has been minimized by removing packages and content that are
not required on a system that users do not log into.

To restore this content, you can run the 'unminimize' command.

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

admin@syscont:~$

Now verify that Docker is running inside the system container:

admin@syscont:~$ systemctl status docker.service
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-11-01 18:48:21 UTC; 16s ago
     Docs: https://docs.docker.com
 Main PID: 1008 (dockerd)
    Tasks: 13
   CGroup: /system.slice/docker.service
           └─1008 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

admin@syscont:~$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

And now let’s run an inner container (busybox):

admin@syscont:~$ docker run -it busybox
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
0f8c40e1270f: Pull complete
Digest: sha256:1303dbf110c57f3edf68d9f5a16c082ec06c4cf7604831669faf2c712260b5a0
Status: Downloaded newer image for busybox:latest
/ #

You can also ssh into the system container (since it’s running sshd). Because we used the -P option in the docker run command, Docker maps the system container’s sshd port (22) to an arbitrary port on the host machine.

Let’s find out what that arbitrary port is. From the host, type:

$ docker ps
CONTAINER ID        IMAGE                                          COMMAND             CREATED             STATUS              PORTS                   NAMES
bc959ac83ecf        nestybox/ubuntu-bionic-systemd-docker:latest   "/sbin/init"        54 seconds ago      Up 50 seconds       0.0.0.0:32779->22/tcp   suspicious_curie

It’s port 32779. Now let’s ssh into the system container from a different machine. In this example my host machine is at IP address 10.0.0.230 and I will ssh into port 32779:

$ ssh admin@10.0.0.230 -p 32779
admin@10.0.0.230's password:
Last login: Fri Nov  1 18:48:25 2019
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

admin@syscont:~$

And now let’s verify that busybox is in fact running inside the system container:

admin@syscont:~$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
bddfafc62679        busybox             "sh"                2 minutes ago       Up About a minute                       musing_williamson

There it is!

Creating A Snapshot Of A Docker Sandbox

With Nestybox, you can create a snapshot of a system container that includes the inner Docker container images.

For example, say you deploy a system container for use as a Docker sandbox, and have used the inner Docker to pull a few images. You can now use the docker commit command to take a snapshot of the system container, and that snapshot will include the inner Docker images

This is helpful as a way of saving work or exporting a working system container for deployment in another machine (i.e., commit the image, docker push to a repo, and docker pull from another machine).

Here is how to do this.

First, let’s deploy a system container; we will use a Docker image that contains systemd + dockerd (the same one we used in the prior section):

$ docker run --runtime=sysbox-runc -it --rm -P --hostname=syscont nestybox/ubuntu-bionic-systemd-docker:latest
systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization container-other.
Detected architecture x86-64.

Welcome to Ubuntu 18.04.3 LTS!

Set hostname to <syscont>.

...

[  OK  ] Started Docker Application Container Engine.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Ubuntu 18.04.3 LTS syscont console

syscont login:

The login and password are pre-configured as admin:admin (you can always change this via the image’s Dockerfile).

syscont login: admin
Password:
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-32-generic x86_64)
...
admin@syscont:~$

Now let’s pull some Docker images inside the system container (busybox and alpine):

/ # docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

/ # docker pull busybox
Using default tag: latest
latest: Pulling from library/busybox
7c9d20b9b6cd: Pull complete
Digest: sha256:fe301db49df08c384001ed752dff6d52b4305a73a7f608f21528048e8a08b51e
Status: Downloaded newer image for busybox:latest

/ # docker pull alpine
Using default tag: latest
latest: Pulling from library/alpine
89d9c30c1d48: Pull complete
Digest: sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a
Status: Downloaded newer image for alpine:latest

Now, from the host, let’s use Docker to “commit” the system container image (i.e., take a snapshot of its contents):

$ docker ps
CONTAINER ID        IMAGE                                          COMMAND             CREATED             STATUS              PORTS                   NAMES
dcf36150bcf5        nestybox/ubuntu-bionic-systemd-docker:latest   "/sbin/init"        5 minutes ago       Up 5 minutes        0.0.0.0:32769->22/tcp   exciting_albattani

$ docker commit exciting_albattani nestybox/syscont-with-inner-containers:latest
sha256:e66e830759d95ece211ada0c2d27777baf46f56b4bdba673d854354e4a38ff80

The commit operation may take anywhere from a second to a minute, depending on how many changes were done in the container’s root filesystem since it was created.

Let’s run the newly committed system container image. If all is well, it should contain the inner container images for busybox and alpine within it.

$ docker run --runtime=sysbox-runc -it --rm -P --hostname=syscont2 nestybox/syscont-with-inner-containers
systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization container-other.
Detected architecture x86-64.

Welcome to Ubuntu 18.04.3 LTS!

Set hostname to <syscont2>.

[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Started Update UTMP about System Runlevel Changes.

Ubuntu 18.04.3 LTS syscont2 console

syscont2 login:

Let’s login to this new system container:

syscont2 login: admin
Password:
Last login: Mon Nov 11 18:31:47 UTC 2019 on console
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-32-generic x86_64)

...

admin@syscont2:~$

And now let’s verify the snapshot worked; the inner container images for busybox and alpine should be there:

admin@syscont2:~$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
busybox             latest              020584afccce        11 days ago         1.22MB
alpine              latest              965ea09ff2eb        3 weeks ago         5.55MB

There they are, as expected!

While taking a snapshot of a regular Docker container via docker commit is nothing new, taking a snapshot of a Docker container that includes inner Docker container images is more challenging to implement, due to low level issues related to running Docker inside the container. In fact, Docker’s docker-in-docker solution (which uses unsecure privileged containers) does not support snapshoting the container image (i.e., the snapshot won’t include inner containers).

The Nestybox system container runtime, Sysbox, not only allows running Docker more securely, but also ensures that the Docker commit captures the contents of the system container, including inner Docker images, without problem.

Pre-loading Inner Container Images

Another novel feature introduced by Nestybox is the ability to build a system container image that includes inner container images within it, using a Dockerfile.

This allows you to build Docker sandbox environments that come pre-loaded with inner images, saving you the need to pull those inner images from the network each time the system container is deployed.

The steps to do this are described in this Nestybox blog post.

Persistence of Inner Container Images

The Docker instance running inside a system container stores its images in the /var/lib/docker directory inside the container. This is called the Docker image cache.

When the system container is removed (i.e., not just stopped, but actually removed via docker rm), the contents of that directory will also be removed. In other words, inner Docker’s image cache is destroyed when the associated system container is removed.

But what if you wish to persist the inner Docker’s image cache after the system container is destroyed, so that you may reuse it in a future system container?

This can be easily done by mounting host storage into the system container’s /var/lib/docker. For example:

1) Create a Docker volume on the host to serve as the persistent image cache for the Docker daemon inside the system container.

$ docker volume create my-image-cache
my-image-cache

$ docker volume list
DRIVER              VOLUME NAME
local               my-image-cache

2) Launch the system container and mount the volume into the system container’s /var/lib/docker directory. We will use the same system container image that includes systemd and dockerd used in prior examples:

$ docker run --runtime=sysbox-runc -it --rm -P --hostname=syscont --mount source=my-image-cache,target=/var/lib/docker nestybox/ubuntu-bionic-systemd-docker:latest
systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization container-other.
Detected architecture x86-64.

Welcome to Ubuntu 18.04.3 LTS!

Set hostname to <syscont>.

Ubuntu 18.04.3 LTS syscont console

syscont login: admin
Password:
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-32-generic x86_64)

...

admin@syscont:~$

3) Pull an inner container image (e.g. busybox):

admin@syscont:~$ docker pull busybox
Using default tag: latest
latest: Pulling from library/busybox
0f8c40e1270f: Pull complete
Digest: sha256:1303dbf110c57f3edf68d9f5a16c082ec06c4cf7604831669faf2c712260b5a0
Status: Downloaded newer image for busybox:latest
docker.io/library/busybox:latest

admin@syscont:~$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
busybox             latest              020584afccce        11 days ago         1.22MB

4) Exit the system container. Since it was started with the --rm option, Docker will remove the system container from the system. However, the contents of the system container’s /var/lib/docker will persist since they are stored in volume my-image-cache.

5) Start a new system container and mount the my-image-cache volume:

$ docker run --runtime=sysbox-runc -it --rm -P --hostname=syscont --mount source=my-image-cache,target=/var/lib/docker nestybox/ubuntu-bionic-systemd-docker:latest
systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Detected virtualization container-other.
Detected architecture x86-64.

Welcome to Ubuntu 18.04.3 LTS!

Set hostname to <syscont>.

syscont login: admin
Password:
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-32-generic x86_64)

6) Let’s verify the inner container image for busybox is there already:

admin@syscont:~$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
busybox             latest              020584afccce        11 days ago         1.22MB

As shown, the inner container image for busybox persisted across the life-cycle of the system container. This is cool because it means that a system container can leverage an existing Docker image cache stored somewhere on the host, and thereby avoid having to pull inner Docker images from the network each time a new system container is started.

In the above example we used a Docker volume, but we could also use an arbitrary directory on the host and bind-mount it into the system container’s /var/lib/docker.

A couple of caveats exists however:

  • A persistent Docker image cache must only be mounted on a single system container at any given time. This is a restriction imposed by the Docker daemon, which does not allow its image cache to be shared concurrently among multiple daemon instances. Sysbox will check for this and issue an appropriate error if this rule is violated when launching a system container.

  • When using a persistent Docker image cache as described above, creating a snapshot of the system container (as described in this prior section) will not capture the contents of the Docker image cache. That’s because the snapshot only includes the contents of the system container’s root filesystem, not the contents of host directories mounted on top of it.

Conclusion

As shown in the examples above, it’s super easy to deploy a Docker sandbox environment using Nestybox system containers, quickly enabling you to get the benefits described earlier in this article.

The Nestybox container runtime, Sysbox, is working under the covers to enable functionality such as:

  • Ensuring the system container can run Docker and Systemd properly and with strong isolation from the underlying system.

  • Ensuring a Docker commit captures the system container contents including inner Docker images.

  • Ensuring persistent mounts into the system container’s /var/lib/docker work as expected.

Try it for Free!

You can try Sysbox for free! Check our website for info on how to get it.

We are looking for early adopters and your feedback would be much appreciated!