This report will provide the benefits of containerization and how it helps us solve multiple we face in a typical application development and deployment cycle. While some of the topics we will discuss in this report are generic in nature, we will be focusing mainly on Docker containerization due to its overwhelming popularity and its standard adoption. Docker is an open-source containerization platform that enables a Linux application to be easily scalable and manageable and its dependencies to be packaged as a distributed container.
Containerization is the process of distributing and deploying applications in a portable and predictable manner. There are often many factors that stand in the way of easing the application deployment through the development cycle into production. Besides the actual application development to work appropriately in each environment, we may also face issues with tracking down dependencies, scaling the application, and updating individual components without affecting anything in the application end.
Containerization and Docker’s service-oriented design attempts to solve these problems. The applications can be broken up into manageable, scalable, functional components and packaged individually with all of their dependencies and deployed into production easily without worrying about the irregular infrastructure/architecture.
A Little Background
Containerization is not a new concept in the computing world. In Linux, LXC, the building block that formed the foundation for containerization technologies, was added to the kernel in 2008. Later, Docker was introduced as a way of simplifying the tool required to create and manage the containers. Docker started its life as an open-source project at dotCloud, a cloud-centric PAAS company, in early 2013. Later it changed its name to Docker Inc.., and announced that it is shifting its focus to the development of Docker. The below picture will illustrate Docker’s rising popularity in Google searches.
Benefits of Containers
Containers come with many attractive benefits for both dev and ops. Some of the common benefits are;
Concept of host system away from the containerized application
Containers are meant to be simple and standardized. This means that the container connects to the underlying host interface and to anything outside of the container using the container defined interfaces. A containerized application should not depend on or be concerned with details about the underlying host’s resources or architecture. Likewise, to the host, every container is a black box. It does not care about the application details inside. In the below picture you can see how containers relate to the host system.
Because of the abstraction between the underlying host and container, scaling can be simple and straight-forward. Developers can run few containers on their staging or testing area and when it goes into production, they can scale out again.
Easy Dependency Management and Application Versioning
Containers allow a developer to bundle an application along with all of its dependencies as a unit. The host system does not have to worry about the dependencies needed to run the specific application. As long as it can run the Docker, it should be able to run all containers.
Lightweight and isolated environments for execution
While containers do not provide the same level of isolation and resource management as virtualization technologies, they have an extremely lightweight execution environment. Since containers are isolated at the process level and share the underlying host’s kernel, it doesn’t include a complete operating system. Which implies a developer can run hundreds of containers from their workstation without running into issues.
Containers are committed in “layers” as shown in the container architecture picture below. When multiple containers are based on the same layer, they can share the underlying layer without duplication, by utilizing a very minimal disk space for later images.
Infrastructure as a Code
Docker files allow us to define the exact actions to be performed while creating a new container image. This allows us to write our execution environment as a code.
Containers vs. Virtualization
How is a container different from virtualization? To put it simply, containers virtualize at the OS level, whereas virtualization virtualize at the hardware level. While the effects are similar, the differences are important and significant.
Both containers and VMs are virtualization tools. On the VM side, a hypervisor makes siloed slices of hardware available. There are generally two types of hypervisors: Type 1 & Type 2
“Type 1” runs directly on the bare metal of the hardware. Ex: open-source Xen and VMware’s ESX.
“Type 2” runs as an additional layer of software within a guest OS. Ex: Oracle’s open-source Virtual Box and VMware Workstation. Type 1 is a better candidate for comparison to Docker containers. Containers, in contrast, make available protected portions of the OS and effectively virtualize it. Two containers running on the same host don’t know that they are sharing resources because each has its own networking layer, processes and so on.
Operating Systems and Resources:
Since virtualization provides access to only the hardware, we still need to install an OS on top of it. As a result, there is multiple full-fledged OS running, one in each VM, which quickly gobbles up resources on the host, such as RAM, CPU and bandwidth etc. Containers are coming with an already running OS as their host environment. This has two significant benefits. First, resource utilization is much more efficient. If a container is not executing anything, it is not using up resources. Second, containers are cheap and therefore fast to create and destroy. There is no need to boot and shut down a whole OS. Instead, a container merely has to terminate the processes running in its isolated space. Consequently, starting and stopping a container is more akin to starting and quitting an application, and is just as fast.
Both types of virtualization and containers are illustrated in below:
Isolation for Performance and Security:
Processes executing in a Docker container are isolated from processes running on the underlying host OS and in other Docker containers. Nevertheless, all processes are executed in the same kernel. The Docker undergoes massive changes every day, and we all know that change is the enemy of security.
There are various ways in which a container can be integrated into the development and deployment process. The below figure illustrates a sample workflow.
A developer in our company might be running Ubuntu with Docker installed. He might push/pull Docker images to/from the public registry to use as the base for installing his own code and the company’s proprietary software and produce images that he pushes to the company’s private registry.
Finally, the company hosts its production environment in the cloud, namely on AWS, for scalability and elasticity. Amazon Linux is also running Docker, which is managing various containers.
All the three environments are running different versions of Linux, all of which are compatible with Docker. Moreover, the environments are running various combinations of containers. However, since each container compartmentalizes its own dependencies, there are no conflicts, and all the containers happily coexist.
Docker promotes an application-centric container model. So containers should run individual applications or services, rather than a whole slew of them. Also, containers are fast and resource-cheap to create and run.
The fact that containers offer lightweight packaging and deploying of applications and dependencies, it is quickly being adopted by the Linux community and is making its way into production environments which makes it the future of computing. Docker provides the tools necessary to not only build containers but also manage and share them with new users or hosts. While containerized applications provide the necessary process isolation and packaging to assist in deployment, there are many other components available to manage and scale containers over a distributed cluster of hosts.