Category: docker

Containers and persistent storage

Containers and persistent storage

Containers are a method of operating system virtualization that allow you to run an application and its dependencies in resource-isolated processes. Containers allow you to easily package an application’s code, configurations, and dependencies into easy to use building blocks that deliver environmental consistency, operational efficiency, developer productivity, and version control. Containers are immutable, meaning they help you deploy applications in a reliable and consistent way independent of the the deployment environment.

As containers continue to rise in popularity beyond the developer populous the way these constructs are being used becomes increasingly varied, especially (but not exclusively) in light of enterprise applications the questions of persistent storage comes up more and more. It is a fallacy to think only stateless application can or should be containerized. If you take a look at https://hub.docker.com/explore/ you’ll see that about half of the most popular applications on Docker Hub are stateful, like databases for example. (Post hoc ergo propter hoc?). If you think about monolithic applications versus micro-services, a monolithic application typically requires state, if you pull this application out into micro-services, some of these services can be stateless containers but others will require state.

I’ll mainly use Docker as the example for this post but many other container technologies exist like LXD, rkt, OpenVZ, even Microsoft offers containers with Windows Server Containers, Hyper-V isolation, and Azure Container Service.

Running a stateless container using Docker is quite straightforward;

$ docker run --name demo-mysql mysql

When you execute docker run, the container process that runs is isolated in that it has its own file system, its own networking, and its own isolated process tree separate from the (local or remote) host.

The docker container is created from a readonly template called docker image. The “mysql” part in the command relates to this image, i.e. containerized application, that you want to run by pulling it from the registry. The data you create inside a container is stored on a thin writable layer, called the container layer, that sits on top of the stack of read-only layers, called the image layers, present in the base docker image. When the container is deleted the writable layer is also deleted so your data does not persist , in docker the docker storage driver is responsible for enabling and managing both the read-only image layers and the writable layer, both read and write speeds are generally considered slow.

Image result for docker layers

Assuming you want persistent data for your containers there are several methods to go about this. You can add a storage directory to a container’s virtual filesystem and map that directory to a directory on the host server. The data you create inside that directory on the container will be saved on the host, allowing it to persist after the container shuts down. This directory can also be shared between containers. In docker this is made possible by using volumes, you can also use bind mounts but these are dependent on the directory structure of the host machine whereas volumes are completely managed by Docker itself. Keep in mind though that these volumes don’t move with container workloads as they are local to the host. Alternatively you can use volume drives (Docker Engine volume plugins) to store data on remote systems instead of the Docker host itself. If you are only interested in storing data in the container writeable layer (i.e. on the docker host itself) you can use Docker storage drivers which then determine which filesystem is supported.

Typically you would create a volume using the storage driver of your choice in the following manner;

$ docker volume create -—driver=pure -o size=32GB testvol1

And then start a container and attach the volume to it;

$ docker run -ti -v testvol1:/data mysql

Storage Vendors and Persistent Container Storage

Storage vendors have an incentive to make consuming their particular storage as easy as possible for these types of workloads so many of them are providing plug-ins to do just that.

One example is Pure Storage who provide a Docker Volume Plugin for their FlashArray and FlashBlade systems. Current they support Docker, Swarm, and Mesos. Most other big name storage vendors also have plugins available.

Then there are things like REX-Ray which is an open source, storage management solution, it was born out of the now defunct {code} by Dell EMC team. It allows you to use multiple different storage backends and serve those up as persistent storage for your container workloads.

On the virtualization front VMware has something called the vSphere Docker Volume Service which consists of two parts, the Docker Volume Plugin and a vSphere Installation Bundle (VIB) to install on the ESXi hosts. This allows you to serve up vSphere Datastores (be it Virtual SAN, VMFS, NFS based) as persistent storage to your container workloads.

Then there are newer companies that have been solely focusing on providing persistent storage for container workloads, one of them is Portworx. Portworx want to provide another abstraction layer between the storage pool and the container workload. The idea is that they provide a “storage ” container that can then be integrated with the “application” containers. You can do this manually or you can integrate with a container scheduler like Docker Swarm using Docker Compose for example (Portworx provides a volume driver).

Docker itself has built specific plugins as well, Cloudstor is such a volume plugin. It comes pre-installed and pre-configured in Docker swarms deployed through Docker for AWS. Data volumes can either be backed by EBS or EFS. Workloads running in a Docker service that require access to low latency/high IOPs persistent storage, such as a database engine, can use a “relocatable” Cloudstor volume backed by EBS. When multiple swarm service tasks need to share data in a persistent storage volume, you can use a “shared” Cloudstor volume backed by EFS. Such a volume and its contents can be mounted by multiple swarm service tasks since EFS makes the data available to all swarm nodes over NFS.

Container Orchestration Systems and Persistent Storage

As most enterprise production container deployments will utilize some container orchestration system we should also determine how external persistent storage is managed at this level. If we look at Kubernetes for example, that supports a volume plugin system (FlexVolumes) that makes it relatively straightforward to consume different types of block and file storage. Additionally Kubernetes recently started supporting a implementation of the Container Storage Interface (CSI) which helps accelerate vendor support for these storage plug-ins as volume plugins are currently part of the core Kubernetes code and shipped with the core Kubernetes binaries meaning that vendors wanting to add support for their storage system to Kubernetes (or even fix a bug in an existing volume plugin) must align themselves with the Kubernetes release process. With the adoption of the Container Storage Interface, the Kubernetes volume layer becomes extensible. Third party storage developers can now write and deploy volume plugins exposing new storage systems in Kubernetes without having to touch the core Kubernetes code.

When using CSI with Docker, it relies on shared mounts (not docker volumes) to provide access to external storage. When using a mount the external storage is mounted into the container, when using volumes a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.

To use CSI, you will need to deploy a CSI driver, a bunch of storage vendors have these available in various stages of development. For example there is a  Container Storage Interface (CSI) Storage Plug-in for VMware vSphere.

Pre-packaged container platforms

Another angle how vendors are trying to make it easier for enterprises to adopt these new platforms, including solving for persistence, is by providing packaged solutions (i.e. making it turnkey), this is not new of course, not too long ago we saw the same thing happening with OpenStack through the likes of VIO (VMware Integrated OpenStack), Platform9, Blue Box (acquired by IBM), etc. Looking at the Public Cloud providers these are moving more towards providing container as a service (CaaS) models with Azure Container Service, Google Container Engine, etc.

One example of packaged container platforms though is the Cisco Container Platform. This is provided as an OVA for VMware (meaning it is provisioning containers inside virtual machines, not on bare metal at the moment), initially this is supported on their HyperFlex Platform which will provide the persistent storage layer via a Kubernetes FlexVolume driver. It then can communicate externally via Contiv, including talking to other components on the HX platform like VMs that are running non containerized workloads. For the load-balancing piece (between k8s masters for example) they are bundling NGINX and for logging and monitoring they are bundling Prometheus (monitoring) and an ELK stack (logging and analytics).

Another example would be VMware PKS which I wrote about in my previous post.

Conclusion

Containers are ready for enterprise use today, however there are some areas that could do with a bit more maturity, one of them being storage. I fully expect to see continued innovation and tighter integrations as we figure out the validity of these use-cases. A lot progress has been made in the toolkits themselves, leading to the demise of earlier attempts like ClusterHQ/Flocker. As adoption continues so will the maturity of these frameworks and plugins.

Docker networking overview

Docker networking overview

Introduction

There are of course a lot of blog posts out there already regarding Docker networking, I don’t want to replicate that work but instead wanted to provide a clear overview of what is possible with Docker networking today by showing some examples of the different options.

In general the networking piece of Docker, and arguably Docker itself, is still quite young so things move fast and will likely change over time. A lot of progress has been made via the SocketPlane acquisition last year and it’s subsequent pluggable model, but more about that later.

Docker containers are ephemeral by design (pets vs cattle), this leads to several potential issues, not the least of which is not being able to keep your firewall configuration up to date because of difficult IP address management, it’s also hard to connect to services that might disappear at any moment, and no, using DNS as a stopgap is not a good solution (DNS as a SPOF, don’t go there). Of course there are several options and methods available to overcome these things.

Single host Docker networking

You basically have 4 options for single host Docker networking; Bridge mode, Host mode, Container mode, and No networking.

Bridge mode (the default Docker networking mode)

The Docker deamon creates “docker0” a virtual ethernet bridge that forwards packets between all interfaces attached to it. All containers on the host are attached to this internal bridge which assings one interface as the containers’ “eth0” interface and another interface in the host’s namespace (think VRF). The container get’s a private IP address assignment. To prevent ARP collisions on the local network, the Docker daemon generates a random MAC address from the allocated IP address. In the example below Docker assigns the private IP 172.17.0.1 to the container.

docker1

Host mode

In this mode the container shares the networking namespace of the host, directly exposing it to the outside world. This means you need to use port mapping to reach services inside the container, in Bridge mode, Docker can automatically assign ports and thus make them routable. In the example below the Docker host has the IP 10.0.0.4 and as you can see the container shares this IP address.

docker2

Container mode

This mode forces Docker to reuse the networking namespace of another container. This is used if you want to provide custom networking from said container, this is for example what Kubernetes uses to provide networking for multiple containers. In the example below the container to which we are going to connect the subsequent containers into has the IP 172.17.0.2 and as you can see the container being launched has the same IP address.

docker3
docker4

No networking

This mode does not configure networking, useful for containers that don’t require network access, but it can also be used to setup custom networking.
This is the mode Nuage Networks leverages pre-Docker 1.9 (more info here).
In the example below you can see that our new container did not get any IP address assigned.

docker5

By default Docker has inter-container communication enabled (–icc=true) meaning that containers on a host are free to communicate without restrictions which could be a security concern. Communication to the outside world is controlled via iptables and ip_forwarding.

Multi-host Docker networking

In a real world scenario you will most likely end up using Docker containers across multiple hosts depending on the needs of you containerized application. So now you need to build container networks across these hosts to have you distributed application communicate internally, and externally.

As alluded to above in march of 2015 Docker, Inc. acquired the SDN startup SocketPlane, that has given rise to Libnetwork and the Container Network Model, meant to be the default multi-host networking setup going forward.

Libnetwork

Libnetwork provides a native Go implementation for connecting containers
The goal of libnetwork is to deliver a robust Container Network Model that provides a consistent programming interface and the required network abstractions for applications.

One of the benefits of Libnetwork is that it uses a driver / plugin model to support many underlying network technologies while stil exposing a simple and consistent network model to the end-user (common API), Nuage Networks leverages this model by having a remote plugin.

Libnetwork also introduces the Container Network Model (CNM) to provide interoperation between networks and containers.

A63EECBA-52A0-471B-B7F5-679D2142014C

The CNM defines a network sandbox, and endpoint and a network. The Network Sandbox is an isolated environment where the Networking configuration for a Docker Container lives. The Endpoint is a network interface that can be used for communication over a specific network. Endpoints join exactly one network and multiple endpoints can exist within a single Network Sandbox. And the Network is a uniquely identifiable group of endpoints that are able to communicate with each other. You could create a “Frontend” and “Backend” network and they would be completely isolated.

Unifying Docker Container and VM networking

Unifying Docker Container and VM networking

Introduction

Most environments are not homogeneous, typically you have multiple types of workloads and I believe this will only increase in the near future with the rise of containers, PaaS, VM’s, bare metal,… In this brief overview I wanted to demonstrate how you can connect Virtual Machines and Containers on the same overlay network in an automated manner via our SDN solution. This way every time you spin up a new workload it will automatically get its network and security policy applied and behaves like any other endpoint on the network.

Screen Shot 2015-11-06 at 13.01.15

Docker networking

There are multiple options to do networking in Docker, typically a container (running a specific service) can be exposed externally by mapping an internal port to external one. When you install Docker, it creates three networks automatically (bridge, null, and host), when you run a container you can use the –net flag to specify which network you want to run a container on. By default the Docker daemon connects containers to the bridge network. If you run ifconfig on the host you can see the bridge as part of the host’s network stack.

The none network adds a container to a container-specific network stack, this is what we will use in the case of Nuage Networks to connect the Docker container to our VRS (Open vSwitch)

The host network adds a container on the hosts network stack. You’ll find the network configuration inside the container is identical to the host.

Nuage Networks and Docker containers

In the case of Nuage Networks we will attach every container to a tenant (overlay) network which is provided by our centralised management (VSD) and control (VSC) plane and configured on the Docker host in our VRS (Open vSwitch). This allows us to use our centralised networking and security policies providing IP configuration, firewall rules, QoS, etc. If traffic leaves the Docker host it is encapsulated in VXLAN so from a management point of view this no different then how we work with Virtual Machines.

Demo

I’ve created a L2 network (called DockerSN below) in Nuage (synced to OpenStack) where I’m connecting both my containers and VM workloads. The subnet has a range of 192.168.200.0/24.

Screen Shot 2015-11-06 at 09.11.28

So when I spin up a new container on my Docker host and connect it to the Nuage VRS I’ll automatically get the policies from that construct applied.

Screen Shot 2015-11-06 at 09.23.22

So as you can see above, my new container (gloomy_jang) has gotten the IP address 192.168.200.190, if we go back to our Virtualized Services Architect interface we can see 2 containers (one I created earlier) and a VM attached to the same subnet (could also be to a separate subnet ofc).

Screen Shot 2015-11-06 at 09.28.40

We can drill down on the newly created container and get all the network and security policy details

Screen Shot 2015-11-06 at 09.29.48

We now have connectivity between our container workloads and VM (192.168.200.2).

[[email protected] ~]# docker exec 152f5660e56a ping -c 3 192.168.200.2
PING 192.168.200.2 (192.168.200.2) 56(84) bytes of data.
64 bytes from 192.168.200.2: icmp_seq=1 ttl=64 time=0.650 ms
64 bytes from 192.168.200.2: icmp_seq=2 ttl=64 time=0.450 ms
64 bytes from 192.168.200.2: icmp_seq=3 ttl=64 time=0.450 ms

Networking Field Day 10 – Nuage Networks

Networking Field Day 10 – Nuage Networks

Introduction

Networking Field Day 10 was held from August 19 till 21 2015, in Silicon Valley, NFD is part of the Tech Field Day series of events organized by Stephen Foskett and team, and aims to bring together independent bloggers and IT product vendors to share information and opinions in a presentation format. In this setting demo’s and whiteboard sessions are usually appreciated more than slide-ware and marketing pitches. Assuming you are an independent (e.g. you don’t work for a vendor) blogger/influencer you can ask to become a TFD delegate and when selected get to experience these sessions first hand.

One of the vendors at NFD 10 was Nuage Networks, Nuage has appeared at various TFD events on numerous (10 times by my count) occasions and that’s how I first heard about and got interested in their solutions.

So what did they show at NFD 10?

First up was Sunil Khandekar, Founder and CEO, with an introduction to Nuage Networks and an update on the products. He talked about building software defined, programmable, automated data centers and how Nuage through its declarative policy based automation is applicable to all types of workloads, be it bare metal, virtual machines, or containers. Further he mentioned how Nuage is a complete SDN solution seeing that it hits on all the key tenets, namely; abstraction, automation, control, and visibility. Then he went on to give an update on Nuage Networks.

Screen Shot 2015-09-02 at 10.22.05

Nuage was started January 2012 with the idea that networking should be as instantaneous and consumable as compute has historically become, the solution, VSP, was launched about a year later in April 2013 delivering on this thesis. Currently Nuage is on its 4th release and has seen great customer traction.

The Virtual Services Platform (VSP) has 3 main components, the Virtual Services Directory (VSD), the Virtual Services Controller (VSC), and the Virtual Routing & Switching engine (VRS). Additionally Nuage also optionally provides a hardware VTEP gateway, the 7850 Virtualized Services Gateway (VSG), to connect legacy networking to VXLAN overlays.

Screen Shot 2015-09-02 at 10.32.19

To get a more detailed overview of the solution please see my Nuage Compendium page*.

Sunil also announced the VSP SDK (VSPK) available on https://github.com/nuagenetworks with the idea of fostering open collaboration around the platform. Through this proposed github collaboration, custom scripts for network automation, control or visibility can be developed and shared by its customer community

It is also important to understand that Nuage takes the concept of network automation beyond the data center and extends it to the WAN (branch office) with Virtual Network Services (VNS).

Virtual Network Services (VNS) basics

Next up was Rotem Salomonovitch, head of product management for VNS, talking about how it works in some more detail. The idea of VNS is to make setting up a new branch stupidly easy and independent of the underlying (carrier) technology, essentially SDWAN with a lot of automation. VNS is available as a hardware box, a VM, and as a software only package to install on a bare metal server.

Screen Shot 2015-09-02 at 15.25.36

VNS uses the same control plane components from the data center (VSD / VSC) but has a different data plane, e.g. forwarding entity called the Network Services Gateway (NSG), reason being is that data plane’s in branches typically differ from those in data centers where you have GigE, 10GigE, 40GigE and beyond. In the branch you have different interface types and things like encryption in the WAN, additional security requirements, etc. The NSG is meant to be a platform, initial applications of this platform are networking services (routing, QoS, FW,…) but the goal is to go beyond network connectivity and enable application flexibility. (see section below).
The appliance has multiple WAN ports and also USB connectivity so you could do things like extend it with external LTE connectivity for example.

Rotem then moved on to talk about different abstractions for different types of audiences, e.g. a developer just wants his application to be “connected” whereas the network design team are concerned with connection points, bandwidth consumption etc. To support this he talked about both vertical abstractions, for multi-tenancy, and horizontal abstractions for different audience types within each tenant. The idea is to expose only those abstractions that a particular audience is interested in (until we break all IT silos and everyone is a unicorn of course).

Screen Shot 2015-09-02 at 15.42.34

The idea of abstractions is not to have each team work in their own silo but rather have the system interpret a certain audience’ set of abstractions and translate those to complete the end-to-end policy setup e.g. the application developer uses the Application Designer to create a new app, he only defines the app concepts (e.g. front-end, middle-tier, database) and then the system translates those to subnets, ACLs, etc.

Another example is the VPN designer where you can bring up a new site by linking the device object in the GUI to a location, then, depending on the authentication method, the only thing that needs to happen is physically connecting the WAN and LAN ports on the device at the branch and the device will “bootstrap” automatically and pick up its configuration. The forwarding engine in the device is multi-tenant capable, each subnet is created as a L2 EVPN construct (remember that the idea of VNS is to be independent of the WAN technology), the access ports of the box are piped into those.

Application Flexibility at the Branch

The VNS does not only provide network connectivity but also allows you to run containerized workloads on top of it (it’s an Intel Atom based system). The main idea is to automate the attachment of existing container based applications to the branch network. One example would be to enable running external network operations tools to perform local logging of LAN elements and then setup a single encrypted connection over the WAN or do local auditing of running configurations. Another would be to run a user simulation at the branch before go-live to validate user experience and adjust as needed.

Screen Shot 2015-09-02 at 16.51.00

Theoretically you can run any containerized app (pull it from docker hub) on VNS (provided that you have enough resources to run it), Nuage takes care of the multi-tenant network aspects of running multiple containers on a single host.

Boundary-less Wide Area Networking

Next up was Hussein Khazaal, solution director, talking about extending connectivity from the data center to the branch, to the public cloud, making a VPC your own personal branch office with consistent business policies between all of them.

Screen Shot 2015-09-02 at 17.11.26

The way it works is you get (in this case) a virtual NSG from Nuage which comes as an Amazon AMI (Amazon Machine Image) and deploy it as any other instance to your VPC. Now you can use the NSG-v in Amazon just like any other NSG from your application designer / vpn designer. The classic use case would be to load-balance your front-end (web) application between your data center and Amazon in times of increased load, essentially cloud-bursting made real. If you would want to do something like this without Nuage you would need to figure out how to translate your business policies to the available constructs at each public cloud provider, with the NSG-v it consumes the centralized policies from the VSD just like any other forwarding engine.

*The Nuage Compendium page is under construction, and will be expanded over time.