Category: Cisco

Containers and persistent storage

Containers and persistent storage

Containers are a method of operating system virtualization that allow you to run an application and its dependencies in resource-isolated processes. Containers allow you to easily package an application’s code, configurations, and dependencies into easy to use building blocks that deliver environmental consistency, operational efficiency, developer productivity, and version control. Containers are immutable, meaning they help you deploy applications in a reliable and consistent way independent of the the deployment environment.

As containers continue to rise in popularity beyond the developer populous the way these constructs are being used becomes increasingly varied, especially (but not exclusively) in light of enterprise applications the questions of persistent storage comes up more and more. It is a fallacy to think only stateless application can or should be containerized. If you take a look at https://hub.docker.com/explore/ you’ll see that about half of the most popular applications on Docker Hub are stateful, like databases for example. (Post hoc ergo propter hoc?). If you think about monolithic applications versus micro-services, a monolithic application typically requires state, if you pull this application out into micro-services, some of these services can be stateless containers but others will require state.

I’ll mainly use Docker as the example for this post but many other container technologies exist like LXD, rkt, OpenVZ, even Microsoft offers containers with Windows Server Containers, Hyper-V isolation, and Azure Container Service.

Running a stateless container using Docker is quite straightforward;

$ docker run --name demo-mysql mysql

When you execute docker run, the container process that runs is isolated in that it has its own file system, its own networking, and its own isolated process tree separate from the (local or remote) host.

The docker container is created from a readonly template called docker image. The “mysql” part in the command relates to this image, i.e. containerized application, that you want to run by pulling it from the registry. The data you create inside a container is stored on a thin writable layer, called the container layer, that sits on top of the stack of read-only layers, called the image layers, present in the base docker image. When the container is deleted the writable layer is also deleted so your data does not persist , in docker the docker storage driver is responsible for enabling and managing both the read-only image layers and the writable layer, both read and write speeds are generally considered slow.

Image result for docker layers

Assuming you want persistent data for your containers there are several methods to go about this. You can add a storage directory to a container’s virtual filesystem and map that directory to a directory on the host server. The data you create inside that directory on the container will be saved on the host, allowing it to persist after the container shuts down. This directory can also be shared between containers. In docker this is made possible by using volumes, you can also use bind mounts but these are dependent on the directory structure of the host machine whereas volumes are completely managed by Docker itself. Keep in mind though that these volumes don’t move with container workloads as they are local to the host. Alternatively you can use volume drives (Docker Engine volume plugins) to store data on remote systems instead of the Docker host itself. If you are only interested in storing data in the container writeable layer (i.e. on the docker host itself) you can use Docker storage drivers which then determine which filesystem is supported.

Typically you would create a volume using the storage driver of your choice in the following manner;

$ docker volume create -—driver=pure -o size=32GB testvol1

And then start a container and attach the volume to it;

$ docker run -ti -v testvol1:/data mysql

Storage Vendors and Persistent Container Storage

Storage vendors have an incentive to make consuming their particular storage as easy as possible for these types of workloads so many of them are providing plug-ins to do just that.

One example is Pure Storage who provide a Docker Volume Plugin for their FlashArray and FlashBlade systems. Current they support Docker, Swarm, and Mesos. Most other big name storage vendors also have plugins available.

Then there are things like REX-Ray which is an open source, storage management solution, it was born out of the now defunct {code} by Dell EMC team. It allows you to use multiple different storage backends and serve those up as persistent storage for your container workloads.

On the virtualization front VMware has something called the vSphere Docker Volume Service which consists of two parts, the Docker Volume Plugin and a vSphere Installation Bundle (VIB) to install on the ESXi hosts. This allows you to serve up vSphere Datastores (be it Virtual SAN, VMFS, NFS based) as persistent storage to your container workloads.

Then there are newer companies that have been solely focusing on providing persistent storage for container workloads, one of them is Portworx. Portworx want to provide another abstraction layer between the storage pool and the container workload. The idea is that they provide a “storage ” container that can then be integrated with the “application” containers. You can do this manually or you can integrate with a container scheduler like Docker Swarm using Docker Compose for example (Portworx provides a volume driver).

Docker itself has built specific plugins as well, Cloudstor is such a volume plugin. It comes pre-installed and pre-configured in Docker swarms deployed through Docker for AWS. Data volumes can either be backed by EBS or EFS. Workloads running in a Docker service that require access to low latency/high IOPs persistent storage, such as a database engine, can use a “relocatable” Cloudstor volume backed by EBS. When multiple swarm service tasks need to share data in a persistent storage volume, you can use a “shared” Cloudstor volume backed by EFS. Such a volume and its contents can be mounted by multiple swarm service tasks since EFS makes the data available to all swarm nodes over NFS.

Container Orchestration Systems and Persistent Storage

As most enterprise production container deployments will utilize some container orchestration system we should also determine how external persistent storage is managed at this level. If we look at Kubernetes for example, that supports a volume plugin system (FlexVolumes) that makes it relatively straightforward to consume different types of block and file storage. Additionally Kubernetes recently started supporting a implementation of the Container Storage Interface (CSI) which helps accelerate vendor support for these storage plug-ins as volume plugins are currently part of the core Kubernetes code and shipped with the core Kubernetes binaries meaning that vendors wanting to add support for their storage system to Kubernetes (or even fix a bug in an existing volume plugin) must align themselves with the Kubernetes release process. With the adoption of the Container Storage Interface, the Kubernetes volume layer becomes extensible. Third party storage developers can now write and deploy volume plugins exposing new storage systems in Kubernetes without having to touch the core Kubernetes code.

When using CSI with Docker, it relies on shared mounts (not docker volumes) to provide access to external storage. When using a mount the external storage is mounted into the container, when using volumes a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.

To use CSI, you will need to deploy a CSI driver, a bunch of storage vendors have these available in various stages of development. For example there is a  Container Storage Interface (CSI) Storage Plug-in for VMware vSphere.

Pre-packaged container platforms

Another angle how vendors are trying to make it easier for enterprises to adopt these new platforms, including solving for persistence, is by providing packaged solutions (i.e. making it turnkey), this is not new of course, not too long ago we saw the same thing happening with OpenStack through the likes of VIO (VMware Integrated OpenStack), Platform9, Blue Box (acquired by IBM), etc. Looking at the Public Cloud providers these are moving more towards providing container as a service (CaaS) models with Azure Container Service, Google Container Engine, etc.

One example of packaged container platforms though is the Cisco Container Platform. This is provided as an OVA for VMware (meaning it is provisioning containers inside virtual machines, not on bare metal at the moment), initially this is supported on their HyperFlex Platform which will provide the persistent storage layer via a Kubernetes FlexVolume driver. It then can communicate externally via Contiv, including talking to other components on the HX platform like VMs that are running non containerized workloads. For the load-balancing piece (between k8s masters for example) they are bundling NGINX and for logging and monitoring they are bundling Prometheus (monitoring) and an ELK stack (logging and analytics).

Another example would be VMware PKS which I wrote about in my previous post.

Conclusion

Containers are ready for enterprise use today, however there are some areas that could do with a bit more maturity, one of them being storage. I fully expect to see continued innovation and tighter integrations as we figure out the validity of these use-cases. A lot progress has been made in the toolkits themselves, leading to the demise of earlier attempts like ClusterHQ/Flocker. As adoption continues so will the maturity of these frameworks and plugins.

New Year, New Job.

New Year, New Job.

I’m super excited to be taking on a new role in the NSBU at VMware, as of the 1st of January I’ll officially be joining the team as a Sr. Systems Engineer for the Benelux. I’ll be focused mainly on VMware NSX, including it’s integrations with other solutions (Like vRA and OpenStack for example).

Unofficially I’ve been combining this function with my “real” job for a couple of months now ever since a dear and well respected colleague decided to leave VMware. Recently I was fortunate enough to get the opportunity to attend a 2 week training at our Palo Alto campus on NSX-v, NSX-MH, OpenStack, VIO, OVS,…

vmwarecampus

The experience was face-meltingly good, I definitely learned a lot and got the opportunity to meet many wonderful people. One conclusion is that the NSX team certainly is a very interesting and exciting place to be in the company.

In the last few months I got my feet wet by training some of our partner community on NSX (most are very excited about the possibilities, even the die-hard hardware fanatics), staffing the NSX booth at VMworld Europe, and by having some speaking engagements like my NSX session at the Belgian VMUG.

vmugfv

So why NSX?

In the past I’ve been working on a wide variety of technologies (being in a very small country and working for small system integrators you need to be flexible, and I guess it’s also just the way my mind works #squirrel!) but networking and virtualisation are my two main fields of interest so how convenient that both are colliding!
I’ve been a pure networking consultant in the past, mainly working with Cisco and Foundry/HP ProCurve and then moved more into application networking at Citrix ANG and Riverbed.

The whole network virtualisation and SDN (let’s hold of the discussion of what’s what for another day) field are on fire at the moment and are making the rather tedious and boring (actually I’ve never really felt that, but I’m a bit of a geek) field of networking exciting again. The possibilities and promise of SDN have lot’s of potential to be disruptive and change an industry, and I’d like to wholeheartedly and passionately contribute and be a part of that.

As NSX is an enabling technology for a lot of other technologies it needs to integrate with a wide variety of solutions. 2 solutions from VMware that will have NSX integrated for example are EVO:RACK and VIO. I look forward to also work on those and hopefully find some time to blog about it as wel.

Other fields are also looking to the promise of SDN to enable some new ways of getting things done, like SocketPlane for example, trying to bring together Open vSwitch and Docker to provide pragmatic Software-Defined Networking for container-based clouds. As VMware is taking on a bigger and bigger role in the Cloud Native Apps space it certainly will be interesting to help support all these efforts.

“if you don’t cannibalise yourself, someone else will”
-Steve Jobs

I’m enjoying a few days off with my family and look forward to returning in 2015 to support the network virtualisation revolution!

nsx-dragon-2

Horizon Branch Office Desktop Architecture

VMware has a number of virtual desktop architectures that give a prescriptive approach to matching a companies’ specific use case to a validated design. These architectures are not price-list bundles, they include VMware’s own products combined with 3rd party solutions with the goal of bringing customers from the pilot phase all the way into production.

At the moment there a 4 different architectures focussed on different use cases, these are the Mobile Secure Workplace, the AlwaysOn Workplace, the Branch Office Desktop, and the Business Process Desktop.

horizonArch

In this article I wanted to focus in on the Branch Office Desktop but in the interest of completeness please find below the partner solutions around:

Seeing that there are over 11 million branch offices across the globe, a lot of people are working with remote, or distributed, IT infrastructures which potentially have a lot of downsides. (No remote IT staff, slow and unreliable connectivity, no centralised management,…).

branchofficevmw

With the Horizon Branch Office Desktop you have some options to alleviate those concerns and bring the remote workers into the fold. Depending on your specific needs you could look at several options.

If you have plenty of bandwidth and low latency, using a traditional centralised Horizon View environment is going to be the most cost effective and easy path to pursue. There are of course additional options if you have bandwidth concerns but still want to provide a centralised approach.

Optimized WAN connectivity delivered by F5 Networks.

The F5 solution offers simplified access management, hardened security, and optimized WAN connectivity between the branch locations and the primary datacenter. Using a Virtual Edition of F5’s Traffic Manager in the branch combined with a physical appliance in the datacenter.

F5

The solution provides secure access management via the BIG-IP APM (access policy manager) which is an SSL-VPN solution with integrated AAA services and SSO capabilities. The BIG-IP LTM (local traffic manager) is an Application Delivery Networking solution that provides load-balancing for the Horizon View Security and Connection servers. The solution can also provide WAN optimisation through it’s Wan Optimization Manager (WOM) module, in this case focused on other non PCoIP branch traffic.

If you find that ample bandwidth is not available however you still have other options like the architectures combining Horizon with Riverbed, Cisco, and IBM which I’ll focus on in this article.

Riverbed for the VMware (Horizon) Branch Office Desktop.

With Riverbed’s architecture we essentially take your centralised storage (a LUN from your existing SAN array) and “project” this storage across the WAN towards the branch office. In the branch we have an appliance, called the Granite Edge (steelhead EX + Granite in the picture below) which then presents this “projected” LUN to any server, including itself (the Granite Edge appliance is also a x86 server running VMware ESXi). If we install the virtual desktops on the LUN we have just “projected” out from the central SAN environment then these desktops are now essentially locally available in the branch office. This means that from the POV of the end-user they setup a local (LAN) PCoIP connection toward the virtual desktop and can work with the same local performance one would expect in the datacenter location.

granite

The end-result is that from a management perspective you keep (or gain) centralised control and from an end-user perspective you get the same performance as if you were local. For more details on this architecture you can download a deployment guide here: Deployment Guide: Riverbed for the VMware Branch Office Desktop …

Cisco Office in a Box.

With Cisco’s Office in a Box architecture you take their Integrated Services Routers Generation 2 (ISR G2) platforms (Cisco 2900 and 3900 Series ISRs) and the Cisco UCS E-Series Servers, and combine those into one physical platform that can host up to 50 virtual desktops in a Branch Office.

cisco office in a box

In this case you have essentially built a remote desktop appliance that sits in the branch office, all virtual machines share the direct-attached storage (DAS) of the Cisco UCS E-Series blade. So in this case the management domain is not stretched across the WAN but instead you have a “pod-like” design that includes everything you need to run virtual desktops in the branch.

ciscoLogical

For more information on Cisco’s architecture please see: http://www.cisco.com/c/en/us/products/collateral/servers-unified-computing/ucs-e-series-servers/white_paper_c11-715347.html

IBM Branch Office Desktop.

IBM has another validated approach that combines VMware Mirage and VMware Horizon View technologies to address the varying requirements within the branch office.

With VMware Mirage you can centrally manage OS images for both persistent virtual desktops and physical endpoints, while ensuring employees have fast, secure access to applications and data. With centralized images and layered single image management, the same image can be deployed in a server-hosted virtual desktop for remote execution and natively to a physical PC or client hypervisor for local execution.

This approach let’s you deliver centrally managed desktops with LAN-like performance and disaster recovery capabilities to locations with robust and reliable as well as well as unreliable wide area networks.

These components run on IBM’s (Lenovo’s) System x and FlexSystems compute nodes, IBM storage and IBM System networking components.

ibmbranch

For more information on the IBM architecture please see: http://thoughtsoncloud.com/2012/10/vmware-robo-solution-ibm-vmworld/

Alternatively (or in conjunction with all the architectures mentioned) we can also independently leverage Horizon Mirage for the Branch Office, specifically if you have to deal with frequently disconnected users (laptop users that are not always on the office for example) or physical devices.

For more information on all these Branch Office architectures please see: http://www.vmware.com/remote-branch/remote-branch-office  and http://www.vmware.com/be/nl/remote-branch/partners.html for the partner extended capabilities.