Introduction to Apache Mesos and Mesosphere DCOS

A little history…

Mesos started as a research project at UC Berkeley in early 2009 by Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, and Ion Stoica. Benjamin then brought Mesos to Twitter where it now runs their datacenters, and later became Chief Architect at Mesosphere which builds the Mesosphere Datacenter Operating System (DCOS).

The motivation for building Mesos was to try and increase the performance and utilization of clusters, the team believed that static partitioning of resources in the datacenter should be considered harmful, for example;

Let’s say your datacenter has 9 hosts;

Screen Shot 2015-08-27 at 13.25.11

And you statically partition your hosts and assign 3 each to 3 applications (in this example Hadoop, Spark, Ruby on Rails);

Screen Shot 2015-08-27 at 13.26.16

Then the utilization of those hosts would be sub-optimal;

Screen Shot 2015-08-27 at 13.28.24

So if you would use all resources, in this case the 9 hosts, as a shared pool that you can schedule if needed, utilization would improve;

Screen Shot 2015-08-27 at 13.29.31

The second premise the team held was that they needed a new framework for distributed systems, i.e. not every use case lends itself well to something like Map/Reduce (which in itself led to to birth of Spark, but that’s another story for another day) and a new more straightforward and universally applicable framework was required.

What does the framework (distributed system) of Mesos look like?
Screen Shot 2015-08-27 at 13.38.47

Typically in a distributed system you have a coordinator (scheduler) and workers (task execution).
The coordinator runs processes/tasks simultaneously (distributed), it handles process failures (fault-tolerance), and it optimizes performance (elasticity). In other words it is coordinating execution of what you are trying to run (does not need to be an entire program, this could also be a computation of some sorts) in the datacenter. As alluded to before, Mesos calls this combination scheduling.
Screen Shot 2015-08-27 at 13.47.03

In other words a Mesos frameworks is a distributed system that has a scheduler.

Now what Mesos really is, is a level of abstraction between the scheduler and the machines where you are trying to execute it’s tasks.

Screen Shot 2015-08-27 at 13.50.51

So in Mesos the scheduler communicates with the Mesos layer (via API’s) instead of directly with the machines. The idea here is to fix the issues of static partitioning whereby you no longer have a scheduler for each specific workload talking to it’s designated workers, but instead you have the schedulers talk to Mesos which in turn talks to the entire pool of resources.

Screen Shot 2015-08-27 at 13.54.36

The immediate benefit is that you can run multiple distributed systems on the same cluster of machines and dynamically share those resources more efficiently (no more static partitioning).

Secondly because of this abstraction it provides common functionality (failure detection, task distribution, task starting, task monitoring, task killing, task cleanup) that each distributed system typically tries to implement in it’s own unique way.

Where does Mesos fit as an abstraction layer in the datacenter?

Screen Shot 2015-08-27 at 14.03.19The Mesos layer wants to make it easier to build and run these frameworks by using and scheduling resources.
IaaS’ abstraction is machines, e.g. you give it a number and it provides x number of machines and thus considered a lower level of abstraction in Mesos’ concept.
PaaS is concerned with deploying and managing applications/services and does not care about the underlying infrastructure and thus considered a higher level of abstraction in Mesos’ concept. In terms of interactions, PaaS is probably interacted with by developers, where Mesos is interacted with by software through APIs.

In other words you could build a PaaS system on top of Mesos (like Marathon for example – which is more PaaS like than an actual PaaS but anyway) and you could run Mesos on top of an IaaS (like OpenStack for example). The idea of breaking down hard partitioning comes back again if you run your Mesos layer on top a combination of systems (like OpenStack + Hardware + VM’s for example) and are able to schedule your workloads across all of them, in that sense you can think of Mesos as a sort of datacenter kernel, i.e. it abstracts aways machines and allows you to build distributed systems on top of any of those underlying components.

So Apache Mesos is a distributed system for running and building other distributed systems. (Like Spark for example).

Architectural details of Mesos

In Mesos, the framework (distributed system) issues a request for what is needed at that specific time to the scheduler. This differs from traditional distributed systems as normally the person (again, in Mesos it will be the framework making the request, not a person) making the request would need to figure out the specification beforehand and request those resources, typically the requirements change (think Map/Reduce for example where there is a change in required resources between the Map and Reduce phases).

Mesos Screen Shot 2015-08-27 at 14.31.44then offers the best approximation of those resources immediately instead of waiting to able to fulfil the request completely/exactly. (it wants to be non-blocking as this in most cases is sufficient, i.e. you don’t need the exact amount of requested resources immediately)

Screen Shot 2015-08-27 at 14.42.58Now the framework (distributed system) uses the offers from Mesos to perform it’s own scheduling, e.g. “two-level scheduling”

This is then turned into a task and submitted to run somewhere in the datacenter.

The reason to have this system of “two-level scheduling” is to be able to support multiple distributed systems at the same time. Mesos provides the resource allocations to, in this case, Spark. Spark makes the decision about what tasks to run given the available resources. (i.e. I want to run these maps now because I can satisfy those requirements).
Screen Shot 2015-08-27 at 14.57.24

So once the task are submitted from the framework to Mesos, these now need to be executed. The Mesos master gets the task to one of the slaves where an agent is running to manage launching those tasks. (i.e. if it’s a command it runs it, if it needs specific resources to perform the task, like .jar files, it pulls those down, runs it in a sandbox, and then launches the task).
Or alternatively, the framework can decide it wants to run the tasks with an executor (layer of indirection needed by the framework, which can also be used to run threads).

Screen Shot 2015-08-27 at 15.07.09To provide resource isolation Mesos has built-in support for cgroups and namespaces, or you can also give it a docker container to run as a task. So this gives you multi-tenancy (frameworks) across your pooled resources (across machines and inside individual machines).

You can use reservations if you want, but then you are getting back into the hard partitioning space of course. If you have stateful applications you need reservations (the task always needs to run on the same machine(s)) and persistent volumes (data needs to survive restarts), which is also possible with Mesos.

Mesosphere DCOS

DCOS (Data Center Operating System) is taking the Mesos “kernel” and building around/upon that with additional services and functionality. Add-on modules like mesos-dns, tooling like a CLI, a GUI, a repository for the packages that you want to run, and frameworks like Marathon (a.k.a. distributed init), Chronos (a.k.a. distributed cron) , etc.

Like the name implies it is meant to be an operating system that spans all of the machines in your datacenter or cloud. DCOS should run in any modern Linux environment, public and private cloud, virtual machines and bare metal. (supported platforms are: Amazon AWS, Google Compute Engine, Microsoft Azure, OpenStack, VMware, RedHat, CentOS, CoreOS, and Ubuntu). Currently around 40 services have been made available for DCOS and can be found in the public repository (Hadoop, Spark, Cassandra, Jenkins, Kafka, MemSQL,…)

Posted in mesos, mesosphere | Tagged , , | 1 Comment

VMware AppCatalyst, Bonneville, and Photon.

VMware has lot’s and lot’s of customers, running lot’s and lot’s of workloads, both dev and test workloads and production workloads, you know like, super duper important stuff that cannot, under any circumstance break.

As the whole DevOps “movement” makes clear both developers and operations teams have different requirements and different responsibilities. Developers want speed, the operations teams want stability, both are trying to respond to the demands of the business which wants faster time to market. Gross oversimplification I know, but since devops is getting a lot of attention lately you’ll have no problem digging up articles and blogposts on that.


Now as VMware’s customers rightfully expect, the idea is to try and marry both worlds and make these new methods enterprise grade without loosing the original benefits. I personally notice a lot of resistance in that enterprise customers understand the benefits, and some internal teams are pushing hard to incorporate these, but still a lot of people seem to adopt this “let’s wait and see / it’s just another fad” attitude.


VMware has been doing a lot of work in the past around making it’s products more “DevOps” friendly with things like vRealize CloudClient a command-line utility that provides verb-based access with a unified interface across vCloud Automation Center APIs, and vRealize Code Stream which provides release automation and continuous delivery to enable frequent, reliable software releases, while reducing operational risks.

You don’t have meetings with other teams, you talk to their API instead.
-Adrian Cockcroft

And now at the recent DockerCon in San Francisco VMware announced the tech-preview of AppCatalyst and Project Bonneville.

VMware AppCatalyst is an API and Command Line Interface (CLI)-driven MacOS type 2 hypervisor (based on VMware Fusion but without the GUI, 3D graphics support, virtual USB support, and Windows guest support) that is purpose-built for developers, with the goal of bringing the datacenter environment to the desktop. Currently a technology preview, VMware AppCatalyst offers developers a fast and easy way to replicate a private cloud locally on their desktop for building and testing containerized and microservices-based applications. The tool includes Project Photon (already announced in April), an open source minimal Linux container host, Docker Machine and integration with Vagrant. Panamax and Kitematic support are planned in the near future. AppCatalyst uses MacOS as its host operating system (i.e., the user must use MacOS 10.9.4 or later as their host operating system to use AppCatalyst).

You can download the Tech Preview of AppCatalyst here, it comes with an installer so it pretty easy to get up and running. Once it is installed AppCatalyst does not appear under your Applications folder, instead you can use your Terminal to navigate to /opt/vmware/appcatalyst

Screen Shot 2015-06-27 at 10.11.41

As mentioned above AppCatalyst comes pre-bundled with Project Photon – VMware’s compact container host Linux distribution. When you download AppCatalyst, you can point docker-machine at it, start up a Photon instance almost instantly (since there’s no Linux ISO to download), and start using Docker.

Another common use of the desktop hypervisor is with Vagrant. Developers build Vagrant files and then Vagrant up their deployment. Vagrant creates and configures virtual development environments, it can be seen as a higher-level wrapper to AppCatalyst. You can find the plugin for Vagrant here. (git clone

Since Project Photon is included in AppCatalyst it’s pretty easy to get started with deploying a Photon Linux Container Host.

appcatalyst vm create photon1
Info: Cloned VM from ‘/opt/vmware/appcatalyst/photonvm/photon.vmx’ to ‘/Users/filipv/Documents/AppCatalyst/photon1/photon1.vmx’

appcatalyst vm list
Info: VMs found in ‘/Users/filipv/Documents/AppCatalyst’

appcatalyst vmpower on photon1
2015-06-27T10:18:38.530| ServiceImpl_Opener: PID 2949
Info: Completed power op ‘on’ for VM at ‘/Users/filipv/Documents/AppCatalyst/photon1/photon1.vmx’

appcatalyst guest getip photon1

I can now SSH into the VM:

Screen Shot 2015-06-27 at 10.22.28

And securely launch a Docker container via Project Photon:

Screen Shot 2015-06-27 at 10.26.17

Screen Shot 2015-06-27 at 10.28.07

As mentioned before the idea is to interface with AppCatalyst via REST API calls, you can enable this by first starting the app catalyst-deamon and then going to port 8080 on your localhost.

Screen Shot 2015-06-27 at 10.30.31once the deamon is running we can start to make REST API calls, for example retrieve the IP address of the Docker VM we previously created:

Screen Shot 2015-06-27 at 10.34.09

Since the last VMworld VMware has been talking about this concept of containers and VMs being better together, this kinda led to a lot of discussion about overhead of the hypervisor and each container needing it’s own OS, potential lock-in, etc. But again, this is where VMware is trying to marry Dev with Ops and make the use of containers feasible in the enterprise environment. Project Bonneville takes another step in this direction by making containers first class citizens on the vSphere hypervisor.

Bonneville orchestrates all the back-end systems: VM template (with Photon), storage, network, Docker image cache, etc. It can manage and configure native ESX storage and network primitives automatically as part of a container deploy.

Bonneville is a Docker daemon with custom VMware graph, execution and network drivers that delivers a fully-compatible API to vanilla Docker clients. The pure approach Bonneville takes is that the container is a VM, and the VM is a container. There is no distinction, no encapsulation, and no in-guest virtualization. All of the necessary container infrastructure is outside of the VM in the container host. The container is an x86 hardware virtualized VM – nothing more, nothing less.

Screen Shot 2015-06-27 at 15.22.23

Bonneville uses VMFork (Instant Clone / Project Fargo) to spin up a new VM every time a container is launched, by doing this the operations team now sees VM instances in it’s environment that it can treat, i.e. “operationlize”, just like regular Virtual Machines (Bonneville updates VM names and metadata fields for the container VMs it creates for full transparency in vCenter and any vSphere ecosystem products), and the obvious added benefit is that each container might be a VM but each container is not using a full blown linux host os to run. Instant Cloned VMs are powered on and fully booted in under a second and use no physical memory initially.

You can see a demo of Project Bonneville below:

Posted in containers | Tagged , , , , , , | Leave a comment

VMware NSX and Palo Alto NGFW

VMware NSX is a platform for network and security virtualization, and as such it has the capability to integrate onto it’s platform certain functionalities that are not delivered by VMware itself. One such integration point is with Palo Alto Networks’s Next-Generation Firewall.

VMware NSX has built-in L2-4 stateful firewall capabilities both in the distributed firewall running directly in the ESXi hypervisor for east-west traffic, and in the Edge Services Gateway VM for north-south traffic. If L2-4 is not sufficient for your specific use case we can use VMware’s NSX Service Composer to steer traffic towards a third party solution provider for additional inspection.

At a high level the solution requires 3 components, VMware NSX, The Palo Alto Networks VM-series VM-1000-HV, and Palo Alto Networks’ central management system, Panorama.

Screen Shot 2015-05-04 at 09.47.20

Currently the VM-1000-HV supports 250.000 sessions (8000 new sessions per second) and 1Gbps firewall throughput (with App-ID enabled). The VM-series firewall is installed on each host of the cluster where you want to protect virtual machines with Palo Alto’s NGFW. Each VM-series firewall takes 2 vCPU’s and 5GB RAM.

Screen Shot 2015-05-04 at 09.53.28

If you look (summarize-dvfilter) at each ESXi host after installation you should see the VM-series show up in the dvfilter slowpath section.

Screen Shot 2015-05-04 at 09.58.59

We can also look at the Panorama central management console and verify that our VM-series are listed under managed devices.

Screen Shot 2015-05-04 at 10.03.06

Deciding which traffic to pass to the VM-series is configured using the Service Composer in NSX. The Service Composer provides a framework that allows you to dictate what you want to protect by creating security groups, and then deciding how to protect the members of this group by creating and linking security policies.

Screen Shot 2015-05-04 at 10.07.13

It is perfectly feasible to use security policy to first enable NSX’s distributed firewall to deal with certain type of traffic (up to layer 4) and only steer other “interesting” traffic towards the Palo Alto VM-series, this way you can simultaneously benefit from the distributed throughput of the DFW and the higher level capabilities of Palo Alto Networks NGFW.

Using the Service Composer, we create a security policy and use the Network Introspection Service to select which external 3rd party service that we want to steer traffic to. In this case we select the Palo Alto Networks NGFW and can further select the source, destination, and specific traffic (protocol/port) that we want to have handled by the VM-series.


Today only the traffic is passed to the external service but it is feasible to pass on more metadata that additionally could be acted upon by the third party provider. For example what if we could pass along that the VM we are protecting is running Windows Server 2003 and thus needs to have certain additional security measures applied.

So now that we have a policy that redirects traffic to the VM-series we need to apply this to a specific group. The power of combining NSX with Palo Alto Networks lies in the fact that we can use dynamic groups (both on NSX and in Panorama) and that members of the dynamic groups are sync’ed (about every 60 seconds) between both solutions. This means that if we add or remove VM’s from groups, the firewall rules are automatically updated. No more dealing with large lists of outdated firewall rules relating to decommissioned applications that nobody is willing to risk deleting because no one is sure what the impact would be.

For example we could create a security group using dynamic membership based on a security tag, this security tag could easily be applied as metadata by a cloud management platform (vRealize Automation for example) at the time of creation of the VM. (or you can manually add/remove security tags using the vSphere Web Client).

Screen Shot 2015-05-04 at 10.27.43

In Panorama we also have this concept of dynamic address groups, these are linked in a one-to-one fashion with security groups in NSX.


If we look a the group membership of the address groups in Panorama we will see the IP address of the VM, this can then be leveraged to apply firewall rules in Palo Alto Networks.

Screen Shot 2015-05-04 at 10.34.19

NOTE: if I would remove the VM from the security group in NSX about 60 seconds later the IP address in Panorama would disappear.

Traffic is redirected by using the filtering, and traffic redirection module that are running between the VM and the vNIC. The filtering module is an extension of the NSX distributed firewall, the traffic redirection module defines which traffic is steered to the third party services VM (VM-series VM in our case).


If we use the same dvfilter command (summarize-dvfilter) on the ESXi host as before we can see which slots are occupied;

Slot 0 : implements vDS Access Lists.
Slot 1:  Switch Security module (swsec) capture DHCP Ack and ARP messages, this info then forwarded to the NSX Controller.
Slot 2: NSX Distributed Firewall.
Slot 4: Palo Alto Networks VM-series

Screen Shot 2015-05-04 at 11.00.00

So as we are now able to steer traffic towards the Palo Alto Networks NGFW we can apply security policies, as an example we have built some firewall rules blocking ICMP and allowing SSH between two security groups.

Screen Shot 2015-05-04 at 10.36.52

As you could see from the picture earlier the VM in the SG-PAN-WEB group has IP address (matching the member IP seen above in the dynamic group DAG-WEB in Panorama).

We are not allowed to ping a member of the dynamic group DAG-APP as dictated by the firewall rules on the VM-series firewall.

Screen_Shot_2015-05-04_at_10_39_54Since SSH is allowed we can test this by trying to connect to a VM in the DAG-APP group.


We can also verify if this session shows up on the VM-series firewall by opening the console on the vSphere web client.

Screen Shot 2015-05-04 at 10.45.38

And finally if we look at the monitoring tab on Panorama we can verify that our firewall rules are working as expected.

Screen Shot 2015-05-04 at 10.47.07

So that’s it for this brief overview of using Palo Alto Networks NGFW in combination with VMware NSX. As you can see from the screenshot below, NSX allows for a broad list of third party solutions to be integrated, so the solution is very extensible and true to it’s goal of being a network and security platform for the next generation data center.

Screen Shot 2015-05-04 at 10.48.30

Posted in Networking, NSX, Palo Alto Networks, vmware | Leave a comment

Burn the heretic!

Talking about new ways to “fix” old problems or enabling functionalities that weren’t even possible before by introducing something that goes against established doctrine, can be an interesting experience.

The flip-side of the coin is that a lot of “new ways” are greatly oversold, of course a certain technology or product can’t fix ALL your problems, keep asking the tough questions.

But by keeping an open mind maybe you can see value that wasn’t there before, maybe by embracing change your world can become a whole lot more interesting, maybe you can become the automator instead of eventually becoming the automated.

“You can either be a victim of the world or an adventurer in search of treasure.”

-Paulo Coelho

But, but, if we would implement that we would loose x-y-z…

Back in the days of the mainframe (wait didn’t IBM just release the z13?, anyway…) you could do end to end performance tracing. You could issue an I/O and follow the transaction throughout the system (connect time, disconnect time, seek time, rotational delay,…), this worked because the mainframe was a single monolithic system, it had a single clock against which the I/O transaction could be measured and the protocol carrying the I/O cared about this metadata and allowed it to be traced. Today we have distributed systems, tracing something end to end is a whole lot trickier but that hasn’t stopped us from evolving because we saw the value and promise of the “new way”. What we have gained is much greater than what we have lost. Where you differentiate yourself has changed, the value you can get from IT has moved up the stack, we live in a world of abundance now, not a world of scarcities.

Screen Shot 2015-03-07 at 20.12.22

It’s all about the use case stupid!

I work for a vendor, and I evangelise a new way of looking at networking by doing network virtualisation/software defined networking (God I hate that I need to write both terms in order not to upset people, who cares what we call it, seriously). Obviously this stirs up a lot of controversy among “traditional” networking people, some of it warranted, some of it not. Just like it did when we first started talking about server virtualisation. In the end it comes down to the use case, every technology ever created was done with a specific set of use cases in mind. If those use cases make sense for your organisation, if they can move your business forward somehow maybe it is worth a (second) look.

I won’t spend time talking about my specific solutions’ use cases, that’s not what this blogpost is about.

A very interesting new way (at least in my humble opinion) of looking at things in a software defined world is this concept of machine learning. Systems that use data to make predictions or decisions rather than requiring explicit programming for instruction. What if the network can look at what is happening on the network, combine this with historical data (historical analytics) and make autonomous decisions about what is the best future configuration (maybe do things like redirect traffic by predicting congestion (using near real-time analytics), rearrange paths based on the workload’s requirements (using predictive analytics), etc.)

This kind of thinking requires a capable and unbound platform, something that can quickly adapt and incorporate new functionality. We now have big data platforms that can give us these insights using analytics, combine this with a programmable network and we have a potent solution for future networking.

Someone who is very active and vocal in this space is David Meyer, currently CTO for Service Providers and Chief Scientist at Brocade. I highly recommend checking out some of his recent talks, transcripts of which you can find on his webpage at or have a look at the YouTube video below for his presentation during Network Field Day 8 where he talks about the concept of Software Defined Intelligence.

*Regarding the title, yes I am a warhammer 40k fanboy ;-)

Posted in Networking | Tagged , , | Leave a comment

VMware NSX with Trend Micro Deep Security 9.5

I recorded a brief demonstration showing the integration between Trend Micro Deep Security and VMware NSX.

Posted in Networking, NSX, SDN, vmware | 5 Comments

vSphere 6 – vMotion nonpareil

I remember when I was first asked to install ESX 2.x for a small internal project, must have been somewhere around 2004-2005, I was working at a local systems integrator and got the task because no one else was around at the time. When researching what this thing was and how it actually worked I got sucked in, then some time later when I saw VMotion(sic) for the first time I was hooked.

Sure I’ve implemented some competing hypervisors over the years, and even worked for the competition for a brief period of time (in my defence, I was in the application networking team ;-) ). But somehow, at the time, it was like driving a kit car when you knew the real deal was parked just around the corner.

mercy-4vsRealCar2So today VMware announced the upcoming release of vSphere 6, arguably the most recognised product in the VMware stable and the foundation to many of it’s other solutions.

A lot of new and improved features are available but I wanted to focus specifically on the feature that impressed me the most soo many years ago, vMotion.

In terms of vMotion in vSphere 6 we now have;

  • Cross vSwicth vMotion
  • Cross vCenter vMotion
  • Long Distance vMotion
  • Increased vMotion network flexibility

Cross vSwitch vMotion

Cross vSwitch vMotion allows you to seamlessly migrate a VM across different virtual switches while performing a vMotion operation, this means that you are no longer restricted by the network you created on the vSwitches in order to vMotion a virtual machine. It also works across a mix of standard and distributed virtual switches. Previously, you could only vMotion from vSS to vSS or within a single vDS.

Screen Shot 2015-01-19 at 17.52.25

The following Cross vSwitch vMotion migrations are possible:

  • vSS to vSS
  • vSS to vDS
  • vDS to vDS (including metadata, i.e. statistics)
  • vDS to vSS is not possible

The main use case for this is data center migrations whereby you can migrate VMs to a new vSphere cluster with a new virtual switch without disruption. It does require the source and destination portgroups to share the same L2. The IP address within the VM will not change.

Cross vCenter vMotion

Expanding on the Cross vSwitch vMotion enhancement, vSphere 6 also introduces support for Cross vCenter vMotion.

vMotion can now perform the following changes simultaneously:

  • Change compute (vMotion) – Performs the migration of virtual machines across compute hosts.
  • Change storage (Storage vMotion) – Performs the migration of the virtual machine disks across datastores.
  • Change network (Cross vSwitch vMotion) – Performs the migration of a VM across different virtual switches.

and finally…

  • Change vCenter (Cross vCenter vMotion) – Performs the migration of the vCenter which manages the VM.

Like with vSwitch vMotion, Cross vCenter vMotion requires L2 network connectiviy since the IP of the VM will not be changed. This functionality builds upon Enhanced vMotion and shared storage is not required.

Screen Shot 2015-01-19 at 17.58.38

The use cases are migration of Windows based vCenter to the vCenter Server Appliance (also take a look at the scale improvement of vCSA in vSphere 6) i.e. no more Windows and SQL license needed.
Replacement of vCenters without disruption and the possibility to migrate VMs across local, metro, and cross-continental distances.

Long Distance vMotion

Long Distance vMotion is an extension of Cross vCenter vMotion but targeted to environments where vCenter servers are spread across large geographic distances and where the latency across sites is 150ms or less.

Screen Shot 2015-01-19 at 18.03.38The requirements for Long Distance vMotion are the same as Cross vCenter vMotion, except with the addition of the maximum latency between the source and destination sites must be 150 ms or less, and there is 250 Mbps of available bandwidth.

The operation is serialized, i.e. it is a VM per VM operation requiring 250 Mbps.

The VM network will need to be a stretched L2 because the IP of the guest OS will not change. If the destination portgroup is not in the same L2 domain as the source, you will lose network connectivity to the guest OS. This means that in some topologies, such as metro or cross-continental, you will need a stretched L2 technology in place. The stretched L2 technologies are not specified (VXLAN is an option here, as are NSX L2 gateway services). Any technology that can present the L2 network to the vSphere hosts will work, as it’s unbeknown to ESX how the physical network is configured.

Increased vMotion network flexibility

ESXi 6 will have multiple TCP/IP stacks, this enables vSphere to improve scalability and offers flexibility by isolating vSphere services to their own stack. This also allows vMotion to work over a dedicated Layer 3 network since it can now have it’s own memory heap, ARP and routing tables, and default gateway.


In other words the VM network still needs L2 connectivity since the virtual machine has to retain its IP. vMotion, management and NFC networks can all be L3 networks.

Posted in Networking, vmware | Tagged , | 3 Comments

Introducing VMware vCloud Air Advanced Networking Services

VMware just announced some additions to it’s public cloud service, vCloud Air, one of the additions is advanced networking services powered by VMware NSX. Today the networking capabilities of vCloud Air are based on vCNS features, moving forward these will be provided by NSX.

If you look at the connectivity options from your Data Center towards vCloud Air today you have:

  • Direct connect which is a private circuit such as MPLS or Metro Ethernet.
  • IPSec VPN
  • or via the WAN to a public IP address (think webhosting)

By switching from vCNS Edge devices to NSX Edge devices vCloud Air is adding SSL VPN connectivity from client devices to the vCloud Air Edge.


VMware is adding, by using the NSX Edge Gateway, dynamic routing support (OSPF, BGP), and a full fledged L4/7 load-balancer (based on HA Proxy), that also provides SSL offloading.
As mentioned before SSL VPN to the vCloud Air network, for which clients are available for Mac OS X, Windows, and Linux is also available.
Furthermore the number of available interfaces have also been greatly increased from 9 to 200 sub-interfaces, and the system now also provides distributed firewall (firewall policy linked to the virtual NIC of the VM) capabilities.


The NSX Edge can use BGP to exchange routes between vCloud Air and your on-premises equipment over Direct Connect. NSX Edge can also use OSPF to exchange internal routes between NSX edges or a L3 virtual network appliance.


NSX also introduces the concept of Micro-Segmentation to vCloud Air. This allows implementation of firewall policy at the virtual NIC level. In the example below we have a typical 3-tier application with each tier placed on its own L2 subnet.

Screen Shot 2015-01-20 at 11.16.45

With micro-segmentation you can easily restrict traffic to the application- and database-tier while allowing traffic to the web-tier, even though, again just as an example all these VM’s sit on the same host. Assuming that at some point you will move these VM’s from one host to another host, security policy will follow the VM without the need for reconfiguration in the network. Or you can implement policy that does not allow VM’s to talk to each other even though they sit on the same L2 segment.


If you combine this with security policy capabilities in the NSX Edge Gateway you can easily implement firewall rules for both North-South and East-West traffic. The system will also allow you to build a “container” by grouping certain VM’s together and applying policies to the group as a whole. For example you could create 2 applications, each consisting of 1 VM from each tier (web, app, db), and set policies in a granular level. As a service provider you can very easily create a system that supports multiple tenants in a secure fashion. Furthermore this would also allow you to set policies and move VM’s from on-premises to vCloud Air while still retaining network and security configurations.

Posted in Networking, NSX, vmware | Tagged , , , | Leave a comment