Erasure Coding – a primer

Erasure Coding – a primer

A surefire [sic] way to get to look for another job in IT is to lose important data. Typically if a user in any organisation stores data he or she expects that data to be safe and always retrievable (and as we all know data loss in storage systems is unavoidable). Data also keeps growing, a corollary to Parkinson’s law is that data expands to fill the space available for storage, just like clutter around your house.

Because of the constant growth of data there is a greater need to both protect said data but also to simultaneously store it in a more space efficient way. If you look at large web-scale companies like Google, Facebook, and Amazon they need to store and protect incredible amounts of data, they do however not rely on traditional data protection schemes like RAID because it is simply not a good match with the hard disk capacity increases of late.

Sure sure, but I’m not Google…

Fair point, but take a look at the way modern data architectures are built and applied even in the enterprise space, looking at most hyper-converged infrastructure players for example they typically employ a storage replication scheme to protect data that resides on their platforms, for them they simply cannot not afford the long rebuild times associated with multi-Terabyte hard disks in a RAID based scheme. Same goes for most object storage vendors. As as example let’s take a 1TB disk, it’s typical sequential write sits around 115 MBps, so 1.000.000 MB / 115 MBps = approximately 8700 seconds which is nearly two and a half hours. If you are using 4TB disks then your rebuild time will be at least ten hours. In this case I am even ignoring the RAID calculation that needs to happen simultaneously and the other IO in the system that the storage controllers need to deal with.

RAID 5 protection example.

Let’s say we have 3 HDDs in a RAID 5 configuration, data is spread over 2 drives and the 3rd one is used to store the parity information. This is basically a exclusive or (XOR) function;

Let’s say I have 2 bits of data that I write to the system, disk 1 has the first bit, disk 2 the second bit, and disk 3 holds the parity bit (the XOR calculation). Now I can lose any of the 2 bits (disks) and the system is able to reconstruct the missing bit as demonstrated by the XOR truth table below;

Screen Shot 2016-07-14 at 14.06.39

Let’s say I write bit 1 and bit 0 to the system, 1 is stored on disk A and 0 is stored on disk B, if I lose disk A [1], I still have disk B [0] and the parity disk [1]. According to the table B [0] + parity [1] = 1 thus I can still reconstruct my data.

But as we have established that rebuilding these large disks is unfeasible, what the HCI builders do is replicate all data, typically 3 times, in their architecture as to protect against multiple component failures, this is of course great from an availability point of view but not so much from a usable capacity point of view.

Enter erasure coding.

So from a high level what happens with erasure coding is that when data is written to the system, instead of using RAID or simply replicating it multiple times to different parts of the environment, the system applies slightly more complex mathematical functions (including matrix, and Galois-Field arithmetic*) compared to the simple XOR we saw in RAID (strictly speaking RAID is also an implementation of erasure coding).

There are multiple ways to implement erasure coding of which Reed-Solomon seems to be the most widely adopted one right now, for example Microsoft Azure and Facebook’s cold-storage are said to have implemented it.

Since the calculation of the erasure code is more complex the often quoted drawback is that it is more CPU intensive than RAID. Luckily we have Intel who are not only churning out more capable and efficient CPUs but are also contributing tools, like the Intelligent Storage Acceleration Library (Intel ISA-L) to make implementations more feasible.

As the video above mentions you roughly get 50% more capacity with erasure coding compared to a triple mirrored system.

Erasure Coding 4,2 example.

Erasure codes are typically quite flexible in the way you can implement them, meaning that you can specify (typically as the implementor, not the end-user, but in some cases both) the number of data blocks to parity blocks. This then impacts the protection level and drive/node requirement. For example if you choose to implement a 4,2 scheme, meaning that each file will be split into 4 data chunks and for those 4 chunks 2 parity chunks are calculated, this means that in a 4,2 setup you require 6 drives/nodes.

The logic behind it can seem quite complex, I have linked to a nice video explanation by Backblaze below;


Backup is Boring!

Backup is Boring!

Yep, until it’s not.

When I was a consultant at a VAR a couple of years ago I implemented my fair share of backup and recovery solutions, products of different vendors which shall remain nameless, but one thing that always became clear was how excruciatingly painful the processes involved ended up being. Convoluted tape rotation schema’s, figuring out back-up windows in environments that were supposed to be operating in a 24/7 capacity, running out of capacity, missed pickups for offsite storage,… the experience consistently sucked.


I think it’s fair to say that there has not been a lot of innovation in this market for the last decade or so, sure vendors put out new versions of their solutions on a regular basis and some new players have entered the market, but the core concepts have largely remained unchanged. How many times do you sit around at lunch with your colleagues and discuss exciting new developments in the data protection space… exactly…

So when is the “until it’s not” moment then?

I’m obviously biased here but I think this market is ripe for disruption, if we take some (or most) of the pain out of the data protection process and make it a straightforward affair I believe we can bring real value to a lot of people.

Rubrik does this by providing a simple, converged data management platform that combines traditionally disparate backup software pieces (backup SW, backup agents, catalog management, backup proxies,…) and globally deduplicated storage in one easily deployable and scalable package.

No more jumping from interface to interface to configure and manage something that essentially should be a insurance policy for you business. (i.e. the focus should be on recovery, not backup). No more pricing and sizing individual pieces based on guesstimates, rather scale out (and in) if and when needed, all options included in the base package.

Because it is optimized for the modern datacenter (i.e. virtualization, scale-out architectures, hybrid cloud environments, flash based optimizations,…) it is possible to consume datamanagement as a service rather than through manual configuration. All interactions with the solution are available via REST APIs and several other consumption options are already making good use of this via community driven initiatives like the PowerShell Module and the VMware vRO plugin. (more info see please see: )


So essentially giving you the ability to say no to the “we have always done it this way” mantra, it is time to bring (drag?) backup and recovery into the modern age.


Fortinet integration with Nuage Networks SDN

Fortinet integration with Nuage Networks SDN


Nuage Networks VSP, with the emphasis on P for Platform provides many integration points for 3rd party network and security providers (a.o.) this way the customer can leverage the SDN platform and build end-to-end automated services in support of his/her application needs.

One of the integration partners is Fortinet whereby we can integrate with the FortiGate Virtual Appliances to provide NGFW services and automated FW management via FortiManager.

Integration example

In the example setup below we are using OpenStack as the Cloud Management System and KVM as the OS Compute hosts.
We have the FortiGate Virtual Appliance connected to a management network (orange), a untrusted interface (red), and a trusted/internal interface (purple).
On the untrusted network we have a couple of virtual machines connected, and on the internal network


Dynamic group membership

Since we are working in cloud environment where new workloads are spun up and down at a regular pace resulting in a dynamic allocation of IP addresses, we need to make sure that we can keep the firewall policy intact. To do this we use dynamic group membership that adds and deletes the IP addresses of the virtual machines based on group membership on both platforms. The added benefit of this is that security policy does not go stale, when workloads are decommissioned in the cloud environment it’s address information is automatically removed from the security policies resulting in a more secure and stable environment overall.

Looking at the FortiGate below, we are using dynamic address groups to define source and destination policy rules. The membership of the address groups is synchronised between the Nuage VSP environment and Fortinet.

Screen Shot 2016-04-17 at 14.59.55

If we look at group membership in Fortinet we can see the virtual machines that are currently a member of the group. As you can see in the image below currently the group “PG1 – Address group 1 – Internal” has 1 virtual machine member.


If we now create a new virtual machine instance in OpenStack and make that instance a member of the corresponding Policy Group in Nuage Networks VSP it will be automatically synced back to Fortinet.

Screen Shot 2016-04-17 at 14.12.42

Looking at Nuage Networks VSP we can see the new OpenStack virtual machine is a member of the Policy Group “PG1 – Address Group 1 – Internal”


If we now go back to our FortiGate we can see the membership of the corresponding group has been updated.


Traffic Redirection

Now that we have dynamic address groups we can use these to create dynamic security policy, in order to selectively forward traffic from Nuage Networks VSP to FortiGate we need to create a Forward Policy in VSP.

In the picture below we define a new redirection target pointing to the FortiGate Virtual Appliance, in this case we opted for L3 service insertion, this could also be virtual wire based.


Now we need to define which traffic to classify as “interesting” and forward to FortiGate, because Nuage Networks VSP has a built-in distributed stateful L4 firewall we can create a security baseline that handles common east-west traffic locally and only forwards traffic that demands a higher level inspection to the FortiGate virtual appliance.


In the picture above we can select the Protocol, in this case I’m forwarding all traffic, but we could just as easily select for example TCP and define interesting traffic based on source and destination ports. We need to select the origin and destination network, in this case we use the dynamic address groups that are synced with Fortinet, this could also be based on more static network information. Finally we select the forward action and point to the Fortinet Virtual Appliance.

We have a couple of policies defined, as you could see in the picture at the top of the post we are allowing ICMP traffic between the untrusted network and the internal network. In the picture below I’ve logged on to one of the untrusted clients and am pinging the internal server.

Screen Shot 2016-04-17 at 15.38.57

Screen Shot 2016-04-17 at 15.40.29.png

Since this traffic mathes the ACL redirect rule we have configured in Nuage Networks VSP we can see a flow redirection at the Open vSwitch level pointing to the FortiGate virtual appliance.

Screen Shot 2016-04-17 at 15.40.17

We can also see that the Forward statics counters in VSP are increasing and furthermore determine that the traffic is logged for referential purposes.

Screen Shot 2016-04-17 at 15.41.24.png

If we look at FortiGate we can see that the traffic is indeed received and allowed by the virtual appliance.


Same quick test with SSH which is blocked by our ForiGate security policy.

Screen Shot 2016-04-17 at 15.53.06

Screen Shot 2016-04-17 at 15.52.40.png

So as you can see a very solid integration between the Nuage Networks datacenter SDN solution and Fortinet to secure dynamic cloud environments in an automated way.

Docker networking overview

Docker networking overview


There are of course a lot of blog posts out there already regarding Docker networking, I don’t want to replicate that work but instead wanted to provide a clear overview of what is possible with Docker networking today by showing some examples of the different options.

In general the networking piece of Docker, and arguably Docker itself, is still quite young so things move fast and will likely change over time. A lot of progress has been made via the SocketPlane acquisition last year and it’s subsequent pluggable model, but more about that later.

Docker containers are ephemeral by design (pets vs cattle), this leads to several potential issues, not the least of which is not being able to keep your firewall configuration up to date because of difficult IP address management, it’s also hard to connect to services that might disappear at any moment, and no, using DNS as a stopgap is not a good solution (DNS as a SPOF, don’t go there). Of course there are several options and methods available to overcome these things.

Single host Docker networking

You basically have 4 options for single host Docker networking; Bridge mode, Host mode, Container mode, and No networking.

Bridge mode (the default Docker networking mode)

The Docker deamon creates “docker0” a virtual ethernet bridge that forwards packets between all interfaces attached to it. All containers on the host are attached to this internal bridge which assings one interface as the containers’ “eth0” interface and another interface in the host’s namespace (think VRF). The container get’s a private IP address assignment. To prevent ARP collisions on the local network, the Docker daemon generates a random MAC address from the allocated IP address. In the example below Docker assigns the private IP to the container.


Host mode

In this mode the container shares the networking namespace of the host, directly exposing it to the outside world. This means you need to use port mapping to reach services inside the container, in Bridge mode, Docker can automatically assign ports and thus make them routable. In the example below the Docker host has the IP and as you can see the container shares this IP address.


Container mode

This mode forces Docker to reuse the networking namespace of another container. This is used if you want to provide custom networking from said container, this is for example what Kubernetes uses to provide networking for multiple containers. In the example below the container to which we are going to connect the subsequent containers into has the IP and as you can see the container being launched has the same IP address.


No networking

This mode does not configure networking, useful for containers that don’t require network access, but it can also be used to setup custom networking.
This is the mode Nuage Networks leverages pre-Docker 1.9 (more info here).
In the example below you can see that our new container did not get any IP address assigned.


By default Docker has inter-container communication enabled (–icc=true) meaning that containers on a host are free to communicate without restrictions which could be a security concern. Communication to the outside world is controlled via iptables and ip_forwarding.

Multi-host Docker networking

In a real world scenario you will most likely end up using Docker containers across multiple hosts depending on the needs of you containerized application. So now you need to build container networks across these hosts to have you distributed application communicate internally, and externally.

As alluded to above in march of 2015 Docker, Inc. acquired the SDN startup SocketPlane, that has given rise to Libnetwork and the Container Network Model, meant to be the default multi-host networking setup going forward.


Libnetwork provides a native Go implementation for connecting containers
The goal of libnetwork is to deliver a robust Container Network Model that provides a consistent programming interface and the required network abstractions for applications.

One of the benefits of Libnetwork is that it uses a driver / plugin model to support many underlying network technologies while stil exposing a simple and consistent network model to the end-user (common API), Nuage Networks leverages this model by having a remote plugin.

Libnetwork also introduces the Container Network Model (CNM) to provide interoperation between networks and containers.


The CNM defines a network sandbox, and endpoint and a network. The Network Sandbox is an isolated environment where the Networking configuration for a Docker Container lives. The Endpoint is a network interface that can be used for communication over a specific network. Endpoints join exactly one network and multiple endpoints can exist within a single Network Sandbox. And the Network is a uniquely identifiable group of endpoints that are able to communicate with each other. You could create a “Frontend” and “Backend” network and they would be completely isolated.

Intel and Micron 3D XPoint

Intel and Micron 3D XPoint


My day job is in networking but I do consider myself (on the journey to) a full stack engineer and like to dabble in lot’s of different technologies like, I’m assuming, most of us geeks do. Intel and Micron have been working on a seeming breakthrough that combines memory and storage in one non-volatile device that is cheaper than DRAM (typically computer memory) and faster than NAND (typically a SSD drive).

3D Xpoint

3D Xpoint, as the name implies, is a crosspoint structure, meaning 2 wires crossing each other, with “some material*” in between, it does not use transistors (like DRAM does) which makes it easier to stack (hence the 3D) —> for every 3 lines of metal you get 2 layers of this memory.

Screen Shot 2016-02-13 at 18.09.31.png

The columns contain a memory cell (the green section in the picture above) and a selector (the yellow section in the picture above), connected by perpendicular wires (the greyish sections in the picture above), allowing you to address each column individually by using one wire at the top and one wire at the bottom. These grids can be stacked 3 dimensionally to maximise density.
The memory can be accessed/modified by sending varied voltage to each selector, in contrast DRAM requires a transistor at each memory cell to access or modify it, this results in 3D XPoint being 10x more dense that DRAM and 1000x faster than NAND (at the array level, not at the individual device level).

3D XPoint can be connected via PCIe NVMe and has little wear effect over it’s lifetime compared to NAND. Intel will commercialise this in it’s Optane range both as an SSD disk and as DIMMS. (The difference between Optane and 3D XPoint is that 3D XPoint refers to the type of memory and Optane includes the memory and a controller package).

1000x faster, really?

In reality Intel is getting 7x performance compared to a NAND MLC SSD (on NVMe) today (at 4kB read), that is because of the inefficiencies of the storage stack we have today.

Screen Shot 2016-02-13 at 18.21.27.png

The I/O passes through the filesystem, storage stack, driver, bus/platform link (transfer and protocol i.e. PCIe/NVMe), controller firmware, controller hardware (ASIC), transfer from NAND to the buffers inside the SSD, etc. So 1000x is a theoretical number (and will show up on a lot of vendor marketing slides no doubt) but reality is a bit different.

So focus is and has been on reducing latency, for example work that has been done by moving to NVMe already reduced the controller latency by roughly 20 microseconds (no HBA latency and the command set is much simpler).

Screen Shot 2016-02-13 at 18.25.07

The picture above shows the impact of the bus technology, on the left side you see AHCI (SATA) and on the right NVMe, as you see there is a significant latency difference between the two. NVMe also provides a lot more bandwidth compared to SATA (about 6x more on PCIe NVMe Gen3 and more than 10x on Gen4).

Another thing that is hindering the speed improvements of 3D XPoint is replication latency across nodes (it’s storage so you typically want redundancy). To address this issue work is underway on things like “NVMe over Fabrics” to develop a standard for low overhead replication. Other improvements in the pipe are work on optimising the storage stack, mostly on the OS and driver level. For example, because the paging algorithms today were not designed with SSD in mind they try to optimise for seek time reduction etc, things that are irrelevant here so reducing paging overhead is a possibility.

They are also exploring “Partial synchronous completion”, 3D XPoint is so fast that doing an asynchronous return, i.e. setting up for an interrupt and then waiting for interrupt completion takes more time than polling for data. (we have to ignore queue depth i.e. assume that it will be 1 here).

Screen Shot 2016-02-13 at 19.03.27

Persistent memory

One way to overcome this “it’s only 7x faster problem”, altogether is to move to persistent memory. In other words you skip over they storage stack latency by using 3D XPoint as DIMMs, i.e. for your typical reads and writes there is no software involved, what little latency remains is now caused entirely by the memory and the controller itself.

Screen Shot 2016-02-13 at 19.15.21

To enable this “storage class memory” you need to change/enable some things, like a new programming model, new libraries, new instructions etc. So that’s a little further away but it’s being worked on. What will probably ship this year is the SSD model (the 7x improvement) which is already pretty cool I think.

* It’s not really clear right now what those materials entail exactly which is part of it’s allure I guess😉


SDN with Underlay – Overlay visibility via VSAP

SDN with Underlay – Overlay visibility via VSAP

When using virtualization of any kind you introduce a layer of abstraction that people in operations need to deal with, i.e. if you run a couple of virtual machines on a physical server you need to be able to understand the impact that changes (or failures) in one layer have on the other. The same is true for network virtualization, we need to be able to understand the impact that say, a failure of a physical router port, has on the overlay virtual network.

To make this possible with Nuage Networks we can use our Virtualized Services Assurance Platform or VSAP for short. VSAP builds upon intellectual property from our parent company, Alcatel-Lucent (soon to be Nokia), using the 5620 Service Aware Manager as a base to combine knowledge of both the physical and virtual networks.

VSAP Key Components

VSAP communicates with Nuage Networks using firstly SNMP, connecting into the Virtualized Services Controller (VSC, the centralized SDN controller) this gives VSAP an understanding of the network topology in the Nuage SDN solution. Secondly it uses a CPAM (Control Plane Assurance Manager) module via the 7701 CPAA which acts as a listening device in the network to collect all the IGB and BGP updates which allows it to model the underlay network based on IGP.

Screen Shot 2016-01-10 at 16.55.15

The solution has both a both a Web GUI and Java GUI, pictured below is the SAM JAVA GUI which in this example is displaying the physical network including the data center routers, the VSCs, and the Top-of-Rack switches. (again all learned via SNMP).

Screen Shot 2016-01-10 at 17.02.35.png

You can also look at the IGP (in this case OSPF) topology (picture below), again this is build by the CPAM based on the routing information in the network. We can use this model to map the underlay (what you see on the picture below) with the overlay information from the Nuage Networks SDN solution.

Screen Shot 2016-01-10 at 17.05.04

You can also get a closer look at the services in the network, in the example below the VPLS (L2) and VPRN (L3) constructs. These services are created by the Nuage Networks SDN solution.

Screen Shot 2016-01-10 at 17.08.57

We can also drill deeper into the Virtual Switch constructs provided by Nuage (for example to see which virtual ports and access ports are attached, or which faults recorded are at the vSwitch level,…)

Screen Shot 2016-01-10 at 17.12.30

You can also see how traffic is passed across the virtual service, you have an option to highlight it on the IGP topology view and see the superimposed virtual service on top of the physical underlay.

Screen Shot 2016-01-10 at 17.15.37

On the picture below you can see the virtual overlay service path on top of the physical network.

Screen Shot 2016-01-10 at 17.18.12

As mentioned before there is also the option to use a Web based GUI if you don’t like a Java based client (ahem😉 ) In the example below we are looking at the VRSs (virtual switches) inside the L3 VPRN construct, so you can see where the virtual machines connect to the Open vSwitch based VRS on each hypervisor host.

Screen Shot 2016-01-10 at 17.20.27

You also have an inventory view allowing you to search on all objects in the physical and virtual space.

Screen Shot 2016-01-10 at 17.30.13

You can also look at the inventory topology.

Screen Shot 2016-01-10 at 17.31.24

Fault correlation

One of the powerful options you now have is that you can perform fault correlation between the physical and virtual constructs. So in the example below a switch failure is introduced and you can see the impact on overlay and underlay, including, and up to the virtual machine level.

Pictured below is the overview on the Web Gui showing the alarm that has been raised, the impact and probably cause.

Screen Shot 2016-01-10 at 17.34.38

Same type of indication on the Java client depicted below.

Screen Shot 2016-01-10 at 17.35.41

For additional information and a demo of how this works I recommended having a look at the recorded Tech Field Day Extra session from ONUG Spring 2015 here.

Policy Based Abstractions through SDN

Policy Based Abstractions through SDN

As I’m sure you’re tired of hearing by now IT is typically divided in multiple silo’s which don’t always see eye to eye. Sometimes people are afraid of needing to adjust perceived best practises in their own domain to better collaborate with the rest of the organization, in many cases though it’s simply a matter of not understanding each other because you are not speaking the same language.

The ideal scenario would be a world where each practise would expose it’s infrastructure, build on best practises, through APIs so other teams can interact with it in the most optimum way.

At Nuage Networks we provide API based access to our components making full scale automation a possibility but we can also bring together teams speaking different languages via our abstraction based policies.

Nuage Networks Application Designer 

Application Designer is built for use by people with an understanding of application constructs that don’t necessarily need to understand, or care about, the underlying networking constructs, these are automatically abstracted by the Nuage platform.

In this example we initially start of with a fresh slate, no network constructs have been created beyond the L3 domain.

Screen Shot 2015-12-15 at 09.52.51

If we go to Application Designer we can see the application services that are available, these would typically be created by the network team, it is an abstract representation of a network service, for example below we are creating the application service https, providing TCP communication to port 443.

Screen Shot 2015-12-15 at 10.01.42

The application team can now use these application service abstractions to build out their application. In the example below we start by creating 3-tier application called Banking App.

Screen Shot 2015-12-15 at 10.02.56

Next we can start to add our application tiers and interconnect them by using the application services abstractions that were previously created by the networking team. You do this by dragging and dropping items from the library onto the canvas.

Screen Shot 2015-12-15 at 10.06.53

Once you have your application tiers mapped out you can use the application services to create flow security policy (what type of traffic is allowed between these 2 points) simply by drawing a line between the 2 tiers.

Screen Shot 2015-12-15 at 10.10.15

In this case we are indicating we want HTTPS to be allowed from the Internet to the front-end application tier.

One you have your application mapped out and interconnected (you could also drag and drop other complete application on the canvas and specify connectivity between those as wel) you can add workloads to the tiers, these will then  to the policies you have applied.

Screen Shot 2015-12-15 at 10.15.10

Since the the system will translate these different abstractions to the correct networking constructs we can look at the network design and verify that our application model has been completely mapped to a set of networking policies.

Screen Shot 2015-12-15 at 10.19.41

Furthermore, looking at the security policies we can see these have been translated as wel thus making it easy for different teams with different knowledge domains to focus on their area of expertise while at the same time tying everything together via our policy based abstractions.

Screen Shot 2015-12-15 at 10.22.45

Unifying Docker Container and VM networking

Unifying Docker Container and VM networking


Most environments are not homogeneous, typically you have multiple types of workloads and I believe this will only increase in the near future with the rise of containers, PaaS, VM’s, bare metal,… In this brief overview I wanted to demonstrate how you can connect Virtual Machines and Containers on the same overlay network in an automated manner via our SDN solution. This way every time you spin up a new workload it will automatically get its network and security policy applied and behaves like any other endpoint on the network.

Screen Shot 2015-11-06 at 13.01.15

Docker networking

There are multiple options to do networking in Docker, typically a container (running a specific service) can be exposed externally by mapping an internal port to external one. When you install Docker, it creates three networks automatically (bridge, null, and host), when you run a container you can use the –net flag to specify which network you want to run a container on. By default the Docker daemon connects containers to the bridge network. If you run ifconfig on the host you can see the bridge as part of the host’s network stack.

The none network adds a container to a container-specific network stack, this is what we will use in the case of Nuage Networks to connect the Docker container to our VRS (Open vSwitch)

The host network adds a container on the hosts network stack. You’ll find the network configuration inside the container is identical to the host.

Nuage Networks and Docker containers

In the case of Nuage Networks we will attach every container to a tenant (overlay) network which is provided by our centralised management (VSD) and control (VSC) plane and configured on the Docker host in our VRS (Open vSwitch). This allows us to use our centralised networking and security policies providing IP configuration, firewall rules, QoS, etc. If traffic leaves the Docker host it is encapsulated in VXLAN so from a management point of view this no different then how we work with Virtual Machines.


I’ve created a L2 network (called DockerSN below) in Nuage (synced to OpenStack) where I’m connecting both my containers and VM workloads. The subnet has a range of

Screen Shot 2015-11-06 at 09.11.28

So when I spin up a new container on my Docker host and connect it to the Nuage VRS I’ll automatically get the policies from that construct applied.

Screen Shot 2015-11-06 at 09.23.22

So as you can see above, my new container (gloomy_jang) has gotten the IP address, if we go back to our Virtualized Services Architect interface we can see 2 containers (one I created earlier) and a VM attached to the same subnet (could also be to a separate subnet ofc).

Screen Shot 2015-11-06 at 09.28.40

We can drill down on the newly created container and get all the network and security policy details

Screen Shot 2015-11-06 at 09.29.48

We now have connectivity between our container workloads and VM (

[root@dockerhost ~]# docker exec 152f5660e56a ping -c 3
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=0.650 ms
64 bytes from icmp_seq=2 ttl=64 time=0.450 ms
64 bytes from icmp_seq=3 ttl=64 time=0.450 ms

Nuage Networks Service Insertion demos

Nuage Networks Service Insertion demos

During the OpenStack Summit in Tokyo, Nuage Networks announced 5 new partnerships (Fortinet, vArmour, Citrix, Guardicore, CounterTack). To give a quick overview of what these new (and some previous) partnerships represent from a technical point of view some demo movies were made available on YouTube.

Nuage Networks VSP and Citrix NetScaler VPX Demo In a Red Hat OpenStack environment

Nuage Networks VSP and F5 Big-IP Virtual Edition demo in a Red Hat OpenStack environment

Nuage Networks VSP and Palo Alto Networks Virtualized Firewall Demo in a Red Hat OpenStack environment

Nuage Networks VSP, and both Palo Alto Networks Virtualized Firewall and F5 Big-IP VE Demonstration

Nuage Networks VSP and Fortinet FortiGate Virtual Appliance Demo in a Red Hat OpenStack environment

Nuage Networks VSP and GuardiCore Data Center Security Suite Demo in a Red Hat OpenStack environment

Nuage Networks VSP and CounterTack Sentinel Demo in a Red Hat OpenStack environment

Networking Field Day 10 – Nuage Networks

Networking Field Day 10 – Nuage Networks


Networking Field Day 10 was held from August 19 till 21 2015, in Silicon Valley, NFD is part of the Tech Field Day series of events organized by Stephen Foskett and team, and aims to bring together independent bloggers and IT product vendors to share information and opinions in a presentation format. In this setting demo’s and whiteboard sessions are usually appreciated more than slide-ware and marketing pitches. Assuming you are an independent (e.g. you don’t work for a vendor) blogger/influencer you can ask to become a TFD delegate and when selected get to experience these sessions first hand.

One of the vendors at NFD 10 was Nuage Networks, Nuage has appeared at various TFD events on numerous (10 times by my count) occasions and that’s how I first heard about and got interested in their solutions.

So what did they show at NFD 10?

First up was Sunil Khandekar, Founder and CEO, with an introduction to Nuage Networks and an update on the products. He talked about building software defined, programmable, automated data centers and how Nuage through its declarative policy based automation is applicable to all types of workloads, be it bare metal, virtual machines, or containers. Further he mentioned how Nuage is a complete SDN solution seeing that it hits on all the key tenets, namely; abstraction, automation, control, and visibility. Then he went on to give an update on Nuage Networks.

Screen Shot 2015-09-02 at 10.22.05

Nuage was started January 2012 with the idea that networking should be as instantaneous and consumable as compute has historically become, the solution, VSP, was launched about a year later in April 2013 delivering on this thesis. Currently Nuage is on its 4th release and has seen great customer traction.

The Virtual Services Platform (VSP) has 3 main components, the Virtual Services Directory (VSD), the Virtual Services Controller (VSC), and the Virtual Routing & Switching engine (VRS). Additionally Nuage also optionally provides a hardware VTEP gateway, the 7850 Virtualized Services Gateway (VSG), to connect legacy networking to VXLAN overlays.

Screen Shot 2015-09-02 at 10.32.19

To get a more detailed overview of the solution please see my Nuage Compendium page*.

Sunil also announced the VSP SDK (VSPK) available on with the idea of fostering open collaboration around the platform. Through this proposed github collaboration, custom scripts for network automation, control or visibility can be developed and shared by its customer community

It is also important to understand that Nuage takes the concept of network automation beyond the data center and extends it to the WAN (branch office) with Virtual Network Services (VNS).

Virtual Network Services (VNS) basics

Next up was Rotem Salomonovitch, head of product management for VNS, talking about how it works in some more detail. The idea of VNS is to make setting up a new branch stupidly easy and independent of the underlying (carrier) technology, essentially SDWAN with a lot of automation. VNS is available as a hardware box, a VM, and as a software only package to install on a bare metal server.

Screen Shot 2015-09-02 at 15.25.36

VNS uses the same control plane components from the data center (VSD / VSC) but has a different data plane, e.g. forwarding entity called the Network Services Gateway (NSG), reason being is that data plane’s in branches typically differ from those in data centers where you have GigE, 10GigE, 40GigE and beyond. In the branch you have different interface types and things like encryption in the WAN, additional security requirements, etc. The NSG is meant to be a platform, initial applications of this platform are networking services (routing, QoS, FW,…) but the goal is to go beyond network connectivity and enable application flexibility. (see section below).
The appliance has multiple WAN ports and also USB connectivity so you could do things like extend it with external LTE connectivity for example.

Rotem then moved on to talk about different abstractions for different types of audiences, e.g. a developer just wants his application to be “connected” whereas the network design team are concerned with connection points, bandwidth consumption etc. To support this he talked about both vertical abstractions, for multi-tenancy, and horizontal abstractions for different audience types within each tenant. The idea is to expose only those abstractions that a particular audience is interested in (until we break all IT silos and everyone is a unicorn of course).

Screen Shot 2015-09-02 at 15.42.34

The idea of abstractions is not to have each team work in their own silo but rather have the system interpret a certain audience’ set of abstractions and translate those to complete the end-to-end policy setup e.g. the application developer uses the Application Designer to create a new app, he only defines the app concepts (e.g. front-end, middle-tier, database) and then the system translates those to subnets, ACLs, etc.

Another example is the VPN designer where you can bring up a new site by linking the device object in the GUI to a location, then, depending on the authentication method, the only thing that needs to happen is physically connecting the WAN and LAN ports on the device at the branch and the device will “bootstrap” automatically and pick up its configuration. The forwarding engine in the device is multi-tenant capable, each subnet is created as a L2 EVPN construct (remember that the idea of VNS is to be independent of the WAN technology), the access ports of the box are piped into those.

Application Flexibility at the Branch

The VNS does not only provide network connectivity but also allows you to run containerized workloads on top of it (it’s an Intel Atom based system). The main idea is to automate the attachment of existing container based applications to the branch network. One example would be to enable running external network operations tools to perform local logging of LAN elements and then setup a single encrypted connection over the WAN or do local auditing of running configurations. Another would be to run a user simulation at the branch before go-live to validate user experience and adjust as needed.

Screen Shot 2015-09-02 at 16.51.00

Theoretically you can run any containerized app (pull it from docker hub) on VNS (provided that you have enough resources to run it), Nuage takes care of the multi-tenant network aspects of running multiple containers on a single host.

Boundary-less Wide Area Networking

Next up was Hussein Khazaal, solution director, talking about extending connectivity from the data center to the branch, to the public cloud, making a VPC your own personal branch office with consistent business policies between all of them.

Screen Shot 2015-09-02 at 17.11.26

The way it works is you get (in this case) a virtual NSG from Nuage which comes as an Amazon AMI (Amazon Machine Image) and deploy it as any other instance to your VPC. Now you can use the NSG-v in Amazon just like any other NSG from your application designer / vpn designer. The classic use case would be to load-balance your front-end (web) application between your data center and Amazon in times of increased load, essentially cloud-bursting made real. If you would want to do something like this without Nuage you would need to figure out how to translate your business policies to the available constructs at each public cloud provider, with the NSG-v it consumes the centralized policies from the VSD just like any other forwarding engine.

*The Nuage Compendium page is under construction, and will be expanded over time.