Category: tfd

Atlas distributed filesystem, think outside the box.

Atlas distributed filesystem, think outside the box.

Rubrik recently presented at Tech Field Day 12 and one of the sessions focused on our distributed filesystem called Atlas. As one of the SE’s at Rubrik I’m in the field every day (proudly) representing my company but also competing with other, more traditional backup and recovery vendors. What is apparent more and more is that these traditional vendors are also going down the appliance route to sell their solution into the market, and as such I sometimes get the pushback from potential customers saying they can also get an appliance based offer from their current supplier, or not really immediately grasping why this model can be beneficial to them.
A couple of things I wanted to clarify first, when I say “also down the appliance route” I need make clear that this is purely a way to offer the solution to market for us, there is nothing special about the appliance as such, all of the intelligence in Rubrik’s case lies in the software, we even started to offer a software only version in the form of a virtual appliance for ROBO use cases recently.
Secondly, some traditional vendors can indeed deliver their solution in an appliance based model, be it their own branded one, or pre-packaged via a partnership with a traditional hardware vendor. I’m not saying there is something inherently bad about this, simplifying the acquisition of a backup solution via an appliance based model is great, but there the comparison stops, it will still be a legacy based architecture with disparate software components, typically these software components, think media server,  database server, search server, storage node, etc. need individual love and care, daily babysitting if you will, to keep them going.
Lastly from a component point of view our appliance consists of multiple independent (masterless) nodes that are each capable of running all tasks of the data management solution, in other words there is no need to protect, or indeed worry about, individual software and hardware components as everything is running distributed and able to sustain multiple failures while remaining operational.

nospoon There is no spoon (box)

So the difference lies in the software architecture, not the packaging, as such we need to look beyond the box itself and dive into why starting from a clustered distributed system as a base makes much more sense in todays information era.

The session at TFD12 was presented by Adam Gee and Roland Miller, Adam is the lead of Rubrik’s distributed filesystem called Atlas, it shares some architectural principles with a previous filesystem Adam worked on while he was at Google called Colossus, most items you store while using Google services end up on Colossus, it itself is the successor to the Google File System (GFS) bringing the concept of a masterless cluster to it and making it much more scalable. Not a lot is available on the web in terms of technical details around Colossus, but you can read a high level article on Wired about it here.

Atlas, which sits at the core of the Rubrik architecture, is a distributed filesystem, built from scratch with the Rubrik data management application in mind. It uses all distributed local storage (DAS) resources available on all nodes in the cluster and pools them together into a global namespace. As nodes are added to the cluster the global namespace grows automatically, increasing capacity in the cluster. The local storage resources on each node consist of both SSD and HDD’s, the metadata of Atlas (and the metadata of the data management application) is stored in the metadata store (Callisto) which is also running distributed on all nodes in the SSD layer. The nodes communicate internally using RPC which are presented to Atlas by the cluster management component (Forge) in a topology aware manner thus giving Atlas the capability to provide data locality. This is needed to ensure that data is spread correctly throughout the cluster for redundancy reasons. For example, assuming we are using triple mirror, we need to store data in 3 different nodes in the appliance, let’s now assume the cluster grows beyond 1 appliance, then it would make more sense from a failure domain point of view to move 1 copy of the data from one of the local nodes to the other appliance.

screen-shot-2016-11-19-at-18-58-09

The system is self healing in the way that Forge publishes the disk and node health status and Atlas can react to that, again assuming triple mirror, if a node or entire brik (appliance) fails Atlas will create a new copy of the data on another node to make sure the requested failure tolerance is met. Additionally Atlas also runs a background task to check the CRC of each chunk of data to ensure what you have written to Rubrik is available in time of recovery. See the article How To Kill A Supercomputer: Dirty Power, Cosmic Rays, and Bad Solder on why that could be important.

screen-shot-2016-11-19-at-19-11-46

The Atlas filesystem was designed with the data management application in mind, essentially the application takes backups and places them on Atlas, building snapshots chains (Rubrik performs an initial full backup and incremental forever after that). The benefit is that we can instantly materialize any point in time snapshot without the need to re-hydrate data.

screen-shot-2016-11-19-at-19-44-00

In the example above you have the first full backup at t0, and then 4 incremental backups after that. Let’s assume you want to instantly recover data at point t3, this will simply be a metadata operation, pointers to the last time blocks making up t3 where mutated, there is no data movement involved.

Taking it a step further let’s now assume you want to use t3 as the basis and start writing new data to it from that point on. Any new data that you write to it (redirect-on-write) now goes to a log file, no content from the original snapshot is changed as this needs to be an immutable copy (compliancy). The use case here could be copy data management where you want to present a copy of a dataset to internal dev/test teams.

screen-shot-2016-11-19-at-19-55-54

Atlas also dynamically provides performance for certain operations in the cluster, for example a backup ingest job will get a higher priority than a background maintenance task. Because each node also has a local SSD drive Atlas can use this to place critical files on a higher performance tier and is also capable of tiering all data and placing hot blocks on SSDs. It also understands that each node has 3 HDDs and will these to stripe the data of a file across all 3 on a single node to take advantage of the aggregate disk bandwidth resulting in a performance improvement on large sequential reads and writes, by utilizing read-ahead and write buffering respectively.

For more information on Atlas you can find the blogpost Adam Gee wrote on it here, or watch the TFD12 recording here.

Networking Field Day 10 – Nuage Networks

Networking Field Day 10 – Nuage Networks

Introduction

Networking Field Day 10 was held from August 19 till 21 2015, in Silicon Valley, NFD is part of the Tech Field Day series of events organized by Stephen Foskett and team, and aims to bring together independent bloggers and IT product vendors to share information and opinions in a presentation format. In this setting demo’s and whiteboard sessions are usually appreciated more than slide-ware and marketing pitches. Assuming you are an independent (e.g. you don’t work for a vendor) blogger/influencer you can ask to become a TFD delegate and when selected get to experience these sessions first hand.

One of the vendors at NFD 10 was Nuage Networks, Nuage has appeared at various TFD events on numerous (10 times by my count) occasions and that’s how I first heard about and got interested in their solutions.

So what did they show at NFD 10?

First up was Sunil Khandekar, Founder and CEO, with an introduction to Nuage Networks and an update on the products. He talked about building software defined, programmable, automated data centers and how Nuage through its declarative policy based automation is applicable to all types of workloads, be it bare metal, virtual machines, or containers. Further he mentioned how Nuage is a complete SDN solution seeing that it hits on all the key tenets, namely; abstraction, automation, control, and visibility. Then he went on to give an update on Nuage Networks.

Screen Shot 2015-09-02 at 10.22.05

Nuage was started January 2012 with the idea that networking should be as instantaneous and consumable as compute has historically become, the solution, VSP, was launched about a year later in April 2013 delivering on this thesis. Currently Nuage is on its 4th release and has seen great customer traction.

The Virtual Services Platform (VSP) has 3 main components, the Virtual Services Directory (VSD), the Virtual Services Controller (VSC), and the Virtual Routing & Switching engine (VRS). Additionally Nuage also optionally provides a hardware VTEP gateway, the 7850 Virtualized Services Gateway (VSG), to connect legacy networking to VXLAN overlays.

Screen Shot 2015-09-02 at 10.32.19

To get a more detailed overview of the solution please see my Nuage Compendium page*.

Sunil also announced the VSP SDK (VSPK) available on https://github.com/nuagenetworks with the idea of fostering open collaboration around the platform. Through this proposed github collaboration, custom scripts for network automation, control or visibility can be developed and shared by its customer community

It is also important to understand that Nuage takes the concept of network automation beyond the data center and extends it to the WAN (branch office) with Virtual Network Services (VNS).

Virtual Network Services (VNS) basics

Next up was Rotem Salomonovitch, head of product management for VNS, talking about how it works in some more detail. The idea of VNS is to make setting up a new branch stupidly easy and independent of the underlying (carrier) technology, essentially SDWAN with a lot of automation. VNS is available as a hardware box, a VM, and as a software only package to install on a bare metal server.

Screen Shot 2015-09-02 at 15.25.36

VNS uses the same control plane components from the data center (VSD / VSC) but has a different data plane, e.g. forwarding entity called the Network Services Gateway (NSG), reason being is that data plane’s in branches typically differ from those in data centers where you have GigE, 10GigE, 40GigE and beyond. In the branch you have different interface types and things like encryption in the WAN, additional security requirements, etc. The NSG is meant to be a platform, initial applications of this platform are networking services (routing, QoS, FW,…) but the goal is to go beyond network connectivity and enable application flexibility. (see section below).
The appliance has multiple WAN ports and also USB connectivity so you could do things like extend it with external LTE connectivity for example.

Rotem then moved on to talk about different abstractions for different types of audiences, e.g. a developer just wants his application to be “connected” whereas the network design team are concerned with connection points, bandwidth consumption etc. To support this he talked about both vertical abstractions, for multi-tenancy, and horizontal abstractions for different audience types within each tenant. The idea is to expose only those abstractions that a particular audience is interested in (until we break all IT silos and everyone is a unicorn of course).

Screen Shot 2015-09-02 at 15.42.34

The idea of abstractions is not to have each team work in their own silo but rather have the system interpret a certain audience’ set of abstractions and translate those to complete the end-to-end policy setup e.g. the application developer uses the Application Designer to create a new app, he only defines the app concepts (e.g. front-end, middle-tier, database) and then the system translates those to subnets, ACLs, etc.

Another example is the VPN designer where you can bring up a new site by linking the device object in the GUI to a location, then, depending on the authentication method, the only thing that needs to happen is physically connecting the WAN and LAN ports on the device at the branch and the device will “bootstrap” automatically and pick up its configuration. The forwarding engine in the device is multi-tenant capable, each subnet is created as a L2 EVPN construct (remember that the idea of VNS is to be independent of the WAN technology), the access ports of the box are piped into those.

Application Flexibility at the Branch

The VNS does not only provide network connectivity but also allows you to run containerized workloads on top of it (it’s an Intel Atom based system). The main idea is to automate the attachment of existing container based applications to the branch network. One example would be to enable running external network operations tools to perform local logging of LAN elements and then setup a single encrypted connection over the WAN or do local auditing of running configurations. Another would be to run a user simulation at the branch before go-live to validate user experience and adjust as needed.

Screen Shot 2015-09-02 at 16.51.00

Theoretically you can run any containerized app (pull it from docker hub) on VNS (provided that you have enough resources to run it), Nuage takes care of the multi-tenant network aspects of running multiple containers on a single host.

Boundary-less Wide Area Networking

Next up was Hussein Khazaal, solution director, talking about extending connectivity from the data center to the branch, to the public cloud, making a VPC your own personal branch office with consistent business policies between all of them.

Screen Shot 2015-09-02 at 17.11.26

The way it works is you get (in this case) a virtual NSG from Nuage which comes as an Amazon AMI (Amazon Machine Image) and deploy it as any other instance to your VPC. Now you can use the NSG-v in Amazon just like any other NSG from your application designer / vpn designer. The classic use case would be to load-balance your front-end (web) application between your data center and Amazon in times of increased load, essentially cloud-bursting made real. If you would want to do something like this without Nuage you would need to figure out how to translate your business policies to the available constructs at each public cloud provider, with the NSG-v it consumes the centralized policies from the VSD just like any other forwarding engine.

*The Nuage Compendium page is under construction, and will be expanded over time.