My employer, Scalar Decisions, was lucky enough to acquire one of the first NetApp HCI units in Canada for internal testing and training, customer demos, etc. Here’s how the setup process went for me, with some thoughts on why I think NetApp has finally got hyper-converged right.
NetApp HCI is built on a foundation of Element OS, the operating system that also powers its all-flash storage offering, SolidFire. Element OS runs on a number of platforms to achieve predictable workload performance across an entire cluster. Storage nodes make up one half of the solution, the other being compute nodes, which are regular x86 servers; pre-deployed with a hypervisor, VMware-only at the time of this writing. The minimum configuration required is 4 storage nodes providing 50,000 IOPS each and 2 compute nodes for HA requirements.
The configuration that sits in my company’s lab consists of four compute nodes, and four storage nodes, both the 300 series. For complete node specs, go here. Each node consumes one quarter of a 2RU chassis and is built on Supermicro’s BigTwin, one of the more common OEM’d chassis in the industry. Each storage node has a requisite 1/10GbE, RJ-45 connection for management, and dual 10/25GbE, SFP28 connections for iSCSI traffic. The compute nodes have exactly this, plus a pair of 10/25GbE, SFP28 connections for all your VMware traffic. There is also an additional option management connection for redundancy, should you choose to cable it, as well as a 1Gbps, out-of-band management connection for everything that you server admins out there would typically do with this port. Since my nodes shipped with v1.0 of the NetApp HCI image and I wanted v1.1, I used this connection to mount the virtual optical drive to the upgrade ISO on my laptop which can also upgrade via USB key if you wish.
Packaging and shipping
My eight nodes came in a total of eight boxes, two containing a chassis with one of either a storage node or compute node, the other six were the remaining nodes in the configuration. Any box that contained a storage node also had its six SSDs; the chassis boxes contained one of the easiest rail kits I’ve ever used.
Racking and cabling
Due to the fantastic rail kits included, the racking was quick and painless. Since I had a single dedicated 10GbE switch racked between my two nodes and 2M TwinAx cables, my cabling situation wasn’t ideal. This is a lab, however, and the re-wiring is already being planned as we have since taken possession of two new 10GbE switches, so we can conform to best practices here as well. It should be noted that the only cables included by NetApp are power cables, so be sure to have this sorted along with port requirements prior to installation date.
First boot and firmware upgrades
There are two different images available: one for compute nodes and one for storage nodes. Under the hood, this image consists of ElementOS and the NetApp Deployment Engine (NDE) on the storage nodes. The compute node ISO contains all the images required to install ESX 6.0, 6.5, and vCenter as well as NDE to make it all happen without a lot of user intervention.
I’ve only found the time to run one complete flash/install cycle once, but it was a fairly easy exercise, complicated somewhat by my incessant need to check off the “Advanced” checkbox. If a person was truly embracing agility and eschewing server hugging, they’d let NDE and DHCP make most of the decisions for them.
Most of the knowledge required for a successful NetApp HCI deployment isn’t storage knowledge at all, but rather networking and vSphere. Of paramount importance is understanding the network requirements around LACP (or not) prior to starting, I won’t go into details here as that type of thing may change over time and I wouldn’t want this content to be an outdated technical reference. Let’s just say I got it wrong the first time around due to trying to overly complicate it…my bad.
To get started on the configuration, direct your browser to the IP address of a management port on a storage node, and the browser-based NDE configuration utility starts here. The interface is slick. The error checking is a little aggressive but it can be toggled off in the top right-hand corner.
You’ll click your way through the prerequisites and EULAs for all the software in use. Next, it’s time to configure vCenter. Enter the credentials you’d like to use and confirm your hardware inventory. Here’s where you’ll know if the switch configuration was done properly. Once that’s all done, you’ll be at the review screen and if the button labelled “Looks Good, Let’s Go” is blue, click it and sit back, about 30-45 minutes later you should be presented with this screen:
It’s so easy, that it’s actually a little anti-climatic. Feel free to click on the Launch vSphere Client button and have a gander at your shiny new vCenter. You’ll find you have six datastores, a local VMFS6 (one for each compute host and two 2TB VMFS6), and iSCSI ones presented from the storage nodes. Note that I chose vSphere 6.5 on deployment so VMFS6 datastores are supported, but these would be VMFS5 if you chose 6.0.
At this point, the only thing different from a non-NetApp HCI vCenter is the existence of the NetApp SolidFire Configuration and Management plugins visible on the home screen. The Management plugin is where you manage your datastores. New datastores created through the Create Datastore button are VMFS5 by default. VVols are disabled by default but creating VMFS6 datastores or enabling VVols is pretty simple should you choose to venture into this. The next step is to get vSphere properly licensed as they are deployed with trial licenses.
Finally, HCI that scales—properly
Up until NetApp HCI hit the street, the storage for HCI has traditionally been either within a VM with the hypervisor being on bare metal, or it has been combined with the hypervisor, also on bare metal. The differentiator NetApp has is the ability to scale either compute or storage completely independent of the other while still maintaining the simple installation ethos and consolidated management plane. This ability to completely isolate compute from storage for scaling or performance consideration is refreshing and should spark a bit of competitive engineering effort to provide something similar.