First, a little history.
I’m old enough to remember when Dave Hitz got up on stage at NetApp Insight and introduced this new term: Data Fabric. It wasn’t a product, there were no deliverables, but it was a philosophy that NetApp was going to live by in the development of its new and existing products.
All of us were like, “Cool, but…huh?”
He said that most new workloads were going to be cloud-based (but not all), and that while it’s really easy to deploy and destroy workload instances in the cloud, those workloads are useless unless they have relatively local access to the datasets required to achieve business outcomes.
I believe the year was 2014. Kubernetes had either just come out or was about to be released. The notion of a “service mesh” had not yet been realized. But any doubts about the cloud being production-ready had been clearly vanquished as AWS and Azure had already grown into behemoths, with each introducing new services seemingly every day.
For a few years after this announcement, it seemed that “Data Fabric” was going to be this overall term that fell into the category of “marketecture”—just a cool term with no real meaning or implementation.
This phase ended in 2018 when NetApp reorganized into three business units in order to realize the vision of Data Fabric. The creation of a cloud software unit, headed by Anthony Lye, put a sharp focus on using cloud and DevOps methodologies to augment the tried-and-true technologies that NetApp had perfected over 25 years. NetApp was transformed from a storage company to a data services company.
So what is the Data Fabric now?
NetApp has created a foundational delivery architecture for workloads and their data. This is unique, as everyone else in the industry focuses on one or the other. Customers can provision, manage, and run production, development, or test applications instances in the place that makes the most sense at that time. This has a tremendous positive impact on a data-driven application development and execution workflow, as organizations look to the cloud for their “use-as-you-need” compute farms. This was never more apparent than last week when NetApp announced a series of updates across its portfolio. I won’t dive into them all here, but you should read Matt Watts’ recent blog for a full breakdown.
When you consider that, according to IDC, the amount of data stored globally will grow from ~40ZB in 2019 to 175ZB in 2025, with 49% of that data stored in a public cloud, it’s clear that two things are true: 1) there’s going to be a ton of data in the cloud, and 2) there’s going to be a ton of data still resident in data centers. Most of this new data will reside on NFS or S3-compatible object storage, which are most appropriate for multi-node compute farms to utilize. These datasets will consist of millions/billions (or more?) of files (or objects), with capacities already exceeding the petabyte range. Moving datasets of that sort around by scanning filesystems is simply not possible.
At the core of the NetApp Data Fabric lies NetApp SnapMirror technology. SnapMirror allows you to efficiently move data from place to place in a way that makes the number of files irrelevant, without the need for third-party replication software or appliances that introduce high rates of failure and even higher skill requirements for administration.
NetApp redeveloped SnapMirror at the beginning of the Data Fabric movement to open it up to other platforms such as S3 and the SolidFire Platform to expand the Data Fabric to as many use cases as possible.
What is very exciting now is that the cloud software unit has produced really useful and production-ready technology that piggy-backs on NetApp’s Data Fabric achievements. One of these is the NetApp Kubernetes Service (NKS), which automates the best-practices deployment of K8S clusters with ready-to-consume apps wherever you’d like them: on-premises, or in the cloud of your choice. You can also tear one down and recreate it in another location, and NKS will automate the movement of the data from the old place to the new place.
I’ve personally been engaged in projects where NetApp Cloud Volumes ONTAP has allowed my customers to achieve much faster analytics results using lots of ephemeral cloud compute, leveraging data that resides primarily on-premises, and employing the Data Fabric to get that data into the cloud. The customer remains at the top of the food chain, as opposed to customers who get disrupted because they still cling to the traditional (read: slow and frustrating) 100% on-premises method of application delivery.
If your organization is looking to achieve new or faster data-driven outcomes, it is imperative that you settle on a foundational architecture that not only gets and keeps that dynamic data in the places where you’ll be achieving those outcomes, but also brings your scaled applications to bear on that data to realize true acceleration. If you do your research, you’ll find that NetApp has led in this space from the onset, and is so far ahead in its capabilities that you’ll want to grab onto the NetApp Data Fabric, hold tight, and get ready for a wild ride.