Here’s the next installment of our series about connecting on-premises infrastructure with public cloud (revisit parts 1 and 2 if you need), moving toward fully integrated hybrid multicloud. And, as we wrote earlier, full integration isn’t a reality today. It’s a place we’re aiming for, and at NetApp, we’re aiming from the perspective of data. Our view is that hybrid multicloud begins with solving the challenges of data—data protection, data locality, data inertia, and all the other obstacles we’ve faced for years. Our vision is a truly integrated world from edge to core to cloud. In this world, data and services move freely between platforms, clouds, and environments—maintaining visibility
But there’s another vision that already is a reality: a data fabric powered by NetApp® solutions. In this post, we’ll take you through this ever-evolving reality. Our approach to integrating storage across edge, core, and cloud gives organizations ways to tackle current problems and position themselves for the future.
Considerations for integrating your infrastructure
We’re shifting to a world where services are increasingly distributed. According to Gartner, by 2025, three-quarters of enterprise-generated data will be created and processed at the edge—outside a traditional centralized data center or cloud. That’s up from just 10% in 2018.
Many workloads function because parts of the workflow can be done at multiple locations on the edge, other parts in core data centers, and still others in public clouds. Examples of these workloads are smart cities, augmented reality, artificial intelligence, deep learning, and even predictive analytics. Today, these services are bumping into challenges. The underlying storage technologies haven’t necessarily been designed to function across many locations as a cohesive whole. They aren’t able to facilitate near-real-time insights and localized actions. To cope with the demands of distributed services, you have to be aware of at least four considerations:
- Data scale. Some of these services create dozens of terabytes a day. It’s not easy to manage this amount of data, and as datasets grow, storage has to grow: at any location, as needed, without impediment. And it has to grow without intolerable sunk costs or pricing models that force you to pay for capacity that you don’t use.
- Data management. As data locations and datasets expand, tedious data management tasks can multiply like never before. Administrative headaches need to be solved with automation.
- Data movement. These distributed solutions can’t exist without high throughput and low latency. They also require ways to mitigate data inertia and data gravity while supporting data locality, if needed. The solutions also demand easy connections across the entire integrated infrastructure. Different APIs, different data formats, different protocols impede data movement unless carefully managed.
- Data protection. Data has to be safe and secure at any site, all the time. Both data at rest and data in movement must be protected from damage, downtime, and disaster, and also from threats, malware, and breaches. Distributed solutions offer more attack surfaces than a workload that resides only in a data center.
To meet all these considerations, you need the right type of technology for the right place, at the right balance of cost and complexity. Putting enterprise storage arrays at the edge often isn’t appropriate—for example, when the form factor doesn’t fit or when arrays include unnecessary features and complexity. Software-defined storage on a distinctive appliance might be what you need.
On premises or cloud?
Most organizations begin conversations about integrated infrastructure with a simple question: “Should we run our applications on premises or in the cloud?” This question has a different answer for every organization, application, and user requirement. But it’s worth reviewing some common considerations of workload placement, including:
- Latency. Whether or not we like it, latency remains a challenge for some workloads, and keeping your workload on premises is a good way to control latency. If your workload is latency sensitive, local storage and switching are likely to offer operational advantages.
- Protection. It might be easier to back up and restore workloads on premises than in the cloud, especially if you need more than one cloud service for a workload or have to span multiple clouds. Your local storage might include integrated backup and recovery requiring additional services in the cloud. Having said that, cloud might be the perfect site for archive.
- Performance. For maximum performance, it might be more economical to build your infrastructure locally. You’d need to keep storage-side networking capabilities and compute-side capabilities separate. However, if you’re looking for specialized performance capabilities like GPUs, there might be a cloud offering that would let you add performance nondisruptively.
- Leverage. If possible, use existing on-premises infrastructure instead of buying net new. Or, reallocate already purchased cloud resources onto a new initiative. But don’t forget other kinds of leverage—you can reuse your existing data protection tools, or management tools, or even employee skillsets.
- Scale. It’s probably easier to achieve massive scale in the cloud, or across multiple clouds, assuming you can accept the costs.
Cloud environments: What are the differences?
Then there’s a question about which cloud. After all, different clouds offer different approaches and capabilities that factor into your choices for building an integrated infrastructure.
Different pricing models might affect your overall TCO and return on investment over time. Certain clouds do a better job of cost-effective support for long-term, stable workloads. Others offer more cost versatility for rapidly changing requirements.
Different cloud providers offer different levels of innovation for emerging, leading-edge workloads like AI and data analytics. It’s worth evaluating these offerings to determine the best fit for your distinct requirements.
Performance variations are also an issue, because instances run on different hardware. We’ve seen as much as 15% variation in compute performance from one cloud to another. For demanding workloads, those differences matter.
Finally, it’s important to understand that some clouds have better support for a given software solution than others. For example, we just wrote about our support for SAP HANA on Google Cloud and Microsoft Azure, but we aren’t certified for SAP HANA on AWS—yet.
The final chapter
In our fourth and final installment of this series, we’ll look at how NetApp has created products to overcome the challenges of data architecture. Spoiler: It’s packed with tips and tricks (and resources you can send to your boss!). You can also learn more by downloading our reference architecture starter kit.