When NetApp announced ONTAP AI in August, I talked about the critical importance of accelerating AI infrastructure deployments and removing bottlenecks from AI processes. Data and compute bottlenecks can idle expensive infrastructure, increase costs, cause data scientists to waste valuable time, and put AI outcomes at risk.

 

One of the great strengths of the partnership between NetApp and NVIDIA is that we are both laser-focused on eliminating AI bottlenecks and advancing the realm of the possible at a rapid pace. NetApp’s attention to the data pipeline amplifies NVIDIA’s efforts to accelerate compute. By combining technologies from both companies, ONTAP AI accelerates all facets of AI from edge to core to cloud, to deliver better outcomes more quickly.

 

During the keynote at GTC Europe 2018, NVIDIA CEO Jensen Huang introduced RAPIDS, its latest contribution to the advancement of data science and AI. RAPIDS is open-source software that incorporates highly parallel GPU technology across the data science pipeline. It provides GPU acceleration for data preparation as well as machine learning (ML) and deep learning (DL) models, accelerating the full training pipeline. You can learn more about RAPIDS in the NVIDIA press release or Jensen’s keynote.

NetApp Eliminates Data Pipeline Bottlenecks

When a key step in an AI process is accelerated, as NVIDIA has done with RAPIDS, it results in an even greater appetite for data, creating the potential for bottlenecks in the data pipeline. Significant complexity can exist in real-world data pipelines as illustrated below.

Training data can arise from a huge number of sources: IoT data from the edge, big data and other applications running in public cloud, HPC data, time-series data, or images, and numerous other unstructured data sources. NetApp all-flash storage systems and NetApp Data Fabric provide an unparalleled ability to manage data from all these sources and more.

 

NetApp distinguishes itself from the pack in two key ways:

  • The only complete data pipeline with seamless management of all your data
  • Industry-leading all-flash storage for extreme performance

Data Fabric for a Complete Data Pipeline

NetApp is the only AI storage vendor that delivers a complete data pipeline for big data, machine learning, and deep learning—a pipeline that incorporates the widest variety of data sources on premises, at the edge, and in the cloud. With NetApp Data Fabric, data flows quickly and easily between locations and is managed with simple, repeatable, and automatable processes.

 

One way in which RAPIDS achieves acceleration is through the elimination of data copies between CPU and GPU. NetApp’s single, unified data lake with in-place analytics takes the same approach, eliminating data copies that would otherwise lead to bottlenecks. NetApp customers have been benefiting from in-place data access to accelerate all types of analytics for more than four years.

NetApp is the All-Flash Leader

Recently, I was pleased to report that NetApp attained the #1 spot for all-flash arrays (AFAs), according to the latest results of IDC’s Worldwide Quarterly Enterprise Storage Systems Tracker for the second quarter of 2018 (CY 2Q18). Even bigger news is the trend, the momentum, and the validation we’ve seen by NetApp customers. NetApp has been working hard to innovate continuously, delivering meaningful solutions to our customers to meet all of their data requirements, and our all-flash business has grown faster than the market as a whole as a direct result.

 

For the unique demands of AI workloads, the AFF A800 storage included in ONTAP AI is the clear performance leader. The AFF A800 delivers bandwidth up to 25GBps throughput and 1M IOPS at 500us latency. A full scale-out cluster with 12 A800 controller pairs delivers up to 300GBps—4x to 6x the I/O performance of the nearest competitor.

 

With AI workloads, the storage needs for a single namespace can extend into the petabyte range. NetApp FlexGroup delivers scale-out NFS that blends near-infinite capacity with predictable, low-latency performance. With FlexGroup, load is automatically balanced across multiple controllers for optimal performance for the most demanding workloads. I/O is parallelized across storage controllers analogous to the way NVIDIA GPUs parallelize compute.

More Information and Resources

NetApp is working to eliminate bottlenecks and accelerate every step of the AI process from the initial idea to results—results that may yield better business decisions, better outcomes in healthcare, smarter consumer products, and fully autonomous vehicles and robots.

ONTAP AI and NetApp Data Fabric technologies and services can jumpstart your company on the path to AI success. Check out these resources to learn more about how NetApp can help.

Octavian Tanase

Octavian Tanase is the senior vice president of NetApp’s Data ONTAP
operating system group. Previously Octavian was the GM of the Cloud Appliance – AltaVault, Group which built the fabric to enable enterprises to operate in hybrid cloud environment to send their backups and archives to the cloud. Octavian started at NetApp as the vice president of the Data Protection engineering team who shipped world-class data protection products such as SnapMirror, MetroCulster, SnapVault.

Before joining NetApp in 2010, Octavian led the Java Platform engineering
group at Sun Microsystems/Oracle. He has also held various engineering
roles in several start-ups in Silicon Valley.