In our common quest to democratize AI, we have partnered with H2O.ai and integrated NetApp Cloud Volumes Service, our cloud-native file storage service, with H2O Driverless AI. This was first announced in a joint session at H2O World in San Francisco on Feb 5, 2019. Summarized in the below sections are a few high-level details of the two products and the key value propositions of the integrated solution.

Driverless AI

H2O Driverless AI is a machine learning (ML) platform that empowers data teams to scale and deliver trusted, production-ready models. It automates time consuming data science tasks including advanced feature engineering, model selection, and model deployment. Model deployment is streamlined with automatic scoring pipelines that include everything needed to run the model in production.

Benefits:

  • Automatically selects data plots based on the most relevant data statistics to help users understand their data building models.
  • Employs a library of algorithms and feature transformations to auto engineer new, high value features for a given dataset.
  • Provides robust Interpretability of ML techniques and results including python scoring.
  • Supports enterprise data access with connectors to on-premises and cloud data sources including AWS, GCP, Azure, and Snowflake.

Cloud Volumes Service

Cloud Volumes Service is a fully managed, cloud-native file storage service based on proven NetApp ONTAP® data management software. Cloud Volumes Service combines NetApp’s vast file services expertise with the simplicity and flexibility of the biggest clouds (AWS, Azure, GCP). The service supports NFSv3 and NFSv4 (coming soon) for Linux/UNIX clients, and SMB for Windows clients operating in the cloud.

Benefits:

  • Delivered as a simple native cloud service with an easy-to-use interface and a single payment model through the cloud provider.
  • Scale from zero to 100TB deployments in a matter of seconds.
  • High storage performance with three service levels; allows you to change the service levels dynamically.
  • Supports features to migrate, replicate, and synchronize data across on-premises and the cloud. Also, can use NetApp SnapshotTM for data protection, data restores, and FlexCloneTM for data cloning.

Driverless AI + Cloud Volumes Service

Driverless AI seamlessly integrates with Cloud Volumes providing customers with an easy and convenient way to build enterprise-grade models at scale on the top three public clouds.

 

Once configured and mounted, a Cloud Volume is seen as one of the data sources in the Driverless AI’s user interface providing convenient access to data with a shared data management model. It facilitates collaboration across multiple instances of Driverless AI with the same Cloud Volume mounted and enables data scientists to share experiments, artifacts, and model updates. In the event of Driverless AI going down, all the data and experiments are saved and snapshotted.

 

If a job is taking too long to complete, the data scientist can mount the Cloud Volume on a new powerful compute node with GPUs provisioned and restart from a checkpoint. Alternatively, the user can also downgrade to a lower compute instance to save on OpEx if needed.

Benefits:

  • Easy and convenient to provision and mount Cloud Volumes on Driverless AI.
  • Deploy and collaborate on multiple instances of Driverless AI with Cloud Volumes.
  • Allows check pointing enabling users to restart jobs on alternate compute instances without compromising on progress.
  • Supports changing storage service levels as a way to enhance performance or reduce OpEx.
  • Instantaneous snapshots and cloning features allow scientists to experiment with data sets without risking data loss.
  • Save on API charges, there is no fee per read or write like in AWS S3.

Next Steps:

  • Integration of Driverless AI with NetApp ONTAP AI for training workloads on-premises and in hybrid environments.
  • Integration with NetApp Kubernetes Service (NKS) to launch Driverless AI instances using Kubernetes.

In conclusion, NetApp provides a common storage layer for Driverless AI spanning on-premises, hybrid, and cloud environments. The integrated solution provides data science teams a convenient platform to collaborate, scale, and deploy AI solutions faster without compromising on progress.

 

To learn more on NetApp’s AI offerings, here are a few resources –

  • Demo of H2O Driverless AI integration with Cloud Volumes Service on YouTube
  • AI training on AWS with P3 (GPU) instances and Cloud Volumes Service – TR-4718
  • Edge to Core to Cloud Architecture for AI – WP-7271
  • ONTAP AI reference architecture – NVA-1121
  • For more, visit netapp.com/ai

Sundar Ranganathan

Sundar Ranganathan is a Senior Product Manager for NetApp ONTAP based in Sunnyvale, CA. He currently focuses on introducing features and solutions for the ONTAP software suite to target AI/DL and low latency workloads. His experience includes product management and product development roles at Micron Technology, Qualcomm Inc., and Honeywell. He holds an MBA from the Marshall School of Business (USC), MS degree in Electrical Engineering from University of Southern California (USC) and a BS degree in Electronics Engineering from India.