AI open-source software and tools

As organizations increase their use and spending ($110.7 billion by 2024) on artificial intelligence (AI) and machine learning (ML), they face challenges in data management, deployment complexity, and data availability. Many frameworks and toolkits in the industry attempt to make data more scalable and easier to deploy, but most fail to address the crucial challenge of data management and data availability. Many also feature proprietary data platforms that lack proven, enterprise-class reliability.


The NetApp® AI Control Plane and Data Science Toolkit address these challenges. They simplify data management, streamline AI workflows, and help you get the most out of your data.

AI Control Plane and Data Science Toolkit

The AI Control Plane is a full-stack solution for managing AI data and experimentation; it integrates Kubernetes and Kubeflow with a data fabric enabled by NetApp. Kubernetes, the industry-standard container orchestration platform for cloud-native deployments, makes workloads more scalable and portable. Kubeflow is an open-source ML platform that simplifies management and deployment. And when your data fabric is powered by NetApp, you get uncompromising data availability and portability so that your data is accessible across the pipeline, from edge to core to cloud.


The NetApp Data Science Toolkit is a Python library that makes it easy for data scientists and data engineers to perform numerous data management tasks. These tasks include provisioning a new data volume, cloning a data volume almost instantaneously, and creating a NetApp Snapshot™ copy of a data volume for traceability and baselining. Traceability can add hours to AI operations—hours that the data scientist spends waiting instead of experimenting. The Data Science Toolkit reduces those hours to seconds.


The Data Science Toolkit also enhances the NetApp AI Control Plane by making it much easier to manage data. For example, a data scientist working on a Jupyter Notebook that was provisioned using the AI Control Plane can use the toolkit to implement a data management task in one simple line of Python code. The toolkit can also integrate advanced NetApp data management capabilities into other MLOps platforms—including custom and homegrown platforms—or serve as a standalone solution for teams that don’t need the overhead of a full-blown MLOps platform.


Watch these short videos to see how you can provision a new data volume in minutes and almost instantaneously create an exact copy of a data volume—all using the Data Science Toolkit.

Provision a new data volume

Near-instantaneously clone a data volume


The AI Control Plane and Data Science Toolkit are compatible with NetApp Cloud Volumes ONTAP® software, so teams can use on-demand cloud compute resources in AWS, Microsoft Azure, or Google Cloud. To learn more, visit our NetApp AI page.

Mike McNamara

Mike McNamara is a senior leader of product and solution marketing at NetApp with 25 years of data management and data storage marketing experience. Before joining NetApp over 10 years ago, Mike worked at Adaptec, EMC and HP. Mike was a key team leader driving the launch of the industry’s first cloud-connected AI/ML solution (NetApp), unified scale-out and hybrid cloud storage system and software (NetApp), iSCSI and SAS storage system and software (Adaptec), and Fibre Channel storage system (EMC CLARiiON). In addition to his past role as marketing chairperson for the Fibre Channel Industry Association, he is a member of the Ethernet Technology Summit Conference Advisory Board, a member of the Ethernet Alliance, a regular contributor to industry journals, and a frequent speaker at events. Mike also published a book through FriesenPress titled "Scale-Out Storage - The Next Frontier in Enterprise Data Management", and was listed as a top 50 B2B product marketer to watch by Kapos.

Add comment