By Dave Glatfelter, Senior Product Manager, NetApp with assistance from Mike McNamara, Sr. Manager, Product Marketing, NetApp 


Managing storage for enterprise applications is largely about managing performance-planning for new applications, growing or redeploying existing applications, or just trying to get the full potential out of your existing systems. Time spent on performance monitoring, management, and diagnostics is a major part of every IT department’s job, and storage is always in the thick of it.


However, storage performance is complex. With various applications, operating systems, drivers, networks, storage operating systems, and caching at every layer, managing the performance of your storage takes a great deal of knowledge about the system, and a lot of experience to accurately predict the effect for any specific change. This knowledge is often beyond the capabilities of your IT staff (and sometimes beyond the capabilities of the vendors themselves). Thus, performance diagnostics and tuning is something of an art.


One way of making this less art and more science is to make sure that systems are well instrumented. They must collect all the information needed for making informed decisions about performance and measuring the effect of changes. This is what “analytics” is all about: collecting the details of every aspect of our array’s performance, and presenting them so that they can be acted upon and results can be determined.


Storage systems collect and display basic performance information through their management consoles, APIs, or both. IOPS, block sizes, read/write mix, and latencies are pretty common. But these metrics tell only part of the story, and it’s usually up to administrators to try to tie these array-centric metrics back to their applications. It’s even harder to use these metrics to predict the results from configuration changes or tuning; there’s just not enough information!


For the last three releases, NetApp® SANtricity® OS, the operating system that runs on NetApp E-Series storage arrays, has been adding to the set of performance metrics that are tracked. We’ve always had the basic performance metrics, but we’ve added more “workload metrics” such as working set size, cache utilization, controller utilization, and device-specific utilization. SANtricity OS writes this data into short-duration logs that can then be accessed through API calls and offloaded to management GUIs, support bundles, and performance tools. Taken together, these workload metrics increase our ability to monitor, manage, diagnose, and characterize workloads on our array.


One thing we can do with this new data is to automate performance tuning (so you don’t have to do it). With the recently released SANtricity 11.30, we are now using analytics to perform automated tuning with our SSD cache, and in a future release we will be implementing adaptive write caching and automated write consolidation for flash. Many areas of the system can be automatically tuned to adapt to ever-changing workloads by using analytics data. Analytics can also spot problems and autocorrect in many cases, so good analytics can speed up problem diagnostics and resolution.


Analytics data is also useful to our support people. Workload analytics information can be captured and made available to support staff through the SANtricity Workload Analysis Tool (see figure 1 below). Additionally, some workload summary information is now included in the NetApp ASUP™ support bundles.  Once trace information is captured for a specific workload, we can use other tools to “replay” the workload, thus providing a ready-made set of benchmarks covering a wide variety of applications. This capability is a huge boon to our bench-marking and solutions teams, because they can now quickly test different attributes and configuration changes to see the effect immediately.


Figure 1 - SANtricity Workload Analytics Tool


Ultimately, analytics data will be used for a broader range of automation and tools for both NetApp Support and customers, as well as core performance tuning within the OS. Many of the performance optimizations added to SANtricity come from an understanding of application I/O characteristics.


Lastly, more analytics data will be made available to you through the console and array APIs. This allows us to provide performance analysis “wizards” that help you find the root causes of performance problems and give suggestions on what you can do to solve them. Workload analytics data also helps you focus on the right area; it’s a waste of time to focus on the array when the bottleneck is somewhere upstream.

Mike McNamara

Mike McNamara is a senior leader of product and solution marketing at NetApp with 25 years of data management and data storage marketing experience. Before joining NetApp over 10 years ago, Mike worked at Adaptec, EMC and HP. Mike was a key team leader driving the launch of the industry’s first cloud-connected AI/ML solution (NetApp), unified scale-out and hybrid cloud storage system and software (NetApp), iSCSI and SAS storage system and software (Adaptec), and Fibre Channel storage system (EMC CLARiiON). In addition to his past role as marketing chairperson for the Fibre Channel Industry Association, he is a member of the Ethernet Technology Summit Conference Advisory Board, a member of the Ethernet Alliance, a regular contributor to industry journals, and a frequent speaker at events. Mike also published a book through FriesenPress titled "Scale-Out Storage - The Next Frontier in Enterprise Data Management", and was listed as a top 50 B2B product marketer to watch by Kapos.

Add comment