This best practice document was created to share my production experience in architecting, implementing, and managing many ONTAP 7-Mode HA pairs and over 30 ONTAP SAN clusters. This knowledge is based on real-world experiences of iSCSI deployments that I’ve encountered over the years. This document was used as reference material during the iSCSI Best Practices whiteboard session at NetApp Insight 2018 in Las Vegas. If you are looking for some tested best practices to help you be successful with your iSCSI deployments, then keep reading.
Before getting into it, I wanted to establish a few key definitions as well as a quick overview of NetApp Unified Storage.
NAS – Network Attached Storage: File-based storage (NFS and SMB (CIFS), SMB3) in which ONTAP controls the file system.
SAN – Storage Area Network: Block-based storage (FC, FCoE, iSCSI) in which the host controls the file system.
LUN – Logical Unit Number: A logical representation of an attached SCSI disk.
SCSI – Small Computer Systems Interface: A set of standards that define commands, protocols, and interfaces that are used to transmit data. SCSI allows low-level access to data in units of 512-byte blocks. This is highly efficient and has low overhead compared to NAS (file-level) access. SCSI has a high level of resiliency that makes it perfect for an enterprise-level protocol. SAN uses SCSI-3 protocol.
FC SAN – Fiber Channel SAN: Uses FC protocol to communicate over FC ports. FC encapsulates SCSI commands in FC frames.
IP SAN – iSCSI SAN: Uses iSCSI protocol to communicate over Ethernet ports. iSCSI encapsulates SCSI commands in IP packets. Uses TCP port 3260.
In a FC SAN, a worldwide node name (WWNN) describes a machine, and a worldwide port name (WWPN) describes a physical port that is attached to that machine. In an IP SAN, the node name describes a machine, and the portal describes a physical port. Each iSCSI node must have a node name. There are two supported node name formats: IQN (iSCSI Qualified Name) and EUI (Extended Unique Identifier).
Set MTU to 9,000 on data and cluster interconnect ports and disable flowcontrol on all ports (net port show -fields mtu,flowcontrol-admin).
FC and IP SAN ports within the storage controller leverage application-specific integrated circuits (ASIC) chipsets. It is important to verify that target ports are split across different ASIC chips within the controller or add-on card to eliminate a single ASIC as a single point of failure.
(IP) e0a and e0b are on the same ASIC
(FC) 0a and 0b are on the same ASIC
The correct usage would be:
(IP) e0a and e0c, e0b and e0d
(FC) 0a and 0c, 0b and 0d
ALUA (Asymmetric Logical Unit Access) Path Optimization Selection
ALUA is required in both FC and iSCSI implementations.
Active/Optimized and Active/Non-Optimized paths are both configured for access with a minimum of 1 path per node. For better performance, it is highly recommended to have multiple paths per node to allow 2 or more paths for multipathing.
An Active/Optimized path uses direct or primary paths between the initiator and target on the node that owns the LUN. An Active/Non-Optimized path uses an indirect or secondary path between the initiator and target through the cluster interconnect, with increased latency.
Path Selection (Round Robin)
The host uses an automatic path selection algorithm rotating through all active paths when connecting to active-passive arrays, or through all available paths when connecting to active-active arrays. Round Robin (RR) is the default for a number of arrays and can be used with both active-active and active-passive arrays to implement load balancing across paths for different LUNs. NetApp is configured into an Active/Active solution by default. It can be configured in Active/Passive depending on customer requirements.
Jumbo frames are larger Ethernet packets that reduce the ratio of packet overhead to payload. The default Ethernet frame size or MTU is 1,500 bytes. With jumbo frames, MTU is typically set to 9,000 on end nodes, such as servers and storage, and to a larger value, such as 9,198 or 9,216, on the physical switches. Jumbo frames must be enabled on all physical devices and logical entities from end to end in order to avoid truncation or fragmentation of packets with the maximum size.
On physical switches, the MTU must be set to the maximum supported value, either as a global setting or policy option or on a port-by-port basis (including all ports used by ESXi and the nodes of the NetApp cluster), depending on the switch implementation. The MTU must also be set, and the same value must be used, on the ESXi vSwitch and VMkernel port and on the physical ports or interface groups of each node.
When problems occur, it is often because either the VMkernel or the vSwitch was not set for jumbo frames. For VM guests that require direct access to storage through their own NFS or CIFS stack or iSCSI initiator, there is no MTU setting for the VM port group; however, the MTU must be configured in the guest. The image below shows jumbo frame MTU settings for the various networking components.
After creating ifgrps and VLANs, remove the ifgrps and data ports from the default broadcast domain and create a data broadcast domain. Move the ifgrps and data ports into the data broadcast domain and set the mtu setting to 9,000.
Questions? Find me on Twitter (@cartracr) for more information!
Thanks to Steve Botkin, AKA SANta (@SANTechArch) for his help with SANitizing the content of this document.