In our consulting practice we often encounter absurd situations, designs and solutions. This is caused by companies trying not to involve external consultants with needed know-how and competences and source projects from internal resources who lacks needed know-how and skills. In our series Oh, really? we will share most absurd things we encounter in our customers environments, together with a tips how to do it right.
In our first episode, we will look into teaming of iSCSI Ethernet interfaces. Why is it a wrong thing to do, and why is iSCSI MultiPath better solution for a same problem.
It is well known fact, that unlike Fibre Channel, Ethernet protocol has not been designed with build-in resilience and for running mission critical services. Even though Ethernet got to mission-critical environments and we now need to deal with this fact. For Ethernet resilience, there are certain methods to increase Ethernet resilience such as link aggregation or Spanning Tree Protocol (STP).
However as more universal L2 solutions for a resilience problem, these might not be ideal for certain applications and higher layer solutions. Sometimes it is so, that there is nothing better to use to increase resilience of used protocol (like it is in most NFS deployments), however in some situations (like iSCSI), people who designed a protocol has included better and more integrated approach to solve same problem. In case of iSCSI this is called MultiPath. Same way as in Fibre Channel networks, in iSCSI MultiPath is used to provide initiator (client) with multiple paths to same storage.
Independent L1 networks
In ideal iSCSI world, your iSCSI Ethernet/IP network would be build a same way as your Fibre Channel network.
Two or more totally independent set of switches, connected to multiple Ethernet adapters on server side and a storage side. With two different subnets and two different vlans, whatever error within one L1/L2 domain in isolated in this domain and can not affect another independent path.
Bonding on storage side
In less ideal world, two or more Ethernet NICs are aggregated on a storage side with use of link aggregation protocol and two different vlan id’s are used in logically aggregated switches. This provides a resilience against errors in one L2 domain, but does not protect against network administrator logical errors and only there is only limited protection against L1 errors.
Bonding on both sides
This is totally wrong approach towards iSCSI implementation – one we have encountered in customer environment. Implementing iSCSI following way shows lack of understanding of SAN infrastructure and especially iSCSI. You can read Microsoft KB note 2535811 – Supported and tested Microsoft iSCSI Software Target 3.3 limits.
- You should not use network adapter teaming with Microsoft iSCSI Software Target 3.3 for iSCSI communication.
- If you plan to use multiple network adapters for iSCSI communication, you should separate them into their own subnets, set up virtual IP addresses, and then implement MPIO.
Teaming/Bonding on both sides, you would use same vlan with same vlanid on one logical switch. Use of same vlan, does not provide any error isolation, in L1 network, neither in L2 domain. Some switch failure might bring down entire logical switch, and since same vlan is used, also L2 errors will have negative impact on only one path from iSCSI perspective.
Conclusion
Some crazy person, might probably come with many many more alternatives how you can design iSCSI Storage Area Network except for three designs mentioned above, however three mentioned above are typical designs. Especially first one and second one, third one even thought not approved by vendors is sometimes deployed. If you are planning to deploy iSCSI SAN stick preferably with design number one. If you want to combine as part of Unified Storage and Unified Network both SAN and NAS environments, you might consider using design number two.
But using design number three shows just lack of judgment, experience and know-how. Implementing SAN wrong way is much worse than not to have SAN at all – and design number three is a wrong way to implement IP SAN.



Discussion
No comments yet.