Network storm observed when an NFS datastore is removed abruptly from an ESX host

Issue Description:Un-presentation of an NFS datastore without first un-mounting from the hosts results in a network traffic storm in the infrastructure.

Symptoms:

  • Hosts disconnected from Virtual Center server
  • Physical network switches utilized near capacity
  • Packet capture reveals multitude of GETATTR and ARP packets
  • Systems connected to the same network infrastructure impacted

Tip#1 : Run packet capture on NFS array/filer

Tip#2 : Use Wireshark to review packet capture

Packet Capture – Failure pattern:

===

–>SYN—
<–SYN-ACK<–
–>ACK—
–>GETATTR—
<–FIN-ACK<— (Close connection from Server)
–>ARP REQ—
<–ARP RES<—
–>ACK— (for the FIN sent by Server)
–>RST—-
===

 

Root Cause :

NFS filer/Arrays return FIN-ACK -typically to close connections to any NFS client(ESX host or any server accessing the NFS filer) that attempts to access a Lun that has been deleted or removed.

This can be deemed as a security measure to quell requests to gain access to non-existent devices and the NFS server is not obligated to service such requests.

Another significant reason why this should be done by the array is that one can build a server in the environment that can maliciously cause Denial of service(DDoS) type attacks on NFS array.

 

Resolution:

If hosts are still accessible, unmount the datastores

Else power down/reboot the hosts causing the network storm immediately.

 

The best practice for datastore removal is documented below,

http://kb.vmware.com/kb/2004605 – Un-mounting or detaching a datastore / storage device from multiple ESXi 5.x hosts.

In conclusion it is neither a fault of the ESX server or the array to behave in this fashion, both are reacting to abrupt device removal which is against standard best practices, although both server and client can be designed to behave more gracefully.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s