Understanding vSphere Storage Latency – Part I

Troubleshooting any performance related issue could be quite daunting. In any virtualized platform there are additional layers of indirection, hence we need to invoke a layered approach to troubleshoot and resolve such issues.

In this blog series, we shall breakdown the various components and how we can identify if a specific component is contributing to the storage latency.

Types of Storage

Storage can be broadly classified into three types,

  • File-Based Storage
  • Block storage
  • Object Storage

Here is a nice blog outlining a definition and usecases for each of them  Storage types

In the context of vSphere,

  • A File-based storage is provisioned through NFS
  • A block based storage is provisioned via iSCSI & Fiber Channel
  • VVOLs and vSAN represent object based storage (although with subtle variations)

Now lets take a closer look at storage architecture and its components specifically around block storage.

Storage Architecture & Components

In the following diagram, we have broken down the components into 7 layers that can be isolated from a troubleshooting standpoint,Screen Shot 2018-01-03 at 9.41.32 PM

  1. In a typical block storage architecture, the most fundamental component is the hard disk. Hard disks have evolved over a period of time from magnetic disks to flash/solid state drives to NVMe. This forms our first layer.
  2. LUN/RAID groups – The hard disks are seldom presented in raw form to the servers in a datacenter setup, A set of hard disks are grouped together as logical units for optimizing performance, availability and security. These are typical referenced as LUNs(Logical Unit Number) in a SAN environment.
  3. The points of entry into a storage array are termed differently by different vendors as Controllers, Storage Processors, Directors or simply array front end ports. Server can connect directly to the controllers or through fabric switches.
  4. SAN/Fabric or Network Switch aids in multipathing and eliminating single points of failure my being an intermediary between the servers and storage array. In an FC SAN, these are termed as fabric switches , in the context of iSCSI or NFS, the existing network switches play the same role
  5. The physical connectivity is enabled through fiber channel cabling or ethernet cables for FC & iSCSI/NFS respectively
  6. Host Bus Adapter or Network Interface Cards are exit or entry points for I/O,  subsequently the drivers enable the devices
  7. Hypervisor Layer introduces specific path/component within the kernel depending on the type of virtual disks associated to a VM, for instance an Virtual RDM follows a different I/O path from a standard vmdk within the kernel

Now that we have outlined the different components in the architecture, next we shall understand how to isolate a performance bottleneck at the different layers.