Understanding vSphere Storage Latency – Part I

Troubleshooting any performance related issue could be quite daunting. In any virtualized platform there are additional layers of indirection, hence we need to invoke a layered approach to troubleshoot and resolve such issues.

In this blog series, we shall breakdown the various components and how we can identify if a specific component is contributing to the storage latency.

Types of Storage

Storage can be broadly classified into three types,

  • File-Based Storage
  • Block storage
  • Object Storage

Here is a nice blog outlining a definition and usecases for each of them  Storage types

In the context of vSphere,

  • A File-based storage is provisioned through NFS
  • A block based storage is provisioned via iSCSI & Fiber Channel
  • VVOLs and vSAN represent object based storage (although with subtle variations)

Now lets take a closer look at storage architecture and its components specifically around block storage.

Storage Architecture & Components

In the following diagram, we have broken down the components into 7 layers that can be isolated from a troubleshooting standpoint,Screen Shot 2018-01-03 at 9.41.32 PM

  1. In a typical block storage architecture, the most fundamental component is the hard disk. Hard disks have evolved over a period of time from magnetic disks to flash/solid state drives to NVMe. This forms our first layer.
  2. LUN/RAID groups – The hard disks are seldom presented in raw form to the servers in a datacenter setup, A set of hard disks are grouped together as logical units for optimizing performance, availability and security. These are typical referenced as LUNs(Logical Unit Number) in a SAN environment.
  3. The points of entry into a storage array are termed differently by different vendors as Controllers, Storage Processors, Directors or simply array front end ports. Server can connect directly to the controllers or through fabric switches.
  4. SAN/Fabric or Network Switch aids in multipathing and eliminating single points of failure my being an intermediary between the servers and storage array. In an FC SAN, these are termed as fabric switches , in the context of iSCSI or NFS, the existing network switches play the same role
  5. The physical connectivity is enabled through fiber channel cabling or ethernet cables for FC & iSCSI/NFS respectively
  6. Host Bus Adapter or Network Interface Cards are exit or entry points for I/O,  subsequently the drivers enable the devices
  7. Hypervisor Layer introduces specific path/component within the kernel depending on the type of virtual disks associated to a VM, for instance an Virtual RDM follows a different I/O path from a standard vmdk within the kernel

Now that we have outlined the different components in the architecture, next we shall understand how to isolate a performance bottleneck at the different layers.


Clouds and beyond

The question on most IT professionals’ mind seems to be “what’s the next paradigm shift in the datacenter space”

About a decade ago, most IT companies started heavy adoption of Virtualization and incorporated a “Virtualization first” approach. In essence from the typical process of an IT manager procuring a hardware resource, he would evaluate if the intended workload could be virtualized. If yes, that would the default choice. The reasons are obvious and aplenty – cost, flexibility, availability and so on.

Net-net, the datacenters consolidated and optimized. To help the cause there’s Moore’s law

While in the initial days, there were skepticism and rebellion of sorts. Certain industries chose to remain physical citing their own reasons, eventually we saw almost all verticals from banks to defense organizations adopt virtualization.

Important thing to note is that, as the density of hypervisor vendors increased, the value proposition was no more “Virtualization” rather who could serve it best, (i.e) the actual competition was who can provide a better  quality of features on top of a virtualized platform.

One can also perceive that “virtualization” turned into a commodity and how it can be delivered, maintained or managed were the deciding factors.

In the meanwhile, there were interesting developments above and below the virtualization layer… there was hyper-convergence, storage & network virtualization and in the application stack, modernization of apps to move away from legacy models to  cloud native models.

Putting the pieces together, we have a mixed bag of workloads. Some can be in a private cloud, some that can be run on the public cloud and some hybrid.

From organizational standpoint, CIOs would build a cloud strategy with a set of policies that will govern the placement of workloads (A feature to consider – Policy based Cloud Workload Management)

The devil is in the details,

Private Cloud = Increased Capex – On-PREM but better control and compliance

Public Cloud = Increased Opex – Off-PREM but predictable expenditure and less IT management complexities such datacenter costs – power , cooling, hardware maintenance etc…

Over a period of time we witnessed each layer in the datacenter (bottom-up) getting commodotized.

Gartner states that by 2020, the concept of “no-cloud” policy would be rare.

It appears that a hybrid state with shifting balances will prevail for a fair amount of time and Nostradamus may help us from there !!!

Repeated Outlook Crash – Office 365 ProPlus Version 1706-Build 8229.2073

Annoyed with the Outook crash on Windows 10 Insider Build !!!

Guess that’s what we can expect with preview software builds, nonetheless until Microsoft figures out the solution for the crash, be sure to lower the threshold of “automatic saving of drafts”


1- Browse to the following setting,

File => Options => Mail => Enable “Automatically save items that have not been saved after this many minutes and set to “1”

2- Set it to 1(default 3)

3- Next time it crashes when you’re in the middle of an e-mail, go straight to your drafts folder and continue

This would just mitigate the data loss to the entries between the last saved checkpoint and the crash.

With bated breathe for a fix … !!!