When running any type of PowerCLI script that needs to query storage devices on an ESXi host, you end up using the GET-SCSILun command. Unfortunately, if you have a large environment with a lot of storage devices connected to your hosts, you will quickly find out that this single command can be very expensive to run. In my environment it literally takes us almost 1-2 minutes to execute. Now you may think that is nothing, but if you are iterating through 50 hosts and for each one you are running this command, now your talking about 50-100 minutes for the script to complete. I was working on a script recently that would pull disk information for each virtual machine in our environment. One of the fields that was asked for was the vendor and model info for each storage device that housed the VMDK files for the VM. Of course to get this info it required using the GET-SCSILun command. Well you guesed it that meant running that command for each of the 200+ VM's. The script worked great, except that it took nearly 5 1/2 hours to complete :(. Unacceptable right!
Thinking outside the box, I decided to determine what my script was doing each time I called that command. Essentially what I was doing was using GET-Datastore cmdlet to pull the canonical name (unique number identifying the storage device) and then running the GET-SCSILun using the known canonical name to pull the vendor and model properties off of the storage device. This got me thinking, why not pull all the storage devices information for a host and store it in a hash table using the canonical name as the key. This works for me as all the VMs are in the same cluster of Hosts, hence all the hosts should have the same storage devices. If you intend to work with multiple clusters or hosts not in clusters I would recommend using separate hashtables and then referencing each based on the VM you are querying and what host its hosted on.
Anywho...To build this hash table is pretty simple. First just declare the hash table array variable using @{} syntax. Then run the GET-SCSILun command targeting a host while using the hashtable ".add" function to pipe the information to the hash table. In the add function you declare what properties to add to the table (the first one is the key so make sure its unique). Here is an example of how I built the hash table of the information.
#Dump scsi lun info to hash table
$scsiLunInfo = @{}
GET-SCSILun -VMHost $vmHost | %{
$scsiLunInfo.Add($_.CanonicalName,$_.ExtensionData.Vendor,$_.ExtensionData.Model)
}
Alrighty. Now that we have a hash table of the scsi lun information we can search that table instead of running the GET-SCSILun command. The difference is literally night and day in terms of time to execute. Remember how I said the GET-SCSILun command could take 1-2 minutes? Searching the hash table is nearly instant. Ok, now we got a hash table of the storage device info, but how do we use it?
Well, in my scenario I am trying to pull the vendor and model of a storage device that a VM's VMDK file is residing on. So first I use the GET-Datastore cmdlet to target a datastore with a VM's VMDK files on it and pull the canonical name of the storage device that the datastore is hosted on (remember the canonical name is the key in the hash table). Once I have this name, I can then search the hash table and return the additional properties that are stored with the canonical name key. This is accomplished through the ".Get_Item" function on a hash table. Here is an example of how I utilize the hash table to return the required information.
#Get canonical name of storage device
$id = (GET-Datastore $datastore | Get-View).Info.vmfs.extent.diskname
#Determine backend storage system
$storageSystem = $scsiLunInfo.Get_Item($id)
Using this approach, I first build the hash table in the begining of my script. Then any time I need to pull the storage device info, I query the hash table instead of running the expensive GET-SCSILun command. Now that I only run that command once at the begining of the script, the time to execute was reduced nearly 80%. Remember I stated it took nearly 5 1/2 hours to run my script against 200 VM's? With this approach the script now takes roughly 15 minutes.
I am sure this concept can be used to other expensive VMWare PowerCLI cmdlets, but have yet to experiment with any others.