Disk measurements with typeperf (Pandas py)

Azure disk-types

https://docs.microsoft.com/en-us/azure/virtual-machines/disks-types

Ultra, Premium SSD, Standard SSD, Standard HDD.

Max disk size 65,536 gibibyte (GiB) 32,767 GiB 32,767 GiB 32,767 GiB

Max throughput 2,000 MB/s 900 MB/s 750 MB/s 500 MB/s

Max IOPS 160,000 20,000 6,000 2,000

Premium SSD

Azure premium SSDs deliver high-performance and low-latency disk support for virtual machines (VMs) with input/output (IO)-intensive workloads. […] Premium SSDs can only be used with VM series that are premium storage-compatible.

Standard SSD

Azure standard SSDs are a cost-effective storage option optimized for workloads that need consistent performance at lower IOPS levels. […] Like standard HDDs, standard SSDs are available on all Azure VMs.

Standard SSD sizes E1 E2 E3 E4 E6 E10 E15 E20 E30 E40 E50.

Performance

https://docs.microsoft.com/en-us/azure/virtual-machines/premium-storage-performance

IOPS, or Input/output Operations Per Second, is the number of requests that your application is sending to the storage disks in one second.

Throughput, or bandwidth is the amount of data that your application is sending to the storage disks in a specified interval.

Latency is the time it takes an application to receive a single request, send it to the storage disks and send the response to the client.

You must identify which of these performance indicators are critical to your application……

The peak workload is typically experienced for a limited period, but can require your application to scale two or more times its normal operation. Find out the 50 percentile, 90 percentile, and 99 percentile requirements.

You must capture the values of these counters when your application is running its normal, peak, and off-hours workloads.

You should consider scaling these numbers based on expected future growth of your application

Performance monitor

https://docs.microsoft.com/en-us/windows/win32/perfctrs/performance-counters-portal

IOPS RW

“\SERVER-NAME\PhysicalDisk()\Disk Reads/sec” “\SERVER-NAME\PhysicalDisk()\Disk Writes/sec”
“\SERVER-NAME\PhysicalDisk()\% Disk Read Time” “\SERVER-NAME\PhysicalDisk()\% Disk Write Time”

Throughput or bandwidth

“\SERVER-NAME\PhysicalDisk()\Disk Read Bytes/sec” “\SERVER-NAME\PhysicalDisk()\Disk Write Bytes/sec”

Latency

“\SERVER-NAME\PhysicalDisk()\Avg. Disk sec/Read” “\SERVER-NAME\PhysicalDisk()\Avg. Disk sec/Write”

IO size

“\SERVER-NAME\PhysicalDisk()\Avg. Disk Bytes/Read” “\SERVER-NAME\PhysicalDisk()\Avg. Disk Bytes/Write”

Queue Depth

“\SERVER-NAME\PhysicalDisk(*)\Current Disk Queue Length”

Max. Memory

“\SERVER-NAME\Memory\% Committed Bytes in Use”

Max CPU

“\SERVER-NAME\Processor(*)\% Processor Time”

Network, replace Name with your network interface name

“\SERVER-NAME\Network Interface(Name Ethernet Adapter)\Bytes Total/sec”

Store the performance counters in a file counter.txt

Store the below cmd in a counters_run.bat, -si 2 is a two second sample intervall, -sc 60 is up to a count of 60, i.e 2 min.


typeperf -cf "C:\Users\admin\Desktop\perfmon\counter.txt" -si 2 -sc 20 -f TSV -o "C:\Users\admin\Desktop\perfmon\counters_result.tsv"
pause

Run the bat file and observer the output.

Copy the result to excel and change the time column (the first) to format=text to be able to store the date time correct.

If you want the total and not the individual, change from (*) to (_Total)

Mark the cells you want to view more information about, (in Excel) go to insert and select a graph and you have a nice representation of the data.

Might also copy replace ” with empty in Notepad before copy to Excel and set delimiter.

Cool now you have the data and some graphs, how about the max, mean and min?

import pandas as pd

df=pd.read_csv('test.csv')
header = list(df.columns.values)
print(header)

#FINDING MAX AND MIN
print("\n")
print("Counter; Amount;Max;Mean;Min")
for x in df:
   try:
      total = df[x].count()
      # print(str(x) + "; Values" + str(total))
      ma=df[x].max()
      mi=df[x].min()
      check_inst = isinstance(df[x], str)
      if not check_inst:
        me=df[x].mean()
      else:
         pass
      print(str(x) + ";" + str(total) +";" + str(ma) + ";" + str(me) + ";" + str(mi))
   except ValueError as e:
       # print(e)
       print("value error")
   except TypeError as t:
      # print(t)
      print("type error, empty line somewhere, replace it with 0")
   
# rm the time column
# rm empty lines for string to float
# just one tab in test.csv also, keep the graph

  

Fear no longer…

Nature of IO requests

IOPS * IO Size = Throughput

……