Popular Post

Tuesday, April 13, 2010

Disk Subsystem Capacity Management - my CMG'03 paper - "Health Index" metric and Dynamic Thresholds

Here is the link to my CMG'03 paper:  http://www.cmg.org/proceedings/2003/3099.pdf
(Free download but registration is required)
Presentation slides are freely available here:
Disk Subsystem Capacity Management, Based on Business ... - CMG

1. The paper showed interesting way to report Disk Space usage via BMC Perceive:

2. In the paper there is example of using some interesting "Health Index"  metric. I just took it from Concord (now it is CA product, I believe) performance data collector as one of many performance metrics.

Based on Concord eHeallth tool documentation:

“System Health Index” is the sum of five components (variables):
–SYSTEM, which reports a CPU imbalance problem;
–MEMORY, which is exceeding some memory utilization threshold or reflects some paging and/or swapping problems;
–CPU, which is exceeding some utilization threshold;
–COMM., which reports network errors or exceeding some network volume thresholds;
–And STORAGE, which might be a combination of
a. Exceeding user partition utilization threshold;

b. Exceeding system partition utilization threshold;

c. File cache miss rate, Allocation failures and

d. Disk I/O faults problem that can add additional points to this Health Index component.

I used that long ago. Currently in my environment I do not have that collector.
But I have started calculating my own way of "health index", which is based on numbers and types of exceptions (e.g. Hot ones are defects like run-aways; warning ones are just severe deviations from statistical norms; also number of hours/days with exceptions that does matter). Filtering that by applications (using CMDB) it gives you an idea of how stable the application is. In my other papers there are some elements of that approach.

2011 update: Other important  idea is in the paper is Dynamic Thresholds usage suggestion as for high level I/O related metrics there are no natural thresholds. Dynamic  Thresholds  got recently popular but I introduced that long ago!