System Management by Exception: IT-Chart

Showing posts with label IT-Chart. Show all posts

Wednesday, March 6, 2013

IT-Control Charts: How to Read, How to Use

DEMO is below:

YouTube version of the presentation:

Igor Trubin

He started in 1979 as IBM/370 system engineer. In 1986 he got his PhD. in Robotics at St. Petersburg Technical University (Russia) and then worked as a professor teaching CAD/CAM, Robotics for 12 years. He published 30+ papers and made several presentations for conferences related to the Robotics and Artificial Intelligent fields. In 1999 he moved to the US, worked at Capital One bank as a Capacity Planner. His first CMG.org paper was written and presented in 2001. The next one, "Exception Detection System Based on MASF Technique," won a Best Paper award at CMG'02 and was presented at UKCMG'03 in Oxford, England. He made other tech. presentations at IBM z/Series Expo, SPEC.org, Southern and Central Europe CMG and ran several workshops covering his original method of Anomaly and Change Point Detection (Perfomalist.com). Author of “Performance Anomaly Detection” class (at CMG.com). Worked 2 years as the Capacity team lead for IBM, worked for SunTrust Bank for 3 years and then at IBM for 3 years as Sr. IT Architect. Now he works for Capital One bank as IT Manager at the Cloud Engineering and since 2015 he is a member of CMG.org Board of Directors. Runs UT channel iTrubin

Tuesday, March 27, 2012

R-Script to Aggregate (ETL to MySQL) Actual data with Base-line data for IT-Control Charts

At my previous post (R-script to plot IT-Control Chart against MySQL) the task was given to write a R-script to pre-process (ETL) the raw date-hour stamped data to the DATA-cubical format for Control Charting.

Here is the solution:
I have just transformed the already developed SQL script to the RODBC based R-Script which can be seen below:

The result of the script run is the "ActualVsHistorical" table in the servermentrics database on MySQL with the following data that is identical with the data used for plotting IT-Control Chart published in the previous post. The data itself can be seen by just typing the data frame name in the R-Console window:

So, all main elements of SEDS-lite project were prototyped and published on my posts. Maybe one more task is left, which is to illustrate on R how the exceptional (based on EV meta-metric filtering) list of objects (servers) can be created as a part of anomalies detection. So far that was done and published in this blog and so far it is only in the "DB2"-like SQL format to run within BIRT. See the post about that here: UCL=LCL : How many standard deviations do we use for Control Charting? Use ZERO!

Igor Trubin

Tuesday, December 27, 2011

IT/EV-Charts as an Application Signature: CMG'11 Trip Report, Part 1

I have attended the following CMG’11 presentation (see my previous post):

Application Signature:

A Way to Identify, Quantify and Report Change

Richard Gimarc Kiran Chennuri

CA Technologies, Inc. Aetna Life Insurance Company

Identifying change in application performance is a time consuming task. Businesses today have

hundreds of applications and each application has hundreds of metrics. How do you wade

through that mass of data to find an indication of change? This paper describes the use of an

Application Signature to identify, quantify and report change. A Signature is a compact

description of application performance that is used much like a template to judge if a change has

occurred. There are a concise set of visual indicators generated by the Signature that supports

the identification of change in a timely manner.

Here are my comments.

I like the idea of building an application characteristic called Application Signature. As described in the paper it is actually based on typical (standard) deviations of Capacity usage during the peak hours of a day.

Looking closely to the approach I see it is similar with one I have developed for SEDS but it is a bit too simplified. Anyway it is great attempt to use SEDS methodology to watch application capacity usage.

I think the weekly IT-CONTROL CHART ( see other previous post ) is a way to compare usual weekly profile with last 168 hours of data (Base-line vs. Actual), so the base-line in the format of IT-Control Charts without actual data IS AN APPLICATION SIGNATURE but in much more accurate way. It even looks like somebody’s signature:

The actual data could be significantly different, as seen below:

And that diference should be automatically captured by SEDS-like system as an exceptions and calculated how much it differs from the "Signature" using EV meta metric as a weekly sum of each hour EV values or as a EV-Control Charts like showed here.

For instance, in this example week the application had took a bit more than 23 unusual CPU hours as calculated below:

So, if weekly EV number is 0, that means the most recently the application (server or LPAR and so on) stayed within the IT-Signature, which is GOOD – no changes happend!

The paper also shows the “calendar view“ report that consists of set of daily control charts. It is another good idea. I used to use that approach before I switched to weekly IT- charts that cover 1/4 of a month or bi-weekly ones that cover 1/2 of a month. So if you have IT-charts there is no need for the "calendar view" that sometimes is not easy to read.

Another feature could be important for capacity usage estimates: it is a balance of hourly capacity usage for the day or week vs. overall average (e.g. weekdays vs. weekends or daily “cowboy hat” profile with lunch time drop). That is supposed to be an additional IT-Signature feature. There was another CMG’11 paper that presents some interesting approach to analyze/calculate that. I plan to publish my comments about that paper. So please check my next post soon.....

Igor Trubin

Friday, October 7, 2011

EV-Control Chart

I have introduced the EV meta-metric in 2001 as a measure of anomaly severity. EV stands for Exception Value and more explanation about that idea could be found here: The Exception Value Concept to Measure Magnitude of Systems Behavior Anomalies

Basically it is the difference (integral) between actual data and control limits. So far I have used EV data mostly to filter out real issues or for automatic hidden trend recognition. For instance, in my paper CMG’08 “Exception Based Modeling and Forecasting” I have plotted that metric using Excel to explain how it could be used for a new trend starting point recognition. Here is the picture from that paper where EV called “Extra Volume” and for the particular parent metric (CPU util.) it is named ExtraCPUtime:

The EV meta-metric first chart

But just plotting that meta-metric and/or two their components (EV+ and EV-) over time gives a valuable picture of system behavior. If system is stable that chart should be boring showing near zero value all the time. So using that chart would be very easy (I believe even easier than in MASF Control Charts) to recognize unusual and statistically significant increase or decrease in actual data in very early stage (Early Warning!).

Here is the example of that EV-chart against the same sample data used in few previous posts:

1. Excel example:

2. BIRT/MySQL example as a continuation of the exercise from the previous post:

IT-Control chart vs. EV-Chart

Here is the BIRT screenshots that illustrate how that is built:

a. A. Addition query to get EV calculated written directly in the additional BIRT Data Set object called “Data set for EV Chart”:

SQL query to calculate EV meta-metric

SQL query to calculate EV metric from the data kept in MySQL table

B. Then additional bar-chart object is added to the report that is bind to that new “Data set for EV Chart”:

Result report is already shown here.

Igor Trubin

Tuesday, October 4, 2011

Building IT-Control Chart by BIRT against Data from the MySQL Database

This is just about another way to build an IT-Control chart assuming the raw data are in the real database like MySQL. In this case some SQL scripting is used.

1. The raw data is CPU hourly utilization and actually the same as in the previous posts: BIRT based Control Chart and One Example of BIRT Data Cubes Usage for Performance Data Analysis. (see the raw data picture here)

2. That raw data need to be uploaded to some table (CPUutil) in the MySQL schema (ServerMetric) by using the following script (sqlScriptToUploadCSVforSEDS.sql):

The uploaded data is seen at the bottom of the picture.

3. Then the output (result) data (ActualVsHistoric table) is built using the following script (sqlScriptToControlChartforSEDS.sql):

The fragment of the result data are seen at the bottom of the picture also. Everything is ready for building IT-Control Chart and the data is actually the same as used in BIRT based Control Chart, so result should be the same also. Below is more detailed explanation how that was done.

4. First, using BIRT the connection to MySQL database is established (to MySQLti with schema ServerMetrics to table ActualVsHistorical):

5. Then, the chart is developed the same way like that was done in BIRT based Control Chart post:

1. 6. Nice thing is in BIRT you can specify report parameters, that could be then a part of any constants including for filtering (to change a baseline or to provide server or metric names). Finally the report should be run to get the following result, which is almost identical with the one built for BIRT based Control Chart post:

Igor Trubin

Thursday, September 29, 2011

Power of Control Charts and IT-Chart Concept (Part 1)

This is the video presentation about Control Charts. It is based on my workshop I have already run a few times. It shows how to read and use Control Charts for reporting and analyzing IT systems performance (e.g. servers, applications) . My original IT-(Control) Chart concept within SEDS (Statistical Exception Detection System) is also presented.

The Part 2 will be about "How to build" control chart using R, SAS, BIRT and just

If anybody interested I would be happy to conduct this workshop again remotely via Internet or in person. Just put a request or just a comment here.

UPDATE: See the version of this presentation with the Russian narration:

Igor Trubin

Thursday, September 22, 2011

One Example of BIRT Data Cubes Usage for Performance Data Analysis

I have got the comment on my previous post “BIRT based Control Chart“ with questions about how actually in BIRT the data are prepared for Control Charting. Addressing this request I’d like to share how I use BIRT Cube to populate data to CrossTab object which was used then for building a control chart.

As I have already explained in my CMG paper (see IT-Control Chart), the data that describes the IT-Control Chart (or MASF control chart) has actually 3 dimensions (actually, it has 2 time dimensions and one measurement - metric as seen in the picture at the left). And the control chart is a just a projection to the 2D cut with actual (current or last) data overlaying. So, naturally, the OLAP Cubes data model (Data Cubes) is suitable for grouping and summarizing time stamped data to a crosstable for further analysis including building a control chart. In the past SEDS implementations I did not use Cubes approach and had to transform time stamped data for control charting using basic SAS steps and procs. Now I found that Data Cubes usage is somewhat simpler and in some cases does not require a programming at all if the modern BI tools (such as BIRT) are used.

Below are the some screenshots with comments that illustrates the process of building the IT-Control Chart by using BIRT Cube.

Data source (Input data) is a table with date/hour stamped single metric with at least 4 months history (in this case it is the CPU utilization of some Unix box). That could be in any database format; in this particular example it is the following CSV file:

The result (in the form of BIRT report designer preview) is on the following picture:

(Where UCL – Upper Control Limit; LCL is not included for simplicity)

Before building the Cube the three following data sets were built using BIRT “Data Explorer”:

(1) The Reference set or base-line (just “Data Set” on the picture) is based on the input raw data with some filtering and computed columns (weekday and weekhour) and
(2) the Actual data set which is the same but having the different filter: (raw[“date”} Greater “2011-04-02”)

(3) To combine both data sets for comparing base-line vs. actual, the “Data Set1” is built as a “Joint Data Set” by the following BIRT Query builder:

Then the Data Cube was built in the BIRT Data Cube Builder with the structure shown on the following screen:

Note only one dimension is used here – weekhour as that is needed for Cross table report bellow.

The next step is building report starting with Cross Table (which is picked as an object from BIRT Report designer “Pallete”):

The picture above shows also what fields are chosen from Cube to Cross table.

The final step is dropping “Chart” object from “Palette” and adding UCL calculation using Expression Builder for additional Value (Y) Series:

To see the result one needs just to run the report or to use a "preview' tab on the report designer window:

FINAL COMMENTS

- The BIRT report package can be exported and submitted for running under any portals (e.g. IBM TCR).

- Additional Cube dimensions makes sense to specify and use, such as server name or/and metric name.

- The report can be designed in BIRT with some parameters. For example, good idea is to use a server name as the report parameter.

- To follow the “SEDS” idea and to have the reporting process based on exceptions, the preliminary exception detection step is needed and can be done again within a BIRT report using the SQL script similar with published in one of the previous post:

Igor Trubin

Monday, November 15, 2010

My CMG'10 presentation - "IT-Control Charts"

I will go to CMG conference this time only for one day just to present my paper "IT-Control Charts" on Wednesday December 8th 10:30 - You are WELCOME!

Check it in the CMG conference agenda - http://www.cmg.org/cgi-bin/agenda_2010.pl?action=more&token=5030

For Russian readers (Информация по русски здесь) I made a posting about that event in my Russian mirror blog: http://ukor.blogspot.com/2010/11/cmg10_15.html

Igor Trubin

Tuesday, July 27, 2010

My new CMG'10 paper "IT-Control Charts" was accepted to be published and presented

My new 10th CMG paper "IT-Control Charts" was accepted and will be presented in Orlando on December 8th Wednesday at 10:30. That is a paper version of my Workshop "Power of Control Chart, How to Read, How to Build, How to Use" I ran several times last year.

IT-Control Charts

The Control Chart originally used in Mechanical Engineering has become one of the main Six Sigma tools to optimize business processes, and after some adjustments it is used in IT Capacity Management especially in “behavior learning” products. The paper answers the following questions. What is the Control Chart and how to read it? Where is the Control Chart used? Review of some performance tools that use it. Control chart types: MASF charts vs. SPC; IT-Control Chart for IT Application performance control. How to build a Control Chart using Excel for interactive analysis and R to do it automatically?

Igor Trubin

Popular Post

_

Wednesday, March 6, 2013

Tuesday, March 27, 2012

Tuesday, December 27, 2011

Friday, October 7, 2011

Tuesday, October 4, 2011

Thursday, September 29, 2011

Thursday, September 22, 2011

Monday, November 15, 2010

Tuesday, July 27, 2010