System Management by Exception: Power of Control Charts and IT-Chart Concept (Part 1)

Thursday, September 29, 2011

Power of Control Charts and IT-Chart Concept (Part 1)

This is the video presentation about Control Charts. It is based on my workshop I have already run a few times. It shows how to read and use Control Charts for reporting and analyzing IT systems performance (e.g. servers, applications) . My original IT-(Control) Chart concept within SEDS (Statistical Exception Detection System) is also presented.

The Part 2 will be about "How to build" control chart using R, SAS, BIRT and just

If anybody interested I would be happy to conduct this workshop again remotely via Internet or in person. Just put a request or just a comment here.

UPDATE: See the version of this presentation with the Russian narration:

Igor Trubin

He started in 1979 as IBM/370 system engineer. In 1986 he got his PhD. in Robotics at St. Petersburg Technical University (Russia) and then worked as a professor teaching CAD/CAM, Robotics for 12 years. He published 30+ papers and made several presentations for conferences related to the Robotics and Artificial Intelligent fields. In 1999 he moved to the US, worked at Capital One bank as a Capacity Planner. His first CMG.org paper was written and presented in 2001. The next one, "Exception Detection System Based on MASF Technique," won a Best Paper award at CMG'02 and was presented at UKCMG'03 in Oxford, England. He made other tech. presentations at IBM z/Series Expo, SPEC.org, Southern and Central Europe CMG and ran several workshops covering his original method of Anomaly and Change Point Detection (Perfomalist.com). Author of “Performance Anomaly Detection” class (at CMG.com). Worked 2 years as the Capacity team lead for IBM, worked for SunTrust Bank for 3 years and then at IBM for 3 years as Sr. IT Architect. Now he works for Capital One bank as IT Manager at the Cloud Engineering and since 2015 he is a member of CMG.org Board of Directors. Runs UT channel iTrubin

4 comments:

Igor TrubinOctober 29, 2011
Igor, good information and I appreciate the formal and thorough presentation materials giving us a chance to think on this at a level of depth. I have used similar charting / methodologies on many levels for mainframe performance and capacity planning. In addition to performance related analysis, one area I have found this process / approach to be particularly useful and important is when developing baselines needed with each major forecast/capacity review cycle. Not only is the base-lining process key to an effective forecast, it allows the capacity planners to update their knowledge and expertise on the various LPARs / resources and workload / application usage since the last cycle. Second, this type statistical data analysis and charting is integral to scrubbing the baseline and removing one-time or invalid anomalies from the "typical peak day" baseline needed to build a forecast. Third, the base-lining process should be done at a level of analysis that will identify / address unusual and/or trending anomalies or unplanned growth before significant business impacts occur. Fourth, it is especially important to assess / validate the previous forecasts using the new baselines to calculate plus / minus utilization deltas and update "forecast accuracy" tracking data. Lastly, ongoing use of the charts are key to completing monthly mini-forecasts needed to track status of workload growth and upgrade schedules and associated plans. Thanks again for your continued good work and thinking.
Posted by Jack (John R.)
ReplyDelete
Replies
Igor TrubinOctober 29, 2011
I think the idea for this type of video is great. You are the zen master of this methodology and I thank you for sharing you knowledge and wisdom in this area.

While I loved the content of the presentation, I think the format made it hard to follow. I don't know if you intended for us to do this, but I phased most of the slides to read and absorb the information. I was actually hoping / looking forward to comments from you about each slide. The hammer like sound you used for a slide transition should be replaced with something that has a softer sound.

Again, I really loved the content, but I think it would be a better learning tool if you had an audio track to augment the slides.
ReplyDelete
Replies
Igor TrubinOctober 29, 2011
I watched the Video and think that you are on a great path.
I would just make one suggestion though - it appears that in many instances, you aren't using control charts in the statistical sense ; more like line graphs that depict trends.
To take this to the next step and truly be using a control chart - I would establish control limits (or let a good program do it for me), and have planned responses to the actions.
Possibly put the data into a program such as JMP, Minitab, or Statgraphics (not endorsing or advertising for any of them), and apply the "tolerance" bands of what is funcitonal to the data / feature being measured & analyzed, and see if there are any out of control points within the data trends. This provides more of a methodical response to an "occurence" as depicted by the data and not just a subjective response to a percieved problem / difference at a point in time.
Just a thought. Again, nice work - it is this out of the box thinking in applying statistical techniques to processes that are not limited to manufacturing that will create cost savings as a whole for many businesses & in turn consumers, and not just the "factory" guys.
Good job.

Posted by Ray
ReplyDelete
Replies
Igor TrubinOctober 29, 2011
I liked the presentation and would like to see the next part. I do have questions about the general applicability.

Have you read my 2009 CMG paper on applying survival analysis to computer systems? The assumption of normality of response times cannot be assumed. What do you think? Part of my goal, not discussed in the paper, was to automate real time exception reporting.

I have previously read your papers on MASF and thought they were very good.\

Thank you
Brian
ReplyDelete
Replies

Add comment

System Management by Exception

Popular Post

_

Thursday, September 29, 2011

Power of Control Charts and IT-Chart Concept (Part 1)

4 comments: