Popular Post

Search This Blog

Monday, August 14, 2017

I will present at #imPACt2017 conference - "The Model Factory - Correlating Server and Database Utilization with Customer Activity"

The abstract and more info can be found here:

US Patent "SYSTEMS AND METHODS FOR MODELING COMPUTER RESOURCE METRICS", I Trubin et al


Presentation is scheduled:
New Orleans, Louisiana at the 
Loews New Orleans Hotel
Session Number:  362
Subject Area:  CAP
Session Date and Time: 11/8/2017, 2:20 PM-2:50 PM
Room Assignment:  Beauregard

See conference details here: http://cmgimpact.com/ 
Your are welcome to attend!

Thursday, July 27, 2017

Igor = I go R. I have redeveloped SETDS on R = SonR

The 1st attempt to go from SAS to R:

- R script to run in SAS: one more way to built IT-Control Chart


The 1st attempt of SEDS Control Charts to be built using R: 


My proposal to build SETDS on any open-source platforms (including R):

Using RODBC package against MySQL data to build SEDS control charts:


SETDS is actually a 2-level (1. exception and 2. trend detection)  machine learning based anomaly detection method. It competes with other anomaly detection methods that are more and more getting implemented on R:

Finally SETDS was implemented on R and named SonR.

Friday, July 14, 2017

10 years anniversary of running my tech blog - 212th post. SUBSCRIBE!

Time flies. 10 years ago in June 2007 I wrote: 

"To keep the discussion about how to Manage computer Systems by Exception (e.g. by  using SPC, APC, MASF, 6-SIGMA, SETDS and other techniques), I run this blog and also publish/present white papers at the www.CMG.org. ..."



And now I am posting the 212 th post... All in all that makes my dream to publish a monographic book came true and I happy that my "virtual" on-line "book" is visited ~135,000 times: 



You can see the history has ups and downs following my career path. Still plan to keep posting....

Thank you for viewing!


Thursday, June 29, 2017

My CMG'05 papers was cited in PhD Thesis "Finding External Indicators of Load on a Web Server via Analysis of Black-Box Performance Measurements"

Author:
  from  MarkLogic Corporation
Thesis for: PhD, Advisor: Dr. Alva Couch

ABSTRACT:

Traditional methods for system performance analysis have long relied on a mix of queuing theory, detailed system knowledge, intuition, and trial-and-error. These approaches often require construction of incomplete gray-box models that can be costly to build and difficult to scale or generalize. In this thesis, we present a black-box analysis method to discover the amount of load on a web server with minimal knowledge of its internal mechanisms. In contrast to white-box analysis, where a system's internal mechanisms can help to explain its behavior, black-box analysis relies on external measurements of a system's reactions to well-understood inputs. The primary advantages of black-box analysis are its relative independence from specific architectures,its applicability to opaque environments (e.g., closed-source systems), and its scalability. In this thesis, we show that statistical analyses of web server response times can be used to discover which server resources are stressed by particular workloads. We also show that under certain conditions, the settling period of server response times after resource perturbation correlates positively with the degree of perturbation. Finally, we use the two-sample Kolmogorov-Smirnov (KS) test to measure statistical equality of multiple samples drawn from response times of a server under various steady-state load conditions. We show that in specific circumstances, the number of samples that test as statistically equal can serve as an imprecise indicator of the amount of load on a server. All of these contributions will aid performance analysis in new environments such as cloud computing, where internal server mechanisms and configurations change dynamically and structural information is hidden from users.

Finding External Indicators of Load on a Web Server via Analysis of Black-Box Performance Measurements. Available from: https://www.researchgate.net/publication/230707525_Finding_External_Indicators_of_Load_on_a_Web_Server_via_Analysis_of_Black-Box_Performance_Measurements [accessed Jun 29, 2017].

Cited:  CMG'2005  paper:


Friday, June 16, 2017

The #DynamicThreshold is common art by now...

I have got the following feedback on the previous post about some capacity management tool from one of the this blog posts author: 

"As far as any similarity to my own work, I think my methods for dynamic thresholds are common art by now… and most certainly derived from Igor’s own presentations that I attended 😊"

So I am proud of making  some influence!

Tuesday, June 6, 2017

Re-posting #CMGamplify - "#DataScience Tools for Infrastructure Operational Intelligence"

The following  CMG Amplify blog post written by Tim Browning is interesting as it underlines what this "System Management by Exception" blog is always about:

"...In order for the performance analyst to attend to troubled systems that may number in the thousands, it is imperative that we filter out of this vast ocean of time series metrics only those events that are anomalous and/or troubling to operational stability. It is too overwhelming to sit and look at thousands of hourly charts and tables. In addition, there is a need for continuous monitoring capability that detects problems immediately or, better yet, predicts them in the near term.  Increasingly, we need self-managing systems that learn and adapt to complex continuous activities and quickly identify the causal reconstruction of threatening conditions as well as recommend solutions (or even automatically deploy remediation events).  Out of necessity, this is where we are heading..."

and in order to  achieve that:

"..In data mining, anomaly detection (also known as outlier detection) is the search for data items in a dataset which do not conform to an expected pattern. Anomalies are also referred to as outliers, change, deviation, surprise, aberrant, peculiarity, intrusion, etc. Most performance problems are anomalies. Probably the most successful techniques (so far) would be Multivariate Adaptive Statistical Filtering (MASF) for detecting statistically extreme conditions and Relative Entropy Monitoring for detecting unusual changes in patterns of activity..."

See entire post here: https://www.cmg.org/2017/06/data-science-tools-infrastructure-operational-intelligence/



Wednesday, May 17, 2017

Eli Hizkiyev: ConicIT – Ground Breaking Technology Unleashed. Sense and Respond? Why Not Predict and Prevent?

ConicIT Summary

Take your existing performance monitoring environment to the next level. ConicIT, a software solution, reads thousands of performance and stability metrics per minute from your performance monitors such as TMON, Omegamon, Mainview, Sysview and others. ConicIT processes and analyzes these metrics with machine learning technology and automatically generates alerts about problems in which even seasoned, professional performance staff may not notice if they look at the same data.

Beyond the automatic analytics and alerts, ConicIT provides an efficient and friendly web-interface which allows you to browse through relevant performance data, in an aggregated way, including watching values and graphs from the very moment a problem occurs. With ConicIT in place, you won’t need to tediously jump between many different monitors or screens. ConicIT aggregates the data from different sources into single view, so you can watch the data easily receiving either high-level or low-level insights into your application performance.

ConicIT also creates important calculated variables. Examples include ratios, summaries, and critical information such as taking the cumulating CPU-time of a job or transaction and calculating the real-time CPU consumption of jobs and transactions. Much of this information is missing from all monitors. The real-time CPU consumption is calculated using the rate in which the CPU-Time rises during each minute.

One of the major advantages of ConicIT is the dynamic alerts which are based on machine learning and statistical algorithms. Traditional monitors offer simple static-alerts based on thresholds. But static alerts are always coming too late and most of them are false-alerts. ConicIT solves this problem with its advanced algorithms. ConicIT automatically studies the typical behavior of each metric every day of the week and every hour of the day. So ConicIT knows (and shows you) the expected range for each performance metric. ConicIT also learns how stable each variable is and how often and how long it may be out of its normal range. Based on this analysis, ConicIT recognizes when there is an anomaly in the system in one or more metrics. In such case ConicIT will send you an alert with information and graphs about the problem. These proactive alerts come much earlier and more accurately than any static-alert type performance system. ConicIT gives you time to solve problems before they affect your end-users, clients and customers.

The combination of early proactive-alerts when the problem started, along with supportive information and graphs, allows you and your team to quickly pinpoint where the problem started and which team should work on resolving it. Thus, ConicIT reduces the required war-rooms for fixing problems and reduces the mean time for repairing problems.

Figure 1: 30 hours graph


Figure 2: It takes a single click (on the left menu) to switch and view any type of information from any point in time

Wednesday, May 3, 2017

The effect of outliers on statistical properties - Anscombe's quartet

Anscombe's quartet comprises four datasets that have nearly identical simple descriptive statistics, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of outliers on statistical properties. He described the article as being intended to attack the impression among statisticians that "numerical calculations are exact, but graphs are rough."[1]

Source: https://en.wikipedia.org/wiki/Anscombe%27s_quartet


(You can easily check this in R by loading the data with data(anscombe).) But what you might not realize is that it's possible to generate bivariate data with a given mean, median, and correlation in any shape you like — even a dinosaur:


Source: The Datasaurus Dozen
Posted: 02 May 2017 08:16 AM PDT
(This article was first published on Revolutions, and kindly contributed to R-bloggers)