Popular Post


Monday, October 10, 2011

Is Anomaly Detection Similar to Exception Detection? Apply SEDS for Information Security!

Sometimes I call my "Exception Detection" as "Anomaly Detection".  In some cases the performance degradation could be caused by parasite program (like badly written data collection agent ) or incompetent user (like submitting badly written ad-hock  database query) or even by a cyber attack (denial-of-service attack -DoS definitely  degrades performance to absolutly not performing, doesn't it?)

So it is similar by my opinion and the Exception Detection methodology I am offering to by using MASF technique can be applied to broader filed of Information Security. And vice versa! Some intrusion detection techniques could be useful for automatic performance issues detection!

I have made a litle Google reserch on that and found a few interesting approaches. See one of that:

See the abstract page for dissertation written by Steven Gianvecchio:

Application of information theory and statistical learning to anomaly detection.

So the question is "can that information theory (entropy analysis) could be applied to performance exception detection?"

Friday, October 7, 2011

EV-Control Chart

I have introduced the EV meta-metric in 2001 as a measure of anomaly severity. EV stands for Exception Value and more explanation about that idea could be found here:  The Exception Value Concept to Measure Magnitude of Systems Behavior Anomalies 
Basically it is the difference (integral) between actual data and control limits. So far I have used EV data mostly to filter out real issues or for automatic hidden trend recognition. For instance, in my paper CMG’08 “Exception Based Modeling and Forecasting” I have plotted that metric using Excel to explain how it could be used for a new trend starting point recognition. Here is the picture from that paper where EV called “Extra Volume” and for the particular parent metric (CPU util.) it is named ExtraCPUtime:

The EV meta-metric first chart 

But just plotting that meta-metric and/or two their components (EV+ and EV-) over time gives a valuable picture of system behavior. If system is stable that chart should be boring showing near zero value all the time. So using that chart would be very easy (I believe even easier than in MASF Control Charts) to recognize unusual and statistically significant increase or decrease in actual data in very early stage (Early Warning!).

Here is the example of that EV-chart against the same sample data used in few previous posts:
1. Excel example: 

2.  BIRT/MySQL example as a continuation of the exercise from the previous post:

IT-Control chart vs. EV-Chart
Here is the BIRT screenshots that illustrate how that is built:

a.        A. Addition query to get EV calculated written directly in the additional BIRT Data Set object called “Data set for EV Chart”:
SQL query to calculate EV meta-metric
 SQL query to calculate EV metric from the data kept in MySQL table

B. Then additional bar-chart object is added to the report that is bind to that new “Data set for EV Chart”:
Result report is already shown here.

Tuesday, October 4, 2011

Building IT-Control Chart by BIRT against Data from the MySQL Database

This is just about another way to build an IT-Control chart assuming the raw data are in the real database like MySQL. In this case some SQL scripting is used.

1. The raw data is CPU hourly utilization and actually the same as in the previous posts: BIRT based Control Chart and One Example of BIRT Data Cubes Usage for Performance Data Analysis. (see the raw data picture here)

2. That raw data need to be uploaded to some table (CPUutil) in the MySQL schema (ServerMetric) by using the following script (sqlScriptToUploadCSVforSEDS.sql):

The uploaded data is seen at the bottom of the picture.

3.       Then the output (result) data (ActualVsHistoric table) is built using the following script (sqlScriptToControlChartforSEDS.sql):
The fragment of the result data are seen at the bottom of the picture also. Everything is ready for building IT-Control Chart and the data is actually the same as used in BIRT based Control Chart, so result should be the same also. Below is more detailed explanation how that was done.

4.  First, using BIRT the connection to MySQL database is established (to MySQLti  with schema  ServerMetrics to table ActualVsHistorical):

5. Then, the chart is developed the same way like that was done in BIRT based Control Chart post:

1.      6. Nice thing is in BIRT you can specify report parameters, that could be then a part of any constants including for filtering (to change a baseline or to provide server or metric names). Finally the report should be run to get the following result, which is almost identical with the one built for BIRT based Control Chart post: