Popular Post

Search This Blog

Wednesday, October 24, 2012

Not a MASF Based Statistical Techniques (Entropy-based) for Anomaly Detection in Data Centers (and Clouds)

The following papers published on Mendeley criticizes the MASF Gaussian assumption and offer other methods (Tukey and Relative Entropy) to detect anomalies statistically. (BTW I tried to use the entropy analysis to capture performance anomalies - check my other post)

1. Statistical techniques for online anomaly detection in data centers
by Chengwei Wang, Krishnamurthy Viswanathan, Lakshminarayan Choudur, Vanish Talwar, Wade Satterfield, Karsten Schwan
Online anomaly detection is an important step in data center management, requiring light-weight techniques that provide sufficient accuracy for subsequent diagnosis and management actions. This paper presents statistical techniques based on the Tukey and Relative Entropy statistics, and applies them to data collected from a production environment and to data captured from a testbed for multi-tier web applications running on server class machines. The proposed techniques are lightweight and improve over standard Gaussian assumptions in terms of performance.

2. Online detection of utility cloud anomalies using metric distributions
by Chengwei Wang Chengwei Wang, V Talwar, K Schwan, P Ranganathan


The online detection of anomalies is a vital element of operations in data centers and in utility clouds like Amazon EC2. Given ever-increasing data center sizes coupled with the complexities of systems software, applications, and workload patterns, such anomaly detection must operate automatically, at runtime, and without the need for prior knowledge about normal or anomalous behaviors. Further, detection should function for different levels of abstraction like hardware and software, and for the multiple metrics used in cloud computing systems. This paper proposes EbAT - Entropy-based Anomaly Testing - offering novel methods that detect anomalies by analyzing for arbitrary metrics their distributions rather than individual metric thresholds. Entropy is used as a measurement that captures the degree of dispersal or concentration of such distributions, aggregating raw metric data across the cloud stack to form entropy time series. For scalability, such time series can then be combined hierarchically and across multiple cloud subsystems. Experimental results on utility cloud scenarios demonstrate the viability of the approach. EbAT outperforms threshold-based methods with on average 57.4% improvement in accuracy of anomaly detection and also does better by 59.3% on average in false alarm rate with a `near-optimum' threshold-based method.