The following papers published on
Mendeley criticizes the MASF Gaussian
assumption and offer other methods (Tukey and
Relative Entropy) to detect anomalies statistically.
(BTW I tried to use the entropy analysis to capture performance anomalies - check my other post)
1. Statistical techniques for online anomaly detection in data centers
by
Chengwei Wang,
Krishnamurthy Viswanathan,
Lakshminarayan Choudur,
Vanish Talwar,
Wade Satterfield,
Karsten Schwan
Abstract
Online anomaly detection is an important step in
data center management, requiring light-weight techniques that provide
sufficient accuracy for subsequent diagnosis and management actions.
This paper presents statistical techniques based on the Tukey and
Relative Entropy statistics, and applies them to data collected from a
production environment and to data captured from a testbed for
multi-tier web applications running on server class machines. The
proposed techniques are lightweight and improve over standard Gaussian
assumptions in terms of performance.
2. Online detection of utility cloud anomalies using metric distributions
Abstract
The online detection of anomalies is a vital
element of operations in data centers and in utility clouds like Amazon
EC2. Given ever-increasing data center sizes coupled with the
complexities of systems software, applications, and workload patterns,
such anomaly detection must operate automatically, at runtime, and
without the need for prior knowledge about normal or anomalous
behaviors. Further, detection should function for different levels of
abstraction like hardware and software, and for the multiple metrics
used in cloud computing systems. This paper proposes EbAT -
Entropy-based Anomaly Testing - offering novel methods that detect
anomalies by analyzing for arbitrary metrics their distributions rather
than individual metric thresholds. Entropy is used as a measurement that
captures the degree of dispersal or concentration of such
distributions, aggregating raw metric data across the cloud stack to
form entropy time series. For scalability, such time series can then be
combined hierarchically and across multiple cloud subsystems.
Experimental results on utility cloud scenarios demonstrate the
viability of the approach. EbAT outperforms threshold-based methods with
on average 57.4% improvement in accuracy of anomaly detection and also
does better by 59.3% on average in false alarm rate with a
`near-optimum' threshold-based method.
Chengwei Wang
in Middleware
(2009)
Anukool Lakhina, Mark Crovella, Christophe Diot
in ACM SIGCOMM Computer Communication Review
(2005)
No comments:
Post a Comment