Popular Post

_

Wednesday, January 16, 2019

My CMG IMPACT conference presentation is scheduled 2/20/19 Wednesday at 1:30pm - "Catching Anomaly and Normality in Cloud by Neural Net and Entropy Calculation"

See details here: https://cmgimpact.com/timetable/event/catching-anomalies-in-the-cloud/ 

UPDATE: That was a successful presentation, slides and video will available later for download.
By Igor Trubin, Capital One Bank
Part 1.  The Neural Network (NN) is not a new machine learning method. About 12 years ago I was involved as a Capacity Planning resource for the project of building an infrastructure (servers) to run NN for the fraud detection application. Now NN got much more attention and popularity as a part of AI, mostly because the computing power is increased dramatically and respectively more tasks can be done by using NN.
The goal of the presentation is  to demystify the technique in some simple terms and examples to show what it actually is and how that could be used for Capacity and Demand management. That is done by developing R code to recognize typical workload pasterns, like OLTP, or others in the time series performance data daily profiles.
Part 2. It is the typical concern to detect anomalies for short living objects or for the object with very small amount of measurements. Why? Number of those objects could be thousands and thousands so it is important to separate exceptional ones with anomalies for further investigation.  That could be servers or customers that have just started being monitored or public cloud objects (EC2s, ASGs) that usually have very short lifespan. Suggested approach to detect anomalous behavior of this type of objects is  to estimate the Entropy of the each object. If the entropy is low, everything should be in order and most likely OK. If not – there is a possible disorder there or mess and someone needs to check what is going on with the object. The method is implemented in the cloud based application written on R that scans every  hour all cloud Auto Scaling Groups (ASG) to detect imbalanced ones in term of number of EC2 instances in the group. That allows to separate a couple hundreds ASGs out of hundreds thousands of them.
This entropy based method is well known and it described in details in the following www.Trub.in  blog post:
“Quantifying Imbalance in Computer Systems” which is written based on CMG’12 paper.