System Management by Exception: December 2010

Tuesday, December 28, 2010

Tim Browning: the review of cloud computing article "Optimal Density of Workload Placement"

Bottom line: a cloud computing resource is really a data center with virtualized components. A GUI-frontend to an outsourcing arrangement.

Maybe the only true “cloud computing” takes place in aircraft. Although, that is debatable.

The Cloud Hype in the paper:

The author proclaims that cloud computing “is not simply the re-branding and re-packaging of virtualization”…then proceeds to show that it is just that. He also states that capacity planning’s use of “trend-and-threshold” analytics is not useful in the cloud infrastructure, yet he defines ‘strategic optimization’ as “proactive, long-term placement of resources based on detailed analysis of supply and demand (compacting)”. I assume he does not understand that ‘supply’ is a threshold – we only have a finite amount of ‘supply’ - and that ‘long-term’ is a trend?

He also states

“Rather than the trend-and-threshold model of planning that is typically employed in legacy physical environments, this new form of planning [my emphasis] is based on discrete growth models (at the VM and/or workload level) and the use of permutations and combinations to determine when to rebalance, when to add or remove capacity, and how the environment will respond to different growth, risk and change scenarios.”

So, I ask myself, what’s new about ‘discrete growth models’? Where does he get the “growth, risk and change scenarios” -- (wait, don’t tell me…from trend-and-threshold thinking)? Maybe he is being discreet about the discrete models (thus avoiding being discreetly indiscrete)?

Permutations and combinations say nothing about end-state solutions relative to (long or short term) time-series load patterns. They are time static, so ‘when to add or remove’ is not part of those computational functions. Perhaps, what arrangement is ‘best’ is what he is meaning? Perhaps he is thinking of ‘on demand’ capacity wherein capacity planning is replaced by ‘instant’ capacity in response to ‘change’? Which is to say, there is no planning…just rapid and efficient deployment of some kind of limitless unseen capacity?

What is ‘new’ about combinations and permutations? The newest development I know of in this area is perhaps combinatorial optimization, which consists of finding the optimal solution to a mathematical problem in which each solution is associated with a numerical “cost”. It operates on the domain of optimization problems, in which the set of feasible solutions is discrete or can be reduced to discrete (in contrast to continuous), and in which the goal is to find the best solution (lowest cost). (Developed in the early 70’s as linear and integer programming in operations research and similar to the root mean square error criteria for evaluating competing forecast models using neural networks or statistical methods).

So, knowing how many ways you can combine 887 disks on the same I/O path (combinatorics) tells me when to add or remove some if referenced to a discrete growth model? Wow.. yes, that is so NEW…well, for 1968, maybe.

Subsequently, he states

“the natural changes in utilization over time caused by organic growth will tend to push the limits on the configured capacity. Furthermore, the ability to configure capacity is relatively new to IT, and there are typically no existing processes in place to catch misallocation situations.”

Perhaps the ability to “configure capacity” is new to him, it is in no way new to enterprise IT. So, trend – a legacy term - is not, per the author, ‘changes in utilization over time’ and ‘configured capacity’ is not a threshold? There are ‘no existing processes in place’ to catch misallocation situations? What? None? I suppose by ‘misallocation situation’ he means that a capacity shortfall isn’t a capacity issue, it’s an “allocation issue”. Somewhere – over the rainbow - there is capacity going to waste, but it’s not available for some reason. It’s just been ‘misallocated’. Sort of…misplaced. We must go find it. Instantly.

OK….So do I like anything about this paper?

Some ideas in the paper I DO like:

Workload density – the degree of consolidation of work into one image (of the OS) - is a cool concept where ‘contention for resources’ is a boundary condition for ‘workload placement’. How is this done? “Contention probability analysis”, which involves analyzing the operational patterns and statistical characteristics of running workloads in order to determine the risk of workloads contending for resources. The author uses the phrase, “Patterns and statistical characteristics”. So, in effect, ‘contention probability analysis’ is a ‘trend-and-threshold’ technique (although he thinks it isn’t). I am surprised he didn’t rebrand ‘statistical cluster analysis’ as also something new and revolutionary just hot from computer science labs - yet another form of blessed combinatorics optimization. Where this idea has been usefully applied at KC: SAP Batch Workload time density – the degree of consolidation of batch work into the same time intervals. In this case a boundary condition for workload ‘time placement’ would examine workload (demand) leveling and distribution to avoid unnecessary spikes for time-movable workloads.

Another idea I like:

He suggests that workloads are best characterized by their statistical properties, rather than “up front descriptions of their demand characteristics”. Thus workloads are ‘placed’ using segmentation of the resource demand profiles (to avoid imbalance, etc.). Which is to say, workloads are aggregations of activity with common ‘demand characteristics’. In queuing theory, the classification of incoming transactions into resource-based profiles which are used for priority dispatching protocols against an array of appropriately resource mapped servers will always produce a more optimal process model in terms of throughput and average response times in contrast to a queuing network where transactions are not classified based on resource requirements. This was the basis for batch initiator job class definitions in the mainframe world of the 1970’s. It worked then also. It will work for ‘clouds’ too.

The only ‘up front descriptions of demand characteristics’ that I know of would be the results of demand/performance modeling and/or LoadRunner-type benchmarking. This is still useful for ‘start-state’ sizing of the target landscape.

So…bottom line: interesting concepts or ‘new ways of conceptualizing’ the functional parametric states of virtualized landscapes. Suggestions (but no concrete explanations) that combinatorial optimization techniques can be utilized for capacity planning (implying it is not now being used). Interesting and useful applications for event densities and statistical profiling.

It seems so important, especially to vendor environments, to reinvent the wheel – a legacy object - by their services or products, and suggest that they have superior knowledge of all things new and different and these new and different things are not ‘legacy’. After all, in vendor gadget technology what isn’t ‘new’ is ‘bad’ and ‘if it works, it’s out of date’. Thus, legacy means ‘bad’ because it’s not ‘new’ (even if it uses new components) and, most importantly, it’s not what they are selling.

Just because “2 + 2 = 4” is legacy math, i.e. old, and thus bad, it doesn’t mean that it’s no longer true in cloud math. It is still true, but needs to be repackaged.

So, in the interest of actionable market relevance, here is a new, fresh, cloud hyped- up version of “2+2=4”:

“It has been newly (re)discovered that ‘2 + 2 is optimally 4 and exceptionally relevant for business purposes. The scope of this process is enhanced for sufficiently configured integer values of {2,4} in a dynamic web-enabled hi definition virtual presence wherein it has locality of reference within the set of all integer number segments of the arithmetic cloud infrastructure. This will provide a competitive edge to your business as newly revealed by the appropriate cloud-centric data mining tools (c1, c2, … cn, ) - with price guarantees, if you act now! - at current release, version and maintenance levels in dynamic optimal adaptive combination. This fabulous offering is expertly administered under the guidance of cloud certified analysts, at an attractive hourly rate, who are not now, nor ever have been, legacy experts and thus ‘new’ and ‘fresh’ with exciting social networking added value potential. (Please join us on the Facebook group “I like integer addition with cloud computing”).”

Of course, I might be preaching the choir (rather than the clouds) on this one. It seems, nevertheless, that corporate IT vendors demonstrate a kind of ‘math neurosis’:

A math-psychotic does NOT believe that 2+2=4.

A math-neurotic knows that 2+2=4 is true, but hates it. It must be repackaged for resale and aggressively marketed with a customer focused strategy.

If mathematics is the art of giving the same name to different things (J. H. Poincare), then IT marketing is the art of giving a new name to the same things and using pretty charts.

THE theologically orthodox axiom for information technology services/product vendors:

"Absolutum Obsoletum"

(TimLatin translated: "If it works, it’s out of date").

How to make 3 mice out of 2 mice by making 2 = 1:

How to increase shareholder value by reducing labor costs:

(Posted with the Tim's Browning permission)

Igor Trubin

He started in 1979 as IBM/370 system engineer. In 1986 he got his PhD. in Robotics at St. Petersburg Technical University (Russia) and then worked as a professor teaching CAD/CAM, Robotics for 12 years. He published 30+ papers and made several presentations for conferences related to the Robotics and Artificial Intelligent fields. In 1999 he moved to the US, worked at Capital One bank as a Capacity Planner. His first CMG.org paper was written and presented in 2001. The next one, "Exception Detection System Based on MASF Technique," won a Best Paper award at CMG'02 and was presented at UKCMG'03 in Oxford, England. He made other tech. presentations at IBM z/Series Expo, SPEC.org, Southern and Central Europe CMG and ran several workshops covering his original method of Anomaly and Change Point Detection (Perfomalist.com). Author of “Performance Anomaly Detection” class (at CMG.com). Worked 2 years as the Capacity team lead for IBM, worked for SunTrust Bank for 3 years and then at IBM for 3 years as Sr. IT Architect. Now he works for Capital One bank as IT Manager at the Cloud Engineering and since 2015 he is a member of CMG.org Board of Directors. Runs UT channel iTrubin

Monday, December 13, 2010

Video report about my 1-day attending/presenting at CMG'10 Conference in Orlando

MyCMG'10 presentation is described here:
http://itrubin.blogspot.com/2010/11/my-cmg10-presentation-it-control-charts.html

Here is a picture me siting in anther CMG'10 session:
https://www.facebook.com/photo.php?fbid=10150196216458678&set=a.10150196216138678.334041.120810323677&type=1&ref=nf

Igor Trubin

Video report about my 1-day attending/presenting to CMG'10 Conference in Orlando

http://ukor.blogspot.com/2010/12/one-day-of-my-10th-computer-measurement.html

Here is picture of me sitting on another CMG'10 session:
https://www.facebook.com/photo.php?fbid=10150196216458678&set=a.10150196216138678.334041.120810323677&type=1&ref=nf

Igor Trubin

Friday, December 10, 2010

The Exception Value Concept to Measure Magnitude of Systems Behavior Anomalies

The Exception Value concept was introduced in my 1st CMG paper in 2001 (see the last link in the first post of this blog). I have found later that this EV approach can be used for trends recognition and thier separation in the historical data as described in my 2008 paper: Exception Based Modeling and Forecasting.

Then I have noticed some other vendors started using similar concept (See my last year post about that Exception Value (EV) and OPNET Panorama) ...

The last news about that concept is following.

At CMG'10 conference I met BMC software specialist Dima Seliverstrov and he mentioned of referencing my 1st CMG'01 paper in his CMG presentation (scheduled to be presented TODAY!). I looked at his paper "Application of Stock Market Technical Analysis Techniques to Computer System Performance Data" (abstract is linked here) and indeed he showed the interesting way to use my EV technique to evaluate stock market deviations to automate some brokerage processes! Here is the paragraph from his paper about it:

"Buy or sell signals are generated when the daily value moves outside of the error bars. It’s not only important to identify which systems have buy and sell signals, but which systems to look at first. A useful approach to rank the signals from multiple sources is to calculate the area outside the error bars and rank based on the area [4]. For example if one systems disk space exceeded area is 100 Gbytes outside and another system is 1 Kbyte you would look at the system with a larger area first. Another useful technique for CPU Utilization is to normalize the area outside the envelope by converting to SPECint..."

By the way, I remember that my 1st paper also suggested to do the similar normalization but not based on SPECint benchmark (I know that metrics is used by BMC as the main sizing factor and it is fine), but more efective and most difficult to obtain is TPC (http://www.tpc.org/) benchmark.

Here is the figure from my CMG'01 paper (sorry for the bad quality...)

Anyway I am pleased that my idea is alive!

Below is some other my postings with EV idea discussions:

Real-Time Statistical Exception Detection

Feb 28, 2009

Exception Value (EV) and OPNET Panorama - System Management by ...

Dec 29, 2009

CMG'08 Trip Report

Jan 24, 2009

Industrial Robot Grasping Processes Research. EV prototype was there!

Jun 21, 2010

Igor Trubin

Wednesday, December 1, 2010

Cloud Computing Capacity Management

Interesting that couple years ago I was job-interviewed by Google for Program Manager position and on the last phone interview I was asked about how to do capacity management for cloud computing. I did not really know that...
(I did not have any deal with that yet - only CMG based knowledge - see
C. Molloy's presentation:

Capacity Management for Cloud Computing

) ,

... but tried to tell them that the generic approach should be applied considering a cloud as just a highly virtualized infrastructure with very high mobility feature to satisfy any additional capacity demand and on almost a fly. Cloud is just the next level of virtualization. Right?

And my favorite smart alerting (based on dynamic thresholds) approach could automate the finding a moment when additional capacity needs to be allocated. (I think I mentioned that in one of my papers). As for as I know, currently it is done based on strictly static thresholds.

The figure is the chart about capturing capacity usage change happend in VMware environment (Control chart is for VM, trend is for Host) - that is from my new CMG'10 paper: IT-CONTROL CHART

BTW I failed and did not get the offer from Google, but anyway my family was not really ready to change the coast and I just decided that was a test for Google and they failed, not me!

(see more recent post about cloud compuing here:

Tim Browning: the review of cloud computing article "Optimal Density of Workload Placement")

Igor Trubin

System Management by Exception

Popular Post

_

Tuesday, December 28, 2010

Tim Browning: the review of cloud computing article "Optimal Density of Workload Placement"

Monday, December 13, 2010

Video report about my 1-day attending/presenting at CMG'10 Conference in Orlando

Video report about my 1-day attending/presenting to CMG'10 Conference in Orlando

Friday, December 10, 2010

The Exception Value Concept to Measure Magnitude of Systems Behavior Anomalies

Wednesday, December 1, 2010

Cloud Computing Capacity Management

Capacity Management for Cloud Computing

Tim Browning: the review of cloud computing article "Optimal Density of Workload Placement")