System Management by Exception: SCMG report: Jobs Scheduling in Cloud and Hadoop

Wednesday, May 1, 2013

SCMG report: Jobs Scheduling in Cloud and Hadoop

2015 UPDATE: Now having access to HADOOP I am thinking of how to use Map-Reduce to speed up SETDSing against the big performance data. The flowing thesis could very helpful for that:
Distributed Anomaly Detection andPrevention for Virtual Platforms by Ali Imran Jehangiri

Last week we ran our Richmond SCMG meeting following the agenda published HERE (links to presentations are there too including mine). The 1st presentation was titled: "Some Workload Scheduling Alternatives for High Performance Computing Systems" and presented by Jim McGalliard, the frequent CMG presenter and our friend. He mentioned some old topic he already presented in the past – Supercomputer batch jobs optimization by categorizing and scheduling them. Then after a brief description of MapReduce

( “method for simple implementation of parallelism in a program..”)

he explained how HADOOP

(“Designed for very large (thousands of processors) systems using commodity processors, including grid systems, Hadoop is a specific open source implementation of the MapReduce framework written in Java and licensed by Apache” )

does job scheduling using MapReduce and some other means.

That presentation leads me to another task to consider – job scheduling in the cloud. Ironically just before the meeting I had red interesting article about it (BTW it was recommended reading from my current manager as we are also going to clouds… What about you?). Here is the link to that article from one of the author’s webpage (Asit K Mishra ) and title:

”Towards Characterizing Cloud Backend Workloads: Insights from Google Compute Clusters”

I firmly believed that the workload characterization is going away due to virtualization - each workload/apps can have separate virtual server now. Right? But based on the article looks like the jobs categorization could be useful for optimizing their schedule to run in the cloud and maybe in HADOOP…

Igor Trubin

He started in 1979 as IBM/370 system engineer. In 1986 he got his PhD. in Robotics at St. Petersburg Technical University (Russia) and then worked as a professor teaching CAD/CAM, Robotics for 12 years. He published 30+ papers and made several presentations for conferences related to the Robotics and Artificial Intelligent fields. In 1999 he moved to the US, worked at Capital One bank as a Capacity Planner. His first CMG.org paper was written and presented in 2001. The next one, "Exception Detection System Based on MASF Technique," won a Best Paper award at CMG'02 and was presented at UKCMG'03 in Oxford, England. He made other tech. presentations at IBM z/Series Expo, SPEC.org, Southern and Central Europe CMG and ran several workshops covering his original method of Anomaly and Change Point Detection (Perfomalist.com). Author of “Performance Anomaly Detection” class (at CMG.com). Worked 2 years as the Capacity team lead for IBM, worked for SunTrust Bank for 3 years and then at IBM for 3 years as Sr. IT Architect. Now he works for Capital One bank as IT Manager at the Cloud Engineering and since 2015 he is a member of CMG.org Board of Directors. Runs UT channel iTrubin

System Management by Exception

Popular Post

_

Wednesday, May 1, 2013

SCMG report: Jobs Scheduling in Cloud and Hadoop

No comments:

Post a Comment