Popular Post

Search This Blog

Tuesday, March 27, 2012

R-Script to Aggregate (ETL to MySQL) Actual data with Base-line data for IT-Control Charts

At my previous post (R-script to plot IT-Control Chart against MySQL) the task was given to write a R-script to pre-process (ETL) the raw date-hour stamped data to the DATA-cubical format for Control Charting.

Here is the solution:
I have just transformed the already developed SQL script to the RODBC based R-Script which can be seen below:

The result of the script run is the "ActualVsHistorical" table in the servermentrics database on MySQL with the following data that is identical with the data used for plotting IT-Control Chart published in the previous post. The data itself can be seen by just typing the data frame name in the R-Console window:

So, all main elements of SEDS-lite project were prototyped and published on my posts. Maybe one more task is left, which is to illustrate on R how the exceptional (based on EV meta-metric filtering)  list of objects (servers) can be created as a part of anomalies detection. So far that was done and published in this blog and so far it is only in the "DB2"-like SQL format to run within BIRT. See the post about that here: UCL=LCL : How many standard deviations do we use for Control Charting? Use ZERO!


Wednesday, March 21, 2012

R-script to plot IT-Control Chart against MySQL

Continuing playing with the open-source tools to build some SEDS elements, I have developed the simple R-script to plot the IT-Control chart against data stored in MySQL database.

I used the same MySQL data that was already been built and used for IT-Control Charting by BIRT reporting system. See the following post about how that was done: Building IT-Control Chart by BIRT against Data from the MySQL Database. To do that I have used RODBC package to connect and query data from MySQL database through the MySQL ODBC driver.

Actually, I have just slightly modified the R-script which I wrote for my "Power of Control Chart" workshop That script could be found in the following post: IT-Chart: The Best Way to Visualize IT Systems Performance

Here is my new script (click on it to enlarge) :

Here is the result:

which practically identical with what was done by BIRT (see link to BIRT based  picture here).

If you are a programmer you would notice how it is easier to build charts using R versus BIRT (not-for-programmer, menu-based report generator).

The data used for this exercise was already preprocessed to the DATA-cubical format from raw date-hour stamped data (see the SQL script for that here). But what about doing this pre-processing also by R?

That is the next task ... (could be your homework ;). The simplest approach is again to use RODBC package just to run the mentioned above SQL script within R-system. Other and better approach is to do that using the natural R-system data manipulation technique.