Popular Post


Thursday, September 22, 2011

One Example of BIRT Data Cubes Usage for Performance Data Analysis

I have got the comment on my previous post “BIRT based Control Chart“ with questions about how actually in BIRT the data are prepared for Control Charting. Addressing this request I’d like to share how I use BIRT Cube to populate data to CrossTab object which was used then for building a control chart.

As I have already explained in my CMG paper (see IT-Control Chart), the data that describes the  IT-Control Chart (or MASF control chart) has actually 3 dimensions (actually, it has 2 time dimensions and one measurement - metric as seen in the picture at the left). And the control chart is a just a projection to the 2D cut with actual (current or last) data overlaying. So, naturally, the OLAP Cubes data model (Data Cubes) is suitable for grouping and summarizing time stamped data to a crosstable for further analysis including building a control chart. In the past SEDS implementations I did not use Cubes approach and had to transform time stamped data for control charting using basic SAS steps and procs. Now I found that Data Cubes usage is somewhat simpler and in some cases does not require a programming at all if the modern BI tools (such as BIRT) are used.

Below are the some screenshots with comments that illustrates the process of building the IT-Control Chart by using BIRT Cube.

Data source (Input data) is a table with date/hour stamped single metric with at least 4 months history (in this case it is the CPU utilization of some Unix box). That could be in any database format; in this particular example it is the following CSV file:

The result (in the form of BIRT report designer preview) is on the following picture:(Where UCL – Upper Control Limit; LCL is not included for simplicity)

Before building the Cube the three following data sets were built using BIRT “Data Explorer”:
(1) The Reference set or base-line (just “Data Set” on the picture) is based on the input raw data with some filtering and computed columns (weekday and weekhour) and 
(2) the Actual data set which is the same but having the different filter: (raw[“date”} Greater “2011-04-02”)

(3) To combine both data sets for comparing base-line vs. actual, the “Data Set1” is built as a “Joint Data Set” by the following BIRT Query builder:
Then the Data Cube was built in the BIRT Data Cube Builder with the structure shown on the following screen:
Note only one dimension is used here – weekhour as that is needed for Cross table report bellow.

The next step is building report starting with Cross Table (which is picked as an object from BIRT Report designer “Pallete”):
The picture above shows also what fields are chosen from Cube to Cross table.

The final step is dropping “Chart” object from “Palette” and adding UCL calculation using Expression Builder for additional Value (Y) Series:

To see the result one needs just to run the report or to use a "preview' tab on the report designer window:

                FINAL COMMENTS

- The BIRT report package can be exported and submitted for running under any portals (e.g. IBM TCR).
- Additional Cube dimensions makes sense to specify and use, such as server name or/and metric name.
- The report can be designed in BIRT with some parameters. For example, good idea is to use a server name as the report parameter.
- To follow the “SEDS” idea and to have the reporting process based on exceptions, the preliminary exception detection step is needed and can be done again within a  BIRT report using the SQL script similar with published in one of the previous post: 



  1. I would like to repeat the same exercise (Cube usage for Control charting) against the same data but stored in some MySQL table. Plus in opposed to non-programming approach I am interested in developing some SQL script to do the same data transformation and then to chart using BIRT or R.

    Lastly my plan is to do it all using R meaning to develop some R based open source type of application (SEDS-lite) to
    - Connect to database (MySQL as the test example)
    - Filter out exceptions
    - For each exception to transform data for Control charting
    - Build control charts
    - Put the list of exceptions and control charts on a web report.

    Any help, comments or contribution offering are very welcome.

  2. Hi Igor, I've been trying to replicate this and am having some trouble. Do you actually compute the Standard Deviation while constructing the Data Cube? or are you using just the mean to calculate the UCL?

    1. No, I use STDDEV aggregation function against expression measure["Data Set::% CPU Used1"] for calculated field in the cross table withing the report itself. I also calculate UCL as a part of chart building. Nothing needed like that at the cube building level. Maybe it is possible...

      BTW I plan to convert that (and other) post to Video presentation to make that more understandable (I have already published one on youtube: http://itrubin.blogspot.com/2011/09/power-of-control-charts-and-it-chart.html

  3. Greetings Noble Igor - I notice in your SQL code you are comparing last 7 days to the last 180 days (where the larger is your basis for a reference set). I think you should not include the last 7 days in the 180 day set because: (1) there could be outliers in recent data that will affect means and std, and (2) by including it you are comparing it to itself as a subset of the larger data (like autocorrelation).

  4. Change baseline select to something like this:


    If you have identified and kept outliers in a separate table then:

    (Select DATE from OUTLIER_TABLE

    1. Tim - Your both comments are valid and in the real implementations I do exactly what you pointed plus some other thinks... In this and other posts I have simplified some steps just to not scary potential readers/users. Thank you for you notes. They are good compliment to my posts.

  5. This is really nice post, I found and love this content. I will prefer this, thanks for sharing. Data Cleaning Service.

  6. I wanted to thank you for this excellent read!! I definitely loved every little bit of it. I have you bookmarked your site to check out the new stuff you post. Lebensmittel

  7. Hello I am so delighted I located your blog, I really located you by mistake, while I was watching on google for something else, Anyways I am here now and could just like to say thank for a tremendous post and a all round entertaining website. Please do keep up the great work. survey data entry

  8. Your writing is fine and gives food for thought. I hope that I’ll have more time to read your articles . Regards. I wish you that you frequently publish new texts and invite you to greet me data entry bookkeeping

  9. This comment has been removed by a blog administrator.

  10. This comment has been removed by a blog administrator.

  11. Nice post! This is a very nice blog that I will definitively come back to more times this year! Thanks for informative post. data entry bookkeeping

  12. One in the types is information systems that generally programmed on a mainframe, minicomputer, microcomputer or personal computer. Benefits of Data Entry Outsourcing - Outsourcing give benefits you financially and also strategically. bookkeeping data entry

  13. One in the types is information systems that generally programmed on a mainframe, minicomputer, microcomputer or personal computer. Benefits of Data Entry Outsourcing - Outsourcing give benefits you financially and also strategically. bookkeeping data entry

  14. So actually, when you join and pay the enrollment charge for these Data Entry Projects, you will get the preparation materials that will show you how to type more 'data entry advertisements' and persuade others to do something very similar.database data entry services

  15. Despite technological improvements, business organizations still rely on various data entry systems and data entry vendors accuracy is vital in such cases. Bad data may come from various sources and it is important to sort your data before any entry operations.

  16. Wow, What a Excellent post. I really found this to much informatics. It is what i was searching for.I would like to suggest you that please keep sharing such type of info.Thanks here

  17. With data science, you can make informed business decisions. Businesses depend on data scientists and make use of their expertise in order to offer great results. So, these professionals have an important position. data science course in hyderabad

  18. Many industries use data science in order to automate different tasks. Businesses use historical data for training their machines to do repetitive tasks. And this is what simplifies arduous jobs done by humans a few years back. data science course in hyderabad

  19. A debt of gratitude is in order for sharing the information, keep doing awesome... I truly delighted in investigating your site. great asset... receipt data entry

  20. Positive site, where did u come up with the information on this posting?I have read a few of the articles on your website now, and I really like your style. Thanks a million and please keep up the effective work. data entry ecommerce

  21. This substance is composed exceptionally well. Your utilization of arranging while mentioning your focuses makes your objective facts clear and straightforward. Much obliged to you. images data entry