Friday, September 3, 2010

Multi-Dimensional Design

Learnt about multi-dimensional database designing this week. Assuming we all know what a Cube...errr a multi-dimensional Structure is, let us know more about their design. A multi-dimension database is a collection of many such related cubes.  The dimensions of the multi-dimensional structure provide slice and dice features for the data. Following points should be kept in mind while designing such database-

1) Identify additive and non-additive items(or facts) - For example price of products are non-additive while the number of units of a product are additive.

2) Sparsity and Density of the dimension -

Sparse dimension means that only a small fraction of the available data positions are filled. For example - if a cube has a dimension "Top 10 customer"then only few data positions will be filled as only few customer would fit in that dimension and rest all customers will have that data position empty.

Dense dimension would have larger percentage of available data positions occupied.

3) Star schema or Snowflake schema for storing the data.
Star schema has one main table(fact table)having composite primary key connected to the dimension tables through foreign key - primary key relationship.

Star schema requires a separate fact table called as Aggregate table for each hierarchy level.





From star schema example 
Snowflake schema is similar to a star schema having normalised dimensions. There is still 1 fact table joined to various dimension tables which are further broken into closely related tables after proper database normalisation.
Snowflake is useful to handle sparse dimensions but star schema is good if the queries are expected to be simple as less number of joins will be  required than a snowflake schema.





From snowflake schema example
4) Calculating measures from other measures - The decision has to be carefully made if a calculated measure has to be stored pre-calculated in the fact table or calculated on the fly as requested. The storage space, computational time and also the cross dimension nature of such measures are to be kept in mind.

5) Keep room for changing dimensions like customer addresses


Various tools used for designing multi-dimensional databases are MS Analysis Services ( which we are using in the tutorials :), Palo, Pentaho etc.


There are various methods to conceptualise a multi-dimensional model like-


 1) Thomsen's diagram -  A few vertical & horizontal lines is all you need to represent the model.
 Its beauty is its Simplicity. Also POD's favourites!

 2) Adapt's diagram - Symbolic representation of the model. Microsoft is one of its patron, so it might gain popularity.

3) Miscrosoft PivotDiagrams - Graphical representation.

Too much for today! Time to say goodbye and work on the assignments which are running faster than me towards the due date.

Cognos 24 ways coming up in next blog. It reminds me to order the free book from Cognos before they figure out from their BI reports that Monash students are among their TOP 10 buyer's and they stop shipping freebies!

Keep Blogging!

No comments:

Post a Comment