Based on the
design principle from Kimball, there are many clear defined guideline for
different facts (transactional, periodic snapshot and accumulative snapshot) and
dimensions (regular, fact, junk, etc.). The key concept is to use surrogate key
to make a clear star and slow flake dimension, regardless of whether it is slow
change dimension type 1 or type 2. However, this approach may not be used
everywhere with data warehouse project. It have been many variations implemented
in real life to reduce cost and complexity. In other word, we can establish
dimension model (Cognos DMR) without ETL and STAR SCHEMA in database. A few
approaches are listed in this document.
1. Standard approach
Description:- Create physical table for each dimension
- Use surrogate key in dimension (SCD1 and SCD2)
- Star schema
- DMR will be established based on star schema
Advantages:
- Conformed dimensions can be easily established.- Superior performance
Disadvantages:
- Big effort, need to develop serious ETL.- Need to handle late arrived dimensions
- Create physical table for each dimension, whether it is static or dynamic.
- The relationship between fact and dimension is established by business key, in many cases, you need to create a business key adaptor to establish relationship.
- Star schema
- DMR will be established based on star schema
Advantages:
- Conformed dimensions can be easily established.- Much less effort to create data mart
Disadvantages:
- You may still need to handle late arrived
dimensions- Cannot or very difficult to support slow change dimension type 2
3. Create dimension based on multiple fact tables
Description:- NO physical table for dimension is created, Instead, we can use the query to define dimension, such as
select distinct Column1, Column2, column3 ... from fact1
UNION
select distinct Column1, Column2, column3 ... from fact2
- The relationship between fact and dimension is established by business key, such as connect Fact 1 to this dimension, and Fact 2 to this dimension
- Star schema
- DMR will be established based on star schema
Advantages:
- Conformed dimensions can be established, as
both Fact 1 and Fact 2 shares the same dimension.- no physical dimension and ETL are needed for this dimension.
- Do not need to handle late arrived dimensions, as the data integrity is automatically enforced.
Disadvantages:
- Cannot support slow change dimension type 2- Could result in a bad performance
- NO physical table for dimension is created
- NO query for dimension is created, and therefore no separated query subject is created. Instead , create additional columns for hierarchy on business layer.
- NO relationship between fact and dimension is needed, as dimension is directly come from fact table
- NO Star schema
- DM layer will be established based on fact table
Advantages:
- no physical dimension and ETL are needed for
this dimension.- Do not need to handle late arrived dimensions, as the data integrity is automatically enforced.
Disadvantages:
- Cannot support slow change dimension type 2- Could be result in a bad performance, as dimension will be extracted from fact on the fly
- Conformed dimensions can NOT established
5.
Create dimension based on single fact table
with NO hierarchy
Description:- NO physical table for dimension is created.
- NO query for dimension is created.
- NO query subject in both data tier and business tier is created.
- NO relationship between fact and dimension is needed, as dimension is directly come from fact table
- NO Star schema
- dimension will be established by simply dragging one attribute to DM layer
Advantages:
- no physical dimension and ETL are needed for
this dimension.- Do not need to handle late arrived dimensions, as the data integrity is automatically enforced.
Disadvantages:
- Cannot support slow change dimension type 2- Could result in a bad performance, as dimension will be extracted from fact on the fly
- Conformed dimensions can NOT established
- No hierarchy can be used.
No comments:
Post a Comment