Problem
Sometimes, we can`t directly use fact table from database, instead, we use query as fact table. The final SQL query for report is then generated with the sub query as follows:
Select
From
(Fact sub query),
Dimension 1,
Dimension 2, ...
Dimension N
As we can see, query will get data from sub query, and then filter data out from all dimensions. If the sub query has small dataset, then this query performance is acceptable. However, when there are millions records retrieved from sub query, and then such a query will result in a very poor performance. The problem here is that we cannot break sub query, as business logic is implemented into sub query.
Solution
The idea is to embed all parameters directly into query with prompt macro. This way provides an opportunity to optimize query and therefore result in a better performance. The sample below uses Cognos sample database, where a simple query subject is created.
--(1) First part is query with business logic. RETURNED_ITEM, ORDER_DETAILS and ORDER_HEADER are joined together. The difference is that all dimensions TIME_DIMENSION, PRODUCT and ORDER_METHOD are built into query as well.
select
TIME_DIMENSION.DAY_KEY AS DAY_KEY,
PRODUCT.PRODUCT_NUMBER AS PRODUCT_NUMBER,
ORDER_METHOD.ORDER_METHOD_CODE AS Order_method_code,
RETURNED_ITEM.RETURN_QUANTITY AS Return_quantity
from GOSALES.RETURNED_ITEM RETURNED_ITEM,
GOSALES.ORDER_DETAILS ORDER_DETAILS,
GOSALES.ORDER_HEADER ORDER_HEADER,
GOSALES.TIME_DIMENSION TIME_DIMENSION
GOSALES.PRODUCT PRODUCT,
GOSALES.ORDER_METHOD ORDER_METHOD,
where ORDER_HEADER.ORDER_NUMBER = ORDER_DETAILS.ORDER_NUMBER
and ORDER_DETAILS.PRODUCT_NUMBER = PRODUCT.PRODUCT_NUMBER
and ORDER_HEADER.ORDER_METHOD_CODE = ORDER_METHOD.ORDER_METHOD_CODE
and RETURNED_ITEM.ORDER_DETAIL_CODE = ORDER_DETAILS.ORDER_DETAIL_CODE
and (datepart(year,RETURNED_ITEM.RETURN_DATE) * 10000 + datepart(month,RETURNED_ITEM.RETURN_DATE) * 100) + datepart(day,RETURNED_ITEM.RETURN_DATE) = TIME_DIMENSION.DAY_KEY
and ORDER_DETAILS.PRODUCT_NUMBER = PRODUCT.PRODUCT_NUMBER
and ORDER_HEADER.ORDER_METHOD_CODE = ORDER_METHOD.ORDER_METHOD_CODE
--(2) Second part is mandatory filter, in this case, 'p_BEGIN_DATE' and 'p_END_DATE' are applied
and TIME_DIMENSION.DAY_KEY #'between '+prompt('p_BEGIN_DATE','string')+' and '+ prompt('p_END_DATE','string')#
--(3)Third part is optional filter, in this case 'p_PRODUCT_NUMBER' and 'p_ORDER_METHOD_CODE' are applied
AND # 'PRODUCT.PRODUCT_NUMBER=' +prompt('p_PRODUCT_NUMBER','string', 'PRODUCT.PRODUCT_NUMBER')#
AND # 'ORDER_METHOD.ORDER_METHOD_CODE=' +prompt('p_ORDER_METHOD_CODE','string', 'ORDER_METHOD.ORDER_METHOD_CODE')#
Assume
'p_BEGIN_DATE' = 20100101
'p_END_DATE' = 20101231
'p_PRODUCT_NUMBER' =142110
'p_ORDER_METHOD_CODE' = 2
Then query will looks like follows
....................
and TIME_DIMENSION.DAY_KEY between 20100101 and 20101231
AND PRODUCT.PRODUCT_NUMBER=142110
AND ORDER_METHOD.ORDER_METHOD_CODE=2
Assume that only mandatory filters are given
'p_BEGIN_DATE' = 20100101
'p_END_DATE' = 20101231
Then query will looks like follows
....................
and TIME_DIMENSION.DAY_KEY between 20100101 and 20101231
AND PRODUCT.PRODUCT_NUMBER= PRODUCT.PRODUCT_NUMBER
AND ORDER_METHOD.ORDER_METHOD_CODE= ORDER_METHOD.ORDER_METHOD_CODE
No comments:
Post a Comment