Saturday, September 8, 2012

How to generate Year to Date (YTD) data for scorecard using SQL statement – an practical approach

Context


In scorecard development, Year to Month data metrics is frequently used. The best way to archive such a metrics is to create a metrics source table to contain Year to Month data. It seems to be an easy task from business point of view; however, it can be a pretty challenge task. The main issue is that month data can be missed.

Given a sample below

Month data

MonthSales
Jan$10
Mar$10


Year to Month data

MonthSales
Jan$10
Feb$10
Mar$20


The key is that you need to provide data for the month, even when there is no sale. In case when data is not populated for metric table, user won’t see metric for that month, it is February in this case.

This document is intended to provide a practical solution using SQL statements. gosalesdw database in SQL server is used. Sure that you can use ETL to handle it as well, but it could be complicated process, especially when data volume is very big.   You can download script for the review in detail, the goal is to generate metric source data as below


Implementation


Step 1: Create a master temp table to hold records. In order words, even there is not data for a specific month, year to month data should be available.  

SELECT F.organization_key AS [Dim1],
      F.employee_key     AS [Dim2],
      DT.[year]          AS [Year],
      DT.[month]         AS [Month]
INTO   #t_master_records
FROM   (SELECT F.organization_key,
              F.employee_key
       FROM   gosalesdw.sls_sales_fact F
              INNER JOIN (SELECT DISTINCT [current_year]  AS [year],
                                          [current_month] AS [Month]
                          FROM   gosalesdw.go_time_dim
                          WHERE  [current_year] = 2007
                                 AND [current_month] <> 0) DT
                      ON ( F.order_day_key / 10000 = DT.[year]
                           AND order_day_key / 100 - ( order_day_key / 10000 )
                                                     * 100 =
                               DT.[month] )
       WHERE  F.organization_key = 11101
              AND F.employee_key = 4001
       GROUP  BY F.organization_key,
                 F.employee_key) F,
      (SELECT DISTINCT [current_year]  AS [year],
                       [current_month] AS [Month]
       FROM   gosalesdw.go_time_dim
       WHERE  [current_year] = 2007
              AND [current_month] <> 0) DT

SELECT *
FROM   #t_master_records
ORDER  BY [month]



Step 2: Create an actual temp table to actual data for each month

SELECT F.organization_key                                      AS [Dim1],
      F.employee_key                                          AS [Dim2],
      F.order_day_key / 10000                                 AS [Year],
      F.order_day_key / 100 - ( order_day_key / 10000 ) * 100 AS [Month],
      Sum(F.sale_total)                                       AS [Actual]
INTO   #t_actual
FROM   gosalesdw.sls_sales_fact F
      INNER JOIN (SELECT DISTINCT [current_year]  AS [year],
                                  [current_month] AS [Month]
                  FROM   gosalesdw.go_time_dim
                  WHERE  [current_year] = 2007
                         AND [current_month] <> 0) DT
              ON ( F.order_day_key / 10000 = DT.[year]
                   AND order_day_key / 100 - ( order_day_key / 10000 ) * 100 =
                       DT.[month] )
WHERE  F.organization_key = 11101
      AND F.employee_key = 4001
GROUP  BY F.organization_key,
         F.employee_key,
         F.order_day_key / 10000,
         F.order_day_key / 100 - ( order_day_key / 10000 ) * 100

SELECT *
FROM   #t_actual
ORDER  BY [month]


Step 3: Create a target temp table to target data for each month

SELECT F.organization_key                      AS [Dim1],
      F.employee_key                          AS [Dim2],
      F.month_key / 100                       AS [Year],
      F.month_key - ( month_key / 100 ) * 100 AS [Month],
      Sum(F.sales_target)                     AS [Target]
INTO   #t_target
FROM   gosalesdw.sls_sales_targ_fact F
      INNER JOIN (SELECT DISTINCT [current_year]  AS [year],
                                  [current_month] AS [Month]
                  FROM   gosalesdw.go_time_dim
                  WHERE  [current_year] = 2007
                         AND [current_month] <> 0) DT
              ON ( F.month_key / 100 = DT.[year]
                   AND month_key - ( month_key / 100 ) * 100 = DT.[month] )
WHERE  F.organization_key = 11101
      AND F.employee_key = 4001
GROUP  BY F.organization_key,
         F.employee_key,
         F.month_key / 100,
         F.month_key - ( month_key / 100 ) * 100

SELECT *
FROM   #t_target
ORDER  BY [month]



Step 4: Get metrics data using mater table to join with both actual and target data, where Year to month actual and target data will be generated. Please note that date information has been change to date format so that metric calendar can be applied.

SELECT #t_master_records.[dim1],
      #t_master_records.[dim2],
      CONVERT(DATETIME, CONVERT(VARCHAR(4), #t_master_records.[year])
                        + '-'
                        + CONVERT(VARCHAR(2), #t_master_records.[month])
                        + '-01') AS [Date],
      Sum(#t_actual.actual)      AS Actual,
      Sum(#t_target.[target])    AS [Target]
FROM   #t_master_records
      LEFT JOIN #t_actual
             ON ( #t_actual.[dim1] = #t_master_records.[dim1]
                  AND #t_actual.[dim2] = #t_master_records.[dim2]
                  AND #t_actual.[year] = #t_master_records.[year]
                  AND #t_actual.[month] <= #t_master_records.[month] )
      LEFT JOIN #t_target
             ON ( #t_target.[dim1] = #t_master_records.[dim1]
                  AND #t_target.[dim2] = #t_master_records.[dim2]
                  AND #t_target.[year] = #t_master_records.[year]
                  AND #t_target.[month] <= #t_master_records.[month] )
GROUP  BY #t_master_records.[dim1],
         #t_master_records.[dim2],
         CONVERT(DATETIME, CONVERT(VARCHAR(4), #t_master_records.[year])
                           + '-'
                           + CONVERT(VARCHAR(2), #t_master_records.[month])
                           + '-01'),
         #t_master_records.[year],
         #t_master_records.[month]
ORDER  BY #t_master_records.[month]

No comments:

Post a Comment