Wednesday, May 30, 2012

How to handle inferred member for “Late Arriving Dimension”

Context

This is very general practice in warehouse project: Sometimes we will get the transactional data without dimensional data. It is called as “early arriving facts” or “Late Arriving Dimension”. There are a few of good solutions though, all of which are handled during ETL:


  1. Hold onto the fact until all dimensions arrive
  2. Create a dimension called ‘unknown’ or ‘not available’ with a primary key of -1
  3. Insert a dummy row in the dimension and populate it with whatever you can

Usually, the first approach is not applied, as business want to get result even some dimensions not available.  The exception mainly is handled during the populate fact table when there is NO match found in dimension table. An inferred member is used in third approach.  This document is intended to provide overview to handle Inferred member, along with ETL process.

Detail



  1. Add new table  as  Admin_Exception to save all exception records
    Column name Data type Description
    Log_IdBigintAuto ID
    SrcTableNamevarchar(80)Staging table name
    SrcColumnNamevarchar(80)Staging column name
    TypeInt1: Null or Blank; 2: Inferred Dimension
    RecordIDIntUnique record ID from each staging table
    Batch_NumberIntReference to Admin_Batch_Status.Batch_Number
    MessageText
    DatetimeDatetimeDefault as getdate()


    1. Add new SP to insert data into Admin_Exception  
    Input:
    • [@RecordID = ] RecordID
    • [@SrcTableName = ]’ SrcTableName’
    • [@SrcColumnName = ]’ SrcColumnName’
    • [@Type = ]’ Type’
    • [@Message = ] ‘Message’
    Output: NONE
    Logic: add Batch_Number
    1. While there will not be  any changing for current existing package at all, a few adjustments will be added, When populating fact tables, all records that are not matched to individual dimension, will be insert into Admin_Exception using usp_InsertExecption. Please see sample below:

    • Use conditional split to split not matched dimension into NULL or Blank and inferred Member
    • Use Ole Command to add Exception into table Admin_Exception using usp_InsertExecption, the difference between two exceptions are the description.
    • Use derived column to set Surrogate key as 0
    • Use Ole Command to insert Inferred member using usp_InsertInferred<Dimtable> to Insert new dimension into correspondent dimension table, with IsInferred = 1. Each dimension has one stored procedure accordingly. In addition, Logic is already built in to avoid duplicated insertion.
    • Use lookup again to get Surrogate key, while the lookup condition is with NO cache, and only apply for inferred member
    • Use Union all to combine all records from three channels

    Note

    Use the same stored procedure to insert the inferred dimension and return the newly inserted dimension’s key. Sample code of member dimension:
    CREATE PROCEDURE [dbo].[usp_TestDimMember]
    ( @MemberAlternateKey Varchar(16),@outval int OUTPUT)
    AS

    BEGIN TRY
    IF not exists (select MemberAlternateKey from DimMember where MemberAlternateKey = @MemberAlternateKey and toDate is NULL)
    BEGIN
    insert into DimMember ( MemberAlternateKey,isInferred ) values  (@MemberAlternateKey,1)
    SET @outval = SCOPE_IDENTITY()
    END
    ELSE
    BEGIN
    SELECT @outval = MAX(MemberKey) from DimMember where MemberAlternateKey = @MemberAlternateKey and toDate is NULL
    END
    END TRY

    BEGIN CATCH
       DECLARE @errorString  nvarchar(4000), @errorSeverity int, @errorState int
       -- Common handler for exception message formatting and transaction rollback.
    EXEC usp_Process_Exception @errorString OUT, @errorSeverity OUT, @errorState OUT
       -- Pass the collected error information up the stack.
       RAISERROR (@errorString,@errorSeverity,@errorState);
    END CATCH
    • Create index for the business key column. For example, the MemberAlterKey column of DimMember table.
    • Use Multicast component and OLEDB component to log the exception instead of using a stored procedure.

No comments:

Post a Comment