Sunday, October 4, 2015

How to quickly validate data and apply case statement by using SSIS expression

  1. How to validate data using SSIS

Whether you load data into data warehouse, or generate IBM OpenPages fastmap, all dimension data must be valid. There are many different way to perform such task. Below is a simplest way to make this validation simple and effective with SSIS expression.
The requirement is to provide business users an overview with all data issues. The proposal is to simply mark record and show what column with problems
Three major dataflow elements is needed
  1. Add validation columns


Sample:  (ISNULL(Event_Basel) || Event_Basel == "" || Event_Basel == "Internal Fraud" || Event_Basel == "External Fraud" || Event_Basel == "Employment Practices and Workplace Safety" || Event_Basel == "Clients, Products and Business Practices" || Event_Basel == "Damage to Physical Assets" || Event_Basel == "Business Disruption and System Failures" || Event_Basel == "Execution, Delivery and Process Management" ? "" : "X")


  1. Derive a column to combine all problems




([V__Causal Category__Risk_Name] == "X" ? "***Causal Category" : "") + ([V__Scotiabank Risk Categories__Scotiabank Risk Category] == "X" ? "***Scotiabank Risk Categories" : "") + ……+ ([V__BE Folder Path Row 2__Combined_Div_SubDiv_BE] == "X" ? "***BE Folder Path Row 2" : "") + ([V__Business Entity__LE Business Entity] == "X" ? "***Business Entity" : "")


  1. Check whether record has problem(s)


  1. Export it into error log in excel


  1. How to apply case statement by using SSIS expression



When selecting data from a database using T-SQL, one can use CASE statements to derive new columns based on data in the database. However, when reading data from files in SQL Server Integration Services (SSIS) there isn't a CASE statement readily available when new columns need to be derived or existing values need to be replaced.
SSIS expressions with case statement:
(DT_I8)([Recovery Category] == "Direct (Incl. Legal)" ? Local_Curr_Recovery :
[Recovery Category] == "Insurance (Indirect)" ? Local_Curr_Insurance :

[Recovery Category] == "Both" ? Local_Curr_Recovery + Local_Curr_Insurance : 0.0)

No comments:

Post a Comment