Sunday, February 21, 2016

How to load large fastmap file into OpenPages

1. Context

Large fast map file is defined as more than 20,000 rows and 100+ columns; file size is about 4-8MB. We have experienced problems with three cases below
1) Large fastmap files cannot load with trigger enabled:
When trigger is NOT disabled, and system is used by a few users, fast map file can NOT be loaded with probability of 90%. Error is happened at any random row, which is {Error: Error Exception Sheet(ORX Losses) Row(13334): Failed to get localized application text for key [bns.custom.quarter.trigger.event.associate.fail] }
2) More than 5 Large fastmap files with 10,000 cannot be pushed into queue when trigger is disabled.
When trigger is disabled, Maximum Workbook Rows: 20,000, Transaction timeout: 2 hours. we cannot  put more than 5 Large fastmap files with 10,000  files into the queue. It hangs there for validation for the 5th one; also all files in queue are failed.
3) Large fastmap files with 40,000 cannot be pushed into queue:
When trigger is disabled, Maximum Workbook Rows: 200,000, Transaction timeout: 4 hours, files with 40,000 could NOT even put into the queue. It hangs there for validation.  Also servers are “crashed” because of this action
Obviously, we can perform some performance tuning on the server. However, given that the server resource is limited, there is always a load limit. We need to figure out a practical solution.
  1. Analysis and approach

As server has only limited power in term of memory and CPU, we need to split a big fastmap file into multiple files. The effort to split file is very big based on two reasons:
  1. The master file is big, it will take time to open it and cut records even with VBA.
  2. The manual process is very error prone.


The practical solution to load large file is listed below
1) Write a program to split file, which will actually save time and eliminate manual errors
2) Cut file with reasonable number of records, with this number, we can put all split files in queue,
3) Disable triggers and put all split files into queue, and let system run at night, or weekend.
A generic ETL to split files is explained.


  1. Fastmap Splitter

The idea is to turn the manual process into program, where SSIS and VBA are used. A small sample will be sued to explain.


    1. Input

Control file: provides a way to split file, see below


Raw file: file to be split based on control
    1. Result



    1. ETL and code











using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using Excel = Microsoft.Office.Interop.Excel;
using Microsoft.SqlServer.Dts.Runtime;


namespace ST_cdf2d5d3e89e44beb96c4c423dc86887.csproj
{
   [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
   public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
   {


       #region VSTA generated code
       enum ScriptResults
       {
           Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
           Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
       };
       #endregion


       
       public void Main()
       {


           try
           {


               String filename = Dts.Variables["User::uTargetFileNameRenamed"].Value.ToString();
               int startRow = Convert.ToInt32(Dts.Variables["User::uStartRow"].Value.ToString());
               int fileNo = Convert.ToInt32(Dts.Variables["User::uFileNo"].Value.ToString());
               int endRow = Convert.ToInt32(Dts.Variables["User::uEndRow"].Value.ToString());


               Excel._Application _app = new Excel.Application();               
               Excel.Workbooks _books;
               Excel.Workbook _book = _app.Workbooks.Open(filename, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
               Excel.Sheets _sheets;
               Excel.Worksheet _sheet = new Excel.Worksheet();
               
               _books = _app.Workbooks;
               _sheets = (Excel.Sheets)_book.Worksheets;


               int SheetCount = 0;
               SheetCount = _sheets.Count;
               int i = 1;
               while (i <= SheetCount)
               {
                   _sheet = (Excel.Worksheet)_sheets[i];
                   if (_sheet.Name != "Definition")
                   {


                       _sheet.Select(Type.Missing);


                       int totalColumn =_sheet.Columns.Count;
                       int totalRows =_sheet.Rows.Count;


                       if (fileNo > 1)
                       {
                           Excel.Range range1 = _sheet.get_Range(_sheet.Cells[2, 1], _sheet.Cells[startRow, totalColumn]);
                           range1.EntireRow.Delete(Excel.XlDeleteShiftDirection.xlShiftUp);
                           System.Runtime.InteropServices.Marshal.ReleaseComObject(range1);
                       }


                       Excel.Range range2 = _sheet.get_Range(_sheet.Cells[endRow - startRow + 1 + 1 + 1, 1], _sheet.Cells[totalRows, totalColumn]);
                       range2.EntireRow.Delete(Excel.XlDeleteShiftDirection.xlShiftUp);
                       System.Runtime.InteropServices.Marshal.ReleaseComObject(range2);
                   }
                   i++;
               }


               _book.Save();
               _book.Close(false, Type.Missing, Type.Missing);


               _app.Quit();


               System.Runtime.InteropServices.Marshal.ReleaseComObject(_app);
               System.Runtime.InteropServices.Marshal.ReleaseComObject(_books);
               System.Runtime.InteropServices.Marshal.ReleaseComObject(_book);
               System.Runtime.InteropServices.Marshal.ReleaseComObject(_sheets);
               System.Runtime.InteropServices.Marshal.ReleaseComObject(_sheet);
           
               GC.Collect();
               GC.WaitForPendingFinalizers();
           }
         
           catch (Exception ex)
           {
               Dts.TaskResult = (int)ScriptResults.Failure;
               MessageBox.Show(ex.Message);
           }
              
           Dts.TaskResult = (int)ScriptResults.Success;
       }
   }
}

No comments:

Post a Comment