1. Context
Large fast map file is defined as more than 20,000 rows and 100+ columns; file size is about 4-8MB. We have experienced problems with three cases below
1) Large fastmap files cannot load with trigger enabled:
When trigger is NOT disabled, and system is used by a few users, fast map file can NOT be loaded with probability of 90%. Error is happened at any random row, which is {Error: Error Exception Sheet(ORX Losses) Row(13334): Failed to get localized application text for key [bns.custom.quarter.trigger.event.associate.fail] }
2) More than 5 Large fastmap files with 10,000 cannot be pushed into queue when trigger is disabled.
When trigger is disabled, Maximum Workbook Rows: 20,000, Transaction timeout: 2 hours. we cannot put more than 5 Large fastmap files with 10,000 files into the queue. It hangs there for validation for the 5th one; also all files in queue are failed.
3) Large fastmap files with 40,000 cannot be pushed into queue:
When trigger is disabled, Maximum Workbook Rows: 200,000, Transaction timeout: 4 hours, files with 40,000 could NOT even put into the queue. It hangs there for validation. Also servers are “crashed” because of this action
Obviously, we can perform some performance tuning on the server. However, given that the server resource is limited, there is always a load limit. We need to figure out a practical solution.
Analysis and approach
As server has only limited power in term of memory and CPU, we need to split a big fastmap file into multiple files. The effort to split file is very big based on two reasons:
- The master file is big, it will take time to open it and cut records even with VBA.
- The manual process is very error prone.
The practical solution to load large file is listed below
1) Write a program to split file, which will actually save time and eliminate manual errors
2) Cut file with reasonable number of records, with this number, we can put all split files in queue,
3) Disable triggers and put all split files into queue, and let system run at night, or weekend.
A generic ETL to split files is explained.
Fastmap Splitter
The idea is to turn the manual process into program, where SSIS and VBA are used. A small sample will be sued to explain.
Input
Control file: provides a way to split file, see below
Raw file: file to be split based on control
Result
ETL and code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using Excel = Microsoft.Office.Interop.Excel;
using Microsoft.SqlServer.Dts.Runtime;
namespace ST_cdf2d5d3e89e44beb96c4c423dc86887.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
public void Main()
{
try
{
String filename = Dts.Variables["User::uTargetFileNameRenamed"].Value.ToString();
int startRow = Convert.ToInt32(Dts.Variables["User::uStartRow"].Value.ToString());
int fileNo = Convert.ToInt32(Dts.Variables["User::uFileNo"].Value.ToString());
int endRow = Convert.ToInt32(Dts.Variables["User::uEndRow"].Value.ToString());
Excel._Application _app = new Excel.Application();
Excel.Workbooks _books;
Excel.Workbook _book = _app.Workbooks.Open(filename, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
Excel.Sheets _sheets;
Excel.Worksheet _sheet = new Excel.Worksheet();
_books = _app.Workbooks;
_sheets = (Excel.Sheets)_book.Worksheets;
int SheetCount = 0;
SheetCount = _sheets.Count;
int i = 1;
while (i <= SheetCount)
{
_sheet = (Excel.Worksheet)_sheets[i];
if (_sheet.Name != "Definition")
{
_sheet.Select(Type.Missing);
int totalColumn =_sheet.Columns.Count;
int totalRows =_sheet.Rows.Count;
if (fileNo > 1)
{
Excel.Range range1 = _sheet.get_Range(_sheet.Cells[2, 1], _sheet.Cells[startRow, totalColumn]);
range1.EntireRow.Delete(Excel.XlDeleteShiftDirection.xlShiftUp);
System.Runtime.InteropServices.Marshal.ReleaseComObject(range1);
}
Excel.Range range2 = _sheet.get_Range(_sheet.Cells[endRow - startRow + 1 + 1 + 1, 1], _sheet.Cells[totalRows, totalColumn]);
range2.EntireRow.Delete(Excel.XlDeleteShiftDirection.xlShiftUp);
System.Runtime.InteropServices.Marshal.ReleaseComObject(range2);
}
i++;
}
_book.Save();
_book.Close(false, Type.Missing, Type.Missing);
_app.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(_app);
System.Runtime.InteropServices.Marshal.ReleaseComObject(_books);
System.Runtime.InteropServices.Marshal.ReleaseComObject(_book);
System.Runtime.InteropServices.Marshal.ReleaseComObject(_sheets);
System.Runtime.InteropServices.Marshal.ReleaseComObject(_sheet);
GC.Collect();
GC.WaitForPendingFinalizers();
}
catch (Exception ex)
{
Dts.TaskResult = (int)ScriptResults.Failure;
MessageBox.Show(ex.Message);
}
Dts.TaskResult = (int)ScriptResults.Success;
}
}
}
No comments:
Post a Comment