Sunday, March 1, 2020

Better BI Solution Architect - Avoid idealism, advanced technology, over-architect / overengineering, too much user-focused.

This title might be misleading. By no means, I am not against idealism, advanced technology, architect, and user-focused.  During my more than 20 years of business intelligence related consulting, some projects have failed to deliver or initially failed. The solution architect plays a vital role in business intelligence projects. The decision could be a make or break for a project. There are a few problems identified below based on my experiences:

1. Idealism in this context is to follow the principle from all academic books as close as possible. For example, we need to apply surrogate keys and slowly changing dimension type 2 for data warehousing regardless of we need it or not for a specific project.

2. Advanced technology is useful to resolve some tough challenges. The problem is that we use advanced technology regardless of whether we need it or not. For example, multitier technology with an application server such as WebLogic is right to build applications to be scalable. However, we should not use it when we have only limited business users to use it.

3. Over-architect/over-engineering in this context is the designing of data warehouse/business intelligence to be more robust or complicated than necessary. For example, our data volume is not so large, but we want to apply the incremental loading instead of using simple full loading for the data flow.

4.User-focused design is the right approach; however, we don't want to focus on a particular user requirement, which could dramatically change the overall architecture and satisfy only a tiny percentage of requirements.

The solution architect is art, and it isn't straightforward to judge whether it is correct, especially during the initial period of the BI project. In my opinion, we should value it whether we can deliver the project that can satisfy the business requirement, but with minimal effort and minimal risk. We should apply for a very pragmatic approach to design systems as simple as possible, only use the complicated solutions unless we have to use it. In reality, many DW/BI solution architects and software developers like to use a complicated approach to resolve the problem in a not easy way, as they follow all so-called "best practices."

I take liberty sharing some of the projects that were failed or initially failed during my consulting duo to the wrongly architected approach.

1. Large DW/BI project to deliver risk management reports: This is a project for a big financial institution, where data is enormous, some of the tables are about 1 billion records. Besides, information is very complicated and located in different source systems. Also, they need to convert all SAS based platform to Cognos, or from all SAS code-based report into Cognos, or model-based reports. The architect team decided to apply for the very best data warehousing concept, so-called slow changing dimension type 5, or slow-changing type 2 with bi-temporal. At the same time, they used GoldenGate 12c to capture real-time data. Also, they use composite software to aggregate data for Cognos. The report performance is extremely slow for business users and failed.  The issue was that we don't need to have real-time data, and we don't need to follow such more than the complicated dimension concept, as it is simply too complicated to mimic the business logic. Finally, we used the Netezza for this project; we may not even need all surrogate keys for the data warehousing, as Netezza is quick enough to handle table join with strings. Overall, Idealism and Over-architect/over-engineering are the problems with this project.

2. Large DW/BI project to deliver trading life cycle reports:  This project is supposed to collect 12 different trading platforms, then generate reports for all trading related counts. There are two problems with this project, first of all, business users requested to allow to select any date to any date to generate a report. However, the most used case is still select the relatively predefined duration, such as a current month, quarter, and year.  The business logic for this project is that all these counts must base on the start date and end date. If we allow users to select any time to any time, then all measures must be dynamically generated. This logic makes fact table dynamic, which in turn makes report performance extremely slow, as any action needs to produce one the fly from detail records. The proposed solution should make fact tables static to satisfy most of the business requirements. Secondly, SSIS was used for ETL to load data into the data warehouse. However, Ab Initio was used to replace SSIS. Even SSIS is good enough, and Ab Initio is not necessary for this project. This decision resulted in a considerable effort and did not deliver a reasonable result duo to resource constraint and complexity. Overall, Idealism and Over-architect/over-engineering are the problems with this project. Advanced technology and too much user-focused approach are the problems with this project.

3. Large DW/BI project to provide investment analytical reports. This project is to establish the reporting platform to generate investment analytics performance for more than 200 custodians. One of the functions is to allow business analysts to reconcile transactions and assets data. The process is a batch processing with many records.  During the initial design period, there was a considerable dispute with two architect solutions:  multi-tiers vs. only two tiers.  Mult-tier refers to the introduction of the application server, where all objects reside in the application server. Multitier archiecture was very popular at the beginning of 2001. The two-tier architecture is very much like client-server, we use stored procedures to collect data, and windows interface to reconcile data. We applied the two-tier approach, and barely made deliver the project in time. Imagine that if we were used multitier, we would have failed this project.  Over-architect could be an issue for this project.

4. Application to support network capacity planning: the project is to establish database related software for telecommunication companies to make network capacity planning. There are many issues with this initiative; what i am focusing on is that what programming languages to use.  At that time, both visual basic and visual C++ could develop excellent UI. Somehow, C++ was used to create UI; even visual basic is good enough. This decision results in a much bigger effort at a high cost. People choose C++, just because  c++ was better language, and can do more than what visual basic can do.

5.Develop a self-configurable ERP system: The philosophy of this initiative is to provide a graphic user interface that business users can develop their ERP system without having programming skills. The solutions resulted in a very complicated visual user interface that might be even more difficult to use than programming itself. At the same time, the performance was extremely poor duo the interpreter. This idea was proven to work if the domain is well defined, and the area is pretty narrowed. ERP logic is too complicated to realize with GUI.



1 comment:

  1. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care
    http://chennaitraining.in/qliksense-training-in-chennai/
    http://chennaitraining.in/pentaho-training-in-chennai/
    http://chennaitraining.in/machine-learning-training-in-chennai/
    http://chennaitraining.in/artificial-intelligence-training-in-chennai/
    http://chennaitraining.in/snaplogic-training-in-chennai/
    http://chennaitraining.in/snowflake-training-in-chennai/

    ReplyDelete