Yin Solutions: 1. Solution Architect

Showing posts with label 1. Solution Architect. Show all posts

Thursday, February 20, 2025

Comprehensive Security Solution for a Sample Power BI App Deployment

Check list of all posts

Assumptions of Deployment Architecture

The Power BI model and reports are located in the same workspace.
Reports utilize a shared semantic model.
All reports are published via a Power BI app for end users.

Solution Summary:

Enforce Robust Security Controls – Implement both Row-Level Security (RLS) and report-level access restrictions.
Streamline Model Management – Maintain a single shared semantic model with built-in security to reduce maintenance overhead while ensuring compliance.
Ensure Strict Access Control – Guarantee that users can only access the data and reports they are authorized to view.

Detailed Requirements:

There are four roles with different access requirements. The diagram below illustrates the access structure.

Challenges & Analysis:

While roles 1, 2, and 4 have straightforward solutions, role 3 presents a unique challenge. Below are three potential approaches and their drawbacks:

Option 1: Assign Role 3 as "Contributors" in the Workspace

Issue: Contributors have access to all reports via the Power BI app, and restricting access at the report level is not possible.
Conclusion: This approach does not meet the requirement for selective report access.

Option 2: Use a Security Dimension to Control Access

Challenges:
1. The security dimension could become excessively large, as it would need to store permissions for all users.
2. Dynamic updates would be required to ensure new users and changing data permissions are accurately reflected.
3. The security dimension might not cover all records due to incomplete mappings, potentially filtering out critical data.
Conclusion: This approach is impractical due to scalability and data completeness issues.

Option 3: Duplicate the Semantic Model

Process:
- Create two versions of the semantic model: one with RLS and one without RLS.
- Maintain separate reports pointing to each model, respectively.
- Configure Power BI app audiences accordingly.
Challenges:
- Requires maintaining duplicate models and reports.
- Increases complexity in Power BI app configuration.
Conclusion: This approach introduces high maintenance overhead and redundancy.

Optimized Solution:

Rather than adopting the flawed approaches above, we propose an innovative RLS setup that does not impose restrictions. The key steps include:

Define an RLS Role with Full Access:
- The security filter is set to TRUE(), ensuring that assigned users can access all data without restriction.
Use a Disconnected Security Dimension:
- Applying TRUE() alone is insufficient because relationships between the security dimension and the main model may still filter data unexpectedly.
- By using a disconnected security dimension, we prevent unintended filtering effects.

This approach achieves:

No need for a massive security dimension.
No need for duplicate models.
No need for duplicate reports.
Simplified Power BI app configuration.

With this methodology, along with the existing roles 1, 2, and 4, the final solution structure is illustrated below

Implementation Steps:

Create an RLS Role (Refer to the diagram below).

Create an RLS_FullData Role (Refer to the diagram below).

Assign SecurityGroups to Appropriate Roles

Define security groups based on role-based access needs
Extend security group assignments as necessary (Refer to the attached diagram).

This solution ensures scalable, maintainable, and efficient security management within Power BI, aligning with the business requirements for secure and streamlined report access.

Note 1: “Apply security filter in both directions” ensures row-level security filters flow both ways across the relationship, which can be powerful but might complicate your security model.Turn on both-direction filters only when you need dimension tables to be filtered based on row-level security rules that originate in the fact table or in other scenarios that require the dimension table itself to be restricted by fact data.

Friday, July 26, 2024

Unlock the Full Potential of Power BI with Row-Level Security (RLS)

Check list of all posts

Objective

Using a single table to control all static and dynamic Row-Level Security (RLS) in Power BI has several advantages:

Centralized control: With a single table, you can centralize all your RLS rules and permissions in one place, which can make it easier to manage and maintain your data model. This can be particularly useful if you have a large number of users, groups, and data sets that need to be managed.

Improved efficiency: A single table can help to streamline the process of managing RLS permissions, as you only need to update a single table to make changes to the permissions. This can save time and reduce the risk of errors, particularly if you have a large number of rules or permissions to manage.

Simplified troubleshooting: With all your RLS rules and permissions in a single table, it can be easier to troubleshoot any issues or problems that may arise. You can use the table to quickly identify and fix any errors or inconsistencies in your RLS rules, which can help to ensure that your data model is accurate and up-to-date.

Overall, using a single table to control all static and dynamic RLS in Power BI can help to improve the efficiency, simplicity, and accuracy of your data model, and make it easier to manage and maintain your RLS rules and permissions.

We want to enhance our use of Power BI by consolidating all our Row-Level Security (RLS) requirements into a single role. Currently, we have to define static RLS and dynamic RLS separately, and we have to set up multiple roles for specific dimensions in static RLS. Additionally, we have different areas for each static RLS. To address this, we aim to combine all data into a single dataset and create a single role for all RLS to achieve the following goals:

1) Combine static and dynamic RLS, allowing users to access only their own data and the data they are allowed to see based on their role or group membership.

2) Create a static RLS for multiple dimensions, such as product and customer dimensions.

3) Create a single role for each dimension, such as multiple products.

This document will provide a solution that guides you through the process, from the database to the final reports.

Sample data setup (SQL server)

DROP TABLE IF EXISTS dbo.CYFact;

CREATE TABLE dbo.CYFact(

UserID varchar(10) ,

Product varchar(10) ,

Customer varchar(10) ,

Sales varchar(10)

)

insert into dbo.CYFact values

('U1','P1','C1',1),

('U1','P1','C2',2),

('U1','P2','C1',3),

('U1','P2','C2',4),

('U2','P1','C1',5),

('U2','P1','C2',6),

('U2','P2','C1',7),

('U2','P2','C2',8)

DROP TABLE IF EXISTS dbo.CYRLS;

CREATE TABLE dbo.CYRLS(

ID varchar(10) ,

Email varchar(100) ,

IsDynamicRLS varchar(10),

ProductStaticRLS varchar(10) ,

CustomerStaticRLS varchar(10)

)

insert into dbo.CYRLS values

('U1','U1@Test.com','Yes','P1','C1'),

('U1','U1@Test.com','Yes','P2','C1'),

('U2','U2@Test.com','No','None','C1')

DROP TABLE IF EXISTS dbo.CYUser;

CREATE TABLE dbo.CYUser(

ID varchar(10),

Email varchar(100) ,

)

insert into dbo.CYUser values

('U1','U1@Test.com'),

('U2','U2@Test.com')

DROP TABLE IF EXISTS dbo.CYProduct;

CREATE TABLE dbo.CYProduct(

ID varchar(10))

insert into dbo.CYProduct values

('P1'),

('P2')

DROP TABLE IF EXISTS dbo.CYCustomer;

CREATE TABLE dbo.CYCustomer(

ID varchar(10))

insert into dbo.CYCustomer values

('C1'),

('C2')

Data Model

Assuming that there are only one fact and three dimensions with a typical star schema. Please note the RLS is a disconnected table with all dimensions and fact tables.

Security Logic
We use 3 sample records below to demonstrate all cases we try to achieve.

U1 supports dynamic RLS, meaning that U1 can only access his data to the Fact sales table.
U1 supports static RLS for the product dimension, with two products P1 and P2
U1 supports static RLS for the customer dimension, with one customer C1 only
U2 doesn't support dynamic RLS, meaning that U2 can access all data. For example, he can access U1 data.
U2 doesn't support static RLS for the product dimension, meaning that U2 can access all products.
U2 supports static RLS for the customer dimension, with one customer C1 only.

Role implementation

CYUser

VAR _DimRLS =
    CALCULATETABLE (
        VALUES ( 'CYRLS'[IsDynamicRLS] ),
        'CYRLS'[Email] = USERPRINCIPALNAME ()
    )
VAR _RLS =
    SWITCH (
        TRUE (),
        _DimRLS = "No", TRUE (),
        [Email] = USERPRINCIPALNAME (), TRUE (),
        FALSE ()
    )
RETURN
    _RLS

CYProduct

VAR _DimRLS =
    CONCATENATEX (
        CALCULATETABLE (
            VALUES ( 'CYRLS'[ProductStaticRLS] ),
            'CYRLS'[Email] = USERPRINCIPALNAME ()
        ),
        'CYRLS'[ProductStaticRLS]
    )
VAR _RLS =
    SWITCH (
        TRUE (),
        CONTAINSSTRING ( _DimRLS, "None" ), TRUE (),
        CONTAINSSTRING ( _DimRLS, [ID] ), TRUE (),
        FALSE ()
    )
RETURN
    _RLS

CYCustomer

VAR _DimRLS =
    CONCATENATEX (
        CALCULATETABLE (
            VALUES ( 'CYRLS'[CustomerStaticRLS] ),
            'CYRLS'[Email] = USERPRINCIPALNAME ()
        ),
        'CYRLS'[CustomerStaticRLS]
    )
VAR _RLS =
    SWITCH (
        TRUE (),
        CONTAINSSTRING ( _DimRLS, "None" ), TRUE (),
        CONTAINSSTRING ( _DimRLS, [ID] ), TRUE (),
        FALSE ()
    )
RETURN
    _RLS

Test result

No Security ( Admin )

U1: User 1 login

U2: User 2 login

Note 1: If a user is assigned to two roles for row-level security (RLS) in Power BI, they will have access to the data that is allowed by the rules associated with both of those roles. This means that the user will be able to see all of the data that is allowed by the rules for either of the two roles. However, we try to assign everyone to a single role, driven by the table defined.

Note 2: If a user has no role assigned with row-level security (RLS) in Power BI, they will not be able to access the data in the dataset. This is because RLS is designed to restrict access to data based on the user's role, and a user who has no role will not have any access to the data.

Note 3: To give people full access to data in Power BI using row-level security (RLS), you can create a role with no rules, or you can create a rule that allows access to all rows of data.

To create a role with no rules, you can simply define the role in the Power BI service and assign users to it. This will allow those users to see all of the data in the dataset, regardless of any other roles they may be assigned to.

Another way to do it is to assign this user to the contributor for the workspace where the dataset resides.

Note 4 The key to understanding RLS is that we implement RLS on the Power dataset. So, first of all, we need to figure out the fundamental permissions to Power BI datasets, which is the write permission. If a user has written permission on a Power BI dataset, then RLS won’t apply for this user. Why? Because the user can edit permission for these roles. If the user doesn’t have permission to write the dataset, RLS is fully enforced, no matter where the user has access to this dataset, in the original workspace, or any apps and shared links.

Who has write permission for a dataset?

The user has access to the same workspace of the Power BI dataset and has permission as

• Admin

• Member

• Contributor

Who has no write permission on the dataset?

There are three cases where people have no write permission:

1) Report is in the same workspace as dataset. The user has permission as Viewer only.

2) Report is in a different workspace than the dataset. Users can be assigned to any security group, however these users should have only read, or build permission to the dataset.

3) Reports are accessed by a shared links

4) Reports are accessed by an apps

The diagram below illustrates all cases below.

Note 5: The "apply security filters to both directions" option in Power BI Row-Level Security (RLS) is used to enforce security filters in both the filters and visuals panes of a report. This can be useful for ensuring that users only see data that they are authorized to see, regardless of how they interact with the report.

For example, consider a report that contains a visual showing sales data for different products and customers. You have set up RLS rules to limit access to certain products based on the user's role, but you want to ensure that users cannot see data for customers that they are not authorized to see, even if they try to manipulate the filters or visuals to see data for other customers. In this case, you could use the "apply security filters to both directions" option to enforce the security filters in both the filters and visuals panes of the report. However, this solution might not be the best, as there are two problems: 1) when the transaction table is huge, then it will result in poor performance; 2) You might want customers, even if you don't have sales with the context of filters.

Note 6: User group to setup permissions:

Note 7: Why LOOKUPVALUE method is much more flexible than USEPRINCIPALNAME:
Dynamic security is a feature that allows you to change the level of access that users have to data in a Power BI report or dashboard at runtime. It allows you to specify which data a user can see based on their identity or role within an organization.

One way to implement dynamic security in Power BI is by using the LOOKUPVALUE function in combination with a security table. The security table is a table in your Power BI dataset that defines the level of access that different users or groups have to different data.

The LOOKUPVALUE function allows you to look up a value in a table based on a matching condition. In the context of dynamic security, you can use the LOOKUPVALUE function to look up the level of access that a user has to a particular data element in the security table, based on their user name or group membership.

For example, suppose you have a Power BI report that displays sales data for different products and customers. You can use the LOOKUPVALUE function to check the security table to see if the current user has access to the data for a particular product or customer. If the user has access, the LOOKUPVALUE function will return the value "Allow" and the user will be able to see the data. If the user does not have access, the LOOKUPVALUE function will return the value "Deny" and the user will not be able to see the data.

Using the LOOKUPVALUE function and a security table allows you to implement dynamic security in a flexible and scalable way. It allows you to control access to data at a granular level and to change the level of access that users have to data on the fly, without the need to update the report or dashboard.

By contrast, the USEPRINCIPALNAME function allows you to filter data based on the current user's login name. While this can be useful in some cases, it is less flexible than using a security table and the LOOKUPVALUE function, as it does not allow you to specify different levels of access or to easily change the level of access that users have to data.

Note 8: Sample Implementation case of Row-Level Security (RLS)

Current Setup and Objective

Given that we have three different layers for Power BI implementation:

1. App Layer – Analytics

2. Dashboards/Reports Layer - SECURE_PROD

3. Semantic Model Layer - SECURE_Data

The Model is shared and located in a separate workspace (SECURE_Data). Dashboards/Reports are in a separate workspace (SECURE_PROD), where the shared Model is used. All dashboards/reports are organized in Power BI apps.

Ensure that when a specific employee logs in, they can only view their assignments for planned and actual data, while all other dashboards should not be impacted.

Analysis

Implementing RLS can effectively resolve this visibility issue. The challenge lies in deciding whether to utilize a consolidated or separate model.

Separated Model Strategy

Advantages:

- Ease of Implementation: Implementing RLS on a separate model does not impact the central Model.

- Standardized Process: The implementation process is standardized, making it straightforward.

- Maintenance Simplicity: The separate model is easier to maintain due to its independence.

Disadvantages:

- Dual Maintenance Required: Any changes requiring updates in both models increase maintenance effort.

- Complex Separation: Defining a clear boundary between models is challenging due to many involved parameters.

- Dashboard Transition: Employees need to switch to the new model, potentially necessitating adjustments to the overview pages, which could become a considerable risk to implement.

Consolidated Model Strategy

Adding security to the data model will impact all dashboards, which is not desirable. The solution is to somehow enhance the current model by leveraging an “Inactive Relationship” for RLS, which should affect only dashboards that need to have RLS, and leave other dashboards unchanged.

Advantages:

- Single Model Management: Maintains a single model, avoiding duplicated efforts and simplifying management.

- Seamless Integration: All existing parameters and functions remain effective without needing alterations.

- Unified Dashboard Access: The employee dashboard points to the same model, ensuring consistency.

Disadvantages:

- Dashboard Adjustments Needed: Modifications to dashboard measures are required to activate the RLS features.

Implementation Details

1. The model includes an inactive relationship, which is currently used only for presentation purposes and is not operational in any DAX formulas. Also, `NetworkLoginName` is used for consistency.

2. The role for a specific AD group has been configured for data viewing permissions.

3. DAX solutions have been implemented. Note that the functions `USERELATIONSHIP` and `CROSSFILTER` are not yielding the expected results.

MeasureX (RLS) =
VAR Email =
    SELECTEDVALUE ( Users[Email] )
RETURN
    IF (
        ISBLANK ( Email ),
        [MeasureX],
        CALCULATE (
            [MeasureX],
            'DimensionX'[Email] = Email
        )
    )

Deployment Details - Adding a Person or Group to RLS

To add a person or a group to the RLS, you need to add this security to four different places:

On the Power BI app side ( at the audience level) .

On the dashboard/report side ( at the report level, not the workspace level, as you don't want people to access the report workspace)

On the semantic model ( at the model level, not the workspace level, as you don't want people to access the report workspace)

On the RLS security setting.

If a user has access to the dataset with Build, Share, or Reshare permissions but does not have Write access, you also need to add the user to the Security settings for the semantic model. However, if the user has Write access to the dataset, then it is not necessary to add them to the Security settings for the semantic model. Typically, we apply the first case.

Monday, May 6, 2024

Achieving Workflow Solutions with Power BI and Power Automate

Check list of all posts

Context

The objective is to create a workflow solution that addresses the following requirements:

• The workflow process should be seamless, guiding end-users through the entire process.

• Data loading should be monitored to provide end-users with an overview of data availability.

Traditionally, solutions to these issues involve the use of scheduler software, workflow software, master data services, and web development. However, these tools present challenges in this context:

• Sophisticated workflow solutions are often expensive and impractical.

• Master data management can address some issues but is complex and does not fully resolve workflow challenges.

This article explores an affordable and innovative solution using Power BI and Power Automate, eliminating the need for custom or standard package software.

Solution Overview

The proposed solution leverages Power BI Dashboards and Power Automate as follows:

• Create highly interactive Power BI dashboards that function as a "web form," allowing users to navigate through various actions.

• Use Power BI dashboards to display overall status and enable users to control the data loading process.

• Utilize Power Automate visuals to automate processes as needed.

• Log all statuses, which can be displayed on the dashboard in real-time.

The following screenshot illustrates the main concept:

Note: This interface is a conceptual demonstration. In practice, it can be customized for any data load management. Key functions include:

1. Tracking the status of all streams.

2. Enforcing workflow logic to ensure users complete all steps.

3. Allowing users to manage situations similar to an operations dashboard.

The architectural diagram is shown below:

Implementation Challenges

Define a Flexible Data Model: Develop a framework or metadata that can handle similar problems. Data flow should be highly customizable but not overly complex, with main tables for year-level and stream-level Log

Write User Input into Database via Power Automate: While Power Automate can typically insert data into a database without issues, challenges may arise with specific databases like Netezza. A workaround involves using a data gateway, saving a control file to the process folder, and then inserting data status into the database.

Transform Power BI Dashboard into "Web Forms": To provide a user interface for the data workflow concept, find a way to convert reports into "Web forms." This can be achieved by leveraging Power BI key concepts, including DAX, conditional visibility, and buttons.

Stream Status =
VAR MaxStatusTimePerSector =
    CALCULATE ( MAX ( 'Stream Log'[Status Time] ) )
VAR StatusCode =
    CALCULATE (
        MAX ( 'Stream Log'[Status Code] ),
        'Stream Log'[Status Time] = MaxStatusTimePerSector
    )
VAR result =
    CALCULATE (
        SELECTEDVALUE ( 'Status'[Status Name] ),
        'Status'[Status Code] = StatusCode
    )
RETURN
    result

Stream Status =
VAR CurrentWorkYear = [Current Work Year]
VAR StreamStatus =
    ADDCOLUMNS (
        VALUES ( 'Stream'[Stream Code] ),
        "Year", CurrentWorkYear,
        "Status", CALCULATE ( [Stream Status], 'Stream Log'[Year] = CurrentWorkYear )
    )
RETURN
    StreamStatus

Stream Actions =
VAR CurrentWorkYear = [Current Work Year]
VAR StreamStatus =
    ADDCOLUMNS (
        VALUES ( 'Stream'[Stream Code] ),
        "Year", CurrentWorkYear,
        "Status", CALCULATE ( [Stream Status], 'Stream Log'[Year] = CurrentWorkYear )
    )
VAR StatusActionMap =
    DATATABLE (
        "Status", STRING,
        "Action", STRING,
        "ActionSortOrder", INTEGER,
        {
            { "Status 1", "Action 1", 1 },
            { "Status 2", "Action 2", 2 },
            { "Status 3", "Action 3", 3 },
            { "Status 4", "Action 3", 3 },
            { "Status 4", "Action 4", 4 },
            { "Status 5", "Action 1", 1 },
            { "Status 6", "Action 1", 1 }
        }
    )
VAR result =
    NATURALINNERJOIN ( StreamStatus, StatusActionMap )
RETURN
    result

Stream Actions - Stream 1 =
VAR YearStatus = [Year Status]
VAR Stream =
    IF ( YearStatus = "Closed", "ZZZ", "Stream 1" )
VAR ActionsFilteredByStatus =
    CALCULATETABLE ( 'Stream Actions', 'Stream Actions'[Stream Code] = Stream )
VAR Actions =
    SELECTCOLUMNS (
        ActionsFilteredByStatus,
        "Year", [Year],
        "Stream Code", [Stream Code],
        "Status", [Status],
        "Action", [Action],
        "ActionSortOrder", [ActionSortOrder]
    )
VAR DefaultActions =
    ROW (
        "Year", [Current Work Year],
        "Stream Code", "Stream 1",
        "Status", "n/a",
        "Action", "Status log",
        "ActionSortOrder", 10
    )
VAR result =
    UNION ( Actions, DefaultActions )
RETURN
    result

Implement Permission Pass-Through: Ensure that permissions are correctly propagated through the system.

Note: Action Choices are NOT refreshed with status accordingly on Cloud if the dashboard work against DirectQuery, on Clould, with calculated table. The solution is to use calculated measure instead of using calculated table

Step 1: Establish a hardcoded map between status and actions, which was defined based on original logic:

StreamStatusActionMap =
DATATABLE (
    "Status", STRING,
    "Action", STRING,
    "ActionSortOrder", INTEGER,
    {
        { "Status 1", "Action 1", 1 },
        { "Status 1", "Action 99", 99 },
        { "Status 2", "Action 2", 2 },
        { "Status 2", "Action 99", 99 },
        { "Status 3", "Action 3", 3 },
        { "Status 4", "Action 99", 99 },
        { "Status 5", "Action 1", 1 },
        { "Status 5", "Action 3", 3 },
        { "Status 5", "Action 4", 4 },
        { "Status 5", "Action 99", 99 },
        { "Status 6", "Action 1", 1 },
        { "Status 6", "Action 99", 99 },
        { "Status 7", "Action 1", 1 },
        { "Status 7", "Action 99", 99 }
    }
)

Step 2: Get the current status:

Stream 1 Status Name =
VAR CurrentWorkYear = [Current Work Year]
VAR MaxStatusTime =
    CALCULATE (
        MAX ( 'Stream Log'[Status Time] ),
        'Stream Log'[Stream Code] = "Stream 1"
    )
VAR StatusCode =
    CALCULATE (
        MAX ( 'Stream Log'[ Status Code] ),
        'Stream Log'[Status Time] = MaxStatusTime
    )
VAR StatusName =
    CALCULATE (
        SELECTEDVALUE ( 'Stream Status'[ Status Name] ),
        'Stream Status'[ Status Code] = StatusCode
    )
RETURN
    StatusName

Step 3: Get the first Action:

Stream 1 Action Text 1 =
VAR StatusName = [Stream 1 Status Name]
VAR Actions =
    CALCULATETABLE (
        'StreamStatusActionMap',
        'StreamStatusActionMap'[Status] = StatusName
    )
VAR Top1 =
    TOPN ( 1, Actions, 'StreamStatusActionMap'[ActionSortOrder], ASC )
VAR ActionSelected =
    MAXX ( Top1, 'StreamStatusActionMap'[Action] )
VAR NumberOfActions =
    COUNTROWS ( Actions )
VAR result =
    IF ( NumberOfActions > 1, ActionSelected )
RETURN
    result

Step 4: Get the Second Action:

Stream 1 Action Text 2 =
VAR StatusName = [Stream 1 Status Name]
VAR Actions =
    CALCULATETABLE (
        'StreamStatusActionMap',
        'StreamStatusActionMap'[Status] = StatusName
    )
VAR Top2 =
    TOPN ( 2, Actions, 'StreamStatusActionMap'[ActionSortOrder], ASC )
VAR Top1 =
    TOPN ( 1, Top2, 'StreamStatusActionMap'[ActionSortOrder], DESC )
VAR ActionSelected =
    MAXX ( Top1, 'StreamStatusActionMap'[Action] )
VAR NumberOfActions =
    COUNTROWS ( Actions )
VAR result =
    IF ( NumberOfActions > 1, ActionSelected )
RETURN
    result