Dynamic Snowflake Connections in Power BI
Author(s) | @Braeden Bailey |
Creation Date | Jul 15, 2024 |
Last Edited | Jul 24, 2024 |
Purpose
This is a guide for setting up Snowflake connections in Power BI so that they can be dynamically altered for the dev/test/prod workspaces in the Power BI service.
Prerequisites
You will need the access to the following services/programs:
Power BI
Snowflake
Azure DevOps
1 Setting Up Power BI Git Integration
You will not be able to use standard Power BI deployment pipelines for this process (as far as I know). Power BI does support dynamic semantic model connections using deployment pipelines via “rules”, but only for certain data sources. Snowflake is not one of these data sources.
Therefore we need an alternative way to version control the Power BI reports while also giving us the ability to define data source connection parameters dynamically between dev/test/prod environments. The solution is Power BI’s new(ish) preview feature called Power BI Projects. Power BI Projects (.pbip) files differ from Power BI Desktop files (.pbix) because they create artifact files that can be version controlled with a Git repo in Azure DevOps.
1.1 Enabling Power BI Project Files
You can turn any existing .pbix
file into a .pbip
file by opening the file in Power BI Desktop.
Navigate to
File > Options and Settings > Options
. You should see a window open up.Scroll down to
Preview Features
and click on itEnable the “Power BI Project (.pbip) save option” checkbox and then press OK.
You should now be able to save any Power BI Desktop file as a .pbip file. When you do this, you will notice it creates other folders in your directory, called {Report Name}.Report
and {Report Name}.Semantic Model
. These folders contain a collection of JSON files and other files that contain information about the report. Anytime you make changes in the report and save it, these files will automatically be saved too.
1.2 Initialize Azure DevOps Repo
You’ll need to create an Azure project if you don’t already have one, create an Azure Repo then clone it to your local machine. Add/save your .pbip files to the repo you just cloned.
Proceed to setup dev/test/prod branches as you normally would using whatever Git interface you prefer. It may be a good idea to do an initial push of the .pbip file(s) to dev so you have something to test in the next step.
1.3 Connecting PBI Workspace to Git
Once your Azure repo is setup, you’ll need to connect your Power BI workspace(s) to the repositories you created so Power BI knows where to look for reports and updates.
Open your workspace in Power BI. Click on Workspace Settings. Then navigate to Git Integration and you should see all of your Azure connections available through dropdown menus. This assumes you are logging into Azure and Power BI using the same account.
Select the correct branch, then connect it to your workspace. For example, if we are in a workspace called ws_ProjectName [dev], we will want to connect the dev branch we made earlier.
After exiting the settings, you will see a red notification on a button that says Source Control at the top of the workspace. Click on it and sync the changes from your branch to your workspace. You should now see any reports you had pushed to the branch in your workspace.
Repeat Step 2 with other workspaces. You will also need to sync the branch to the workspace every time you make a new commit to the branch, it will never sync automatically (as far as I know).
Note: It seems that in order to add/change the branch that is connected to the workspace, the user must be both an admin or owner of the workspace and be added to the Git repo in Azure. Needs further testing
2 Connecting to Snowflake in Power BI
Connecting directly to a Snowflake data source is easy and no different than other data sources in Power BI. Select Get Data from the top ribbon then go to More > Snowflake and click connect.
Power BI only requires that you enter a Server and Warehouse to connect. However, you will need to specify some optional fields for the dynamic parameters to work. Be sure to fill out the following:
Server
Warehouse
Role
Database
Click Ok, then select all tables you want to load. On the next screen, you will be prompted to choose either Import
or DirectQuery
mode. This step is important and there are some key differences to be aware of. In short, Import mode data can be transformed within the report and the data can be updated via scheduled or manual refreshes. DirectQuery creates a live connection to the data that is always updating but cannot be transformed within the report. Read the full documentation here.
Notably, DirectQuery allows for a setup where users on the Power BI service will only be able to see the data in the report if they also have access to the data on Snowflake. Import mode allows users to see the report simply be being added to the workspace.
2.1 Creating Power BI Parameters
Next you will want to click on Transform Data in Power BI to open up the PowerQuery editor. Then go to Manage Parameters. You will need to add a parameter for the warehouse, role, and database. You can name them whatever you want, but I have been using:
Snowflake_WH
Snowflake_ROLE
Snowflake_DB
Make sure that the Required checkbox is ticked and that the datatype is set to text. Set the Default Value to the same value(s) you used when you loaded in the data to the report.
2.2 Adding Parameters to Connection Strings
Stay in the PowerQuery editor. Click on one of the tables you loaded in from Snowflake. On the Applied Steps section of the page, you will see a step titled Source. Click on it, then click Advanced Editor from the top ribbon.
A window will open up with the full Snowflake connection string. You will notice that the values you used to connect to Snowflake are here, you need to replace them with the parameter names. Here is an example of what that might look like:
Before Replacement:
let
Source = Snowflake.Databases("yourserver.snowflakecomputing.com", "WAREHOUSE_DEV",[Role="MYROLE_DEV"]),
SOURCEDATABASE_DEV_Database = Source{[Name="SOURCEDATABASE_DEV",Kind="Database"]}[Data],
MYSCHEMA_Schema = SOURCEDATABASE_DEV_Database{[Name="MYSCHEMA",Kind="Schema"]}[Data],
VW_TABLEONE_View = MYSCHEMA_Schema{[Name="VW_TABLEONE",Kind="View"]}[Data]
in
VW_TABLEONE_View
After Parameter Replacement
let
Source = Snowflake.Databases("yourserver.snowflakecomputing.com", Snowflake_WH,[Role=Snowflake_ROLE]),
SOURCEDATABASE_DEV_Database = Source{[Name=Snowflake_DB,Kind="Database"]}[Data],
MYSCHEMA_Schema = SOURCEDATABASE_DEV_Database{[Name="MYSCHEMA",Kind="Schema"]}[Data],
VW_TABLEONE_View = MYSCHEMA_Schema{[Name="VW_TABLEONE",Kind="View"]}[Data]
in
VW_TABLEONE_View
You will have to repeat this replacement process in the Advanced Editor for all tables or views that you loaded into Power BI. You should be able to also create more parameters following these methods, if for some reason your Schema has a difference name in DEV and TEST, then you would want to parameterize it here.
2.3 Setting Parameters in Power BI Service
Follow the steps from Section 1.3 to push your report to the remote Azure repository. Push and merge as necessary, then sync the workspace with your repository.
Go to the Power BI Service and click the ellipses on your semantic model(s) and go to Settings. Scroll to Parameters and expand the dropdown. You should see the three parameters (Snowflake_WH
, Snowflake_ROLE
and Snowflake_DB
) that were created earlier.
Enter in the parameters that you want to be used for this workspace. There are two things to be aware of here depending on if you are using DirectQuery or Import mode when setting up the credentials. Open up the Data source credentials dropdown menu then proceed based on your setup:
DirectQuery: Click on Edit Credentials next to the Snowflake option.
Change the Authentication Method to OAuth2.
Set the Privacy level to Organizational
Check the box that reads “Report viewers can only access this data source with their own Power BI identities using DirectQuery”.
You need to ensure that your users have access to the Snowflake_ROLE
that you set in the parameters for the workspace. This role will also need access to the Snowflake_WH
and Snowflake_DB
. When a user logins to view the report, Power BI will automatically check if they account they are logging in with has the proper permissions, and then they will be allowed to use the Power BI report.
Import: You’ll want to be refreshing the data with a Snowflake service account. Click on Edit Credentials next to the Snowflake option.
Change Authentication Method to Basic
You will be prompted for a username/password. Enter the service account’s username and password.
Set the Privacy level to Organizational
Do not check the “Report viewers can only access…” box.
Ensure that your service account has access to the Snowflake_ROLE
you set in the Parameters section. This role will also need access to the Snowflake_WH
and Snowflake_DB
that you set in Parameters. Since the data credentials are being handled by a service account, any user that is granted to access to the PBI Workspace will be able to use the data in the report that the service account has access to.
Note: RLS has not been tested with either of these setups yet but should be possible.
Try to refresh your semantic model and access the report after completing either of the steps above. If it works, you have setup properly.
2.4 Commit Back to Azure Branch
Once you’ve completed the above steps. When you return to your workspace you’ll notice that Power BI tells you that you have an uncommitted change to the semantic model. You can go ahead and commit it back to the branch.
What’s interesting here is that future merges to the branch (usually) will not overwrite the parameters that you defined in the Power BI service, even though you’d be merging from a development branch that you were working on locally that had different parameters. This commit from Power BI back to Azure also does not create any merge conflicts the next time you commit and merge a change from your desktop to the same branch in Azure.
I have seen the parameters defined in the semantic models sometimes revert back to the ones being used in Power BI Desktop. I’m not entirely sure what causes this yet, needs more testing. Worst case scenario, you would know immediately if the parameters were incorrect because the report wouldn’t be able to authenticate the data, so you would open up the semantic model settings again and just change the parameters to something that the provided data authentication can access.
3 Future Testing
Row level security needs to be tested as a possibility with either of these setups
Test if the parameters setup works with paginated reports, which do not have the .pbip save feature and may require additional workarounds
4 Paginated Reports and Other Options
This is a WIP section that will be updated as research into the topic continues. Some promising articles from Microsoft that may help setup what we need:
Connect paginated reports to data sources using Power Query in Power BI Report Builder (Preview)
This importantly documents how to link filtering parameters to a Power Query dataset
It also mentions using Shareable Cloud Connections in order to share the report?
Create and share cloud data sources in the Power BI service