Thursday, 19 May 2022

Refresh single table in PBI Dataset from within ETL

Case
I have one big Power BI dataset with multiple source tables and one of those source tables is refreshed once every hour and all the other tables are only refreshed once a day. How can refresh only specific tables within my Power BI dataset via an ADF or Synapse pipeline?
Using XMLA enpoints in Power BI













Solution
To accomplish this you need the XMLA enpoints in Power BI. Those XMLA enpoints provide access to the Analysis Services engine in Power BI giving you extra options for your dataset like only refreshing a single table (or a couple of tables).

However, XMLA enpoints are only availble for premium workspaces. So you either need Power BI premium (capacity) or Power BI premium per user. This doesn't work for Power BI PRO.

1) Enable XMLA endpoints
First make sure the XMLA endpoints are enabled in the Tenant settings. For this you need a Power BI administrator role.
  • Log in to PowerBI.com
  • Click on the gear icon in the top right corner
  • Choose Admin Portal
  • Click on Tenant settings in the Admin portal menu. If you don't see that option then you don't have the PBI admin role
  • Scrole down to Integration settings
  • Expand 'Allow XMLA endpoints and Analyze in Excel with on-premises datasets'
  • Check whether it is enabled for either The entire organization or Specific security groups. If it is set to Specific security groups make sure that your user and the Service Principal or the Managed Identity of ADF/Synapse is in that group.
Allow XMLA endpoints and
Analyze in Excel with on-premises datasets













2) Set to Read-Write
Depending or the premium license type you go to Premium Per User or to Capacity settings in the same Admin portal menu. And if you have multiple Capacities then first go to the specific Capacity.
  • In the Admin portal go to Premium Per User or to Capacity settings (depending on your license)
  • Locate the XMLA Endpoint setting and set it to Read Write
XMLA endpoints for Premium Capacity


XMLA endpoints for Permium per user





















3) Premium workspace
Now go to the Power BI Workspace where your dataset is located and make sure the workspace is a Premium workspace. You can change it in the Settings in the upper right corner.






















4) Create JSON message 
Now go to your ADF or Synapse pipeline that refreshes the Power BI dataset where you either use a Service Principal or a Managed Identity to access Power BI. All those steps are described in details in those two posts.
Web Activity to refresh a Power BI dataset












The body of this Web Activity contains options for refreshing the dataset. Our basic example only contained an option to skip notifications: {"notifyOption":"NoNotification"}. This message can also contain settings to refresh only certain tables. Check DatasetRefreshObjects and DatasetRefreshType in the documentation and create a JSON messages like this where we refresh only the data of two tables: myFact and myDim.

{"notifyOption":"NoNotification"
,"objects":"[{"table":"myFact"},{"table":"myDim"}]"
,"type":"DataOnly"}

Refresh specific tables in a PBI dataset



















5) Create JSON message via SQL
If you want to make that a little more flexible then you can put a Lookup activity in front of it that gets the tables from query like this:
SELECT CONVERT(VARCHAR(MAX),
    (SELECT [table]
    FROM [META].[DatasetContinuousRefresh]
	WHERE Active = 1
    FOR JSON PATH)
) as body

Lookup activity to get table names in a JSON output














Instead of the hardcoded body in the Web activity you can create something more dynamic with an expression like:
@concat('{"notifyOption":"NoNotification"'
,if(pipeline().parameters.ContinuousLoad,concat(',"objects":"',activity('Create body call dataset refresh').output.firstRow.body,'"'),'')
,',"type":"DataOnly"}')

The entire pipeline to refresh a Power BI dataset via its name (instead of the ID) looks something like this:
The complete refresh PBI dataset pipeline










Please check the previous blog posts for all the details

Conclusions
In this post we showed you how to refresh only a single table or just a couple of tables within a Power BI dataset. This could save you a lot of refresh time if you don't want to refresh all tables each and every time. The only downside is that Power BI PRO is not enough and you need access to the Power BI admin portal to set the XMLA settings.

thx to colleague Hans Timmerman for helping.

Friday, 13 May 2022

DevOps: SQL Server - NETFramework v4.5 not found

Case
I have a DevOps pipeline to build and deploy my Azure SQL Server Database, but it is giving an .NET Framework error stating that it can't find the framework version. It did work before, how can I solve it?
error MSB3644: The reference assemblies for
 .NETFramework,Version=v4.5 were not found.












The entire error message in Azure DevOps:
##[section]Starting: 1. Creating Artifact
==============================================================================
Task         : MSBuild
Description  : Build with MSBuild
Version      : 1.199.0
Author       : Microsoft Corporation
Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/build/msbuild
==============================================================================
##[command]"D:\a\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.199.0\ps_modules\MSBuildHelpers\vswhere.exe" -version [17.0,18.0) -latest -format json
##[command]"C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\msbuild.exe" "D:\a\1\s\SQL\myDWH\myDWH.sqlproj" /nologo /nr:false /dl:CentralLogger,"D:\a\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.199.0\ps_modules\MSBuildHelpers\Microsoft.TeamFoundation.DistributedTask.MSBuild.Logger.dll";"RootDetailId=|SolutionDir=D:\a\1\s\SQL\myDWH|enableOrphanedProjectsLogs=true"*ForwardingLogger,"D:\a\_tasks\MSBuild_c6c4c611-aa2e-4a33-b606-5eaba2196824\1.199.0\ps_modules\MSBuildHelpers\Microsoft.TeamFoundation.DistributedTask.MSBuild.Logger.dll"  /p:_MSDeployUserAgent="VSTS_3aa34741-51f2-4a22-9768-a5deca3bfa4e_build_17_0"
Build started 5/13/2022 2:36:44 PM.
##[error]C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(1221,5): Error MSB3644: The reference assemblies for .NETFramework,Version=v4.5 were not found. To resolve this, install the Developer Pack (SDK/Targeting Pack) for this framework version or retarget your application. You can download .NET Framework Developer Packs at https://aka.ms/msbuild/developerpacks
Project "D:\a\1\s\SQL\myDWH\myDWH.sqlproj" on node 1 (default targets).
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(1221,5): error MSB3644: The reference assemblies for .NETFramework,Version=v4.5 were not found. To resolve this, install the Developer Pack (SDK/Targeting Pack) for this framework version or retarget your application. You can download .NET Framework Developer Packs at https://aka.ms/msbuild/developerpacks [D:\a\1\s\SQL\myDWH\myDWH.sqlproj]
_CleanRecordFileWrites:
  Creating directory "obj\Debug\".
Done Building Project "D:\a\1\s\SQL\myDWH\myDWH.sqlproj" (default targets) -- FAILED.

Build FAILED.

"D:\a\1\s\SQL\myDWH\myDWH.sqlproj" (default target) (1) ->
(GetReferenceAssemblyPaths target) -> 
  C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\Microsoft.Common.CurrentVersion.targets(1221,5): error MSB3644: The reference assemblies for .NETFramework,Version=v4.5 were not found. To resolve this, install the Developer Pack (SDK/Targeting Pack) for this framework version or retarget your application. You can download .NET Framework Developer Packs at https://aka.ms/msbuild/developerpacks [D:\a\1\s\SQL\myDWH\myDWH.sqlproj]

    0 Warning(s)
    1 Error(s)

Time Elapsed 00:00:01.97
##[error]Process 'msbuild.exe' exited with code '1'.
##[section]Finishing: 1. Creating Artifact
Solution
This error is probably because you have a DevOps agent of the type windows-latest and that recently changed to the newer Windows-2022 and .Net Framework version 4.5 is out of support since April 26, 2022.

The solution is simple:
  • Go to Visual Studio and open your database project
  • Find your project in the Solution Explorer pane
  • Right click your project and go to the properties
  • Go to the tab SQLCLR and find Target framework
  • Change the .NET Framework to a higher version (4.7.2 or 4.8)
  • Now save your project and make sure the change goes to your repository so that your build pipeline can do its work correctly
Change .NET Framework version in SQLCLR pane



















Another option is to downgrade your DevOps Agent from 'windows-latest' to 'windows-2019'. An even simpler change, but probably only a temporary change to postpone the real change.

Conclusion
In this post you learned how to solve the .NET framework not found error in two ways. A temporary quick-win by changing the agent type and a little more 'permanent' change in the Visual Studio project (until that versio also goes out of support).

Sunday, 8 May 2022

Refresh Power BI Datasets via its name instead of Id

Case
In our last post we refreshed a Power BI dataset from within an ADF or Synapse pipeline. For this we used the IDs of the PBI Workspace and Dataset, but it would be much user friendly to use the name instead of the ID.
Translate Name to Id






Solution
This is a follow up post. Make sure you first create a Pipeline that either uses a Service Principal (SP) or the Managed Identity (MSI) to authorize against Power BI. This example adds three aditional activities before the Web Activity called Call dataset refresh. If you used the Service Principal it is between Get AAD Token and Call dataset refresh


1) Change Parameters
In the previous post we used the parameters named WorkspaceId and DatasetId. Let's change those to WorkspaceName and DatasetName. The type is still a string for both parameters.
Change parameters from Id to Name








2) Get WorkspaceId
To translate the Workspace Name to a Workspace Id we will call a Rest API with a Web Activity. Make sure to use the same authorization as you used in the Call dataset refresh.
  • Add Web Activity to your existing pipeline
  • Give it the name Get WorkspaceId. It will be used later on.
  • Set the URL to the following expression
    @concat(
    'https://api.powerbi.com/v1.0/myorg/groups?$filter=name%20eq%20''',
    pipeline().parameters.WorkspaceName,'''')
    It uses the parameter from step 1 in the filter option from the Rest API. Note that the space is translated to %20
  • Set Method to GET
  • Now either add a authorization header for SP or set the Authenication for the MSI
This will return 1 workspace (if it exists).
Get Workspace Id via its Name













3) Get All Datasets
Next step is to translate the Dataset Name to and ID. This is also done via a Rest API, but unfortunately it doesn't support a filter. So we will get ALL datasets from the Workspace and then filter it afterwards. Again make sure to use the same authorization as you used in the Call dataset refresh.
  • Add Web Activity to your existing pipeline and connect it to Get WorkspaceId
  • Give it the name Get All Datasets. It will be used later on.
  • Set the URL to the following expression
    @concat('https://api.powerbi.com/v1.0/myorg/groups/',
    activity('Get WorkspaceId').output.value[0].id,
    '/datasets')
    It will use the parameter from step to get all datasets from a specific workspace.
  • Set Method to GET
  • Now either add a authorization header for SP or set the Authenication for the MSI
This will return all datasets from the workspace from step 2 (if you have access).
Get all datasets from a workspace













4) Filter DatasetName
The last new activity is a Filter activity which we will use to filter all the datasets from step 3 to only the one we need.
  • Add Filter Activity to your existing pipeline and connect it to Get All Datasets
  • Give it the name Filter DatasetName. It will be used later on.
  • Set Items expression to
    @activity('Get All Datasets').output.value
    This is the output from step 3
  • Set Condition expression to
    @equals(item().name,pipeline().parameters.DatasetName)
    This is the second parameter from step 1
Filter all datasets with a dataset name to get only 1








5) Change expressions
Now go to all the successive (Web) activities and change the expressions where you used a parameter. Now you need the get the WorkspaceId with:
activity('Get WorkspaceId').output.value[0].id
and the DatasetId with:
activity('Filter DatasetName').output.value[0].id


Conclusion
In this follow up post you learned how to translate the Workspace Name and Dataset Name to an ID which is required for the Rest API calls. With the three extra activities you just made your pipeline a little more friendlier to use, but also a little more monkey proof. Because if you delete a dataset and publish it again, it gets a different ID and then you also need to change your pipeline parameter. In a next follow up post we will refresh only certain tables within a Power BI dataset.

thx to colleague Hans Timmerman for helping.