Microsoft BI Tools: Deploy a Power BI dataset via a DevOps

Case
I want to develop my Power BI dataset separately from my Power BI reports. So we started using Tabular Editor. This also allows us to work on the model with multiple people at the same time because every object is a separate JSON file. But how can we automatically deploy these files using Azure DevOps?

Deploy your Power BI datasets with DevOps

Solution
In this serie of posts we will deploy the Tabular Editor project as an Power BI dataset via DevOps. We will deploy this to the development workspace. For the deployment from Dev to Test and Test to Prod we will use Power BI Deployment pipelines.

In this third post we will create the YAML files that will do the actual publishing of the data sets. The first file uses Tabular Editor to publish to the Development environment and the second file calls the Power BI deployment pipeline to publish to test/acceptance and to production. Both YAML files are stored in the repository under CICD\YAML\*****.yml.

Repository folder structure

1) DeployPBIModelDev.yml

The First YAML file is BuildModel.yml that builds the JSON files of the tabular editor project into a deployable *.bim file (the dataset) and then calls the deployment YAML files to do the actual deployment.

First we are adding the parameters to this YAML file. The values will be passed through by the YAML file of the previous post.

parameters:
  - name: env
    displayName: Environment
    type: string
    values:
    - DEV
  - name: SpApplicationId
    displayName: Service Principal Application Id
    type: string
  - name: SpClientSecret
    displayName: Service Principal Client Secret
    type: string
  - name: SpTenantId
    displayName: Service Principal Tenant Id
    type: string
  - name: DownloadLinkTabularEditor
    displayName: Download Link Tabular Editor
    type: string

This part connects to a enviroment to which you can add approval steps in DevOps under Pipelines - Environments.

jobs:
  - deployment: deploymentjob${{ parameters.env }}
    displayName: Deployment Job ${{ parameters.env }} 
    environment: Deploy to ${{ parameters.env }}
    strategy:
      runOnce:
        deploy:
          steps:

The first step is a check out to get the files from the repository to the agent. For this simple example it is not required, but later on you perhaps want to execute some Powershells that are stored in the repository to for example connect the dataset to a gateway.

          ###################################
          # 1 Check out repository to agent
          ###################################
          - checkout: self
            displayName: '1 Retrieve Repository'
            clean: true

In the second step we will download and unzip tabular editor. If you have a self-hosted agent instead of a Microsoft hosted agent you also could install this software on the VM manually. However this step allows you to easily upgrade to a newer version of the editor by just changing the URL in the variable group.

          ###################################
          # 2 Download Tabular Editor
          ###################################
          - powershell: |
              # Download URL for Tabular Editor portable:
              
              # Create Download folder
              $DownloadFolder = join-path (get-location) "TabularEditor"
              new-item -type directory -path $DownloadFolder -Force
              Write-Host "Create download location: $($DownloadFolder)" 

              # Download destination (root of PowerShell script execution path):
              $DownloadFile = join-path $DownloadFolder "TabularEditor.zip"
              Write-Host "Save download as: $($DownloadFile)"
              
              # Download from GitHub:
              Write-Host "Downloading Tabular Editor from: $(DownloadLinkTabularEditor)" 
              Invoke-WebRequest -Uri $(DownloadLinkTabularEditor) -OutFile $DownloadFile
              
              # Unzip Tabular Editor portable, and then delete the zip file:
              Write-host "Extracting downloaded zip file"
              Expand-Archive -Path $DownloadFile -DestinationPath $DownloadFolder
              Write-host "Clean up download file"
              Remove-Item $DownloadFile
            displayName: '2 Download Tabular Editor'

The third step is also an optional debug step showing you the artifact and repository files on your agent as well as the exact location of the executable of tabular editor which we need in the next step. Once everything is running fine you can remove or comment out this step.

            ###################################
            # 3 Show treeview of agent
            ###################################
            - powershell: |
                tree "$(Pipeline.Workspace)" /F
              displayName: '3 Treeview of Pipeline.Workspace'

The fourth step is the only mandatory step. It executes tabular editor for each project on the agent. Checkout all the parameters after TabularEditor.exe where the you will see the service principal and a lot of parameters of the tool it self.

          ###################################
          # 4 Publish data models
          ###################################
          - powershell: |
              # Get Artifact folder with model files
              $ModelsFolder = Join-Path -Path $(Pipeline.Workspace) -ChildPath "\PBIModelsArtifact\"

              # Loop through local artifact folder to get all datamodels
              $AllDatamodels = Get-ChildItem -Path $ModelsFolder -Recurse -Filter *.bim
              Write-Host "$($AllDatamodels.Count) model$(If ($AllDatamodels.Count -ne 1) {"s"}) found in $($ModelsFolder)"
              Write-Host "================================"
              foreach ($DataModel in $AllDatamodels)
              {
                $Path = "$($DataModel.Fullname)"
                Write-Host "Processing model $($Path)"

                # Extract datasetname and workspacename from folder path of model
                $DatasetName = $Path | Split-Path -Parent | Split-Path -Leaf
                $WorkspaceName = $Path | Split-Path -Parent | Split-Path -Parent | Split-Path -Leaf
                # Concat workspace name with the suffix used for the deployment workspace
                $WorkspaceName = "$($WorkspaceName)$(PowerBiWorkspaceSuffix)"
                  
                Write-Host "Publishing model [$($DatasetName)] in [$($WorkspaceName)]"
                # Publish model.bim to Power BI Workspace
                # Deleted -P argument at the end of the command line, because we don't want to overwrite the table partitions in the model
                $(Build.SourcesDirectory)\TabularEditor\TabularEditor.exe "$($DataModel.Fullname)" -D "Provider=MSOLAP;Data Source=powerbi://api.powerbi.com/v1.0/myorg/$($WorkspaceName);User ID=app:${{ parameters.SpApplicationId }}@${{ parameters.SpTenantId }};Password=${{ parameters.SpClientSecret }};" "$($DatasetName)" -O -C -R -M -E -V
                Wait-Process -Name "TabularEditor"

                Write-Host "================================"
              }
            displayName: '4 Deploy data models'

This is all you need to deploy your model to a Power BI Workspace. You could reuse this for other environments like test/acceptance and production as well. However we will be using the Power BI Deployment Pipelines for the next environments. This also allows you to change parameters and connections without writing code.

2) DeployPBIModelAccPrd.yml
By default the Power BI deployment pipeline will deploy everything from the source workspace to the destination workspace. We only want to deploy datasets and leave all the other stuff for other pipelines. To be even more specific we only want datasets from the respository and leave any manually deployed datasets. In the GUI you can make that selection manually. In the YAML code you can provide a semicolon separated string with the names of the datasets you want to promote. For this we will create a variable, fill it with some Powershell and then provide it as a parameter for the deployment.

First we are adding the parameters to this YAML file. The values will be passed through by the YAML file of the previous post. Note that the parameter differ from the YAML above. This is mainly because we can use a DevOps Service Connection instead of a Service Principal.

parameters:
  - name: env
    displayName: Environment
    type: string
    values:
    - ACC
    - PRD
  - name: DeployStagePowerBI
    displayName: DeployStagePowerBI
    type: string
    values:
    - Test
    - Production
  - name: ServiceConnection
    displayName: Service Connection
    type: string
  - name: ReposWorkspaceFolder
    displayName: ReposWorkspaceFolder
    type: string
  - name: DeploymentPipelinePBI
    displayName: Deployment Pipeline PBI
    type: string

This part connects to a enviroment to which you can add approval steps in DevOps under Pipelines - Environments. Note that we created an empty string variable named Datamodels. We will use this for the list of datasets that we want to promote.

jobs:
  - deployment: deploymentjob${{ parameters.env }}
    displayName: Deployment Job ${{ parameters.env }} 
    environment: Deploy to ${{ parameters.env }}
    variables:
    - name: Datamodels
      value: ""

    strategy:
      runOnce:
        deploy:
          steps:

The first step is a check out to get the files from the repository to the agent. For this simple example it is not required, but it could be handy for a more complex pipeline where you want to use Powershells from your repository.

          ###################################
          # 1 Check out repository to agent
          ###################################
          - checkout: self
            displayName: '1 Retrieve Repository'
            clean: true

The second step is also optional, but shows the artifact (and repos) files on your agent. We need this in the next step. You can remove or comment out the code once everything works

          ###################################
          # 2 Show environment and treeview
          ###################################
          - powershell: |
              Write-Output "Deploying PBI models in the ${{ parameters.env }} environment"
              tree "$(Pipeline.Workspace)" /F
            displayName: '2 Show environment and treeview Pipeline_Workspace'

In this step we will loop through the artifact folder to get a list of all datasets. The code will first create an array of subfolders (the subfolder has the name of the dataset). Then the array will be used to create a semicolon separated string of dataset names. The last Write-Host line looks a bit strange, but will populate the YAML variable that we create above.

          ###################################
          # 3 Fill variable with list of models
          ###################################
          - powershell: |
              # Get Artifact folder with model files
              $WorkspacesFolder = Join-Path -Path $(Pipeline.Workspace) -ChildPath "\PBIModelsArtifact\Models\${{ parameters.ReposWorkspaceFolder }}"

              # Loop through local artifact folder to get all datamodels
              [array]$ModelsArray = Get-ChildItem -Path $WorkspacesFolder | ? {$_.PSIsContainer} | Select-Object Name
              Write-Host "$($ModelsArray.Count) workspace$(If ($ModelsArray.Count -ne 1) {"s"}) found in $($WorkspacesFolder)"

              [string]$ModelsList = $null
              $ModelsList = $ModelsArray.name -join ";"
              Write-Host "Value $($ModelsList)"
              Write-Host "##vso[task.setvariable variable=Datamodels;]$ModelsList"
            displayName: '3 Fill variable with list of Workspaces'

For debug purposes we show the variable with the semicolon separated string of dataset names. You can remove this once everything works or leave it and use it as log information for your pipeline.

          ###################################
          # 4 Show environment and treeview
          ###################################
          - powershell: |
              Write-Output "Variable value of Datamodels $(Datamodels)"
            displayName: '4 Show variable value'

The last step is calling the deployment task from the installed DevOps addon for Power BI. We use Selective for the deployType which allows us to provide a lost of datasets. You can use a similar construction for the reports which we we show in a later blog post.

          ###################################
          # 5 Deploy models
          ###################################
          - task: ms-pbi-api.pbi-automation-tools.Pipelines_Deploy.DeploymentPipelines-Deploy@1
            displayName: '5 Deploy models to ${{ parameters.DeployStagePowerBI }} stage'
            inputs:
                pbiConnection: PowerBI
                pipeline: ${{ parameters.DeploymentPipelinePBI }}
                stageOrder: ${{ parameters.DeployStagePowerBI }}
                deployType: Selective
                datasets: $(Datamodels)
                createNewWS: false

Running it once for dev and once for test(/acc) will look like this. Note that some of the stages are skipped. This is due the branch strategy and the conditions we used between the stages.

The result of 2 runs

In the runs overview it will look like this.

Conclusion
In this post we showed two different ways to deploy your Power BI datasets. The first with tabular editor can be used without a Power BI Premium license. A disadvantage is that it is a lot of work to change parameters and connections.

The second method with the Power BI Deployment pipelines require Power BI Premium, but those pipelines allow you to change parameters and connections without code. A big disadvantage is that your datasets already need to be present in the development workspace of Power BI. This can be done with a Manual deployment of by combining both methods to get the best of both worlds and versioning of your work in a repository.

Microsoft BI Tools

Friday, 27 January 2023

Deploy a Power BI dataset via a DevOps - Publish

No comments:

Post a Comment