Showing posts with label YAML. Show all posts
Showing posts with label YAML. Show all posts

Friday 5 May 2023

Synapse - Automatically check naming conventions

Case
We use naming conventions in Synapse, but sometimes it's just a lot of work to check if everything is correct. Is there a way to automatically check those naming conventions?
Automatically check Naming Conventions












Solution
You can use a PowerShell script to loop through the JSON files from Synapse that are stored in the repository. Prefixes for Linked Services, Datasets, Pipelines, Notebooks, Dataflows and scripts are easy to check since that is just a case of checking the filename of the  JSON file. If you also want to check activities or use the Linked Service or Dataset type to check the naming conventions then you also need to check the contents of those JSON files.

There are a lot of different naming conventions, but as long as you are committed to use one, it will make your Synapse workspace more readable and your logs easier to understand. In one glance you will immediately see which part of Synapse is causing the error. Above all it looks more professional and it show you took the extra effort to make it better.

However, having a lot of different naming conventions also makes it is hard to create one script to rule them all. We created a script to check the prefixes of all differents parts of Synapse and you can configure them in a JSON file. You can use this script to either run it once a while to occasionally check your workspace or run it as a Validation step for a pull request in Azure DevOps. Then it acts like a gatekeep that doesn't allow bad named items. You can make it a required step then you first have to solve the issues or make it an optional check and then you will get the result, but you can choose to ignore it. For new projects you should make it required and for big existing projects you should probably first make it optional for a while and then change it to required.

The PowerShell script, the YAML file and the JSON config example are stored in a public GitHub site. This allows us to easily improve the code for you and keep this blog post up-to-date. It also allows you to help out by doing suggestions or even to write some better code.


1) Folder structure repository
Just like the Synapse Deployment scripts we store these validation files in the Synapse respository. We have CICD and a SYN folder in the root. SYN contains the JSON files from the Synapse workspace. The CICD folder had three sub folders: JSON (for the config), PowerShell (for the actual script) and YAML (for the pipeline that is required for validation).

Download the files from the Github Repository and store these in your own Repository according the structure described above. If you use a different structure you have to change the paths in the YAML file.
Repository structure
























2) Create pipeline
Now create a new pipeline with the existing YAML file called NamingValidation4Synapse.yml. Make sure the paths in the YAML are following the folder structure from step 1 or change it to your own structure.

If you are not sure about the folders then there is a treeview step in the YAML that will show you the structure of the agent. Just continue with the next steps and after the first run check the result of the treeview step and change the paths in the YAML and run again. You can remove or comment-out the treeview step when everything works.

This example uses the Azure DevOps respository with the following steps:
  • Go to pipelines and create a new pipeline
  • Select the Azure Repos Git
  • Select the Synapse repository
  • Choose Existing Azure Pipelines YAML file
  • Choose the right branch and select the YAML file under path
  • Save it and rename it because the default name is equals to the repos name
Create new pipeline with existing YAML file












3) Branch validation
Now that we have the new YAML pipeline, we can use it as a Build Validation in the branch policies. The example shows how to add them in an Azure DevOps repository.
  • In DevOps go to Repos in the left menu.
  • Then click branches to get all branches.
  • Now hover you mouse above the first branch and click on the 3 vertical dots.
  • Click Branch policies
  • Click on the + button in the Build Validation section.
  • Select the new pipeline created in step 2 (optionally change the Display name)
  • Choose the Policy requirement (Required or Optional)
  • Click on the Save button
Repeat these steps for all branches where you need the extra check. Don't add them on feature branches because it will also prevent you doing manual changes in these branches.
All branches








Required or Optional
















4) Testing
Now its time to perform a pull request. You will see that the validation will first be queued. So this extra validation will take a little extra time, especially when you have a busy agent. However you can just continue working and wait for the approval or even press Set auto-complete to automatically complete the Pull Request when all approvals and validations are validated. However don't auto-complete if you chose to make it an optional validation because then it will cancel it as soon as all other validations are ready.

As soon as the validation is ready it will show you the first couple if errors.
Required Naming Convention Validation failed











You can click on those couple of errors to see the total number of errors and the first ten errors.
Get error count and first 10 errors

















And you can click on one of those errors to see the entire output including all errors and all correct items.

















And at the bottom you will find a summary with the total number of errors and percentages per Synapse part.
Summary example










Summary example











Or download your Synapse JSON files and run the PowerShell script locally with for example Visual Studio Code.
Running Naming Validations in VCode














Conclusion
In this post you learned how you could automate your naming conventions check. This helps/forces your team to consistantly use the naming conventions without you being some kind of nitpicking police officer. Everybody can just blame DevOps/themselves.

You can combine this validation with for example the branch validation to prevent accidentily choosing the wrong branch in a Pull Request.

Please submit your suggestions under Issues (and then bug report of feature request) in the GitHub site or drop it in the comments below.




Friday 28 April 2023

DevOps - Build Validation to check branches

Case
We sometimes mess up when creating pull requests in Azure DevOps by accidentally selecting the wrong source or target branch. These mistakes caused unwantend situations where stuff got promoted to the wrong environment to early. Resolving those mistakes often take a lot of time. Is there a way to prevent these easily made mistakes?
Build Validation for Pull Requests
























Solution
The first option is to setup a four-eyes principle where someone else has to approve your work including choosing the right branches for your pull request. This can be done by setting the branch policies. Note that you need to do this for all your branches (excluding feature/personal branches).
Setting minimum number of reviewers for changes




















Onother option that you could use is build validations where you compare the source branch with the target branch in a script. Even better is to combine these two options, but if you (sometimes) work alone then this could be an alternative.

Note that this solution only works if you have at lease two branches and that you are also using feature branches.

1) Create extra YAML file
Add an extra YAML file called ValidatePullRequest.yml in your your current YAML folder.
Pipeline for validating pull request















The code of the new file can be found below, but you have to change two things in this code. The first thing is the names from you branches in order (on line 14). If you only have a main and a development branch then it will be [String[]]$branches = "main", "development". The second item is the name/wildcard of hotfix/bugfix branches that will need to be ignored by this check (on line 17). So if your bugfix branches always contain the word hotfix then it will become [String]$fixBranch = "hotfix". The script uses a like to compare the name.

trigger: none
steps:
- task: PowerShell@2
  displayName: 'Validate branches in Pull Request'
  inputs:
    targetType: 'inline'
    script: | 
        #######################################################################
        # PARAMETERS FOR SCRIPT
        #######################################################################
        # All branches in order. You can only do a pull request one up or down.
        # Feature and / or personal branches can only be pulled to or from the
        # latest branch in the row.
        [String[]]$branches = "main", "acceptance", "test", "development", "sprint"

        # Bugfix or hotfix branches can be pulled to and from all other branches
        [String]$fixBranch = "bugfix"


        #######################################################################
        # DO NOT CHANGE CODE BELOW
        #######################################################################
        $SourceBranchName = "$(System.PullRequest.SourceBranch)".toLower().Replace("refs/heads/", "") # sourceBranchName does not exist
        $TargetBranchName = "$(System.PullRequest.targetBranchName)"

        function getBranchNumber
        {
            <#
                .SYNOPSIS
                Get the order number of the branch by looping through all branches and checking then branchname
                .PARAMETER BranchName
                Name of the branch you want to check
                .EXAMPLE
                getBranchNumber -BranchName "myBranch"
            #>
            param (
                [string]$BranchName
            )
            # Loop through branches array to find a specific branchname
            for ($i = 0; $i -lt $branches.count; $i++)
            {
                # Find specific branchname
                if ($branches[$i] -eq $BranchName)
                {
                    # Return branch order number
                    # (one-based instead if zero-based)
                    return $i + 1
                }
            }
            # Unknown branch = feature branch
            return $branches.count + 1
        }


        # Retrieve branch order
        $SourceBranchId = getBranchNumber($SourceBranchName)
        $TargetBranchId = getBranchNumber($TargetBranchName)

        # Show extra information to check the outcome of the check below
        Write-Host "All branches in order: [$($branches -join "] <-> [")] <-> [feature branches]."
        Write-Host "Checking pull request from $($SourceBranchName) [$($SourceBranchId)] to $($TargetBranchName) [$($TargetBranchId)]."

        if ($SourceBranchName -like "*$($fixBranch)*")
        {
            # Pull request for bugbix branches are unrestricted
            Write-Host "Pull request for Bugfix or hotfix branches are unrestricted."
            exit 0
        }
        elseif ([math]::abs($SourceBranchId-$TargetBranchId) -le 1)
        {
            # Not skipping branches or going from feature branch to feature branch 
            Write-Host "Pull request is valid."
            exit 0
        }
        else
        {
            # Invallid pull request that skips one or more branches
            Write-Host "##vso[task.logissue type=error]Pull request is invalid. Skipping branches is not allowed."
            exit 1
        }

In this GitHub repository you will find the latest version of the code

Note 1: The PowerShell script within the YAML either returns 0 (success) of 1 (failure) and that the Write-Host contains a little ##vso block. This allows you to write errors that will show up in the logging of DevOps.

Note 2: The script is fairly flexible in the number of branches that you want use, but it does require a standard order of branches. All unknown branches are considered being feature branches. If you have a naming conventions for feature branches then you could refine the IF statement to also validate those naming conventions to annoy/educate your co-workers even more.

2) Create new pipeline
Now create a new pipeline based on the newly created YAML file from step 1.
  • Go to Pipelines in Azure DevOps.
  • Click on the New Pipeline button to create the new pipeline.
  • Choose the repos type (Azure DevOps Git in our example)
  • Select the right repositiory if you have mulitple repos
  • Select Existing Azure Pipelines YAML file.
  • Select the branch and then the new YAML file under Path.
  • Save it
  • Optionally rename the Pipeline name to give it a more readable name (often the repository name will be used as a default name)














3) Add Build Validation to branch
Now that we have the new YAML pipeline, we can use it as a Build Validation. The example shows how to add them in an Azure DevOps repository.
  • In DevOps go to Repos in the left menu.
  • Then click branches to get all branches.
  • Now hover you mouse above the first branch and click on the 3 vertical dots.
  • Click Branch policies
  • Click on the + button in the Build Validation section.
  • Select the new pipeline created in step 2 (optionally change the Display name)
  • Click on the Save button
Repeat these steps for all branches where you need the extra check. Don't add them on feature branches because it will also prevent you doing manual changes in these branches.
Adding the Build Validation as a Branch Policy






















4) Testing
Now its time to perform a 'valid' and an 'invalid' pull request. You will see that the validation will first be queued. So this extra validation will take a little extra time, especially when you have a busy agent. However you can just continue working and wait for the approval or even press Set auto-complete to automatically complete the Pull Request when all approvals and validations are validated.

If the validation fails you will see the error message and then you can click on it to also find the two regular Write-Host lines (60 and 61) with the list of branches and the branch names with there branch order number. This should help you to figure out what mistake your made.
Invalid pull request
























For the 'legal' pull request you will first see the validation being queued and then you will see the green circle with the check mark. You can click on it to see why this was a valid Pull request.
Valid pull request


















Conclusions
In this post you learned how to create Build Validations for your Pull Requests in Azure DevOps. In this case a simple script to compare the Source and Target branch from a Pull Request. In a second example, coming online soon, we will validate the naming conventions within a Synapse Workspace (or ADF) during a Pull Request. This will prevent your colleagues (you are of course always following the rules) from not following the naming conventions because the Pull Request will automatically fail.
Sneak preview














Special thanks to colleague Joan Zandijk for creating the initial version of the YAML and Powershell script for the branch check.

Saturday 1 April 2023

Synapse - Cleanup workspace after deployment

Case
A while ago we created a script to clean up the Synapse Workspace before deploying new pipelines, datasets, etc. to Synapse. This helps you to remove old parts like for example triggers because the Synapse deployment is incremental and does not do any deletes. That script works fine, however for workspaces with a lot of items it could take up to 30 minutes to clean up everything. In that post we already mentioned some future improvements...
Cleanup Synapse Workspace after deployment



Update: use DeleteArtifactsNotInTemplate: true in deployment task to avoid powershell

Solution

This version of the cleanup script compares the items in the Synapse Workspace against the Artifact that you just deployed. Everything that is in the workspace but not in the artifact will now be deleted (afterwards). This will significantly shorten the cleanup period during deployment.

You need to add this YAML step to your Synapse Deployment pipeline (preferably after the Deployment step) and change the path of the PowerShell script to your setup. The first parameter is the Synapse Workspace Name of the environment you are deploying to. The second one is the corresponding Resource Group Name. The last one is the location of the artifact. In this case the Pipeline.Workspace variable + "s\Synapse" (see screenshot below of the treeview on the agent).
###################################
# 5 Cleanup Synapse
###################################
- task: AzurePowerShell@5
  displayName: '5 Cleanup Synapse'
  inputs:
	azureSubscription: ${{ parameters.ServiceConnnection }}
	scriptType: filePath
	scriptPath: $(Pipeline.Workspace)\s\CICD\Powershell\ClearSynapse.ps1
	scriptArguments:
	  -WorkspaceName ${{ parameters.TargetWorkspaceName }} `
	  -ResourceGroupName ${{ parameters.ResourceGroupName }} `
	  -ArtifactDirectory $(Pipeline.Workspace)\s\Synapse
	azurePowerShellVersion: latestVersion
	pwsh: true
Showing the path of the Artifact root folder















Save the PowerShell Script below in your repository and change the path in the above YAML script. We like to have a CICD folder in the root to store everything deployment related.
 
Repos folder structure























param (
   [Parameter (Mandatory = $true, HelpMessage = 'Synapse name')]
   [ValidateNotNullOrEmpty()]
   [string] $WorkspaceName,
   
   [Parameter (Mandatory = $true, HelpMessage = 'Resourcegroup name')]
   [ValidateNotNullOrEmpty()]
   [string] $ResourceGroupName,
   
   [Parameter (Mandatory = $true, HelpMessage = 'Artifact Directory')]
   [ValidateNotNullOrEmpty()]
   [string] $ArtifactDirectory
)

# Two default Linked Services we cannot remove
[string] $WorkspaceDefaultSqlServer = "$($WorkspaceName)-WorkspaceDefaultSqlServer"
[string] $WorkspaceDefaultSqlStorage = "$($WorkspaceName)-WorkspaceDefaultStorage"

# A little dummy protection to check whether you have configured the right folder
# If these folders are not available there is probably something wrong and we
# don't want to delete everything in the workspace.
if (!(Test-Path -Path (Join-Path $ArtifactDirectory "integrationRuntime")) -AND
    !(Test-Path -Path (Join-Path $ArtifactDirectory "linkedService")) -AND 
    !(Test-Path -Path (Join-Path $ArtifactDirectory "pipeline")))
    {
        Write-Output "Artifact folder $($ArtifactDirectory) "
        throw "Dummy protection - Probably not the right folder that stores your artifact"
    }


#######################################################
# 1) Checking for resource locks and removing them
#######################################################
Write-Output "==========================================="
Write-Output "1) Removing resource locks"
Write-Output "==========================================="

# Getting all locks on the Azure Synapse Workspace
$lock = Get-AzResourceLock -ResourceGroupName $ResourceGroupName -ResourceName $WorkspaceName -ResourceType "Microsoft.Synapse/workspaces"
Write-Output "Found $($lock.Count) locks"

# Check if the collection of Azure resource locks is not emtpy
if($null -ne $lock)
{
    # Looping through all resource locks to remove them one by one
    $lock | ForEach-Object -process {
        # Remove lock
        Write-Output "Removing Lock Id: $($lock.LockId)"
        Remove-AzResourceLock -LockId $_.LockId -Force
    }
}
Write-Output "Step 'Removing resource locks' completed`r`n"



#######################################################
# 2) Stop and remove Triggers not in Artifact
#######################################################
Write-Output "==========================================="
Write-Output "2) Stop and remove Triggers not in Artifact"
Write-Output "==========================================="

# Check if the artifact contains any triggers
if (Test-Path -Path (Join-Path $ArtifactDirectory "trigger"))
{
    # Getting all Triggers from Artifact
    $ArtifactTriggers = Get-ChildItem -Path (Join-Path $ArtifactDirectory "trigger") -Filter "*.json" | Select-Object -ExpandProperty BaseName

    # Getting all Triggers from Synapse
    $SynapseTriggers = Get-AzSynapseTrigger -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name

    # Getting Triggers from Synapse that are not in the Artifact
    $Triggers = Compare-Object -ReferenceObject $ArtifactTriggers -DifferenceObject $SynapseTriggers | Select-Object -ExpandProperty InputObject
} else {
    # Fill collection with all existing triggers to remove them
    # because artifact doesn't contain any triggers anymore
    Write-Output "Path not found in Artifact, removing all existing Triggers in Synapse Workspace"
    $Triggers = Get-AzSynapseTrigger -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name
}
Write-Output "Found $($Triggers.Count) Triggers that are not in the Artifact"

# Stopping all Triggers before deleting them
$Triggers | ForEach-Object -process { 
    Write-Output "Stopping Trigger $($_)"
    try {
        # Trying to stop each Trigger
        Stop-AzSynapseTrigger -WorkspaceName $WorkspaceName -Name $($_) -ErrorAction Stop
    }
    catch {
        if ($_.Exception.Message -eq "{}")
        {
            # Ignore failures for triggers that are already stopped
            Write-Output "Trigger stopped"
        }
        else {
            # Unexpected error
            Write-Output "Something went wrong while stopping trigger!"
            Throw $_
        }
    }
    # Remove trigger
    Write-Output "Removing Trigger $($_)"
    Remove-AzSynapseTrigger -Name $_ -WorkspaceName $WorkspaceName -Force
}
Write-Output "Step 'Stop and remove Triggers not in Artifact' completed`r`n"



#######################################################
# 3) Remove Pipelines not in Artifact
#######################################################
Write-Output "==========================================="
Write-Output "3) Remove Pipelines not in Artifact"
Write-Output "==========================================="

# Check if artifact contains any pipelines
if (Test-Path -Path (Join-Path $ArtifactDirectory "pipeline"))
{
    # Getting all Pipelines from Artifact
    $ArtifactPipelines = Get-ChildItem -Path (Join-Path $ArtifactDirectory "pipeline") -Filter "*.json" | Select-Object -ExpandProperty BaseName

    # Getting all Pipelines from Synapse
    $SynapsePipelines = Get-AzSynapsePipeline -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name

    # Getting Pipelines from Synapse that are not in the Artifact
    $Pipelines = Compare-Object -ReferenceObject $ArtifactPipelines -DifferenceObject $SynapsePipelines | Select-Object -ExpandProperty InputObject
} else {
    # Fill collection with all existing pipelines to remove them 
    # because artifact doesn't contain any pipelines anymore
    Write-Output "Path not found in Artifact, removing all existing Pipelines in Synapse Workspace"
    $Pipelines = Get-AzSynapsePipeline -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name
}
Write-Output "Found $($Pipelines.Count) Synapse Pipelines that are not in the Artifact"

# Trying to delete all pipelines. If a pipeline is still referenced
# by an other pipeline it will continue to remove other pipelines 
# before trying to remove it again... max 100 times. So don't create
# chains of pipelines that are too long
[int] $depthCount = 0
while ($Pipelines.Count -gt 0 -and $depthCount -lt 100)
{
    # Loop through collection of pipelines and try to remove them
    $Pipelines | ForEach-Object -process { 
        Write-Output "Trying to delete pipeline $($_)"
        Remove-AzSynapsePipeline -Name $_ -WorkspaceName $WorkspaceName -Force -ErrorAction SilentlyContinue
    }

    # Wait 2 seconds before retry and raise retry counter
    Start-Sleep -Seconds 2
    $depthCount += 1

    # Check if artifact contains any pipelines
    if (Test-Path -Path (Join-Path $ArtifactDirectory "pipeline"))
    {   
        # Getting all Pipelines from Synapse
        $SynapsePipelines = Get-AzSynapsePipeline -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name
        
        # Getting Pipelines from Synapse that are not in the Artifact
        $Pipelines = Compare-Object -ReferenceObject $ArtifactPipelines -DifferenceObject $SynapsePipelines | Select-Object -ExpandProperty InputObject
    }  else {
        # Fill collection with all existing pipelines to remove them 
        # because artifact doesn't contain any pipelines anymore
        $Pipelines = Get-AzSynapsePipeline -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name
    }
    
    # Check if there are any pipelines left for a retry
    if ($Pipelines.count -gt 0)
    {
        Write-Output "Still found $($Pipelines.Count) Synapse Pipelines that are not in the Artifact. Starting next iteration."
    } else {
        Write-Output "Deletion of Pipelines not in Artifact completed"
    }
}
# Error when you have circulair pipeline links or just way to many levels
if ($depthCount -eq 100)
{
    throw "Too many levels of child pipelines or circulair relations!"
}
Write-Output "Step 'Remove Pipelines not in Artifact' completed`r`n"



#######################################################
# 4) Remove Notebooks not in Artifact
#######################################################
Write-Output "==========================================="
Write-Output "4) Remove Notebooks not in Artifact"
Write-Output "==========================================="

# Check if artifact contains any pipelines
if (Test-Path -Path (Join-Path $ArtifactDirectory "notebook"))
{
    # Getting all Notebooks from Artifact
    $ArtifactNotebooks = Get-ChildItem -Path (Join-Path $ArtifactDirectory "notebook") -Filter "*.json" | Select-Object -ExpandProperty BaseName

    # Getting all Notebooks from Synapse
    $SynapseNotebooks = Get-AzSynapseNotebook -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name

    # Getting Notebooks from Synapse that are not in the Artifact
    $Notebooks = Compare-Object -ReferenceObject $ArtifactNotebooks -DifferenceObject $SynapseNotebooks | Select-Object -ExpandProperty InputObject
} else {
    # Fill collection with all existing notebooks to remove them 
    # because artifact doesn't contain any notebooks anymore
    Write-Output "Path not found in Artifact, removing all existing Notebooks in Synapse Workspace"
    $Notebooks = Get-AzSynapseNotebook -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name
}    
Write-Output "Found $($Notebooks.Count) Synapse Notebooks that are not in the Artifact"

# Loop through collection of Notebooks to delete them
$Notebooks | ForEach-Object -process {
    Write-Output "Deleting Notebook $($_)"
    Remove-AzSynapseNotebook -Name $($_) -WorkspaceName $WorkspaceName -Force
}
Write-Output "Step 'Remove Notebooks not in Artifact' completed`r`n"



#######################################################
# 5) Remove SQL-Scripts not in Artifact
#######################################################
Write-Output "==========================================="
Write-Output "5) Remove SQL-Scripts not in Artifact"
Write-Output "==========================================="

# Check if artifact contains any SQL Scripts
if (Test-Path -Path (Join-Path $ArtifactDirectory "sqlscript"))
{
    # Getting all SQL-scripts from Artifact
    $ArtifactSQLScripts = Get-ChildItem -Path (Join-Path $ArtifactDirectory "sqlscript") -Filter "*.json" | Select-Object -ExpandProperty BaseName

    # Getting all SQL-scripts from Synapse
    $SynapseSQLScripts = Get-AzSynapseSqlScript -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name

    # Getting SQL-scripts from Synapse that are not in the Artifact
    $SQLScripts = Compare-Object -ReferenceObject $ArtifactSQLScripts -DifferenceObject $SynapseSQLScripts | Select-Object -ExpandProperty InputObject
} else {
    # Fill collection with all existing SQL Scripts to remove them 
    # because artifact doesn't contain any SQL Scripts anymore
    Write-Output "Path not found in Artifact, removing all existing SQL-Scripts in Synapse Workspace"
    $SQLScripts = Get-AzSynapseSqlScript -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name
} 
Write-Output "Found $($SQLScripts.Count) Synapse SQL-Scripts that are not in the Artifact"

# Loop through collection of SQL scripts to delete them
$SQLScripts | ForEach-Object -Process {
    Write-Output "Deleting SQL-script $($_)"
    Remove-AzSynapseSqlScript -Name $($_) -WorkspaceName $WorkspaceName -Force
}
Write-Output "Step 'Remove SQL-Scripts not in Artifact' completed`r`n"



#######################################################
# 6) Remove Datasets not in Artifact
#######################################################
Write-Output "==========================================="
Write-Output "6) Remove Datasets not in Artifact"
Write-Output "==========================================="

# Check if artifact contains any datasets
if (Test-Path -Path (Join-Path $ArtifactDirectory "dataset"))
{
    # Getting all Datasets from Artifact
    $ArtifactDatasets = Get-ChildItem -Path (Join-Path $ArtifactDirectory "dataset") -Filter "*.json" | Select-Object -ExpandProperty BaseName

    # Getting all Datasets from Synapse
    $SynapseDatasets = Get-AzSynapseDataset -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name

    # Getting Datasets from Synapse that are not in the Artifact
    $Datasets = Compare-Object -ReferenceObject $ArtifactDatasets -DifferenceObject $SynapseDatasets | Select-Object -ExpandProperty InputObject
} else {
    # Fill collection with all existing Datasets to remove them 
    # because artifact doesn't contain any Datasets anymore
    Write-Output "Path not found in Artifact, removing all existing Datasets in Synapse Workspace"
    $Datasets = Get-AzSynapseDataset -WorkspaceName $WorkspaceName | Select-Object -ExpandProperty Name
}
Write-Output "Found $($Datasets.Count) Synapse Datasets that are not in the Artifact"

# Loop through collection of Datasets to delete them
$Datasets | ForEach-Object -process { 
    Write-Output "Deleting Dataset $($_)"
    Remove-AzSynapseDataset -Name $_ -WorkspaceName $WorkspaceName -Force
}
Write-Output "Step 'Remove Datasets not in Artifact' completed`r`n"



#######################################################
# 7) Remove Linked Services not in Artifact
#######################################################
Write-Output "==========================================="
Write-Output "7) Remove Linked Services not in Artifact"
Write-Output "==========================================="

# Check if artifact contains any Linked Services
if (Test-Path -Path (Join-Path $ArtifactDirectory "linkedService"))
{
    # Getting all Linked Services from Artifact
    $ArtifactLinkedServices = Get-ChildItem -Path (Join-Path $ArtifactDirectory "linkedService") -Filter "*.json" | Where-Object {($_ -NotLike "*WorkspaceDefaultSqlServer.json" -and  $_ -NotLike "*WorkspaceDefaultStorage.json") } | Select-Object -ExpandProperty BaseName
    
    # Getting all Linked Services from Synapse
    $SynapseLinkedServices = Get-AzSynapseLinkedService -WorkspaceName $WorkspaceName | Where-Object {($_.Name -ne $WorkspaceDefaultSqlServer -and  $_.Name -ne $WorkspaceDefaultSqlStorage) } | Select-Object -ExpandProperty Name

    # Getting Linked Services from Synapse that are not in the Artifact
    $LinkedServices = Compare-Object -ReferenceObject $ArtifactLinkedServices -DifferenceObject $SynapseLinkedServices | Select-Object -ExpandProperty InputObject
} else {
    # Fill collection with all existing Linked Services to remove them 
    # because artifact doesn't contain any Linked Services anymore
    Write-Output "Path not found in Artifact, removing all existing Linked Services in Synapse Workspace"
    $LinkedServices = Get-AzSynapseLinkedService -WorkspaceName $WorkspaceName | Where-Object {($_.Name -ne $WorkspaceDefaultSqlServer -and  $_.Name -ne $WorkspaceDefaultSqlStorage) } | Select-Object -ExpandProperty Name
}
Write-Output "Found $($LinkedServices.Count) Synapse Linked Services that are not in the Artifact"

# Trying to delete all linked services. If a linked service is still
# referenced by an other linked service it will continue to remove 
# other linked services before trying to remove it again... 
# max 100 times. Example: KeyVault linked services
$depthCount = 0
while ($LinkedServices.Count -gt 0 -and $depthCount -lt 100)
{
     # Loop through collection of Linked Services and try to remove them
    $LinkedServices | ForEach-Object -process { 
        Write-Output "Trying to delete Linked Service $($_)"
        Remove-AzSynapseLinkedService -Name $_ -WorkspaceName $WorkspaceName -Force -ErrorAction Continue
    }

    # Wait 2 seconds before retry and raise retry counter
    Start-Sleep 2 
    $depthCount += 1

    # Check if artifact contains any Linked Services
    if (Test-Path -Path (Join-Path $ArtifactDirectory "linkedService"))
    {
        # Getting all Linked Services from Synapse
        $SynapseLinkedServices = Get-AzSynapseLinkedService -WorkspaceName $WorkspaceName | Where-Object {($_.Name -ne $WorkspaceDefaultSqlServer -and  $_.Name -ne $WorkspaceDefaultSqlStorage) } | Select-Object -ExpandProperty Name

        # Getting Linked Services from Synapse that are not in the Artifact
        $LinkedServices = Compare-Object -ReferenceObject $ArtifactLinkedServices -DifferenceObject $SynapseLinkedServices | Select-Object -ExpandProperty InputObject
    } else {
        # Fill collection with all existing Linked Services to remove them 
        # because artifact doesn't contain any Linked Services anymore
        $LinkedServices = Get-AzSynapseLinkedService -WorkspaceName $WorkspaceName | Where-Object {($_.Name -ne $WorkspaceDefaultSqlServer -and  $_.Name -ne $WorkspaceDefaultSqlStorage) } | Select-Object -ExpandProperty Name
    }
    
    # Check if there are any Linked Services left for a retry
    if ($LinkedServices.count -gt 0)
    {
        Write-Output "Still found $($LinkedServices.Count) Synapse Linked Services that are not in the Artifact. Starting next iteration."
    } else {
        Write-Output "Deletion of Linked Services not in Artifact completed"
    }
}

# Error when you have circulair Linked Services links or just way to many levels
if ($depthCount -eq 100)
{
    throw "Too many levels of references to other Linked Services!"
}
Write-Output "Step 'Remove Linked Services not in Artifact' completed"
Write-Output "==========================================="


There is a check at the start that will check whether it can find some expected sub folders in the supplied artifact path. Otherwise it will asume that you don't have any datasets, pipelines, etc. and cleanup your entire Synapse Workspace.






If nothing has to be cleaned/removed the script is ready within seconds.
No cleanup necessary



















If items do have to be removed it is still ready in a few minutes instead of half an hour. Notice the iterations for cleaning up the pipelines. It will try to delete a pipeline, but if it is still used by an other pipeline then it will continue with the next pipeline and try again afterwards. The same construction is used for Linked Services where for example a Key Vault Linked Service can still be used by an other Linked Service.
Some cleanup was necessary




















Conclusions
In this post you learned how to clean your Synapse Workspace much more efficiently by comparing the Workspace and the Artifact. This way you don't have to delete each item in your workspace, but only the ones you deleted during development.

Note that not all parts of Synapse are available in this clean up script. For example KQL scripts and Dataflows are still missing. They will be added later on. If you want to add those yourself make sure they are on the right place within the script Dataflows should probably added before the pipelines and the KQL scripts in front of or after the SQL Scripts. Feel free to let us know if you have any improvements for this script that you would like the share with the community.

Special thanks to colleague Joan Zandijk for helping out.

Friday 27 January 2023

Deploy a Power BI dataset via a DevOps - Publish

Case
I want to develop my Power BI dataset separately from my Power BI reports. So we started using Tabular Editor. This also allows us to work on the model with multiple people at the same time because every object is a separate JSON file. But how can we automatically deploy these files using Azure DevOps?
Deploy your Power BI datasets with DevOps











Solution
In this serie of posts we will deploy the Tabular Editor project as an Power BI dataset via DevOps. We will deploy this to the development workspace. For the deployment from Dev to Test and Test to Prod we will use Power BI Deployment pipelines.


In this third post we will create the YAML files that will do the actual publishing of the data sets. The first file uses Tabular Editor to publish to the Development environment and the second file calls the Power BI deployment pipeline to publish to test/acceptance and to production. Both YAML files are stored in the repository under CICD\YAML\*****.yml. 

Repository folder structure
























1) DeployPBIModelDev.yml
The First YAML file is BuildModel.yml that builds the JSON files of the tabular editor project into a deployable *.bim file (the dataset) and then calls the deployment YAML files to do the actual deployment.

First we are adding the parameters to this YAML file. The values will be passed through by the YAML file of the previous post.
parameters:
  - name: env
    displayName: Environment
    type: string
    values:
    - DEV
  - name: SpApplicationId
    displayName: Service Principal Application Id
    type: string
  - name: SpClientSecret
    displayName: Service Principal Client Secret
    type: string
  - name: SpTenantId
    displayName: Service Principal Tenant Id
    type: string
  - name: DownloadLinkTabularEditor
    displayName: Download Link Tabular Editor
    type: string

This part connects to a enviroment to which you can add approval steps in DevOps under Pipelines - Environments.
jobs:
  - deployment: deploymentjob${{ parameters.env }}
    displayName: Deployment Job ${{ parameters.env }} 
    environment: Deploy to ${{ parameters.env }}
    strategy:
      runOnce:
        deploy:
          steps:
The first step is a check out to get the files from the repository to the agent. For this simple example it is not required, but later on you perhaps want to execute some Powershells that are stored in the repository to for example connect the dataset to a gateway.
          ###################################
          # 1 Check out repository to agent
          ###################################
          - checkout: self
            displayName: '1 Retrieve Repository'
            clean: true 
In the second step we will download and unzip tabular editor. If you have a self-hosted agent instead of a Microsoft hosted agent you also could install this software on the VM manually. However this step allows you to easily upgrade to a newer version of the editor by just changing the URL in the variable group.
          ###################################
          # 2 Download Tabular Editor
          ###################################
          - powershell: |
              # Download URL for Tabular Editor portable:
              
              # Create Download folder
              $DownloadFolder = join-path (get-location) "TabularEditor"
              new-item -type directory -path $DownloadFolder -Force
              Write-Host "Create download location: $($DownloadFolder)" 

              # Download destination (root of PowerShell script execution path):
              $DownloadFile = join-path $DownloadFolder "TabularEditor.zip"
              Write-Host "Save download as: $($DownloadFile)"
              
              # Download from GitHub:
              Write-Host "Downloading Tabular Editor from: $(DownloadLinkTabularEditor)" 
              Invoke-WebRequest -Uri $(DownloadLinkTabularEditor) -OutFile $DownloadFile
              
              # Unzip Tabular Editor portable, and then delete the zip file:
              Write-host "Extracting downloaded zip file"
              Expand-Archive -Path $DownloadFile -DestinationPath $DownloadFolder
              Write-host "Clean up download file"
              Remove-Item $DownloadFile
            displayName: '2 Download Tabular Editor'
The third step is also an optional debug step showing you the artifact and repository files on your agent as well as the exact location of the executable of tabular editor which we need in the next step. Once everything is running fine you can remove or comment out this step.
            ###################################
            # 3 Show treeview of agent
            ###################################
            - powershell: |
                tree "$(Pipeline.Workspace)" /F
              displayName: '3 Treeview of Pipeline.Workspace'
The fourth step is the only mandatory step. It executes tabular editor for each project on the agent. Checkout all the parameters after TabularEditor.exe where the you will see the service principal and a lot of parameters of the tool it self.
          ###################################
          # 4 Publish data models
          ###################################
          - powershell: |
              # Get Artifact folder with model files
              $ModelsFolder = Join-Path -Path $(Pipeline.Workspace) -ChildPath "\PBIModelsArtifact\"

              # Loop through local artifact folder to get all datamodels
              $AllDatamodels = Get-ChildItem -Path $ModelsFolder -Recurse -Filter *.bim
              Write-Host "$($AllDatamodels.Count) model$(If ($AllDatamodels.Count -ne 1) {"s"}) found in $($ModelsFolder)"
              Write-Host "================================"
              foreach ($DataModel in $AllDatamodels)
              {
                $Path = "$($DataModel.Fullname)"
                Write-Host "Processing model $($Path)"

                # Extract datasetname and workspacename from folder path of model
                $DatasetName = $Path | Split-Path -Parent | Split-Path -Leaf
                $WorkspaceName = $Path | Split-Path -Parent | Split-Path -Parent | Split-Path -Leaf
                # Concat workspace name with the suffix used for the deployment workspace
                $WorkspaceName = "$($WorkspaceName)$(PowerBiWorkspaceSuffix)"
                  
                Write-Host "Publishing model [$($DatasetName)] in [$($WorkspaceName)]"
                # Publish model.bim to Power BI Workspace
                # Deleted -P argument at the end of the command line, because we don't want to overwrite the table partitions in the model
                $(Build.SourcesDirectory)\TabularEditor\TabularEditor.exe "$($DataModel.Fullname)" -D "Provider=MSOLAP;Data Source=powerbi://api.powerbi.com/v1.0/myorg/$($WorkspaceName);User ID=app:${{ parameters.SpApplicationId }}@${{ parameters.SpTenantId }};Password=${{ parameters.SpClientSecret }};" "$($DatasetName)" -O -C -R -M -E -V
                Wait-Process -Name "TabularEditor"

                Write-Host "================================"
              }
            displayName: '4 Deploy data models'

This is all you need to deploy your model to a Power BI Workspace. You could reuse this for other environments like test/acceptance and production as well. However we will be using the Power BI Deployment Pipelines for the next environments. This also allows you to change parameters and connections without writing code.

2) DeployPBIModelAccPrd.yml
By default the Power BI deployment pipeline will deploy everything from the source workspace to the destination workspace. We only want to deploy datasets and leave all the other stuff for other pipelines. To be even more specific we only want datasets from the respository and leave any manually deployed datasets. In the GUI you can make that selection manually. In the YAML code you can provide a semicolon separated string with the names of the datasets you want to promote. For this we will create a variable, fill it with some Powershell and then provide it as a parameter for the deployment.

First we are adding the parameters to this YAML file. The values will be passed through by the YAML file of the previous post. Note that the parameter differ from the YAML above. This is mainly because we can use a DevOps Service Connection instead of a Service Principal.
parameters:
  - name: env
    displayName: Environment
    type: string
    values:
    - ACC
    - PRD
  - name: DeployStagePowerBI
    displayName: DeployStagePowerBI
    type: string
    values:
    - Test
    - Production
  - name: ServiceConnection
    displayName: Service Connection
    type: string
  - name: ReposWorkspaceFolder
    displayName: ReposWorkspaceFolder
    type: string
  - name: DeploymentPipelinePBI
    displayName: Deployment Pipeline PBI
    type: string
This part connects to a enviroment to which you can add approval steps in DevOps under Pipelines - Environments. Note that we created an empty string variable named Datamodels. We will use this for the list of datasets that we want to promote.
jobs:
  - deployment: deploymentjob${{ parameters.env }}
    displayName: Deployment Job ${{ parameters.env }} 
    environment: Deploy to ${{ parameters.env }}
    variables:
    - name: Datamodels
      value: ""

    strategy:
      runOnce:
        deploy:
          steps:
The first step is a check out to get the files from the repository to the agent. For this simple example it is not required, but it could be handy for a more complex pipeline where you want to use Powershells from your repository.
          ###################################
          # 1 Check out repository to agent
          ###################################
          - checkout: self
            displayName: '1 Retrieve Repository'
            clean: true 
The second step is also optional, but shows the artifact (and repos) files on your agent. We need this in the next step. You can remove or comment out the code once everything works
          ###################################
          # 2 Show environment and treeview
          ###################################
          - powershell: |
              Write-Output "Deploying PBI models in the ${{ parameters.env }} environment"
              tree "$(Pipeline.Workspace)" /F
            displayName: '2 Show environment and treeview Pipeline_Workspace'
In this step we will loop through the artifact folder to get a list of all datasets. The code will first create an array of subfolders (the subfolder has the name of the dataset). Then the array will be used to create a semicolon separated string of dataset names. The last Write-Host line looks a bit strange, but will populate the YAML variable that we create above.
          ###################################
          # 3 Fill variable with list of models
          ###################################
          - powershell: |
              # Get Artifact folder with model files
              $WorkspacesFolder = Join-Path -Path $(Pipeline.Workspace) -ChildPath "\PBIModelsArtifact\Models\${{ parameters.ReposWorkspaceFolder }}"

              # Loop through local artifact folder to get all datamodels
              [array]$ModelsArray = Get-ChildItem -Path $WorkspacesFolder | ? {$_.PSIsContainer} | Select-Object Name
              Write-Host "$($ModelsArray.Count) workspace$(If ($ModelsArray.Count -ne 1) {"s"}) found in $($WorkspacesFolder)"

              [string]$ModelsList = $null
              $ModelsList = $ModelsArray.name -join ";"
              Write-Host "Value $($ModelsList)"
              Write-Host "##vso[task.setvariable variable=Datamodels;]$ModelsList"
            displayName: '3 Fill variable with list of Workspaces'
For debug purposes we show the variable with the semicolon separated string of dataset names. You can remove this once everything works or leave it and use it as log information for your pipeline.
          ###################################
          # 4 Show environment and treeview
          ###################################
          - powershell: |
              Write-Output "Variable value of Datamodels $(Datamodels)"
            displayName: '4 Show variable value'
The last step is calling the deployment task from the installed DevOps addon for Power BI. We use Selective for the deployType which allows us to provide a lost of datasets. You can use a similar construction for the reports which we we show in a later blog post.
          ###################################
          # 5 Deploy models
          ###################################
          - task: ms-pbi-api.pbi-automation-tools.Pipelines_Deploy.DeploymentPipelines-Deploy@1
            displayName: '5 Deploy models to ${{ parameters.DeployStagePowerBI }} stage'
            inputs:
                pbiConnection: PowerBI
                pipeline: ${{ parameters.DeploymentPipelinePBI }}
                stageOrder: ${{ parameters.DeployStagePowerBI }}
                deployType: Selective
                datasets: $(Datamodels)
                createNewWS: false

Running it once for dev and once for test(/acc) will look like this. Note that some of the stages are skipped. This is due the branch strategy and the conditions we used between the stages.
The result of 2 runs























In the runs overview it will look like this.














Conclusion
In this post we showed two different ways to deploy your Power BI datasets. The first with tabular editor can be used without a Power BI Premium license. A disadvantage is that it is a lot of work to change parameters and connections.

The second method with the Power BI Deployment pipelines require Power BI Premium, but those pipelines allow you to change parameters and connections without code. A big disadvantage is that your datasets already need to be present in the development workspace of Power BI. This can be done with a Manual deployment of by combining both methods to get the best of both worlds and versioning of your work in a repository.



Thursday 26 January 2023

Deploy a Power BI dataset via a DevOps - Build

Case
I want to develop my Power BI dataset separately from my Power BI reports. So we started using Tabular Editor. This also allows us to work on the model with multiple people at the same time because every object is a separate JSON file. But how can we automatically deploy these files using Azure DevOps?
Deploy your Power BI datasets with DevOps











Solution
In this serie of posts we will deploy the Tabular Editor project as an Power BI dataset via DevOps. We will deploy this to the development workspace. For the deployment from Dev to Test and Test to Prod we will use Power BI Deployment pipelines.


In this second post we will create the YAML file that will build the Tabular Editor project that is stored in the repository. The filename we used is BuildModel.yml and it is stored in the repository under CICD\YAML\BuildModel.yml. There are two additional YAML files in that same folder and those will be explained in the next post. However they will be mentioned in one of the tasks below.

Repository folder structure
























1) BuildModel.yml
The first YAML file is BuildModel.yml that builds the JSON files of the tabular editor project into a deployable *.bim file (the dataset) and then calls the deployment YAML files to do the actual deployment.

First we are adding the general variable group (mentioned in the previous post) to the YAML file. This allows us to use those variables in the YAML and Powershell code later on.
###################################
# General Variables
###################################
variables:
  - group: PBIModelParamsGen


Secondly we need to determine when to trigger this pipeline. This part is of course heavily depending on your branch strategy. In this example it triggers on changes in multiple branches but only for models in a certain Power BI workspace folder. Note that the path is hardcoded and that it can't use variables from your Variable groups.
###################################
# When to create a pipeline run
###################################
trigger:
  branches:
    include:
    - Development
    - Acceptance
    - main
  paths:
    include:
    - Models/myPowerBIWorkspace/*


The next step is describing the first stage of the pipeline where we build the model and set it ready for publishing. The important property is pool where you either configure a self hosted agent (via name) or a Microsoft hosted agent (via vmImage). If you are using a self hosted agent then the clean: all setting helps you to cleanup your agent before you start. 
stages:
###################################
# Create Artifact of datamodels
###################################
- stage: CreateModelArtifact
  displayName: Create PBI Model Artifact

  jobs:
  - job: CreateModel
    displayName: 'Create Model'
    workspace:
      clean: all
    pool:
      name: mySelfHostedAgent
      # vmImage: 'windows-latest' #'ubuntu-latest'
    steps:


The first real task is retrieving all the files from the repository. After this step your datasets/models will be available on the agent it self. This is necessary for the build step.
    ###################################
    # 1 Retrieve Repository
    ###################################
    - checkout: self
      displayName: '1 Retrieve Repository'
      clean: true


The next task is downloading and unzipping the tool Tabular Editor. If you have a self hosted agent then you could also choose to download/install this on your Virtual Machine. However installing it via YAML makes it very easy to use a newer version just by changing the 'DownloadLinkTabularEditor' variable in the general variable group (see previous post). That variable should be filled with the download url of the portable/zip file: https://github.com/TabularEditor/TabularEditor/releases/
    ###################################
    # 2 Download Tabular Editor
    ###################################
    - powershell: |
        # Download URL for Tabular Editor portable:
        
        # Create Download folder
        $DownloadFolder = join-path (get-location) "TabularEditor"
        new-item -type directory -path $DownloadFolder -Force
        Write-Host "Create download location: $($DownloadFolder)" 

        # Download destination (root of PowerShell script execution path):
        $DownloadFile = join-path $DownloadFolder "TabularEditor.zip"
        Write-Host "Save download as: $($DownloadFile)"
        
        # Download from GitHub:
        Write-Host "Downloading Tabular Editor from: $(DownloadLinkTabularEditor)" 
        Invoke-WebRequest -Uri $(DownloadLinkTabularEditor) -OutFile $DownloadFile
        
        # Unzip Tabular Editor portable, and then delete the zip file:
        Write-host "Extracting downloaded zip file"
        Expand-Archive -Path $DownloadFile -DestinationPath $DownloadFolder
        Write-host "Clean up download file"
        Remove-Item $DownloadFile
      displayName: '2 Download Tabular Editor'


To check whether the checkout step and the download/unzip step were succesful we have an optional step that shows a treeview of the agent. Once everything is working you can remove or comment-out this step.
    ###################################
    # 3 Show treeview of agent
    ###################################
    - powershell: |
        tree "$(Pipeline.Workspace)" /F
      displayName: '3 Treeview of Pipeline.Workspace'


Now that we have succesfully copied the models to the agent and downloaded Tabluar editor, we can build the json files into a BIM file which we can later on release as a dataset to a Power BI workspace.

This Powershell step loops through all database.json files (the main/project file) that are on the agent. It retrieves the subfolder/parentfolder name so that is can be used as a dataset name and then uses TabularEditor.exe to build the project. All BIM files are stored in the artifact staging folder.

    ###################################
    # 4 Build data models
    ###################################
    - powershell: |
        # Get repos folder with model files
        $ModelsFolder = Join-Path -Path $(Build.SourcesDirectory) -ChildPath "\Models\$($ReposWorkspaceFolder)\"
      
        # Loop through local repos folder to get all datamodels
        $AllDatamodels = Get-ChildItem -Path $ModelsFolder -Recurse -Filter database.json
        Write-Host "$($AllDatamodels.Count) model$(If ($AllDatamodels.Count -ne 1) {"s"}) found in $($ModelsFolder)`n"
        foreach ($DataModel in $AllDatamodels)
        {
          $Path = "$($DataModel.Fullname)"
          Write-Host "Building model $($Path)"

          # extract datasetname and workspacename from folder path of model
          $DatasetName = $Path | Split-Path -Parent | Split-Path -Leaf
          $WorkspaceName = $Path | Split-Path -Parent | Split-Path -Parent | Split-Path -Leaf

          # Create target path for staging folder
          $TargetFolder = Join-Path -Path $(Build.ArtifactStagingDirectory) -ChildPath "\Models\"
          $TargetFolder = Join-Path -Path $TargetFolder -ChildPath $WorkspaceName
          $TargetFolder = Join-Path -Path $TargetFolder -ChildPath $DatasetName

          Write-Host "Saving build model in $($TargetFolder)"
          # Build model into bim file by executing tabular editor
          $(Build.SourcesDirectory)\TabularEditor\TabularEditor.exe "$($DataModel.Fullname)" -B "$($TargetFolder)\Model.bim"
          Wait-Process -Name "TabularEditor"
        }
      displayName: '4 Build data models'


This is again the same treeview step to check the result of the build. In this case it shows all files and subfolders of the artifact staging folder. You can remove or comment-out this step once everything works. 
    ###################################
    # 5 Show treeview of agent
    ###################################
    - powershell: |
        tree "$(Build.ArtifactStagingDirectory)" /F
      displayName: '5 Treeview of Build.ArtifactStagingDirectory'


The last step of the build stage is creating an artifact of all (BIM) files in the artifact staging folder. This artifact will auttomatically be downloaded to the agent in the next stage. 
    ###################################
    # 6 Publish artifact
    ###################################
    - task: PublishPipelineArtifact@1
      displayName: '6 Publish ARM template as artifact'
      inputs:
        targetPath: $(Build.ArtifactStagingDirectory)
        artifact: 'PBIModelsArtifact'
        publishLocation: 'pipeline'


Now it's time to do the actual deployment. We have 3 extra stages (Deploy to Dev, Acc and Prd), but we used a condition on them. If a change happened in the Acceptance branch then we will only execute the Deploy to Acc stage and skip the rest of the stages. In the pipeline runs overview it will look like this:

Stages of pipeline runs



















In this example the deployment method for Development is a little PowerShell script and for Acceptance and Production we will use the Power BI Deployment Pipeline. Therefore those stages have different parameters. The details of the those yaml files will be explained in the next blog post.
###################################
# Deploy Dev environment
###################################
- stage: DeployDev
  displayName: Deploy DEV
  variables:
    - group: PBIModelParamsDev
  pool:
    name: mySelfHostedAgent
    #vmImage: 'windows-latest'
  condition: eq(variables['Build.SourceBranchName'], 'Development')
  jobs:
    - template: DeployPBIModelDev.yml
      parameters:
        env: DEV
        SpApplicationId: $(SpApplicationId)
        SpClientSecret: $(SpClientSecret)
        SpTenantId: $(SpTenantId)
        DownloadLinkTabularEditor: $(DownloadLinkTabularEditor)
        
###################################
# Deploy Acc environment
###################################
- stage: DeployAcc
  displayName: Deploy ACC
  variables:
    - group: PBIModelParamsAcc
  pool:
    name: mySelfHostedAgent
    #vmImage: 'windows-latest'
  condition: eq(variables['Build.SourceBranchName'], 'Acceptance')
  jobs:
    - template: DeployPBIModelAccPrd.yml
      parameters:
        env: ACC
        DeployStagePowerBI: Test # PBI doesn't use Acceptance
        ServiceConnection: 'PowerBI'
        ReposWorkspaceFolder: $(ReposWorkspaceFolder)
        DeploymentPipelinePBI: $(DeploymentPipelinePBI)


###################################
# Deploy Prd environment
###################################
- stage: DeployPrd
  displayName: Deploy PRD
  variables:
    - group: PBIModelParamsPrd
  pool:
    name: mySelfHostedAgent
    #vmImage: 'windows-latest'
  condition: eq(variables['Build.SourceBranchName'], 'main')
  dependsOn: CreateModelArtifact
  jobs:
    - template: DeployPBIModelAccPrd.yml
      parameters:
        env: PRD
        DeployStagePowerBI: Production
        ServiceConnection: 'PowerBI'
        ReposWorkspaceFolder: $(ReposWorkspaceFolder)
        DeploymentPipelinePBI: $(DeploymentPipelinePBI)
Conclusion
In this post we showed how to build a Tabular Editor project with JSON files into one BIM file which we can deploy to the Development workspace in Power BI. You could do the same for the other enviroments, but in this example we will use the Power BI Deployment pipeline for Test and Production which you will see in the next post.