Monday, 20 September 2021

DevOps Snack: Get Treeview of files on agent

Case
I want to see which files and folders are available in my workspace on a DevOps agent. This would make it much easier to determine paths to for example other YAML files or PowerShell scripts. Is there an option to browse files on the agent?
Treeview after a checkout of the respository

















Solution
Unless you are using your own Virtual Machine as a private agent (instead of a Microsoft-hosted agent) where you can login to the actual VM, the answer is no. However with a single line PowerShell script it is very easy. Don't worry it's just copy and paste!

The trick is to add a YAML PowerShell task with an inline script that executes the PowerShell Tree command. The first parameter of the Tree command is the folder or drive. This is where the Predefined DevOps variables are very handy. For this example we will use the Pipeline.Workspace to see its content. The /F parameter will show all file in each directory.
###################################
# Show treeview of agent
###################################
- powershell: |
    tree "$(Pipeline.Workspace)" /F
  displayName: '2 Treeview of Pipeline.Workspace'

On a Windows agent it looks a bit crapy, but on an Ubuntu agent it is much better (see first screenshot above).
Treeview of Pipeline.Workspace on Windows agent
















A useful place for this tree command is for example right after a checkout of the repository. Now you know where for example your other YAML file is located so you can call in in a next task. 
YAML pipeline






















Or use it after the build of an artifact to see the result. And now that you now where the files are located you could even show the content of a (text)file with an additional line of code. 
###################################
# Show treeview of agent
###################################
- powershell: |
    tree "$(Pipeline.Workspace)" /F
    Write-host "--------------------ARMTemplateForFactory--------------------"
    Get-Content -Path $(Pipeline.Workspace)/s/CICD/packages/ArmTemplateOutput/ARMTemplateForFactory.json
    Write-host "-------------------------------------------------------------"
  displayName: '7 Treeview of Pipeline.Workspace and ArmTemplateOutput content '
Note: For large files it is wise to limit the number of rows to read with an additional parameter for the Get-Content: -TotalCount 25

Conclusion
In this post you learned how a little PowerShell can help you debugging your YAML pipelines. Don't forget to comment out the extra code when you go to production with your pipeline. Please share your debug tips in the comments below.

Thx to colleague Walter ter Maten for helping!

Sunday, 19 September 2021

ADF Build - missing publish_config.json

Case
I'm using the new and improved ARM export via Npm to generated an ARM template for my Data Factory so I can deploy it to the next environment, but the Validate step and the Validate and Generate ARM template step both throw an error sayin that the publish_config.json file can't be found. This file isn't mentioned in the steps from the documentation. How do I add this file and what content should be in it?
Unable to read file: publish_config.json


















ERROR === LocalFileClientService: Unable to read file: D:\a\1\publish_config.json, error: {"stack":"Error: ENOENT: no such file or directory, open 'D:\\a\\1\\publish_config.json'","message":"ENOENT: no such file or directory, open 'D:\\a\\1\\publish_config.json'","errno":-4058,"code":"ENOENT","syscall":"open","path":"D:\\a\\1\\publish_config.json"}
ERROR === PublishConfigService: _getLatestPublishConfig - Unable to process publish config file, error: {"stack":"Error: ENOENT: no such file or directory, open 'D:\\a\\1\\publish_config.json'","message":"ENOENT: no such file or directory, open 'D:\\a\\1\\publish_config.json'","errno":-4058,"code":"ENOENT","syscall":"open","path":"D:\\a\\1\\publish_config.json"}
Solution
While it indeed looks like a real error it doesn't stop the DevOps pipeline. The old method of publishing ADF changes to an other ADF did create this file automatically in the adf_publish branch when you hit the Publish button the the Data Factory Gui. So probably it isn't used any more, but we still want to get rid of annoying errors!

You can solve this by manually adding the missing file:

1) Add new file to repos
To solve it go to the Azure DevOps repository and locate the folder where ADF stores the pipeline, dataset and factory files (in subfolders). Click on the +New button and create a file called publish_config.json.
Add new file to repository (in root of ADF)










2) Add JSON content
The content of the new file should be the name of your publishing branch when you configured GIT for ADF in the following JSON format:
{"publishBranch": "factory/adf_publish"}

Add the publishing branch in the following format









3) The result
Now the new file is available for the Npm task in the pipeline. Run that DevOps pipeline again and you will notice that the error message won't appear in the logs.
publish_config.json is now available for the pipeline













Conclusion
In this post you learned how to avoid the error message about the missing publish_config.json file. Not very satisfying that it is still unknown why this file was missing and if it is still used by the process. Please add comment below if you found more details.

In a next post we will describe the entire Data Factory ARM deployment where you don't need to hit that annoying Publish button within the Data Factory GUI. Everything (CI and CD) will be a YAML pipeline).

thx to colleague Roelof Jonkers for helping

Saturday, 18 September 2021

ADF Build - missing arm-template-parameters-definition.json

Case
I'm using the new and improved ARM export via Npm to generated and ARM template for my Data Factory so I can deploy it to the next environment, but the Validate step and the Validate and Generate ARM template step both throw an error sayin that the arm-template-parameters-definition.json file can't be found. This file isn't mentioned in the steps from the documentation. How do I add this file and what content should be in it?
Unable to read file: arm-template-parameters-definition.json



















ERROR === LocalFileClientService: Unable to read file: D:\a\1\arm-template-parameters-definition.json, error: {"stack":"Error: ENOENT: no such file or directory, open 'D:\\a\\1\\arm-template-parameters-definition.json'","message":"ENOENT: no such file or directory, open 'D:\\a\\1\\arm-template-parameters-definition.json'","errno":-4058,"code":"ENOENT","syscall":"open","path":"D:\\a\\1\\arm-template-parameters-definition.json"}
  
WARNING === ArmTemplateUtils: _getUserParameterDefinitionJson - Unable to load custom param file from repo, will use default file. Error: {"stack":"Error: ENOENT: no such file or directory, open 'D:\\a\\1\\arm-template-parameters-definition.json'","message":"ENOENT: no such file or directory, open 'D:\\a\\1\\arm-template-parameters-definition.json'","errno":-4058,"code":"ENOENT","syscall":"open","path":"D:\\a\\1\\arm-template-parameters-definition.json"}

Solution
This step is indeed not mentioned within that new documentation, but it can be found if you know where to look for. The messages state that it is indeed an error, but it will continue using a default file. Very annoying, but not blocking for your pipeline. To solve it we need to follow these steps:

1) Edit Parameter configuration
Go to your development ADF and open Azure Data Factory Studio. In the left menu click on the Manage icon (a toolbox) and then click on ARM template under Source Control. Now you will see the option 'Edit parameter configuration'. Click on it.
Edit parameter configuration












2) Save Parameter configuration
Now a new JSON file will be opened (that you can adjust to your needs, but more on that in a later post) and in the Name box above you will see 'arm-template-parameters-definition.json'. Click on the OK button and go to the Azure DevOps repository.
arm-template-parameters-definition.json























3) The result
In the Azure DevOps Repository you will now find a new file in the root of the ADF folder where the subfolders like pipeline and dataset are also located. Run the DevOps pipeline again and you will notice that the error and warning are gone.
The new file had been added to the repository by ADF











Note: that you only have to do this for the development Data Factory (not for test, acceptance or production) and that the ARM template parameter configuration is only available for git enabled data factories. 

Conclusion
In this post you learned how to solve the arm-template-parameters-definition.json not found error/warning. Next step is to learn more about this possibility and the use case of it. Most often it will be used to add extra parameters for options that aren't parameterized. This will be explained in a next post.

In an other following post we will describe the entire Data Factory ARM deployment where you don't need to hit that annoying Publish button within the Data Factory GUI. Everything (CI and CD) will be a YAML pipeline).

thx to colleague Roelof Jonkers for helping

Friday, 17 September 2021

ADF Release - ResourceGroupNotFound

Case
I'm using the Pre and Post deployment PowerShell script from Microsoft within my ADF DevOps deployment, but it gives an error that it cannot find my resource group although I'm sure a gave the Service Principal enough access to this resource group (and I checked for typos). What is wrong and how can I solve this?
ResourceGroupNotFound



















##[error]HTTP Status Code: NotFound
Error Code: ResourceGroupNotFound
Error Message: Resource group 'RG_ADF_PRD' could not be found.

Solution
If the Service Principal indeed has enough permissions on the Azure Resource Group then your Service Principal probably has access to more than one Azure Subscription. The PowerShell function Get-AzDataFactoryV2Pipeline within this script can only get Data Factories from the Active Subscription. If your resource group is not in that active subscription then it will not find it. Only one can be active.
Get-AzDataFactoryV2Pipeline -ResourceGroupName "RG_ADF_PRD" -DataFactoryName "DataFactory2-PRD"
WARNING: TenantId '4e8e12ea-d5b6-40f1-9988-4b09705c2595' contains more than one active subscription. 
First one will be selected for further use. To select another subscription, use Set-AzContext.

The best solution is to create a Service Principal for each subscription where you want to deploy ADF and then give each Service Principal only access to one Azure subscription.

The alternative is to add one extra parameter to the PowerShell script for the subscription ID (or Name ) and then use the PowerShell function Set-AzContext to activate the correct Azure subscription.
Set-AzContext -Subscription $Subscription













In your YAML script you need to add the extra parameter and then either add the subscription ID (or  Name) hardcoded or much better as a variable from the DevOps Variable Group (Library). If you are using the old Release pipelines then hit the three dots behind the Script Arguments textbox to add the extra parameter.
add parameter to YAML














Conclusion
In this post you learned how to fix the Resource Not Found error during the Pre and Post deployment script execution. The best/safest solution is to minimize access for each Service Principal and the work around is to add two lines of code to the example script of Microsoft.

In a next post we will describe the entire Data Factory ARM deployment where you don't need to hit that annoying Publish button within the Data Factory GUI. Everything (CI and CD) will be a YAML pipeline).

thx to colleague Roelof Jonkers for helping

Friday, 3 September 2021

Virtual Network Data Gateway for Power BI

Case
My company doesn't allow public end points on my Azure resources like the Azure SQL Database and the Azure Data Lake. Now Power BI cannot use these sources directly and therefore we have to install an On-premises Data Gateway on an Azure Virtual Machine within the same VNET as my sources (or in a peered VNET). Is there an alternative for this VM solution?
Data Gateways and one with a different icon

























Solution
Yes there is a promising new alternative for the silly VM solution, but there are some caveats which will be shared in the conclusions. A few months ago Microsoft announced the VNET Data Gateway for cloud sources. Now the service is available for testing. The two main benefits are:
  • No need for a Virtual Machine which you need to maintain (but which you probably forget)
  • No need for the On-Premises Data Gateway which you need to nearly update each month
You will still need a Virtual Network (VNET) to connect this service to other Azure services to allow the connection to Power BI or other services of the Power Platform. Within this VNET we need a subnet that can delegate to the Microsoft Power Platform.


1) Resource Provider Microsoft.PowerPlatform
The first step for adding the VNET gateway is to check within your Azure Subscription whether the Resource Provider Microsoft.PowerPlatform is already registered (probably not). By registering this Provider you will be able to connect the Subnet of Step 2 to the Gateway of step 3.
  • Go to the Azure portal and login as an owner of the Subscription.
  • Go to the Subscription overview page (the same subscription where your VNET is also located)
  • In the left menu you will find the option Resource Providers.
  • Search for Microsoft.PowerPlatform and check the Status column. If is Says NotRegistered then select it and hit the Register button in the top.
Register Resource Provider













2) Add Subnet to VNet
Now that we have the new Resource Provider we can add a Subnet to an existing VNET. The VNET is a simple default 'installation' with an address space of 10.240.134.0/23 (512 addresses 10.240.134.0 to 10.240.135.255). 
  • Go to your existing VNET
  • In the left menu click on Subnets to see the available subnets
  • Click in + Subnet (not + Gateway Subnet) to add a new Subnet
  • Give it a suitable name (gatewaysubnet is reserved/not allowed)
  • Choose a small Subnet address range, 28 will give you 16 ip addresses of which 11 can be used to add gateways. IP6 is now allowed at the moment.
  • The most important property for the VNET gateways is Subnet delegation. Make sure to set it to Microsoft.PowerPlatform/vnetaccesslinks.
Add Subnet to VNET












3) Add VNET Gateway
For the final step, creating the gateway, we need to go to the Power Platform admin center. So this means it not an Azure located service. For this step you need to be add least an Azure Network Contributor (but Subscription Owner will work as well).
  • Log in to the Power Platform admin center
  • In the left menu click on Data (preview)
  • Then go to the tab Virtual Network data gateways to see existing gateways or to add a new one
  • Click on + New to add a new VNET Gateway
  • Select the Azure Subscription of your VNET / subnet. Subscriptions where step 1 was not performed will give an error when you select them: Please register "Microsoft.PowerPlatform" as a resource provider for this subscription to continue creating a virtual network data gateway.
  • Select the Resource Group of your VNET
  • Select the VNET
  • Select the Subnet of step 2
  • Give the new Gateway a suitable new
Add Virtual Network data gateway













You can also add other user to use this new gateway, but at this moment you can only add administrators and you wont see the Can use or the Can use + Share option. Probably because it is still in Public Preview at the moment of writing.
Can use or the Can use + Share are missing













Conclusion
In this post you learned about the new Virtual Network Data Gateway for Power BI. The lack of maintenance is the big benefit for this new service. But there are also a few caveats besides being a Preview service. Some of them will probably be solved when the service will get the status General Available.
  1. This new service only works for Power BI premium workspaces. This is a real big deal since I don't want to make all my workspaces premium.
  2. The price is still unknown. May be it will be a premium feature, but then they have to solve the first issue.
  3. The performance to Azure Data Lake is very slow compared to an On-premises Data Gateway on an Azure VM (up to 6 times slower!). This is a bug where they are working on. However this brings up an other issue. How can you tweak the performance of this new gateway. You cannot create a cluster with multiple gateways and you cannot chance the core/memory.
  4. The number of source is still a bit low. Only Microsoft cloud services are supported (since you cannot install third party drivers). 
  5. On-prem sources are also not supported which would be very useful to reduce the number of servers (and maintenance) in my on-prem network since VNET can also be connected via VPN to the on-prem network. However Microsoft products like SQL Server should work since they use the same drivers as Azure SQL Database (not tested).
  6. Can use & Can use + Share is still missing which only allows gateway admins to use this service.
So altogether a very promising new service from Microsoft. Still a couple of issues, but hopefully they will all be fixed when it will be General Available. The real deal breaker will be issue 1 and solving issue 5 will really boost the success of this new service. For more detailed installation steps read Docs.