Saturday, 6 March 2021

dataMinds Saturday - REST APIs in Data Factory

On Saturday March 13th I will present a session about using REST APIs in Azure Data Factory instead of scripting. Despite of PowerShell being our friend in Azure for a long time. It's not everybody's cup of thee. Most BI developers and probably also a lot of DWH developers rather do anything else before they start writing PowerShell scripts. 

Luckily Microsoft brought Rest APIs to almost every service in Azure. So instead of writing, and maintaining, PowerShell scripts that are easily over a 100 lines of code, you now have one URL to accomplish the same task. In this session I will show you how to handle those REST APIs in Azure Data Factory Pipelines (and they work in Synapse Studio Pipelines as well). Join me via https://datamindssaturday.be/ Download PowerPoint

PowerPoint will be available after the presentation















Demo material

    Monday, 1 March 2021

    Pausing/resuming Dedicated SQL Pools from Synapse

    Case
    In a previous post we described how we can pause and resume our Azure Dedicated SQL pools via Azure Data Factory. The question we are going to solve now: "How can we pause and resume from within an Azure Synapse workspace"?


    Pause / Resume SQL dedicated Pools

























    Solution
    Pausing and resuming from within an Azure Synapse workspace is very similar, but there are some differences because there are two types of Azure Dedicated SQL Pools. When you create it from within the Azure Synapse Workspace then the provider will be 'Microsoft.Synapse' (hosted in Synapse), but when you create it outside the Azure Synapse Workspace then the provider will be 'Microsoft.Sql' (hosted in SQL Server). Both have different Rest APIs and different security roles. In this example we will demo the Synapse hosted version but we will also mention the differences. For the SQL version you could also take a look at our previous post which handles the SQL version from within Data Factory.


    1) Give Azure Synapse Access to SQL Dedicated Pools
    To call the Rest API we need to grant our Azure Synapse workspace access to the Dedicated SQL Pool or more specific to the Synapse hosting it. Perhaps a bit odd, but we need to grant Synapse access to its own resources (for the SQL version you need grant access to the SQL Server hosting the SQL Pool). 

    For the security role you always need to avoid 'Owner' and use 'Contributor'. This is because the Owner role can also change security related items (for the SQL version you can use 'SQL Server Contributor')
    • In the Azure portal go to the Azure Synapse Workspace hosting the SQL Pool that you want to pause or resume
    • In the left menu click on Access control (IAM)
    • Click on Add, Add role assignment
    • In the 'Role' drop down select 'Contributor'
    • In the 'Assign access to' drop down select user, group or service principal.
    • Search for your Synapse Studio name (in our example‘gansdorp’), select it and click on Save

    Grant Synapse Contributor role to SQL Server

















    If you forget this step then you will receive an authorization error while executing your Synapse pipeline.
    2108 Authorization Failed

     












    {"error":
    {"code":"AuthorizationFailed"
    ,"message":"The client 'xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' with object id 'xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' does not have authorization to perform action 'Microsoft.Synapse/workspace/resume/action' over scope '/subscriptions/xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/RG_bitools/providers/Microsoft.Synapse/workspace/gansdorp/dedsqlpools' or the scope is invalid. If access was recently granted, please refresh your credentials."}
    }
    

    2) Determine URL
    Now it is almost time to edit your Synapse pipeline. The first step will be adding a Web activity to call the Rest API (for the SQL version you must use this Rest API). Find the operation you want to perform and then find the example URL to construct the new URL.

    Pause compute
    https://management.azure.com/subscriptions/{subscription-id}/resourceGroups/{resource-group-name}/providers/Microsoft.Synapse/workspaces/{workspacename}/sqlPools/{DedicatedSQLPoolName}/pause?api-version=2019-06-01-preview

    Resume compute
    https://management.azure.com/subscriptions/{subscription-id}/resourceGroups/{resource-group-name}/providers/Microsoft.Synapse/workspaces/{workspacename}/sqlPools/{DedicatedSQLPoolName}/resume?api-version=2019-06-01-preview

    Within these URL's you need to replace all parts that start and end with a curly bracket: {subscription-id}, {resource-group-name}, {workspacename} and {DedicatedSQLPoolName} (including the brackets themselves). 

    Example URL
    https://management.azure.com/subscriptions/aaaa-bbbb-1234-cccc/resourceGroups/RG-DEV/providers/Microsoft.Synapse/workspaces/gansdorp/dedsqlpool/pause?api-version=2019-06-01-preview
    Example Rest API URL










    3) Add Web Activity
    To call the Rest API we will use the Web Activity in the Synapse pipeline. The actual Rest API call is synchronous, which means it waits for it to finish the pause or resume action and then it will return a message. This also means that you don't have to build any checks to make sure it is already online.
    • Add a Web activity to your pipeline and give it a suitable name
    • Go to the Settings tab and use the URL from the previous step in the URL property
    • Choose POST as method
    • Fill in {} as body (we don't need it, but it is required)
    • Choose MSI as authentication method
    • As the last step enter this URL https://management.azure.com/ as Resource

    web activity






















    To first check the current state of the SQL Pool you can use the Get Rest API and then use an If Condition to check its output with an expression like @Activity('Get Current State').Output.Properties.State

    Summary
    In this post you learned how pause and resume your Dedicated SQL Pool within Azure Synapse Studio, the method is very similar to the ADF version. There are some differences, however that is mostly caused by the Service that is hosting the SQL Pool. This could be either a SQL Server or your own Synapse Workspace. It is expected that one of both will probably disappear, because it isn't likely that Microsoft will be maintaining two different services with both their own set op Rest APIs.