Friday, 2 October 2020

Pause all/an Azure Synapse SQL Pool(s) (Az)

Case
I want to schedule a pause of my Azure Synapse SQL Pools to save some money on my Azure bills. Back in 2017 when Synapse was still called SQL Data Warehouse we wrote a post with the AzureRM modules which are now outdated. How does it work with AZ?
Pause and resume Synapse SQL Pools
























Solution
For this example we will have two scripts. The first to stop all Azure Synapse SQL Pools within an Azure subscription. This is especially handy for Development, Test or Acceptance environments where you only turn on the Synapse SQL Pools when you need to develop or test something. When you forget to pause them afterwards this script will pause them all on a scheduled moment. The second script is for pausing (or resuming) a specific Synapse SQL Pool. Probably more suitable for a production environment where you don't want to ruthlessly pause all Synapse SQL Pools.

Note: All basics to create your first Azure Automation Runbook can be found here. Combine that with the code below.

1) Modules
If you want to use this script in an Azure Automation Runbook you first need to add the module Az.Sql and that module first wants the module Az.Accounts to be installed. If you already added AzureRM modules to your Azure Automation account then that is no problem. As long as you only don't mix them up in your runbooks. On your own PC you cannot have both installed. You first need to remove all AzureRm modules before you can add Az modules. 
  • Go to your Azure Automation account in the Azure Portal
  • Click on Modules in the left menu
  • Click on Browse Gallery (not on Add a module)
  • Search for Az.Sql and click on it
  • Next click on Import and then on the Ok button
Note 1: If you haven't installed Az.Account then it will ask you to do that first.
Note 2: It takes a few minutes to import a module
Adding a module

















2) Login
If you want to run the script in PowerShell ISE then you first need to login and select your subscription if you have multiple subscriptions.
# PowerShell code
# Login to Azure (browser popup will appear)
Connect-AzAccount -Confirm

# Optional: select your subscription
Set-AzContext -SubscriptionName "mysubscription"
Login to Azure with PowerShell ISE





















In your Runbook you need to add the following code instead which uses your Run as Account to login. Please read this blogpost for all details.
# PowerShell code
########################################################
# Log in to Azure with AZ (standard code)
########################################################
Write-Verbose -Message 'Connecting to Azure'

# Name of the Azure Run As connection
$ConnectionName = 'AzureRunAsConnection'
try
{
    # Get the connection properties
    $ServicePrincipalConnection = Get-AutomationConnection -Name $ConnectionName        
 
    'Log in to Azure...'
    $null = Connect-AzAccount `
        -ServicePrincipal `
        -TenantId $ServicePrincipalConnection.TenantId `
        -ApplicationId $ServicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint 
}
catch 
{
    if (!$ServicePrincipalConnection)
    {
        # You forgot to turn on 'Create Azure Run As account' 
        $ErrorMessage = "Connection $ConnectionName not found."
        throw $ErrorMessage
    }
    else
    {
        # Something else went wrong
        Write-Error -Message $_.Exception.Message
        throw $_.Exception
    }
}
########################################################

3) The pause all script
Below the above script we need to add the following script which first gets all Azure SQL Servers and then checks whether they also host a Synapse SQL Pool. When one is found it checks the status. If it is still Online (active) then it will pause it. This will take several minutes. Afterwards it will recheck the status and show how long it took to pause. Feel free to add more checks and let us know in the comments what you added.
# PowerShell code
# Get all SQL Servers to check whether they host a Synapse SQL Pool
$allSqlServers = Get-AzSqlServer

# Loop through all SQL Servers
foreach ($sqlServer in $allSqlServers)
{
    # Log which SQL Servers are checked and in which resource group
    Write-Output "Checking SQL Server [$($sqlServer.ServerName)] in Resource Group [$($sqlServer.ResourceGroupName)] for Synapse SQL Pools"
    
    # Get all databases from a SQL Server, but filter on Edition = "DataWarehouse"
    $allSynapseSqlPools = Get-AzSqlDatabase -ResourceGroupName $sqlServer.ResourceGroupName `
                                           -ServerName $sqlServer.ServerName `
                                           | Where-Object {$_.Edition -eq "DataWarehouse"}
    # Loop through each found Synapse SQL Pool
    foreach ($synapseSqlPool in $allSynapseSqlPools)
    {
        # Show status of found Synapse SQL Pool
        # Available statuses: Online Paused Pausing Resuming
        Write-Output "Synapse SQL Pool [$($synapseSqlPool.DatabaseName)] found with status [$($synapseSqlPool.Status)]"
        
        # If status is online then pause Synapse SQL Pool
        if ($synapseSqlPool.Status -eq "Online")
        {
            # Pause Synapse SQL Pool
            $startTimePause = Get-Date
            Write-Output "Pausing Synapse SQL Pool [$($synapseSqlPool.DatabaseName)]"
            $resultsynapseSqlPool = $synapseSqlPool | Suspend-AzSqlDatabase

            # Show that the Synapse SQL Pool has been pause and how long it took
            $endTimePause = Get-Date
            $durationPause = NEW-TIMESPAN –Start $startTimePause –End $endTimePause
            $synapseSqlPool = Get-AzSqlDatabase -ResourceGroupName $sqlServer.ResourceGroupName `
                                                -ServerName $sqlServer.ServerName `
                                                -DatabaseName $synapseSqlPool.DatabaseName
            Write-Output "Synapse SQL Pool [$($synapseSqlPool.DatabaseName)] paused in $($durationPause.Hours) hours, $($durationPause.Minutes) minutes and  $($durationPause.Seconds) seconds. Current status [$($synapseSqlPool.Status)]"
        }
    }
}
In PowerShell ISE the result will look this. To see the result of your runbook you need to check the jobs and the the output.
Running the script in PowerShell ISE







4) The pause one script - parameters
To only select one Synapse SQL Pool we need to provide three parameters: Resource Group name, SQL Server name and the name of the SQL Pool. For this we will add parameter code at the beginning of the complete script (above login). You can add parameter validations to make it more monkey proof.
# PowerShell code
<#
    .SYNOPSIS
        Pause an Azure Synapse SQL Pool
    .DESCRIPTION
        By providing the following parameters you can pause one
        specific Azure Synapse SQL Pool. It will only pause when
        the status is 'online'

    .PARAMETER resourceGroupName
        This is the Resource group where Azure Synapse Analytics
        SQL Pool is located

    .PARAMETER sqlServerName
        This is the name of the Azure SQL Server hosting the Azure
        Synapse Analytics SQL Pool
    
    .PARAMETER SynapseSqlPoolName
        This is the name of the Azure Synapse Analytics SQL Pool

#>
Param(
    # This is the Resource group where Azure Synapse Analytics SQL Pool is located   
    [Parameter(Mandatory=$True)]  
    [String] $resourceGroupName
    ,
    # This is the name of the Azure SQL Server hosting the Azure Synapse Analytics SQL Pool
    [Parameter(Mandatory=$True)]  
    [String] $sqlServerName
    ,
    # This is the name of the Azure Synapse Analytics SQL Pool
    [Parameter(Mandatory=$True)]  
    [String] $SynapseSqlPoolName
) 

5) The pause one script
Now the actual pause one script to pause only one SQL Pool. This replaces the script of step 3. If you also want a resume script you just have to replace Pause-AzSqlDatabase by Resume-AzSqlDatabase. and of course chancing some of the texts and if statement (Online => Paused). You could also merge the pause and resume script into one script by adding an extra parameter to indicate what you want to do.
# PowerShell code
# Get one specific Synapse SQL Pool
$synapseSqlPool = Get-AzSqlDatabase -ResourceGroupName $resourceGroupName `
                                    -ServerName $sqlServerName `
                                    -DatabaseName $SynapseSqlPoolName `
                                    | Where-Object {$_.Edition -eq "DataWarehouse"}

# Check if the Synapse SQL Pool can be found with the provided parameters
if ($synapseSqlPool)
{
    # Show status of found Synapse SQL Pool
    # Available statuses: Online Paused Pausing Resuming
    Write-Output "Synapse SQL Pool [$($synapseSqlPool.DatabaseName)] found with status [$($synapseSqlPool.Status)]"

    # If status is online then pause Synapse SQL Pool
    if ($synapseSqlPool.Status -eq "Online")
    {
        # Pause Synapse SQL Pool
        $startTimePause = Get-Date
        Write-Output "Pausing Synapse SQL Pool [$($synapseSqlPool.DatabaseName)]"
        $resultsynapseSqlPool = $synapseSqlPool | Suspend-AzSqlDatabase

        # Show that the Synapse SQL Pool has been pause and how long it took
        $endTimePause = Get-Date
        $durationPause = NEW-TIMESPAN –Start $startTimePause –End $endTimePause
        $synapseSqlPool = Get-AzSqlDatabase -ResourceGroupName $resourceGroupName `
                                            -ServerName $sqlServerName `
                                            -DatabaseName $SynapseSqlPoolName
        Write-Output "Synapse SQL Pool [$($synapseSqlPool.DatabaseName)] paused in $($durationPause.Hours) hours, $($durationPause.Minutes) minutes and  $($durationPause.Seconds) seconds. Current status [$($synapseSqlPool.Status)]"
    }
}
else
{
    Throw "Synapse SQL Pool [$($SynapseSqlPoolName)] not found. Check parameter values."
}

Running the script in PowerShell ISE





Summary
In this post you saw how to pause one or more Synapse SQL Pools to save some money on your Azure bill, but note that the storage costs will continue when you pause Synapse. Next step is to either schedule it within your Azure Automation account or to add a Webhook and execute it from an other service like Azure Data Factory

Besides scripting you can also use the Rest API of Synapse which is particularly easy if you want to pause or resume from within an ADF pipeline, but that will be explored in an other blogpost.

Thursday, 24 September 2020

Using global parameters in Azure Data Factory

Case
In a previous blogpost we described how to use the Azure Keyvault to parameterize some properties in Azure Data Factory (ADF) pipelines. Although that worked technically very well, it’s not always preferable to store non-sensitive data in the keyvault.

With the arrival of global parameters in Data Factory it’s now possible to parameterize a property in ADF pipelines without the use of keyvault. You can use this parameter in all the pipelines in the ADF environment. In this weblog we will explain how to create and use those global parameters. 
Data Factory Global parameters










Solution
We are going to create a global parameter to upscale an Azure SQL database with a Web activity via its RestAPI. For more in depth information about up and down scaling read the blogpost: up/down scale Azure DB

1) Creating a global parameter
The following steps describes how to create a global parameter
  • Go to ‘manage’ in the left menu (Suitcase with wrench icon)
  • Then click on ‘Global parameters’
  • Click on ‘New’ 
  • Now you can fill in the name, type and the value.
    Please note that you can not use  ‘-‘ in the parameter name but you can use the ‘_’ you will receive an errorcode "{"code":"BadRequest","message":"ErrorCode=InvalidTemplate, ErrorMessage=The expression 'pipeline().globalParameters.myparam-dbtest-url' is not valid: the string character 'd' at position '35' is not expected}
  • Click on Save
Global parameters











In this example we used a global parameter to store the RestAPI URL of an Azure SQL Database. This URL will be different for each environment of the DTAP-street.

Adding Global parameter














2) Using a global parameter
The next step is to use the global parameter in a pipeline. In this example we used the Web activity and the Global parameter will make the URL property dynamic.
  • Go to the pipeline
  • Go to the Web activity
  • Go to the settings tab
  • Click on the URL textbox and then on ‘Add Dynamic content’
  • On the right side scroll down till you see the ‘Global parameters’, here you will find all your global parameters, in our example you will see the database url parameter.
  • Click on the name, and that’s it 

Using global parameter











Conclusion
In this weblog we showed you how to create and use a global parameter. This parameter can only be used in the pipeline and it’s not possible to use them in e.g. Linked services or Triggers. In an upcoming post we will explain how to use these global parameters when deploying ADF to another environment with Azure DevOps.



Monday, 31 August 2020

Execute pipelines from an other Data Factory Async

Case
Can I use the Execute Pipeline Activity to execute pipelines from an other Data Factory? Or do we need an other activity for this?
Can you use the Execute Pipeline Activity for this?
























Solution
First of all why would you like to have multiple Data Factories other then for a DTAP-street (Development, Testing, Acceptance and Production). There are several reasons, for example:
  • Different departments/divisions each having their own Data Factory
    For example to prevent changes to your pipelines by users of a different department or to make it easier to split the Azure consumption between two divisions.
  • Different regions for international companies
    Either due legal reasons where for example the data may not leave the European Union
    Or to prevent paying unnecessary outbound data costs when your data is spread over different regions
  • Security reasons
    To prevent others to use the access provided via the Managed Identity of ADF. If you give your ADF access via MSI to an Azure Key Vault or an Azure Storage Account then everybody using that ADF can access that service via ADF.
If you have any other good reasons to use multiple Data Factories please let us know in the comments below.

And although you may have multiple Data Factories you could still use one Data Factory to execute pipelines from a different Data Factory. However you cannot use the Execute Pipeline Activity because it can only execute pipelines within the same Data Factory. 

You can either use the Web Activity or the Web Hook Activity for this. The Web Activity always executes the pipeline asynchronous. This means it does not wait for the result. The Web Hook Activity executes the child pipeline synchronous. Which means it waits until the child pipeline is ready and you could also retrieve the execution result via a call back. In this blog post we will show the asynchronous Web Activity and in an other blog post we will show the synchronous Web Hook Activity.


1) Give parent access to child via MSI
We will not use a user to execute the pipeline in the child(/worker) Data Factory, but instead we will give the managed identity (MSI) of the parent(/master) Data Factory access to the child(/worker) Data Factory. The minimum role needed is Data Factory Contributor, but you could also use a regular Contributor or Owner (but less is more).
  • Go to the child(/worker) Data Factory (DivisionX in this example)
  • In the left menu click on Access control (IAM)
  • Click on the +Add button and choose Add role assignment
  • Select Data Factory Contributor as Role
  • Use Data Factory as Assign access to
  • Optionally change the subscription
  • Optionally enter a (partial) name of your parent ADF (if you have a lot of data factories)
  • Select your parent ADF and click on the Save button
Give one ADF access to other ADF















2) Determine URL
This solution will call the Create Run RestAPI of ADF to execute the pipeline. For this you need to replace the marked parts of the URL below by the Subscription ID, Resource Group name, Data Factory name and the Pipeline name of the child(/worker) pipeline. We will use this URL in the next step.

Example URL:
https://management.azure.com/subscriptions/aaaaaa-bbbb-1234-cccc-123456789/resourceGroups/DivisionX/providers/Microsoft.DataFactory/factories/DevisionX-ADF/pipelines/MyChildPipeline/createRun?api-version=2018-06-01


3) Web Activity
So for this first example we will use the Web Activity. This will execute the pipeline of the child(/worker) pipeline, but you will not see the result in the master ADF. However you can see the executions in the monitor of the child(/worker) ADF.
  • Go to you master ADF and click on Author & Monitor
  • Create a new pipeline and add a Web Activity to the canvas of the new pipeline
  • Give it a suitable descriptive name on the General Tab
  • Go to the Settings tab and enter the URL of the previous step
  • Choose POST as Method
  • Add a new header called Content-Type and with value application/json
  • As Body enter a JSON message. This could either be a dummy message or you could supply parameters in this message. The child parameter is called myParam1: {"myParam1":"bla bla"}
  • Use MSI as Authentication method
  • Enter this URL as Resource: https://management.core.windows.net/
Web Activity calling a child pipeline in an other ADF
























Our dummy child pipeline in a different ADF only contains a Wait activity that waits 30 seconds. To force a child pipeline to fail we used a Stored Procedure activity that executes a RAISERROR statement.
Notice the parameter called myParam1















4) Testing
Now trigger the new master pipeline and check the monitor of the master ADF. For this example we executed one child pipeline that fails and one that succeeds, but is will show that both were successful. Also notice the execute duration of both the master and the worker pipelines.
ADF Monitor of both Master and worker












Summary
In this blog post you learned how to give one ADF access to an other ADF and to execute pipelines in an other ADF. The Web activity solution is very basic and will not show you the result, but it does allow you to pass through values via parameters.

In a next post we will show you the Webhook activity solution which does allows you to call back the master pipeline and show the execution result.

Sunday, 14 June 2020

Pause and Resume Analysis Services with ADF only

Case
I want to start and stop my Azure Analysis Services from within Azure Data Factory, but I don't want write code or use other Azure services like Azure Automation or Azure Logic Apps to do this. Is there an Azure Data Factory-only solution where we only use the standard pipeline activities from ADF?
Save some money on your Azure Bill by pausing AAS




















Solution
Yes you can use the Web Activity to call the Rest API of Azure Analysis Services (AAS), but that requires you to give ADF permissions in AAS via its Managed Service Identity (MSI). If you already used our Process Model example, then this is slightly different (and easier).


1) Add ADF as contributer to AAS
Different than for processing one of the AAS models we don't need SSMS to add ADF as an Server Administrator. Instead we will use Access control (IAM) on the Azure portal to make our ADF a contributor for the AAS that we want to pause or resume.
  • Go to your AAS the Azure portal
  • In the left menu click on Access control (IAM)
  • Click on + Add and choose Add role assignment
  • In the new Add role assignment pane select Contributor as Role
  • In the Assign access to dropdown select Data Factory
  • Select the right Subscription
  • Now Select your Data Factory and click on the Save button
Add ADF as Contributor to AAS















2) Add Web Activity
In your ADF pipeline you need to add a Web Activity to call the Rest API of Analysis Services. First step is to determine the Rest API URL. Replace in the string below, the <xxx> values with the subscription id, resource group and servername of your Analysis Services. The Rest API method we will be using is 'Suspend' but you can replace that word by 'Resume' to startup the AAS:
https://management.azure.com/subscriptions/<xxx>/resourceGroups/<xxx>/providers/Microsoft.AnalysisServices/servers/<xxx>/Suspend?api-version=2017-08-01

Example:
https://management.azure.com/subscriptions/a74a173e-4d8a-48d9-9ab7-a0b85abb98fb/resourceGroups/bitools/providers/Microsoft.AnalysisServices/servers/bitools2/Suspend?api-version=2017-08-01

Second step is to create a JSON message for the Rest API. Well the Rest API doesn't use it, but it is required in the Web activity when you use POST as method. So you just need to create a dummy json message:
{
    "Dummy": "Dummy"
}
  • Add the Web activity to your pipeline
  • Give it a descriptive name like Pause AAS (or Resume AAS)
  • Go to the Settings tab
  • Use the Rest API URL from above in the URL property
  • Choose POST as Method
  • Add the dummy JSON message from above in the Body property
  • Under advanced choose MSI as Authentication method
  • Add 'https://management.azure.com/ in the Resource property (different than process example)
Web Activity calling the AAS Rest API



















Then Debug the Pipeline to check the suspend/resume action














3) Retrieve info
By changing the method type from POST to GET (body property will disappear) and removing the method (suspend or pause) from the URL, you can retrieve information about the AAS. Like status and pricing tier. You could for example use that to first check the status before changing it.
Retrieve service info via GET


Summary
In this post you learned how pause and resume your Analysis Services to save some money on your Azure bill. The big advantage of this method is that you don't need other Azure services which makes maintenance a little easier. In a next post we will also show you how to change the pricing tier via the Rest API.