Tuesday, 26 November 2019

Archiving within a Blob Storage container

Case
New files are delivered to my Blob Storage container each day. After processing them I want to archive them into folders with a timestamp in the name and after x days delete them. Which Azure resource can I use for this task?
Container with archive folders















Solution
One of the services you could use is an Azure Automation runbook with some PowerShell code. Note that blob storage containers only have virtual folders which means that the foldername is stored in the filename. Microsoft Azure Storage Explorer will show it as if it are real folders.

1) Create Automation Account
First we need to create an Azure Automation account to host the runbook with PowerShell code.
  • Go to the Azure portal and create a new resource
  • Search for automation
  • Select Automation Account
  • Choose a useful name for the Automation Account
  • Select your Subscription, Resource Group and the Region
  • Decide whether you need an Azure Run As account and click on the Create button.
    Note: We don't need it for this specific Runbook, because we will use the access key to access the Blob Storage Container. If you intend to add more runbooks to manage other Azure Resources you should probably enable it.
Create Azure Automation Account
















2) Add Module Az.Storage
Before we start writing code we need to add a PowerShell module called Az.Storage. This module contains methods we need in our code.

If you forget this step you will get error messages while running your code that state that some of your commands are nog recognized:
New-AzStorageContext : The term 'New-AzStorageContext' is not recognized as the name of a cmdlet, function, script 
file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct 
and try again.
  • Go to the newly created Azure Automation Account
  • Go to Modules in the left menu
  • Click on the Browse Gallery button
  • Search for Az.Storage
  • Click on Az.Storage in the result and import this module
Adding a new module to your Automation Account
















3) Create Runbook
Now we need to add a PowerShell Runbook to this new Automation Account.
  • Go to the overview page of your newly created Azure Automation Account
  • Click on Runbooks in the left menu
  • Click on the + Create a runbook button to create a new Runbook
  • Enter a descriptive name for the Runbook
  • Select PowerShell as Runbook type
  • Optionally add a description and click on the Create button
Create Runbook
















4) Edit Runbook code
Now first look up the Storage Account Name and key (1 or 2) from your storage account that you want to cleanup with this Archive script. In the first example we will hardcode these two parameters in the PowerShell code it self. The second example will show you an alternative.
Lookup Account Name and Key















Next open the new Runbook if it wasn't already opened by the previous step. Copy the code below and paste it in the editor. Then study the code and its comments to understand the code and to make sure we don't steal your data. If you never want to delete files then just remove the cleanup part starting on row 74.

Since the storage account container doesn't have real (sub)folders, we will use virtual folders which means the foldername is actually stored in the filename. Therefore we cannot move files to a subfolder, but we can rename them instead. Unfortunately renaming a blob is not a standard method available in the Azure Storage module. Therefore we will use a custom function from Martin Brandl that makes a copy with the new name and deletes the original.

# PowerShell code

##########################################################################
############################### PARAMETERS ###############################
##########################################################################
$StorageAccountName = "bitools"
$StorageAccountKey = "XTZSqCcF7q43SwX3FAKEgG5ezCC3l5jor5gUajoTWnnk4qFAKEk52UuR3lYqw1eaFAKEbQe3M4CpSbGDVnSCKg=="
$ContainerName = "sensordata"
$DaysHistory = 31


##########################################################################
########################## Rename-AzStorageBlob ##########################
##########################################################################
# Custom function because Rename method does not exist in Storage modules
# Function copied from about-azure.com and adjusted from AzureRM to Az
# https://about-azure.com/2018/02/13/rename-azure-storage-blob-using-powershell/

function Rename-AzStorageBlob
{
    [CmdletBinding()]
    Param
    (
        [Parameter(Mandatory=$true, ValueFromPipeline=$true, Position=0)]
        [Object]$Blob,
        [Parameter(Mandatory=$true, Position=1)]
        [string]$NewName
    )


  Process {
    $blobCopyAction = Start-AzStorageBlobCopy `
        -ICloudBlob $Blob.ICloudBlob `
        -DestBlob $NewName `
        -Context $Blob.Context `
        -DestContainer $Blob.ICloudBlob.Container.Name
 
    $status = $blobCopyAction | Get-AzStorageBlobCopyState
 
    while ($status.Status -ne 'Success')
    {
        $status = $blobCopyAction | Get-AzStorageBlobCopyState
        Start-Sleep -Milliseconds 50
    }
 
    $Blob | Remove-AzStorageBlob -Force
  }
}


##########################################################################
############################### ARCHIVING ################################
##########################################################################
Write-Output "Start archiving $($ContainerName) in $($StorageAccountName)"

# Get 'context' of the source container
$StorageContext = New-AzStorageContext -StorageAccountName $StorageAccountName -StorageAccountKey $StorageAccountKey

# Determine prefix of datetime
$Prefix = (get-date).ToString("yyyyMMdd-HHmmssfff")

# Loop through all blobs that are not in a virtual folder
$blobs = Get-AzStorageBlob -Container $ContainerName -Context $StorageContext | where {$_.Name -notlike "*/*" }
foreach ($blob in $blobs)
{
    $NewName = $Prefix + "\" + $blob.name
    Write-Output "Archive $($blob.name) to $($NewName)"
   
    #Rename blob file.csv => 20190226\file.csv
    Get-AzStorageBlob -Container $ContainerName -Context $StorageContext -Blob $blob.name  | Rename-AzStorageBlob -NewName $NewName
}


##########################################################################
################################ CLEANUP #################################
##########################################################################
Write-Output "Start cleanup $($ContainerName) in $($StorageAccountName)"

# Determine date in history based on given days
$HistoryLimit = (get-date).AddDays(-$DaysHistory).ToString("yyyyMMdd-HHmmssfff")

# Get all blobs where the names are at least 19 chararters long
# and from that set get the files that matches 99999999-999999999/
# example: 20191123-171423559/filename.csv
$blobs = Get-AzStorageBlob -Container $ContainerName -Context $StorageContext | where {$_.Name.Length -gt 19} | where {($_.Name).Substring(0,19) -match "^\d{8}[-]\d{9}/$"}

# Loop through the set with blobs that matches our requirements
# and delete those from the container
foreach ($blob in $blobs)
{
    if ( ($blob.name).Substring(0,8) -lt $HistoryLimit)
    {
        Write-Output "Remove old blob $($blob.name)"
        Remove-AzStorageBlob -Context $StorageContext -Blob $blob.name -Container $ContainerName
    }
}

Not very flexible and secure to store your Storage Accountname and Accesskey in the code it self. So in this second example we will replace it by parameters that can be passed from Azure Data Factory. You can find more details in this blog post. Only the parameter part is different and the rest of the code is unchanged.
# PowerShell code

##########################################################################
############################### PARAMETERS ###############################
##########################################################################
Param
(
    # ContainerName is required
    [Parameter(Mandatory=$False,Position=1)]
    [object] $WebhookData
)

# Get all parameters from body (passed from Data Factory Web Activity)
$Parameters = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)

# Get single parameter from set of parameters
$ContainerName = $Parameters.ContainerName

$StorageAccountName = $Parameters.StorageAccountName
$StorageAccountKey = $Parameters.StorageAccountKey
$ContainerName = $Parameters.ContainerName
$DaysHistory = $Parameters.DaysHistory
The parameters that will be provided in Azure Data Factory (ADF) via a JSON message will look like this:
{
"StorageAccountName":"BiTools",
"StorageAccountKey ":"XTZSqCcF7q43SwX3FAKEgG5ezCC3l5jor5gUajoTWnnk4qFAKEk52UuR3lYqw1eaFAKEbQe3M4CpSbGDVnSCKg==",
"ContainerName":"SensorData",
"DaysHistory":"31"
}

Another alternative, for when you don't want to use ADF, is using the Azure Key Vault to store the key and then use some PowerShell code to retrieve it in your runbook. The code is very simple, but see this post for more details.
# PowerShell code snippet
# Retrieve value from Key Vault
$StorageAccountKey = (Get-AzKeyVaultSecret -VaultName "MyKeyVault" -Name "StorageAccountKey").SecretValueText

5) Testing
Testing the functionality of your code is the easiest if you still have the hardcoded parameters. Then you can just use the Test pane in the Runbook editor like the animation below. If you want to test it with the parameters for ADF then you first need to create a webhook and then create and run an ADF pipeline to test your code.
Testing the code
















Summary
In this post we explained how to archive files in a container from a Storage Account using PowerShell. We also showed how to pass the parameters like the account key from ADF instead of hard coding it. However we also don't want to store secrets in ADF. In a next post we will show you how to store the Storage account key in an Azure Key Vault and show how to read that secret within ADF to use it as a parameter for this Runbook.
Passing secrets parameters using KeyVault, DataFactory and a Runbook









Monday, 18 November 2019

Runbook snack: read parameters from Data Factory

Case
I want to pass parameters from an Azure Data Factory pipeline web(hook) activity to my Azure Automation Runbook. How do I do that?
Supplying parameters via the Body property in Web(hook) activity























Solution
You cannot pass through a single parameters and then read that particularly parameter with PowerShell. Instead you have to read all the data coming from the webhook as one big Object parameter. That object also contains the Body property which on its turn contains a JSON message with a parameter name and value.

1) JSON Message
The parameter 'ContainerName'' will be provided in Azure Data Factory (ADF) via a JSON message that looks like
{"ContainerName":"OfficeData"}
2) Coding
In an Azure Automation Runbook you can start your code with the following code snippit that retrieves the value from the JSON messages from ADF. For this example it gets the name of a Blob Storage Container.
# PowerShell code

# Parameters
Param
(
    # Get webhook data via webhook
    [Parameter(Mandatory=$False,Position=1)]
    [object] $WebhookData
)

# Get all parameters from body (passed from Data Factory Web Activity)
$Parameters = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)

# Get single parameter from the set of parameters
$ContainerName = $Parameters.ContainerName

# Write the value of the parameter
Write-Output "The value of the container parameter is $($ContainerName)"

3) Adding Webhook
Now that the code is ready, you have to create a Webhook for ADF. Make sure to choose a correct expire date. By default it is only valid for one year. Also make sure to copy the URL, because you can only see and copy it once. You need this URL in ADF for the Web(hook) activity. Don't share the URL because it is a URL and Password in one.
Adding a Webhook to a Runbook
















4) Azure Data Factory
In the pipeline of ADF you can use the Web(hook) activity to call the Webhook and use a JSON message in the Body parameter to provide a value for a parameter. The URL is the Webhook URL from the previous step that you copied. For Method you need to select "POST" and in the Body you can enter the JSON message from step 1.
Supplying parameters via the Body property in Web(hook) activity























Conclusion
Perhaps a bit strange, but this seems to be the only way when using webhooks for your runbook and calling them in ADF. As an alternative you could provide the value of the parameters in the Webhook when you create the Webhook instead of supplying them via ADF, but that is not very flexible if you expect a lot of different values. The code for retrieving parameters from the Webhook itself looks like this:
# PowerShell code

# Parameters
Param
(
    # Get parameter via webhook
    [Parameter(Mandatory=$False,Position=1)]
    [string] $ContainerName
)

# Write the value of the parameter
Write-Output "The value of the container parameter is $($ContainerName)"

Sunday, 17 November 2019

Runbook snack: Using the Azure Run As Connection

Case
I want to manage Azure resources via a runbook (for example upscaling, downscaling, pausing and resuming resources), how do I provide authentication to do this?
Automation Account























Solution
For this we can use an Azure Run As account which gives the Automation Account the Contributor role at the subscription level. However this also means that you need to be an Application administrator in Azure Active Directory or an Owner in the subscription to create an Azure Run As account! If you do not have sufficient rights, then the Create Azure Run As account option will be grayed out when you create a new Automation Account.
Insufficient permissions



















Now you can start your PowerShell runbook with the following code which allows your code to manage the various Azure resource. The first example is for those using the PowerShell RM modules and the second example is for the newer PowerShell AZ modules.
# PowerShell code
########################################################
# Log in to Azure with RM (standard code)
########################################################
Write-Verbose -Message 'Connecting to Azure'

# Name of the Azure Run As connection
$ConnectionName = 'AzureRunAsConnection'
try
{
    # Get the connection properties
    $ServicePrincipalConnection = Get-AutomationConnection -Name $ConnectionName        
 
    'Log in to Azure...'
    $null = Add-AzureRmAccount `
        -ServicePrincipal `
        -TenantId $ServicePrincipalConnection.TenantId `
        -ApplicationId $ServicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint 
}
catch 
{
    if (!$ServicePrincipalConnection)
    {
        # You forgot to turn on 'Create Azure Run As account' 
        $ErrorMessage = "Connection $ConnectionName not found."
        throw $ErrorMessage
    }
    else
    {
        # Something else went wrong
        Write-Error -Message $_.Exception.Message
        throw $_.Exception
    }
}
########################################################

# test code getting all resource groups
Get-AzureRmResourceGroup

Nearly the same code, but now with AZ modules.
# PowerShell code
########################################################
# Log in to Azure with AZ (standard code)
########################################################
Write-Verbose -Message 'Connecting to Azure'

# Name of the Azure Run As connection
$ConnectionName = 'AzureRunAsConnection'
try
{
    # Get the connection properties
    $ServicePrincipalConnection = Get-AutomationConnection -Name $ConnectionName        
 
    'Log in to Azure...'
    $null = Connect-AzAccount `
        -ServicePrincipal `
        -TenantId $ServicePrincipalConnection.TenantId `
        -ApplicationId $ServicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint 
}
catch 
{
    if (!$ServicePrincipalConnection)
    {
        # You forgot to turn on 'Create Azure Run As account' 
        $ErrorMessage = "Connection $ConnectionName not found."
        throw $ErrorMessage
    }
    else
    {
        # Something else went wrong
        Write-Error -Message $_.Exception.Message
        throw $_.Exception
    }
}
########################################################

# Test code getting all resource groups
Get-AzResourceGroup

If you used the Create Run As account option while creating the Automation account, then the name will be AzureRunAsConnection. Otherwise you can find the name under connections in the Azure Automation Account.
Connections in Azure Automation













Conclusion
This code, in combination with creating the Azure Run As account, allows you to perform all kinds of operations on Azure resources like up- and downscaling Azure SQL Databases or pausing and resuming Azure Analysis Services. The only problem could be insufficient rights to create this. In that case you could ask someone with sufficient rights to create the Automation Account with the Azure Run As account option on and then you could create the runbooks yourself.

Note that support for RM modules will end in December 2020. So don't wait to long to convert your scripts to AZ.