Friday 1 May 2020

Azure Data Factory - Use Key Vault Secret in pipeline

Case
I want to use secrets from Azure Key Vault in my Azure Data Factory (ADF) pipeline, but only certain properties of the Linked Services can be filled by secrets of a Key Vault. Is it possible to retrieve Key Vault secrets in the ADF pipeline itself?

Using Azure Key Vault for Azure Data Factory













Solution
Yes the easiest way is to use a web activity with a RestAPI call to retrieve the secret from Key Vault. The documentation is a little limited and only shows how to retrieve a specific version of the secret via the secret identifier (a guid). This is not a workable solution for two reasons:
  1. The guid changes each time you change the secret. In probably 99% of the cases you just want to get the latest of version of the secret. This means you need to change that guid in ADF as well when you change a secret.
  2. The guid differs on each environment of your key vault (dev, test, acceptance and production). This makes it hard to use this solution in a multi ADF environment.
For this example we will misuse Key Vault a little bit as a configuration table and retrieve the RestAPI url of a database from the Key Vault. The example assumes you already have a Key Vault filled with secrets. If you don't have that then executes the first two steps of this post.

1) Access policies
First step is to give ADF access to the Key Vault to read its content. You can now find ADF by its name so you don't have to search for its managed identity guid, but using that guid is also still possible.
  • Go to Access policies in the left menu of your Key Vault
  • Click on the blue + Add Access Policy link
  • Leave Configure from template empty
  • Leave Key permissions unselected (we will only use a Secret for this example)
  • Select Get for Secret permissions
  • Leave Certificate permissions unselected (we will only use a Secret for this example)
  • Click on the field of Select principal to find the name of your Azure Data Factory
  • Leave Authorized application unchanged
  • Click on Add and a new Application will appear in the list of Current Access Policies
Add Access policy
















Note: for this specific example we do not need to create a Key Vault Linked Service in ADF.

2) Determine URL of secret
To retrieve the secrets we need the RestAPI URL of that secret. This URL is constructed as
https://{Name Keyvault}.vault.azure.net/secrets/{SecretName}?api-version=7.0

{Name Keyvault} : is the name of the keyvault you are using
{SecretName} : is the secretName

In this example the secretName is "SQLServerURL" and the URL should be looking like this https://{Name Keyvault}.vault.azure.net/secrets/SQLServerURL?api-version=7
Get the SecretName from Key Vault












3) Web activity
Next we have to add the activity ‘web’ into the ADF pipeline. Use the following settings in the settings tab.
  • Set URL to https://{Name Keyvault}.vault.azure.net/secrets/SQLServerURL?api-version=7.0
  • Set Method to Get
  • Under Advanced select MSI
  • And set the resource to https://vault.azure.net

Configuring the Web activity to retrieve the secret


















4) Retrieve value
Now we want to use the secret from the Key Vault in a successive activity, in this case another web activity to upscale a database. In the URL property of this activity we now use the output value from the previous webactivity.
@activity('GetURL').output.value
Retrieve output value via expression















5) The result
To check the result of the changes we need to execute the pipeline.
Execute the pipeline to check the result

















Note: if you are using this to retrieve real secrets like passwords and you don't want them to show up in the logging of Azure Data Factory then check the Secure output property on the general tab of your activity.
Don't show secret in logging




















Conclusion
In this blogpost your learned how to retrieve Key Vault secrets in ADF. The trick is to retrieve them by there name instead of by there version Guid. This will always give you the latest version and allows you to use this construction in multiple environments.

Update: ADF now supports Global parameters to store parameters that can be used by all pipelines