Friday 1 May 2020

Azure Data Factory - Use Key Vault Secret in pipeline

Case
I want to use secrets from Azure Key Vault in my Azure Data Factory (ADF) pipeline, but only certain properties of the Linked Services can be filled by secrets of a Key Vault. Is it possible to retrieve Key Vault secrets in the ADF pipeline itself?

Using Azure Key Vault for Azure Data Factory













Solution
Yes the easiest way is to use a web activity with a RestAPI call to retrieve the secret from Key Vault. The documentation is a little limited and only shows how to retrieve a specific version of the secret via the secret identifier (a guid). This is not a workable solution for two reasons:
  1. The guid changes each time you change the secret. In probably 99% of the cases you just want to get the latest of version of the secret. This means you need to change that guid in ADF as well when you change a secret.
  2. The guid differs on each environment of your key vault (dev, test, acceptance and production). This makes it hard to use this solution in a multi ADF environment.
For this example we will misuse Key Vault a little bit as a configuration table and retrieve the RestAPI url of a database from the Key Vault. The example assumes you already have a Key Vault filled with secrets. If you don't have that then executes the first two steps of this post.

1) Access policies
First step is to give ADF access to the Key Vault to read its content. You can now find ADF by its name so you don't have to search for its managed identity guid, but using that guid is also still possible.
  • Go to Access policies in the left menu of your Key Vault
  • Click on the blue + Add Access Policy link
  • Leave Configure from template empty
  • Leave Key permissions unselected (we will only use a Secret for this example)
  • Select Get for Secret permissions
  • Leave Certificate permissions unselected (we will only use a Secret for this example)
  • Click on the field of Select principal to find the name of your Azure Data Factory
  • Leave Authorized application unchanged
  • Click on Add and a new Application will appear in the list of Current Access Policies
Add Access policy
















Note: for this specific example we do not need to create a Key Vault Linked Service in ADF.

2) Determine URL of secret
To retrieve the secrets we need the RestAPI URL of that secret. This URL is constructed as
https://{Name Keyvault}.vault.azure.net/secrets/{SecretName}?api-version=7.0

{Name Keyvault} : is the name of the keyvault you are using
{SecretName} : is the secretName

In this example the secretName is "SQLServerURL" and the URL should be looking like this https://{Name Keyvault}.vault.azure.net/secrets/SQLServerURL?api-version=7
Get the SecretName from Key Vault












3) Web activity
Next we have to add the activity ‘web’ into the ADF pipeline. Use the following settings in the settings tab.
  • Set URL to https://{Name Keyvault}.vault.azure.net/secrets/SQLServerURL?api-version=7.0
  • Set Method to Get
  • Under Advanced select MSI
  • And set the resource to https://vault.azure.net

Configuring the Web activity to retrieve the secret


















4) Retrieve value
Now we want to use the secret from the Key Vault in a successive activity, in this case another web activity to upscale a database. In the URL property of this activity we now use the output value from the previous webactivity.
@activity('GetURL').output.value
Retrieve output value via expression















5) The result
To check the result of the changes we need to execute the pipeline.
Execute the pipeline to check the result

















Note: if you are using this to retrieve real secrets like passwords and you don't want them to show up in the logging of Azure Data Factory then check the Secure output property on the general tab of your activity.
Don't show secret in logging




















Conclusion
In this blogpost your learned how to retrieve Key Vault secrets in ADF. The trick is to retrieve them by there name instead of by there version Guid. This will always give you the latest version and allows you to use this construction in multiple environments.

Update: ADF now supports Global parameters to store parameters that can be used by all pipelines

7 comments:

  1. This is exactly what I ran into last week. Had a proof of concept with refreshing a Power BI dataset, but left of wondering how to to implement this in the development street...

    ReplyDelete
  2. Thanks. Question: How does this solve the problem of using this one solution in a multi ADF environment (dev, test, acceptance and production)? There is still a problem with the {Name Keyvault} that has to be changed -supposing you have separate keyvaults for each environment. Another option is having different secrets for each environment, but then changing the {SecretName} would be required.
    How does one implement this with one version that will work on all environments?

    ReplyDelete
    Replies
    1. We use naming conventions in the ADF and Key Vault name. By retrieving the ADF name with an expression we can determine the Key Vault name.

      Delete
    2. Interesting article, thanks. But how do you retrieve the ADF-name in an expression?

      Delete
    3. Ah, thanks. Do you prefer to use global parameters for this now (i.e., introduce a global “environment” variable), or do you still extract the environment from the adf name?

      Delete
    4. If you can deduct it from the Data Factory name then it is probably less work, however the Global parameters give you more freedom in case you want to deviate from naming conventions.

      Delete

All comments will be verified first to avoid URL spammers. यूआरएल स्पैमर से बचने के लिए सभी टिप्पणियों को पहले सत्यापित किया जाएगा।