I want to use secrets from Azure Key Vault in my Azure Data Factory (ADF) pipeline, but only certain properties of the Linked Services can be filled by secrets of a Key Vault. Is it possible to retrieve Key Vault secrets in the ADF pipeline itself?
Using Azure Key Vault for Azure Data Factory |
Solution
Yes the easiest way is to use a web activity with a RestAPI call to retrieve the secret from Key Vault. The documentation is a little limited and only shows how to retrieve a specific version of the secret via the secret identifier (a guid). This is not a workable solution for two reasons:
- The guid changes each time you change the secret. In probably 99% of the cases you just want to get the latest of version of the secret. This means you need to change that guid in ADF as well when you change a secret.
- The guid differs on each environment of your key vault (dev, test, acceptance and production). This makes it hard to use this solution in a multi ADF environment.
1) Access policies
First step is to give ADF access to the Key Vault to read its content. You can now find ADF by its name so you don't have to search for its managed identity guid, but using that guid is also still possible.
- Go to Access policies in the left menu of your Key Vault
- Click on the blue + Add Access Policy link
- Leave Configure from template empty
- Leave Key permissions unselected (we will only use a Secret for this example)
- Select Get for Secret permissions
- Leave Certificate permissions unselected (we will only use a Secret for this example)
- Click on the field of Select principal to find the name of your Azure Data Factory
- Leave Authorized application unchanged
- Click on Add and a new Application will appear in the list of Current Access Policies
Add Access policy |
Note: for this specific example we do not need to create a Key Vault Linked Service in ADF.
2) Determine URL of secret
To retrieve the secrets we need the RestAPI URL of that secret. This URL is constructed as
https://{Name Keyvault}.vault.azure.net/secrets/{SecretName}?api-version=7.0
{Name Keyvault} : is the name of the keyvault you are using
{SecretName} : is the secretName
In this example the secretName is "SQLServerURL" and the URL should be looking like this https://{Name Keyvault}.vault.azure.net/secrets/SQLServerURL?api-version=7
Get the SecretName from Key Vault |
3) Web activity
Next we have to add the activity ‘web’ into the ADF pipeline. Use the following settings in the settings tab.
- Set URL to https://{Name Keyvault}.vault.azure.net/secrets/SQLServerURL?api-version=7.0
- Set Method to Get
- Under Advanced select MSI
- And set the resource to https://vault.azure.net
Configuring the Web activity to retrieve the secret |
4) Retrieve value
Now we want to use the secret from the Key Vault in a successive activity, in this case another web activity to upscale a database. In the URL property of this activity we now use the output value from the previous webactivity.
@activity('GetURL').output.value
Retrieve output value via expression |
5) The result
To check the result of the changes we need to execute the pipeline.
Execute the pipeline to check the result |
Note: if you are using this to retrieve real secrets like passwords and you don't want them to show up in the logging of Azure Data Factory then check the Secure output property on the general tab of your activity.
Don't show secret in logging |
Conclusion
In this blogpost your learned how to retrieve Key Vault secrets in ADF. The trick is to retrieve them by there name instead of by there version Guid. This will always give you the latest version and allows you to use this construction in multiple environments.
Update: ADF now supports Global parameters to store parameters that can be used by all pipelines
This is exactly what I ran into last week. Had a proof of concept with refreshing a Power BI dataset, but left of wondering how to to implement this in the development street...
ReplyDeleteThanks. Question: How does this solve the problem of using this one solution in a multi ADF environment (dev, test, acceptance and production)? There is still a problem with the {Name Keyvault} that has to be changed -supposing you have separate keyvaults for each environment. Another option is having different secrets for each environment, but then changing the {SecretName} would be required.
ReplyDeleteHow does one implement this with one version that will work on all environments?
We use naming conventions in the ADF and Key Vault name. By retrieving the ADF name with an expression we can determine the Key Vault name.
DeleteInteresting article, thanks. But how do you retrieve the ADF-name in an expression?
Deletepipeline().DataFactory
DeleteAh, thanks. Do you prefer to use global parameters for this now (i.e., introduce a global “environment” variable), or do you still extract the environment from the adf name?
DeleteIf you can deduct it from the Data Factory name then it is probably less work, however the Global parameters give you more freedom in case you want to deviate from naming conventions.
Delete