Saturday, 13 June 2020

Process Analysis Service with Data Factory only

Case
There are several methods to process Azure Analysis Services models like with Logic Apps, Azure Automation Runbooks and even SSIS, but is there an Azure Data Factory-only solution where we only use the pipeline activities from ADF?
Process Azure Analysis Services




















Solution
Yes you can use the Web Activity to call the Rest API of Azure Analysis Services (AAS), but that requires you to give ADF permissions in AAS via its Managed Service Identity (MSI).

1) Create ADF service principal
In the next step we need a user which we can add as a Server Administrator of AAS. Since we will not find the managed identity of ADF when we search for a user account, we will have to create one. This 'user' is called a service principal.
  • Go to ADF in the Azure portal (not the Author & Monitor environment)
  • In the left menu click on Properties which you can find under General
  • Copy the 'Managed Identity Application ID' and the 'Managed Identity Tenant' properties to a notepad and construct the following string for the next step:
    app:<Application ID>@<Tentant> (and replace the <xxx> values with the properties)
    app:653ca9f9-855c-45df-bfff-3e7718159295@d903b4cb-ac8c-4e31-964c-e630a3a0c05e

Create app user from ADF for AAS















2) Add user as Server Administrator
Now we need to connect to your Azure Analysis Services via SQL Server Management Studio (SSMS) to add the user from the previous step as a Server Administrator. This cannot be done via the Azure portal.
  • Login to your AAS with SSMS
  • Right click your server and choose Properties
  • Go to the Security pane
  • Click on the Add... button
  • Add the service principal from the previous step via the Manual Entry textbox and click on the Add button
  • Click on Ok the close the property window

Add Server Administrator via Manual Entry














After this step the 'user' will appear on the portal as well, but you can not add it via the portal.
Analysis Services Admins
















3) Add Web Activity
In your ADF pipeline you need to add a Web Activity to call the Rest API of Analysis Services. First step is to determine the Rest API URL. Replace in the string below, the <xxx> values with the region, servername and modelname of your Analysis Services. The Rest API method we will be using is 'refreshes':
https://<region>.asazure.windows.net/servers/<servername>/models/<modelname>/refreshes

Example:
https://westeurope.asazure.windows.net/servers/bitoolsserver/models/bitools/refreshes

Second step is to create a JSON message for the Rest API to give the process order to AAS. To full process the entire model you can use this message:
{
    "Type": "Full",
    "CommitMode": "transactional",
    "MaxParallelism": 2,
    "RetryCount": 2,
    "Objects": []
}
Or you can process particular tables within the model with a message like this:
{
    "Type": "Full",
    "CommitMode": "transactional",
    "MaxParallelism": 2,
    "RetryCount": 2,
    "Objects": [
        {
            "table": "DimProduct",
            "partition": "CurrentYear"
        },
        {
            "table": "DimDepartment"
        }
    ]
}
See the documentation for all the parameters that you can use.
  • Add the Web activity to your pipeline
  • Give it a descriptive name like Process Model
  • Go to the Settings tab
  • Use the Rest API URL from above in the URL property
  • Choose POST as Method
  • Add the JSON message from above in the Body property
  • Under advanced choose MSI as Authentication method
  • Add 'https://*.asazure.windows.net' in the Resource property (note this URL is different for suspending and resuming AAS)
Web Activity calling the AAS Rest API
























Then Debug the Pipeline to check the process result















4) Retrieve refreshes
By only changing the method type from POST to GET (body property will disappear) you can retrieve information about the processing status and use that information in the next pipeline activities.
Retrieve process status via GET


Summary
In this post you learned how process your Analysis Services models with only Azure Data Factory. No other services are needed which makes maintenance a little easier. In a next post we will also show you how to Pause or Resume your Analysis Services with Rest API. With a few extra steps you can also use this method to refresh a Power BI dataset, but we will show that in a future post.

Update: firewall turned on?





42 comments:

  1. Thanks for this post.i tried but getting error 2108 "message": "Invoking Web Activity failed with HttpStatusCode - 'NotFound'.",
    "failureType": "UserError".

    ReplyDelete
    Replies
    1. I got that error when the URL was incorrect (spelling error in model or server)

      Delete
    2. Do you have a space in the name of your model? We experienced the same issue but just replaced the space with %20 in the url and it was solved.

      Delete
  2. Thanks All good.it worked.i missed "servers" before servername

    ReplyDelete
  3. Hi Joost, thanks for this post, really good !!

    I've a question, is it possible to create a dynamic name partition parameters ? For example I've partitions with names DK_202001 and FR_201901, but I want to run all the partitions for DK and FR for example. And I want to process the tabular model dynamically with parameters, for example, I want process the tabular model just for DK, tomorrow for FR.

    Once again thanks and kind regards!

    ReplyDelete
    Replies
    1. I'm not seeing a list of available partitions in the Rest API. But if you can create a collection of partitions that you want to process then you could use some dynamic expressions to generate the body message

      Delete
  4. i'm got like this error

    Response status code: ''. More details:Exception message: 'An error occurred while sending the request.'.\r\nNo response from the endpoint. Possible causes: network connectivity, DNS failure, server certificate validation or timeout.",
    "failureType": "UserError",
    "target": "Process Model",
    i followed same thing
    thank you

    ReplyDelete
    Replies
    1. Hi i have resolved this by manually go to SSMS and refresh the creds ,and problem solved.But i tried to deploy with an existing credentials but i still get this error.Is there any way we can automatically / permanently store the creds in AAS model?

      Delete
  5. I am getting error while executing the web activity "Client with IP Address '*.*.*.*' is not allowed to access the server. To enable access, use the Firewall settings in Azure Management Portal." This IP address is changing and not fixed . Also I can not see any ADF IP address to add in AAS firewall setting.

    ReplyDelete
    Replies
    1. You could whitelist ALL Azure Data Factories in your region. Not 100% safe, but better than turning it off.

      An other option could be to use an Azure Automation Runbook (PowerShell or Python code) that first retrieves its own ip address to add it to the firewall and then removes it after processing.

      Delete
    2. https://techcommunity.microsoft.com/t5/azure-data-factory/azure-data-factory-now-supports-static-ip-address-ranges/ba-p/1117508

      Delete
    3. See latest blogpost https://microsoft-bitools.blogspot.com/2020/11/process-analysis-service-with-firewall.html

      Delete
  6. hi , Is it possible to do incremental processing using this approch?

    ReplyDelete
  7. Joost, thanks so much, this post was really clear and usefull! (Unlike some Microsoft Documentation! Within 5 minutes I have a fully working refresh without any problems!!!

    ReplyDelete
  8. Hello Joost, there is a way to see the current process of refreshing the model in pipeline log? I mean, my model usually took like 1 hour to completed the Full process using the Automation.

    But now using the ADF Web Activity, it just took 6 seconds for the pipeline to finish with "In Progress" status. There is any way to see the progress of the refresh?

    ReplyDelete
    Replies
    1. Yes you can also use the rest api to get the current refresh status. You could call that in an UNTIL loop. Within the loop you retrieve the status and combine that with a wait activity to wait a (couple) minute(s) before you retrieve the status again.

      https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-async-refresh

      I will perhaps do a separate post on that if more people like that.

      Delete
    2. Joost, if you don't mind. I want to clarify on GET and POST method of the Web Activity task.

      I am using a small Tabular model to do the testing to check wheter the model is refreshed or not. I followed this guide with debugging the activity with POST method and changed it to GET when I published it.

      However, my model is not refreshed when the Web Activity using the GET method, and as soon as I changed it to the POST, it refreshed. I tested it in PowerBI to see the change of the data.

      Any idea about this, Joost? Thanks for your kind response

      Delete
    3. Hi Joost, Thanks for the article. It really helped. Did you do a post on getting the refresh status and following through the end of refresh. It would be great to have a similar post on it.

      Delete
  9. Can we submit DAX queries, I want to query AAS model form ADF and dump the result into ADL.
    Can anyone please help

    ReplyDelete
    Replies
    1. Did you get any solution? I am looking for it. Only option I could see is to run PowerShell script as custom activity in ADF. In powershell script we can add DAX query

      Delete
  10. If you are using Azure Synapse instead of Data Factory, how to you create a server admin account for Synapse? In the properties of synapse there is a Managed Identity Object ID but there's no Application ID or Tenant ID.

    ReplyDelete
    Replies
    1. I haven't tried it, but I think you can look up the Managed Identity Tenant in an other service like ADF.

      Delete
    2. I tried with MangedIdentityObjectID,, but not worked from synapse.any suggestion

      Delete
  11. Today I fought with the user permissions in Analysis Services, wich was not working when constructed with [object id]@[tenant id] shown in Data Factory's Properties blade, finally found that Properties of Data Factory doesn't show application ID, only Object ID and Tenant ID. To get application ID go to Active Directoy, Enterprise Applications, filter Application Type = Managed Identities, click on your data factory and use Application Id...

    ReplyDelete
    Replies
    1. Please follow the steps given in the URL below to get the Application ID:
      https://docs.microsoft.com/en-us/azure/data-factory/data-factory-service-identity#retrieve-managed-identity-using-powershell

      Delete
  12. Did it worked?? I also tried using the ManagedIdentityObjectID in the Synapse but not worked.

    ReplyDelete
    Replies
    1. I faced the same issue but I found the Application ID by following the below:
      https://docs.microsoft.com/en-us/azure/data-factory/data-factory-service-identity#retrieve-managed-identity-using-powershell

      In your Azure Portal go to Powershell and follow the commands

      Delete
  13. Hi Joost,

    Thanks for this article, but it does not tell where are you extracting the data from. so for eg how do I extract data from synapse and then process my Analysis services cube?
    Could you please let me know.

    Thanks,
    Shantanu

    ReplyDelete
    Replies
    1. Not sure I understand the problem. The model points to a datasource... and you are processing the model which knows where to get the data from... or do you want to change the connection?

      Delete
    2. ohh....I get it now. I thought that like a copy activity where we need to specify the Linked services and data sets in the ADF pipeline, we would have to do something similar to extract the data from synapse and process the AAS cube.
      Many thanks!

      Delete
  14. Many thanks Joost, I was able to process the AAS Model after following your article.

    ReplyDelete
  15. Hi Joost, am i missing anything i got this error message
    {"error":{"code":"InvalidRequestContent","message":"The request content was invalid and could not be deserialized: 'Unexpected end when deserializing object. Path 'properties.ipV4FirewallSettings.firewallRules[41].firewallRuleName', line 1, position 4000.'."}}

    ReplyDelete
  16. Operation on target WEB - Get details of a server failed: {"error":{"code":"AuthorizationFailed","message":"The client 'xxxxxx' with object id 'xxxxxxx' does not have authorization to perform action 'Microsoft.AnalysisServices/servers/read' over scope '/subscriptions/xxxxxxx/resourceGroups/xxxxxx/providers/Microsoft.AnalysisServices/servers/xxxxx' or the scope is invalid. If access was recently granted, please refresh your credentials."}}

    While connecting From adf web activity it is giving above error. I have performed all steps ad per this article.

    ReplyDelete

  17. Hi Joost, getting below error: {"code":"Unauthorized","subCode":0,"message":"An internal error occurred.","timeStamp":"2021-05-31T09:50:55.2470385Z","httpStatusCode":400,"details":[{"code":"RootActivityId","message":"a6fc98e3-c08e-403b-9a93-03c0bfbbf85f"},{"code":"Param1","message":"asazure://asazureweu5-westeurope.asazure.windows.net/xxxxxx"}]}

    ReplyDelete
  18. Hi, Can you help to share the code to scale out the Processing server from Query Server using ADF.

    ReplyDelete
  19. Hello Joost van Rossum,

    I am using REST API method for processing the data on azure analysis cube that is working fine .

    I need to create new partition for each month and process only particular month partition month only .

    i.e. we are running on AUG 2021 hence we need to create the partition as TableName_Aug2021 and query will come as "select * from table name where date between '1-aug-2021' and '31-aug2021'".

    once this partition gets created we need to process only this TableName_Aug2021 this partition using REST API method .

    Please let me know if you need any information .

    ReplyDelete
    Replies
    1. Hello Ravi, I would like to know if you found a solution to dinamic create partition montly?

      Delete
  20. Hello Joost,

    Thanks for this post. Till AAS everything works really fine. But the problem is we have SQL server who is hosting the Cube data. So eventually we ran out with credentials error. We don't want to use SQL DB account due to security policy of our Org. So refreshing through ADF - will this solve the SQL account issue as well or we still need SQL login from AAS to get that data?

    ReplyDelete
  21. Hi Joost,
    Thank you for the post. I am wondering how we can programmatically check the status of the refresh process started by this approach. The GET request you mentioned in the post returns all the recent refresh processes since 30 days, rather than only the last one. The runs in the returned array can be identified by the refreshId, the problem is: how we find out the refreshId? The official documentation says as a result of "POST / refreshes" HTTP request in the respond header we receive the refreshId which is uniquely identifies the run/process. (https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-async-refresh#post-refreshes). But it fact I can not find the information in my pipeline/activity logs.
    Regards
    Ivan

    ReplyDelete
  22. Hello Joost, great article. That REST API is only for process, but I would like to know if is there a way to create partition dinamically?

    ReplyDelete

All comments will be verified first to avoid URL spammers