Friday 29 March 2019

Azure - Reuse Power BI dataflows

Case
Microsoft recently introduced Power BI dataflows for self-service ETL or data preparation. It uses an Azure Data Lake Storage (ADLS), which is only visible to Power BI, to store the metadata and result of the dataflows. However, you can also bring your own storage account. How does that work and why would you want that?

Bring Your Own Storage in preview

Solution
As we mentioned earlier in our previous post, Power BI dataflows can be seen as a self-service data preparation tool. It is easy accessible, so that other people in your organization (besides IT) can get started with transforming and maintaining the data. Nowadays there are often several people / business analysts in a department who are maintaining Excel files. With Power BI dataflows you can centralize this process and contribute to the ideology of "one version of the truth".

When you bring your own ADLS Gen2 storage account (StorageV2) for Power BI dataflows, other services like Azure Data Factory or Azure Databricks could use that same data. This makes it possible to do light weight data preparations with an user-friendly tool for corporate data warehousing or data science. However Power BI dataflows does not (yet?) replace SSIS, Azure Data Factory or any other ETL tool. They could even write back data to the data lake to create 'external' dataflows that are not maintained in Power BI.

Configure and connect to your own storage account
First you have to create an ADLS Gen2 storage account. Make sure the storage account is on the same region as the Power BI tenant. Click here for the documentation of Microsoft that explains step by step how to create such account.

Next, execute the following steps to make a connection with your own storage account. Note that the last step is a one-time action that cannot be changed afterwards.

It requires the following permissions to make a connection:
  • Role "Owner" in the Azure Subscription (service administrator / classic administrator) to add an ADLS Gen2 storage account. This is also required to give the Power BI Service access to the storage account and Blob Container. 
  • Global Administrator in O365 or Azure Active Directory to connect your ADLS Gen2 storage account with dataflows. This has to be done in the Power BI admin portal. The role Power BI Service Administrator is not sufficient to perform those actions.

Power BI Service - Successfully connected to your own ADLS Gen2 storage account













Result
To actual store the dataflow definition and the related data files in your ADLS Gen2 storage account, you must create a new Power BI app workspace or update an existing one. In case of an update, make sure you do not already have dataflows stored in the workspace. Otherwise you cannot change this.

In this case we created a new workspace. Turn on Dataflow storage (preview) under "Advanced" settings. In this case we built the same (simple) dataflow as our previous post. Save your dataflow and click "Refresh".

Azure Storage Explorer
When the dataflow is refreshed, go to the Blob container you created earlier for your ADLS Gen2 storage account. Here you will find the definition (source code) of the dataflow and the output. Note that the content of the ADLS Gen2 storage account is only visible in Azure Storage Explorer. Click here to download.

Azure Data Explorer - Result in own ADLS Gen2 storage account


















And now you can choose per workspace whether you want to use this new Data Lake. This only works for the new workspaces, that are at the moment also still in preview.
Power BI Workspace settings


Common errors
When you try to connect to your own storage account in the admin portal of the Power BI Service, you can get several errors.

You must have global administrator permissions
Only Global Administrators in Office 365 or Azure Active Directory are administrators in Power BI and therefore able to connect to the storage account. Click here for more information about administrator roles in Power BI.














There was a problem accessing your dataflow storage account
After creating your storage account , it can take up to 30 minutes to make a connection. Also make sure you avoid spelling mistakes.

Your storage account must be in the same Azure Active Directory tenant
This occurs for example when you are trying to make a connection with the ADLS storage account in the Power BI Service with an account outside the organization. In this case, the organization (subscription) is where the storage account is created.
















Conclusion
In this post we showed you how to use your own Azure Data Lake Storage account instead of the default provided by Power BI using dataflows. This new feature has several possible use cases. For example:
  1. Data preparation for 'corporate' data warehousing by a business user with a user-friendly tool
  2. Data preparation for 'corporate' data science by a business user with a user-friendly tool
  3. Creating 'external' dataflows for Power BI with Azure Services like Data Factory or Databricks
  4. Using data from other CDM-compliant applications like Dynamics 365 and Office 365

We hope Microsoft will make it a bit easier to bring your own storage, because at the moment there are a lot of steps to take and you need a lot of rights to do it. This discourages to arrange this powerful option.

We also expect some more admin capabilities in Power BI, because at this moment you cannot change your dataflow storage once Power BI is connected to your own ADLS account (so be very careful). And for larger corporations one ADLS account is probably not enough. It is also expected that the integration and collaboration with other Azure services will be improved so that you are even more flexible in choosing the services in your BI landscape.

Saturday 23 March 2019

Power Apps Snack: Refresh Power BI (automatically)

Case
A while ago we blogged about the integration of PowerApps and Power BI. This allows you to change data from within your Power BI report! One downside was that you still had to refresh the Power BI data manually (or by using the API and Flow). How can you automate the refresh?
The new PowerApps Visualization in Power BI





















Solution
Microsoft recently updated PowerApps with the ability to refresh Power BI. Now you can add the refresh command PowerBIIntegration.Refresh() right after your SubmitForm or Patch command. That is all!
PowerBIIntegration.Refresh()













Note: For me it didn't work for existing PowerApps. I had to create a new PowerApp to get the refresh option available. We will follow up on that.
Update 25-03-2019: The Refresh() function gets added to PowerApps from the visual on creation. Therefore it will not be available for existing PowerApps. The workaround for now is creating a new app and then copy and paste the items from the existing app. Please upvote this uservoice request.

Refresh without clicking on the Refresh button in Power BI













Summary
Finally we have an even better integration of PowerApps and Power BI. No more clicking on refresh buttons. Also note the new PowerApps Visualization in Power BI. It got a total makeover with some handy explanations about the integration of both tools. See the top picture. Also see this link for more PowerApps updates