Monday, 18 February 2019

ADF snack: Pipeline templates

Case
I'm regularly creating the same kind of pipelines in Azure Data Factory. Is there a way to copy or reuse them without writing JSON?
ADF Templates













Solution
If you are working within one Data Factory then you could just clone the pipeline by clicking the three dots behind the template that you want to copy. After that, a new copy of the pipeline will be created.
Clone a pipeline





















But Microsoft has also recently introduced pipeline templates which could help you with this. The template gallery contains several templates from Microsoft, but you can also add your own templates. However, using a Git Repository is required for this option. To add a new pipeline from a template, click on the + icon after the filter text box and then choose Pipeline from template. Now find the right template and fill in all the required fields for each activity.
Choosing a pipeline template













To add your own pipeline as a template you first need to open your pipeline. Then press the Save as template option in the top menu. Now you can give your new template a name and then press the Save template button in the bottom right.

Creating a new pipeline template













Summary

In this post we showed you how to use the new pipeline template option in Azure Data Factory. This could be handy when you are creating a lot of similar pipelines, click here for an example.

Next to this the templates from Microsoft could also be a quick start when you do not have any templates and you are starting from scratch. 

Thursday, 31 January 2019

Introducing Azure Function Activity to Data Factory

Case
Last month Microsoft released the new Azure Function Activity for Azure Data Factory (ADF). This should make it easier to integrate Azure Function code in Data Factory. How does it work?
Azure Function activity in Azure Data Factory


















Solution
Using Azure Functions (like other API's) was already possible via the Web Activity, but now ADF has its own activity which should make the integration be even better.
Azure Functions via the Web Activity























For a project at a client we where already using the Web Activity for a Azure Function with an HTTP trigger and to use this code with the new activity, we did need to slightly change the code. Note that I'm not a die hard .NET developer, but I got some help from my colleague Walter ter Maten.

1) HttpResponseMessage to IActionResult
The standard function did return an HttpResponseMessage, but for Azure Function Activity in the ADF pipeline it needs to return a json object. Below you find a very simple and partial code purely to show the differences in the function call and the return code. Also notice the extra reference/using.

Partial old code with the HttpResponseMessage (Web Activity only):
//C# Code
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.Azure.WebJobs.Host;

namespace Calculate
{   
    public static class Calculate
    {
        [FunctionName("SumNumbers")]
        public static async Task<HttpResponseMessage> SumNumbersFunction([HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)]HttpRequestMessage req, TraceWriter log)
        {
            int number1 = 0;
            int number2 = 0;

            // Log start
            log.Info("C# HTTP trigger function processed a request.");

            // Get request body
            dynamic data = await req.Content.ReadAsAsync<object>();
            number1 = data?.number1 ?? 0;
            number2 = data?.number2 ?? 0;

            // Log parameters
            log.Info("Number1:" + number1.ToString());
            log.Info("Number2:" + number2.ToString());

            return req.CreateResponse(HttpStatusCode.OK, "The result is " + (number1 + number2).ToString());
        }
    }
}

Partial new code with the IActionResult (Azure Function Activity and Web Activity):
//C# Code
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.AspNetCore.Mvc;  // Added

namespace Calculate
{   
    public static class Calculate
    {
        [FunctionName("SumNumbers")]
        public static async Task<IActionResult> SumNumbersFunction([HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)]HttpRequestMessage req, TraceWriter log)
        {
            int number1 = 0;
            int number2 = 0;

            // Log start
            log.Info("C# HTTP trigger function processed a request.");

            // Get request body
            dynamic data = await req.Content.ReadAsAsync<object>();
            number1 = data?.number1 ?? 0;
            number2 = data?.number2 ?? 0;

            // Log parameters
            log.Info("Number1:" + number1.ToString());
            log.Info("Number2:" + number2.ToString());

            return new OkObjectResult(new { Result = "OK" });
        }
    }
}

We also needed to install the Microsoft.AspNetCore.Mvc NuGet Package. In the tools menu from Visual Studio you will find the NuGet Package Manager. Then search for this package and click the install button. After that you can add the new Using (Microsoft.AspNetCore.Mvc) in the code.
Bijschrift toevoegen















2) Get Function Key and App URL
In the next step we need to specify the Function URL and the Function Key. You can find them in the Azure Portal and then locate the specific function.
Function App URL











For the Function Key you need to go to Manage in you function and then copy the Function Key.
Function Key














3) Azure Function Activity
Now you can replace the Web Activity by an Azure Function Activity. Drag the new activity to the pipeline canvas and give it a suitable name. Next create a new Azure Function Linked Server. This is where you need the two strings from the previous step. After that enter the name of the function you want to execute, choose the Method and fill in the Body (copy it from the Web Activity).
Now delete the Web Activity and debug the pipeline to check the Function. Also check the Monitor (log) of your Azure Function to see whether it was successfully executed.
Replace Web Activity by Azure Function Activity














Summary
In this post you saw how to use the new Azure Function activity in the Azure Data Factory pipeline. It is fairly easy, but we did have to make a small change to the code. If you don't change the code you will get this error:
{
    "errorCode": "3600",
    "message": "Error calling the endpoint.",
    "failureType": "UserError",
    "target": "Sum this"
}
For now I did not see any improvements other than that you now can use the Function Key in Azure Functions which also allows you to revoke access for a specific key.

Monday, 31 December 2018

Power BI - Introducing dataflows

Case
Power BI recently introduced dataflows. What is it and who should use it?
Power BI dataflows












Solution
According to Microsoft this new addition is for self-service ETL by business annalists and BI professionals. A minor addition from our side: especially business annalists that are skilled with tools like Excel and BI professionals with already a focus on Power BI will probably be very exiting using this new addition. An other group that could benefit of this new feature is the technical application administrator that has a great amount of knowledge of the data model of their application. With dataflows the could do some data preparation for the business annalists

First of all you must have at least a Power BI Pro license and this new preview feature is a Power BI Service-only feature that is not (yet?) available in Power BI Desktop. However you can use the result of a Data Flow which we will show you in the last step.

1) Power BI Service
To create a dataflow, sign in to Power BI Service and go to one of your workspaces, but not My Workspace. Or create a new workspace. Here you will find a new tab called "Dataflows (preview)" and in the Create-menu there is a dataflow option. Click on it to create a new dataflow.

Power BI dataflows - Create new dataflow
















2) Add new entities
In this step we will add a source for our dataflow. You can map this data to one of the standard Common Data Model (CDM) entities, but you can also create 'custom' entities that are not mapped. The CDM is a standard model for example for contacts or accounts to which you can map your sources like CRM or SalesForce. It should make it easier for development, but also analytics. In a subsequent post we will explain the CDM.

For this example we will not map to CDM, but create a 'custom' entity. We will use a text file that contains sensor data. We use Azure Blob Storage to store this data, but of course you can also use an internal file share. However then you first need to install the On-premises data gateway.


As told, we choose "Azure Blobs" as data source and next fill in the connection settings. If you have ever used this storage account before then it will remember the account key. Choose your Blob folder which contains the data and now you can transform your data similar to the Power Query Editor in Power BI Desktop.

Power BI dataflows - Create new Entity
















3) Edit Query
Just like Power BI Desktop you can do data preparation inside an "Edit Query" mode. This Power Query version does not have the full functionality compared to Power BI Desktop, but the expectation is that it will be extended in the upcoming releases. Some limitations of the current Query Editor in dataflows are: you cannot change the datatypes of the columns or use the 'Group By' function.

Power BI dataflows - Power Query Editor















There are also differences between a Pro license and Premium. In case you are doing some basic transformations like combining two queries (using Merge or Append queries) it will cause the following warning when using a Pro license:
This dataflow contains computed entities, which require Premium to refresh. To enable refresh, upgrade this workspace to Premium capacity.

Power BI dataflows - Pro license vs Premium







More information about the differences between a Pro license and Premium here.

4) Save and use dataflow 
After finishing the data preparation, you must save the new dataflow. Give it a suitable name and after saving, you will be asked to refresh the dataflow or schedule it later. You can set up a "Scheduled refresh" the same way as a dataset.

Power BI dataflow - Save (and schedule refresh)














Open Power BI Desktop, select "Get Data" and now you can use a Dataflow as a source for your report.

Power BI Desktop - Use a dataflow as source












Conclusion
In this post we created our first dataflow, a new (preview) feature of Power BI. Despite we are using the first version which is still lacking of some basic features, it already looks very promising. We expect/hope that the limited Power Query possibilities will soon be aligned with those in Power BI Desktop making this a very powerful tool.

The main benefit of this new addition, is that you don’t have to setup and host a separate ETL tool with possibly complex code. Now everything is integrated in one platform. The disadvantages are, besides the limited Power Query options, the lack of versioning and release management.

Will it replace enterprise ETL with tools like SSIS, Azure Data Factory and Azure Databricks? Probably not in the near future. For now it is still self-service ETL which you could use as a first step to enterprise ETL. However, Microsoft will likely extend this tool in the coming years. Just like they did with Power BI itself. Back in 2014 most people didn’t see that as a serious alternative for reporting.

An alternative route map could be integrating Power BI dataflows within your existing BI platform. You can publish (and refresh) the result of your dataflows to Azure Data Lake and then pick up the data, besides Power BI, with tools as Azure Databricks and Azure Data Factory. We will explain this in a subsequent post next year.

Monday, 12 November 2018

Power Apps: Add Microsoft Flow to delete records

Case
In a previous post you wrote that Microsoft Flow could help to delete records with my PowerApps. How do you add a Flow to PowerApps?
PowerApps ♥ Flow










Solution
To add a new Flow to PowerApps we have two options. First could go to Microsoft Flow and create a new Flow that starts with a 'PowerApps button' trigger and then go to PowerApps and use the new Flow (see step 7). The second option is to start from PowerApps and then use the menu to create a new Flow. For this example we will start from PowerApps.

The starting position is a table called Job with a primary key called JobId for which we have generated a PowerApps app. We will replace the code of the delete button to use a Flow instead.

1) Create new Flow
Edit your PowerApps app and go to the Action menu and click on Flows. A new pane will appear where you can edit existing Flows for PowerApps or you could create a new one from scratch. Click on 'Create a new flow'. This will open a new tab with Microsoft Flow.
Create new Flow













2) Rename Flow
It's important to rename the Flow before you save it otherwise the old default name will show in PowerApps. You can do this by clicking on the text 'PowerApps button' on the top left side op the Flow editor: 'DeleteJob'. A suitable descriptive name will also make it easier to find your Flow and will make your code more readable. You could also include the name of your PowerApps app. This will make it easier to understand the purpose of your Flow in case you have a whole bunch of them.
Rename Flow













3) New Step - SQL Server - Delete Row
Next step is the actual deleting part. Since our source is an Azure SQL Server Database we first search for 'SQL Server' and then choose 'Delete row'. If you already have database connections (like the ones in PowerApps) then you can reuse them. Click on the three dotes menu to change the connection.
Delete Row












Note: If you want to delete multiple records then you should use 'Execute a SQL query' or 'Execute stored procedure'. For this example we will only delete one row at a time.

4) Rename Step
Now first rename the SQL action. This will give the parameters in the next step better names. Again, click on the three dotes menu to change the name of the action: SQL Delete Job.
Rename SQL Action













5) Step Parameters
In the Table name property select the name of the table in which you want to delete records. In the Row id property we will add a parameter that will ask PowerApps for the value of the primary key column. If you click on the textbox you can add an expression. Go to the Dynamic content tab and click on 'See more' in the PowerApps row. This will show the option 'Ask in PowerApps'. Select it and you will see the new expression in the textbox.
Parameters













6) Save and Close
Now we are done with the Flow. Click on the save button and then close the tab to return to PowerApps editor. The list of existing Flows will now automatically show the newly created Flow.
Save and close













7) Use new Flow in PowerApps
Select the icon or button where you want to execute the Flow and then add the code: [NameOfYourFlow].Run([PrimaryKeyValue])

Is should look something like this. The Back() returns to the previous screen.
DeleteJob.Run(BrowseGallery1.Selected.JobId)
Add flow to icon












You could also add Back() to return to the previous screen:
DeleteJob.Run(BrowseGallery1.Selected.JobId);Back()
8) The result
Now it is time to test the newly added delete icon/button.
The generated app with the new delete icon























Conclusion
This post showed you how to add a Microsoft Flow to delete records within you PowerApps. This is especially useful in case you have more than 500 records in your table. The steps are fairly simple, but please keep in mind to use good descriptive names for the Flow and the steps within the Flow. This will make the code in the PowerApps more readable.



Wednesday, 31 October 2018

Power Apps Snack: RemoveIf not deleting all records

Case
I my PowerApp I want to delete records in a table with RemoveIf, but it doesn't remove all records selected with the condition parameter.
Remove record(s) in PowerApps

















Solution
This problem occurs when the source table has more than 500 records. Because RemoveIf is a non-deligated function it will not deligate the delete instruction to the source. Instead it will first get 500 records (that number can be changed) and then delete only within these 500 records.

There are two solutions:

1) Change Data row limit for non-delegable queries
One option is to change that magic 500 number to for example 1000. You can do this in the App Settings under Advanced settings, but there is one caveat when changing that. It could cause performance issues because it will then first 'download' the 1000 records before it will start deleting.
And what if you have tens of thousands of records or even millions? That is not something you want to 'download' to your app! So be careful with this magic number.
Data row limit for non-delegable queries


2) Use Microsoft Flow
An alternative is to use Microsoft Flow to delete records in large tables. You can create a new Action for your button that executes a Microsoft Flow that does the deletion via a Stored Procedure or query.
Adding Microsoft Flow to your PowerApps












There is of course a little downside. The free edition of Flow only allows 750 runs per month and the Office 365 only 2000 runs per month. Above these numbers you need Flow Plan 1 or 2 with 4500 and 15000 runs per month. Or you can buy an additional 50000 runs per month for USD $40.00. If you are sure you don't need these high numbers then you have a good, free alternative.


Conclusion
This post showed you two alternative solutions both with their own pros and cons. There is a third alternative which I haven't tested yet and that is to create a custom connector. Or you could just add a PowerApps Idea to suggest Microsoft to solve this and post the URL in the comments below.