Sunday 22 March 2020

Build dynamic pipelines in Azure Data Factory

Case
I want to create multiple Azure Data Factory (ADF) pipelines that are using the same source and sink location, but with different files and tables. How can we do this in ADF when taking extensibility and maintainability of your pipelines into account?














Solution
Let's say we have a scenario where all the source files are stored into our Azure Storage, for example Blob Storage. We want to load these files into a traditional Data Warehouse (DWH) using an Azure SQL Database, that contains a separate schema for the Staging layer. Earlier we showed you how you can do this using a Foreach Loop in one single pipeline. For this scenario we will not this because:
  • Not loading all Staging tables every day on the same time. 
    • Sometimes it will only load 3 of the total 5 files, because not all the files are coming for a certain day and can be different per day. 
    • Load can be spread during the day, several moments.
  • Be more flexible per Staging load.
    • In most cases you want to have the same Staging load for every table, but sometimes transformations can be different or you need an extra table to prepare the dataset.
  • The use of technical columns in Staging table.
    • In most cases you want to store some metadata, for example "inserted date" and "inserted by". When using a "Copy Data Activity", you have to configure the mapping section when the source and sink fields are not equal. 
One of the solutions is building dynamic pipelines. This will be a combination of parameters, variables and naming convention. The result will be a dynamic pipeline, that we can clone to create multiple pipelines using the same source and sink dataset. In this blog post we will focus on the Staging layer. Of course you can implement this for every layer of your DWH.

Note:
When working in a team, it is important to have a consistent way of development and have a proper naming convention in place. This contributes to the extensibility and maintainability of your application. Click here for an example of naming convention in ADF.

Pipeline

For this solution, we will create a pipeline that contains the following activities:
  • Copy Data activity (load data from source to sink)
  • Stored Procedure activity (update technical columns)
The source will be a CSV file and is stored in a Blob container. The data is based on the customer sales table from WideWorldImporters-Standard. The file will be delivered daily.

Storage account
bitoolssa
Blob container
sourcefiles
Blob folder
WWI
File name*
SalesCustomers20200322.csv
* this is the blog post date, but it should be the date of today

Note:
Blob storage containers only have virtual folders which means that the folder name is stored in the filename. Microsoft Azure Storage Explorer will show it as if it are real folders.

1) Linked Service

Open the ADF portal, go to Connections - Linked services and and click on New. Select Azure Blob Storage and give it a suitable (generic) name. Make connection with your storage account. Create another Linked service for Azure SQL Database, because that will be our destination (sink). Click here how to make connection using Azure Key Vault.
ADF portal - Create Linked Service























2) Dataset (source)

Click New dataset and select Azure Blob Storage. The format is DelimitedText. Give it a suitable (generic) name and select the Linked Service for Blob Storage that we created earlier. 

Once you click 'OK', it will open the dataset automatically. Go to Parameters and add the following:
  • Container -> type "String"
  • Directory -> type "String"
  • File -> type "String"
  • ColumnDelimiter -> type "String"
Go to Connection and now use the applicable parameters to fill File path. You can apply dynamic content for each setting. For now, we also added a parameter for "Column delimiter". At last, we use First row as header.

Click Publishing to save your content locally. For now we did not configure a Git Repository, but of course we recommend that.
ADF portal - Create Source Dataset
























Note:
Use Preview data to verify if the source input is as expected by filling in the correct parameters values.

3) Dataset (sink)

Create another dataset for the sink. Choose Azure SQL Database and give it a suitable (generic) name. Select the Linked Service for Azure SQL Database that we created earlier and click 'OK'.

Add the following Parameters:
  • SchemaName (String)
  • TableName (string)
Go to Connection and click Edit. Fill in both parameters using dynamic content. 
ADF portal - Create Sink Dataset














4) Pipeline - Copy data

Create a new pipeline and include the schema and table name in the name, like "PL_Load_Stg_Customer". Go to Variables and add the following including values:
  • Container -> type "String" and value "sourcefiles"
  • Directory -> type "String" and value "WWI"
  • File -> type "String" and value "SalesCustomers"
  • ColumnDelimiter -> type "String" and value ","
Add the Copy data activity, go to Source and use dynamic content to assign the variables to the input parameters. For the file, we use also expression language to retrieve the correct name of the file dynamically: "@{variables('File')}@{formatDateTime(utcnow(), 'yyyyMMdd')}.csv"

Go to Sink and fill in the schema and table name. We use the SPLIT function to retrieve this from the pipeline name. We use a Pre-copy data script to truncate the table before loading.

At last, go to Mapping and click on Import schemas. This will automatically map the columns with the same names (source and sink). We will remove the columns that do not exists in the source, in this case our technical columns "InsertedDate" and "InsertedBy". We will fill those columns in the next activity.

ADF portal - Create Copy data activity
















5) Pipeline - Stored Procedure

Add the Stored Procedure activity and give it a suitable name. In SQL Account, select the Linked service for Azure SQL Database that we created earlier. We created a SP, that contains dynamic SQL to fill the columns "InsertedDate" and "InsertedBy" for every Staging table. See code below.

/*
==========================================================================================================================
Stored Procedure [dbo].[uspPostLoadStaging]
==========================================================================================================================
Description:

This query will load the following columns:
- InsertedDate
- InsertedBy

In ADF, without using Data Flows (Mapping), you can combine a Copy data activity with a Stored Procedure in order to
fill those (technical) columns during execution of the pipeline. In SSIS this was done by the Derived Column task. 

==========================================================================================================================
Parameter     Parameter description
--------------------------------------------------------------------------------------------------------------------------
@InsertedDate    Date when the data was inserted into the Staging table.
@InsertedBy     Name of a service / account that has inserted the data into the Staging table.
==========================================================================================================================

==========================================================================================================================
Change history

Date  Who      Remark
--------------------------------------------------------------------------------------------------------------------------
2020-03-22 Ricardo Schuurman  Intial creation.
==========================================================================================================================
*/

CREATE PROCEDURE [dbo].[uspPostLoadStaging]
 @SchemaName   AS NVARCHAR(255)
, @TableName   AS NVARCHAR(255)
, @InsertedDate  AS DATETIME
, @InsertedBy   AS NVARCHAR(255)

AS
BEGIN

 SET NOCOUNT ON;
 BEGIN TRY

  DECLARE
   @QueryStep1    AS NVARCHAR(MAX)
  , @QueryStep2    AS NVARCHAR(MAX)

  -- ERROR variables
  , @errorMsg    AS NVARCHAR(MAX) = ''
  , @errorLine    AS NVARCHAR(MAX) = ''

  /* Example values of SP parameters
  , @SchemaName   AS NVARCHAR(255)
  , @TableName   AS NVARCHAR(255)
  , @InsertedDate  AS DATETIME
  , @InsertedBy   AS NVARCHAR(255)

  SET @SchemaName   = N'Stg'
  SET @TableName   = N'Customer'
  SET @InsertedDate  = GETDATE()
  SET @InsertedBy   = N'Test'

  */

  /* ================================================================================================================
   Step 1: Extract schema and table name (based on the ADF pipeline naming convention)
   ================================================================================================================ */

  -- Add LIKE condition for schema
  SET @SchemaName = '%' + @SchemaName + '%'

  SELECT 
   @QueryStep1 = '[' + [TABLE_SCHEMA] + '].[' + [TABLE_NAME] + ']'
  FROM [INFORMATION_SCHEMA].[TABLES]
  WHERE TABLE_TYPE = 'BASE TABLE' 
  AND TABLE_SCHEMA LIKE @SchemaName
  AND TABLE_NAME = @TableName

  /* ================================================================================================================
   Step 2: Execute Update statement using the fixed (technical) columns
   ================================================================================================================ */

  SET @QueryStep2 = N'
   UPDATE ' + @QueryStep1 + '
   SET 
    [InsertedDate]  = ''' + CONVERT(NVARCHAR(30),@InsertedDate, 120) + '''
   , [InsertedBy]  = ''' + @InsertedBy + ''';
  '

  EXEC SP_EXECUTESQL 
   @Query = @QueryStep2

 END TRY

  -------------------- Error Handling  --------------------

  BEGIN CATCH 

    SELECT 
     @errorMsg = ERROR_MESSAGE()
    , @errorLine = ERROR_LINE()

    SET @errorMsg = 'Error Occured with the following Message ' + @errorMsg + 'Error Line Number  '+ @errorLine;

    THROW 50001, @errorMsg, @errorLine;

  END CATCH

END


Create the SP in the database, go to Stored Procedure and select the SP. Click Import parameter and fill the parameters. We use the System variables 'Pipeline Name' and 'Pipeline trigger time' for "InsertedDate" and "InsertedBy". Reuse the values of "SchemaName" and "TableName" from the sink (copy data activity).
ADF portal - Create Stored Procedure activity


















Note:
Instead of using a Stored Procedure with an update query, you can also use the metadata columns of ADF itself. More information here.

Result
Execute the pipeline in ADF (by clicking Debug) and check the result in SQL Server Management Studio. It works!












Note:
In case of a "SqlFailedToConnect" error, make sure the firewall of the Azure SQL Database allows the Integration Runtime of ADF to access. Go to the Azure portal or use sp_set_firewall_rule on the  database to create a firewall rule. 


Summary
In this blog post you saw how you can create a dynamic pipeline using parameters and variables in combination with a proper naming convention in ADF, for example the names of your pipelines. This way of working can contribute to a consistent solution and code. In case of a new Staging load table, clone the existing pipeline and use the schema and table name in the pipeline name. You only have to change the mapping in the copy data activity.

To make your pipelines even more dynamic, you can also use metadata that is stored in a database instead of ADF itself and retrieve it using the Stored Procedure activity. 

Saturday 21 March 2020

Embed a Power BI tile in Power Apps with filters

Case
I want to include a Power BI tile in my Power App. Is it possible to manipulate the Power BI tile with data from the Power App?
Power BI tile in Power Apps with filter

























Solution
Power Apps has its own charts. So why would you want to use a Power BI tile in your Power App in the first place? The charts from Power Apps are pretty fast, but since Power Apps is not a reporting tool the functionality is quite basic. The two major disadvantages are:

  • The charts don't have options in the GUI for sorting, filtering or grouping. So either your Data Source is already prepared or you to do that with functions in Power Apps (Filter, GroupBy, SortByColumns, etc.). Not ideal, especially not for large datasets.
  • The charts are very basic (Column, Line or Pie chart). You could combine a column and line chart by creating two separate charts and then lay the line chart over the column chart. Again workable, but not ideal.


Let's see how you can add a Power BI tile to your Power App and what you have to do to make it more dynamic with filters.

1) Pin visual to dashboard
Power Apps can only add Power BI visuals that are pinned to a dashboard. Go to your Power BI report on PowerBI.com and click on the little pin icon above the Visual that you need in Power Apps. After that specify to which existing or new Power BI Dashboard you want to publish it.
Pin your visual to a dashboard

























2) Add Visual to Power Apps
In the Insert menu of Power Apps you see the Charts submenu in which you will find the option to add a Power BI tile. Adding it is very straight forward. Just select the right Workspace, Dashboard and then the Tile.
Adding a Power BI tile to Power Apps


















3) Adding dynamic filters
If you go to the properties of the new Power BI tile in Power Apps you will find the TileUri property. This is the URL of your Power BI tile which you can extend with filters. The standard URL looks something like:
"https://app.powerbi.com/embed?dashboardId=11111111-aaaa-1234-aaaa-111111111111&tileId=11111111-aaaa-1234-aaaa-111111111111&groupId=11111111-aaaa-1234-aaaa-111111111111&config=abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc1230%3d"
The DATA TileUri




















You can simply add a filter to this URL by adding an extra parameter to the querystring. Let's say we have a field called Department in the table FactInternetSales and we want to filter on the value 'Amsterdam'.
The field on which we want to filter

























Before the last double quote of the TileUri you must add &filter=[tablename]/[columnname] eq '[value]'. The result looks like:

"https://app.powerbi.com/embed?dashboardId=11111111-aaaa-1234-aaaa-111111111111&tileId=11111111-aaaa-1234-aaaa-111111111111&groupId=11111111-aaaa-1234-aaaa-111111111111&config=abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc1230%3d&filter=FactInternetSales/Department eq 'Amsterdam'"

Note that filter is in lowercase otherwise it won't work. For more OData like filter examples see the documentation of Power BI filters. A great tip is to first test the query string filter in Power BI itself and then copy and paste it to Power Apps. After that you can replace the hard-coded value 'Amsterdam' by a value from for example a drop drown.
Adding a Drop Down to use it in the filter

























With the drop down the URL looks like:

"https://app.powerbi.com/embed?dashboardId=11111111-aaaa-1234-aaaa-111111111111&tileId=11111111-aaaa-1234-aaaa-111111111111&groupId=11111111-aaaa-1234-aaaa-111111111111&config=abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc123abc1230%3d&filter=FactInternetSales/Department eq '" & ddDepartment.Selected.Department & "'"
If your table name or field name contains spaces, dashes or non-ansi characters you need to replace those with special characters in your query string.

4) The result
Now you can hit the play button and test the dynamic filter. It works very well, but it isn't super fast. Therefore you shouldn't have more than three Power BI tiles loaded at the same time. With the property LoadPowerBIContent you can control which tile is loading.

When clicking on the Power BI tile in your Power App the user will be redirected to Power BI online to do some more in-depth analysis. If you don't want that you need to set the PowerBIInteractions property to false (default is true).
Testing the dynamic filter

























Conclusion
In this post you learned how to use Power BI tiles, that are published to a dashboard, in Power Apps. Very straight forward to use and they have way more options than the built-in charts. One last point for attention. If you use these Power BI tiles then make sure that your Power Apps users have a Power BI licence and are authorized to the used tiles.

Thursday 19 March 2020

Power Apps Snack: Loop through combobox items

Case
I want get the product ID's of all selected items from my combobox in Power Apps, but the DisplayFields property of the combobox only shows the product name. How do I get the corresponding ID's?
Mulitselect ComboBox in Power Apps






















Solution
You can loop though the collection of selected items of your combobox with the SelectedItems function. Below you will find a couple of code examples. For this example we are using a table called MyProducts with data from Adventure Works.
Table with sample data
















Loop through selection
If multi select is turned off (basicly a DropDown with a search option) then you can use the Selected function, just like for the ComboBox.
ComboBox_singleselect.Selected.ProductKey

When multiselect is turned on, you must use SelectedItems. Getting a comma separated list of ProductKeys (instead of ProductNames) can be done with the following script. To test it you could add a label with this script in the Text property to show the result:
Concat(
    ComboBox_multiselect.SelectedItems.ProductKey,
    Concatenate(
        Text(ProductKey),
        ","
    )
)

This will result in a string with "PrdKey1,PrdKey3,PrdKey3," If you want to get ride of the last comma you can use a left: LEFT(mystring, LEN(mystring) - 1):
Left(
    Concat(
        ComboBox_multiselect.SelectedItems.ProductKey,
        Concatenate(
            Text(ProductKey),
            ","
        )
    ),
    Len(
        Concat(
            ComboBox_multiselect.SelectedItems.ProductKey,
            Concatenate(
                Text(ProductKey),
                ","
            )
        )
    ) - 1
)

Or if you want to keep your code a little cleaner then you could first store the result in a variable and then use the LEFT on the variable. To test this add the UpdateContext to the OnChange of the ComboBox and the Left to the Text property of your label.
// Store concat with trailing comma in variable
UpdateContext(
    {
        SelProds: Concat(
            ComboBox_multiselect.SelectedItems.ProductKey,
            Concatenate(
                Text(ProductKey),
                ","
            )
        )
    }
);

// Remove last comma
Left(SelProds, Len(SelProds) - 1)
Update: better solution to not repeat the code is by using the WITH function

Get specific item from alle selected items
You can also get specific items from the selection.
// First item
First(ComboBox_multiselect.SelectedItems.ProductKey).ProductKey

// Second item
First(
    LastN(
        ComboBox_multiselect.SelectedItems.ProductKey,
        CountRows(ComboBox_multiselect.SelectedItems.ProductKey) - 1
    )
).ProductKey

// Thirth item
First(
    LastN(
        ComboBox_multiselect.SelectedItems.ProductKey,
        CountRows(ComboBox_multiselect.SelectedItems.ProductKey) - 2
    )
).ProductKey

To keep your code a little cleaner you could first store the selection in a collection in the OnChange event of the ComboBox.
// Store selection in collection
ClearCollect(
    ProdSel,
    ComboBox_multiselect.SelectedItems.ProductKey
);

// First item
First(ProdSel).ProductKey

// Second item
First(
    LastN(
        ProdSel,
        CountRows(ProdSel) - 1
    )
).ProductKey

// Thirth item
First(
    LastN(
        ProdSel,
        CountRows(ProdSel) - 2
    )
).ProductKey


Maximize selection
An other need trick is that you could maximize the number of selected items by adding the following script to the OnChange of the ComboBox.
If(
    CountRows(ComboBox_multiselect.SelectedItems) > 4,
    Notify(
        "You reached the maximum of 4 items",
        NotificationType.Error,
        2000
    )
)

Or you could combine that with filling a boolean variable that you can use to disable/hide the save button
If(
    CountRows(ComboBox_multiselect.SelectedItems) > 4,
    UpdateContext({ValidationError: false});
    Notify(
        "You reached the maximum of 4 items",
        NotificationType.Error,
        2000
    ),
    UpdateContext({ValidationError: true})
);

Conclusion
In this code snippet post you learned how to retrieve the selected items from a Power Apps ComboBox. When multi select is turned on you should use SelectedItems instead if Selected. This returns a collection of all selected Items from the ComboBox.

Bug: the formulas only work with string columns. Solution is to add the integerfield to the displayfields of the ComboBox.
Add non-string fields to DisplayFields

Thursday 12 March 2020

Power Apps: Get next item from gallery

Case
I am using a gallery on my overview page to see all records and on my details page I want a button to go to the next item from my gallery without first going back to my overview page.
Generated App with dynamic Next button
















Solution
For this example we will use a generated PowerApp based on a product table that came from the AdventureWorks database.

In this generated app you know which item is selected in the product gallery, but you don't know which product is next because there is no expression for this. To solve this we will:
  1. Create a new collection based on the gallery with only the key column (to keep it small). 
  2. Then create a copy of that collection with an extra column that contains an auto-increment (auto-number).  
A preview of both collections


















This second collection can then be used to query the successive product row for a specific product key. The product key of this successive product row can then be used to retrieve all data from that product to display it in the Display form.

1) Pass through product key via variable
First step is to slightly change the navigation command on the overview page. We will pass on a variable that contains the active product key to the details page. In the OnSelect action of the gallery we will change the existing code with an extra line:
// Old code
Navigate(
    DetailScreen1,
    ScreenTransition.None
)

// New code (with the variable)
Navigate(
    DetailScreen1,
    ScreenTransition.None,
    {MyProductKey: BrowseGallery1.Selected.ProductKey}
)
Change navigate in OnSelect of gallery



















2) Change item code on details page
On the details page where you see all columns/attributes of the product we need to change the Item code of the Display form. The standard code retrieves the selected item from the gallery on the overview page, but we will use the variable from step 1 instead.
// Old code
BrowseGallery1.Selected

// New code
// Get first row from dataset 'MyProducts'
// that is filtered with our new variable
First(
    Filter(
        '[dbo].[MyProducts]',
        ProductKey = MyProductKey
    )
)
Change Item from Display form



















3) Add Next button
The last part of this solution is to add a button and a lot of code for its OnSelect action. Adding the rownumber to the collection can be done in a couple of ways, but this flexible code is very well documented by powerappsguide.com.
// Create new collection with only the key column
// from the gallery (key=ProductKey)
ClearCollect(
    MyProductGallery,
    ShowColumns(
        BrowseGallery1.AllItems,
        "ProductKey"
    )
);


// Create new collection based on first
// collection with one extra column that
// contains an auto increment number "MyId"
Clear(MyNumberedProductGallery);
ForAll(
    MyProductGallery,
    Collect(
        MyNumberedProductGallery,
        Last(
            FirstN(
                AddColumns(
                    MyProductGallery,
                    "MyId",
                    CountRows(MyNumberedProductGallery) + 1
                ),
                CountRows(MyNumberedProductGallery) + 1
            )
        )
    )
);


// Retrieve next item from gallery and
// store its id (ProductKey) in a variable
// called 'MyNextProductKey'
// Steps:
// 1) use LookUp to retrieve the new id (MyId) from the new collection
// 2) use that number to only get records with a higher number by using a Filter
// 3) get First record from the filtered collection
// 4) get ProductKey column and store it in a variable
UpdateContext(
    {
        MyNextProductKey: First(
            Filter(
                MyNumberedProductGallery,
                MyId > LookUp(
                    MyNumberedProductGallery,
                    ProductKey = MyProductKey,
                    MyId
                )
            )
        ).ProductKey
    }
);


// Only when MyNextProductKey is not empty
// change the value of MyProductKey which
// will result in retrieving new data to
// the Display form.
// An empty value only occurs for the last
// record. The If statement could be extended
// with for example navigating back to the
// overview page when it is empty:
// Navigate(BrowseScreen1, ScreenTransition.None)
If (
    !IsBlank(MyNextProductKey),
    UpdateContext({MyProductKey: MyNextProductKey})
)
Adding a next button to the details screen



















For adding a previous button you just need to change the filter to 'smaller then' and then select the last record instead of the first.
UpdateContext(
    {
        MyPrevProductKey: Last(
            Filter(
                MyNumberedProductGallery,
                MyId < LookUp(
                    MyNumberedProductGallery,
                    ProductKey = MyProductKey,
                    MyId
                )
            )
        ).ProductKey
    }
);

4) The Result
Now lets start the app add test the Next button. The nice part is that you can filter or sort the gallery anyway you like and the next button will still go to the next record from that gallery.
Test the new next button


























Summary
In this post you learned how to retrieve the next (and previous) item from a gallery. You could also add this code to a save button to create a save-and-edit-next-record action which saves the user a few extra clicks by not going back to the overview page to find the next record to edit.

For an approve and go to next record button where your gallery is getting smaller and smaller each time you approve a record you could also do something very simple in your form to get next item
// Old code
BrowseGallery1.Selected

// Alternative code
First(BrowseGallery1.AllItems)