Saturday, 31 March 2018

Use Logic Apps for extracting Email Attachments

Case
Sometimes it happens that source files are delivered via email. In that case you could extract these files with for example a Script Task in SQL Server Integration Services (SSIS) or a PowerShell Script, but this requires some serious programming in C#, VB.net or PowerShell . Are there other and easier ways to get email attachments without programming skills?

Azure Logic Apps - Codeless and serverless




















Solution
As we slowly move from on-premises Data Warehouses (DWH) to Azure, we could use other Azure parts to solve this: Azure Logic Apps. With this, you can build automatic workflows without writing code in C# or other programming languages. For example, extract social media data like Twitter tweets or using it for Azure Data Factory V2 notifications. For a lot of BI specialist writing code can be a threshold, so this service offers a way out. We will store these email attachments in Blob Storage. After that, you can load these files into your DWH using SSIS for example.

Starting point of this post is an existing storage account and blob container.

1) Create new Logic App
Go to your Azure portal and type in "Logic Apps" in the search bar Search resources, services and docs. Locate Logic App under Services and click on Add. Give it a suitable name like "ExtractEmailAttachments" and then choose the Subscription, Resource Group and Location. For the Resource Group and this Logic app we use West Europe since we are from the Netherlands.

Azure Portal - Create Logic App














2) Add trigger
When editing the Logic App, we first need to pick a trigger. It is the event that starts this Logic App. In this case, using Azure Data Factory, we pick the HTTP trigger When a HTTP request is received


Logic Apps Designer - Add HTTP trigger














NOTE:
When you open the Logic App for the first time, you can choose several (common) triggers. You can also choose existing Logic Apps templates for known applications or purposes to use it as an example or starting point. 

3) Get emails
Next step is to retrieve the emails. Click on New step and Add an action. Choose the Connector "Office 365 Outlook", search for "Get email" and select Office 365 Outlook - Get emails. The first time that you use this action, you need to login with your Office 365 account. Now you setup this action by choosing the Outlook folder, select only unread messages including the attachments and what subject the email contains (like in the Outlook client). Click on Save in the upper left corner when you are finished.

Logic Apps Designer - Add action Get Emails















4) Filter emails
In this scenario we are receiving various source files per email each day, but we first want to retrieve attachments from emails with a specific subject and store those in a separate blob storage container. For other email subjects we can add more conditions and save all those files in there own blob storage containers. That's why we are looping through the inbox (or another Outlook folder) and filter per subject. Add a new step and choose Add a condition to filter on a specific source file. Now automatically the for each will appear, because we are receiving multiple emails in the previous step. Give the condition a suitable name, because you have to add more conditions to separate the different source files (we only show one condition in this post).

Logic Apps Designer - Add condition














5) Store data in blobs
Now that we have filtered the emails per subject, we must store the attachment (which contains the data) itself. Add an action on the 'if true' side and choose "Azure Blob Storage". Inside this category, choose Azure Blob Storage - Create blob. The first time that you use this action, you need to create a connection. Choose the storage account and give the connection a suitable name. Now you can setup this action by defining the Folder path, Blob name and Blob content. As said earlier, we are retrieving new source files every day. That's why the blob name will contain the day of load. This is the expression:
concat('DWH01_Sales_',
formatDatetime(utcNow(),'yyyy'), '-', formatDatetime(utcNow(),'MM'), '-',
formatDatetime(utcNow(),'dd'),'.csv')

Logic Apps Designer - Create blob














NOTE:
There is no separate action to create a new folder in the Blob Storage container, but it will be created automatically when you save the first file in the container.

6) Send email when succeeded
Every time the data is stored into a new blob, we will confirm this by sending an email. In this case, we will send the email to the same account as step 2 (Get Emails). You can of course send it to developers or administrators in your organization as well. In that case you must create a new connection by clicking change connection, because now this action will automatically use the same Office 365 connection as before. Insert a new step and select "Office 365 Outlook" and choose the action Office 365 Outlook - Send an email. We create the following Body:

Dear User,

The run of DWH01_Sales has completed successfully for March 31, 2018.

Kind regards,
Azure Logic Apps

Therefore we need to use the following expression:
concat('The run of DWH01_Sales has completed successfully for '
, formatDatetime(utcNow(),'MMMMM'), ' ', formatDatetime(utcNow(),'dd'), ', '
, formatDatetime(utcNow(),'yyyy'), '.')

Fill in the Subject and select "To" to send the email To the account used in the Office 365 connection.
Logic Apps Designer - Send email















7) Mark email as read
After we have sent the succeed email, we want to mark the processed emails (Daily Schedule) as read. Insert a new step and select "Office 365 Outlook" and choose the action Office 365 Outlook - Mark as read. Click on Message Id  and choose "Message Id", based on step 2 (Get Emails). You have to click on "See more" to make it appear in the list.

Logic Apps Designer - Mark as read
















8) Move email to archive
Finally, we will move the emails to an archive folder. We created separate mail folders in Outlook for each subject/source file. Insert a new step and select "Office 365 Outlook" and choose the action Office 365 Outlook - Send an email. Choose the Message Id  and select the specific sub Folder in archive.


Logic Apps Designer - Move email















Result
Now let's see if it all works. Make sure you have sent the email "DWH01_Sales DAILY SCHEDULE" and it is unread. Click in the Logic Apps Designer on Run. Now wait for the succeeded email...it works!


Result - Received succeeded email
















The source file is also stored into a new blob. See below the result in Azure Storage Explorer.

Result - File stored as new blob
















Summary
In this post we showed you how to build a Logic App to ingest email attachments as source files. All without programming skills and easily to maintain easily as part from an ETL process.

Click here to see how to execute this Logic App with Azure Data Factory V2 as part of an DWH solution in Azure.

This is why we started the Logic App with an HTTP trigger that can be called from other applications. If you do not want to integrate the Logic App and use it as a separate solution, then you should start with a different trigger. For example a Recurrence trigger or When a new email arrives

Thursday, 1 March 2018

Add email notification in Azure Data Factory V2

Case
I am running SSIS packages in Azure Data Factory (ADF V2), but I want to get an email notification when my package execution fails. It seems that ADF V2 doesn't have a built-in email notification option. How can I be notified without checking the built-in pipeline monitor in ADF?
Email notification on failure

















Solution
For this solution we will be using a Logic App to send an email and trigger it, if an error occurs in ADF. The starting point of this blog post is a working pipeline that executes an SSIS package using a stored procedure.
Data Factory loves Logic App











So, the solution exists of two parts: Logic App (email) and ADF (error handling). The communication between these two Azure parts is done with a JSON message via an HTTP request (post). The JSON message contains the name of the Data Factory and the pipeline that failed, an error message and an email address. You could of course hardcode the email address in Logic Apps, but now you can reuse the Logic App for various pipelines or data factories and notify different people.
{
    "properties": {
        "DataFactoryName": {
            "type": "string"
        },
        "PipelineName": {
            "type": "string"
        },
        "ErrorMessage": {
            "type": "string"
        },
        "EmailTo": {
            "type": "string"
        }
    },
    "type": "object"
}


a) Logic App
We first start with creating the Logic App. We need it before creating the error handler in ADF.

a1) Create new Logic App
Click on Create a resource and locate Logic App under Enterprise Integration. Pick a descriptive name like "ADF-Notifications" and then choose the Subscription, Resource Group and Location. For the Resource Group and this Logic app we use West Europe since we are from the Netherlands.
Create new Logic App




















a2) HTTP Trigger
When editing the Logic App we first need to pick a trigger. It is the event that starts this Logic App. Pick the HTTP trigger When a HTTP request is received and then click on edit to specify the parameters. Paste the JSON message from above in the textbox. In the next step we can use these parameters to setup the email.
Add HTTP trigger














a3) Send an email
Add a new step and choose Add an action. Search for "Send an email" and then scroll down to Office 365 Outlook - Send an email. The first time that you use this action you need to login with your Office 365 account. Now you can setup the email with fixed texts mixed with parameters from the JSON message from the previous step. When you are satisfied with the email setup, click on Save in the upper left corner.
Add action to Send an email














a4) Copy URL from HTTP trigger
The Logic App is ready. Click on the HTTP trigger and copy the URL. We need this in ADF to trigger the Logic App.
Copy the URL for ADF




















b) Data Factory
Next we will add a new activity in the existing ADF pipeline to trigger the new Logic App.

b1) Add Parameter
To specify the email address of the recipient we will use a pipeline parameter. Go to your existing pipeline (do not select any of the activities in it) and go to the Parameters page. Click on New and add a new String parameter called EmailTo. Add the email address of the recipient in the default value field. You can override the default value when editing the trigger (after save).
Add pipeline parameter













b2) Add Web activity
Next collapse the General activities and drag a Web activity to the canvas. Make sure to give it a suitable name like Error Notification. Add an Activity Dependency (Similar to the Precedence Constraints in SSIS) between the Stored Procedure activity and the Web activity. When right clicking it you can change it to Failure.
Add Web activity















b3) Web activity settings
Select the newly added Web activity and go to the Settings page. In the URL field you must paste the URL from step a4 and as method you need to select Post.
Next step is to add an new header with a JSON message. The header is called Content-Type and its expression is application/json. As body you need to add the following JSON message, but make sure to change the name of Stored Procedure activity. Ours is called Execute Package. The first two items are retrieving the Data Factory name and Pipeline name. The last one is getting the value of the parameter created in step b1.
{
    "DataFactoryName":
        "@{pipeline().DataFactory}",
    "PipelineName":
        "@{pipeline().Pipeline}",
    "ErrorMessage":
        "@{activity('Execute Package').error.message}",
    "EmailTo":
        "@{pipeline().parameters.EmailTo}"
}
.
Add URL and json to Web activity













b4) Testing
Now it is time to test the pipeline. Make sure something is failing in the package. For example by changing a servername or password in the SSIS environment. Or you could just pause your Integration Runtime and run the trigger. Now wait for the email to arrive.
Email notification received













The solution has one downside! Because you are handling the error with an Activity Dependency the entire pipeline succeeds despite of the failing SSIS stored procedure. Check the image below. The last 4 jobs did fail, but show the Status 'Succeeded'. Though there is an error message.
Failed or Succeeded?











b5) Add fail
If you want the correct status when the SSIS stored procedure fails then copy and paste the existing Stored Procedure activity, rename it to for example 'Fail' and replace SQL code with the code below. Then connect the Web activity to this new activity.
--T-SQL Code
Declare @err_msg NVARCHAR(150)
SET @err_msg=N'Error occurred, email was sent'
RAISERROR(@err_msg,15,1)
Add fail













Now we have a Failed status for a failing pipeline. Please leave a comment when you have a better or easier solution for this.
Status Failed












Summary
In this post we showed you how to use a Logic App to send you an email notification in case of a failing pipeline in Azure Data Factory. This saves you a daily login to the Azure portal to check the pipelines monitor. Feel free to adjust the JSON message to your own needs. You could also add an additional notification for successful jobs.

Update dec 1: instead of Office 365 you can now also use SendGrid