Thursday, 17 November 2016

Extracting tweets with Azure App Service - Logic App

Case
I want to extract tweets from twitter to my data warehouse without writing code. Is that possible?
Extract #SSIS Tweets













Solution
There are of course several options with traditional ETL tools like SSIS with custom .NET code, a custom twitter tasks or even by calling the Twitter API within R. But Microsoft also recently introduced Microsoft Flow and Microsoft Azure Logic Apps which can connect to various apps and services (like Facebook, Google Drive, GitHub, Twitter, SharePoint, etc.) to collect data, synchronize files or get notifications of certain events, without writing any code! The basics of Flow are free and Logic App is the professional version in Azure with of course some more options.

Solutions:
A) Microsoft Flow
B) Microsoft Azure Logic Apps

1) Microsoft Azure Logic Apps

First go to portal.azure.com, login and create a new Logic App (located under Enterprise Integration). The name of our Logic App is called SSISTweets (you have to choose a different name) and we use West-Europe as the location of the Resource Group since we life in the Netherlands. When you click on the create button the Logic App will be created and you can start editing. You can choose between various default templates or you can create a Blank LogicApp.
Creating a new Logic App

















If you have also tried Microsoft Flow, you will probably notice that there are less templates to choose from. We will choose the Blank LogicApp for this example.

2) Twitter source
In the blank LogicApp you first need to search for a trigger. In this case we want to search for tweets on twitter. The trigger is 'When a new tweet is posted'.
Blank LogicApp searching for Twitter trigger


















When you select the Twitter trigger you need to setup the Twitter connection and the search text. For this example I will login with my Twitter account and search for tweets with #SSIS. You can also change the frequency and interval.
Connecting to Twitter and setting up search


















3) Filtering annoying job tweets
Of course we want to filter those annoying job tweets. Therefore we need to add a condition. In the condition pane you can select fields from the Twitter 'source' to filter on. In this example we are filtering tweets that contain the word 'job' (probably not the perfect filter).
Add 'job' filter


















4) Add SQL Destination
Now we need to store the tweets in a (Azure) SQL Server database. For this example we created a simple table with all nvarchar(255) fields:
CREATE TABLE [dbo].[Tweets](
    [Name] [nvarchar](255) NULL,
    [TweetText] [nvarchar](255) NULL,
    [TweetTime] [nvarchar](255) NULL
)

Add a new step and search for SQL Server - Insert row. Then enter the credentials from your database, select the table (must be an existing table) and map the Twitter fields to the appropriate table columns.
SQL Destination - Insert row


















5) Save LogicApp
After the last step has been added click on Save to save your work and then on Done. Now your newly created flow is ready to run.
Saving the LogicApp

















Now click on the run button and watch the table!
Run LogicApp


















The result
















Conclusion
Microsoft Logic App is very easy to use and you don't need programming skills. There are dozens of apps and services to connect to. If you only need one Logic App and a 15 minute refresh is enough then you could also consider using Microsoft Flow which is free of charge.