Thursday 17 November 2016

Extracting tweets with Microsoft Flow

Case
I want to extract tweets from twitter to my data warehouse without writing code. Is that possible?
Extract #SSIS Tweets













Solution
There are of course several options with traditional ETL tools like SSIS with custom .NET code, a custom twitter tasks or even by calling the Twitter API within R. But Microsoft also recently introduced Microsoft Flow and Microsoft Azure Logic Apps which can connect to various apps and services (like Facebook, Google Drive, GitHub, Twitter, SharePoint, etc.) to collect data, synchronize files or get notifications of certain events, without writing any code! The basics of Flow are free and Logic App is the professional version in Azure with of course some more options.


Solutions:
A) Microsoft Flow
B) Microsoft Azure Logic Apps


1) Microsoft Flow
First go to flow.microsoft.com and login with your Microsoft Account. You can now search for existing templates or create a flow from blank. We will start with a blank flow for this example.
Searching for Twitter templates or use the blank flow


















2) Twitter source
In the blank flow you first need to search for a trigger. In this case we want to search for tweets on twitter. The trigger is 'When a new tweet is posted'.
Blank flow searching for Twitter trigger





















When you select the Twitter trigger you need to setup the Twitter connection and the search text. For this example I will login with my Twitter account and search for tweets with #SSIS.
Connecting to Twitter and setting up search






















3) Filtering annoying job tweets
Of course we want to filter those annoying job tweets. Therefore we need to add a condition. In the condition pane you can select fields from the Twitter 'source' to filter on. In this example we are filtering tweets that contain the word 'job' (probably not the perfect filter).
Add ' job' filter


















4) Add SQL Destination
Now we need to store the tweets in a (Azure) SQL Server database. For this example we created a simple table with all nvarchar(255) fields:
CREATE TABLE [dbo].[Tweets](
    [Name] [nvarchar](255) NULL,
    [TweetText] [nvarchar](255) NULL,
    [TweetTime] [nvarchar](255) NULL
)

Add a new step and search for SQL Server - Insert row. Then enter the credentials from your database, select the table (must be an existing table) and map the Twitter fields to the appropriate table columns.
SQL Destination - Insert row


















5) Create flow
After the last step has been added click on Create flow and then on Done. Now your newly created flow is ready to run. Watch the table!
Create flow


















The result
















Conclusion
Microsoft Flow is very easy to use and you don't need programming skills. There are dozens of apps and services to connect to.
But to keep it for free you can only check once each 15 minutes and only do 750 runs a month. That should be enough for at least on flow, but if that's not enough you can either switch to a premium account which gives you more runs, more checks and even more services to connect to. Or you could switch to professional Microsoft Azure Logic Apps.