Thursday, 29 September 2016

IoT Adventure: 5b - Stream Analytics for Azure SQL Database

Case
Your sensors are connected with an IoT Hub and is generating data. In our previous post we send the real-time data to an Power BI dashboard. What are the other possibilities in Azure with this data?

Solution
In our previous post we distinguish two streams for our data: Cold path and Hot path. In this case we store the data in a Azure SQL Database, which is a form of the Cold path. See here for the full list of Azure SQL Databases (size and prices) where you can choose from. The reason to store the data may be, for example, to analyze the data or to prepare a dataset as input for your Machine Learning experiment/model. Just like the Hot path, we are setting up a (separate) Stream Analytics Job for this. You can have multiple Outputs in one Job, for example real-time Power BI and SQL Database, but when you want to edit the query for the data to the database, you must stop the Job and also your real-time data is not sent. Before we create the job, we first set up the Azure SQL Server and after that the SQL Azure Database. The reason for this is that we want to select the database by the Output of the Stream Analytics Job and therefore we need to create it first (along with the server).

Cold path with Stream Analytics









1) Create the Azure SQL Server
Go to the Azure portal and click on 'More services' on the bottom of the menu (left-hand side of the screen) and search for SQL server. When you have opened it, click on 'Add' to create a new Stream Analytics Job. Perhaps you noticed that in our previous post instead of 'Browse' now 'More service' stands. The portal is still in development so there are regular updates.

Azure Portal - Create SQL Server















Now you can fill in your Server name (the name cannot contain spaces). Next we create a SQL Server login and this is our Server admin. You can also use Azure Active Directory (user or group) for this. Click here for more information. The Subscription is filled automatically. After this we choose a Resource group. For the convenience we choose the Resource group we have created earlier by setting up the IoT Hub, but you can also create a general Resource group so the server can be used for purposes other than IoT. Otherwise the server has the same lifecycles, permissions and policies as the IoT Resource group. Our Location is the Netherlands, so we choose West-Europe.

Azure Portal - Create SQL Server (continuation)















Tip:
When the deployment of the server succeeded, the server must appear in the list of SQL Servers. If not, you must click on the 'Refresh' button at the top under SQL Servers.

2) Create the Azure SQL Database
Next we create the database. In your Azure portal click on 'More services' and search for SQL Database. When you have opened it, click on 'Add' to create a new database.

Azure Portal - Create SQL database














Choose a name for your database. The Subscription is filled automatically and next we choose for the same Resource group as earlier by setting up the SQL Server. Select 'Blank database' (new database) and choose the SQL Server that you have created earlier. If you don't choose a server, Azure creates automatically one. That is the reason why we set up the server first, because maybe you want to create one server (with a general name) and to attach here multiple databases. Otherwise you have a separate server for every database. That can also be a conscious choice off course, but that is not what we want in this case. At last we choose the 'Basic' database, but here you can choose the size that fits your needs. The Collation is default.

Azure Portal - Create SQL database (continuation)














Tip:
If you decide to add in Management Studio (once you have connected) a new database, be aware of the fact that Azure creates default the S3 (Standard) version of a database. Therefore, you should always create a new database in the Azure portal, so that you can choose the right size and price.

3) Connect to SQL Server and create a table
Once the database is created, you can connect to the SQL Server in Management Studio. In this case our Server name is 'bitools.database.windows.net,1433'. As you can see the name includes the default port of 1433 (this is the only port on which the service is available). Next you choose 'SQL Server Authentication' and fill in the login and password that you have created earlier by setting up the SQL Server. The first time you must Sign In with your Azure account. This is also the case when you have not made connection to the server for a while. At last you must add your client IP for access to the server. Your IP is now added to the firewall.

SQL Server Management Studio - Connect to Azure














Before we create the Stream Analytics Job, we must do one last thing and that is create a table where we can store the sensor data. I have created the following table in the database:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
CREATE TABLE [dbo].[sensorData](
 [SensorName] [nvarchar](max) NULL,
 [MeasurementCount] [bigint] NULL,
 [MeasurementTime] [datetime] NULL,
 [Temperature] [float] NULL,
 [Humidity] [float] NULL,
 [Pressure] [float] NULL,
 [Altitude] [float] NULL,
 [Decibel] [float] NULL,
 [DoorOpen] [bigint] NULL,
 [Motion] [bigint] NULL,
 [Vibration] [bigint] NULL,
 [Illumination] [float] NULL
)

As you can see I choose for float as datatype, because the standard data types in Azure are floats. This means that input datatypes such as decimal and numeric are converted to floats.

Tip:
You can manage the firewall in the Azure portal. Go in the portal to your server, click on it and under settings you will find 'Firewall'. Here you can add Client IP's to allow connection to the server. Note that this can only be done by an user who have the role of 'Owner'. In this case I created the server so I'm automatically owner of the server.


4) Create the Stream Analytics Job
In your Azure portal click on 'New'. Type in 'Stream Analytics Job' and click on it. Next you click on the result, in this case only Stream Analytics Job. After that, you can click on the 'Create' button.

Azure Portal - Create Stream Analytics Job (extensive)














Tip:
In our previous post we create a new Stream Analytics Job on a faster and different way. This way is extensive and gives some general information about, in this case, a Stream Analytics Job. So if you want more information about a feature in Azure before you create it, this is a useful way. 

Now you can fill in your Job name (the name cannot contain spaces) and the Subscription is filled automatically. Next choose a Resource group. These groups are made by setting up the IoT Hub. Click here for more information and how you create it. When you have created this, it appears in the list of 'use existing' and you can choose this one. Our Location is the Netherlands, so we choose West-Europe. At last you can pin your Job right away to your dashboard, with the checkbox at the bottom. You may have noticed at the first screenshot that I have already pin the previous Stream Analytics Job to my dashboard.

Azure Portal - Create Stream Analytics Job (continuation)














5) Define the Input
Once the Job is created, you must add a new Input. Because I have pinned the Job (screenshot 6 of 'Create the Stream Analytics Job'), you can select it from the dashboard. The default Source Type is 'Data stream', because  the sensor data is an ongoing stream and is derived from the IoT Hub. Optionally, you can add 'Reference data' as type. This data is like static metadata next to your sensor data, it gives your sensor data more meaning. Here you can find more information about this kind of data. The Source is 'IoT hub' and then the IoT Hub that you have created automatically appears. If you have more then one IoT Hub, you can choose one from the drop-down list. The next thing is to choose the right Consumer group. These groups are made by setting up the IoT Hub. Click here for more information and how you create it. In this case we want to store the sensor data in a Azure database, so you choose 'azuredb'. Finally you choose 'JSON' as Event serialization format. Click here for more information and how you create such a JSON message.
  1. Click on the job
  2. Click on 'Inputs'
  3. Click on 'Add'
  4. Fill in a name, select 'Data stream' as Source Type  and select your IoT Hub as Source
  5. Select 'azuredb' as Consumer group and choose 'JSON' as Event serialization format
Azure Portal - Define Input














6) Define the Output
After the Input, you create the Output. When you have given the Output a suitable name, you choose 'SQL database' in Sink. Then select your Database that you have created, in our case 'IoT_Sensor'. After that the Server name, that you have created too, is filled in automatically. Next you must connect to the database with the SQL Server login you made earlier. Now you can choose a table, in our case 'sensorData'
  1. Click on the job and click on 'Outputs'
  2. Click on 'Add'
  3. Fill in a name and select 'SQL database' as Sink
  4. Select your Azure Database that you have created
  5. Log in with the SQL login Username and Password
  6. Choose the created Table
Azure Portal - Define Output














7) Define the Query
Now the Output is defined, you can build up the query. Compared to our previous post you see that there are already some updates have been made. For example, on the left you see your Input and Output. The query needs always an Input and an Output, so that's why we have created the Output first. It is good to know that the language is SQL, but there are certain differences with a normal SQL query. In addition, the standard data types are floats. 
Besides a FROM clause, there is an INTO clause. For the FROM you will use your defined Input and the Output is used for the INTO. Additionally, there are various new windowing functions available. This will be discussed in another blog. You will find more details about the Stream Analytics Query Language here. For now we use a simple query without those functions. 

The query:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
SELECT   CAST(sensorName as nvarchar(max)) as SensorName
,        CAST(1 as bigint) as MeasurementCount
,        CAST(measurementTime as datetime) as MeasurementTime
,        Temperature
,        Humidity
,        Pressure
,        Altitude
,        Decibel
,        CAST(doorOpen as bigint) as DoorOpen
,        CAST(motion as bigint) as Motion
,        CAST(vibration as bigint) as Vibration
,        Illumination
INTO   [saj-bitools-DB-Output]
FROM   [saj-bitools-DB-Input]

Unfortunately, the testing of the query is not supported in the new Azure Portal. They are working on it.

Azure Portal - Define Query














7) Start the Job
At last you must start the Stream Analytics Job. You can choose between ad-hoc (now) or a scheduled day and time (custom).

Azure Portal - Start the Stream Analytics Job














Result
We have started the Stream Analytics Job and now we want to see the result. Go back to your SQL Server Management Studio and connect to the Azure Database. When you look at the table, you must see the results. In our case we have sent 100 messages to our database. The messages are sent every 10 seconds.

SQL Server Management Studio - Table results











Conclusion
The steps are logical, but the sequence of the execution is very important. Also be careful about which database you buy, because there are big differences between the prices.