Showing posts with label IOT. Show all posts
Showing posts with label IOT. Show all posts

Thursday 15 December 2016

Azure Event Hub vs IoT Hub

Case
During our journey we noticed that in our team there is some confussion about the differences between an Event hub and an IoT hub. After some research we find out that there are a lot of similarities but also differences. In this blog I will explain the concept of an Event/IoT hub and a best practice when to use an event hub and when to use an IoT hub.
The goal of this article is to give a global image of the Event hub and IoT hub. Please follow the links for more in-depth information. 














Solution
Before we can find out what the differences and similarities are, the first question that is: “what is an event hub, and how do we use it?”

1) Event hub
An Event hub is a gateway to  the Azure cloud. It’s main purpose is to collect the incoming data and pas it to the Azure cloud, as seen in figure 1. An Event hub process the income data, but on a low profile scale. It doesn’t have advanced sequencing or delivery guaranties. Therefore Event hubs are a high scale messaging service, with a low latency and a high reliability. In our cases we use an event hub to collect the data from the raspberry,  but it can also be used in other cases, like collecting data from console games or other telemetry.
Figure1: Event Hub











Protocol
The connected devices/entities are called: Event publisher
Connecting Event publisher to the Azure event hub is easy, because it support the HTTP/AMQP protocols. The most used protocol is AMQP protocol. See here for more information about this subject 

Partition
The Event hub uses partitions. Partitions are an ordered sequences that keep events in the Event hub.  This sequence is based on the ‘first in first out’ principal. The number of partitions that can be used at the same time is between 2 and 32. Please note that when you create an event hub, you have to set the number of partitions. Strangely it cannot be changed afterwards. Mostly the number of partitions are ased on the amount of readers you are going to use (meaning the use of partitions further in the process). The default number of partitions is 4.  

In short an Event hub is a high scale telemetry, one way,  service, using the HTTP/AMQP protocol and is generally available worldwide.

For more information ‘how to develop with event hub’ see the programming guide  
Setting up an Event hub follows in an other blog (comming soon).

2) IoT Hub
But with the ‘grow of Iot’  there came additional needs: control, device authentication and authorization, protocol translation, etc.
Since an Event hub is an one way point of entry it’s limited in the additional needs as mentioned before..
Figure2: IoT hub

And this is where the IoT Hub kick in. The IoT hub can do the same things as an Event hub, but it’s capable of much more. The most important thing, it can handle bi directional traffic, meaning that an IoT hub is capable of sending data back to the connected devices.
Now it’s possible to command and control the devices, e.g. you can send a disconnect event to the device or a threshold event, e.g. when the machine reach a certain temperature that you can shutdown the machine.
Devices can be registered, so you can identify devices to check whether they are allowed to connect. It’s possible to connect more than 10 million devices (where the Event hub can handle up to 1 million devices) , it is also easy to import bulk device identities (which is easy when you are use 10 million devices ) .
The IoT hub can handle device error reporting, e.g. you can check the failed connection attempts per device. This can result in disconnection/disabling the device/Sensor in the IoT hub (so the sensor isn’t allowed to connect to the hub anymore).
It also support the AMQP over webSockets en MQTT protocol whereby the latter no protocol gateway is needed (when using Azure IoT SDKs).

For more in-depth information about the IoT hub, please see also the reference architecture 
For setting up the IoT hub see our earlier blogspot: Setting up IoT hub

Summary
The IoT Hub can do the same as an Event hub, but much more. Mostly because the bi-directional communication possibility, ergo an IoT hub is 'Event Hub plus'.


So why not all use the IoT hub instead of the Event hub? Well one thing we didn’t mentioned was the pricing. With all the extra capabilities of the IoT hub the pricing is also a lot higher. Sometimes up to 40 times higher. So for simple event, like reading data from a weather station, or counting how many times a door is opening a IoT hub is not necessary. 

Thursday 29 September 2016

IoT Adventure: 5b - Stream Analytics for Azure SQL Database

Case
Your sensors are connected with an IoT Hub and is generating data. In our previous post we send the real-time data to an Power BI dashboard. What are the other possibilities in Azure with this data?

Solution
In our previous post we distinguish two streams for our data: Cold path and Hot path. In this case we store the data in a Azure SQL Database, which is a form of the Cold path. See here for the full list of Azure SQL Databases (size and prices) where you can choose from. The reason to store the data may be, for example, to analyze the data or to prepare a dataset as input for your Machine Learning experiment/model. Just like the Hot path, we are setting up a (separate) Stream Analytics Job for this. You can have multiple Outputs in one Job, for example real-time Power BI and SQL Database, but when you want to edit the query for the data to the database, you must stop the Job and also your real-time data is not sent. Before we create the job, we first set up the Azure SQL Server and after that the SQL Azure Database. The reason for this is that we want to select the database by the Output of the Stream Analytics Job and therefore we need to create it first (along with the server).

Cold path with Stream Analytics









1) Create the Azure SQL Server
Go to the Azure portal and click on 'More services' on the bottom of the menu (left-hand side of the screen) and search for SQL server. When you have opened it, click on 'Add' to create a new Stream Analytics Job. Perhaps you noticed that in our previous post instead of 'Browse' now 'More service' stands. The portal is still in development so there are regular updates.

Azure Portal - Create SQL Server















Now you can fill in your Server name (the name cannot contain spaces). Next we create a SQL Server login and this is our Server admin. You can also use Azure Active Directory (user or group) for this. Click here for more information. The Subscription is filled automatically. After this we choose a Resource group. For the convenience we choose the Resource group we have created earlier by setting up the IoT Hub, but you can also create a general Resource group so the server can be used for purposes other than IoT. Otherwise the server has the same lifecycles, permissions and policies as the IoT Resource group. Our Location is the Netherlands, so we choose West-Europe.

Azure Portal - Create SQL Server (continuation)















Tip:
When the deployment of the server succeeded, the server must appear in the list of SQL Servers. If not, you must click on the 'Refresh' button at the top under SQL Servers.

2) Create the Azure SQL Database
Next we create the database. In your Azure portal click on 'More services' and search for SQL Database. When you have opened it, click on 'Add' to create a new database.

Azure Portal - Create SQL database














Choose a name for your database. The Subscription is filled automatically and next we choose for the same Resource group as earlier by setting up the SQL Server. Select 'Blank database' (new database) and choose the SQL Server that you have created earlier. If you don't choose a server, Azure creates automatically one. That is the reason why we set up the server first, because maybe you want to create one server (with a general name) and to attach here multiple databases. Otherwise you have a separate server for every database. That can also be a conscious choice off course, but that is not what we want in this case. At last we choose the 'Basic' database, but here you can choose the size that fits your needs. The Collation is default.

Azure Portal - Create SQL database (continuation)














Tip:
If you decide to add in Management Studio (once you have connected) a new database, be aware of the fact that Azure creates default the S3 (Standard) version of a database. Therefore, you should always create a new database in the Azure portal, so that you can choose the right size and price.

3) Connect to SQL Server and create a table
Once the database is created, you can connect to the SQL Server in Management Studio. In this case our Server name is 'bitools.database.windows.net,1433'. As you can see the name includes the default port of 1433 (this is the only port on which the service is available). Next you choose 'SQL Server Authentication' and fill in the login and password that you have created earlier by setting up the SQL Server. The first time you must Sign In with your Azure account. This is also the case when you have not made connection to the server for a while. At last you must add your client IP for access to the server. Your IP is now added to the firewall.

SQL Server Management Studio - Connect to Azure














Before we create the Stream Analytics Job, we must do one last thing and that is create a table where we can store the sensor data. I have created the following table in the database:

CREATE TABLE [dbo].[sensorData](
 [SensorName] [nvarchar](max) NULL,
 [MeasurementCount] [bigint] NULL,
 [MeasurementTime] [datetime] NULL,
 [Temperature] [float] NULL,
 [Humidity] [float] NULL,
 [Pressure] [float] NULL,
 [Altitude] [float] NULL,
 [Decibel] [float] NULL,
 [DoorOpen] [bigint] NULL,
 [Motion] [bigint] NULL,
 [Vibration] [bigint] NULL,
 [Illumination] [float] NULL
)

As you can see I choose for float as datatype, because the standard data types in Azure are floats. This means that input datatypes such as decimal and numeric are converted to floats.

Tip:
You can manage the firewall in the Azure portal. Go in the portal to your server, click on it and under settings you will find 'Firewall'. Here you can add Client IP's to allow connection to the server. Note that this can only be done by an user who have the role of 'Owner'. In this case I created the server so I'm automatically owner of the server.


4) Create the Stream Analytics Job
In your Azure portal click on 'New'. Type in 'Stream Analytics Job' and click on it. Next you click on the result, in this case only Stream Analytics Job. After that, you can click on the 'Create' button.

Azure Portal - Create Stream Analytics Job (extensive)














Tip:
In our previous post we create a new Stream Analytics Job on a faster and different way. This way is extensive and gives some general information about, in this case, a Stream Analytics Job. So if you want more information about a feature in Azure before you create it, this is a useful way. 

Now you can fill in your Job name (the name cannot contain spaces) and the Subscription is filled automatically. Next choose a Resource group. These groups are made by setting up the IoT Hub. Click here for more information and how you create it. When you have created this, it appears in the list of 'use existing' and you can choose this one. Our Location is the Netherlands, so we choose West-Europe. At last you can pin your Job right away to your dashboard, with the checkbox at the bottom. You may have noticed at the first screenshot that I have already pin the previous Stream Analytics Job to my dashboard.

Azure Portal - Create Stream Analytics Job (continuation)














5) Define the Input
Once the Job is created, you must add a new Input. Because I have pinned the Job (screenshot 6 of 'Create the Stream Analytics Job'), you can select it from the dashboard. The default Source Type is 'Data stream', because  the sensor data is an ongoing stream and is derived from the IoT Hub. Optionally, you can add 'Reference data' as type. This data is like static metadata next to your sensor data, it gives your sensor data more meaning. Here you can find more information about this kind of data. The Source is 'IoT hub' and then the IoT Hub that you have created automatically appears. If you have more then one IoT Hub, you can choose one from the drop-down list. The next thing is to choose the right Consumer group. These groups are made by setting up the IoT Hub. Click here for more information and how you create it. In this case we want to store the sensor data in a Azure database, so you choose 'azuredb'. Finally you choose 'JSON' as Event serialization format. Click here for more information and how you create such a JSON message.
  1. Click on the job
  2. Click on 'Inputs'
  3. Click on 'Add'
  4. Fill in a name, select 'Data stream' as Source Type  and select your IoT Hub as Source
  5. Select 'azuredb' as Consumer group and choose 'JSON' as Event serialization format
Azure Portal - Define Input














6) Define the Output
After the Input, you create the Output. When you have given the Output a suitable name, you choose 'SQL database' in Sink. Then select your Database that you have created, in our case 'IoT_Sensor'. After that the Server name, that you have created too, is filled in automatically. Next you must connect to the database with the SQL Server login you made earlier. Now you can choose a table, in our case 'sensorData'
  1. Click on the job and click on 'Outputs'
  2. Click on 'Add'
  3. Fill in a name and select 'SQL database' as Sink
  4. Select your Azure Database that you have created
  5. Log in with the SQL login Username and Password
  6. Choose the created Table
Azure Portal - Define Output














7) Define the Query
Now the Output is defined, you can build up the query. Compared to our previous post you see that there are already some updates have been made. For example, on the left you see your Input and Output. The query needs always an Input and an Output, so that's why we have created the Output first. It is good to know that the language is SQL, but there are certain differences with a normal SQL query. In addition, the standard data types are floats. 
Besides a FROM clause, there is an INTO clause. For the FROM you will use your defined Input and the Output is used for the INTO. Additionally, there are various new windowing functions available. This will be discussed in another blog. You will find more details about the Stream Analytics Query Language here. For now we use a simple query without those functions. 

The query:

SELECT   CAST(sensorName as nvarchar(max)) as SensorName
,        CAST(1 as bigint) as MeasurementCount
,        CAST(measurementTime as datetime) as MeasurementTime
,        Temperature
,        Humidity
,        Pressure
,        Altitude
,        Decibel
,        CAST(doorOpen as bigint) as DoorOpen
,        CAST(motion as bigint) as Motion
,        CAST(vibration as bigint) as Vibration
,        Illumination
INTO   [saj-bitools-DB-Output] 
FROM   [saj-bitools-DB-Input]

Unfortunately, the testing of the query is not supported in the new Azure Portal. They are working on it.

Azure Portal - Define Query














7) Start the Job
At last you must start the Stream Analytics Job. You can choose between ad-hoc (now) or a scheduled day and time (custom).

Azure Portal - Start the Stream Analytics Job














Result
We have started the Stream Analytics Job and now we want to see the result. Go back to your SQL Server Management Studio and connect to the Azure Database. When you look at the table, you must see the results. In our case we have sent 100 messages to our database. The messages are sent every 10 seconds.

SQL Server Management Studio - Table results











Conclusion
The steps are logical, but the sequence of the execution is very important. Also be careful about which database you buy, because there are big differences between the prices. 

Wednesday 3 August 2016

IoT Adventure: 5a - Stream Analytics Job for Power BI

Case
Your sensors are connected with an IoT Hub and is generating data. What are the possibilities with this data and more important, how does this works in Azure?

Solution
In Azure, we can use Stream Analytics to do this. We distinguish two streams for our data:
  • Cold path
  • Hot path
Possibilities with Stream Analytics








There is also the possibility for Machine Learning. You can use this for both Cold path and Hot path. We are skipping this for now. This will be discussed in another blog.

Cold path
The Cold path means that the data will be stored for further processing, before presenting it to the end users. Examples of a Cold path are loading the data into a Azure Data Lake or an Azure SQL Database. For the latter, you have to create first the Azure SQL Database. You also have to organize the authentication, make some custom tables where the data can be stored in and finally make the Stream Analytics Job. Setting up all this will be explained in another blog. 

Hot path
The other path is the Hot path and this means that the data is real-time and will be displayed into Power BI. First you have to create a Stream Analytics Job and then do the configuration.


1) Create the Stream Analytics Job
You can make the Job in both old portal and the new portal. We prefer to make as much as possible in the new portal, because this will be the standard in the future.

Go to 'Browse' on the bottom of the menu (left-hand side of the screen) and search for Stream Analytics jobs. When you have opened it, click on 'Add' to create a new Stream Analytics Job. 

Azure Portal - Create Stream Analytics Job













Tip: 
As you can see, on the left in the Azure menu I have some standard resources. When you click on a blank star (favorite) button, it will append on the left in your menu (see screenshot 2). You can also select 'All recourses' in the menu. This contains a list of all features and here you can add any feature, also a Stream Analytics Job.

Now you can fill in your Job name (the name cannot contain spaces) and the Subscription is filled automatically. Next choose a Resource group. These groups are made by setting up the IoT Hub. Click here for more information and how you create it. When you have created this, it appears in the list of 'use existing' and you can choose this one. Our Location is the Netherlands, so we choose West-Europe. At last you can pin your Job right away to your dashboard, with the checkbox at the bottom. You may have noticed at the first screenshot that I have already pin some some Jobs to my dashboard

Azure Portal - Create Stream Analytics Job (continuation)














Once the Job is created, it will appears in the list of Stream Analytics Jobs with the status 'Created'.

2) Define the Input
Once the Job is created, you must add a new Input. The default Source Type is 'Data stream', because  the sensor data is an ongoing stream and is derived from the IoT Hub. Optionally, you can add 'Reference data' as type. This data is like static metadata next to your sensor data, it gives your sensor data more meaning. Here you can find more information about this kind of data. The Source is 'IoT hub' and then the IoT Hub that you have created automatically appears. If you have more then one IoT Hub, you can choose one from the drop-down list. The next thing is to choose the right Consumer group. These groups are made by setting up the IoT Hub. Click here for more information and how you create it. In our case we want to present the sensor data in a Power BI dashboard, so you choose 'powerbi'. Finally you choose 'JSON' as Event serialization format. Click here for more information and how you create such a JSON message.
  1. Click on the job
  2. Click on 'Inputs'
  3. Click on 'Add'
  4. Fill in a name, select 'Data stream' as Source Type  and select your IoT Hub as Source
  5. Select 'powerbi' as Consumer group and choose 'JSON' as Event serialization format
Azure Portal - Define Input















3) Define the Output
After the Input, you create the Output. When you have given the Output a suitable name, you choose 'Power BI' in Sink. Then you should authorize with your Power BI account. In most cases, this will be the same as your Azure account. Please note that this only can be done with an organization account. When you are successfully logged in, you must choose a Group Workspace. The default is 'My Workspace'. The downside of 'My Workspace' is that when you want to share your Power BI dashboard with other persons and such persons enable to edit this dashboard, this doesn't work in a local workspace. You must choose a workspace where several people are part of, for example a SharePoint group. Everybody in this group can see and edit Power BI dashboard(s) in this workspace. In our example is this the 'IoT' group.
  1. Click on the job and click on 'Outputs'
  2. Click on 'Add'
  3. Fill in a name and select 'Power BI' as Sink
  4. Authorize with your Azure account
  5. Select your Group Workspace
  6. Fill in a Dataset name and Table name
Azure Portal - Define Output














4) Define the Query
Now the Output is defined, you can build up the query. Because the query needs always an Input and an Output, we created the Output first. It is good to know that the language is SQL, but there are certain differences with a normal SQL query. In addition, the standard data types are floats. 
Besides a FROM clause, there is an INTO clause. For the FROM you will use your defined Input and the Output is used for the INTO. Additionally, there are various new windowing functions available. We use one of them. You will find more details about the Stream Analytics Query Language here. For now we use the Tumbling Window function and the Timestamp function. 

Tumbling Window function
With the Tumbling Window function you can define your own time intervals, that will not overlap each other. Our data is send per second to our IoT Hub, but here we make an interval of one minute. Based on that we use several aggregate functions, that will calculate the data for one minute. See this post for more details.

Timestamp By function
The other function is the Timestamp. A query in Stream Analytics contains nearly always a datetime field, because the data is send at a specific time. This field is passed to 'TIMESTAMP BY' function. This ensures that the data is coming in at the time when it is created, instead of the time when the data has been sent. Perhaps the transfer of the data is delayed and causes the data in a different order by entry. 

The query:

SELECT   sensorName
,        Max(measurementTime) as measurementTime
,        CAST(1 as bigint) as measurementCount
,        Avg(decibel) as decibel
,        Sum(doorOpen) as doorOpen
,        Avg(humidity) as humidity
,        Avg(illumination) as illumination
,        Sum(motion) as motion
,        Avg(pressure) as pressure
,        Avg(temperature) as temperature
,        Sum(vibration) as vibration
INTO   [saj-bitools-Output] 
FROM   [saj-bitools-Input] TIMESTAMP by measurementTime 
GROUP BY  sensorName
,         TumblingWindow(minute, 1)

Unfortunately, the testing of the query is not supported in the new Azure Portal. They are working on it.

Azure Portal - Define Query
















5) Start the Job
At last you must start the Stream Analytics Job. You can choose between ad-hoc (now) or a scheduled day and time (custom).


Azure Portal - Start the Stream Analytics Job
















Power BI
We have started the Stream Analytics Job and now we want to present the data in Power BI. Log in here with your Power BI account. Select on the top left your Group Workspace. This is the same group that you have chosen earlier when defining the Output. When you have selected the right workspace, the dataset (defined in the Output) appears automatically. Select the dataset en now you can build a report. In 10-15 minutes we have created a simple report. It contains the amount of input events, averages of the measurements and a timeline with these measurements.

When you click the refresh button, the data will be refreshed. Next you can pin the live page to a dashboard. If the pin doesn't work the first time by adding at to a new dashboard, just pin the report again to the existing dashboard (you created previously). Now you have your first Power BI dashboard with live sensor data.
  1. Click on 'Power BI'
  2. Click on the menu
  3. Select your Group Workspace
  4. The dataset appears automatically
  5. Click on the dataset
  6. Make the report and click on 'Pin Live Page'
  7. Pin the report to a new dashboard en give it a suitable name
  8. Select the dashboard and you will see your report
Create the Power BI dashboard


Power BI example





















Note
In another blog we will show you how to send your sensor data to an Azure SQL Database. Because you have these different options for your output, we choose to create a separate job for our livestream sensor data to Power BI. You can create multiple Inputs, Query's and Outputs in one job. The disadvantage is that you want to edit a single query (for example the livestream query), you must stop the entire job. This means that other Outputs (like Azure database) are stopped too and do not receive any data anymore through this job. 

Conclusion
A lot of steps to do before you can make a Power BI dashboard. If you can build once, you can pretty quickly put together other dashboards.