Does Microsoft Azure IoT Hub stores data? - azure

I have just started learning Azure IoT and it's quite interesting. I am confuse about does IoT Hub stores data somewhere?
i.e. Suppose i am passing room Temperature to IoT hub and want to store it in database for further use. How it's possible?
I am clear on how device-to-cloud and cloud-to-device works with IoT hub.

IoT Hub exposes device to cloud messages through an event hubs endpoint. Event Hubs has a retention time expressed in days. It's a stream of data that the reading client could re-read more time because the cursor is on client side (not on server side like queues and topics). With IoT Hub the related retention time is 1 day by default but you can change it.
If you want to store received messages from device you need to have a client reading on the Event Hubs exposed endpoint (for example with an Event Processor Host) that has the business logic to process the messages and store them into a database for example.
Of course you could use another decoupling layer so that the client reads from event hubs and store messages into queues. Then you have another client that at its own pace reads from queues and store into database. In this way you have a fast path reading event hubs.

This is pretty much the use case for all IoT scenarios.
Step 1: High scale data ingestion via Event Hub.
Step 2: Create and use a stream processing engine (Stream Analytics or HDInsight /Storm). You can run conditions (SQL like queries) to filter and store appropriate data in either cold or hot store for further analytics.
Step 3: Storage for cold-path analytics can be Azure BLOB. Stream Analytics can directly be configured to write the Data into it. Cold can contain all other data that doesn't require querying and will be cheap.
Step 4: Processing for hot-path analytics. This is data that is more regularly queries for. Or data where real time analytics needs to be carried on. Like in your case checking for Temperature values going beyond a threshold! needs an urgent trigger!
Let me know if you face any challenges while configuring the Stream analytics job! :)

If you take a look at the IoT Suite remote monitoring preconfigured solution (https://azure.microsoft.com/documentation/articles/iot-suite-remote-monitoring-sample-walkthrough/) you'll see that it persists telemetry in blob storage and maintains device status information in DocumentDb. This preconfigured solution gives you a working illustration of the points made in the previous answers.

Related

Azure messages pattern

Right now, on IoT Hub there is an information that limit for messages per day 8000. I would like to ask you about any patterns which are being used in Azure.
I am curious if I am able to hit to Azure with some service outside Messages in order to prevent it from being overloaded by big amount of data, or save some confidentiality for this service.
For example, I would like to store some data from given service to Messages that are not being confidential and other data by using some WebSocket or any Rest protocol. I think that there are some patterns that serve that scenarios.
Does anyone has experience with that kind of situation?
Not everything needs to go through IoT Hub. IoT Hub is great for two way communication to/from IoT devices. You could also look at Event Hubs for ingestion from devices that don't need two way comms. We have a write up on the differences here Connecting IoT Devices to Azure: IoT Hub and Event Hubs.

EventGrid vs EventHub

I am working on a service fabric application and want to publish few events from this application and subscribe or process those publish events in another application.
I have tried EventGrid concept and observed that there is a delay while publishing and processing the events. So, now I am looking for other alternatives like EventHub or Queues, etc..
If anyone had already used EventGrid, EventHud or Queues, etc.. , Please do suggest which one will give more performance when we deal with more events.
Design Approach
We have migrated the tables from SQL service to Service Fabric. There is a view in SQL Service, and we are planning to implement that as a service in service fabric.
The implementation logic follows below.
Table 1 implemented service and we publish an event for each CRUD operation to EventGrid/ EventHud.
Table 2 implemented service and we publish an event for each CRUD operation to EventGrid/ EventHud.
We have created a view service where it listens to the events when any event sent to EventGrid/ EventHud, it will perform required calculations and store in the ViewService( It is a background job)
We are looking for a messaging service which gives more performance.
Have you seen this comparison and this one?
Anyway, can you clarify your requirements in terms of throughput and performance? It depends on a lot of factors including, but not limited to, the message size and the amount of messages.
Having used both Event Grid and Event Hub I'd say Event Hub works very well for many messages per second, say data streams from iot devices, but the performance of the downstream processing can be a bottleneck. You have to process them very fast in order to receive new events. Then there are partitions and consumer groups that can be of help to balance the load and have different processors for the same data but with different view of the data stream. (A fast processor for live displaying of sensor data and a slower one for storing the data for later analysis)
If you're talking about a few events generated by an application that triggers other apps to start doing some work based on those events Event Grid is a good fit. I haven't experienced much delay in receiving those events.
But bottom line, I think all services (Event Grid, Event Hub, Service Bus etc) support different use cases and that should be your first decision point.
Can you describe your publisher, subscriber, etc. and show your metrics of the Azure Event Grid usage?
You can use the portal screen snippets on the topic (publisher) and subscription (subscriber).
The following screen snippets are from my tester when manually have been fired few events.
Publisher side:
Subscriber side:
Metrics on the portal:
As you can see, the delivery destination processing time is ~1ms. The latency time on the publisher side (custom topic) is between 2-4ms.
Note, that the AEG is a PUSH->PUSH-ACK or PUSH->PULL-ACK eventing loosely decupled Pub/Sub model instead of the Event Hub model which is based on the PUSH->PULL mechanism, in other words, the Event Hub needs to host a listener and receiver for pulling an event from the partition.

Using partitionId or partitionKey with Iot Hub Azure

We are developing an application where IoT devices will be publishing events to azure IoT hub using MQTT protocol (by using one topic to push message). We want to consume these message using Stream Analytic service. And to scale Stream analytic services, it is recommended to use partitionBy clause.
Since, we are not using Azure Event hub SDK, can we somehow attached partitionId with events?
Thanks In Advance
As Rita mentioned in the comments, Event Hub will automatically associate each device to a particular partition.
Then, when you can use PARTITION BY PartitionId for steps closer to the input to efficiently parallelize processing of the input and reduce/aggregate the data.
Then, you can have another non-partitioned step to output to SQL sending some aggregate data.
Doing that you will be able to assign more thank 6 SUs, even with an output to SQL.
We will update our documentation to give more info about scaling ASA jobs and describe the different possible scenarios.
Thanks,
JS - Azure Stream Analytics

Azure Service Fabric routing

I would like to get some recommendation, for designing a routing of IoT messages in Azure.
I have following scenario:
Senors sending messages to Azure IoT Hub in Google Protobuf format. Depending of the type of a message, I want to route the message to different applications inside a service fabric.
My current approach is to use a service fabric application to receive all messages from the IoT hub, parse the protobuf message, send the message depending on their type (attribute inside the protobuf) to an type-specific Azure event hub. Now the applications fetches the messages from their "own" event hub and process the messages.
I'm not sure if this is the best approach. I don't like the fact, to have one event hub for each type of message. Service Bus Topics are probably not an option, because I have a lot of messages (~30k per second).
Do I realy need a event hub, to decoupling this process, or does it make sense, to send the messages from the "routing application" direct to the different "type applications"?
What do you think?
Regards,
Markus
If you really need high performance you should take a look at IoT Hub and Event Hubs. Azure Event Hubs is a highly scalable data streaming platform and event ingestion service capable of receiving and processing millions of events per second. Event Hubs can process and store events, data, or telemetry produced by distributed software and devices. Data sent to an event hub can be transformed and stored using any real-time analytics provider or batching/storage adapters.
In other hand if you need only 30k messages per second you can go with Premium Messaging.
Comparison of Azure IoT Hub and Azure Event Hubs
Premium Messaging: How fast is it?
What is Event Hubs?

Azure ioT and Stream Analytics job

I am trying to collect data from a ioT device, as for now i am using this code to simulate the device. remote_monitoring. It send data and i can se the data in the dashboard. Next thing is that i want to save the data to a SQL database. I was thinking of using Stream Analytics to do the job. The problem i am having now is that when i select ioT HUB as a input i get the error
Please check if the input source is configured correctly and data is in correct format.
I am trying to find documentation if there is something special i need to add to my JSON object before i send it.
IoTHub is a supported input for Azure Stream Analytics and there is nothing wrong with using ASA as a "pump" to copy data from IoT Hub or Event Hubs to a store like SQL DB. Many use cases of ASA combine such "archiving" with other functions. The only thing to be careful with is the limited ingress rate of many ASA outputs, so SQL DB may not be able to keep up and throttle ASA in which case ASA may fall behind beyond the used hub's retention window, causing data loss.
Try to use Event Hub, update my post when i have a awnser.
What about this doc https://github.com/Azure/azure-content/blob/master/articles/stream-analytics/stream-analytics-define-inputs.md ? Does it help you?

Resources