Azure stream analytics time window query - azure

I am working on a project where we are recording temperature and humidity of multiple devices and on cloud side using Azure stream analytics to find if any device has breached it mentioned threshold limit.
We need to monitor device for 15 mins if device is constantly breaching its limitations then need to raise alert.
But the tricky part is if device is still breaching its threshold for another 30 min then again raise another alert. Then need to raise alert again and again after every 30 mins until device is back to normal limits.
I can use sliding window query in stream analytics to find out which device is out of threshold for first 15 mins, But how can find subsequent 30 min threshold breach and raise alert?

I'd suggest sending off one output from your current ASA job to a new event hub, and have a new ASA job monitoring the data from the second event hub over the 30 minute treshold.

Related

Azure Stream Analytics Job not getting input events before its restarted

We have a streaming service for small data files. Files of around 1MB is uploaded to a storage account every minute. A stream analytics job takes the data from these and passes them to an event hub which again triggers a function app.
Lately the inputs from this storage accounts suddenly stops, and the stream has to be restarted for it to start processing data again.
Any feedback is welcome, and I will happily provide more information if needed to solve this.
These are the facts:
Blob uploads every minute.
Stream gets input events from blob.
Even though the data is successfully uploaded to the blob, eventually
the stream stops processing these saying 0 input events recieved.
Watermark delay increases with 0 events outputted until restarted.
Every time the stream stops receiving inputs the storage account gets a network error followed by a timeout error every 10 min (likely the stream retry policy).
Whenever it is restarted there is a spike in outputs which after a short while normalizes in volume.
At any point there is only 30 files (1 per minute last 30 mins), as after this the files are deleted.
The storage account and stream is located in different regions.

Is there any message receiving limit per device on Azure IoTHub?

Is there any message receiving limit per device on Azure IoTHub?
If any, can I remove or raise the upper limit without registering additional devices?
I tested 2 things to make sure if I can place enough load (ideally, 18000 message/s)on Azure IoT Hub in the future load tests.
① Send a certain amount of mqtt messages from a VM.
② Send a certain amount of mqtt messages from two VMs.
I expected that the traffic of ② would be twice as large as that of ①. But it wasn't. Maximum messages per minute on IoTHub of ② is not so different from that of ①. Both of them are around 3.6k [message/min]. At that time, I registered only one device on IoT Hub. So I added another device and tested ② again to see if the second device could increase the traffic. As a result, it increased the traffic and IoT Hub had bigger messages per minute.
Judging from this result, I thought IoTHub has some kind of limit on receiving message per device. But I am not sure. So if anyone know about the limit, could you tell me what kind of limit it is and how to raise the upper limit without registering additional devices because in production we use only one device.
For your information, I know there is a "unit" to increase the throughput in IoTHub. To increase the load I changed the number of unit from 2 to 20 in both ① and ②. However, it did not make messages/min in IotHub bigger. I'd also like to know why the "unit" did not work as expected.
Thank you for reading, in advance. Any comment would be my help.
Every basic (B1,B2, B3) or standard unit of IoT Hub SKU (S1, S2, S3) has specific daily message quota as per https://azure.microsoft.com/en-us/pricing/details/iot-hub/. A single IoTHub can support 1 million devices and there is no per device cost associated, only the msg/day quota as above.
e.g. S1 SKU has 400,000 msg/day quota and you can add multiple units of S1 to increase capacity. S2 has 6000,000 msg/day and S3 has 300,000,000 msg/day quota per unit and more units can be added.
Before this limit is reached IoTHub will raise alert which can be used to automatically add more units or jump to higher SKU.
Regarding your test, there are specific throttling limits to avoid misuse of the service here -
https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-quotas-throttling
As an example, for 18000 msg/sec you will need 3 units of S3 SKU (each with 6000 msg/sec rate limit). In addition there are other limits like how quickly connections can be attempted, if using Azure IoT SDK's the built-in retry logic helps overcome this otherwise you need to have retry policy. Basically you dont want million device trying to connect at the same time, IoTHub will only accept connections at a certain rate. This is not concurrent connection limit but a rate at which new connnections are accepted.

Why IotHub events are delayed when stored in Time Series Insights?

I have a Time Series Insights Environment with an IoT Hub data source configured.
What I noticed is that there is a specific 20-30 seconds delay from sending an event to IoT Hub and seeing it stored in TSI.
After I found this, I attached a Function Trigger directly to the Iot Hub. What happened is that events were received immediately by the trigger, but TSI returned them 20-30 seconds later.
So, I have two questions:
Where does that delay come from?
Is there anything I can do about minimizing the delay?
Thanks!
There is an expected measurable delay of up to 1 minute before you will see it in TSI and you cannot dial that up/down. It's just how the service works.
Just in case you haven't already, also make sure you've configured your SKU and capacity to support your use cases.

How to achieve high speed processing receiving from Azure Event Hub?

I am working on the POC for Azure Event hubs to implement the same into our application.
Quick Brief on flow.
Created tool to read the CSV data from local folder and send it to event hub.
We are sending Event Data in Batch to event hub.
With 12 instance of tool (Parallel), I can send a total of 600 000 lines of messages to Event hub within 1 min.
But, On receiver side, to receive the 600 000 lines of data, it takes more than 10 mins.
Need to achieve
I would like to Match/double my egress speed on the receiver to
process the data. Existing Configuration
The configuration I have made user are
TU - 10 One Event hub with 32 Partition.
Coding logic goes same as mentioned in MSDN
Only difference is, I am sending line of data in a batch.
EventProcessorhost with options {MaxBatchSize= 1000000,
PrefetchCount=1000000
To achieve higher egress rate (aka faster processing pipeline) in eventhubs:
Create a Scaled-out pipeline - each partition in EventHub is the unit-of-scale for processing events out of EventHub. With the Scale you described (6Lakh events per min --> 10K events per sec - with 32 partitions - you already got this right). Make sure you create as many partitions as you envision your pipeline need in near future. Imagine analyzing traffic on a Highway and no. of lanes is the only limitation for the amount of traffic.
Equal load distribution across partitions: if you are using SendToASpecificPartition or SendUsingPartitionKey - you will need to take care of equal load distribution. If you use EventHubClient.Send(EventDataWithOutPartitionKey) - EventHubs service will make sure all of your partitions are equally loaded. If a single EventHub Partition is heavily loaded - the amount of time you can process all events on EventHub will be bound by no. of events on this Partition.
Scale-out physical resources on the Receiver/EventProcessorHost: most importantly Network (Sockets & bandwidth) & after-a-point, CPU & Memory. Use PartitionManagerOptions.MaxReceiveClients to increase the maximum number of EventHubClients (which has a dedicated MessagingFactory, which maps to 1 socket) created per EventProcessorHost instance. By default it is 16.
Let me know how it went... :)
More on Event Hubs.

how to improve speed for azure event hub listener processing?

I've set up an EventHub for APIM and I've created a listener (IEventProcessor) to capture context policy output. It appears that the listener gets hit on an interval, maybe every 10-15 seconds. Are there any approaches that you use to increase the speed at which events are processed by the IEventProcessor? Are there any types of settings that push out messages more quickly or could this be achieved by scaling out the listeners to improve processing throughput?
We maintain an internal buffer before sending to EventHub and the flush happens every 15 seconds or when the buffer gets full (~ 256 Kb).
This has been done to use the EventHub in the best possible way
https://azure.microsoft.com/en-us/documentation/articles/event-hubs-programming-guide/#batch-event-send-operations as essentially you are paying for it.
Let us know on the User Voice, if you would rather prefer more control via the policy configuration at https://feedback.azure.com/forums/248703-api-management

Resources