When are EventGrid IoT Hub DeviceConnected and DeviceDisconnected Events raised - azure

IoT Hub publishes the events "DeviceConnected" and "DeviceDisconnected" via Event Grid according to the documentation.
My question is, which action from an actual IoT Device triggers these events?
For the "DeviceConnected" event:
Is it triggered when the OpenAsync Method is called on the Client SDK?
Is it triggered implicitly when the SendEvent Method is called?
Is this event also available via direct AMQP/MQTT connections?
For how long will it stay in this state?
For the "DeviceDisconnected" event:
Is the device going to "disconnected" as soon as "Close" on the DeviceClient is called?
What if connectivity is not good? Is there a constant ping along with a timeout mechanism which marks a device as offline after it was idle for a given time?
We currently have implemented the heartbeat pattern as described here but we are wondering if there is an easier and eventually more cost-efficient way to achieve the same goal.

I found this passage in the documentation
The connection state is updated only for devices using MQTT or AMQP.
Also, it is based on protocol-level pings (MQTT pings, or AMQP pings),
and it can have a maximum delay of only 5 minutes. For these reasons,
there can be false positives, such as devices reported as connected
but that are disconnected.
This covers most of my questions.

Related

IoT EDGE Device Connection state monitoring

We have a business requirement to maintain Iot Edge devices Connected state in Digital Twins Instance. It should be near to real time, but short delays up to few minutes are acceptable.
I.e., In Digital Twins instance we have DT entity for each IoT Edge device, and it have property Online (true/false).
In production we will have up to few hundreds of devices in total.
We are looking for a good method of monitoring Edge devices connected state.
Our initial attempt was to subscribe an AZ Function for Event Grid Device Connected/Disconnected notifications in IoT Hub events.
After initial testing we found that Event Grid seems cannot be used as a single source. After more research we found following information:
https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-event-grid#limitations-for-device-connected-and-device-disconnected-events
IoT Hub does not report each individual device connect and disconnect, but rather publishes the current connection state taken at a periodic 60 second snapshot. Receiving either the same connection state event with different sequence numbers or different connection state events both mean that there was a change in the device connection state during the 60 second window.
And another one:
https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-troubleshoot-connectivity#mqtt-device-disconnect-behavior-with-azure-iot-sdks
Azure IoT device SDKs disconnect from IoT Hub and then reconnect when they renew SAS tokens over the MQTT (and MQTT over WebSockets) protocol….
…
If you're monitoring device connections with Event Hub, make sure you build in a way of filtering out the periodic disconnects due to SAS token renewal. For example, do not trigger actions based on disconnects as long as the disconnect event is followed by a connect event within a certain time span.
Next, after more search on the topic, we found the following question:
Best way to Fetch connectionState from 1000's of devices - Azure IoTHub
Accepted answer suggests using heartbeat pattern, however in official documentation it is clearly stated that it should not be used in production environment:
https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-identity-registry#device-heartbeat
And in the article describing heartbeat pattern there is a mention of “short expiry time pattern” but not much information given to detail it.
For complete picture, we also found the following article:
https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-how-to-order-connection-state-events
But it is based on Event Grid subscription and therefore will not provide accurate data.
Finally, after reading all of this, we have the following plan to address the problem:
We will have AZ Function subscribed for Event Grid Device Connected/Disconnected notifications.
If DeviceConnected event received, the function will check device connectivity immediately.
If DeviceDisconnected event received, the function will delay for 90 seconds, as we found DeviceConnected event usually come after ~60 seconds for a given device. And after the delay it will check the device connectivity.
Device Connectivity will be checked with Cloud to Device message send with acknowledgment as described here:
https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-csharp-csharp-c2d#receive-delivery-feedback
Concerns of the solution:
Complexity.
AZ function would need IoT HUB service Connection string.
Device disconnected event might be delayed up to few minutes.
Can anyone suggest better solution?
Thanks!
EDIT:
In our case, we do not use DeviceClient, but ModuleClient on the Edge devices, and modules does not support C2D messages, which is stated here:
https://learn.microsoft.com/en-us/azure/iot-edge/module-development?view=iotedge-2018-06&WT.mc_id=IoT-MVP-5004034#iot-hub-primitives
So we would need to use Direct Methods instead to test if the device is Online.

ServiceBus message delivery time reliable?

I'm working on creating an events system with Azure ServiceBus, I find events generally hits reliably at the scheduled time I had them set to run - so if event 'pop' is supposed to run at 12:30pm it generally would be delivered at that time to my reciever.
I wanted to know is there a guarantee that events are always fired within the scheduled time or is that more of a suggested time and the system can get clogged and backlogged causing longer queues to form?
There are quite a few differences between messages (which are handled with Service Bus) and events, as you can see in the article Choose between Azure messaging services - Event Grid, Event Hubs, and Service Bus.
An event is a lightweight notification of a condition or a state change. The publisher of the event has no expectation about how the event is handled. The consumer of the event decides what to do with the notification. Events can be discrete units or part of a series.
[...]
A message is raw data produced by a service to be consumed or stored elsewhere. The message contains the data that triggered the message pipeline.
It sounds like you need a reliable way to have a timer trigger execute on a specific time. Service Bus is not the correct service for that, since "the message enquing time does not mean that the message will be sent at that time. It will get enqueued, but the actual sending time depends on the queue's workload and its state." (see BrokeredMessage.ScheduledEnqueueTimeUtc Property).
For handling the triggering in a reliable way, you could use services like Logic Apps (if you want to create it low-code/no-code) or Azure Functions (for the Serverless solution with code).
If you're actually looking for events, consider Event Grid.

mqtt - Is it a good idea to one-time subscribe to a topic, wait for a message and then immediately unsubscribe?

EDIT:
I found out about the retain message flag on mqtt servers. This might be what I'm looking for. Instead of querying for the current state directly, I can subscribe to a topic and the broker will send the last published state directly.
I'll update this answer after I tried the retain message flag.
Let's say I have a switch that is connected to a mqtt broker. That switch knows whether it's switched ON or OFF.
When I publish to the topic "switch/status", the switch recognises that and in turn publishes a message to "switch" with the content "isTurnedOn: (true|false)".
My question is, is the following workflow bad practice or is there a better way to do this?
A client subscribes to topic "switch"
Same client publishes a message to "switch/status"
The client waits until it receives a response on topic "switch" or timeout after n seconds. Process the result.
Client unsubscribes from that topic.
One switch might not make much of an impact, but what if there were say 10 switches. That would mean 10 separate subscription requests, 10 responses/possible timeouts and then 10 separate un-subscription request.

Monitor a specific IoT device in Azure?

I have multiple devices in my IoT Hub and I want to set an alert when one specific device is not sending messages. I know you can set alerts when the IoT Hub in general is not getting any messages but I want to alert when a specific device isn't.
Example: Device1, Device2, Device3, Device4
Alert when Device1 is not sending messages.
I have tried searching all over and all that I found was a question from 2018 saying it was not possible (I am hoping this has changed).
To your case specific I would leverage Device heartbeat pattern: https://learn.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-identity-registry#device-heartbeat
If your IoT solution needs to know if a device is connected, you can implement the heartbeat pattern. In the heartbeat pattern, the device sends device-to-cloud messages at least once every fixed amount of time (for example, at least once every hour). Therefore, even if a device does not have any data to send, it still sends an empty device-to-cloud message (usually with a property that identifies it as a heartbeat). On the service side, the solution maintains a map with the last heartbeat received for each device. If the solution does not receive a heartbeat message within the expected time from the device, it assumes that there is a problem with the device.

Why IotHub events are delayed when stored in Time Series Insights?

I have a Time Series Insights Environment with an IoT Hub data source configured.
What I noticed is that there is a specific 20-30 seconds delay from sending an event to IoT Hub and seeing it stored in TSI.
After I found this, I attached a Function Trigger directly to the Iot Hub. What happened is that events were received immediately by the trigger, but TSI returned them 20-30 seconds later.
So, I have two questions:
Where does that delay come from?
Is there anything I can do about minimizing the delay?
Thanks!
There is an expected measurable delay of up to 1 minute before you will see it in TSI and you cannot dial that up/down. It's just how the service works.
Just in case you haven't already, also make sure you've configured your SKU and capacity to support your use cases.

Resources