I am using the Azure function to process message from IOT hub and output to Blob storage.
enter image description here
But the function missed IOT messages when I send in high frequency.
For example, I send 30 messages from 20:40:16 to 20:40:23 but only 3 are processed and stored in to the Blob storage and I have no ideal where the rest 27 went.
enter image description here
I am using the function consumption plan and Azure declares it will auto scaling depends on the load.
But from the above activity log, it only one thread is running and not even queue the input and cause some message lost.
So, what I should do to catch all messages from IOT hub?
Found the solution myself.
The trigger need to change from Azure Event Hubs to Event Grid Trigger as the images show below.
Azure Event Hubs
Azure Grid Trigger
Azure Functions on a consumption plan can handle this load, but you might want to make a separate Consumer Group in your IoT Hub that the Function can use. In the Azure Portal, go to Built-in endpoints and add a new Consumer Group.
You then have to specify in your Function which Consumer Group to use
[FunctionName("Function1")]
public static async Task Run([IoTHubTrigger("messages/events",ConsumerGroup = "functions", Connection = "EventHubConnectionAppSetting")]EventData message,
I tested this with a consumption plan Function listening to IoT Hub default endpoint and writing to blob storage with an 8 second delay to make it more like your function. I'm seeing no loss in messages, whether I send 30 or 100 messages. Make sure that no other applications are using your new Consumer Group!
Related
I am using Consumption Plan Function App.
I have IoT Devices which are communicating with the IoT Hub. The IoT Hub triggers an Azure Function from my Function App.
The image below was obtained from the Azure Function App setting, and it shows the IoT Hub-triggered function has an execution count of over 250.
Does this mean that there are 250 instances of Azure Function App? Normal?
If was to introduce batch processing for IoT Hub messages, what would be classified as batch of messages? Do they need to have a timestamp within a certain limit?
Your Consumption Plan Function App won't scale over 200 instances (100 for Linux), that's a hard limit. You can use the Function's metrics to check the amount of instances running. Select a metric and split it by 'instance'
Edit with info from comments:
It's possible that you're handling all those requests from 1 instance. The Function will scale out automatically. How that works is described in the docs; however, no exact logic for scaling is described. it's not a hard link to the number of messages available on the Event Hub.
Here is my architecture:
I have a blob storage with an event that triggers an azure function when a blob is created in a container.
The azure function read the file (xml) and sent a message to a service bus queue depending on a xml tag in the file. It means that I have multiple queues in my service bus.
An app that is running with multiple threads is then pulling the messages from the queue and execute an action depending on the message.
My problem is that when a massive amount of file is uploaded to the blob (~25 000), then the azure function open a lot of connection (ServiceBusClient) to send the messages to a service bus queue which cause the service bus client to throw an error because I reach the limit of connection (5000 max). It means that my azure function have more than 5000 connections in parallel to the service bus.
Is there a way to limit the azure function to 2000 instances or to slow down the event send to the azure function ?
I already tried to make a retry policy if it fails inside the azure function and in the ServiceBusClient but i would really like to avoid this.
Thanks !
We designed an eventhub trigger which reads the messages from event hub and inserts the messages into cosmos. During this process if any unhandled exceptions/throttling for cosmos we are moving these messages to blob.
Do we have any way we can move the messages back to event hub from blob through azure portal?. This helps azure admin to move the messages to eventhub in the production
No I don't think there is a way to move the message back in the portal UI.
I would use a combination of approaches here. First I would use autoscaling of Cosmos DB to lower the risk of you being throttled and to keep you from overprovisioning and thus overspending on Cosmos DB. Secondly I would implement a retry logic with exponential back-off in your trigger to further minimize the risk of throttling being a problem.
If you still get failed events, you might not have to push them to separate storage after all. All events remain by default in Event Hubs for seven days. You can just reread the entire thing if you want.
If that is not a good approach, I would push the failed messages to a queue (Storage queue or Service Bus Queue) and have an Azure Function on a timer trigger to process the queue and send the messages back to the Event Hub again. Then it would be fully automatic and admin does not have to do anything.
Do we have any way we can move the messages back to event hub from blob through azure portal?
One of the workarounds is to use logic apps where you can create a workflow from Azure blob storage to event hub. Here is a sample flow of my logic app.
In one of the recent project, I need to add messages(>200kb) to Azure Event Hub through an endpoint exposed by Azure API Management Service. Then, the Stream Analytics job reads this message from Event Hub and writes it to the respective tables in SQL Server.
I was using "log-to-eventhub" policy to log the messages to event hub. But it has a size limitation associated with it, which is 200kb.
What would be the best approach to overcome this size limitation or should I consider a different way to log the payload to Event Hub?
Any help is much appreciated.
Here is a limit about this described in official docs.
The maximum supported message size that can be sent to an event hub
from this API Management policy is 200 kilobytes (KB). If a message
that is sent to an event hub is larger than 200 KB, it will be
automatically truncated, and the truncated message will be transferred
to event hubs.
You could consider using Azure Event Hubs output binding for Azure Functions.
About How Function consume Event Hubs events, you could try using multiple parallel Functions instances under consumption plan.
I have an Azure Event Hub over which I would like to send various types of messages. Each message should be handled by a separate Azure Function, based on their message type. What is the best way to accomplish this?
Actually, I could create some JSON container with a type and payload property and let one parent Azure Function dispatch all the messages payloads - based on their type - to other functions, but that feels a bit hacky.
This question basically asks the same - however it is answered how it can be done using the IoT Hub and message routing. In the Event Hub configuration I cannot find any setting to configure message routing though.
Or should I switch to an Azure Message Queue to get this functionality?
I would use Azure Streaming Analytics to route it to the different Azure Functions. ASAs allow you to specify Event Hubs as a source and several sinks (one of which can be multiple Azure Functions). You can read more about setting up Azure Streaming Analytics services through the Azure Portal here. You'll need to set up the Event Hub as your source (docs). You'll also need to set up your sink (docs). You write some MS SQL-like code to route the messages to the various sinks. However, ASAs are costly relative to other services since you're paying for a fixed amount of compute.
I put some pseudo code below. You'll have to swap it out based on how you configure you're ASA using the information from the attached MS Documentation.
SELECT
*
INTO
[YourOutputAlias]
FROM
[YourInputAlias]
HAVING
[CONDITION]
SELECT
*
INTO
[YourAlternateOutputAlias]
FROM
[YourInputAlias]
HAVING
[CONDITION]
Based on your additional info about the business requirements and assuming that the event size < 64KB (1MB in preview), the following screen snippet shows an example of your solution:
The concept of the above solution is based on the pushing a batch of the events to the Event Domain Endpoint of the AEG. The EventHub Trigger function has a responsibility for mapping each event message type in the batch to the domain topic before its publishing to the AEG.
Note, that using the Azure IoT Hub for ingestion of the events, the AEG can be directly integrated to the IoT Hub and each event message can be distributed in the loosely decoupled Pub/Sub manner. Besides that, for this business requirements can be used the B1 scale tier for IoT Hub ($10/month) comparing to the Basic Event Hubs ($11.16).
The IoT Hub has built-in a message routing mechanism (with some limitations), but a recently new feature of the IoT/AEG integration such as publishing a device telemetry message is giving a good support in the serverless architecture.
I ended up using Azure Durable Functions using the Fan Out/Fan In pattern.
In this approach, all events are handled by a single Orchestrator Function which in fact is a Durable Azure Function (F1). This deserializes incoming JSON to the correct DTO. Based on the content of the DTO, a corresponding activity function (F2) is invoked which processes it.