Azure Functions + EventHub: why batch latency grows up constantly? - node.js

I have next chart:
As you can see my batch latency grows up and count of outgoing messages grows down.
Inside of the function I do append to a blob storage. But blob metrics says everything is ok.
What could be causing the ever-increasing latency?
Function implementation:
const parsedEvents = eventHubMessages.map((event) => {
try {
return JSON.parse(event);
} catch (error) {
context.log(`Error: cannot parse next event: ${event}`);
return {};
}
});
for (const event of parsedEvents) {
const { id } = event;
const data = {
data: 'data',
};
const filename = `${id}.log`;
await blob.append(filename, JSON.stringify(data));
}
Blob append is a instance of a class and looks like:
class AzureStorage {
constructor(config) {
this.config = config;
this.blobServiceClient = BlobServiceClient.fromConnectionString(this.config.storageConnectionString);
this.containerClient = this.blobServiceClient.getContainerClient(this.config.containerName);
}
async append(filename, data) {
const client = this.containerClient.getAppendBlobClient(filename);
await client.createIfNotExists();
await client.appendBlock(data, data.length);
}
}
Another one chart:
Update:
So, my problem was in the blob storage. I did client.createIfNotExists(); and this is the root of the problem. I rewrite my code next way:
I call client.appendBlock
I catch it and if there is an error, then I do client.create(); and then client.appendBlock one more time.

Thanks #JesseSquire for your helpful suggestion. Adding few more troubleshooting points that helps to find the root cause of latency issues in Azure Functions integrated with Event Hubs.
Also check if versioning is enabled on the Storage account which may slows down considerably.
Make sure you can scale out to at least the number of partitions your Event Hub has by checking your Function scaling setup.
Use Logging/Application Insights feature to measure the execution time for blob append in order to check bottlenecks in your code.
Telemetry Logs helps you to find your function performance data & metrics for avoiding the integrated services like Event Hub batch events latency issues, runtime exceptions, etc.
Dedicated storage account is better because of checkpointing, event hub-triggered functions could experience a large volume of storage transactions.
Refer to the MS Doc of Azure Functions Performance for Event Hubs.

Related

EventHub with NodeJS SDK - All consumers in ConsumerGroup getting the message

I hope someone can clarify this for me:
I have 2 consumers in the same ConsumerGroup, it is my understanding that they should coordinate between them, but I am having the issue that both consumers are getting all the messages. My code is pretty simple:
const connectionString =...";
const eventHubName = "my-hub-dev";
const consumerGroup = "processor";
async function main() {
const consumerClient = new EventHubConsumerClient(consumerGroup, connectionString, eventHubName);
const subscription = consumerClient.subscribe({
processEvents: async (events, context) => {
for (const event of events) {
console.log(`Received event...`, event)
}
},
}
);
If I run two instances of this consumer code and publish an event, both instances will receive the event.
So my questions are:
Am I correct in my understanding that only 1 consumer should receive the message?
Is there anything I am missing here?
The EventHubConsumerClient requires a CheckpointStore that facilitates coordination between multiple clients. You can pass this to the EventHubConsumerClient constructor when you instantiate it.
The #azure/eventhubs-checkpointstore-blob uses Azure Storage Blob to store the metadata and required to coordinate multiple consumers using the same consumer group. It also stores checkpoint data: you can call context.updateCheckpoint with an event and if you stop and start a new receiver, it will continue from the last checkpointed event in the partition that event was associated with.
There's a full sample using the #azure/eventhubs-checkpointstore-blob here: https://github.com/Azure/azure-sdk-for-js/blob/master/sdk/eventhub/eventhubs-checkpointstore-blob/samples/javascript/receiveEventsUsingCheckpointStore.js
Clarification: The Event Hubs service doesn't enforce a single owner for a partition when reading from a consumer group unless the client has specified an ownerLevel. The highest ownerLevel "wins". You can set this in the options bag you pass to subscribe, but if you want the CheckpointStore to handle coordination for you it's best not to set it.

Batch insert to Table Storage via Azure function

I have a following azure storage queue trigger azure function which is binded to azure table for the output.
[FunctionName("TestFunction")]
public static async Task<IActionResult> Run(
[QueueTrigger("myqueue", Connection = "connection")]string myQueueItem,
[Table("TableXyzObject"), StorageAccount("connection")] IAsyncCollector<TableXyzObject> tableXyzObjectRecords)
{
var tableAbcObject = new TableXyzObject();
try
{
tableAbcObject.PartitionKey = DateTime.UtcNow.ToString("MMddyyyy");
tableAbcObject.RowKey = Guid.NewGuid();
tableAbcObject.RandomString = myQueueItem;
await tableXyzObjectRecords.AddAsync(tableAbcObject);
}
catch (Exception ex)
{
}
return new OkObjectResult(tableAbcObject);
}
public class TableXyzObject : TableEntity
{
public string RandomString { get; set; }
}
}
}
I am looking for a way to read 15 messages from poisonqueue which is different than myqueue (queue trigger on above azure function) and batch insert it in to dynamic table (tableXyz, tableAbc etc) based on few conditions in the queue message. Since we have different poison queues, we want to pick up messages from multiple poison queues (name of the poison queue will be provided in the myqueue message). This is done to avoid to spinning up new azure function every time we have a new poison queue.
Following is the approach I have in my mind,
--> I might have to get 15 queue messages using queueClient (create new one) method - ReceiveMessages(15) of Azure.Storage.Queue package
--> And do a batch insert using TableBatchOperation class (cannot use output binding)
Is there any better approch than this?
Unfortunately, storage queues don't have a great solution for this. If you want it to be dynamic then the idea of implementing your own clients and table outputs is probably your best option. The one thing I would suggest changing is using a timer trigger instead of a queue trigger. If you are putting a message on your trigger queue every time you add something to the poison queue it would work as is, but if not a timer trigger ensures that poisoned messages are handled in a timely fashion.
Original Answer (incorrectly relating to Service Bus queues)
Bryan is correct that creating a new queue client inside your function isn't the best way to go about this. Fortunately, the Service Bus extension does allow batching. Unfortunately the docs haven't quite caught up yet.
Just make your trigger receive an array:
[QueueTrigger("myqueue", Connection = "connection")]string myQueueItem[]
You can set your max batch size in the host.json:
"extensions": {
"serviceBus": {
"batchOptions": {
"maxMessageCount": 15
}
}
}

Azure Storage Queue performance

We are migrating a transaction-processing service which was processing messages from MSMQ and storing transacitons in a SQLServer Database to use the Azure Storage Queue (to store the id's of the messages and placing the actual messages in the Azure Storage Blob).
We should at least be able to process 200.000 messages per hour, but at the moment we barely reach 50.000 messages per hour.
Our application requests batches of 250 messages from the Queue (which now takes about 2 seconds to get the id's from the azure queue and about 5 seconds to get the actual data from the azure blob storage) and we're storing this data in one time into the database using a stored procedure accepting a datatable.
Our service also resides in Azure on a virtual machine, and we use the nuget-libraries Azure.Storage.Queues and Azure.Storage.Blobs suggested by Microsoft to access the Azure Storage queue and blob storage.
Does anyone have suggestions how to improve the speed of reading messages from the Azure Queue and then retrieving the data from the Azure Blob?
var managedIdentity = new ManagedIdentityCredential();
UriBuilder fullUri = new UriBuilder()
{
Scheme = "https",
Host = string.Format("{0}.queue.core.windows.net",appSettings.StorageAccount),
Path = string.Format("{0}", appSettings.QueueName),
};
queue = new QueueClient(fullUri.Uri, managedIdentity);
queue.CreateIfNotExists();
...
var result = await queue.ReceiveMessagesAsync(1);
...
UriBuilder fullUri = new UriBuilder()
{
Scheme = "https",
Host = string.Format("{0}.blob.core.windows.net", storageAccount),
Path = string.Format("{0}", containerName),
};
_blobContainerClient = new BlobContainerClient(fullUri.Uri, managedIdentity);
_blobContainerClient.CreateIfNotExists();
...
public async Task<BlobMessage> GetBlobByNameAsync(string blobName)
{
Ensure.That(blobName).IsNotNullOrEmpty();
var blobClient = _blobContainerClient.GetBlobClient(blobName);
if (!blobClient.Exists())
{
_log.Error($"Blob {blobName} not found.");
throw new InfrastructureException($"Blob {blobName} not found.");
}
BlobDownloadInfo download = await blobClient.DownloadAsync();
return new BlobMessage
{
BlobName = blobClient.Name,
BaseStream = download.Content,
Content = await GetBlobContentAsync(download)
};
}
Thanks,
Vincent.
Based on the code you posted, I can suggest two improvements:
Receive 32 messages at a time instead of 1: Currently you're getting just one message at a time (var result = await queue.ReceiveMessagesAsync(1);). You can receive a maximum of 32 messages from the top of the queue. Just change the code to var result = await queue.ReceiveMessagesAsync(32); to get 32 messages. This will save you 31 trips to storage service and that should lead to some performance improvements.
Don't try to create blob container every time: Currently you're trying to create a blob container every time you process a message (_blobContainerClient.CreateIfNotExists();). It is really unnecessary. With fetching 32 messages, you're reducing this method call by 31 times however you can just move this code to your application startup so that you only call it once during your application lifecycle.

How do I make sure to receive all of my messages with Azure Service Bus Queue?

I created a Service Bus Queue following the tutorial in Microsoft Documentation. I can send and receive messages, however, only half of my messages make it through. Literally half, only the even ones.
I tried changing the message frequency but it doesn't change anything. It doesn't matter if I send a message every 3 seconds or 3 messages per second, I only get half of them on the other end.
I have run the example code in all the possible languages and I have tried using the REST API and batch messaging but no dice.
I also tried using Azure Functions with the specific trigger for Service Bus Queues.
This is the receiving function code:
module.exports = async function(context, mySbMsg) {
context.log('JavaScript ServiceBus queue trigger function processed message', mySbMsg);
context.done();
};
And this is the send function code:
module.exports = async function (context, req) {
context.log('JavaScript HTTP trigger function processed a request.');
var azure = require('azure-sb');
var idx = 0;
function sendMessages(sbService, queueName) {
var msg = 'Message # ' + (++idx);
sbService.sendQueueMessage(queueName, msg, function (err) {
if (err) {
console.log('Failed Tx: ', err);
} else {
console.log('Sent ' + msg);
}
});
}
var connStr = 'Endpoint=sb://<sbnamespace>.servicebus.windows.net/;SharedAccessKeyName=<keyname>;SharedAccessKey=<key>';
var queueName = 'MessageQueue';
context.log('Connecting to ' + connStr + ' queue ' + queueName);
var sbService = azure.createServiceBusService(connStr);
sbService.createQueueIfNotExists(queueName, function (err) {
if (err) {
console.log('Failed to create queue: ', err);
} else {
setInterval(sendMessages.bind(null, sbService, queueName), 2000);
}
});
};
I expect to receive most of the sent messages (specially in this conditions of no load at all) but instead I only receive 50%.
My guess is that the reason is that you are only listening to one of 2 subscriptions on the topic and it is set up to split the messages between subscriptions. This functionality is used to split workload to multiple services. You can read about topics here: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-messaging-overview
and
https://learn.microsoft.com/en-us/azure/service-bus-messaging/topic-filters
Here sort description from above links:
"Partitioning uses filters to distribute messages across several existing topic subscriptions in a predictable and mutually exclusive manner. The partitioning pattern is used when a system is scaled out to handle many different contexts in functionally identical compartments that each hold a subset of the overall data; for example, customer profile information. With partitioning, a publisher submits the message into a topic without requiring any knowledge of the partitioning model. The message then is moved to the correct subscription from which it can then be retrieved by the partition's message handler."
To check this you can see if your service bus have partitioning turned on or any other filters.Turning partitioning off should do the trick in your case I think.

Unable to find entities from table storage after inserting batches of 100

Issue:
We currently have two azure consumption plan functions, each receiving service bus queue messages as input.
The first functions call SQL Azure with a stored proc, gets 500k+ records back, saves those records in batches of a 100 to Azure table storage with each batch having a unique partition key. After that's done it then creates a new queue message for next function to read batch and process it.
Everything works fine when the second function is not running warm and still needs to warm up. If the second function is running in memory, and it receives the queue message, we do a partition key lookup against the table storage, and sometimes it seems the data coming back is empty.
Code that inserts batches into table storage:
foreach (var entry in partitionKeyGroupinng)
{
var operation = new TableBatchOperation();
entry.ToList().ForEach(operation.Insert);
if (operation.Any())
{
await CloudTable.ExecuteBatchAsync(operation);
}
}
This is within an async task function in a shared assembly referenced by all functions.
Code to read out from table storage as partition key lookup:
TableContinuationToken continuationToken = null;
var query = BuildQuery(partitionKey);
var allItems = new List<T>();
do
{
var items = await CloudTable.ExecuteQuerySegmentedAsync(query, continuationToken);
continuationToken = items.ContinuationToken;
allItems.AddRange(items);
} while (continuationToken != null);
return allItems;
Code that calls that to lookup by partition key:
var batchedNotifications = await _tableStorageOperations.GetByPartitionKeyAsync($"{trackingId.ToString()}_{batchNumber}");
I reckon its to do with the batch still being written and available to other clients but don't know if that's the case? What would be the best way to handle this with the function processing and eventual consistency?
I have disabled the following on table client:
tableServicePoint.UseNagleAlgorithm = false;
tableServicePoint.Expect100Continue = false;
tableServicePoint.ConnectionLimit = 300;
If I also look up that same partition key in storage explorer as the event happens, I can see the batch so it returns values? I thought to make use of EGT with the batching would ensure this is written and available as soon as possible, because the method async Task WriteBatch shouldn't finish before it has finished writing the batch, however, don't know how long the back of table storage takes to write that to a physical partition and then make it available. I have also batched all the service bus queue messages up before sending them to add some delay to the second function.
Question:
How do we deal with this delay in accessing these records out of table storage between two functions using service bus queues?

Resources