Proper way to handle 412 Precondition failure from cosmosdb in an event driven system with application implemented in nodejs - node.js

I have an event-driven system with my application implemented in nodejs using cosmosdb (azure-cosmosdb-sqlapi) to store the events. I have planning data coming via various events from kafka broker, to complete a planning documnet I need to combine data from 5 different events, I combine the events using planning id. In such a system for upsert operation we encounter 412 Precondition failure error very often, as we receive many events for a planning id.
The official Microsoft link says to retry. I am confused about which approach to take to retry
Handle the error code using a try catch and call the method handling the event from catch block for n number of times. If the retry fails after n times throw the exception back to broker, in that case the event is send again by the broker. The issue with this is I am not able to add test for the same. Secondly I need to manage all the retry logic in my code base. The advantage here is that I know that an event is failed and I can retry directly without sending the event back to broker. Below is the the snippet from planningservice.ts handlePlanningEvents method
try {
await repository.upsert(planningEntry, etag)
} catch (e: any) {
if (e.code === 412 and retries) {
this.handlePlanningEvents(event, retries-1)
} else {
throw e // throws exception back to broker
}
}
Not using try catch to handle the error in service code but propagating the error to controller where it sends a 500 error response to broker and the broker sends the event again. The issue with this case is that it's longer path as compared to using try catch where I can retry directly. But the advantage here is that I don't to worry about retry logic anymore its handled by broker, less and cleaner code.
Not sure which approach to take, also open to other suggestions.

Related

Taking an action when HTTP response status is not 200

I'm new to Spring Integration. I have a simple flow which send request to external resource with several attemps.
IntegrationFlows.from(MY_CHANNEL)
.handle(myOutboundGateway, e -> e.advice(myRetryAdvice))
.wireTap(logResponse())
.get();
What I need to do is to take some action (saving data to a database) when calling external resource (after retrying) is not succesful (http status code is not 200 OK). How can I achieve that in my flow?
When all the attempts of the retry are exhausted, the RecoveryCallback is called.
See some sample here: How to get HTTP status in advice recovery callback. In that RecoveryCallback you can just return null and send a message to some channel for that storing to DB logic.
Another way is to have extra advice on top of that retry instead of RecoveryCallback. See its docs: https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#expression-advice. This way when all the attempts are done, the exception is going to be bubbled and caught by that ExpressionEvaluatingRequestHandlerAdvice and its failureChannel. Pay attention to the trapException = true, so the error doesn't go back to the flow. Only to that failureChannel for your DB logic.

How to persist Saga instances using storage engines and avoid race condition

I tried persisting Saga Instances using RedisSagaRepository; I wanted to run Saga in load balancing setup, so I cannot use InMemorySagaRepository.
However, after I switched, I noticed that some of the events published by Consumers were not getting processed by Saga. I checked the queue and did not see any messages.
What I noticed is it will likely occurs when the Consumer took little to no time to process command and publish event.
This issue will not occur if I use InMemorySagaRepository or add Task.Delay() in Consumer.Consume()
Am I using it incorrectly?
Also, If I want to run Saga in load balancing setup, and if the Saga needs to send multiple commands of the same type using dictionary to track completeness (similar logic as in Handling transition to state for multiple events). When multiple Consumer publish events at the same time, would I have race condition if two Sagas are process two different events at the same time? In this case, would the Dictionary in State object will be set correctly?
The code is available here
SagaService.ConfigureSagaEndPoint() is where I switch between InMemorySagaRepository and RedisSagaRepository
private void ConfigureSagaEndPoint(IRabbitMqReceiveEndpointConfigurator endpointConfigurator)
{
var stateMachine = new MySagaStateMachine();
try
{
var redisConnectionString = "192.168.99.100:6379";
var redis = ConnectionMultiplexer.Connect(redisConnectionString);
///If we switch to RedisSagaRepository and Consumer publish its response too quick,
///It seems like the consumer published event reached Saga instance before the state is updated
///When it happened, Saga will not process the response event because it is not in the "Processing" state
//var repository = new RedisSagaRepository<SagaState>(() => redis.GetDatabase());
var repository = new InMemorySagaRepository<SagaState>();
endpointConfigurator.StateMachineSaga(stateMachine, repository);
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
LeafConsumer.Consume is where we add the Task.Delay()
public class LeafConsumer : IConsumer<IConsumerRequest>
{
public async Task Consume(ConsumeContext<IConsumerRequest> context)
{
///If MySaga project is using RedisSagaRepository, uncomment await Task.Delay() below
///Otherwise, it seems that the Publish message from Consumer will not be processed
///If using InMemorySagaRepository, code will work without needing Task.Delay
///Maybe I am doing something wrong here with these projects
///Or in real life, we probably have code in Consumer that will take a few milliseconds to complete
///However, we cannot predict latency between Saga and Redis
//await Task.Delay(1000);
Console.WriteLine($"Consuming CorrelationId = {context.Message.CorrelationId}");
await context.Publish<IConsumerProcessed>(new
{
context.Message.CorrelationId,
});
}
}
When you have events published in this manner, and are using multiple service instances with a non-transactional saga repository (such as Redis), you need to design your saga such that a unique identifier is used and enforced by Redis. This prevents multiple instances of the same saga from being created.
You also need to accept the events in more than the "expected" state. For instance, expecting to receive a Start, which puts the saga into a processing state, before receiving another event only in processing, is likely to fail. Allowing the saga to be started (Initially, in Automatonymous) by any of the sequence of events is recommended, to avoid out-of-order message delivery issues. As long as the events all move the dial from the left to the right, the eventual state will be reached. If an earlier event is received after a later event, it shouldn't move the state backwards (or to the left, in this example) but only add information to the saga instance and leave it at the later state.
If two events are processed on separate service instances, they'll both try to insert the saga instance to Redis, which will fail as a duplicate. The message should then retry (add UseMessageRetry() to your receive endpoint), which would then pick up the now existing saga instance and apply the event.

With the retry options in durable functions, what happens after the last attempt?

I'm using a durable function that's triggered off a queue. I'm sending messages off the queue to a service that is pretty flaky, so I set up the RetryPolicy. Even still, I'd like to be able to see the failed messages even if the max retries has been exhausted.
Do I need to manually throw those to a dead-letter queue (and if so, it's not clear to me how I know when a message has been retried any number of times), or will the function naturally throw those to some kind of dead-letter/poison queue?
When an activity fails in Durable Functions, an exception is marshalled back to the orchestration with FunctionFailedException thrown. It doesn't matter whether you used automatic retry or not - at the very end, the whole activity fails and it's up to you to handle the situation. As per documentation:
try
{
await context.CallActivityAsync("CreditAccount",
new
{
Account = transferDetails.DestinationAccount,
Amount = transferDetails.Amount
});
}
catch (Exception)
{
// Refund the source account.
// Another try/catch could be used here based on the needs of the application.
await context.CallActivityAsync("CreditAccount",
new
{
Account = transferDetails.SourceAccount,
Amount = transferDetails.Amount
});
}
The only thing retry changes is handling the transient error(so you do not have to enable the safe route each time you have e.g. network issues).

How to listen to a queue using azure service-bus with Node.js?

Background
I have several clients sending messages to an azure service bus queue. To match it, I need several machines reading from that queue and consuming the messages as they arrive, using Node.js.
Research
I have read the azure service bus queues tutorial and I am aware I can use receiveQueueMessage to read a message from the queue.
However, the tutorial does not mention how one can listen to a queue and read messages as soon as they arrive.
I know I can simply poll the queue for messages, but this spams the servers with requests for no real benefit.
After searching in SO, I found a discussion where someone had a similar issue:
Listen to Queue (Event Driven no polling) Service-Bus / Storage Queue
And I know they ended up using the C# async method ReceiveAsync, but it is not clear to me if:
That method is available for Node.js
If that method reads messages from the queue as soon as they arrive, like I need.
Problem
The documentation for Node.js is close to non-existant, with that one tutorial being the only major document I found.
Question
How can my workers be notified of an incoming message in azure bus service queues ?
Answer
According to Azure support, it is not possible to be notified when a queue receives a message. This is valid for every language.
Work arounds
There are 2 main work arounds for this issue:
Use Azure topics and subscriptions. This way you can have all clients subscribed to an event new-message and have them check the queue once they receive the notification. This has several problems though: first you have to pay yet another Azure service and second you can have multiple clients trying to read the same message.
Continuous Polling. Have the clients check the queue every X seconds. This solution is horrible, as you end up paying the network traffic you generate and you spam the service with useless requests. To help minimize this there is a concept called long polling which is so poorly documented it might as well not exist. I did find this NPM module though: https://www.npmjs.com/package/azure-awesome-queue
Alternatives
Honestly, at this point, you may be wondering why you should be using this service. I agree...
As an alternative there is RabbitMQ which is free, has a community, good documentation and a ton more features.
The downside here is that maintaining a RabbitMQ fault tolerant cluster is not exactly trivial.
Another alternative is Apache Kafka which is also very reliable.
You can receive messages from the service bus queue via subscribe method which listens to a stream of values. Example from Azure documentation below
const { delay, ServiceBusClient, ServiceBusMessage } = require("#azure/service-bus");
// connection string to your Service Bus namespace
const connectionString = "<CONNECTION STRING TO SERVICE BUS NAMESPACE>"
// name of the queue
const queueName = "<QUEUE NAME>"
async function main() {
// create a Service Bus client using the connection string to the Service Bus namespace
const sbClient = new ServiceBusClient(connectionString);
// createReceiver() can also be used to create a receiver for a subscription.
const receiver = sbClient.createReceiver(queueName);
// function to handle messages
const myMessageHandler = async (messageReceived) => {
console.log(`Received message: ${messageReceived.body}`);
};
// function to handle any errors
const myErrorHandler = async (error) => {
console.log(error);
};
// subscribe and specify the message and error handlers
receiver.subscribe({
processMessage: myMessageHandler,
processError: myErrorHandler
});
// Waiting long enough before closing the sender to send messages
await delay(20000);
await receiver.close();
await sbClient.close();
}
// call the main function
main().catch((err) => {
console.log("Error occurred: ", err);
process.exit(1);
});
source :
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-nodejs-how-to-use-queues
I asked myslef the same question, here is what I found.
Use Google PubSub, it does exactly what you are looking for.
If you want to stay with Azure, the following ist possible:
cloud functions can be triggered from SBS messages
trigger an event-hub event with that cloud function
receive the event and fetch the message from SBS
You can make use of serverless functions which are "ServiceBusQueueTrigger",
they are invoked as soon as message arrives in queue,
Its pretty straight forward doing in nodejs, you need bindings defined in function.json which have type as
"type": "serviceBusTrigger",
This article (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-service-bus#trigger---javascript-example) probably would help in more detail.

Azure Servicebus: Transient Fault Handling

I have a queue receiver, which reads messages from the queue and process the message (do some processing and inserts some data to the azure table or retrieves the data).
What I observed was that any exception that my processing method (SendResponseAsync()) throws results in retry i.e. redelivery of the message to the default 10 times.
Can this behavior be customized i.e. I only retry for certain exception and ignore for other. Like if there is some network issue, then it makes sense to retry but if it is BadArgumentException(poisson message), then I may not want to retry.
Since retry is taken care by ServiceBus client library, can we customize this behavior ?
This is the code at the receiver end
public MessagingServer(QueueConfiguration config)
{
this.requestQueueClient = QueueClient.CreateFromConnectionString(config.ConnectionString, config.QueueName);
this.requestQueueClient.OnMessageAsync(this.DispatchReplyAsync);
}
private async Task DispatchReplyAsync(BrokeredMessage message)
{
await this.SendResponseAsync(message);
}

Resources