Spring Itegration Aggregator: meaning of "send-timeout" XML configuration

Spring Itegration Aggregator: meaning of "send-timeout" XML configuration - spring-integration

I would appreciate some clarification about the meaning of "send-timeout" configuration parameter for an Aggregator. Based on Spring documentation, this configuration is The timeout interval for sending the aggregated messages to the output or reply channel. Optional.
Now, based on my understanding the Aggregator is a passive component and only decides to send a message or not based on the result of the release strategy after the receipt of a message; it won't release messages based on timeout events, for that, a separate Reaper component is needed. Is this correct?
Assuming, the send-timeout is the maximum amount of time that the Aggregator can spend sending the completed group of messages to the output channel. What would happen if time runs out (due to this parameter set up) while sending a message. How will the Aggregator handle that message group that was ready to release, started to be sent but never finished? Will it be marked complete?
Thanks

This is a fairly commonly misunderstood attribute. In many places (but not all) we have explained it clearly in the XSD and docs.
Bottom line is it rarely is applied. It only applies when the output channel can block. For example, if the output-channel is a QueueChannel with a capacity and the queue is full; it is the time we will wait to send the message to the channel.
If the output channel is, for example, a DirectChannel, it never applies.
If it does apply, the exception will be thrown back to the caller and the group will remain. Attempts to re-release such groups will occur if you configure a MessageGroupStoreReaper; if the group is still eligible for release, the reaper will again attempt to send the group to the output channel.
The "stuck" group will also be released if new messages arrives for the same group and the release strategy still considers the group eligible (e.g. it uses size >= n rather than size == n).
BTW, while the aggregator is generally a passive component, we did introduce the group-timeout (and group-timeout-expression) in 4.0 which allows partial groups to be released after a timeout, even without a reaper.
However, if such a release fails to happen because of the send-timeout, a new release attempt will only be made if a reaper is configured.

Related

about message queue consume order

I am new to message queues. I read the documentation of rabbitmq today, about the setting of "prefetch_count", this is to set the number of unpacked messages that consumers can have the most. I looked at the official documentation and said that the order is guaranteed even if re-enqueued. Assuming a single consumer, the producer sends the message 1->2->3->4->5->6->7->8->9->10, if the prefetch_count is set to 30, then if the 5 message fails and it does't has manual ack, this failed message is requeue. However, 6, 7, 8, 9, 10 are successfully consumed and all acked. At this time, although the messages seem to be pushed to consumers in order, the order of consuming messages maybe 1->2->3-> 4->6->7->8->9->10->5 order. I have this question:
In the scenario where the order of consumption needs to be guaranteed, how do you all do the above situations in the real scenario?
If the order of consumption needs to be guaranteed, how to ensure that the next message will not be consumed if the previous message is not successfully consumed?
Is it necessary to persist the message to MySql? For example, record the message of consumption failure
Here is the rabbitmq doc about the order.
Message ordering guarantees
Section 4.7 of the AMQP 0-9-1 core specification explains the conditions under which order is guaranteed: messages published in one channel, passing through one exchange and one queue, and one outgoing channel will be received in the same order that they were sent. RabbitMQ offers stronger guarantees since release 2.7.0.
Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic. recover, basic. reject, and basic. nack), or due to a channel closing while holding unacknowledged messages. Any of these scenarios caused messages to be queued at the back of the queue for RabbitMQ releases earlier than 2.7.0. From RabbitMQ release 2.7.0, messages are always held in the queue in publication order, even in the presence of queueing or channel closure.
With release 2.7.0 and later it is still possible for individual consumers to observe messages out of order if the queue has multiple subscribers. This is due to the actions of other subscribers who may require messages. From the perspective of the queue the messages are always held in the publication order.

I want to re-queue into RabbitMQ when there is error with added values to queue payload

I have a peculiar type of problem statement to be solved.
Configured RabbitMQ as message broker and its working but when there is failure in process in consume I'm now acknowledging with nack but it blindly re-queues with whatever already came in as payload but i want to add some-more fields to it and re-queue with simpler steps
For Example:
When consume gets payload data from RabbitMQ it will then process it and try to do some process based on it in multiple host machines, but due to some thing if one machine not reachable i need to process that alone after some time .
Hence I'm planning to re-queue failed data with one more fields with machine name again back to queue so it will be processed again with existing logic itself.
How to achieve this ? Can someone help on me

When a message is requeued, the message will be placed to its original position in its queue, if possible. If not (due to concurrent deliveries and acknowledgements from other consumers when multiple consumers share a queue), the message will be requeued to a position closer to queue head. This way you will end up in an infinite loop(consuming and requeuing the message). To avoid this, you can, positively acknowledge the message and publish it to the queue with the updated fields. Publishing the message puts it at the end of the queue, hence you will be able to process it after some time.
Reference https://www.rabbitmq.com/nack.html

How to handle publishing event when message broker is out?

I'm thinking how can I handle sending events when suddenly message broker go down. Please take a look at this code
using (var uow = uowProvider.Create())
{
...
...
var policy = offer.Buy(customer);
uow.Policies.Add(policy);
// DB changes are saved here! but what would happen if...
await uow.CommitChanges();
// ...eventPublisher throw an exception?
await eventPublisher.PublishMessage(PolicyCreated(policy));
return true;
}
IMHO if eventPublisher throw exception the event PolicyCreated won't be published. I don't know how to deal with this situation. The event must be published in system. I suppose that only good solution will be creating some kind of retry mechanism but I'm not sure...

I would like to elaborate a bit on the answers provided by both #Imran Arshad and #VoiceOfUnreason which are, of course, correct.
There are basically 3 patterns when it comes to publishing messages:
exactly once delivery (requires distributed transactions)
at most once delivery (no distributed transaction but may miss messages - like the actor model)
at least once delivery (no distributed transaction but may have duplicate messages)
The following is all in terms of your example.
For exactly once delivery both the database and the queue would need to provide the ability to enlist in distributed transactions. Some queues do not proivde this functionality out-of-the-box (like RabbitMQ) and even though it may be possible to roll your own it may not be the best option. Distributed transactions are typically quite slow.
For at most once delivery we have to accept that we may miss messages and I'm guessing that in most use-cases this is quite troublesome. You would get around this by tracking the progress and picking up the missed messages and resending them if required.
For at least once delivery we would need to ensure that the messages are idempotent. When we get a duplicate messages (usually quite an edge case) they should be ignored or their outcome should be the same as the initial message processed.
Now, there are a couple of ways around your issue. You could start a database transaction and make your database changes. Before you comit you perform the message sending. Should that fail then your transaction would be rolled back. That works fine for sending a single message but in your case some subscribers may have received a message. This complicates matters as all your subscribers need to receive the message or none of them get to receive it.
You could have your subscriber check whether the state is indeed true and whether it should continue processing. This places a burden on the subscriber and introduces some coupling. It could either postpone the action should the state not allow processing, or ignore it.
Another option is that instead of publishing the event you send yourself a command that indicates completion of the step. The command handler would perform the publishing and retry until all subscriber queues receive the message. This would require the relevant subscribers to ignore those messages that they had already processed (idempotence).
The outbox is a store-and-forward approach and will eventually send the message to all subscribers. You could have your outbox perhaps be included in the database transaction. In my Shuttle.Esb service bus one of the folks that used it came across a weird side-effect that I had not planned. He used a sql-based queue as an outbox and the queue connection was to the same database. It was therefore included in the database transasction and would roll back with all the other changes if not committed. Apologies for promoting my own product but I'm sure other service bus offerings may have the same functionality.
There are therefore quite a few things to consider and various techniques to mitigate the risk of a queue outage. I would, however, move the queue interaction to before the database commit.

For reliable system you need to save events locally. If your broker is down you have to retry and publish event.
There are many ways to achieve this but most common is outbox pattern. Just like your mail box your event/message stays locally and you keep retrying until it's sent and you mark the message published in your local DB.
you can read more about here Publish Events

You'll want to review Udi Dahan's discussion of Reliable Messaging without Distributed Transactions.
But very roughly, the PolicyCreated event becomes part of the unit of work; either because it is saved in the Policy representation itself, or because it is saved in an EventRepository that participates in the same transaction as the Policies repository.
Once you've captured the information in your database, retry the publish is relatively straight forward - read the events from the database, publish, optionally mark the events in the database as successfully published so that they can be cleaned up.

MessageGroupStoreReaper what exactly is it for

All spring projects have extensive and extremely helpful documentation and the same is the case for spring integration project but one topic that I want able to gather much information on is message group reapers. Basically it seems like it's a mechanism that can influence an aggregator externally to release some messages. What is unclear is How different or similar it is from a release policy. Does it actually releases messages from aggregator so they are then processed by subsequent components in the flow or it just empties out the aggregator kind of like discarding the aggregated messages. Secondly is there control over which messages are released so messages that belong to a particular group are released but not others.

Prior to Spring Integration 4.0, the aggregator was a completely passive component. Group release (or not) is triggered by the arrival of a new message - when the release strategy is consulted to see if the group to which the message belongs can be released. So, how do we discard (or release) a group that is idle and has had no new messages arrive for some time?
The reaper runs on a schedule and can detect "stuck" groups and release them based on time without a new message arriving.
It is also used to remove empty groups - the aggregator can be configured so that empty groups remain present so that late arriving messages can be discarded instead of forming a new group. Empty groups can be reaped on a different schedule to partially complete groups.
The reaper expires all groups that meet the configured criteria; if the group passes the criteria, it is expired; otherwise it is not, and will be considered the next time the reaper runs. The action taken by the reaper (partial release, discard etc) depends on other settings on the aggregator.
Spring Integration 4.0 introduced new attributes (group-timeout and group-timeout-expression), allowing partial groups to be released automatically after the timeout, without a reaper (the timer is armed when a new message arrives for a group). The reaper can still be used to expire (remove) empty groups in this case.
I hope that explains the function of the reaper.

Azure queue - can I verify a message will be read only once?

I am using an Azure queue and have several different processes reading from the queue.
My system is built in a way that assumes each message is read only once.
This Microsoft article claims Azure queues have an at least once delivery guarantee which potentially means two processes can read the same message from the queue.
This StackOverflow thread claims that if I use GetMessage then the message becomes invisible to all other processes for the invisibility timeout.
Assuming I use GetMessage() and never exceed the message invisibility time before I DeleteMessage, can I assume I will get each message only once?

I think there is a property in queue message named DequeueCount, which is the number of times this message has been dequeued. And it's maintained by queue service. I think you can use this property to identify whether your message had been read before.
https://learn.microsoft.com/en-us/dotnet/api/azure.storage.queues.models.queuemessage.dequeuecount?view=azure-dotnet

No. The following can happen:
GetMessage()
Add some records in a database...
Generate some files...
DeleteMessage() -> Unexpected failure (process that crashes, instance that reboots, network connectivity issues, ...)
In this case your logic was executed without calling DeleteMessage. This means, once the invisibility timeout expires, the message will appear in the queue and be processed once again. You will need to make sure that your process is idempotent:
Idempotence is the property of certain operations in mathematics and
computer science, that they can be applied multiple times without
changing the result beyond the initial application.
An alternative solution would be to use Service Bus Queues with the ReceiveAndDelete mode (see this page under How to Receive Messages from a Queue). If you receive the message it will be marked as consumed and never appear again. This way you can be sure it is delivered At-Most-Once (see the comparison with Storage Queues here). But then again, if something happens while your are processing the message (ie: server crashes, ...), you could loose valuable information.
Update:
This will simulate an At-Most-Once in storage queues. The message can arrive multiple times via GetMessage, but will only be processed once by your business logic (with the risk that some of your business logic will never execute).
GetMessage()
DeleteMessage()
AddRecordsToDatabase()
GenerateFiles()

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string