How does group timeout work? - spring-integration

Aggregator is a passive component and the release logic is only triggered when a new message arrives. How then does the group timeout work?
Is it a scheduled task similar to reaper that constantly monitors the state of the aggregator. Does that also mean it repeatedly evaluates the group-timeout-expression to determine the value of group-timeout, or is it evaluated once at the start? I am assuming, since there are some examples based on size of payload, that means it must evaluate the group-timeout-expression repeatedly but if that's the case how often does that happen? Can that frequency of evaluation be controlled/modified. Along the same lines if aggregator is a POJO, has this group-timeout functionality already flexible enough to be able to specify the timout from a POJO method.
Another interesting thing I noticed is that for my group-timeout-expression I was trying a spell expression and was passing payload or headers but those apparently aren't available in the context. Seems like the context within this group-timeout-expression points to SimpleMessageGroup which doesn't have payload or headers properties. So, how can I access payload or headers within the spel expression of group-timeout-expression?
In fact in my case I want the actual message (the wrapper around the payload) because my method signature expects an actual SI message passed to it not the payload.

Prior to Spring Integration 4.0, the aggregator was a passive component and you had to configure a MessageGroupStoreReaper to expire groups.
The group-timeout* attributes were added in 4.0; when a new message arrives, a task is scheduled to time out the group. If the next message arrives before the timeout, the task is cancelled, the new message is added to the group and, if the release doesn't happen, a new task is scheduled. The expression is re-evaluated each time (the example in the documentation looks at the group size).
Yes, the root object for expression evaluation is the message group.
Which "payload and headers" to you need? There are likely multiple messages in the group. You can easily access the first message in the group using one.payload or one.headers['foo'] (these expressions use group.getOne() which peeks at the first message in the group).
If you need to access a different message, you would need to write some code to iterate over the messages in the group. We don't currently, but it would be possible to make the newly arrived message available as a variable #newMessage or similar; feel free to open an 'Improvement' JIRA issue for that.

Related

Message Collapsing

I'm trying to determine if there's a way for Azure Service Bus to provide message collapsing. Specifically I'm after something like:
First event into a queue gets picked up straight away
All other events that are queued within the next N seconds, and match some criteria (e.g. matching message ids), have the schedule enqueue set to a value so they fire at the end of the N seconds. If a "waiting" message already exists it should be deleted.
After the N seconds has expired the newest scheduled message appears and is picked up.
Basically I need a way to get a good time-to-first-event, but provide protection from over processing events from chatty sources.
Does anyone have a pattern they've used to get something close to these semantics?
Update 1
The messages involved aren't true duplicates, rather they're the current state of an entity that is used for some processing (e.g. a message that's generated each time a file is updated). The result of the processing of an early message is fully replaced by that of later messages (e.g. the result is the size of the file). So we still need to guarantee we process the most recent message, but it's a waste to process all M within N seconds.
It sounds like you're talking about Duplicate Detection, especially in regards to matching MessageIds. If you want to evaluate some other attribute in the message for duplicate detection, maybe it's worth taking a step back and asking Why are my publishers sending so many duplicate messages? If it's unavoidable, maybe you can segregate your chatty consumers into a separate consumer group and manually handle the the duplicate check, then re-enqueue (just thinking out loud).

How to tell when the last message from an Azure Queue is removed

I'm exploring azure functions and queue triggers to implement a recursive solver.
The desired behavior is as follows:
There exists a queue called "solveme" which when it receives an item containing state, will create a second queue with a unique name (GUID) and send the initial state to that queue with some correlation information.
When a message is received into the newly created queue, it will possibly enqueue more work representing all the next possible states so that before the message is "completed", the new work will exist in the queue. If there are no valid next states for the given state, it will just complete the message. This means that the original message is not "completed" until all new work is enqueued. During this step, I'm writing off data to azure tables representing the valid solutions using the correlation information from step 1.
3. When all messages have been processed from the newly created queue, delete the queue.
I am not sure how to go about #3 without having a timer or something that checks the queue length until it is zero and deletes it. I'm trying to only use Azure Functions and Azure Storage - nothing else.
It would be nice if there was someway to do directly from a QueueTrigger or within a function called by a QueueTrigger.
Further, is there a better / standard way to trigger completion of this kind of work? Effectively, the empty queue is the trigger that all work is done, but if there is a better way, I'd like to know about it!
There's no reliable way to do that out of the box. If your consumer is faster than producer, the queue could stay on 0 for long time, while producer still keeps sending data.
I would suggest you to send a special "End of Sequence" message, so when consumer receives this message it knows to go and delete the queue.
P.S. Have a look at Durable Functions - it looks like you could use some of the orchestration there instead of implementing workflows yourself (just mind it's in early preview).

How to avoid concurrency on aggregates status using Rebus in a server cluster

I have a web service that use Rebus as Service Bus.
Rebus is configured as explained in this post.
The web service is load balanced with a two servers cluster.
These services are for a production environment and each production machine sends commands to save the produced quantities and/or to update its state.
In the BL I've modelled an Aggregate Root for each machine and it executes the commands emitted by the real machine. To preserve the correct status, the Aggregate needs to receive the commands in the same sequence as they were emitted, and, since there is no concurrency for that machine, that is the same order they are saved on the bus.
E.G.: the machine XX sends a command of 'add new piece done' and then the command 'Set stop for maintenance'. Executing these commands in a sequence you should have Aggregate XX in state 'Stop', but, with multiple server/worker roles, you could have that both commands are executed at the same time on the same version of Aggregate. This means that, depending on who saves the aggregate first, I can have Aggregate XX with state 'Stop' or 'Producing pieces' ... that is not the same thing.
I've introduced a Service Bus to add scale out as the number of machine scales and resilience (if a server fails I have only slowdown in processing commands).
Actually I'm using the name of the aggregate like a "topic" or "destinationAddress" with the IAdvancedApi, so the name of the aggregate is saved into the recipient of the transport. Then I've created a custom Transport class that:
1. does not remove the messages in progress but sets them in state
InProgress.
2. to retrive the messages selects only those that are in a recipient that have no one InProgress.
I'm wandering: is this the best way to guarantee that the bus executes the commands for aggregate in the same sequence as they arrived?
The solution would be have some kind of locking of your aggregate root, which needs to happen at the data store level.
E.g. by using optimistic locking (probably implemented with some kind of revision number or something like that), you would be sure that you would never accidentally overwrite another node's edits.
This would allow for your aggregate to either
a) accept the changes in either order (which is generally preferable – makes your system more tolerant), or
b) reject an invalid change
If the aggregate rejects the change, this could be implemented by throwing an exception. And then, in the Rebus handler that catches this exception, you can e.g. await bus.Defer(TimeSpan.FromSeconds(5), theMessage) which will cause it to be delivered again in five seconds.
You should never rely on message order in a service bus / queuing / messaging environment.
When you do find yourself in this position you may need to re-think your design. Firstly, a service bus is most certainly not an event store and attempting to use it like one is going to lead to pain and suffering :) --- not that you are attempting this but I thought I'd throw it in there.
As for your design, in order to manage this kind of state you may want to look at a process manager. If you are not generating those commands then even this will not help.
However, given your scenario it seems as though the calls are sequential but perhaps it is just your example. In any event, as mookid8000 said, you either want to:
discard invalid changes (with the appropriate feedback),
allow any order of messages as long as they are valid,
ignore out-of-sequence messages till later.
Hope that helps...
"exactly the same sequence as they were saved on the bus"
Just... why?
Would you rely on your HTTP server logs to know which command actually reached an aggregate first? No because it is totally unreliable, just like it is with at-least-one delivery guarantees and it's also irrelevant.
It is your event store and/or normal persistence state that should be the source of truth when it comes to knowing the sequence of events. The order of commands shouldn't really matter.
Assuming optimistic concurrency, if the aggregate is not allowed to transition from A to C then it should guard this invariant and when a TransitionToStateC command will hit it in the A state it will simply get rejected.
If on the other hand, A->C->B transitions are valid and that is the order received by your aggregate well that is what happened from the domain perspective. It really shouldn't matter which command was published first on the bus, just like it doesn't matter which user executed the command first from the UI.
"In my scenario the calls for a specific aggregate are absolutely
sequential and I must guarantee that are executed in the same order"
Why are you executing them asynchronously and potentially concurrently by publishing on a bus then? What you are basically saying is that calls are sequential and cannot be processed concurrently. That means everything should be synchronous because there is no potential benefit from parallelism.
Why:
executeAsync(command1)
executeAsync(command2)
executeAsync(command3)
When you want:
execute(command1)
execute(command2)
execute(command3)
You should have a single command message and the handler of this message executes multiple commands against the aggregate. Then again, in this case I'd just create a single operation on the aggregate that performs all the transitions.

resequencer with holes on secuence

we have an ETL scenario where we use the resequencer.
Messages arrive to the flow with a sequence number that the resequencer uses it to send messages in order, but sometimes messages are discarded previously (because of data validation) and do not arrive to the resequencer. This produces holes in the sequence and resequencer stops sending messages using the default release strategy. To avoid this, we developed a new SequenceTimeoutReleaseStrategy that is a mix between default strategy and TimeoutCountSequenceSizeReleaseStrategy from SI. When a message arrives, it checks the timeout and release it if necesary.
All this worked well unless for the last messages that arrive before the timeout and have holes. This messages aren't release by the strategy. We could use a reaper but the secuence may have more than one hole in the sequence so when the resequencer release them it will stop in the first sequence break and remove the group losing the rest of the messages. So, the question is: is there a way to use the resequencer where there can be holes in the sequence?
One solution we have and want to avoid is having a scheduled tasks that removes the messages directly from the message store, but this could be a problem with concurrency and so on, so we prefer other solutions.
Any help is appreciated here
Regards
Guzman
There are two components involved; the release strategy says "something" can be released; the actual decision as to what is released is performed by the MessageGroupProcessor. In this case, a ResequencingMessageGroupProcessor.
You would need to customize that class to "skip" the hole(s).
You can't wire in a customized MGP using the <reseequencer/> namespace, you would have to wire up using <bean/> s - a ResequencingMessageHandler and a ConsumerEndpointFactoryBean.
Or use a BeanFactoryPostProcessor to change the constructor argument to your custom class.

locking on correlation id in aggregator

Looking at the code for AbstractCorrelatingMessageHandler it seems like a lock is obtained based on correlation id by the aggregator to enforce processing the message group belonging to the same correlation id by a single thread. This makes sense and in fact i wanted to make sure that once messages are released from an aggregator as long as they are being worked on by any downstream component within the same thread as they were released in, the aggregator shouldnt release any more messages belonging to same correlation id (if they come in).
So e.g. If I send message "A" with correlation id as "1" and immediately send message "B" belonging to same correlation id and let aggregator release both. Now If I have some time consuming components after the aggregator I want to not release "B" until the thread processing "A" is all done. Keeping in mind that A and B are being processed by different threads. At the same time if I send "C" with correlation id as "2" I do want this to be released while "A" is being processed.
So this is desired behavior and surprisingly this is exactly how it seems to work. So the aggregator is handling the locking logic as intended. I just want to understand how thats happening does that means that the aggregator places the lock before releasing a message group and then releases the lock until after all components downstream have been processed basically the postSend have been called on them. Looking at the code for class mentioned above it seems to release the lock after releasing the message but thats not what I see when running my example. Instead the lock is held until all downstream components after the aggregator have completed. I am hoping for some clarification. Thanks.
Well to simplify it a bit the logic looks like:
lock.lockInterruptibly();
try {
completeGroup(message, correlationKey, messageGroup);
}
finally {
lock.unlock();
}
And I think should agree with me that unlock() will be done only after completeGroup() will finish its work. The last one may do any hard and long work and the lock will be holded all that time.
So, if your downstream flow (the subsriber for aggregator's output-channel) is single-threaded (all endpoint are tied with DirectChannel), the lock won't be unlocked until all those endpoint will finish their work. MessageHandlers subscribe to channels and send message to channels. Everything is done within the same Call Stack.

Resources