Is it possible to make a Poller (or PollableMessageSource) to poll messages as List? - spring-integration

Following the example found in GitHub https://github.com/spring-cloud/spring-cloud-gcp/tree/master/spring-cloud-gcp-samples/spring-cloud-gcp-pubsub-polling-binder-sample regarding polling messages from a PubSub subscription, I was wondering...
Is it possible to make a PollableMessageSource retrieve List<Message<?>> instead of a single message per poll?
I've seen the #Poller notation only being used in Source typed objects, never in Processor or Sink. Is it possible to use in such context when for example using #StreamListener or with a functional approach?

The PollableMessageSource binding and Source stream applications are fully based on the Poller and MessageSource abstraction from Spring Integration where its contract is to produce a single message to the channel configured. The point of the messaging is really to process a single message not affecting others. The failure for one message doesn't mean to fail others in the flow.
On the other hand you probably mean GCP Pub/Sub messages to be produced as a list in the Spring message payload. That is really possible, but via some custom code from Pub/Sub consumer and MessageSource impl. Although I would think twice to expect some batched from the source. Probably you may utilize an aggregator to build some small windows if your further logic is about processing as list. But again: it is going to be a single Spring message.
May be better to start thinking about a reactive function implementation where you indeed can expect a Flux<Message<?>> as an input and Spring Cloud Stream framework will take care for you how to emit the data from Pub/Sub into the reactive stream you expect.
See more info in docs: https://docs.spring.io/spring-cloud-stream/docs/3.1.0/reference/html/spring-cloud-stream.html#_reactive_functions_support

Related

Architecture issue - Azure servicebus and message order guarantee

Ok so i'm relatively new to the servicebus. Working on a project where we use Azure servicebus for queueing messages. Our architecture roughly looks like the following:
So the idea is that in our SourceSystem all kinds of stuff happens, which leads to messages being put on the servicebustopics. Now our responsibility is syncing these events to the external client so they are aware of what we are doing.
Now the issue is that currently we dont use servicebus sessions so message order isnt guaranteed. Also consider the following scenario:
OrderCreated
OrderUpdate 1
OrderUpdate 2
OrderClosed
What happens now is if the externalclients API is down for say OrderUpdate 1 and OrderUpdate 2, we could potentially send the messages in order: OrderCreated, OrderClosed, OrderUpdate 1, OrderUpdate 2.
Currently we just retry a message a few times and then it moves into the deadletter queue for manual reprocessing.
What steps should we take to better guarantee message order? I feel like in the scope of an order, message order needs to be guaranteed.
Should we force the sourcesystem to put all messages for a order in a servicebus session? But how can we handle this with multiple topics? And what do we do if message 1 from a session ends up in the deadletter?
There are a lot of considerations here, should we use a single topic so its easier to manage the sessions? But this opens up other problems with different message structures being in a single topic?
Id love to hear your opinions on this
Have a look at Durable Functions in Azure. You can use the 'Async Http API' or one of the other patterns to achieve the orchestration you need to do.
NServicebus' Sagas might also be a good option, here is an article that does a very good comparison between NServicebus and Durable Functions.
If the external client has to receive all those events and order matters, sending those messages to multiple topics where a topic is per message type will make your mission extremely hard to accomplish. For ordered messaging first you need to use a single entity (queue or topic) with Sessions enabled. That way you can guarantee ordered message processing. In case you have multiple external clients, you'd need to have a session-enabled entity (topic) per external client.
Another option is to implement a pattern known as Process Manager. The process manager would be responsible to make the decisions about the incoming messages and conclude when the work for a given order is completed or not.
There are also libraries (MassTransit, NServiceBus, etc) that can help you. NServiceBus implements Process Manager via a feature called Saga (tutorial) and MassTransit has it as well (documentation).

How Spring Cloud Stream prevents the application’s instances from receiving duplicate messages?

Spring Cloud Stream is based on At least once method,This means that in some rare cases a duplicate message can arrive at an endpoint.
Does Spring Cloud Stream keep a buffer of already received messages?
The IdempotentReceiver in Enterprise Integration Patterns book suggests :
Design a receiver to be an Idempotent Receiver,one that can safely receive the same message multiple times.
Does Spring Cloud Stream control duplicate messages in consumers?
Update:
A paragraph from Spring Cloud Stream says :
4.5.1. Durability
Consistent with the opinionated application model of Spring Cloud Stream, consumer group subscriptions are durable. That is, a binder implementation ensures that group subscriptions are persistent and that, once at least one subscription for a group has been created, the group receives messages, even if they are sent while all applications in the group are stopped.
Anonymous subscriptions are non-durable by nature. For some binder implementations (such as RabbitMQ), it is possible to have non-durable group subscriptions.
In general, it is preferable to always specify a consumer group when binding an application to a given destination. When scaling up a Spring Cloud Stream application, you must specify a consumer group for each of its input bindings. Doing so prevents the application’s instances from receiving duplicate messages (unless that behavior is desired, which is unusual).
I think your assumption on the responsibility of the spring-cloud-stream framework are incorrect.
Spring-cloud-stream in a nutshell is a framework responsible for connecting and adapting producers/consumers provided by the developer to the message broker(s) exposed by the spring-cloud-stream binder (e.g., Kafka, Rabbit, Kinesis etc).
So connecting to a broker, receiving message from the broker, deserialising it, invoking user code, serialising message and sending it back to the broker is in the scope of framework responsibility. So you can look at it as purely infrastructure.
What you're describing is more of an application concern since the actual receiver is something that user would develop as part of the spring-cloud-stream development experience, hence responsibility for idempotence would reside with such user.
Also, on top of that most brokers already handle idempotency (in a way) by ensuring that a particular message has been delivered only once. That said, if someone sends identical message to such broker, it will have no idea that it is duplicate so the requirement for idempotency and/or deduplication is still valid, but as you can see it is not as straight forward given the amount of factor that are in play where your understanding of idempotence could be different from mine, hence our approaches could be different as well.
One last thing (partially to prove my last point): can safely receive the same message multiple times. - That is all it states, but what does safely really mean to you vs. me vs. some other person?
If you are concerned about a case where the application receives and processes message from the broker but crashes before it acknowledges the message, that can happen. Spring cloud stream app starters provides support for auto-configuration of a persistent message metadata store which backs Spring Integration's IdempotentReceiverInterceptor. An example of this is in the SFTP source app starter. By default, the sftp source uses an in-memory metadata store, so it would not survive a restart, but can be customized to use a persistent store.

Spring Cloud Stream #StreamListener and Spring Integration's Resequencer Pattern

AFAIK the Spring Cloud Stream project is based on Spring Integration. Hence I was wondering if there is a nice way to resequence a subset of inbound messages before the StreamListener handler is triggered? Or do I need to assemble the whole IntegrationFlow from scratch using XML or Java DSL config from Spring Integration?
My use case is as follows. Most of the time I process inbound messages on a Kafka topic as they come. However, a few events have to be resequenced based on CORRELATION_ID, SEQUENCE_NUMBER, and SEQUENCE_SIZE headers. In other words I'd like to keep using StreamListener as much as possible and simply plug in resequencing strategy for some events.
Yes, you would need to use Spring Integration for it. In fact Spring Cloud Stream is effectively a binding framework only. It binds message handlers to the message brokers via binders. The message handlers themselves are provided by the users.
The #StreamListener annotation is pretty much an equivalent of Spring Integration's #ServiceActivator with few extra features (e.g., conditional routing), but other then it is just a message handler.
Now, as you eluded to, you are aware that you can use Spring Integration (SI) to implement a message handler or an internal SI flow, and that is normal and recommended for complex cases.
That said, we do provide out of the box apps that implements certain EIP components and we do have, for example, and aggregator app which you can use as a starting point in implementing resequencer. Further more, given that we have an aggregator app and not resequencer, we would be glad to accept a contribution for it if you're interested.
I hope this answers you question.

Message persistence in Spring Integration Aggregator without MessageStore by using AMQP?

I would like to know if I can have persistence in my Spring Integration setup when I use a aggregator, which is not backed by a MessageStore, by leveraging the persistence of AMQP (RabbitMQ) queues before and after the aggregator.
I imagine that this would use ack's: The aggregator won't ack a message before it's collected all the parts and sent out the resulting message.
Additionally I would like to know if this is ever a good idea :)
I am new working with queue's, and am trying to get a good feel for patterns to use.
My business logic for this is as follows:
I receive a messages on one queue.
Each message must result in two unrelated webservice calls (preferably in parallel).
The results of these two calls must be combined with details from the original message.
The combination must then be sent out as a new message on a queue.
Messages are important, so they must not be lost.
I was/am hoping to use only one 'persistent' system, namely RabbitMQ, and not having to add a database as well.
I've tried to keep the question specific, but any other suggestions on how to approach this are greatly appreciated :)
What you would like to do recalls me Scatter-Gather EI Pattern.
So, you get a message from the AMQP send it into the ScatterGather endpoint and wait for the aggregated reply. That's enough for to stick with the default acknowledge.
Right, the scatterChannel can be PublishSubscribeChannel with an executor to call Web Services in parallel. Anyway the gatherer process will wait for replies according the release strategy and will block the original AMQP listener do not ack the message prematurely.

How to process message in order with spring integration

Does the spring integration framework have any features I can use to guarantee message order?
We are needing to run our spring integration flow on two different nodes to keep up with the message volume. I have seen several solutions for solving this issue, but I wanted to see if the framework already has something.
This article might better explain what I'm trying to ask.
http://sleeplessinslc.blogspot.com/2010/02/message-ordering-in-jms-using-coherence.html
Yes, it is here. And its name Resequencer.
From Spring Integration Reference Manual: http://docs.spring.io/spring-integration/docs/latest-ga/reference/html/messaging-routing-chapter.html#resequencer.
Each of message for reordering has to have a correlationId header to compare with other messages and sequenceNumber to determine the real order of that message to emit.
The release-partial-sequences="true" does the stuff to release messages, when the correct order is achieved by default SequenceSizeReleaseStrategy on the next message.

Resources