Unique identifier for Spring Integration

Unique identifier for Spring Integration - spring-integration

I have a pub/subscribe queue in Spring Integration. Once a message is put on the queue I can see a new message ID is generated and different message ID for each of the subscribers. I want to use the initial unique message ID as an unique identifier while it flows through various microservices subscribers. Can I get the original message ID from each of the subscribers?
Also if I had multiple spring integration instances writing the messages into a single kafka queue, would message ID be unique?

I think Kafka deserves its own SO question. Re. the same id for all the subflows: how about a applySequence = true for the PublishSubscribeChannel and each message copy will be send with the Sequence Details headers where the IntegrationMessageHeaderAccessor.CORRELATION_ID is exactly copy of the original message?
The problem with Messaging that each new message should be really a new unique object. This way each message is a stand along entity and it doesn't effect all others and even may not know about their existence. The stateless is one of the consistency goals of Messaging per se.
Therefore if you would like to carry some identificator to all the messages, you should consider to use some other header, not an id. For this purpose the Framework already provides for your conventional mechanism called correlation and sequence details: https://docs.spring.io/spring-integration/docs/5.0.4.RELEASE/reference/html/messaging-channels-section.html#channel-configuration-pubsubchannel

Related

ReceiveAsync from Azure Service Bus topic without a body

I'm creating a consumer of an Azure Service Bus topic (subscription) that does nothing but store some statistics. The messages sent to the topic contains a rather large body, that is handled by another consumer on a second subscription on the same topic.
Since the statistics consumer can handle a large number of messages in one go, I was wondering if it is possible to receive a lot of messages but leave out the body to improve performance when communicating with Service Bus and to receive even more messages in one go.
I'm currently doing this:
this.messageReceiver = new MessageReceiver(conn, path);
...
await messageReceiver.ReceiveAsync(10, TimeSpan.FromSeconds(5));
It works pretty sweet but it would be nice to be able to receive 100 or more messages, without having to worry about moving large messages over the network.
Before anyone suggests it, I already know that I can ask for a count, etc. on a topic subscription. I still need the Message object since that contains an entry in the UserProperties dictionary that is used to calculate the stats.

Not possible. You can peek, but that brings the whole payload and headers w/o incrementing the DeliveryCount of the message. You could request it as a broker feature here.

Architecture issue - Azure servicebus and message order guarantee

Ok so i'm relatively new to the servicebus. Working on a project where we use Azure servicebus for queueing messages. Our architecture roughly looks like the following:
So the idea is that in our SourceSystem all kinds of stuff happens, which leads to messages being put on the servicebustopics. Now our responsibility is syncing these events to the external client so they are aware of what we are doing.
Now the issue is that currently we dont use servicebus sessions so message order isnt guaranteed. Also consider the following scenario:
OrderCreated
OrderUpdate 1
OrderUpdate 2
OrderClosed
What happens now is if the externalclients API is down for say OrderUpdate 1 and OrderUpdate 2, we could potentially send the messages in order: OrderCreated, OrderClosed, OrderUpdate 1, OrderUpdate 2.
Currently we just retry a message a few times and then it moves into the deadletter queue for manual reprocessing.
What steps should we take to better guarantee message order? I feel like in the scope of an order, message order needs to be guaranteed.
Should we force the sourcesystem to put all messages for a order in a servicebus session? But how can we handle this with multiple topics? And what do we do if message 1 from a session ends up in the deadletter?
There are a lot of considerations here, should we use a single topic so its easier to manage the sessions? But this opens up other problems with different message structures being in a single topic?
Id love to hear your opinions on this

Have a look at Durable Functions in Azure. You can use the 'Async Http API' or one of the other patterns to achieve the orchestration you need to do.
NServicebus' Sagas might also be a good option, here is an article that does a very good comparison between NServicebus and Durable Functions.

If the external client has to receive all those events and order matters, sending those messages to multiple topics where a topic is per message type will make your mission extremely hard to accomplish. For ordered messaging first you need to use a single entity (queue or topic) with Sessions enabled. That way you can guarantee ordered message processing. In case you have multiple external clients, you'd need to have a session-enabled entity (topic) per external client.
Another option is to implement a pattern known as Process Manager. The process manager would be responsible to make the decisions about the incoming messages and conclude when the work for a given order is completed or not.
There are also libraries (MassTransit, NServiceBus, etc) that can help you. NServiceBus implements Process Manager via a feature called Saga (tutorial) and MassTransit has it as well (documentation).

How Spring Cloud Stream prevents the application’s instances from receiving duplicate messages?

Spring Cloud Stream is based on At least once method,This means that in some rare cases a duplicate message can arrive at an endpoint.
Does Spring Cloud Stream keep a buffer of already received messages?
The IdempotentReceiver in Enterprise Integration Patterns book suggests :
Design a receiver to be an Idempotent Receiver,one that can safely receive the same message multiple times.
Does Spring Cloud Stream control duplicate messages in consumers?
Update:
A paragraph from Spring Cloud Stream says :
4.5.1. Durability
Consistent with the opinionated application model of Spring Cloud Stream, consumer group subscriptions are durable. That is, a binder implementation ensures that group subscriptions are persistent and that, once at least one subscription for a group has been created, the group receives messages, even if they are sent while all applications in the group are stopped.
Anonymous subscriptions are non-durable by nature. For some binder implementations (such as RabbitMQ), it is possible to have non-durable group subscriptions.
In general, it is preferable to always specify a consumer group when binding an application to a given destination. When scaling up a Spring Cloud Stream application, you must specify a consumer group for each of its input bindings. Doing so prevents the application’s instances from receiving duplicate messages (unless that behavior is desired, which is unusual).

I think your assumption on the responsibility of the spring-cloud-stream framework are incorrect.
Spring-cloud-stream in a nutshell is a framework responsible for connecting and adapting producers/consumers provided by the developer to the message broker(s) exposed by the spring-cloud-stream binder (e.g., Kafka, Rabbit, Kinesis etc).
So connecting to a broker, receiving message from the broker, deserialising it, invoking user code, serialising message and sending it back to the broker is in the scope of framework responsibility. So you can look at it as purely infrastructure.
What you're describing is more of an application concern since the actual receiver is something that user would develop as part of the spring-cloud-stream development experience, hence responsibility for idempotence would reside with such user.
Also, on top of that most brokers already handle idempotency (in a way) by ensuring that a particular message has been delivered only once. That said, if someone sends identical message to such broker, it will have no idea that it is duplicate so the requirement for idempotency and/or deduplication is still valid, but as you can see it is not as straight forward given the amount of factor that are in play where your understanding of idempotence could be different from mine, hence our approaches could be different as well.
One last thing (partially to prove my last point): can safely receive the same message multiple times. - That is all it states, but what does safely really mean to you vs. me vs. some other person?

If you are concerned about a case where the application receives and processes message from the broker but crashes before it acknowledges the message, that can happen. Spring cloud stream app starters provides support for auto-configuration of a persistent message metadata store which backs Spring Integration's IdempotentReceiverInterceptor. An example of this is in the SFTP source app starter. By default, the sftp source uses an in-memory metadata store, so it would not survive a restart, but can be customized to use a persistent store.

Message persistence in Spring Integration Aggregator without MessageStore by using AMQP?

I would like to know if I can have persistence in my Spring Integration setup when I use a aggregator, which is not backed by a MessageStore, by leveraging the persistence of AMQP (RabbitMQ) queues before and after the aggregator.
I imagine that this would use ack's: The aggregator won't ack a message before it's collected all the parts and sent out the resulting message.
Additionally I would like to know if this is ever a good idea :)
I am new working with queue's, and am trying to get a good feel for patterns to use.
My business logic for this is as follows:
I receive a messages on one queue.
Each message must result in two unrelated webservice calls (preferably in parallel).
The results of these two calls must be combined with details from the original message.
The combination must then be sent out as a new message on a queue.
Messages are important, so they must not be lost.
I was/am hoping to use only one 'persistent' system, namely RabbitMQ, and not having to add a database as well.
I've tried to keep the question specific, but any other suggestions on how to approach this are greatly appreciated :)

What you would like to do recalls me Scatter-Gather EI Pattern.
So, you get a message from the AMQP send it into the ScatterGather endpoint and wait for the aggregated reply. That's enough for to stick with the default acknowledge.
Right, the scatterChannel can be PublishSubscribeChannel with an executor to call Web Services in parallel. Anyway the gatherer process will wait for replies according the release strategy and will block the original AMQP listener do not ack the message prematurely.

Distributed pub/sub with single consumer per message type

I have no clue if it's better to ask this here, or over on Programmers.SE, so if I have this wrong, please migrate.
First, a bit about what I'm trying to implement. I have a node.js application that takes messages from one source (a socket.io client), and then does processing on the message, which might result in zero or more messages back out, either to the sender, or other clients within that group.
For the processing, I would like to essentially just shove the message into a queue, then it works its way through various message processors that might kick off their own items, and eventually, the bit running socket.io is informed "Hey, send this message back"
As a concrete example, say a user signs into the service, that sign in message is then placed in the queue, where the authorization processor gets it, does it's thing, then places a message back in the queue saying the client's been authorized. This goes back to the socket.io socket that is connected to the client, along with other clients that might be interested. It can also go to other subsystems that might want to do more processing on authorization (looking up user info, sending more info to the client based on their data, etc).
If I wanted strong coupling, this would be easy, but I tried that before, and it just goes to a mess of spaghetti code that's very fragile, and I would like to avoid that. Another wrench in the setup is this should be cluster-able, which is where the real problem comes in. There might be more than one, say, authorization processor running. But the authorization message should be processed only once.
So, in short, I'm looking for a pattern/technique that will allow me to, essentially, have multiple "groups" of subscribers for a message, and the message will be processed only once per group.
I thought about maybe having each instance of a processor generate a unique name that would be used as a list in Reids. This name would then be registered with some sort of dispatch handler, and placed into a set for that group of subscribers. Then when a message arrives, the dispatch pulls a random member out of that set, and places it into that list. While it seems like this would work, it seems somewhat over-complicated and fragile.
The core problem is I've never designed a system like this, so I'm not even sure the proper terms to use or look up. If anyone can point me in the right direction for this, I would be most appreciative.

I think what your describing is similar to https://www.getbridge.com/ service. I it but ended up writing my own based on zeromq, it allows you to register services, req -> <- rec and channels which are pub / sub workers.
As for the design, I used a client -> broker -> services & channels which are all plug and play using auto discovery, you have the services register their schema with the brokers who open a tcp connection so that brokers on other servers can communicate with that broker groups services. Then internal services and clients connect via unix sockets or ipc channels which ever is preferred.

I ended up wrapping around the redis publish/subscribe functions a bit to do this. Each type of message processor gets a "group name", and there can be multiple instances of the processor within that group (so multiple instances of the program can run for clustering).
When publishing a message, I generate an incremental ID, then store the message in a string key with that ID, then publish the message ID.
On the receiving end, the first thing the subscriber does is attempt to add the message ID it just got from the publisher into a set of received messages for that group with sadd. If sadd returns 0, the message has already been grabbed by another instance, and it just returns. If it returns 1, the full message is pulled out of the string key and sent to the listener.
Of course, this relies on redis being single threaded, which I imagine will continue to be the case.

What you might be looking for is an AMQP protocol implementation,where you can have queue get custom exchanges,and implement a pub-sub model.
RabbitMQ - a popular amqp protocol implementation with lots of libraries
it also has node.js library

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string