My application currently uses RabbitMQ to queue and process messages to initiate data streams and to pass the streamed data to a processing area.
Because we only want one client to consume the data stream and only one client to process the streamed data, I am currently using PUSH messages.
The issue I am finding is that if I acknowledge the PUSH message to initiate the data stream and that process fails, the message will not be requeued. If I do not acknowledge the message, none of my other PUSH messages will be received until after I either acknowledge the data stream message or the process dies.
I have looked at REQUEST/REPLY messages, however I think the same issue may apply here, where I need to requeue automatically should the process/server die.
Is it possible to use non-blocking PUSH messages?
Perhaps of value is the "qos" / prefetch setting for consumers: https://www.rabbitmq.com/consumer-prefetch.html It's possible to set the value to greater than one to allow a single consumer to get more than one message at a time. Would this result in the non-blocking PUSH (read only) message you're looking for?
Related
I have a peculiar type of problem statement to be solved.
Configured RabbitMQ as message broker and its working but when there is failure in process in consume I'm now acknowledging with nack but it blindly re-queues with whatever already came in as payload but i want to add some-more fields to it and re-queue with simpler steps
For Example:
When consume gets payload data from RabbitMQ it will then process it and try to do some process based on it in multiple host machines, but due to some thing if one machine not reachable i need to process that alone after some time .
Hence I'm planning to re-queue failed data with one more fields with machine name again back to queue so it will be processed again with existing logic itself.
How to achieve this ? Can someone help on me
When a message is requeued, the message will be placed to its original position in its queue, if possible. If not (due to concurrent deliveries and acknowledgements from other consumers when multiple consumers share a queue), the message will be requeued to a position closer to queue head. This way you will end up in an infinite loop(consuming and requeuing the message). To avoid this, you can, positively acknowledge the message and publish it to the queue with the updated fields. Publishing the message puts it at the end of the queue, hence you will be able to process it after some time.
Reference https://www.rabbitmq.com/nack.html
I have a spring integration flow where a processor (scheduled) sequentially reads messages from a queue (jms) and attempts processing. If the processor finds that the message can't be processed until another event finishes, it sends the message back to the original queue and attempts processing later.
If it just keep sending messages that can't be processed, back to the queue, it creates an infinite loop.
So I need to hold onto them until I finish reading all messages in the queue that already exist. And trigger a release when all existing messages are read, before sending them to the queue. How do I go about this?
Note that I don't want to aggregate the messages, but just temporarily hold them, and somehow. Also note that my processor is scheduled to read messages (not message driven).
In this case you have to acknowledge those messages in the queue anyway and re-send them back to it using JmsTemplate (or JmsSendingMessageHandler).
The feature with dequeue that the failed message is returned to the head of the queue. That's how you see it again and again and don't reach other messages (also you can do that with the concurrency).
With the resending messages in case of failure back to the queue, you place them in the tail of the queue. So, "bad" messages will be available later, after processing other, existing messages.
I have an SQS Queue which has a lot of messages (typically in thousands). Presently I am having multiple listeners (which are created by threads created from the same source) and each listener listens to the queue and receives messages. As soon as a listener receives a message from the Queue, that listener deletes the message from the Queue. The message will be processed only after deleting the message from the queue. I am having a visibility timeout of 30 seconds.
I am not using any locks or anything to handle duplicates since I am deleting the message from the queue as soon as after receiving. I haven't seen a case of duplicity until now but I am just worried it might.
Now, the question is, which is a better way, having multiple listeners this way or listening to the queue in a single thread, and then spinning up new threads to process each message you receive?
Firstly, it is worth understanding the concept of message invisibility timeout.
When a message is retrieved from an Amazon SQS queue (eg by your thread), the message is marked as invisible in Amazon SQS. Best-practice is for your thread to then process the message and then delete the message after it has completed processing the message. This way, if the thread fails, the message will automatically become visible on the queue again and another thread can process it.
With your current application design, if a thread fails then the message is lost and will not be retried. You should consider changing your code to delete the message only after it has been processed.
Using multiple threads to process messages is recommended, because it will allow higher message throughput by processing messages in parallel. It is also a simpler design, and simple is always best. Your alternate idea of having one process retrieve messages and then firing off threads to process the message is more complex and does not provide any benefits.
Amazon SQS queues can occasionally return the same message more than once. It is rare, but can happen. The multiple-thread design will probably result in it happening more than the single-thread design because multiple threads might simultaneously retrieve the same message. However, there it could still happen in the single-thread model, too.
If processing the same message twice is a concern, then consider using a FIFO queue (not currently available in every AWS Region). This will guarantee that every message is received only once. Alternatively, your code would need to check whether a particular message has already been processed (eg by checking in a database).
The multiple-thread design will also allow you to horizontally scale by having multiple system (even across multiple Availability Zones) process messages, whereas your single-thread design has a single point of failure and is less scalable.
Quite new to RabbitMQ and I'm trying to see if I can achieve what I need with it.
I am looking for the Worker Queues pattern but with one caveat. I want to have only a single worker running concurrently per routing key.
An example for clarification:
If i send the following messages with routing keys by order: a, a, b, c, I want to have only 3 workers running concurrently. When the first a message is received a worker picks it up and handles it.
When the next a message is received and the previous a message is still handled (not acknowledged) the new a message should wait in queue. When the b and c messages are received they each get a worker handling them. When the first a message is acknowledged any worker can pick up the next a message.
Would that pattern be possible using RabbitMQ in a natural way (without writing any application code on my side to handle the locking and stuff...)
Edit:
Another clarification. All workers can and should handle all messages, and I don't want to have a queue per Worker as I want to share the load between them, and the Publisher doesn't know which Worker should process the message. But I do want to make sure that no 2 Workers are working on messages sharing the same key at the same time.
For example, if I have a Publisher publishing messages with a userId field, I want to make sure no 2 Workers are handling messages with the same userId at the same time.
Edit 2
Expanding on the userId example. Let's say I have a single Publisher and 3 Workers. The publisher publishes messages like these: { userId: 1, text: 'Hello' }, with varying userIds. My 3 Workers all do the same thing to this messages, so I can have any of them handle the messages coming in. But what I'm trying to achieve is to have only a single worker processing a message from a certain user at the same time. If a Worker has received a message with userId 1 and is still processing it, and another message with userId 1 is received I want to make sure no other Worker picks up that message. But other messages coming in with different userIds should be processed by other available Workers.
userIds are not known beforehand, and the publisher doesn't know how many workers are or anything specific about them, he just wants to schedule the messages for processing.
what your asking is not possible with routing keys, but is built into queues with a few settings.
if you define "queue_a" for a messages, "queue_b" for b messages, etc, you can then have as many consumers connect to it as you want.
RabbitMQ will only deliver a given message to a single consumer of a given queue.
The way it works with multiple consumers on a single queue is basic round-robin style dispatch of the messages. that is, the first message will be delivered to one of the consumers, and the next message (assuming the first consumer is still busy) will be delivered to the next consumer.
So, that should satisfy the need to deliver the message to any given consumer of the queue.
To ensure your messages have an equal chance of getting to any of the consumer (and are not all delivered to the same consumer all the time), there are a few other settings you should put in place.
First, make sure to set the message consumer no ack setting to false (sometimes called "auto ack"). This will force you to ack the message from your code.
Lastly, set the "consumer prefetch" limit of the consumer to 1.
With this combination of settings, a single consumer will retrieve a single message and begin working on it. While that consumer is working, any message waiting in the queue will be delivered to other consumers if any are available. If there are none available, the message will wait in the queue until a consumer is available.
With this, you should be able to achieve the behavior you are wanting, on a given queue.
...
Keep in mind this only applies to queues, though. routing keys cannot be managed this way. all matched routing keys from an exchange will cause a copy of the message to be sent to the destination queue.
I'm implementing Kafka consumer with custom acknowledgement mechanism using spring-integration-kafka.
The code from this example was used.
What I'm trying to achieve is when an exception is thrown, the acknowledgement should not be sent back to Kafka (i.e. no offset commit should be performed) so the next fromKafka.receive(10000) method call will return the same message as the previous one.
But I faced with a problem: even if the acknowledgement isn't sent to Kafka, the consumer knows somehow the offset of the next message and continues to read new messages in spite of the fact that offset value in offset topic remains unchanged.
How to make consumer reread message in case of some failures?
There's not currently any support for re-fetching failed messages
One thing you can do is add retry (e.g. using a request handler retry advice) downstream of the message-driven adapter.
By not acking, the message(s) will be delivered after a restart but not during the current instantiation.
Since messages are prefetched into the adapter, one thing you could do is detect the failure, stop the adapter, drain the prefetched messages and restart.
You could inject a custom ErrorHandler to stop the adapter and signal to your downstream flow that it should ignore the draining messages.
EDIT
There is now a SeekToCurrentErrorHandler.