How do I ack messages with RabbitMQ and Apache Spark?

How do I ack messages with RabbitMQ and Apache Spark? - apache-spark

I receive messages from a RabbitMQ broker.
How do I issue an ack or nack, inside a Spark action (like foreach/foreachPartition), to retry message processing at a later time or just discard it?
I can't just pass along the deliveryTag, connect to rabbit inside an action and send the ack, since the deliveryTag is bound to a particular channel.

Spark tasks typically run on remote nodes. So all objects on the context that a task interacts with should be either private to the task or shared variables. RabbitMQ connection objects (any sort of connection, actually) established on the driver node will not be carried to remote nodes. Therefore in order to send ack and noack to RabbitMQ you need to do it outside of tasks, unless you are running everything on the driver node.
In short, try to find a way to signal message consumption failures back to the driver node and have the driver node send all acks and noacks.

Related

How to check if there are no message to consume in AMQ queue in stopmit node.js

I am using stompit package of node.js to connect to AMQ queue to subscribe message. I used ConnectFailover class to create connection and channelPool class to create pool.
Problem I am facing is that once connection is made and if there is no message in the queue then it stay connected.
What I need a way to disconnect if there is no message to read from the queue. I don't see any option in stompit documentation.

There is no way to do that with STOMP as per this issue. As a general rule, brokers like AMQ rarely allow consumers to inspect queue properties like message count.
Unless you can somehow leverage JMX from your node.js code, the easiest way would be to create a timer with client.disconnect() as a callback and wait for an amount of time suitable for your system. Whenever a message is consumed, reset the timer.

How to access to worker's queued requests?

I'm implementing a web server using nodejs which must serve a lot of concurrent requests. As nodejs processes the requests one by one, it keeps them in an internal queue (in libuv, I guess).
I also want to run my web server using cluster module, so there will be one requests queue per worker.
Questions:
If any worker dies, how can I retrieve its queued
requests?
How can I put retrieved requests into other workers' queues?
Is there any API to access to alive workers' requests queue?
By No. 3 I want to keep queued requests somewhere such as Redis (if possible), so in case of server crash, failure or even hardware restart I can retrieve them.

As you mentioned in the tags that you are-already-using/want-to-use redis, you can use queue-manager based on redis to do all the work for you.
Checkout https://github.com/OptimalBits/bull (or it's alternatives).
bull has a concept of queue. you add jobs to the queue and listen to the same queue from different processes/vms. bull will send the same job to only one listener and you have the ability to control how many jobs each listener is processing at the same time (concurrency-level).
In addition, if one of the jobs fails to run (in other words, the listener of the queue threw an error), bull will try to give the same job to different listener.

How to automatically assign a worker for message processing?

After the master has forked the workers and now wants to start sending messages to the worker processes, is specifying a worker before sending a message the only way to pass the message? The documentation suggests so.
const worker = cluster.fork();
worker.send('hi there');
If yes, what is the scheduling policy all about? Is there a way where we could:
master.sendToWorker('Hi there!');
and it automatically selects the worker according to the default/configured algorithm?

The scheduling policy is for handling incoming connections. If you have 3 workers that are express applications, when a user connects, only one worker will handle the request. It will either be Round Robin, by default, or OS's choice. So that does not give you lots of flexibility.
Now, that does not help us on your request, which is to send messages from the master. The correct solution depends on the nature of the message you'd like to send.
If you are sending a message to make the worker start a task, messages might not be the best solution, you might like to use a job queue instead. But if you'd like to use messages anyways, your master could simply take note of available workers and arbitrarily send the message to a free one, removing it from the available workers until it reports to have finished.
You could simply use your round robin implementation, in one line of code it would look like this:
workersList[++messageCount%workersList.length].send("message");
If you wanted to use the native policy, you could have your workers listen on a specific port and have your master send a message to that port on localhost, it should work, but you'll have to implement your own messaging system...
IMO, if you want to send a message, you know who you want to send it to. If you want to send a message to a "random" recipient, it may be because a message might not be the appropriate way to communicate for that scenario.

Hazelcast event consistency

I am using hazelcast local listener for my use case. i have read the documentation and understands that it uses queue to push events to listeners.
What happens to the events in the queue of node that is down ? will these be ignored or will be in queue and routed to new node if the replica is configured ? Please clarify.
Is there any way to acknowledge the successful receive of the message with some kind of call back ? so that event never be lost.

LocalListener queues are not distributed (as it would involve serialization). Anyhow listeners are not expected to do long running operations therefore your queue should always be empty. Queues tend to have only one of two states: empty or full (depending on fast or slow consumer).
And yes if the node goes down and your local queue is full, you'll loose events.
What is your usecase? Do you have slow consumers? Think to offload them to a Hazelcast distributed queue and execute them independently from the event threads.

How to prevent AMQP (RabbitMQ) message from being black holed when the connection to the broker dies?

For example, if there's a network outage and your producer loses connection to your RabbitMQ broke, how can you prevent messages from being black holed that need to be queued up? I have a few ideas one of them being to write all your messages to a local db and remove them once they're acked and periodically resend after some time period, but that only works if your connection factory is set to have the publisher confirm.

I'm just generating messages from my test application to simulate event logging. I'm essentially trying to create a durable producer. Is there a way to detect when you can reconnect to RabbitMQ also? I see there's a ConnectionListener interface, but it seems you cannot send messages to flush an internal queue in the ConnectionListener.
If you have a SimpleMessageListenerContainer (perhaps listening to a dummy queue) it will keep trying to reconnect (and fire the connection listener when successful). Or you can have a simple looper that calls createConnection() on the connection factory from time-to-time (it won't create a new connection each time, just return the single shared connection - if open); this will also fire the listener when a new connection is made.
You can use transactions instead of publisher confirms - but they're much slower due to the handshake. It depends on what your performance requirements are.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I ack messages with RabbitMQ and Apache Spark? - apache-spark

Related

How to check if there are no message to consume in AMQ queue in stopmit node.js

How to access to worker's queued requests?

How to automatically assign a worker for message processing?

Hazelcast event consistency

How to prevent AMQP (RabbitMQ) message from being black holed when the connection to the broker dies?

Categories

Resources