Does it have a crystal-lang Queue? - multithreading

How to realize pattern producer - consumer on crystal lang? I'am looking for something like that - http://ruby-doc.org/core-2.2.0/Queue.html
Probably i need to use Channel, but I don't understand how.. because it's wait while "consumer" will receive.
I mean:
channel = Channel(Int32).new
spawn do
15.times do |i|
# ... do something that take a time
puts "send #{i}"
channel.send i # paused while someone receive, but i want to continue do the job that takes a time..
end
end
spawn do
loop do
i = channel.receive
puts "receive #{i}"
sleep 0.5
end
end
sleep 7.5

You're right, using a Channel is a great way to solve conncurent communication in Crystal.
Note that by default a Channel can only store one value until it is received.
But you can use a buffered Channel to be able to send multiple values to the Channel and they don't need to be received immediately. This is essentially a FIFO queue where new items are added at one end and removed from the other.
# Create a channel with a buffer for 32 values
channel = Channel(Int32).new(32)

Related

Forcing a message loop to yield

I hope this question isn't too broad.
I'm working with a legacy Ada application. This application is built around a very old piece of middleware that handles, among other things, our IPC. For the sake of this question, I can boil the middleware's provisions down to
1: a message loop that processes messages (from other programs or this program)
2: a function to send messages to this program or others
3: a function to read from a database
The program operates mainly on a message loop - simply something like
loop
This_Msg := Message_Loop.Wait_For_Message; -- Blocking wait call
-- Do things based on This_Msg's ID
end loop
however there are also callbacks that can be triggered by external stimuli. These callbacks run in their own threads. Some of these callbacks call the database-reading function, which has always been fine, EXCEPT, as we recently discovered, in a relatively rare condition. When this condition occurs, it turns out it isn't safe to read from the database when the message loop is executing its blocking Wait_For_Message.
It seemed like a simple solution would be to use a protected object to synchronize the Wait_For_Message and database read: if we try to read the database while Wait_For_Message is blocking, the read will block until Wait_For_Message returns, at which point the Wait_For_Message call will be blocked until the database read is complete. The next problem is that I can't guarantee the message loop will receive a message in a timely fashion, meaning that the database read could be blocked for an arbitrary amount of time. It seems like the solution to this is also simple: send a do-nothing message to the loop before blocking, ensuring that the Wait_For_Message call will yield.
What I'm trying to wrap my head around is:
If I send the do-nothing message and THEN block before the database read, I don't think I can guarantee that Wait_For_Message won't have returned, yielded, processed the do-nothing message, and started blocking again before the pre-database read block. I think I conceptually need to start blocking and THEN push a message, but I'm not sure how to do this. I think I could handle it with a second layer of locks, but I can't think of the most efficient way to do so, and don't know if that's even the right solution. This is really my first foray into concurrency in Ada, so I'm hoping for a pointer in the right direction.
Perhaps you should use a task for this; the following would have the task waiting at the SELECT to either process a message or access the DB while another call on an entry during the processing would queue on that entry for the loop to reiterate the select, thus eliminating the problem altogether... unless, somehow, your DB-access calls the message entry; but that shouldn't happen.
Package Example is
Task Message_Processor is
Entry Message( Text : String );
Entry Read_DB( Data : DB_Rec );
End Message_Processor;
End Example;
Package Body Example is
Task Body Message_Processor is
Package Message_Holder is new Ada.Containers.Indefinite_Holders
(Element_Type => String);
Package DB_Rec_Holder is new Ada.Containers.Indefinite_Holders
(Element_Type => DB_Rec);
Current_Message : Message_Holder.Holder;
Current_DB_Rec : DB_Rec_Holder.Holder;
Begin
MESSAGE_LOOP:
loop
select
accept Message (Text : in String) do
Current_Message:= Message_Holder.To_Holder( Text );
end Message;
-- Process the message **outside** the rendevouz.
delay 1.0; -- simulate processing.
Ada.Text_IO.Put_Line( Current_Message.Element );
or
accept Read_DB (Data : in DB_Rec) do
Current_DB_Rec:= DB_Rec_Holder.To_Holder( Data );
end Message;
-- Process the DB-record here, **outside** the rendevouz.
or
Terminate;
end select;
end loop MESSAGE_LOOP;
End Message_Processor;
End Example;

Azure queues getMessages method in sdk not working as expected

I have created a queue in Azure Queue and enqueued two items in it. Using the nodejs sdk, i create a timer that executes every 5 secs and calls:
azure.createQueueService("precondevqueues", "<key>").getMessages(queueName, {numOfMessages : 1, visibilityTimeout: 1 }, callback)
I expect that the same message of the two in the queue to show up after every 5 secs but that does not seem to be the case. The output of this call alternates between the two messages.
This should not be the case since visibilityTimeout is set to 1 and hence, after 1 second, the message dequeued in the first call should be visible again before the next getMessage call is made.
As noted here, FIFO ordering is not guaranteed. So it may be the case, that most of the time messages are fetched in FIFO order, but that is not guaranteed and Azure can give you the messages in the order which is best for their implementation.
Messages are generally added to the end of the queue and retrieved
from the front of the queue, although first in, first out (FIFO)
behavior is not guaranteed.
Aha my mistake! I again read the getMessages documentation very carefully and realize that getMessages dequeues the message but retains a invisible copy outside of the queue. If the message processor does not delete the message before the visibility timeout expires, the copy is re-enqueued in the message and therefore they go to the end of the queue.

RabbitMQ: how to limit consuming rate

I need to limit the rate of consuming messages from rabbitmq queue.
I have found many suggestions, but most of them offer to use prefetch option. But this option doesn't do what I need. Even if I set prefetch to 1 the rate is about 6000 messages/sec. This is too many for consumer.
I need to limit for example about 70 to 200 messages per second. This means consuming one message every 5-14ms. No simultaneous messages.
I'm using Node.JS with amqp.node library.
Implementing a token bucket might help:
https://en.wikipedia.org/wiki/Token_bucket
You can write a producer that produces to the "token bucket queue" at a fixed rate with a TTL on the message (maybe expires after a second?) or just set a maximum queue size equal to your rate per second. Consumers that receive a "normal queue" message must also receive a "token bucket queue" message in order to process the message effectively rate limiting the application.
NodeJS + amqplib Example:
var queueName = 'my_token_bucket';
rabbitChannel.assertQueue(queueName, {durable: true, messageTtl: 1000, maxLength: bucket.ratePerSecond});
writeToken();
function writeToken() {
rabbitChannel.sendToQueue(queueName, new Buffer(new Date().toISOString()), {persistent: true});
setTimeout(writeToken, 1000 / bucket.ratePerSecond);
}
I've already found a solution.
I use module nanotimer from npm for calculation delays.
Then I calculate delay = 1 / [message_per_second] in nanoseconds.
Then I consume message with prefetch = 1
Then I calculate really delay as delay - [processing_message_time]
Then I make timeout = really delay before sending ack for the message
It works perfectly. Thanks to all
See 'Fair Dispatch' in RabbitMQ Documentation.
For example in a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work. Well, RabbitMQ doesn't know anything about that and will still dispatch messages evenly.
This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.
In order to defeat that we can use the prefetch method with the value of 1. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.
I don't think RabbitMQ can provide you this feature out of the box.
If you have only one consumer, then the whole thing is pretty easy, you just let it sleep between consuming messages.
If you have multiple consumers I would recommend you to use some "shared memory" to keep the rate. For example, you might have 10 consumers consuming messages. To keep 70-200 messages rate across all of them, you will make a call to Redis, to see if you are eligible to process message. If yes, then update Redis, to show other consumers that currently one message is in process.
If you have no control over consumer, then implement option 1 or 2 and publish message back to Rabbit. This way the original consumer will consume messages with the desired pace.
This is how I fixed mine with just settimeout
I set mine to process consume every 200mls which will consume 5 data in 1 seconds I did mine to do update if exist
channel.consume(transactionQueueName, async (data) => {
let dataNew = JSON.parse(data.content);
const processedTransaction = await seperateATransaction(dataNew);
// delay ack to avoid duplicate entry !important dont remove the settimeout
setTimeout(function(){
channel.ack(data);
},200);
});
Done

Message ordinal-number by enqueuing order

My application (.NET-based) gets messages from a queue in a multithreaded fashion and I'm worried about the fact that I may receive messages in an out-of-order manner because one thread can be quicker than the other, for instance, given the following queue state:
[Message-5 | Message-4 | Message-3 | Message-2 | Message-1]
In a multithreaded operation, msg #2 may arrive before msg #1, even though msg #1 was first in the queue, due to many threading issues (thread time slices, thread scheduling etc).
In such a situation, it would be great if a message that is inside the queue have already stamped with an ordinal/sequence number when it was enqueued and even if I get the messages in an out of order fashion, I can still order them at some point within my application using their given ordinal-number attribute.
Any known mechanism to achieve it in a Websphere MQ environment?
You have 2 choices:
(1) Use Message Grouping in MQ as whitfiea mentioned or
(2) Change you application to be single threaded.
Note: If the sending application does not set the MQMD MsgId field then the queue manager will generate a unique number (based on queue manager name, date & time) and store it in the message's MQMD MsgID field.
You can obtain the MessageSequenceNumber from the MQMessage if the messages are put to the queue in a message group. The MessageSquenceNumber will either be the order that the messages were put to the queue by default or defined by the application that put the messages to the queue.
See the MessageSequenceNumber here for more details
Yes, if the originating message has an ordinal then as you receive your data you could:
Use a thread safe dictionary:
SortedDictionary<int,Message>

Maintaining Order in a Multi-Threaded Pipeline

I'm considering a multi-threaded architecture for a processing pipeline. My main processing module has an input queue, from which it receives data packets. It then performs transformations on these packets (decryption, etc.) and places them into an output queue.
The threading comes in where many input packets can have their contents transformed independently from one another.
However, the punchline is that the output queue must have the same ordering as the input queue (i.e., the first pulled off the input queue must be the first pushed onto the output queue, regardless of whether its transformations finished first.)
Naturally, there will be some kind of synchronisation at the output queue, so my question is: what would be the best way of ensuring that this ordering is maintained?
Have a single thread read the input queue, post a placeholder on the output queue, and then hand the item over to a worker thread to process. When the data is ready the worker thread updates the placeholder. When the thread that needs the value from the output queue reads the placeholder it can then block until the associated data is ready.
Because only a single thread reads the input queue, and this thread immediately puts the placeholder on the output queue, the order in the output queue is the same as that in the input. The worker threads can be numerous, and can do the transformations in any order.
On platforms that support futures, they are ideal as the placeholder. On other systems you can use an event, monitor or condition variable.
With the following assumptions
there should be one input queue, one output queue and one working queue
there should be only one input queue
listener
output message should contain a wait
handle and a pointer to worker/output data
there may be an arbitrary number of
worker threads
I would consider the following flow:
Input queue listener does these steps:
extracts input message;
creates output message:
initializes worker data struct
resets the wait handle
enqueues the pointer to the output message into the working queue
enqueues the pointer to the output message into the output queue
Worker thread does the following:
waits on a working queue to
extract a pointer to an output
message from it
processes the message based on the given data and sets the event when done
consumer does the following:
waits on n output queue to
extract a pointer to an output
message from it
waits on a handle until the output data is ready
does something with the data
That's going to be implementation-specific. One general solution is to number the input items and preserve the numbering so you can later sort the output items. This could be done once the output queue is filled, or it could be done as part of filling it. In other words, you could insert them into their proper position and only allow the queue to be read when the next available item is sequential.
edit
I'm going to sketch out a basic scheme, trying to keep it simple by using the appropriate primitives:
Instead of queueing a Packet into the input queue, we create a future value around it and enqueue that into both the input and output queues. In C#, you could write it like this:
var future = new Lazy<Packet>(delegate() { return Process(packet); }, LazyThreadSafetyMode.ExecutionAndPublication);
A thread from the pool of workers dequeues a future from the input queue and executes future.Value, which causes the delegate to run JIT and returns once the delegate is done processing the packet.
One or more consumers dequeues a future from the output queue. Whenever they need the value of the packet, they call future.Value, which returns immediately if a worker thread has already called the delegate.
Simple, but works.
If you are using a windowed-approach (known number of elements), use an array for the output queue. For example if it is media streaming and you discard packages which haven't been processed quickly enough.
Otherwise, use a priority queue (special kind of heap, often implemented based on a fixed size array) for the output items.
You need to add a sequence number or any datum on which you can sort the items to each data packet. A priority queue is a tree like structure which ensures the sequence of items on insert/pop.

Resources