MPI_Recv message ordering vs MPI_Send message ordering - openmpi

While trying to simulate the behaviour of a network using OpenMPI, I am experiencing an issue which can be summed up as follows:
Rank 2 sends a message (message1) to rank 0;
Rank 2 sends a message (message2) to rank 1;
Rank 2 sends a message (message3) to rank 0;
At his own turn, rank 0 receives both messages from rank 2 and forwards them to rank 1 (in the correct order);
Rank 1 receives the messages in the following order: message1, message3 and message2.
This behaviour only occurs only once in a while running the program. Usually (6 times out of 7), following the same pattern, rank 1 appears to receive the messages in the expected order (i.e: message2, message1, message3)
I am only using the basic MPI_Recv and MPI_Send functions.

MPI makes no guarantee about the order in which messages from different processes will be recieved. In fact, a receive operation can begin after a send has completed if an output buffer is used in standard mode: http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node40.html#Node40. The only order you can guarantee with standard mode send is that message3 will always arrive after message1. Here is a possible (not unique) sequence that would lead to your anomalous scenario:
Rank 2 sends a message (message1) to rank 0;
Rank 0 receives message1 from rank2;
Rank 0 sends a message (message1) to rank 1;
Rank 1 receives message1 from rank0;
Rank 2 sends a message (message2) to rank 1;
Rank 2 sends a message (message3) to rank 0;
Rank 0 receives message3 from rank2;
Rank 0 sends a message (message3) to rank 1;
Rank 1 receives message3 from rank0;
Rank 1 receives message2 from rank2;
Essentially, MPI_Send is an alias for either MPI_BSend or MPI_SSend, and it is not up to you which one is picked. Your anomaly is caused by MPI_BSend. You can guarantee a write to the corresponding receive buffer using synchronous mode (MPI_SSend) or ready mode (MPI_RSend). The main diffrerence between the two is that ready mode requires the receiver to already be waiting for the message for it not to fail, while synchronous mode will wait for it to happen.
If you are on a Linux platform, you can play with the standard mode by using the nice command to increase the priority of rank0 and decrease that of rank2. The anomaly should happen more consistently the more you increase the priority difference. Here is a brief tutorial on the subject: http://www.nixtutor.com/linux/changing-priority-on-linux-processes/

Related

how to implement string priority queue with checking message priority

Messenger is used to send or receive text
messages. When someone is offline a messenger
maintains a buffer of messages which is delivered
to the receiver when he gets online.
The phenomena take place on simple
timestamp phenomena, the message delivered
earlier will be sent to the receiver first and the
message received late will be delivered after it.
Sometime a message in the buffer may have higher
priority so it should be delivered earlier on the
higher priority. Some of the messages are to be
delivered on a particular day or a date are also in
the same buffer. Your task is to select a suitable
data structure (Heap or Priority Queue) and
implement the requirements mentioned above.
You need to implement program which
shows a user to be offline, display the messages,
with a click or a key stroke make the user online
and deliver/display the messages according to the
mentioned criteria.
I dont understand how to check priority
Sounds to me like your code that orders the heap has to check two things when assigning priority. It checks the priority flag (the one that signals that a message must be delivered sooner than normal), and the timestamp.
So your comparison function looks something like:
// returns -1, 0, 1 to indicate if msg1 is less than, equal to,
// or greater than msg2.
int compare(msg1, msg2)
{
if (msg1.priority == true)
{
if (msg2.priority == false)
return -1; // msg1 has priority flag set and msg2 doesn't
}
else if (msg2.priority == true)
return 1; // msg2 has priority flag set and msg1 doesn't
// At this point, we know that the priority flag is the same
// for both messages.
// So compare timestamps.
if (msg1.timestamp < msg2.timestamp)
return -1;
if (msg1.timestamp == msg2.timestamp)
return 0;
return 1;
}

How should I do that the two receiving processes not to be twice in a row in Promela model?

I am a beginner in the spin. I am trying that the model runs the two receiving processes (function called consumer in the model) alternatively, ie. (consumer 1, consumer 2, consumer 1, consumer 2,...). But when I run this code, my output for 2 consumer processes are showing randomly. Can someone help me?
This is my code I am struggling with.
mtype = {P, C};
mtype turn = P;
chan ch1 = [1] of {bit};
byte current_consumer = 1;
byte previous_consumer;
active [2] proctype Producer()
{`
bit a = 0;
do
:: atomic {
turn == P ->
ch1 ! a;
printf("The producer %d --> sent %d!\n", _pid, a);
a = 1 - a;
turn = C;
}
od
}
active [2] proctype Consumer()
{
bit b;
do
:: atomic{
turn == C ->
current_consumer = _pid;
ch1 ? b;
printf("The consumer %d --> received %d!\n\n", _pid, b);
assert(current_consumer == _pid);
turn = P;
}
od
}
Sample out is as photo
First of all, let me draw your attention to this excerpt of atomic's documentation:
If any statement within the atomic sequence blocks, atomicity is lost, and other processes are then allowed to start executing statements. When the blocked statement becomes executable again, the execution of the atomic sequence can be resumed at any time, but not necessarily immediately. Before the process can resume the atomic execution of the remainder of the sequence, the process must first compete with all other active processes in the system to regain control, that is, it must first be scheduled for execution.
In your model, this is currently not causing any problem because ch1 is a buffered channel (i.e. it has size >= 1). However, any small change in the model could break this invariant.
From the comments, I understand that your goal is to alternate consumers, but you don't really care which producer is sending the data.
To be honest, your model already contains two examples of how processes can alternate with one another:
The Producer/Consumers alternate one another via turn, by assigning a different value each time
The Producer/Consumers alternate one another also via ch1, since this has size 1
However, both approaches are alternating Producer/Consumers rather than Consumers themselves.
One approach I like is message filtering with eval (see docs): each Consumer knows its own id, waits for a token with its own id in a separate channel, and only when that is available it starts doing some work.
byte current_consumer;
chan prod2cons = [1] of { bit };
chan cons = [1] of { byte };
proctype Producer(byte id; byte total)
{
bit a = 0;
do
:: true ->
// atomic is only for printing purposes
atomic {
prod2cons ! a;
printf("The producer %d --> sent %d\n", id, a);
}
a = 1 - a;
od
}
proctype Consumer(byte id; byte total)
{
bit b;
do
:: cons?eval(id) ->
current_consumer = id;
atomic {
prod2cons ? b;
printf("The consumer %d --> received %d\n\n", id, b);
}
assert(current_consumer == id);
// yield turn to the next Consumer
cons ! ((id + 1) % total)
od
}
init {
run Producer(0, 2);
run Producer(1, 2);
run Consumer(0, 2);
run Consumer(1, 2);
// First consumer is 0
cons!0;
}
This model, briefly:
Producers/Consumers alternate via prod2cons, a channel of size 1. This enforces the following behavior: after some producers created a message some consumer must consume it.
Consumers alternate via cons, a channel of size 1 containing a token value indicating which consumer is currently allowed to perform some work. All consumers peek on the contents of cons, but only the one with a matching id is allowed to consume the token and move on. At the end of its turn, the consumer creates a new token with the next id in the chain. Consumers alternate in a round robin fashion.
The output is:
The producer 0 --> sent 0
The consumer 1 --> received 0
The producer 1 --> sent 1
The consumer 0 --> received 1
The producer 1 --> sent 0
The consumer 1 --> received 0
...
The producer 0 --> sent 0
The consumer 1 --> received 0
The producer 0 --> sent 1
The consumer 0 --> received 1
The producer 0 --> sent 0
The consumer 1 --> received 0
The producer 0 --> sent 1
The consumer 0 --> received 1
Notice that producers do not necessarily alternate with one another, whereas consumers do -- as requested.

Several producers, one consumer: Avoid starvation

Given an instance of the producer-consumer problem, where several producers send messages to a single consumer: What techniques do you recommend to avoid starvation of producers, when some of the messages arrive "at the same time" to the consumer. Until now I am considering:
Choosing "non-deterministically" by sampling some probability distribution (not sure how, considering that a different number of messages are arrived at different time stamps).
Using some counters and put a producer to sleep for a while after it has send n messages.
If you can have a priority queue I think each producer can have a message sent counter. And the queue will order based on the messageSent number and the date, such that a message should be sent before another message if its sent number is less then the other message.
In Java
class Message { //or you can implement Comparable<Message>
final Date created = new Date();
final int messageNumber;
public Message(int m ){this.messageNumber = m;}
}
BlockingQueue<Message> queue = new PriorityBlockingQueue<Message>(new Comparator(){
public int compare(Message m1, Message m2){
if(m1.messageNumber < m2.messageNumber) return 1;
if(m2.messageNumber < m1.messageNumber) return -1;
if(m1.messageNumber == m2.messageNumber) return m1.created.compareTo(m2.created);
}
});
class Provider{
int currentMessage = 0;
void send(){
queue.offer(new Message(currentMessage++));
}
}
So if Producer 1 adds 5 elements to the queue (first) and Producer 2 adds 1, the queue will have
P1: 5
P1: 4
P1: 3
P1: 2
P2: 1
P1: 1
One of the simplest and best way is to process the messages on the order of their arrival (a simple FIFO list would do the trick). it doesn't matter even if multiple messages come at the same time. By this way, none of the producers will be starved.
One thing I would make sure is the consumer consumes the messages more rapidly than the producers producing the messages. If not it might end up in producers waiting for the consumer and there won't be any advantage on having multiple producers for a single consumer.

Seeking help with a MT design pattern

I have a queue of 1000 work items and a n-proc machine (assume n =
4).The main thread spawns n (=4) worker threads at a time ( 25 outer
iterations) and waits for all threads to complete before processing
the next n (=4) items until the entire queue is processed
for(i= 0 to queue.Length / numprocs)
for(j= 0 to numprocs)
{
CreateThread(WorkerThread,WorkItem)
}
WaitForMultipleObjects(threadHandle[])
The work done by each (worker) thread is not homogeneous.Therefore in
1 batch (of n) if thread 1 spends 1000 s doing work and rest of the 3
threads only 1 s , above design is inefficient,becaue after 1 sec
other 3 processors are idling. Besides there is no pooling - 1000
distinct threads are being created
How do I use the NT thread pool (I am not familiar enough- hence the
long winded question) and QueueUserWorkitem to achieve the above. The
following constraints should hold
The main thread requires that all worker items are processed before
it can proceed.So I would think that a waitall like construct above
is required
I want to create as many threads as processors (ie not 1000 threads
at a time)
Also I dont want to create 1000 distinct events, pass to the worker
thread, and wait on all events using the QueueUserWorkitem API or
otherwise
Exisitng code is in C++.Prefer C++ because I dont know c#
I suspect that the above is a very common pattern and was looking for
input from you folks.
I'm not a C++ programmer, so I'll give you some half-way pseudo code for it
tcount = 0
maxproc = 4
while queue_item = queue.get_next() # depends on implementation of queue
# may well be:
# for i=0; i<queue.length; i++
while tcount == maxproc
wait 0.1 seconds # or some other interval that isn't as cpu intensive
# as continously running the loop
tcount += 1 # must be atomic (reading the value and writing the new
# one must happen consecutively without interruption from
# other threads). I think ++tcount would handle that in cpp.
new thread(worker, queue_item)
function worker(item)
# ...do stuff with item here...
tcount -= 1 # must be atomic

Linux termios VTIME not working?

We've been bashing our heads off of this one all morning. We've got some serial lines setup between an embedded linux device and an Ubuntu box. Our reads are getting screwed up because our code usually returns two (sometimes more, sometimes exactly one) message reads instead of one message read per actual message sent.
Here is the code that opens the serial port. InterCharTime is set to 4.
void COMClass::openPort()
{
struct termios tio;
this->fd = -1;
int tmpFD;
tempFD = open( port, O_RDWR | O_NOCTTY);
if (tempFD < 0)
{
cerr<< "the port is not opened"<< port <<"\n";
portOpen = 0;
return;
}
tio.c_cflag = BaudRate | CS8 | CLOCAL | CREAD ;
tio.c_oflag = 0;
tio.c_iflag = IGNPAR;
newtio.c_cc[VTIME] = InterCharTime;
newtio.c_cc[VMIN] = readBufferSize;
newtio.c_lflag = 0;
tcflush(tempFD, TCIFLUSH);
tcsetattr(tempFD,TCSANOW,&tio);
this->fd = tempFD;
portOpen = true;
}
The other end is configured similarly for communication, and has one small section of particular iterest:
while (1)
{
sprintf(out, "\r\nHello world %lu", ++ulCount);
puts(out);
WritePort((BYTE *)out, strlen(out)+1);
sleep(2);
} //while
Now, when I run a read thread on the receiving machine, "hello world" is usually broken up over a couple messages. Here is some sample output:
1: Hello
2: world 1
3: Hello
4: world 2
5: Hello
6: world 3
where number followed by a colon is one message recieved. Can you see any error we are making?
Thank you.
Edit:
For clarity, please view section 3.2 of the Linux Serial Programming HOWTO. To my understanding, with a VTIME of a couple seconds (meaning vtime is set anywhere between 10 and 50, trial-and-error), and a VMIN of 1, there should be no reason that the message is broken up over two separate messages.
I don't see why you are surprised.
You are asking for at least one byte. If your read() is asking for more, which seems probable since you are surprised you aren't getting the whole string in a single read, it can get whatever data is available up to the read() size. But all the data isn't available in a single read so your string is chopped up between reads.
In this scenario the timer doesn't really matter. The timer won't be set until at least one byte is available. But you have set the minimum at 1. So it just returns whatever number of bytes ( >= 1) are available up to read() size bytes.
If you are still experiencing this problem (realizing the question is old), and your code is accurate, you are setting your VTIME and VMIN in the newtio struct, and the rest of the other parameters in the tio struct.

Resources