C# - Matrix Printing Incorrectly on a Separate Thread - multithreading

I'm currently making a Console Application game, and am working on the multiplayer version.
When I am receiving packets, I create a new thread to handle the packets, in order to return to the ReceiveFrom command as soon as possible.
The newly created thread should reprint the whole Matrix, using the new information it got regarding the changes in the matrix (it should print the Matrix with updated player positions).
The problem is, when the Print method is called on the new thread, it is badly performed. It prints the matrix very inaccurately, and many characters of the matrix are just in a mess on the Console Screen.
Here are the methods:
void Receive()
{
while (true)
{
byte[] msg = new byte[1024];
client.Receive(msg);
Thread handle = new Thread(() => HandleInput(msg));
handle.Start();
}
}
void HandleInput(byte[] msgX)
{
string data = Encoding.ASCII.GetString(msgX);
data = data.Replace("\0", "");
if (data.Contains('*') || data.Contains('\0') || data.Contains(' ')) // Move Packet (moving objects in a matrice)
{
// for example, a move packet can be: '*'!12!10
char ToMove = char.Parse(data.Split('!')[0]);
int X = int.Parse(data.Split('!')[1]);
int Y = int.Parse(data.Split('!')[2]);
grid[X, Y] = ToMove;
Print();
}
}
void Print() // print a 20x50 matrix
{
Console.Clear();
for (int y = 0; y < 52; y++)
{
Console.Write('-');
}
Console.WriteLine();
for (int i = 0; i < 20; i++)
{
Console.Write('|');
for (int j = 0; j < 50; j++)
{
Console.Write(grid[i,j]);
}
Console.WriteLine('|');
}
for (int x = 0; x < 52; x++)
{
Console.Write('-');
}
}
But, when I tried to treat the input and print the matrix on the same thread of the Receive method, it worked fine. The problem printing it only came when I printed the matrix on a separate thread.
So why does this happen? Why can't a separate thread just print the matrix correctly?

You create a new thread for every received kilobyte, but those threads are not guaranteed to start running in sequence, to wait one another, nor to finish running in sequence.
There is also no guarantee that network client will receive all needed bytes in one call of Receive method. It may happen in one, but it may happen in several calls. It may also happen that two Send methods from the other side of the connection are merged into one message. All that is guaranteed in network communication is the ordering of the bytes in the stream, and nothing else.
Your next question will probably be "so how do I make it work"? I can't do the job for you, but this is in general how I would do it:
Read chunk of bytes from the stream and store it in some buffer.
Check in the loop for existence of a complete message in the buffer. If there is such a message remove that message from the buffer, and process the message. I would process it in the same thread, and only if there are problems with that approach I would consider additional threads (or rather just one additional thread to whom I would be sending all the messages).
Repeat the loop until all complete messages are removed from the buffer, and continue reading the stream.

Related

Error when trying to access frames from multithreading video capturing

I am using ltbb to stream from two cameras. ltbb creates two threads(because of two cameras) for simultaneous streaming from two cameras. It makes use of concurrent queues for fetching frames. The following code snippet displays frames:
while (waitKey(20) != 27)
{
//Retrieve frames from each camera capture thread
vector<Mat> iMats(capture_source.size());
for (int i = 0; i < capture_source.size(); i++)
{
Mat frame;
//Pop frame from queue and check if the frame is valid
if (cam.frame_queue[i]->try_pop(frame))
{
//Show frame on Highgui window
// IMats.push_back(frame);
iMats[i] = frame;
imshow(label[i], frame);
}
} // end of for - loop
int x = opencv_tri(iMats);
}
The problem is when I am doing iMats[i] = frame and pass it to other function, It's giving me an error. It works fine when I comment iMats and stop calling opencv_tri(iMats).
Error: Segmentation fault (core dumped)
Link to opencv_tri: opencv_tri.cpp
Can anyone please explain and help me to fix this?
Unfortunatly I can't comment.
I think your vector is empty, therefor you can not access the [i]'th element.
vector<Mat> iMats(capture_source.size());
This creates an empty Vector with capture_source.size() elements.
I did not go through the whole opencv_tri function that you linked, but I spot the possible error.
you have the following
for(int i = 0;i<imats.size();i+=2)
{
...
Mat imgATest = imats[i];
Mat imgBTest = imats[i+1];
when i is in imats.size()-1 it will try to access:
Mat imgBTest = imats[i+1];
This is an out of bounds. since i+1 is imats.size()
You have to loop up to imats.size()-1
for(int i = 0;i<imats.size()-1;i+=2)
{
this way it will not try to access the wrong value

Is it possible to block on wait on a semaphore without using the data when it's available?

The software I'm working on is a data analyzer with a sliding window. I have 2 threads, one producer and one consumer, that use a circular buffer.
The consumer must process data only if the first element in the buffer is old enough, therefore there are at least X elements in the buffer. But after the processing, only X/4 data can be deleted, because of the moving window.
My solution below works quite well, except that I have a trade-off between being fast (busy form of waiting in the check), or being efficient (sleep for some time). The problem is that the sleep time varies according to load, thread scheduling and elaboration complexity, so I can potentially slow down the performances.
Is there a way to poll a semaphore to check if there are at least X elements, blocking the thread otherwise, but acquiring only X/4 after the processing has been done? The tryAcquire option does not work because when it wakes the thread consumes all the data, and not one half.
I've thought about copyng the elements in a second buffer, but actually there are 7 circular buffers of big data, therefore I'd like to avoid data duplication, or even data moving.
//common structs
QSemaphore written;
QSemaphore free;
int writtenIndex = 0;
int readIndex = 0;
myCircularBuffer buf;
bool scan = true;
//producer
void produceData(data d)
{
while ( free.tryAcquire(1, 1000) == false && scan == true)
{
//avoid deadlock!
//once per second give up waiting and check if closing
}
if (scan == false) return;
buf.at(writtenIndex) = d;
writtenIndex = (writtenIndex+1) % bufferSize;
written.release();
}
//consumer
void consumeData()
{
while(1)
{
//here goes the problem: usleep (slow), sched_yield (B.F.O.W.) or what?
if (buf.at(writtenIndex).age - buf.at(readIndex).age < X)
{
//usleep(100); ? how much time?
//sched_yield(); ?
//tryAcquire not an option!
continue;
}
processTheData();
written.acquire(X/4);
readIndex = (readIndex + X/4) % bufferSize;
free.release(X/4);
}

Filling QGraphicsScene with items in thread

I have the thread where the worker object is running infinite cycle. Here I have the following code that I would like it to read coordinates from list and place QGraphicsEllipseItem on these coordinates. The list can be updated by another thread, so I protect it by mutex. But sometimes the size of list may grow up so I would like to create new QGraphicsEllipse items for it if needed.
int meter_to_pixel_ratio = 20;
int x_pixel, y_pixel;
int i;
forever {
visualizationDataMutex->lock();
while(ellipseList->count()<visualizationData->count())
{
qDebug() << "Creating new visual item...";
ellipseList->append(new QGraphicsEllipseItem(0.0, 0.0, 10.0, 10.0));
ellipseList->last()->setVisible(false);
visualizationScene->addItem(ellipseList->last());
}
for(i=0; i<visualizationData->count(); i++)
{
x_pixel = meter_to_pixel_ratio*visualizationData->at(i)->x();
y_pixel = meter_to_pixel_ratio*visualizationData->at(i)->y();
ellipseList->at(i)->setPos(x_pixel, y_pixel);
ellipseList->at(i)->setBrush(QBrush(*visualizationColor->at(i)));
if(!ellipseList->at(i)->isVisible()) ellipseList->at(i)->setVisible(true);
}
visualizationDataMutex->unlock();
// repaint scene
visualizationScene->update();
QThread::msleep(100);
}
The problem I have is, that when I try to run the program I´ll obtain a runtime error. Tried to qDebug() the ellipseList->count() and seems to have the exactly same number of elements as needed (as visualizationData->count()). When commented these three lines:
//ellipseList->at(i)->setPos(x_pixel, y_pixel);
//ellipseList->at(i)->setBrush(QBrush(*visualizationColor->at(i)));
//if(!ellipseList->at(i)->isVisible()) ellipseList->at(i)->setVisible(true);
program can run without crashing. I do not understand why is this happening since there is no other function working with QGraphicsView/QGraphicsScene. (QGraphicsView was added from Qt Designer environment into mainwindow).

lock-free bounded MPMC ringbuffer failure

I've been banging my head against (my attempt) at a lock-free multiple producer multiple consumer ring buffer. The basis of the idea is to use the innate overflow of unsigned char and unsigned short types, fix the element buffer to either of those types, and then you have a free loop back to beginning of the ring buffer.
The problem is - my solution doesn't work for multiple producers (it does though work for N consumers, and also single producer single consumer).
#include <atomic>
template<typename Element, typename Index = unsigned char> struct RingBuffer
{
std::atomic<Index> readIndex;
std::atomic<Index> writeIndex;
std::atomic<Index> scratchIndex;
Element elements[1 << (sizeof(Index) * 8)];
RingBuffer() :
readIndex(0),
writeIndex(0),
scratchIndex(0)
{
;
}
bool push(const Element & element)
{
while(true)
{
const Index currentReadIndex = readIndex.load();
Index currentWriteIndex = writeIndex.load();
const Index nextWriteIndex = currentWriteIndex + 1;
if(nextWriteIndex == currentReadIndex)
{
return false;
}
if(scratchIndex.compare_exchange_strong(
currentWriteIndex, nextWriteIndex))
{
elements[currentWriteIndex] = element;
writeIndex = nextWriteIndex;
return true;
}
}
}
bool pop(Element & element)
{
Index currentReadIndex = readIndex.load();
while(true)
{
const Index currentWriteIndex = writeIndex.load();
const Index nextReadIndex = currentReadIndex + 1;
if(currentReadIndex == currentWriteIndex)
{
return false;
}
element = elements[currentReadIndex];
if(readIndex.compare_exchange_strong(
currentReadIndex, nextReadIndex))
{
return true;
}
}
}
};
The main idea for writing was to use a temporary index 'scratchIndex' that acts a pseudo-lock to allow only one producer at any one time to copy-construct into the elements buffer, before updating the writeIndex and allowing any other producer to make progress. Before I am called heathen for implying my approach is 'lock-free' I realise that this approach isn't exactly lock-free, but in practice (if it would work!) it is significantly faster than having a normal mutex!
I am aware of a (more complex) MPMC ringbuffer solution here http://www.1024cores.net/home/lock-free-algorithms/queues/bounded-mpmc-queue, but I am really experimenting with my idea to then compare against that approach and find out where each excels (or indeed whether my approach just flat out fails!).
Things I have tried;
Using compare_exchange_weak
Using more precise std::memory_order's that match the behaviour I want
Adding cacheline pads between the various indices I have
Making elements std::atomic instead of just Element array
I am sure that this boils down to a fundamental segfault in my head as to how to use atomic accesses to get round using mutex's, and I would be entirely grateful to whoever can point out which neurons are drastically misfiring in my head! :)
This is a form of the A-B-A problem. A successful producer looks something like this:
load currentReadIndex
load currentWriteIndex
cmpxchg store scratchIndex = nextWriteIndex
store element
store writeIndex = nextWriteIndex
If a producer stalls for some reason between steps 2 and 3 for long enough, it is possible for the other producers to produce an entire queue's worth of data and wrap back around to the exact same index so that the compare-exchange in step 3 succeeds (because scratchIndex happens to be equal to currentWriteIndex again).
By itself, that isn't a problem. The stalled producer is perfectly within its rights to increment scratchIndex to lock the queue—even if a magical ABA-detecting cmpxchg rejected the store, the producer would simply try again, reload exactly the same currentWriteIndex, and proceed normally.
The actual problem is the nextWriteIndex == currentReadIndex check between steps 2 and 3. The queue is logically empty if currentReadIndex == currentWriteIndex, so this check exists to make sure that no producer gets so far ahead that it overwrites elements that no consumer has popped yet. It appears to be safe to do this check once at the top, because all the consumers should be "trapped" between the observed currentReadIndex and the observed currentWriteIndex.
Except that another producer can come along and bump up the writeIndex, which frees the consumer from its trap. If a producer stalls between steps 2 and 3, when it wakes up the stored value of readIndex could be absolutely anything.
Here's an example, starting with an empty queue, that shows the problem happening:
Producer A runs steps 1 and 2. Both loaded indices are 0. The queue is empty.
Producer B interrupts and produces an element.
Consumer pops an element. Both indices are 1.
Producer B produces 255 more elements. The write index wraps around to 0, the read index is still 1.
Producer A awakens from its slumber. It had previously loaded both read and write indices as 0 (empty queue!), so it attempts step 3. Because the other producer coincidentally paused on index 0, the compare-exchange succeeds, and the store progresses. At completion the producer lets writeIndex = 1, and now both stored indices are 1, and the queue is logically empty. A full queue's worth of elements will now be completely ignored.
(I should mention that the only reason I can get away with talking about "stalling" and "waking up" is that all the atomics used are sequentially consistent, so I can pretend that we're in a single-threaded environment.)
Note that the way that you are using scratchIndex to guard concurrent writes is essentially a lock; whoever successfully completes the cmpxchg gets total write access to the queue until it releases the lock. The simplest way to fix this failure is to just replace scratchIndex with a spinlock—it won't suffer from A-B-A and it's what's actually happening.
bool push(const Element & element)
{
while(true)
{
const Index currentReadIndex = readIndex.load();
Index currentWriteIndex = writeIndex.load();
const Index nextWriteIndex = currentWriteIndex + 1;
if(nextWriteIndex == currentReadIndex)
{
return false;
}
if(scratchIndex.compare_exchange_strong(
currentWriteIndex, nextWriteIndex))
{
elements[currentWriteIndex] = element;
// Problem here!
writeIndex = nextWriteIndex;
return true;
}
}
}
I've marked the problematic spot. Multiple threads can get to the writeIndex = nextWriteIndex at the same time. The data will be written in any order, although each write will be atomic.
This is a problem because you're trying to update two values using the same atomic condition, which is generally not possible. Assuming the rest of your method is fine, one way around this would be to combine both scratchIndex and writeIndex into a single value of double-size. For example, treating two uint32_t values as a single uint64_t value and operating atomically on that.

Multithreading

I have just started learning multi-threading. I have written a simple application. The application creates three threads. Two threads write and one thread reads. The writer threads write to separate location in a global array. The writer thread after incrementing the value in the array notifies the reader. The reader thread then decrements that value in the array and waits again for the writer threads to update their corresponding value in the array. The code for the application is pasted below.
What I see is that the writer(Producer) threads get more time slice than the reader(Consumer) thread. I think I am doing something wrong. If the output of the application is redirected to a file, then it can be observed that there are more consecutive messages from the Producers and the messages from the Consumer occur infrequently. What I was expecting was that, when a Producer updates its data, the Consumer immediately processes it i.e. after every Producer message there should be a Consumer message printed.
Thanks and regards,
~Plug
#include <stdio.h>
#include <pthread.h>
const long g_lProducerCount = 2; /*Number of Producers*/
long g_lProducerIds[2]; /*Producer IDs = 0, 1...*/
long g_lDataArray[2]; /*Data[0] for Producer 0, Data[1] for Producer 1...*/
/*Producer ID that updated the Data. -1 = No update*/
long g_lChangedProducerId = -1;
pthread_cond_t g_CondVar = PTHREAD_COND_INITIALIZER;
pthread_mutex_t g_Mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_t g_iThreadIds[3]; /*3 = 2 Producers + 1 Consumer*/
unsigned char g_bExit = 0; /*Exit application? 0 = No*/
void* Producer(void *pvData)
{
long lProducerId = *(long*)pvData; /*ID of this Producer*/
while(0 == g_bExit) {
pthread_mutex_lock(&g_Mutex);
/*Tell the Consumer who's Data is updated*/
g_lChangedProducerId = lProducerId;
/*Update the Data i.e. Increment*/
++g_lDataArray[lProducerId];
printf("Producer: Data[%ld] = %ld\n",
lProducerId, g_lDataArray[lProducerId]);
pthread_cond_signal(&g_CondVar);
pthread_mutex_unlock(&g_Mutex);
}
pthread_exit(NULL);
}
void* Consumer(void *pvData)
{
while(0 == g_bExit) {
pthread_mutex_lock(&g_Mutex);
/*Wait until one of the Producers update it's Data*/
while(-1 == g_lChangedProducerId) {
pthread_cond_wait(&g_CondVar, &g_Mutex);
}
/*Revert the update done by the Producer*/
--g_lDataArray[g_lChangedProducerId];
printf("Consumer: Data[%ld] = %ld\n",
g_lChangedProducerId, g_lDataArray[g_lChangedProducerId]);
g_lChangedProducerId = -1; /*Reset for next update*/
pthread_mutex_unlock(&g_Mutex);
}
pthread_exit(NULL);
}
void CreateProducers()
{
long i;
pthread_attr_t attr;
pthread_attr_init(&attr);
for(i = 0; i < g_lProducerCount; ++i) {
g_lProducerIds[i] = i;
pthread_create(&g_iThreadIds[i + 1], &attr,
Producer, &g_lProducerIds[i]);
}
pthread_attr_destroy(&attr);
}
void CreateConsumer()
{
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_create(&g_iThreadIds[0], &attr, Consumer, NULL);
pthread_attr_destroy(&attr);
}
void WaitCompletion()
{
long i;
for(i = 0; i < g_lProducerCount + 1; ++i) {
pthread_join(g_iThreadIds[i], NULL);
}
}
int main()
{
CreateProducers();
CreateConsumer();
getchar();
g_bExit = 1;
WaitCompletion();
return 0;
}
You would have to clarify what is it exactly that you want to achieve. For now the producers only increment an integer and the consumer decrements the value. This is not a very useful activity ;) I understand that this is only a test app, but still it is not clear enough what's the purpose of this processing, what are the constraints and so on.
The producers produce some 'items'. The outcome of this production is represented as an integer value. 0 means no items, 1 means there is a pending item, that consumer can take. Is that right? Now, is it possible for the producer to produce several items before any of them gets consumed (incrementing the array cell to a value higher than 1)? Or does he have to wait for the last item to be consumed before the next one can be put into the storage? Is the storage limited or unlimited? If it is limited then is the limit shared among all the producers or is it defined per producer?
What I was expecting was that, when a Producer updates its data,
the Consumer immediately processes it i.e. after every Producer
message there should be a Consumer message printed.
Though it's not really clear what you want to achieve I will hold on to that quote and assume the following: there is a limit of 1 item per producer and the producer has to wait for the consumer to empty the storage before a new item can be put in the cell i.e. the only allowed values in the g_lDataArray are 0 and 1.
To allow maximum concurrency between threads you will need a conditional variable/mutex pair for each cell of g_lDataArray (for each producer). You will also need a queue of updates that is a list of producers that have submitted their work and a conditional variable/mutex pair to guard it, this will replace g_lChangedProducerId which can only hold one value at a time.
Every time a producer wants to put an item into the storage it has to acquire the respective lock, check if the storage is empty (g_lDataArray[lProducerId] == 0), if not wait on the condition variable and then, increment the cell, release the held lock, acquire the consumer lock, add his id to the update queue, notify the consumer, release the consumer lock. Of course if the producer would perform any real computations producing some real item, this work should be performed out of the scope of any lock, before the attempt to put the item in the storage.
In pseudo code this looks like this:
// do some computations
item = compute();
lock (mutexes[producerId]) {
while (storage[producerId] != 0)
wait(condVars[producerId]);
storage[producerId] = item;
}
lock (consumerMutex) {
queue.push(producerId);
signal(consumerCondVar);
}
The consumer should act as follows: acquire his lock, check if there are any pending updates to process, if not wait on the condition variable, take one update out of the queue (that is the number of the updating producer), acquire the lock for producer who's update is going to be processed, decrement the cell, notify the producer, release the producer's lock, release his lock, finally process the update.
lock (consumerMutex) {
while (queue.isEmpty())
wait(consumerCondVar);
producerId = queue.pop();
lock (mutexex[producerId]) {
item = storage[producerId];
storage[producerId] = 0;
signal(condVars[producerId]);
}
}
//process the update
process(item);
Hope this answer is what you needed.
The problem may be that all producers change g_lChangedProducerId, so the value written by one producer may be overwritten by another producer before the consumer sees it.
This means that the consumer effectively doesn't see that the first producer has produced some output.
Well,when you producer produced, it may wake up the ProThread or ConThread.
And If it waked up the ProThread,the producer produced again,and the ConThread didn't consume immediately after data is produced.
That's what you don't want to see.
All you need is to make sure that when it produced,it won't wake the ProThread up.
Here's one kind of solution for this
void* Producer(void *pvData)
{
........
//wait untill consumer consume its number
while(-1!=g_lChangedProducerId)
pthread_cond_wait(&g_CondVar,&g_Mutex);
//here to inform the consumer it produced the data
g_lChangedProducerId = lProducerId;
........
}
void* Consumer(void *pvData)
{
g_lChangedProducerId = -1;
**//wake up the producer when it consume
pthread_cond_signal(&g_CondVar);**
pthread_mutex_unlock(&g_Mutex);
}

Resources