Spawn a thread and worker queue only if resource is busy - multithreading

Hoping someone can help me design this correctly.
In my TCP code, I have a SendMessage() function that tries to write to the wire. I am trying to design the call so that it moves to a producer/consumer model if a lot of concurrent requests happen, but at the same time, stays single-threaded if there are no concurrent requests (for maximum performance).
I'm struggling on how to design this without race conditions because there is no way to move locks between threads.
What I have so far is something like (pseudo-coded):
SendMessage(msg) {
if(Monitor.TryEnter(wirelock,200)) {
try{
sendBytes(msg);
}
finally {
Monitor.Exit...
}
}
else {
_SomeThreadSafeQueue.add(msg)
Monitor.TryEnter(consumerlock,..
Task.Factory.New(ConsumerThreadMethod....
}
}
ConsumerThreadMethod() {
lock (wirelock) {
while(therearemessagesinthequeue)
sendBytes...
}
}
Any obvious race conditions?
EDIT: Found a flaw in the last one. How about this instead?
SendMessage(msg) {
if(Monitor.TryEnter(wirelock)) {
try{
sendBytes(msg);
}
finally {
Monitor.Exit...
}
}
else {
_SomeThreadSafeQueue.add(msg)
if (Interlocked.Increment(ref _threadcounter) == 1)
{
Task.Factory.StartNew(() => ConsumerThreadMethod());
}
else
{
Interlocked.Decrement(ref _threadcounter);
}
}
}
ConsumerThreadMethod() {
while(therearemessagesinthequeue)
lock (wirelock) {
sendBytes...
}
}
Interlocked.Decrement(ref _threadcounter);
}
So basically using the interlocked counter as a way to only ever spawn one thread (if necessary)

No obvious races but TryEnter is a cause for some serious idle time. I actually think that using a consumer thread all the time is the best solution. If there is little to do, the overhead will be really small (the consumer thread will be asleep when not working, if designed correctly).
Now you create a new task for each sent message, resulting in huge contention on the lock, since you are using a while loop in the consumer thread.
EDIT: Since you are using non-blocking sockets, a single consumer thread should be enough to handle all send requests. The throughput of a single thread is higher than your network. If you have more consumers it's hard to make sure that no two consumer threads send on the same socket, without serializing everything using a mutex. I don't think switching between single-threaded and multi-threaded is a good idea.
Your current "multithreaded" solution does not give you any performance gain since all work is protected using the same mutex. It will be as slow, or slower, than a single thread.

Related

what would be the right way to go for my scenario, thread array, thread pool or tasks?

I am working on a small microfinance application that processes financial transactions, the frequency of these transaction are quite high, which is why I am planning to make it a multi-threaded application that can process multiple transactions in parallel.
I have already designed all the workers that are thread safe,
what I need help for is how to manage these threads. here are some of my options
1.make a specified number of thread pool threads at startup and keep them running like in a infinite loop where they could keep looking for new transactions and if any are found start processing
example code:
void Start_Job(){
for (int l_ThreadId = 0; l_ThreadId < PaymentNoOfWorkerThread; l_ThreadId++)
{
ThreadPool.QueueUserWorkItem(Execute, (object)l_TrackingId);
}
}
void Execute(object l_TrackingId)
{
while(true)
{
var new_txns = Get_New_Txns(); //get new txns if any returns a queue
while(new_txns.count > 0 ){
process_txn(new_txns.Dequeue())
}
Thread.Sleep(some_time);
}
}
2.look for new transactions and assign a thread pool thread for each transaction (my understanding that these threads would be reused after their execution is complete for new txns)
example code:
void Start_Job(){
while(true){
var new_txns = Get_New_Txns(); //get new txns if any returns a queue
for (int l_ThreadId = 0; l_ThreadId < new_txns.count; l_ThreadId++)
{
ThreadPool.QueueUserWorkItem(Execute, (object)new_txn.Dequeue());
}
}
Thread.Sleep(some_time);
}
void Execute(object Txn)
{
process_txn(txn);
}
3.do the above but with tasks.
which option would be most efficient and well suited for my application,
thanks in advance :)
ThreadPool.QueueUserWorkItem is an older API and you shouldn't be using it directly
anymore. Tasks is the way to go and Thread pool is managed automatically for you.
What may suite your application would depend on what happens in process_txn and is subjective, so this is very generic guideline:
If process_txn is a compute bound operation: for example it performs only CPU bound calculations, then you may look at the Task Parallel Library. It will help you use the CPU cores more efficiently.
If process_txn is less of CPU and more IO bound operations: meaning if it may read/write from files/database or connects to some other remote service, then what you should look at is asynchronous programming and make sure your IO operations are all asynchronous which means your threads are never blocked on IO. This will help your service to be more scalable. Also depending on what your queue is, see if you can await on the queue asynchronously, so that none of your application threads are blocked just waiting on the queue.

Is this reader-writer lock implementation correct?

wondering if the following implementation of reader/writer problem correct.
We're using only one mutex and a count variable to indicate the num of readers.
read api:
void read() {
mutex.lock();
count ++;
mutex.unlock();
// Do read
mutex.lock();
count --;
mutex.unlock();
}
write api:
void write() {
while(1) {
mutex.lock();
if(count == 0) {
// Do write
mutex.unlock();
return;
}
mutex.unlock();
}
}
Looks like in the code:
Only one lock is used so there is no deadlock problem;
Writer can only write when count == 0 so there is no race conditions.
As for a read/write problem prior to reader, is there any problem for the above code? Looks like all the standard implementation uses two locks(eg. https://en.wikipedia.org/wiki/Readers%E2%80%93writers_problem#First_readers-writers_problem). If the above implementation seems correct, why are we using two locks in wiki? Thank you!
It's correct, but it will perform atrociously. Imagine if while a reader is trying to do work there are two waiting writers. Those two waiting writers will constantly acquire and release the mutex, saturating the CPU resources while the reader is trying to finish its work so that the system as a whole can make forward progress.
The nightmare scenario would be where the reader shares a physical core with one of the waiting writers. Yikes.
Correct, yes. Useful and sensible, definitely not!
One reason to use two locks is to prevent two writers from competing. A more common solution, at least in my experience, is to use a lock with a condition variable to release waiting writers or alternate phases.

java - avoid unnessary thread wake-ups

I have a set of 12 threads executing work (Runnable) in parallel. In essence, each thread does the following:
Runnable r;
while (true) {
synchronized (work) {
while (work.isEmpty()) {
work.wait();
}
r = work.removeFirst();
}
r.execute();
}
Work is added as following:
Runnable r = ...;
synchronized (work) {
work.add(r);
work.notify();
}
When new work is available, it is added to the list and the lock is notified. If there is a thread waiting, it is woken up, so it can execute this work.
Here lies the problem. When a thread is woken up, it is very likely that another thread will execute this work. This happens when the latter thread is done with its previous work and re-enters the while(true)-loop. The smaller/shorter the work actions, the more likely this will happen.
This means I am waking up a thread for nothing. As I need high throughput, I believe this behavior will lower the performance.
How would you solve this? In theory, I need a mechanism which allows me to cancel a pending thread wake-up notification. Of course, this is not possible in Java.
I thought about introducing a work list for each thread. Instead of pushing the work into one single list, the work is spread over the 12 work lists. But I believe this will introduce other problems. For example, one thread might have a lot of work pending, while another thread might have no work pending. In essence, I believe that a solution which assigns work to a particular thread in advance might become very complex and and is sub-optimal.
Thanks!
What you are doing is a thread pooling. Take a look at pre java-5 concurrency framework, PooledExecutor class there:
http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html
In addition to my previous answer - another solution. This question makes me curious.
Here, I added a check with volatile boolean.
It does not completely avoid the situation of uselessly wakening up a thread but helps to avoid it. Actually, I do not see how this could be completely avoided without additional restrictions like "we know that after 100ms a job will most likely be done".
volatile boolean free = false;
while (true) {
synchronized (work) {
free = false; // new rev.2
while (work.isEmpty()) {
work.wait();
}
r = work.removeFirst();
}
r.execute();
free = true; // new
}
--
synchronized (work) {
work.add(r);
if (!free) { // new
work.notify();
} // new
free = false; // new rev.2
}

Multi-threaded application, parrarel reading with write possibility

For example, I have multi-threaded application which can be presented as:
Data bigData;
void thread1()
{
workOn(bigData);
}
void thread2()
{
workOn(bigData);
}
void thread3()
{
workOn(bigData);
}
There are few threads that are working on data. I could leave it as it is, but the problem is that sometimes (very seldom) data are modified by thread4.
void thread4()
{
sometimesModifyData(bigData);
}
Critical sections could be added there, but it would make no sense to multi-threading, because only one thread could work on data at the same time.
What is the best method to make it sense multi-threading while making it thread safe?
I am thinking about kind of state (sempahore?), that would prevent reading and writing at the same time but would allow parallel reading.
This is called a readers–writer lock. You could implement what is called a mutex to make sure no one reads when write is going on and no one writes when reads are going on. One way to solve the problem would be to have flags. If the writer is got something to modify, then switch on a lock. Upon which NO MORE readers will get to read and after all the current readers have finished, the writer will get to do its job and then again the readers read.

can a threadpool(like boost) be initialized into a class constructor and used when class members are called?

Lets say I have a threadpool (example might be http://threadpool.sourceforge.net/);
And i have this code:
class Demigod{
public:
Demigod();
void AskObedienceFast();
void AskObedienceSlow();
void WorkHardGodDamn();
~Demigod();
private:
ThreadPool m_PitySlaves;
int m_Quota;
};
Demigod::Demigod() : m_PitySlaves(manyPlease) {
}
void Demigod::WorkHardGodDamn(){
//something irelevant just to annoy slaves
}
void Demigod::AskObedienceFast() {
for(int q=0; q<m_Quota; ++q){
m_PitySlaves.schedule(boost::bind(&Demigod::WorkHardGodDamn, this));
}
m_PitySlaves.wait();
}
void Demigod::AskObedienceSlow() {
ThreadPool poorSouls;
for(int q=0; q<m_Quota; ++q){
poorSouls.schedule(boost::bind(&Demigod::WorkHardGodDamn, this));
}
poorSouls.wait();
}
void main(){
Demigod someDude;
for(size_t i=0; i<dontstop; ++i){
someDude.AskObedienceFast();
}
}
Can AskObedienceFast be faster and work in comparison with AskObedienceSlow?
This way I can have some thread (slaves) and be ready for work anytime I ask without having to lose time making the threadpool at every call. I know I can verify the code myself, but my question is more broad, if this is not fundamentally lose performance somewhere else, like those threads in the threadpool doing some waiting process? It comes down to avoiding expensive threadpool initialization (and threads).
There is nothing like "waiting process". If a thread is waiting (on a condition), scheduler is simply skipping it, therefore such thread does not do anything and is not being switched in and out. As you very correctly pointed out, the most expensive task in threading is setting up the thread (though all major OSes are taking steps in minimizing it as much as possible, to keep pace with the recent multiplication of cores), closely followed by switching thread contexts. So you can see why AskObedienceSlow is horrible. Your temporaries should be only "cheap" structures, which take as little time as possible for construction and destruction. ThreadPool definitely isn't one such. Even AskObedienceFast won't protect you from the context switching overhead, but that's why bigger thread pools aren't always better and the best performing size is a matter of careful balancing dependent on your actual workload. Some of the best performing high load, high throughput applications are single-thread, message-passing designs for this very reason. Programming languages used for such applications (like Erlang) are explicitly threadless.

Resources