Thread joining problems - linux

am using Perl on a linux box and my memory usage is going up and up, I believe because of previously run threads that have not been killed/joined.
I think I need to somehow signal the thread(s) that have done/run to terminate, and then detach
it/them so that it/they will get cleaned up automatically giving me back memory...
I have tried return(); with $thr_List->join(); & $thr_List->detach(); but my gui doesn't show for ages with join, and the mem problem seems
to be still there with the detach... Any help...
$mw->repeat(100, sub { # shared var handler/pivoting-point !?
while (defined(my $command = $q->dequeue_nb())) { #???
# to update a statusbar's text
$text->delete('0.0', "end");
$text->insert('0.0', $command);
$indicatorbar->value($val); # to update a ProgressBar's value to $val
$mw->update();
for ( #threadids ) { # #threadids is a shared array containing thread ids
# that is to say I have got the info I wanted from a thread and pushed its id into the above #threadids array.
print "I want to now kill or join the thread with id: $_\n";
#$thrWithId->detach();
#$thrWithId->join();
# then delete that id from the array
# delete $threadids[elWithThatIdInIt];
# as this seems to be in a repeat(100, sub... too, there are problems??!
# locks maybe?!?
# for ( #threadids ) if its not empty?!?
}
} # end of while
}); # end of sub
# Some worker... that works with the above handler/piviot me thinks#???
async {
for (;;) {
sleep(0.1);
$q->enqueue($StatusLabel);
}
}->detach();
I have uploaded my full code here (http://cid-99cdb89630050fff.office.live.com/browse.aspx/.Public) if needed, its in the Boxy.zip...
First sorry for replying here but I've lost my cookie that allows editing etc...
Thanks very much gangabass, that looks like great info, I will have to spend some time on it though, but at least it looks like others are asking the same questions... I was worried I was making a total mess of things.
Thanks guys...

So it sounds like you got the join working but it was very slow?
Threading in perl is not lightweight. Creating and joining threads takes significant memory and time.
If your task allows, it is much better to keep threads running and give them additional work rather than ending them and starting new threads later. Thread::Queue can help with this. That said, unless you are on Windows, there is not a lot of point to doing that instead of forking and using Parallel::ForkManager.

You need to use only one Worker thread and update GUI from it. Other threads just process data from the queue and there is no need to terminate them.
See this example for more info

Related

How to control multi-threads synchronization in Perl

I got array with [a-z,A-Z] ASCII numbers like so: my #alphabet = (65..90,97..122);
So main thread functionality is checking each character from alphabet and return string if condition is true.
Simple example :
my #output = ();
for my $ascii(#alphabet){
thread->new(\sub{ return chr($ascii); });
}
I want to run thread on every ASCII number, then put letter from thread function into array in the correct order.
So in out case array #output should be dynamic and contain [a..z,A-Z] after all threads finish their job.
How to check, is all threads is done and keep the order?
You're looking for $thread->join, which waits for a thread to finish. It's documented here, and this SO question may also help.
Since in your case it looks like the work being done in the threads is roughly equal in cost (no thread is going to take a long time more than any other), you can just join each thread in order, like so, to wait for them all to finish:
# Store all the threads for each letter in an array.
my #threads = map { thread->new(\sub{ return chr($_); }) } #alphabet;
my #results = map { $_->join } #threads;
Since, when the first thread returns from join, the others are likely already done and just waiting for "join" to grab their return code, or about to be done, this gets you pretty close to "as fast as possible" parallelism-wise, and, since the threads were created in order, #results is ordered already for free.
Now, if your threads can take variable amounts of time to finish, or if you need to do some time-consuming processing in the "main"/spawning thread before plugging child threads' results into the output data structure, joining them in order might not be so good. In that case, you'll need to somehow either: a) detect thread "exit" events as they happen, or b) poll to see which threads have exited.
You can detect thread "exit" events using signals/notifications sent from the child threads to the main/spawning thread. The easiest/most common way to do that is to use the cond_wait and cond_signal functions from threads::shared. Your main thread would wait for signals from child threads, process their output, and store it into the result array. If you take this approach, you should preallocate your result array to the right size, and provide the output index to your threads (e.g. use a C-style for loop when you create your threads and have them return ($result, $index_to_store) or similar) so you can store results in the right place even if they are out of order.
You can poll which threads are done using the is_joinable thread instance method, or using the threads->list(threads::joinable) and threads->list(threads::running) methods in a loop (hopefully not a busy-waiting one; adding a sleep call--even a subsecond one from Time::HiRes--will save a lot of performance/battery in this case) to detect when things are done and grab their results.
Important Caveat: spawning a huge number of threads to perform a lot of work in parallel, especially if that work is small/quick to complete, can cause performance problems, and it might be better to use a smaller number of threads that each do more than one "piece" of work (e.g. spawn a small number of threads, and each thread uses the threads::shared functions to lock and pop the first item off of a shared array of "work to do" and do it rather than map work to threads as 1:1). There are two main performance problems that arise from a 1:1 mapping:
the overhead (in memory and time) of spawning and joining each thread is much higher than you'd think (benchmark it on threads that don't do anything, just return, to see). If the work you need to do is fast, the overhead of thread management for tons of threads can make it much slower than just managing a few re-usable threads.
If you end up with a lot more threads than there are logical CPU cores and each thread is doing CPU-intensive work, or if each thread is accessing the same resource (e.g. reading from the same disks or the same rows in a database), you hit a performance cliff pretty quickly. Tuning the number of threads to the "resources" underneath (whether those are CPUs or hard drives or whatnot) tends to yield much better throughput than trusting the thread scheduler to switch between many more threads than there are available resources to run them on. The reasons this is slow are, very broadly:
Because the thread scheduler (part of the OS, not the language) can't know enough about what each thread is trying to do, so preemptive scheduling cannot optimize for performance past a certain point, given that limited knowledge.
The OS usually tries to give most threads a reasonably fair shot, so it can't reliably say "let one run to completion and then run the next one" unless you explicitly bake that into the code (since the alternative would be unpredictably starving certain threads for opportunities to run). Basically, switching between "run a slice of thread 1 on resource X" and "run a slice of thread 2 on resource X" doesn't get you anything once you have more threads than resources, and adds some overhead as well.
TL;DR threads don't give you performance increases past a certain point, and after that point they can make performance worse. When you can, reuse a number of threads corresponding to available resources; don't create/destroy individual threads corresponding to tasks that need to be done.
Building on Zac B's answer, you can use the following if you want to reuse threads:
use strict;
use warnings;
use Thread::Pool::Simple qw( );
$| = 1;
my $pool = Thread::Pool::Simple->new(
do => [ sub {
select(undef, undef, undef, (200+int(rand(8))*100)/1000);
return chr($_[0]);
} ],
);
my #alphabet = ( 65..90, 97..122 );
print $pool->remove($_) for map { $pool->add($_) } #alphabet;
print "\n";
The results are returned in order, as soon as they become available.
I'm the author of Parallel::WorkUnit so I'm partial to it. And I thought adding ordered responses was actually a great idea. It does it with forks, not threads, because forks are more widely supported and they often perform better in Perl.
my $wu = Parallel::WorkUnit->new();
for my $ascii(#alphabet){
$wu->async(sub{ return chr($ascii); });
}
#output = $wu->waitall();
If you want to limit the number of simultaneous processes:
my $wu = Parallel::WorkUnit->new(max_children => 5);
for my $ascii(#alphabet){
$wu->queue(sub{ return chr($ascii); });
}
#output = $wu->waitall();

How to speed up Parallel::ForkManager in perl

I am using EC2 amazon server to perform data processing of 63 files,
the server i am using has 16 core but using perl Parallel::ForkManager with number of thread = number of core then it seems like half the core are sleeping and the working core are not at 100% and fluctuate around 25%~50%
I also checked IO and it is mostly iddling.
use Sys::Info;
use Sys::Info::Constants qw( :device_cpu );
my $info = Sys::Info->new;
my $cpu = $info->device( CPU => %options );
use Parallel::ForkManager;
my $manager=new Parallel::ForkManager($cpu->count);
for($i=0;$i<=$#files_l;$i++)
{
$manager->start and next;
do_stuff($files_l[$i]);
$manager->finish;
}
$manager->wait_all_children;
The short answer is - we can't tell you, because it depends entirely on what 'do_stuff' is doing.
The major reasons why parallel code doesn't create linear speed increases are:
Process creation overhead - some 'work' is done to spawn a process, so if the children are trivially small, that 'wastes' effort.
Contented resources - the most common is disk IO, but things like file locks, database handles, sockets, or interprocess communication can also play a part.
something else causing a 'back off' that stalls a process.
And without knowing what 'do_stuff' does, we can't second guess what it might be.
However I'll suggest a couple of steps:
Double the number of processes to twice CPU count. That's often a 'sweet spot' because it means that any non-CPU delay in a process just means one of the others get to go full speed.
Try strace -fTt <yourprogram> (if you're on linux, the commands are slightly different on other Unix variants). Then do it again with strace -fTtc because the c will summarise syscall run times. Look at which ones take the most 'time'.
Profile your code to see where the hot spots are. Devel::NYTProf is one library you can use for this.
And on a couple of minor points:
my $manager=new Parallel::ForkManager($cpu->count);
Would be better off written:
my $manager=Parallel::ForkManager -> new ( $cpu->count);
Rather than using indirect object notation.
If you are just iterating #files then it might be better to not use a loop count variable and instead:
foreach my $file ( #files ) {
$manager -> start and next;
do_stuff($file);
$manager -> finish;
}

multithreading with MQ

I'm having problem using MQSeries Perl module in multi-threading environment. Here what I have tried:
create two handle in different thread with $mqMgr = MQSeries::QueueManager->new(). I thought this would give me two different connection to MQ, but instead I got return code 2219 on the second call to MQOPEN(), which probably means I got the same underling connection to mq from two separate call to new() method.
declare only one $mqMgr as global shared variable. But I can't assign reference to an MQSeries::QueueManager object to $mqMgr. The reason is "Type of arg 1 to threads::shared::share must be one of [$#%] (not subroutine entry)"
declare only one $mqMgr as global variable. Got same 2219 code.
Tried to pass MQCNO_HANDLE_SHARE_NO_BLOCK into MQSeries::QueueManager->new(), so that a single connection can be shared across thread. But I can not find a way to pass it in.
My question is, with Perl module MQSeries
How/can I get separate connection to MQ queue manager from different thread?
How/can I share a connection to MQ queue manager across different thread?
I have looked around but with little luck, Any info would be appreciated.
related question:
C++ - MQ RC Code 2219
Update 1: add a example that two local MQSeries::QueueManager object in two thread cause MQ error code 2219.
use threads;
use Thread::Queue;
use MQSeries;
use MQSeries::QueueManager;
use MQSeries::Queue;
# globals
our $jobQ = Thread::Queue->new();
our $resultQ = Thread::Queue->new();
# ----------------------------------------------------------------------------
# sub routines
# ----------------------------------------------------------------------------
sub worker {
# fetch work from $jobQ and put result to $resultQ
# ...
}
sub monitor {
# fetch result from $resultQ and put it onto another MQ queue
my $mqQMgr = MQSeries::QueueManager->new( ... );
# different queue from the one in main
# this would cause error with MQ code 2219
my $mqQ = MQSeries::Queue->new( ... );
while (defined(my $result = $resultQ->dequeue())) {
# create an mq message and put it into $mqQ
my $mqMsg = MQSeries::Message->new();
$mqQ->put($mqMsg);
}
}
# main
unless (caller()) {
# create connection to MQ
my $mqQMgr = MQSeries::QueueManager->new( ... );
my $mqQ = MQSeries::Queue->new( ... );
# create worker and monitor thread
my #workers;
for (1 .. $nThreads) {
push(#workers, threads->create('worker'));
}
my $monitor = threads->create('monitor');
while (True) {
my $mqMsg = MQSeries::Message->new ();
my $retCode = $mqQ->get(
Message => $mqMsg,
GetMsgOptions => $someOption,
Wait => $sometime
);
die("error") if ($retCode == 0);
next if ($retCode == -1); # no message
# not we have some job to do
$jobQ->enqueue($mqMsg->Data);
}
}
There is a very real danger when trying to multithread with modules that the module is not thread safe. There's a bunch of things that can just break messily because of the way threading works - you clone the current process state, and that includes things like file handles, sockets, etc.
But if you try and use them in an asynchronous/threaded way, they'll act really weird because the operations aren't (necessarily) atomic.
So whilst I can't answer your question directly, because I have no experience of the particular module:
Unless you know otherwise, assume you can't share between threads. It might be thread safe, it might not. If it isn't, it might still look ok, until one day you get a horrifically difficult to find bug as a result of a race condition in concurrent conditions.
A shared scalar/list is explicitly described in threads::shared as basically safe (and even then, you can still have problems with non-atomicity if you're not locking).
I would suggest therefore that what you need to do is either:
have a 'comms' thread, that does all the work related to the module, and make the other threads use IPC to talk to it. Thread::Queue can work nicely for this.
treat each thread as entirely separate for purposes of the module. That includes loading it (with require and import - not use because that acts earlier) and instantiating. (You might get away with 'loading' the module before threads start, but instantiating does things like creating descriptors, sockets etc.)
lock stuff when there's any danger of interruption of an atomic operation.
Much of the above also applies to fork parallelism too - but not in quite the same way, as fork makes "sharing" stuff considerably harder, so you're less likely to trip over it.
Edit:
Looking at the code you've posted, and crossreferencing against the MQSeries source:
There is a BEGIN block, that sets up some stuff with the MQSeries at the point at which you use it.
Whilst I can't say for sure that this is your problem, it makes me very wary - because bear in mind that when it does that, it sets up some stuff - and then when your threads start, they inherit non-shared copies of "whatever it did" during that "BEGIN" block.
So in light of what I suggested earlier on - I would recommend you try (because I can't say for sure, as I don't have a reference implementation):
require MQSeries;
MQSeries->import;
Put this in your code - in lieu of use - after thread start. E.g. after you do the creates and within the thread subroutine.

What is the smart way for start parallel tasks in forked proccess via perl, linux and anyevent?

I have an issue with AnyEvent::Utils::fork_call. I m using fork_call, then doing some work, and after that i should end my fork_call, and start new parallel task.
I tried that:
fork_call {
# very important work
# start new parallel task
if (my $pid = fork) {
# other important and very long work
}
} sub {
# never reached this, while parallel task working
}
I suppouse that any event fork_call is waiting for all its childs to finish their jobs.
So how can I avoid this problem? Maybe I should get rid of parent process?
Any suggestions and advices are welcome.
Thank you.
I don't know if that suits your needs, but you can run next fork_call in a callback in first one. Something like this appears to work:
fork_call {
# Important work
} sub {
fork_call {
# other important work
} sub {
warn "done!";
}
};
Because second fork_call is used in first one's callback this means it forks and starts after "Important work" is done.
While not directly an answer to your question, doing any nontrivial amounts of processing in a forked child is error-prone and unportable. A generic way to avoid any and all such problems is to use AnyEvent::Fork (and/or AnyEvent::Fork::RPC, AnyEvent::Fork::Pool) to create worker processes. These modules create a new Perl process from scratch and use it to fork further worker processes. This avoids almost all problems with inheriting state from your parent, which might be the problem you are running into - it might especially explain why it works sometimes for you and sometimes not, depending on what other things were going on when the process was forked.

Delphi threads and variables

So this is my question , threads are so confusing for me , let's say I have 5 threads , and 50 or 100 or more sites. So as far as I've learned about threads , I can make constructor create (link:string) and start new threads with different links , but than I wold need to make as much threads as the number of links I need to parse.So how can I make variable link shared between threads , so when thread one downloads link listbox1.items[0] it tells others that number 0 is downloaded and next thread should ask what link should I download and get answer listbox1.items[1] and so on until they download all links when they should terminate.
Can anyone provide me with simple example of how can this be done. Threads are killing me :(
You could have a thread-safe list of URLs to process, and a static-sized pool of worker threads each taking an unprocessed URL from the list at a time, processing it (downloading and parsing) and adding any found new URLs to the list, in a loop, as long as there are any unprocessed items in the list. Keep finished URLs in the list, only mark them as done, to avoid recursion.
Sounds like you simply need to set up a critical section.
This needs to be set up around the code segment which reads the next URL. To do this you would typically place a semaphore at the start of the code so that only one thread can enter it at any time. The semaphore is reset at the end of the code. As each new thread sees the URL list has expired, then it terminates.
Typically semaphores are boolean, but they can be integers for example if you want to allow a specific number of threads to enter the region at any time.
In your case you can simply set up a global boolean variable (visible to all threads), say "fSemaphore".
At the start of the region, the thread checks the flag. If it is false it sets it to true and enters the region (to get the next URL).
If it is true, then it loops - e.g. repeat sleep(0) until (not fSemaphore).
When it exits the region it set fSemaphore := False;
Obviously you need to make sure you guard against a possible infinite loop scenario...
Define a 'TURI' class for the request URI, result, error message and anything else needed for the web query except for the component to be used for the URI access. Descending from TObject shoud be fine. Create, initialize 100 of these and push them on a producer-consumer queue, (TObjectQueue, TCriticalSection and a semaphore should do fine). Hang a few TThreads off the queue that loop around and process the TURI instances until the queue is empty, whereupon they block.
You do not say what action you need taking with the processed TURI's - they will need freeing somewhere. If you wish to notify the main thread, PostMessage the completed URI's and free them in the message-handler.
Terminate the threads? Sure, if you really have to, then queue up some object that signals them to commit suicide, (a NIL maybe - the threads can check for 'assigned' just after popping the queue). When doing something like this, I oftem just leave the threads lying around even if I don't need to process any more URI during the app run - it's not worth the typing of terminating them.
Sadly, the Delphi examples and, I'm afraid, many textbooks, dont' get much further than suspend/resume control of threads, (don't do it), and 'TThread.Synchronize', 'TThread.WaitFor' and 'TThread.OnTerminate'. If you get a textbook like that, take it outside and burn it - you will learn next-to-nothing good.

Resources