Thread::Queue how to spread jobs evenly between the threads - multithreading

Here is the code:
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use Thread::Queue;
my $queue = Thread::Queue->new();
sub run_queries {
my $n = shift;
print "$n\n-------------\n";
# $queue->dequeue_nb() does the same
while (defined(my $text = $queue->dequeue())) {
print "$text\n";
}
}
my #threads;
push(#threads, threads->create(\&run_queries, 1));
push(#threads, threads->create(\&run_queries, 2));
push(#threads, threads->create(\&run_queries, 3));
push(#threads, threads->create(\&run_queries, 4));
for (my $i = 0; $i < 12; $i++)
{
$queue->enqueue($i);
}
$queue->end();
foreach (#threads) {
$_->join();
}
Output:
1
-------------
2
-------------
3
-------------
4
-------------
0
1
2
3
... all the items here
I expected that the output will be spread evenly between the threads and as soon as one thread takes the item from the queue, the second will activate and start processing the next item.
But in reality we see that one thread is processing everything, while the others are idle.
What do I need to do to split the jobs evenly between the threads?

What makes you think it's all one thread?
Change
print "$text\n";
to
print "[$n] $text\n";
or
print "[" . threads->tid . "] $text\n";
Sample output:
1
-------------
2
-------------
3
-------------
4
-------------
[2] 0
[1] 1
[2] 2
[4] 3
[3] 4
[3] 5
[1] 6
[1] 7
[2] 8
[1] 9
[2] 10
[4] 11
You might also want to experiment using
use Time::HiRes qw( sleep );
print "[" . threads->tid . "] START $text\n";
sleep(rand()+1); # [1,2) seconds
print "[" . threads->tid . "] END $text\n";
# After the loop.
print "[" . threads->tid . "] EXIT\n";
Sample output:
[2] START 0 \
[3] START 1 \ All the workers start right off the bat.
[4] START 2 /
[1] START 3 /
[1] END 3 \ As soon as a worker finishes one job,
[1] START 4 / it starts the next available job.
[4] END 2
[4] START 5
[3] END 1
[3] START 6
[2] END 0
[2] START 7
[1] END 4
[1] START 8
[4] END 5
[4] START 9
[1] END 8
[1] START 10
[2] END 7
[2] START 11
[3] END 6 \ There are no jobs left. Because ->end was called, it exits.
[3] EXIT / Otherwise, it would block until ->enqueue or ->end is called.
[4] END 9
[4] EXIT
[2] END 11
[2] EXIT
[1] END 10
[1] EXIT

Related

Groovy while loop not executed correctly

I've following simple while loop code in groovy -
def count = 1
while(count <= 5) {
println "$count"
sleep(5000)
println "Sleeping for 5 seconds"
count++
}
Which indicates that loop is executed only twice still second time Sleeping for 5 seconds is not run. Actually with this code, while block is expected to be executed 5 times. Can someone help to understand why such a weird behaviour?
When this code is run, output is following -
1
Sleeping for 5 seconds
2
This works fine:
~ $ cat doit.groovy
def count = 1
while(count <= 5) {
println "$count"
sleep(5000)
println "Sleeping for 5 seconds"
count++
}
~ $ groovy doit
1
Sleeping for 5 seconds
2
Sleeping for 5 seconds
3
Sleeping for 5 seconds
4
Sleeping for 5 seconds
5
Sleeping for 5 seconds

Run bash scripts from Perl threads

My script should have n subroutines (my_proc) to run simultaneously, each of them runs bash script and one sub (check_procs) checks if subs has finished.
use strict;
use threads;
use threads::shared;
my %proc_status :shared;
my %thr;
foreach my $i (1,2,3,4) {
$proc_status{$i}=0;
}
sub my_proc {
my $arg=shift(#_);
while (1) {
sleep(2);
print "Proc $arg Started\n";
#exec("/bin/bash","sleep_for_10_sec.bash") or die("Can't exec"); # case 1
#`sleep_for_10_sec.bash &`; # case 2
print "Proc $arg Finished\n";
{
lock(%proc_status);
$proc_status{$arg}=1;
}
}
}
sub check_procs {
my $all_finished;
while (! $all_finished) {
sleep 5;
print "CHECK: \n";
$all_finished=1;
foreach my $num (1,2,3,4) {
if ($proc_status{$num} == 1) {
print "CHECK: procedure $num has finished\n";
} else {
$all_finished=0;
}
}
}
print "All jobs finished\n";
}
foreach my $num (1,2,3,4) {
$thr{"$num"} = new threads \&my_proc,$num;
}
my $thr_check= new threads \&check_procs;
$thr_check->join();
And here are the sleep_for_10_sec.bash
ls
# bunch of other stuff
sleep 10
echo "finished sleep"
I don't want my_proc subs to wait "sleep_for_10_sec.bash" command to be executed, after browsing I have found that either #case1 or #case2 should work, but they both fail.
the output of #case1:
Proc 1 Started
[ls result]
finsihed sleep
the output of #case2:
Proc 1 Started
Proc 2 Started
Proc 3 Started
Proc 4 Started
CHECK:
CHECK:
Proc 4 Finished
Proc 2 Finished
Proc 3 Finished
Proc 1 Finished
Proc 3 Started
Proc 1 Started
Proc 2 Started
Proc 4 Started
CHECK:
CHECK: procedure 1 has finished
CHECK: procedure 2 has finished
CHECK: procedure 3 has finished
CHECK: procedure 4 has finished
But I expect something like this :
Proc 1 Started
Proc 2 Started
Proc 3 Started
Proc 4 Started
Proc 1 Finished
Proc 1 Started
Proc 3 Finished
Proc 3 Started
Proc 4 Finished
Proc 4 Started
Proc 2 Finished
Proc 2 Started
CHECK:
CHECK:
CHECK:
CHECK: procedure 1 has finished
CHECK: procedure 2 has finished
CHECK: procedure 3 has finished
CHECK: procedure 4 has finished
Actually I get wanted result in case of redirecting output to " > log", but anyway after:
Proc 1 Started
Proc 2 Started
Proc 3 Started
Proc 4 Started
it waits "sleep_for_10_sec.bash" to finish.
This is my first project where I use "thread" and "exec", could someone help me on this ?
exec shouldn't be combined with threads. exec launches a new program within the current process, so when you call exec from one thread, the program the threads were executing disappears. Since the threads would have no program to execute, exec kills the threads as well.
It's not clear to me why case 2 doesn't work (edit: see ikegami's comment below). I would think it would launch the process, run it in the background, and allow the Perl thread to immediately continue. It doesn't seem to do that, but this code will:
system("/bin/bash sleep_for_10_sec.bash &"); # case 3
exec("/bin/bash","sleep_for_10_sec.bash") or die("Can't exec"); # case 1
exec replaces the program running in the current process with another program. At the same time, the existing threads are terminated (since the program they want to execute is no longer there), replaced with a single thread executing the new program.
This means that exec never returns (except on error). Threads or no threads, exec is not what you want, because you don't want your program to stop running.
But I expect something like this:
Are you sure you want to launch sleep_for_10_sec.bash 4 times every two seconds (meaning you can have up to 20 of them running at a time) as your desired output indicates?
Are you sure you don't care if sleep_for_10_sec.bash completes or not as your desired output indicates?
If so, why are you using threads at all? You could simply use the following:
sub start {
my $num = shift;
say "Proc $num Started";
system('bash -c sleep_for_10_sec.bash &');
say "Proc $num Finished";
}
for my $pass (1..2) {
start($_) for 1..4;
sleep 2;
start($_) for 1..4;
sleep 2;
start($_) for 1..4;
sleep 1;
if ($pass == 1) {
say "CHECK:";
} else {
say "CHECK: procedure $_ has finished" for 1..4;
}
}
I think you want
use threads;
use Thread::Queue qw( ); # 3.01+
use constant NUM_WORKERS => 4;
sub worker {
my $num = shift;
say "Job $num Started\n";
system("sleep_for_10_sec.bash"); # Make sure starts with #! and is executable.
say "Job $num Finished\n";
}
{
my $q = Thread::Queue->new();
for (1..NUM_WORKERS) {
while (defined( my $job = $q->dequeue() )) {
worker($job);
}
}
$q->enqueue(1..4, 1..4);
$q->end();
$_->join() for threads->list;
}

Using threads in loop

I want to use threads in loops. The way I want to use this is that start threads in a loop and wait for them to finish. Once all threads are finished then sleep for some predefined number of time and then start those threads again.
Actually I want to run these threads once every hour and that is why I am using sleep. Also I know that hourly run can be done via cron but I can't do that so I am using sleep.
I am getting this error when I am trying to run my code:
Thread already joined at ex.pl line 33.
Perl exited with active threads:
5 running and unjoined
0 finished and unjoined
0 running and detached
This is my code:
use strict;
use warnings;
use threads;
use Thread::Queue;
my $queue = Thread::Queue->new();
my #threads_arr;
sub main {
$queue->enqueue($_) for 1 .. 5;
$queue->enqueue(undef) for 1 .. 5;
}
sub thread_body {
while ( my $num = $queue->dequeue() ) {
print "$num is popped by " . threads->tid . " \n";
sleep(5);
}
}
while (1) {
my $main_thread = threads->new( \&main );
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;
$main_thread->join();
foreach my $x (#threads_arr) {
$x->join();
}
sleep(1);
print "sleep \n";
}
Also I am able to see other questions similar to this but I am not able to get any of them.
Your #threads_arr array never gets cleared after you join the first 5 threads. The old (already joined) threads still exist in the array the second time around the loop, so Perl throws the "Thread already joined" error when attempting to join them. Clearing or locally initializing #threads_arr every time around the loop will fix the problem.
#threads_arr=(); # Reinitialize the array
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;

How to fork a function with params in bash?

I'm new to bash scripting and I faced an issue when I tried to improve my script. My script is spliting a text file and each part of this text file is processed in a function ... Everything is working fine but my problem occurs when I'm forking (with &) my function processing ! My args are not like expected (it's a line number and text with whitespaces and backspaces) and I suppose it's because of global variables ... I tried to fork, then sleep 1 second in the parent thread and then continue in order to put args into local variables for my function execution but it doesn't work either ... Can you give me a hint about how to do it ? What I want is to be able to pass args to my function and be allowed to modify it after the fork call in my parent thread ... is it possible ?
Thanks in advance :)
Here is my script :
#!/bin/bash
#Parameters#
FILE='tasks'
LINE_BY_THREAD='500'
#Function definition
function checkPart() {
local NUMBER="$1"
local TXT="$2"
echo "$TXT" | { while IFS= read -r line ; do
IFS=' ' read -ra ADDR <<< "$line"
#If the countdown is set to 0, launch the task ans set it to init value
if [ ${ADDR[0]} == '0' ]; then
#task launching
#to replace by $()l
echo `./${ADDR[1]}.sh ${ADDR[2]} &`
#countdown set to init value
sed -i "$NUMBER c ${ADDR[3]} ${ADDR[1]} ${ADDR[2]} ${ADDR[3]}" $FILE
else
sed -i "$NUMBER c $((ADDR-1)) ${ADDR[1]} ${ADDR[2]} ${ADDR[3]}" $FILE
fi
((NUMBER++))
done }
}
#Init processes number#
LINE_NUMBER=$(wc -l < $FILE)
NB_PROCESSES=$(($LINE_NUMBER / $LINE_BY_THREAD))
if [ $(($LINE_NUMBER % $LINE_BY_THREAD)) -ne '0' ]; then
((NB_PROCESSES++))
fi
echo "Number of thread to be run : $NB_PROCESSES"
#Start the split sequence#
for (( i = 2; i <= $LINE_NUMBER; i += $LINE_BY_THREAD ))
do
PARAM=$(sed "$i,$(($i + $LINE_BY_THREAD - 1))!d" "$FILE")
(checkPart "$i" "$PARAM") &
sleep 1
done
My job is to create a scheduler for tasks described in this following file :
#MinutesBeforeLaunch#TypeOfProcess#Argument#Frequency#
2 agr_m 42 5
5 agr_m_s 26 5
0 agr_m 42 5
3 agr_m_s 26 5
0 agr_m 42 5
5 agr_m_s 26 5
4 agr_m 42 5
5 agr_m_s 26 5
4 agr_m 42 5
4 agr_m_s 26 5
2 agr_m 42 5
4 agr_m_s 26 5
When I'm reading a number > 0 in the first column, I just decrement it and when it's a 0 I have to launch the task and set the first number to frequency, last column ...
My first code is the previous with sed for text replacement but is it possible to do better ?

In Perl, how can I wait for threads to end in parallel?

I have a Perl script that launches 2 threads,one for each processor. I need it to wait for a thread to end, if one thread ends a new one is spawned. It seems that the join method blocks the rest of the program, therefore the second thread can't end until everything the first thread does is done which sort of defeats its purpose.
I tried the is_joinable method but that doesn't seem to do it either.
Here is some of my code :
use threads;
use threads::shared;
#file_list = #ARGV; #Our file list
$nofiles = $#file_list + 1; #Real number of files
$currfile = 1; #Current number of file to process
my %MSG : shared; #shared hash
$thr0 = threads->new(\&process, shift(#file_list));
$currfile++;
$thr1 = threads->new(\&process, shift(#file_list));
$currfile++;
while(1){
if ($thr0->is_joinable()) {
$thr0->join;
#check if there are files left to process
if($currfile <= $nofiles){
$thr0 = threads->new(\&process, shift(#file_list));
$currfile++;
}
}
if ($thr1->is_joinable()) {
$thr1->join;
#check if there are files left to process
if($currfile <= $nofiles){
$thr1 = threads->new(\&process, shift(#file_list));
$currfile++;
}
}
}
sub process{
print "Opening $currfile of $nofiles\n";
#do some stuff
if(some condition){
lock(%MSG);
#write stuff to hash
}
print "Closing $currfile of $nofiles\n";
}
The output of this is :
Opening 1 of 4
Opening 2 of 4
Closing 1 of 4
Opening 3 of 4
Closing 3 of 4
Opening 4 of 4
Closing 2 of 4
Closing 4 of 4
First off, a few comments on the code itself. You need to make sure you have:
use strict;
use warnings;
at the beginning of every script. Second:
#file_list = #ARGV; #Our file list
$nofiles = $#file_list + 1; #Real number of files
is unnecessary as an array in scalar context evaluates to the number of elements in the array. That is:
$nofiles = #ARGV;
would correctly give you the number of files in #ARGV regardless of the value of $[.
Finally, the script can be made much simpler by partitioning the list of files before starting the threads:
use strict; use warnings;
use threads;
use threads::shared;
my #threads = (
threads->new(\&process, #ARGV[0 .. #ARGV/2]),
threads->new(\&process, #ARGV[#ARGV/2 + 1 .. #ARGV - 1]),
);
$_->join for #threads;
sub process {
my #files = #_;
warn "called with #files\n";
for my $file ( #files ) {
warn "opening '$file'\n";
sleep rand 3;
warn "closing '$file'\n";
}
}
Output:
C:\Temp> thr 1 2 3 4 5
called with 1 2 3
opening '1'
called with 4 5
opening '4'
closing '4'
opening '5'
closing '1'
opening '2'
closing '5'
closing '2'
opening '3'
closing '3'
Alternatively, you can let the threads move on to the next task as they are done:
use strict; use warnings;
use threads;
use threads::shared;
my $current :shared;
$current = 0;
my #threads = map { threads->new(\&process, $_) } 1 .. 2;
$_->join for #threads;
sub process {
my ($thr) = #_;
warn "thread $thr stared\n";
while ( 1 ) {
my $file;
{
lock $current;
return unless $current < #ARGV;
$file = $ARGV[$current];
++ $current;
}
warn "$thr: opening '$file'\n";
sleep rand 5;
warn "$thr: closing '$file'\n";
}
}
Output:
C:\Temp> thr 1 2 3 4 5
thread 1 stared
1: opening '1'
1: closing '1'
1: opening '2'
thread 2 stared
2: opening '3'
2: closing '3'
2: opening '4'
1: closing '2'
1: opening '5'
1: closing '5'
2: closing '4'
I think you need to move the code that pulls the next file from the list into the threads themselves.
So every thread would not just process one file, but continue to process until the list is empty.
This way, you also save on the overhead of creating new threads all the time.
Your main thread will then join both of them.
Of course, this requires synchronization on the list (so that they do not pull the same data). Alternately, you could split the list into two (one for each thread), but that might result in an unlucky distribution.
(PS: No Perl God, just a humble monk)

Resources