I have code that looks like this:
package require Thread
proc p1 {} {
set tid [thread::create {
proc executeCommand {command} {
return $command
}
thread::wait
}]
set result ""
::thread::send -async $tid [list executeCommand {"Hello thread world"}] result
#***Do some additional stuff***
vwait result
::thread::release $tid
puts $result
return $result
}
p1
After sourcing the .tcl file that holds this code, my expectation is for the child thread to return "Hello thread world" after vwait is called and the 'result' variable to be printed out, but neither of these happen. It appears that the 'result' variable is remaining blank.
Whats strange is that when I take the code out of a procedure (proc) block and source the .tcl file, it works perfectly but with the way my system is set up I need to use procedures.
Not sure what I am doing wrong.
The “problem” is that the receiving variable (just as with vwait) is located with respect to the global namespace, not the variables that are in the current scope; the flag TCL_GLOBAL_ONLY is used on the call to Tcl_SetVar2Ex in the callback (and Tcl's underlying variable implementation is quite complicated so one really wants to stick to the API if possible):
/*
* Set the result variable
*/
if (Tcl_SetVar2Ex(interp, var, NULL, valObj,
TCL_GLOBAL_ONLY | TCL_LEAVE_ERR_MSG) == NULL) {
rc = TCL_ERROR;
goto cleanup;
}
That makes sense in general, as you could have returned from the procedure between launching the background thread and receiving the result, and Tcl really tries to avoid doing early binding.
So where did the result go? It's in the global result variable (the :: just means “I really mean to use the global variable named this”):
% puts $::result
"Hello thread world"
The easiest fix for this is to use a variable to do the receiving that is unique to the particular call. Which sounds more complicated than it really is, since we've already got a unique token in the thread ID:
proc p1 {} {
set tid [thread::create {
proc executeCommand {command} {
return $command
}
thread::wait
}]
### Make the name of the global variable (an array element) ###
set var ::p1results($tid)
### This is a simple transformation of what you were already doing ###
set $var ""
::thread::send -async $tid [list executeCommand {"Hello thread world"}] $var
#***Do some additional stuff***
vwait $var
### Transfer the global variable into a local and remove the global ###
set result [set $var]
unset $var
### Back to your code now ###
::thread::release $tid
puts $result
return $result
}
This appears to work as expected when I try it out.
Related
I've been trying to implement some multithreaded problems in Tcl. At some point I wanted to simulate the Fork/Join framework in Java where a thread spawns two "child" threads that each do half of the work. In order to do this in Tcl, you would create some threads with a startup script, put them to work to then wait for their result. My implementation uses joinable threads, asynchronous message sends with optional result variable and vwait on that variable, but produces irregular results. Sometimes the script finishes as expected, sometimes it keeps blocking on the vwait. Why can it sometimes block on the vwait, and otherwise run just fine? I have no idea why that would be; I followed the manual pages to the best of my knowledge.
Some sample code to demonstrate the behaviour:
package require Thread
set code {
proc run {} {
puts "I am running"
return 4
}
thread::wait
}
set t1 [ thread::create -joinable $code ]
set t2 [ thread::create -joinable $code ]
thread::send -async $t1 "run" res1
thread::send -async $t2 "run" res2
puts "Waiting"
vwait res1
vwait res2
puts "Releasing"
thread::release $t1
thread::release $t2
puts "Joining"
thread::join $t1
thread::join $t2
puts [ expr { $res1 + $res2 } ]
This sometimes produces (as expected):
Waiting
I am running
I am running
Releasing
Joining
8
... and sometimes:
Waiting
I am running
I am running
<keeps hanging here>
Spawning only 1 thread never seems to recreate the issue, or it might be that the underlying problem still remains but never manifests itself. Someone can hopefully shed some light on why this happens. Thanks in advance!
The threads can run in either order, yet vwait only waits on a single variable; it's possible for $t2 to finish its work (and set res2) before $t1 finishes (and sets res1). A simple way around this is to put the variables to wait on in an array and then vwait on the whole array:
thread::send -async $t1 "run" waiting(res1)
thread::send -async $t2 "run" waiting(res2)
puts "Waiting"
# Wait for the two sets to happen, in either order
vwait waiting
vwait waiting
puts "Releasing"
More generally, consider using a thread pool instead.
package require Thread
set code {
proc run {} {
puts "I am running"
return 4
}
# No thread::wait here!
}
set pool [tpool::create -maxworkers 2 -initcmd $code]
set task1 [tpool::post $pool run]
set task2 [tpool::post $pool run]
tpool::wait $pool $task1
tpool::wait $pool $task2
Your idea about a race condition (That the threads run in an order where the second one executes first) is correct, yes.
One alternative uses shared variables instead of having the run proc return a value, and condition variables to notify the parent thread that the run procedures are complete before killing the thread:
#!/usr/bin/env tclsh
package require Thread
set code {
proc run {m cond res} {
puts "$res is running"
tsv::incr results $res 4
thread::mutex lock $m
thread::cond notify $cond
thread::mutex unlock $m
}
thread::wait
}
set t1 [thread::create -joinable $code]
set t2 [thread::create -joinable $code]
set c1 [thread::cond create]
set c2 [thread::cond create]
set m1 [thread::mutex create]
set m2 [thread::mutex create]
thread::send -async $t1 [list run $m1 $c1 res1]
thread::send -async $t2 [list run $m2 $c2 res2]
puts "Running threads"
thread::mutex lock $m1
while {![tsv::exists results res1]} {
thread::cond wait $c1 $m1
}
thread::mutex unlock $m1
thread::mutex lock $m2
while {![tsv::exists results res2]} {
thread::cond wait $c2 $m2
}
thread::mutex unlock $m2
puts "Releasing"
thread::release $t1
thread::release $t2
puts "Joining" ;# Not sure if this is needed after the release or even valid
thread::join $t1
thread::join $t2
puts [expr {[tsv::get results res1] + [tsv::get results res2]}]
Thanks so much for the great ideas so far! I was able to solve my own problem, because when you explain the problem in just a few lines of code, it is easier to oversee what is going on. I would like to share my idea too. Look at the accepted answer for a better understanding of the problem.
I came up with an alternative solution that uses synchronous message sending after putting the threads to work with an asynchronous message. The Tcl manual specifies:
Many threads can simultaneously send scripts to the target thread for execution. All of them are entered into the event queue of the target thread and executed on the FIFO basis [...].
... which means that scripts sent to threads are processed in the order they arrive. You can make the thread accept a new script that just returns the resulting value and send that script synchronously to block until you get the result. This way, you don't have to use vwait at all because you rely on the FIFO script sending system.
The solution code looks like this:
package require Thread
set code {
set result 0
proc run {} {
upvar result result
puts "I am running"
set result 4
}
proc getResult {} {
upvar result result
return $result
}
thread::wait
}
set t1 [ thread::create -joinable $code ]
set t2 [ thread::create -joinable $code ]
thread::send -async $t1 "run" res1
thread::send -async $t2 "run" res2
puts "Waiting"
set res1 [ thread::send $t1 "getResult" ]
set res2 [ thread::send $t2 "getResult" ]
puts "Releasing"
thread::release $t1
thread::release $t2
puts "Joining"
thread::join $t1
thread::join $t2
puts [ expr { $res1 + $res2 } ]
This is to add an example of a third option for inter-thread comm using channel pipes (chan pipe) as a communication vehicle (rather than mutexes/ conditions, which are also used internally by tpool, if I am not utterly mistaken):
package require Thread
set code {
proc run {ch} {
puts "[thread::id] is running"
puts $ch 4
close $ch
}
thread::wait
}
set handler [list {pr tid} {
variable busy
variable data
set r [read $pr]
if {[eof $pr]} {
close $pr
unset busy($tid)
thread::release $tid
} else {
lappend data $r
}
} [namespace current]]
set t1 [thread::create $code]
lassign [chan pipe] pr1 pw1
chan configure $pw1 -blocking 0
chan configure $pr1 -blocking 0
thread::transfer $t1 $pw1
chan event $pr1 readable [list apply $handler $pr1 $t1]
thread::send -async $t1 [list run $pw1]
set busy($t1) $pr1
set t2 [thread::create $code]
lassign [chan pipe] pr2 pw2
chan configure $pw2 -blocking 0
chan configure $pr2 -blocking 0
thread::transfer $t2 $pw2
chan event $pr2 readable [list apply $handler $pr2 $t2]
thread::send -async $t2 [list run $pw2]
set busy($t2) $pr2
while {[array size busy]} {
vwait ::busy
}
puts [tcl::mathop::+ {*}$data]
Some remarks:
Pipes are elegant because inter-thread communication follows the model of channel communication (chan event, puts/ read);
They can be used to communicate between worker threads (not just the main one and the workers);
The above implementation is not ideal, as threads should not be created and destroyed on a per-job basis. Hence, tpool is more convenient to use and a better fit in the given situation, clearly;
Pipes can be used as a secondary communication line between threads (in a tpool), also;
You would have to introduce some procedure abstraction to hide the verbosity of the pipe-based solution;
My scenario is: using tcl, I am writing a file. then I am sourcing that file and want to keep a watch on log file that the file will generate during its execution. If the size of log file does not change after 2 hours then I need to stop the execution of file and rerun tcl script which will regenerate the file and then source it(generate-source cycle continue till file execution done completely)
Here is psedo code of my scenario :
set root /home/nikhil/
set group {all}
set TopScript [open $root/TopScript.tcl w]
puts $TopScript "[exec perl $root/extract_excel.pl $group] \n}"
Puts $TopScript "set logfilename $root/logfile"
puts $TopScript "source $root/main_1.tcl"
puts $TopScript "source $root/main_2.tcl"
close $TopScript
#Pseudo code for scenario what I want is:
thread::create {
exec tclsh /home/nikhil/TopScript.tcl
thread::wait
}
thread::create {
set flag_for_interupt 0
while{!flag_for_interupt} {
set old_log_size [file size $root/logfile]
after [expr {int(1000* 60* 60* 2)}]
set new_log_size [file size $root/logfile]
if{$old_log_size == $new_log_size} {
puts "I suspect Test is in hung state.... checking again after 2 hours.....\n"
after [expr {int(1000* 60* 60* 2)}]
set $new_log_size [file size $root/logfile]
if{$old_log_size == $new_log_size} {
puts "\n\n Test is in hung state.... log has not updated since last 4 hours........\n\n"
}
########## enter code to interupt main thread and execute again
set flag_for_inturept 1
}
}
}
Tcl does not share (normal) variables between threads. Instead, you need to work by sending messages between threads. A message is just a (usually short) script that you ask the other thread to run (the result of the script can be handled in a few ways, including a synchronous wait or by running the script disconnected). Most of the time, you set up a procedure in the receiving thread to actually do the work.
Let's restructure your waiting thread to operate that way:
set waiter [thread::create {
proc do {filename targetThread returnMessage} {
set delay [expr {int(1000* 60* 60* 2)}]
while true {
# This would be really a do-while loop, but we don't have those
while true {
set old_log_size [file size $filename]
after $delay
set new_log_size [file size $filename]
if {$old_log_size == $new_log_size} break
}
puts "I suspect Test is in hung state... checking again after 2 hours...\n"
after $delay
set new_log_size [file size $filename]
if {$old_log_size == $new_log_size} break
}
puts "\n\n Test is in hung state... log has not updated since last 4 hours...\n\n"
# Send message to main thread to do something about the hung test
thread::send -async $targetThread $returnMessage
}
thread::wait
}]
We'd set that thread actually working like this:
thread::send -async $waiter [list do $root/logfile [thread::current] {set test_hung 1}]
However, the only long operations in there are the calls to after. (Well, unless you're fantastically unlucky with the OS calls to get the log file size.) That means we can convert to using an asynchronous form in the thread, leaving the thread open to being accessed while it is working.
set waiter [thread::create {
proc do {filename targetThread returnMessage} {
set delay [expr {int(1000* 60* 60* 2)}]
set old_log_size [file size $filename]
# Schedule the run of do2 in two hours
after $delay [list do2 $filename $targetThread $returnMessage $delay $filename $old_log_size]
}
proc do2 {filename targetThread returnMessage delay filename old_log_size} {
set new_log_size [file size $filename]
if {$old_log_size == $new_log_size} {
puts "I suspect Test is in hung state... checking again after 2 hours...\n"
# Schedule the run of do3 in another two hours
after $delay [list do3 $filename $targetThread $returnMessage $delay $filename $old_log_size]
} else {
# An update did happen; run ourselves again in two hours to compare to the new size
after $delay [list do2 $filename $targetThread $returnMessage $delay $filename $new_log_size]
}
}
proc do3 {filename targetThread returnMessage delay filename old_log_size} {
set new_log_size [file size $filename]
if {$old_log_size == $new_log_size} {
puts "\n\n Test is in hung state... log has not updated since last 4 hours...\n\n"
# Send message to main thread to do something about the hung test
thread::send -async $targetThread $returnMessage
} else {
# An update did happen; run ourselves again in two hours to compare to the new size
after $delay [list do2 $filename $targetThread $returnMessage $delay $filename $new_log_size]
}
}
thread::wait
}]
So… we've got manageability but lost readability (the API for use is identical). Not bad but not great! (This sort of restructuring is known as conversion to Continuation-Passing Form, and it tends to destroy code readablity.) In 8.6 we can do better because we have coroutines that can yield to the thread's event loop.
set waiter [thread::create {
proc do {filename targetThread returnMessage} {
coroutine Coro[incr ::Coro] doBody $filename $targetThread $returnMessage
}
proc delayForTwoHours {} {
set delay [expr {int(1000* 60* 60* 2)}]
after $delay [info coroutine]
yield
}
proc doBody {filename targetThread returnMessage} {
while true {
while true {
set old_log_size [file size $filename]
delayForTwoHours
set new_log_size [file size $filename]
if {$old_log_size == $new_log_size} break
}
puts "I suspect Test is in hung state... checking again after 2 hours...\n"
delayForTwoHours
set new_log_size [file size $filename]
if {$old_log_size == $new_log_size} break
}
puts "\n\n Test is in hung state... log has not updated since last 4 hours...\n\n"
# Send message to main thread to do something about the hung test
thread::send -async $targetThread $returnMessage
}
thread::wait
}]
That (which still has the same API calling convention) gives manageability yet keeps virtually all the code (especially apart from the short bits in their own procedures) looking the same as the first version I wrote. Under the covers, the coroutine does the rewriting to continuation-passing form, but that's now handled by the Tcl runtime instead of needing to be done explicitly in your code. (Also, Tcl uses explicit coroutine launching, but that in turn means that it can yield across multiple stack levels without the complex yield chains of some other languages.)
I leave it as an exercise to use the second or third version as the basis for a version of the code which doesn't need extra threads at all. Running processes in the background also doesn't need threads; this whole management process can work with just a single (user-visible) thread.
I want to get 200 files using multithread, so I modify a TCL example as below.
But the result is strange, the total number of output files is random, about 135. I was confused that how the thread started change the value of variable $thread.
package require Thread
puts "*** I'm thread [thread::id]"
for {set thread 1} {$thread <= 200} {incr thread} {
set thread_ida $thread
tsv::set app global_thread_num $thread_ida
set id [thread::create -joinable {
puts [ tsv::get app global_thread_num ]
set thread_id [ tsv::get app global_thread_num ]
puts "${thread_id}thread_id"
set outFile "./test/${thread_id}"
append outFile ".tmd"
puts $outFile
set FileOut [open $outFile w+]
puts $FileOut "${thread_id}thread_id"
}] ;# thread::create
puts "*** Started thread $id"
lappend threadIds $id
} ;# for
puts "*** Existing threads: [thread::names]"
# Wait until all other threads are finished
foreach id $threadIds {
thread::join $id
}
puts "*** That's all, folks!"
The problem you've got is that these two lines:
puts [ tsv::get app global_thread_num ]
set thread_id [ tsv::get app global_thread_num ]
are not guaranteed to get the same value at all, nor are they at all likely to synchronise with the setting of the shared variable in the outer loop. Threads in Tcl have a reasonable amount of overhead during launch.
Instead, what you should do is make threads with the description of work inside a procedure and then send a simple message to them with the ID to start the real processing; that's much easier to make work.
package require Thread
puts "*** I'm thread [thread::id]"
for {set thread 1} {$thread <= 200} {incr thread} {
set id [thread::create -joinable {
proc DoWork {thread_id} {
# Only one puts here
puts "${thread_id}thread_id"
set outFile "./test/${thread_id}"
append outFile ".tmd"
puts $outFile
set FileOut [open $outFile w+]
puts $FileOut "${thread_id}thread_id"
# Close the channel, please...
close $FileOut
# Thread done, and since we're using joinable threads it should die now
thread::release
}
thread::wait
}] ;# thread::create
puts "*** Started thread $id"
lappend threadIds $id
# Start the work going, passing over the numeric ID in the "message"
thread::send -async $id [list DoWork $thread]
} ;# for
puts "*** Existing threads: [thread::names]"
# Wait until all other threads are finished
foreach id $threadIds {
thread::join $id
}
puts "*** That's all, folks!"
The key things here are that we create a procedure in each thread (DoWork) to receive the message, get the thread to wait for messages with thread::wait, and then launch the work by sending a message in with thread::send -async. The work destroys the thread with thread::release; it needs to do so explicitly otherwise it'll end up back in thread::wait waiting for the next message.
I'd probably use a thread pool in production code, as they're easier to scale to the hardware available in a particular deployment. The DoWork procedure — without the thread::release — would be defined in the pool's -initcmd option. The thread::send -async would be replaced by posting work to the pool, and you'd be waiting for the jobs instead of the threads.
package require Thread
puts "*** I'm thread [thread::id]"
set pool [tpool::create -maxworkers 48 -initcmd {
proc DoWork {thread_id} {
# Only one puts here
puts "${thread_id}thread_id"
set outFile "./test/${thread_id}"
append outFile ".tmd"
puts $outFile
set FileOut [open $outFile w+]
puts $FileOut "${thread_id}thread_id"
# Close the channel, please...
close $FileOut
}
}]
for {set thread 1} {$thread <= 200} {incr thread} {
lappend work [tpool::post -nowait $pool [list DoWork $thread]]
}
# Wait until all work is finished
foreach id $work {
tpool::wait $pool $id
}
puts "*** That's all, folks!"
tpool::release $pool
When a CHLD signal handler is used in Perl, even uses of system and backticks will send the CHLD signal. But for the system and backticks sub-processes, neither wait nor waitpid seem to set $? within the signal handler on SuSE 11 linux. Is there any way to determine the return code of a backtick command when a CHLD signal handler is active?
Why do I want this? Because I want to fork(?) and start a medium length command and then call a perl package that takes a long time to produce an answer (and which executes external commands with backticks and checks their return code in $?), and know when my command is finished so I can take action, such as starting a second command. (Suggestions for how to accomplish this without using SIGCHLD are also welcome.) But since the signal handler destroys the backtick $? value, that package fails.
Example:
use warnings;
use strict;
use POSIX ":sys_wait_h";
sub reaper {
my $signame = shift #_;
while (1) {
my $pid = waitpid(-1, WNOHANG);
last if $pid <= 0;
my $rc = $?;
print "wait()=$pid, rc=$rc\n";
}
}
$SIG{CHLD} = \&reaper;
# system can be made to work by not using $?, instead using system return value
my $rc = system("echo hello 1");
print "hello \$?=$?\n";
print "hello rc=$rc\n";
# But backticks, for when you need the output, cannot be made to work??
my #IO = `echo hello 2`;
print "hello \$?=$?\n";
exit 0;
Yields a -1 return code in all places I might try to access it:
hello 1
wait()=-1, rc=-1
hello $?=-1
hello rc=0
wait()=-1, rc=-1
hello $?=-1
So I cannot find anywhere to access the backticks return value.
This same issue has been bugging me for a few days now. I believe there are 2 solutions required depending on where you have your backticks.
If you have your backticks inside the child code:
The solution was to put the line below inside the child fork. I think your statement above "if I completely turn off the CHLD handler around the backticks then I might not get the signal if the child ends" is incorrect. You will still get a callback in the parent when the child exits because the signal is only disabled inside the child. So the parent still gets a signal when the child exits. It's just the child doesn't get a signal when the child's child (the part in backticks) exits.
local $SIG{'CHLD'} = 'DEFAULT'
I'm no Perl expert, I have read that you should set the CHLD signal to the string 'IGNORE' but this did not work in my case. In face I believe it may have been causing the problem. Leaving that out completely appears to also solve the problem which I guess is the same as setting it to DEFAULT.
If you have backticks inside the parent code:
Add this line to your reaper function:
local ($!, $?);
What is happening is the reaper is being called when your code inside the backticks completes and the reaper is setting $?. By making $? local it does not set the global $?.
So, building on MikeKull's answer, here is a working example where the fork'd child uses backticks and still gets the proper return code. This example is a better representation of what I was doing, while the original example did not use forks and could not convey the entire issue.
use warnings;
use strict;
use POSIX ":sys_wait_h";
# simple child which returns code 5
open F, ">", "exit5.sh" or die "$!";
print F<<EOF;
#!/bin/bash
echo exit5 pid=\$\$
exit 5
EOF
close F;
sub reaper
{
my $signame = shift #_;
while (1)
{
my $pid = waitpid(-1, WNOHANG);
print "no child waiting\n" if $pid < 0;
last if $pid <= 0;
my $rc = $? >> 8;
print "wait()=$pid, rc=$rc\n";
}
}
$SIG{CHLD} = \&reaper;
if (!fork)
{
print "child pid=$$\n";
{ local $SIG{CHLD} = 'DEFAULT'; print `./exit5.sh`; }
print "\$?=" . ($? >> 8) . "\n";
exit 3;
}
# sig CHLD will interrupt sleep, so do multiple
sleep 2;sleep 2;sleep 2;
exit 0;
The output is:
child pid=32307
exit5 pid=32308
$?=5
wait()=32307, rc=3
no child waiting
So the expected return code 5 was received in the child when the parent's reaper was disabled before calling the child, but as indicated by ikegami the parent still gets the CHLD signal and a proper return code when the child exits.
The below code does not output to the sample.log file as intended. Is there a reason why?
Might be related to scope, but cant seem to figure it out.
proc log_by_tracing { array element op } {
uplevel {
global logfile
set file $logfile($expect_out(spawn_id))
puts -nonewline $file $expect_out(buffer)
}
}
proc start_telnet {} {
spawn telnet 10.30.16.112
set logfile($spawn_id) [open sample.log w]
trace variable expect_out(buffer) w log_by_tracing
expect "login"
send "anonymous\r"
}
start_telnet
Help
Probably that uplevel block is failing to substitute what you think. Its really awkward code too. You can pass a variable into the trace function you know and avoid the global and the uplevel.
proc log {chan varname elt op} {
upvar #0 $varname v
puts -nonewline $chan "< $v(buffer)"
}
proc start_telnet {} {
set log [open sample.log w]
trace variable expect_out(buffer) w [list log $log]
...
}