Hi i have been working with tcl scripting for almost a year and now understand almost basics of it completely. But today i just came across nested procedures which is kind of strange as i did not get the use of it.
Anyways, i read about nested proc here but did not get the clear idea as of why do we need it.
The article says that since proc's are global in a namespace so to create a local proc you make nested proc's.
proc parent {} {
proc child {} {
puts "Inside child proc";
}
...
}
Now one usage i can think of is like
proc parent {} {
proc child {intVal} {
puts "intVal is $intVal";
}
puts "[::child 10]";
... #some processing
puts "[::child 20]";
... #some processing
puts "[::child 30]";
... #some processing
puts "[::child 40]";
... #some processing
puts "[::child 50]";
... #some processing
}
So now the child proc is local to the parent proc and could be used only inside parent proc. And also as i understand it is useful when you want to do same processing at multiple places inside that parent proc.
Now my confusion is that Is this the only use of nested proc or is there anything else that i did not understand???. I mean the nested proc just seems like a kind of private proc.
So please shed some light on it and help me understand the use of nested proc's.
Tcl doesn't have nested procedures. You can call proc inside a procedure definition, but that's just creating a normal procedure (the namespace used for resolution of the name of the procedure to create will be the current namespace of the caller, as reported by namespace current).
Why would you put proc inside proc? Well, the real reason for doing so is when you want to have the outer command act as a factory, to create the command when it is called. Sometimes the name of the command to create will be supplied by the caller, and sometimes it will be internally generated (in the latter case, it is normal to return the name of the created command). The other case that comes up is where the outer command is some sort of proxy for the (real) inner one, allowing the postponing of the creation of the real command because it is expensive in some fashion; if that's the case, the inner procedure will tend to actually be created with the same name as the outer one, and it will replace it (though not the executing stack frame; Tcl's careful about that because that would be crazy otherwise).
In the case where you really need an “inner procedure” it's actually better to use a lambda term that you can apply instead. That's because it is a genuine value that can be stored in a local variable and which will automatically go away when the outer procedure terminates (or if you explicitly unset the variable or replace its contents). This inner code won't have access to the outer code's variables except via upvar; if you want to return the value while still binding variables, you should use a command prefix and include a bit of extra trickery to bind the variables as pre-supplied arguments:
proc multipliers {from to} {
set result {}
for {set i $from} {$i <= $to} {incr i} {
lappend result [list apply {{i n} {
return [expr {$i * $n}]
}} $i]
}
return $result
}
set mults [multipliers 1 5]
foreach m $mults {
puts [{*}$m 2.5]
}
# Prints (one per line): 2.5 5.0 7.5 10.0 12.5
Using an inner proc to simulate apply
Note that the apply command can actually be simulated by an inner procedure. This was a technique used in Tcl 8.4 and before:
# Omitting error handling...
proc apply {lambdaTerm args} {
foreach {arguments body namespace} $lambdaTerm break
set cmd ${namespace}::___applyLambad
proc $cmd $arguments $body
set result [uplevel 1 [linsert 0 $args $cmd]]; # 8.4 syntax for safe expansion!
rename $cmd ""
return $result
}
This was somewhat error-prone and very slow as it would recompile on each invocation; we don't do that any more!
Tcl does not have nested procs. From the proc man page:
Normally, name is unqualified (does not include the names of any containing namespaces), and the new procedure is created in the current namespace.
(emphasis mine)
To demonstrate:
% namespace eval foo {
proc parent {} {
proc child {} {
puts "the child"
}
puts "the parent, which holds [child]"
}
}
% foo::parent
the child
the parent, which holds
% foo::child
the child
We can still call the "inner" proc directly -- it's not local to the enclosing proc.
One item you missed in the discussion in that wiki page is that to make a proc truly local only to the enclosing proc, one must delete it at the end of the enclosing proc:
% namespace eval foo {
proc parent {} {
proc child {} {
puts "the child"
}
puts "the parent, which holds [child]"
# now, destroy the inner proc
rename child ""
}
}
% foo::parent
the child
the parent, which holds
% foo::child
invalid command name "foo::child"
As to the use of a local proc, I'd agree with you that it's beneficial to encapsulate repetive tasks that are only useful in the current proc. I wouldn't get too hung up on that though: clear documentation or code conventions will do just as well.
Related
In the context of Jenkins pipelines, I have some Groovy code that's enumerating a list, creating closures, and then using that value in the closure as a key to lookup another value in a map. This appears to be rife with some sort of anomaly or race condition almost every time.
This is a simplification of the code:
def tasks = [:]
for (platformName in platforms) {
// ...
tasks[platformName] = {
def componentUploadPath = componentUploadPaths[platformName]
echo "Uploading for platform [${platformName}] to [${componentUploadPath}]."
// ...
}
tasks.failFast = true
parallel(tasks)
platforms has two values. I will usually see two iterations and two tasks registered and the keys in tasks will be correct, but the echo statement inside the closure indicates that we're just running one of the platforms twice:
14:20:02 [platform2] Uploading for platform [platform1] to [some_path/platform1].
14:20:02 [platform1] Uploading for platform [platform1] to [some_path/platform1].
It's ridiculous.
What do I need to add or do differently?
It's the same issue as you'd see in Javascript.
When you generate the closures in a for loop, they are bound to a variable, not the value of the variable.
When the loop exits, and the closures are run, they will all be using the same value...that is -- the last value in the for loop before it exited
For example, you'd expect the following to print 1 2 3 4, but it doesn't
def closures = []
for (i in 1..4) {
closures << { -> println i }
}
closures.each { it() }
It prints 4 4 4 4
To fix this, you need to do one of two things... First, you could capture the value in a locally scoped variable, then close over this variable:
for (i in 1..4) {
def n = i
closures << { -> println n }
}
The second thing you could do is use groovy's each or collect as each time they are called, the variable is a different instance, so it works again:
(1..4).each { i ->
closures << { -> println i }
}
For your case, you can loop over platforms and collect into a map at the same time by using collectEntries:
def tasks = platforms.collectEntries { platformName ->
[
platformName,
{ ->
def componentUploadPath = componentUploadPaths[platformName]
echo "Uploading for platform [${platformName}] to [${componentUploadPath}]."
}
]
}
Hope this helps!
It seems that dynamic variables don't always survive subroutine calls in threads:
sub foo($x, &y = &infix:<+>) {
my &*z = &y;
bar($x);
}
sub bar ($x) {
say &*z($x,$x);
my $promise = start { bar($x-1) if $x > 0 }
await $promise;
# bar($x-1) if $x > 0 # <-- provides the expected result: 6, 4, 2, 0
}
foo(3); # 6, 4, Dynamic variable &*z not found
Using a more globally scoped variable also works, so it's not that all variables are lost — it seems specific to dynamics:
our &b;
sub foo($a, &c = &infix:<+>) {
&b = &c;
bar($a);
}
sub bar ($a) {
say &b($a,$a);
my $promise = start { bar($a-1) if $a > 0 }
await $promise;
}
foo(3); # 6, 4, 2, 0
Once the variable is set in foo(), it is read without problem in bar(). But when bar() is called from inside the promise, the value for &*z disappears not on the first layer of recursion but the second.
I'm sensing a bug but maybe I'm doing something weird with the between the recursion/dynamic variables/threading that's messing things up.
Under current semantics, start will capture the context it was invoked in. If dynamic variable lookup fails on the stack of the thread that the start executes on (one of those from the thread pool), then it will fall back to looking at the dynamic scope captured when the start block was scheduled.
When a start block is created during the execution of another start block, the same thing happens. However, there is no relationship between the two, meaning that the context captured by the "outer" start block will not be searched also. While one could argue for that to happen, it seems potentially problematic to do so. Consider this example:
sub tick($n = 1 --> Nil) {
start {
await Promise.in(1);
say $n;
tick($n + 1);
}
}
tick();
sleep;
This is a (not entirely idiomatic) way to produce a tick every second. Were the inner start to retain a reference back to the state of the outer one, for the purpose of dynamic variable lookup, then this program would build up a chain of ever increasing length in memory, which seems like an undesirable behavior.
I have below dummy program,
proc main2 {} {
set mainVar 100
proc subproc1 {} {
puts $mainVar
}
subproc1
}
main2
it throws an error can't read "mainVar": no such variable. my question is if I declare a variable (i.e mainVar )in proc isn't that variable should be accessible everywhere inside that proc? why it can't accessible in another proc which is declared inside mainproc proc? please put some light on this
Tcl's procedures do not nest; there is no shared scope at all. The main reason for declaring a procedure inside another one is if you are doing some kind of code generation in the outer procedure (whether of the name, the variable list or the body).
Now, you can simulate a read-only version like this (simplified version; a full-service variant is a lot more complex):
proc closure {name arguments body} {
set vars [uplevel 1 {info locals}]
set prologue {}
foreach v $vars {
upvar 1 $v var
append prologue [list set $v $var] ";"
}
uplevel 1 [list proc $name $arguments $prologue$body]
}
proc main2 {} {
set mainVar 100
closure subproc1 {} {
puts $mainVar
}
subproc1
}
main2
I'll leave making it work correctly with global and arrays (as well as all the other nuances of doing this job properly) as exercises for the reader.
I have a trouble in running a Perl script in muti-threads. It continued consume memory and finally the system ran out of memory and killed it. It seems that the sub-threads were detached but the system resource were not released when they finished. I am pretty new to Perl and couldn't find which part went wrong. This is part of the script that may cause this problem. Could anyone help me with this?
use strict;
use warnings;
print "different number:\t";
my $num1=<>;
chomp $num1;
if($num1!~/[1 2 3 4 5]/)
{
print "invalid input number\n";
END;
}
my $i=0;
my $no;
my #spacer1;
my $nn;
my #spacer2;
open IN,"file1.txt"or die"$!";
while(<IN>)
{
chomp;
if($_=~ /^>((\d)+)\|((\d)+)/)
{
$no=$1;
$spacer1[$no][0]=$3;
}
else
{
$spacer1[$no][1]=$_;
}
}
close IN;
open IN, "file2.txt" or die "$!";
while(<IN>)
{
chomp;
if($_=~ /^>((\d)+)\|((\d)+)/)
{
$nn=$1;
$spacer2[$nn][0]=$3;
}
else
{
$spacer2[$nn][1]=$_;
}
}
close IN;
#-----------------------------------------------------------------#create threads
use subs qw(sg_ana);
use threads;
use Thread::Semaphore;
my $cycl=(int($no/10000))+1;
my $c;
my #thd;
my $thread_limit= Thread::Semaphore -> new (3);
foreach $c(1..$cycl)
{
$thread_limit->down();
$thd[$c]=threads->create("sg_ana",$c-1,$c,$num1);
$thd[$c]->detach();
}
&waitquit;
#-------------------------------------------------------------#limite threads num
sub waitquit
{
print "waiting\n";
my $num=0;
while($num<3)
{
$thread_limit->down();
$num++;
}
}
#---------------------------------------------------------------#alignment
my $n;
my $n1;
my $j;
my $k;
my $l;
my $m;
my $num;#number of match
my $num2=0;;#arrange num
sub sg_ana
{
my $c1=shift;
my $c2=shift;
$num1=shift;
open OUT,">$num1.$c2.txt" or die "$!";
if($num1==1)
{
foreach $n($c1*10000..$c2*10000-1)
{
if($spacer2[$n][1])
{
my $presult1;
my $presult2;
$num2=-1;
foreach $i(0..19)
{
$num=0;
$num2++;
my $tmp1=(substr $spacer2[$n][1],0,$i)."\\"."w".(substr $spacer2[$n][1],$i+1,19-$i);
foreach $n1(0..#spacer1-1)
{
if($spacer1[$n1][1])
{
my $tmp2=substr $spacer1[$n1][1],0,20;
if($tmp2=~/$tmp1/)
{
$num++;
$presult1.=$n1.",";
}
}
}
$presult2=$i+1;
if($num>=4)
{
print OUT "\n";
}
}
}
}
}
close OUT;
$thread_limit->up();
}
Rule one of debugging perl is enable use strict; and use
warnings; and then sort out the errors. Actually, you should
probably do that first of all, before you even start writing code.
You're creating and limiting threads via a Semaphore - but actually
this is really inefficient because of how perl does threads - they
aren't lightweight, so spawning loads is a bad idea. A better way of doing this is via Thread::Queue a bit like this.
Please use 3 arg open and lexical file handles. e.g. open ( my
$out, '>', "$num.$c2.txt" ) or die $!;. You're probably getting
away with it here, but you have got OUT as a global namespace
variable being used by multiple threads. That way lies dragons.
Don't use single letter variables. And given how you you use $c
then you'd be far better off:
foreach my $value ( 1..$cycl ) {
## do stuff
}
The same is true of all your other single letter variables though - they're not meaningful.
You pass $num before it's initialised, so it's always going to
be undef within your sub. So your actual subroutine is just:
sub sg_ana
{
my $c1=shift;
my $c2=shift;
$num1=shift;
open OUT,">$num1.$c2.txt" or die "$!";
close OUT;
$semaphore->up();
}
Looking at it - I think you may be trying to do something with a shared variable there, but you're not actually sharing it. I can't decode the logic of your program though (thanks to having a load of single letter variables most likely) so I can't say for sure.
You're calling a subroutine &waitquit;. That's not good style -
prefixing with an ampersand and supplying no arguments does
something subtly different to just invoking the sub 'normally' - so
you should avoid it.
Don't instantiate your semaphore like this:
my $semaphore=new Thread::Semaphore(3);
That's an indirect procedure call, and bad style. It would be better written as:
my $thread_limit = Thread::Semaphore -> new ( 3 );
I would suggest rather than using Semaphores like that, you'd be far better off not detatching your threads, and just using join. You also don't need an array of threads - threads -> list does that for you.
I can't reproduce your problem, because your sub isn't doing
anything. Have you by any chance modified it for posting? But a classic reason for perl memory exhaustion when threading is because each thread clones the parent process - and so 100 threads is 100x the memory.
I am trying to optimize code that loads a file and parses the data in Perl.
Background
The data ends up in a complicated object tree.
The top level object is a blessed package.
Some of the nested objects are new'ed as another blessed package type call Item.
The first pass separates the binary data into units and segments
within a unit which are all stored in multiple arrays.
There could be 20 or 50 units with 8 segments each.
The second pass performs the decoding of the binary data and is what needs optimized for speed.
Approach with threads
I am trying to use the modules threads and threads::shared.
I want each thread to process a subset of the units and populate the data into a common object tree.
I am looking for some sample code that demonstrates how to share blessed objects that may be allocated and blessed from any thread context and inserted into a shared object tree. And then is accessible from the main thread to walk the data for data lookups. The decoding threads will return once the decoding is completed.
I am having difficulty seeing how to have multiple threads insert objects to a common object tree using the threads:shared module. In particular when object of Item type are blessed from a thread context. The class (package) functions won't be bound to the object as I understand.
I do realize that at certain points in the code the code will need to use the threads::shared::lock() function before adding objects to the object tree.
In particular the nested blessed Item objects would be allocated from each thread context.
The threads::shared documentation says "Note that it is often not wise to share an object unless the class itself has been written to support sharing".
Is there a sample code that demonstrates how to accomplish this?
The documentation also says "object's destructor may get called multiple times, one for each thread's scope exit". How is this handled properly?
Thanks
J.R.
OK, so backtracking a bit - threads::shared really does 'single data structures' and doesn't really support more complicated things. That's because when you 'thread' you actually create separate program instances with (some) shared memory space, but practically speaking each 'thread' is a separate program anyway.
So, supporting sharing in an object gets really quite messy. I've found a better approach is to ... not. Use Thread::Queue to pass data between threads, and have one thread that acts to collate results. If you need to pass more complicated data structures, you can use Storable and freeze/thaw to serialise the object, and enqueue it.
That way you don't have to worry about tripping over shared nested data structures - and it's quite likely that you will, because there's no such thing as a 'deep share' option on an object - you have to explicitly share every internal array/hash(reference).
So I would tackle it like this:
#!/usr/bin/perl
use strict;
use warnings;
package Test_Object;
sub new {
my ( $class, $id ) = #_;
my $self = {};
$self->{id} = $id;
bless $self, $class;
return $self;
}
sub set_result {
my ( $self, $result_code ) = #_;
$self->{result} = $result_code;
}
sub get_id {
my ($self) = #_;
return $self->{id};
}
sub get_result {
my ($self) = #_;
return $self->{result};
}
package main;
use strict;
use warnings qw/ all /;
use threads;
use Thread::Queue;
use Storable qw/ freeze thaw/;
my $work_q = Thread::Queue->new();
my $result_q = Thread::Queue->new();
sub worker {
my $tid = threads->self->tid;
print "$tid: starting\n";
while ( my $item = $work_q->dequeue() ) {
my $object = thaw($item);
print "$tid: got object with ID of ", $object->get_id, "\n";
$object->set_result( $object->get_id . " : $tid" );
$result_q->enqueue( freeze $object );
}
}
sub collator {
while ( my $result = $result_q->dequeue ) {
my $object = thaw $result;
print "Collator got object with result code of ", $object->get_result,
"\n";
}
## do something with collated wossnames - pass back to main maybe?
}
my #workers;
for ( 1 .. 5 ) {
my $thr = threads->create( \&worker );
push #workers, $thr;
}
my $collator = threads->create( \&collator );
for ( 1 .. 200 ) {
my $work_object = Test_Object->new($_);
$work_q->enqueue( freeze $work_object );
}
$work_q->end;
foreach my $thr (#workers) {
$thr->join;
}
$result_q->end;
foreach my $thr ( threads->list ) {
$thr->join;
}