What I want to do seems simple, but I don't know if the TCL interpreter has this functionality.
I have a tcl script that will have thousands of variables defined prior to running within its scope -- this is done by a pre-script that simply does a "global" on the thousands of variables to bring it into this current scope.
Is there an easy way to determine which of those thousands of variables were actually used during that script?
For instance, if the script has variables
a,b,c,d,e,
but only variable e was accessed (whether modified or just used), I would like to know.
You can use tcl's trace capability to keep track of variable access.
Something like:
# at the end of the pre-script:
array set var_stats {}
proc track_var {varname n1 n2 op} {
global var_stats
incr var_stats($varname.$op)
}
foreach var $list_of_varnames {
foreach op {array read write unset} {
set var_stats($var.$op) 0
trace add variable $var $op [list track_var $var]
}
}
The code above will increment the appropriate stats (array, read, write and unset) for the variables when they are accessed. At the end of the script just dump the array with either an array get or a parray.
Updated answer:
I just reread your question and realize that if you just want to know which variable is accessed then there is a simpler way to do it:
array set var_stats {}
proc track_var {varname n1 n2 op} {
global var_stats
set var_stats($varname) 1
}
foreach var $list_of_varnames {
trace add variable $var {array read write unset} [list track_var $var]
}
Then at the end of the script just do an array names to get a list of all variables accessed.
Related
In my Tcl/Tk project, i need to allow my users to mangle a string in a well-defined way.
The idea is, to allow people to declare a "string mangling" proc/expr/function/... in a configuration file, which then gets applied to the strings in question.
I'm a bit worried on how to properly implement that.
Possibilities I have considered so far:
regular expressions
That was my first thought, but there's two caveats:
search/replace with regular expressions in Tcl seems to be awkward. at least with regsub i need to pass the match and replacement parts separately (as opposed to how e.g. sed allows me to pass a single complicated string that does everything for me); there are sed implementations for Tcl, but they look naive and might break rather sooner than later
also regexes can be awkward by themselves; using them to mangle complicated strings is often more complicated than it should be
procs?
Since the target platform is Tcl anyhow, why not use the power of Tcl to do string mangling?
The "function" should have a single input and produce a single output, and ideally it the user should be nudged into doing it right (e.g. not being able to define a proc that requires two arguments) and it be (nigh) impossible to create side-effects (like changing the state of the application).
A simplistic approach would be to use proc mymangler s $body (with $body being the string defined by the user), but there are so many things that can go wrong:
$body assuming a different arg-name (e.g. $x instead of $s)
$body not returning anything
$body changing variables,... in the environment
expressions look more like it (always returning things, not allowing to modify the environment easily), but i cannot make them work on strings, and there's no way to pass a variable without agreeing its name.
So, the best I've come up with so far is:
set userfun {return $s} # user-defined string
proc mymangler s ${userfun}
set output [mymangler $input]
Are there better ways to achieve user-defined string-manglers in Tcl?
You can use apply -- the user provides a 2-element list: the second element is the "proc body", the code that does the mangling; the first element is the variable name to hold the string, this variable is used in the body.
For example:
set userfun {{str} {string reverse $str}}
set input "some string"
set result [apply $userfun $input] ;# => "gnirts emos"
Of course the code you get from the user is any arbitrary Tcl code. You can run it in a safe interpreter:
set userfun {{str} {exec some malicious code; return [string reverse $str]}}
try {
set interp [safe::interpCreate]
set result [$interp eval [list apply $userfun $input]]
puts "mangled string is: $result"
safe::interpDelete $interp
} on error e {
error "Error: $e"
}
results in
Error: invalid command name "exec"
Notes:
a standard Tcl command is used, apply
the user must specify the variable name used in the body.
this scheme does protect the environment:
set userfun {{str} {set ::env(SOME_VAR) "safe slave"; return $str$str}}
set env(SOME_VAR) "main"
puts $env(SOME_VAR)
try {
set interp [safe::interpCreate]
set result [$interp eval [list apply $userfun $input]]
puts "mangled string is: $result"
safe::interpDelete $interp
} on error e {
error "Error: $e"
}
puts $env(SOME_VAR)
outputs
main
mangled string is: some stringsome string
main
if the user does not return a value, then the mangled string is simply the empty string.
The "simplistic" approach is like foreach in that it requires the user to supply a variable name and a script to evaluate that uses that variable, and is a good approach. If you don't want it affecting the rest of the program, run it in a separate interpreter:
set x 0
proc mymangler {name body} {
set i [interp create -safe]
set s "some string to change"
try {
# Build the lambda used by apply here instead of making
# the user do it.
$i eval [list apply [list $name $body] $s]
} on error e {
return $e
} finally {
interp delete $i
}
}
puts [mymangler s { set x 1; string toupper $s }]
puts $x
outputs
SOME STRING TO CHANGE
0
If the person calling this says to use s as a variable and then uses something else in the body, it's on them. Same with providing a script that doesn't return anything.
I'd generally allow the user to specify a command prefix as a Tcl list (most simple command names are trivially suitable for this), which you would then apply to the argument by doing:
set mangled [{*}$commandPrefix $valueToMangle]
This lets people provide pretty much anything they want, especially as they can use apply and a lambda term to mangle things as required. Of course, if you're in a procedure then you're probably actually better off doing:
set mangled [uplevel 1 [list {*}$commandPrefix $valueToMangle]]
so that you're running in the caller's context (change 1 to #0 to use the global context instead) which can help protect your procedure against accidental changes and make using upvar within the mangler easier.
If the source of the mangling prefix is untrusted (what that means depends greatly on your application and deployment) then you can run the mangling code in a separate interpreter:
# Make the safe evaluation context; this is *expensive*
set context [interp create -safe]
# You might want to let them define extra procedures too
# interp invokehidden $context source /the/users/file.tcl
# Use the context
try {
set mangled [interp eval $context [list {*}$commandPrefix $valueToMangle]]
} on error {msg} {
# User supplied something bad; error message in $msg
}
There's various ways to support users specifying the transformation, but if you can expose the fact that you're working with Tcl to them then that's probably easiest and most flexible.
We need to get the value of dynamically constructed variables.
What I mean is we have a variable loaded from a property file called data8967677878788node. So when we run echo $data8967677878788node we get the output test.
Now in data8967677878788node the number part 8967677878788 needs to be dynamic. That means there could be variables like
data1234node
data346346367node
and such.
The number is an input argument to the script. So we need something like this to work
TESTVAR="data`echo $DATANUMBER`node"
echo $$TESTVAR #This line gives the value "test"
Any idea on how this can be accomplished
You can use BASH's indirect variable expansion:
data346346367node='test'
myfunc() {
datanumber="$1"
var1="data${datanumber}node"
echo "${!var1}"
}
And call it as:
myfunc 346346367
Output:
test
Your code is actually already pretty close to working, it just needs to be modified slightly:
TESTVAR="data`echo $DATANUMBER`node"
echo ${!TESTVAR}
If $DATANUMBER has the value 12345 and $data12345node has the value test then the above snippet will output test.
Source: http://wiki.bash-hackers.org/syntax/pe#indirection
I am generating strings with the names of existing variables. I want to use the strings to create a variable set to the VALUE of the existing variable, but I can't figure out how to achieve this.
Put another way if this helps:
A calling routine sends strings "abc" "cde" etc... Each string is the first several characters of a path variable I've already set. I then append "path" to the passed string to create the full name of the existing variable (e.g., %abcpath%) Now I want to get the value of %abcpath% and put it into a variable I can use it in the current routine.
Thanks for any help.
Here is part of the code I have:
SET abcPath=c:\path_to_abc_dir
SET cdePath=c:\path_to_cde_dir
call :names abc cde ...
:names
For %%G in (%*) do (
set name=%%G
:: Append "path" to name from calling routine
set namepath=!name!path
echo "!namepath!"
:: 1st time through namepath is "abcPath"
:: How to now set a var to the VALUE of %abcPath% set above?
::these don't work:
set dirpath=%%namepath%%
set dirpath=!%%namepath%%!
set dirpath=!%namepath%%%amepath%%!
set dirpath=!!name!path:%dirpath%=%%dirpath%%!
::I want to do things with %dirpath% in this routine:
if not "!dirpath!"=="" (
cd !dirpath!
:: call subroutine to get the number of files in the directory
call :forhere
do other stuff with var dirpath ...
)
)
....
::these don't work:
set dirpath=%%namepath%%
^^........^^ Not a valid variable reference
set dirpath=!%%namepath%%!
^^........^^ Not a valid variable reference
set dirpath=!%namepath%%%amepath%%!
^........^ This has been parsed at start and has no value
set dirpath=!!name!path:%dirpath%=%%dirpath%%!
^^ ^..........................^ two "variables" start and end
Delayed expasion over a value obtained with delayed expansion does not have a obvious syntax, because this does not exist. It can not directly be done and other commands need to be used
....
set "name=%%G"
set "namepath=%%Gpath"
call set "dirpath=%%!namepath!%%"
echo !dirpath!
....
Why or how does it work?
When the line is parsed, the only variable referenced is namepath with delayed expansion. The double percent signs are a escaped percent sign. So the line is translated into
call set "dirpath=%abcpath%"
Now, the call is executed, generating a second parse of the line, obtaining the correct value
This can also be done as
for %%a in ("!namepath!") do set "dirpath=!%%~a!"
In this case, the value inside namepath variable is stored into the for replaceable parameter and used to obtain the value to assign to the dirpath variable
In both cases, two "parse" (in the logic sense) operations are done.
In the first solution the first parse extracts the value of namepath and the second parse (invoked by call command execution) uses this value as a variable name.
In the second solution, we first get the value inside namepath (first "parse") and then this value is used in a new delayed expansion operation to retrieve the value to assign to dirpath
I am currently using TCOM to work on excel using TCL. I have 2 excel sheet files. What i need to do is compare the two files for differences and list them out in txt file/excel.
I would like to know whether this comparison between two excel files can be done using tcl/tcom.
If you working on a linux environment you could use some bash commands to help you out.
I think the best way to quickly process their data would be through csv files. You could use:
exec sort sheet1.csv
exec sort sheet2.csv
set diff [diff sheet1.csv sheet2.csv]
Editing since this isn't a pure Tcl:
Let's say both of your csv files looks like this:
sheet1.csv -> a,b,c,d
sheet2.csv -> a,d,c,e
You can load these files by passing them as argument to your Tcl file:
myTclFile sheet1.csv sheet2.csv
Inside your Tcl you could read them using argv:
set list1 [lindex $argv 0]
set list2 [lindex $argv 1]
It is good practice to check for input files before invoking them.
If order of the files aren't important but only the fact that they have the same data you could use lsort. Nevertheless, in order to turn this into a actual list of elements, instead of one big string use split:
set list1 [split $list1 ',']
set list2 [split $list2 ',']
Then you can iterate over these list the way you want. My suggestion would be using foreach. This would go more or less like this (exempla if you wanted to iterate over whole lists)
foreach element $list1 {
foreach element2 $list2 {
set hasMatch 0
if {$element == $element2} {
incr hasMatch
break
} else {
continue
}
}
if {!$hasMatch} {
set diff [lappend $diff $element]
} else {
continue
}
}
I have a manual list I created in a macro in stata, something like
global list1 "a b c d"
which I later iterate through with something like
foreach name in $list1 {
action
}
I am trying to change this to a DB driven list because the list is getting big and changing quickly, I create a new $list1 with the following commands
odbc load listitems=items, exec("SELECT items from my_table")
levelsof listitems
global list1=r(levels)
The items on each are the same, but this list seems to be different and when I have too many items it break on the for loop with the error
{ required
r(100);
Also, when I run only levelsof listitems I get the output
`"a"' `"b"' `"c"' `"d"'
Which looks a little bit different than the other macros.
I've been stuck in this for a while. Again, it only fails when the number of items becomes large (over 15), any help would be very appreciated.
Solution 1:
levelsof listitems, clean local(list1)
foreach name of local list1 {
...action with `name'...
}
Solution 2:
levelsof listitems, clean
global list1 `r(levels)'
foreach name of global list1 {
...action with `name'...
}
Explanation:
When you type
foreach name in $list1 {
then whatever is in $list1 gets substituted inline before Stata ever sees it. If global macro list1 contains a very long list of things, then Stata will see
foreach name in a b c d e .... very long list of things here ... {
It is more efficient to tell Stata that you have a list of things in a global or local macro, and that you want to loop over those things. You don't have to expand them out on the command line. That is what
foreach name of local list1 {
and
foreach name of global list1 {
are for. You can read about other capabilities of foreach in -help foreach-.
Also, you originally coded
levelsof listitems
global list1=r(levels)
and you noted that you saw
`"a"' `"b"' `"c"' ...
as a result. Those are what Stata calls "compound quoted" strings. A compound quoted string lets you effectively nest quoted things. So, you can have something like
`"This is a string with `"another quoted string"' inside it"'
You said you don't need that, so you can use the "clean" option of levelsof to not quote up the results. (See -help levelsof- for more info on this option.) Also, you were assigning the returned result of levelsof (which is in r(levels)) to a global macro afterward. It turns out -levelsof- actually has an option named -local()- where you can specify the name of a local (not global) macro to directly put the results in. Thus, you can just type
levelsof listitems, clean local(list1)
to both omit the compound quotes and to directly put the results in a local macro named list1.
Finally, if you for some reason don't want to use that local() option and want to stick with putting your list in a global macro, you should code
global list1 `r(levels)'
rather than
global list1=r(levels)
The distinction is that the latter treats r(levels) as a function and runs it through Stata's string expression parser. In Stata, strings (strings, not macros containing strings) have a limit of 244 characters. Macros containing strings on the other hand can have thousands of characters in them. So, if r(levels) had more than 244 characters in it, then
global list1=r(levels)
would end up truncating the result stored in list1 at 244 characters.
When you instead code
global list1 `r(levels)'
then the contents of r(levels) are expanded in-line before the command is executed. So, Stata sees
global list1 a b c d e ... very long list ... x y z
and everything after the macro name (list1) is copied into that macro name, no matter how long it is.