Expect/Tcl: 'string map' removing content in string? - string

SETUP: Expect/Tcl Script in Linux
USE CASE:
Using expect to wait for the report of some status to be used in a $user_command.
expect -re "notify (.+)\n"
set status $expect_out(1,string)
send [string map [list SESSION "$status"] "$user_command"]
So when the application sends "notify running", then status is set to running.
For that a keyword STATUS in $user_command needs to be replaced with $status, such that, for example
"log STATUS to file"
becomes
"log running to file"
To see what is happing, I wrote
expect_tty -re "(.+)\n"
set status $expect_out(1,string)
send_user [string map [list SESSION "$status"] "$user_command"]
which works fine when running isolatedly. The output is
log someUserInput to file
when typing someUserInput to responde to expect_tty. However, as part of a larger script, the string map command it removes anything before the string replacement, so that the output becomes
" to file"
(without a newline) I checked for the uniqueness of variables in the script, so that this is not an issue.
QUESTION:
What is going on here? How can I make the script robust?

The string map command is exactly deterministic. At each character of its input string, in order, it considers whether any of the from strings in the mapping list match, in order, and if so it performs the replacement (with the paired to string) and goes on to consider the character immediately after the replaced substring. (The empty string is a special case: it's never matched.) The code to implement it is really quite stupid, but happens to be very cache friendly on modern computers so it's still very fast; more sophisticated and supposedly “faster” implementations have been tried, but found to be slower in practice with the kinds of maps usually encountered in the wild.
If the replacement is failing to apply, it's usually because the input string is not quite what you expect. In most programs this is rare, but it's more common with expect programs because the output of the terminal emulation engine inside them can include metacharacters for things like moving the cursor around and changing the color. (Often the easiest fix for that is to tell the spawned program that its terminal type is one that doesn't support such complex features, perhaps by setting the TERM environment variable to dumb.)

Thanks to #glennjackman:
The problem is due to applications reporting newline as \r\n, so that (.*)\n in
expect_tty -re "(.+)\n"
matches in expression 1 (the thing inside the brackets) something that includes \r at the end. With \r removing anything before it, string map seems to cut anything before the replaced string. The solution is to expect something that excludes \r, i.e.
expect -re ``(\[^\r\n\]+)``
which collects anything until end of line, whatsoever the format may be.

Related

using special characters in parameters and variables in batch without external file use

Before you go marking this as a duplicate hear me out
My question;
A: has different requirements than all the others (which are basically "whats an escape character?") Including: not having to use an external file to enter in parameters to functions
B: questions the existence of this mess rather than accepting 'no' or 'its complicated' as the answer
C: understands that there are escape characters that already exist and ways to work around that
D: comes from a different skill level and isn't a 2-7 years old question
E: requires the use of quotes rather than something like [ because quotes are the only thing that works with spaced strings
Also before ya'll say I didn't try stuff
I read these (all of it including comments and such):
Batch character escaping
http://www.robvanderwoude.com/escapechars.php
https://blogs.msdn.microsoft.com/oldnewthing/20091029-00/?p=16213
using batch echo with special characters
Escape angle brackets in a Windows command prompt
Pass, escape and recognize Special Character in Windows Batch File
I didn't understand that all fully, because to fully understand all that I'd have to be much better at batch but here is what I gleaned:
So I understand theres a whole table of escape sequences, that ^ is the most used one, that you can use Delayed Expansion to do this task so that the characters don't get parsed immediately, just 'expanded' at runtime, but then that the Enable Delayed expansion thing doesn't always work because with pipe characters, the other files/things being piped to/from don't inherit the expansion status so
A: you have to enable it there too
B: that forces you to use that expansion
C: it requires multiple escape characters for each parsing pass of the CLI which apparently is hard to determine and ugly to look at.
This all seems rather ridiculous, why wasn't there some sort of creation to set a string of odd inputs to literal rather than process the characters. Why wasn't it just a simple flag upon some super duper special character (think alt character) that would almost never appear unless you set the font to wingdings. Why does each parsing pass of pipe characters remove the escape characters? That just makes everything insane because the user now has to know how many times that string is used. Why hasn't a tool been developed to auto scan through odd inputs and auto escape them? We have a table of the rules, is it really that hard? Has it been done already? What would it require that's 'hard'?
Down the rabbit hole we go
How did I get here you ask? Well it all started when I made a simple trimming function and happened upon one of the biggest problems in batch, escaping characters when receiving inputs. The problem is alot of my inputs to my trimming function had quotes. Quotes are escaped by using "" in place of " so something like
::SETUP
::parenthesis is just to deliniate where it goes, it isn't
::actually in
::the code
set "var=(stuff goes here)"
call :TrimFunc "%var%",var
:TrimFunc
::the + are just to display the spacing otherwise I can't tell
echo beginning param1 is +%~1+
::code goes here for func
gotoEOF
::END SETUP
::NOTE the + characters aren't part of the actual value, just the
::display when I run this function
set "var=""a"""
::got +"a"+
will work but
set "var="a "
::got +"a+
::expected +"a +
set "var="a ""
::got +"a+
::expected +"a "+
set "var="a " "
::got +"a+
::expected +"a " +
set "var="a"
::got +"a",var+
::expected +"a+
will not work as expected. oddly,
set "var="a""
::got +"a"+
seemes to work despite not being escaped fully. Adding any spaces seems to disrupt this edge case.
Oddly enough I've tried doing:
set 'var="a"'
::got ++
::expected +"a"+
But I have no idea what changing ' to " actually does when its the one that contains the argument (not the ones that are supposed to be literal).
To see what would happen and
What I want:
Surely there must be some sort of universal escape character thing such that I can do this (assume the special character was *)
set *var=""something " "" " """*
call :TrimFunc "%var%",var
echo +%~1+
would net me
+""something " "" " """+
with no problems. In fact, why can't I have some universal escape character that can just be used to take in all the other characters inside it literally instead of the command line trying to process them? Perhaps I'm thinking about this wrong but this seems to be a recurring problem with weird inputs all over. I just want my vairbales, pipes and strings and all that to just STAY LITERAL WHEN THEY'RE SUPPOSED TO. I just want a way to have any input and not have any weird output, it should just treat everything literally untill I want it not to because it was enclosed by the mystical super special character that I just invented in my mind.
::noob rant
I don't see why this was never a thing. Whats preventing the makers of this highly useful language from simply creating a flag and some character that is never used to be the supremo escape character. Why do we need like 10 different ways of escaping characters? I should be able to programatically escape inputs if necessary, it should NEVER be the users job to escape their inputs, thats absolutely ridiculous and probably a violation of every good coding standard in existence
::END noob rant
Anyways. I'd be happy to be enlightened as to why the above is or isn't a thing. I just want to be able to use quotes IN A STRING (kinda important) And I can't comprehend why its not as simple as having one giant "treat these things as literals" flag that ALWAYS JUST WORKS(tm).
By the way, In case you're wondering why my function takes in the name of the variable it's writing to, I couldn't figure out how to get labels inside labels working without using delayed expansion. Using that means the variable I'm making is local not global so I use the name of the global variable to basically set it (using it's name) to the local value on return like this:
endlocal & set "%~2=%String%"
Feel free to yell at me on various things because I am 99% certain I'm doing something horribly syntactically wrong, have some bad misunderstandings or am simply way to naive to truly understand the complexity of this problem but to me it seems amazingly superfluous.
Why can't the last quote be used like the special character it is but any preceeding ones are taken literally (maybe depending upon a flag)
for example
set "var="a ""
why doesn't the ending two quotes act specially and the ones in between act literally? Can the CLI not tell where the line ends? Can it not tell the difference between the first and last quotes and the ones in between? This seems simple to implement to me.
As long as I can echo things properly and save their literal value from parameter to variable I'm happy.
Firstly, I'm don't really understand why you will want to use set "var=something" instead of set var=something, it doesn't seems to have difference.
Secondly, hope that helps, which I have (recently? IDK.) invented a method to deal with the annoying quotes. Hope this batch inspires or helps you to do sth similar.
#echo off
title check for length of string
color de
mode con: cols=90 lines=25
goto t
:t
set str=error
set /p str=Please enter four characters:
set len=0
goto sl
:sl
call set this=%%str:~%len%%%
if not "%this%" == "" (set /a len+=1 & rem debug" == "" (set /a len+=1
goto sl)
if not "%len%" == "4" (set n= not) else (set n=)
echo This is%n% a four character string.
pause >nul
exit
Which in your case:
if not "%var%" == "" (call :TrimFunc "%var%",var & rem debug" == "" (call :TrimFunc "%var%,var)
)
Hope that helps. Add oil~ (My computer doesn't support delayexpansion with unknown reason. Therefore, most of the codes are a bit clumsy.)
P.S.: If you are simply removing text and not replacing text, why not use set var=%var:string=%? If the string required is a variable too, then you can try this: call set var=%%var:%string%=%%

A way to prevent bash from parsing command line w/out using escape symbols

I'm looking for a way (other than ".", '.', \.) to use bash (or any other linux shell) while preventing it from parsing parts of command line. The problem seems to be unsolvable
How to interpret special characters in command line argument in C?
In theory, a simple switch would suffice (e.g. -x ... telling that the
string ... won't be interpreted) but it apparently doesn't exist. I wonder whether there is a workaround, hack or idea for solving this problem. The original problem is a script|alias for a program taking youtube URLs (which may contain special characters (&, etc.)) as arguments. This problem is even more difficult: expanding "$1" while preventing shell from interpreting the expanded string -- essentially, expanding "$1" without interpreting its result
Use a here-document:
myprogramm <<'EOF'
https://www.youtube.com/watch?v=oT3mCybbhf0
EOF
If you wrap the starting EOF in single quotes, bash won't interpret any special chars in the here-doc.
Short answer: you can't do it, because the shell parses the command line (and interprets things like "&") before it even gets to the point of deciding your script/alias/whatever is what will be run, let alone the point where your script has any control at all. By the time your script has any influence in the process, it's far too late.
Within a script, though, it's easy to avoid most problems: wrap all variable references in double-quotes. For example, rather than curl -o $outputfile $url you should use curl -o "$outputfile" "$url". This will prevent the shell from applying any parsing to the contents of the variable(s) before they're passed to the command (/other script/whatever).
But when you run the script, you'll always have to quote or escape anything passed on the command line.
Your spec still isn't very clear. As far as I know the problem is you want to completely reinvent how the shell handles arguments. So… you'll have to write your own shell. The basics aren't even that difficult. Here's pseudo-code:
while true:
print prompt
read input
command = (first input)
args = (argparse (rest input))
child_pid = fork()
if child_pid == 0: // We are inside child process
exec(command, args) // See variety of `exec` family functions in posix
else: // We are inside parent process and child_pid is actual child pid
wait(child_pid) // See variety of `wait` family functions in posix
Your question basically boils down to how that "argparse" function is implemented. If it's just an identity function, then you get no expansion at all. Is that what you want?

Decrypt obfuscated perl script

Had some spam issues on my server and, after finding out and removing some Perl and PHP scripts I'm down to checking what they really do, although I'm a senior PHP programmer I have little experience with Perl, can anyone give me a hand with the script here:
http://pastebin.com/MKiN8ifp
(It was one long line of code, script was called list.pl)
The start of the script is:
$??s:;s:s;;$?::s;(.*); ]="&\%[=.*.,-))'-,-#-*.).<.'.+-<-~-#,~-.-,.+,~-{-,.<'`.{'`'<-<--):)++,+#,-.{).+,,~+{+,,<)..})<.{.)-,.+.,.)-#):)++,+#,-.{).+,,~+{+,,<)..})<*{.}'`'<-<--):)++,+#,-.{).+:,+,+,',~+*+~+~+{+<+,)..})<'`'<.{'`'<'<-}.<)'+'.:*}.*.'-|-<.+):)~*{)~)|)++,+#,-.{).+:,+,+,',~+*+~+~+{+<+,)..})
It continues with precious few non-punctuation characters until the very end:
0-9\;\\_rs}&a-h;;s;(.*);$_;see;
Replace the s;(.*);$_;see; with print to get this. Replace s;(.*);$_;see; again with print in the first half of the payload to get this, which is the decryption code. The second half of the payload is the code to decrypt, but I can't go any further with it, because as you see, the decryption code is looking for a key in an envvar or a cookie (so that only the script's creator can control it or decode it, presumably), and I don't have that key. This is actually reasonably cleverly done.
For those interested in the nitty gritty... The first part, when de-tangled looks like this:
$? ? s/;s/s;;$?/ :
s/(.*)/...lots of punctuation.../;
The $? at the beginning of the line is the pre-defined variable containing the child error, which no doubt serves only as obfuscation. It will be undefined, as there can be no child error at this point.
The questionmark following it is the start of a ternary operator
CONDITION ? IF_TRUE : IF_FALSE
Which is also added simply to obfuscate. The expression returned for true is a substitution regex, where the / slash delimiter has been replaced with colon s:pattern:replacement:. Above, I have put back slashes. The other expression, which is the one that will be executed is also a substitution regex, albeit an incredibly long one. The delimiter is semi-colon.
This substitution replaces .* in $_ - the default input and pattern-searching space - with a rather large amount of punctuation characters, which represents the bulk of the code. Since .* matches any string, even the empty string, it will simply get inserted into $_, and is for all intents and purposes identical to simply assigning the string to $_, which is what I did:
$_ = q;]="&\%[=.*.,-))'-,-# .......;;
The following lines are a transliteration and another substitution. (I inserted comments to point out the delimiters)
y; -"[%-.:<-#]-`{-}#~\$\\;{\$()*.0-9\;\\_rs}&a-h;;
#^ ^ ^ ^
#1 2 3
(1,2,3 are delimiters, the semi-colon between 2 and 3 is escaped)
The basic gist of it is that various characters and ranges -" (space to double quote), and something that looks like character classes (with ranges) [%-.:<-#], but isn't, get transliterated into more legible characters e.g. curly braces, dollar sign, parentheses,0-9, etc.
s;(.*);$_;see;
The next substitution is where the magic happens. It is also a substitution with obfuscated delimiters, but with three modifers: see. s does nothing in this case, as it only allows the wildcard character . to match newline. ee means to evaluate the expression twice, however.
In order to see what I was evaluating, I performed the transliteration and printed the result. I suspect that I somewhere along the line got some characters corrupted, because there were subtle errors, but here's the short (cleaned up) version:
s;(.*);73756220656e6372797074696f6e5f6 .....;; # very long line of alphanumerics
s;(..);chr(hex($1));eg;
s;(.*);$_;see;
s;(.*);704b652318371910023c761a3618265 .....;; # another long line
s;(..);chr(hex($1));eg;
&e_echr(\$_);
s;(.*);$_;see;
The long regexes are once again the data containers, and insert data into $_ to be evaluated as code.
The s/(..)/chr(hex($1))/eg; is starting to look rather legible. It is basically reading two characters at the time from $_ and converting it from hex to corresponding character.
The next to last line &e_echr(\$_); stumped me for a while, but it is a subroutine that is defined somewhere in this evaluated code, as hobbs so aptly was able to decode. The dollar sign is prefixed by backslash, meaning it is a reference to $_: I.e. that the subroutine can change the global variable.
After quite a few evaluations, $_ is run through this subroutine, after which whatever is contained in $_ is evaluated a last time. Presumably this time executing the code. As hobbs said, a key is required, which is taken from the environment %ENV of the machine where the script runs. Which we do not have.
Ask the B::Deparse module to make it (a little more) readable.

Is this batch file injection?

C:\>batinjection OFF ^& DEL c.c
batinjection.bat has contents of ECHO %*
I've heard of SQL injection, though i've never actually done it, but is this injection? Are there different types of injection and this is one of them?
Or is there another technical term for this? or a more specific term?
Note- a prior edit had C:\>batinjection OFF & DEL c.c(i.e. without ^%) and ECHO %1(i.e. without %*) which wasn't quite right. I have corrected it. It doesn't affect the answers.
Your example presents three interesting issues that are easier to understand
when separated.
First, Windows allows multiple statements to be executed on one line by
separating with "&". This could potentially be used in an injection attack.
Second, ECHO parses and interprets messages passed to it. If the message is
"OFF" or "/?" or even blank, then ECHO will provide a different expected
behavior than just copying the message to stdout.
Third, you know that it's possible to inject code into a number of
scriptable languages, including batch files, and want to explore ways
to recognize it so you can better defend against it in your code.
It would be easier to recognize the order in which things are happening
in your script if you add an echo statement before and after the one
you're trying to inject. Call it foo.bat.
#echo off
echo before
echo %1
echo after
Now, you can more easily tell whether your injection attempt executed at
the command line (not injection) or was executed as a result of parameter
expansion that broke out of the echo statement and executed a new statement
(injection).
foo dir
Results in:
before
dir
after
Pretty normal so far. Try a parameter that echo interprets.
foo /?
Results in:
before
Displays messages, or turns command-echoing on or off.
ECHO [ON | OFF]
ECHO [message]
Type ECHO without parameters to display the current echo setting.
after
Hmm. Help for the echo command. It's probably not the desired use of
echo in that batch file, but it's not injection. The parameters were
not used to "escape out" of the limits of either the echo statement or
the syntax of the batch file.
foo dog & dir
Results in:
before
dog
after
[A spill of my current directory]
Okay, the dir happened outside of the script. Not injection.
foo ^&dir/w
Results in:
before
ECHO is off.
[A spill of my current directory in wide format]
after
Now, we've gotten somewhere. The dir is not a function of ECHO, and is
running between the before and after statements. Let's try something
more dramatic but still mostly harmless.
foo ^&dir\/s
Yikes! You can pass an arbitrary command that can potentially impact
your system's performance all inside an innocuous-looking "echo %1".
Yes, it's a type of injection, and it's one of the big problems with batch files, that mostly it isn't a purposefully attac, most of the time you simple get trouble with some characters or word like OFF.
Therefore you should use technics to avoid this problems/vulnerabilitys.
In your case you could change your batch file to
set "param1=%*"
setlocal EnableDelayedExpansion
echo(!param1!
I use echo( here instead of echo. or something else, as it is the only known secure echo for all appended contents.
I use the delayed expansion ! instead of percent expansion, as delayed expansion is always safe against any special characters.
To use the delayed expansion you need to transfer the parameter into a variable and a good way is to use quotes around the set command, it avoid many problems with special characters (but not all).
But to build an absolutly secure way to access batch parameters, the way is quite harder.
Try to make this safe is tricky
myBatch.bat ^&"&"
You could read SO: How to receive even the strangest command line parameters?
The main idea is to use the output of a REM statement while ECHO ON.
This is safe in the way, that you can't inject code (or better: only with really advanced knowledge), but the original content can be changed, if your content is something like.
myBatch.bat myContent^&"&"%a
Will be changed to myContent&"&"4
AFAIK, this is know as command injection (which is one of types code injection attack).
The later link lists various injection attacks. The site (www.owasp.org) is an excellent resource regarding web security.
There are multiple applications of injection one can generalize as "language injection". SQL Injection and Cross Site Scripting are the most popular, but others are possible.
In your example, the ECHO statement isn't actually performing the delete, so I wouldn't call that injection. Instead, the delete happens outside of the invocation of the batinjection script itself.

Why do my keystrokes turn into crazy characters after I dump a bunch of binary data into my terminal?

If I do something like:
$ cat /bin/ls
into my terminal, I understand why I see a bunch of binary data, representing the ls executable. But afterwards, when I get my prompt back, my own keystrokes look crazy. I type "a" and I get a weird diagonal line. I type "b" and I get a degree symbol.
Why does this happen?
Because somewhere in your binary data were some control sequences that your terminal interpreted as requests to, for example, change the character set used to draw. You can restore everything to normal like so:
reset
Just do a copy-paste:
echo -e '\017'
to your bash and characters will return to normal. If you don't run bash, try the following keystrokes:
<Ctrl-V><Ctrl-O><Enter>
and hopefully your terminal's status will return to normal when it complains that it can't find either a <Ctrl-V><Ctrl-O> or a <Ctrl-O> command to run.
<Ctrl-N>, or character 14 —when sent to your terminal— orders to switch to a special graphics mode, where letters and numbers are replaced with symbols. <Ctrl-O>, or character 15, restores things back to normal.
The terminal will try to interpret the binary data thrown at it as control codes, and garble itself up in the process, so you need to sanitize your tty.
Run:
stty sane
And things should be back to normal. Even if the command looks garbled as you type it, the actual characters are being stored correctly, and when you press return the command will be invoked.
You can find more information about the stty command here.
You're getting some control characters piped into the shell that are telling the shell to alter its behavior and print things differently.
VT100 is pretty much the standard command set used for terminal windows, but there are a lot of extensions. Some control character set used, keyboard mapping, etc.
When you send a lot of binary characters to such a terminal, a lot of settings change. Some terminals have options to 'clear' the settings back to default, but in general they simply weren't made for binary data.
VT100 and its successors are what allow Linux to print in color text (such as colored ls listings) in a simple terminal program.
-Adam
If you really must dump binary data to your terminal, you'd have much better luck if you pipe it to a pager like less, which will display it in a slightly more readable format. (You may also be interested in strings and od, both can be useful if you're fiddling around with binary files.)

Resources