how to replace "/" in a POSIX sh string - linux

To replace substring in the bash string str I use:
str=${str/$pattern/$new}
However, I'm presently writing a script which will be executed with ash.
I have a string containing '/' and I want to use the above syntax inorder to replace the '/' in my string but it does not work.
I tried:
str=${str///a}
str=${str/\//a}
str=${str/'/'/a}
But they do not work
How I can fix that?

This parameter expansion is a bash extension to POSIX sh. If you review the relevant section of IEEE standard 1003.1, you'll see that it isn't a required feature, so shells which promise only POSIX compliance, such as ash, have no obligation to implement it, and no obligation for their implementations to hew to any particular standard of correctness should they do so anyhow..
If you want bash extensions, you need to use bash (or other ksh derivatives which are extended similarly).
In the interim, you can use other tools. For instance:
str=$(printf '%s' "$str" | tr '/' 'a')
or
str=$(printf '%s' "$str" | sed -e 's#/#a#g')

POSIX string substitutions can be used to create a 100% POSIX compatible function that does the replacement. For short strings, this is considerably faster than command substitution, especially under Cygwin, whose fork(2) copies the parent process's address space on top of creating processes being generally slow in Windows.
replace_all() {
RIGHT=$1
R=
while [ -n "$RIGHT" ]; do
LEFT=${RIGHT%%$2*}
if [ "$LEFT" = "$RIGHT" ]; then
R=$R$RIGHT
return
fi
R=$R$LEFT$3
RIGHT=${RIGHT#*$2}
done
}
It works like this:
$ replace_all ' foo bar baz ' ' ' .
$ echo $R
.foo.bar.baz.
With regards to performance, replacing 25% of characters in a 512 byte string runs roughly 50 times faster with replace_all() than command substitution under the Cygwin dash(1). However, the execution time evens out around 4 KiB.

Related

how to use variables with brace expansion [duplicate]

This question already has answers here:
Brace expansion with variable? [duplicate]
(6 answers)
Closed 4 years ago.
I have four files:
1.txt 2.txt 3.txt 4.txt
in linux shell, I could use :
ls {1..4}.txt to list all the four files
but if I set two variables : var1=1 and var2=4, how to list the four files?
that is:
var1=1
var2=4
ls {$var1..$var2}.txt # error
what is the correct code?
Using variables with the sequence-expression form ({<numFrom>..<numTo>}) of brace expansion only works in ksh and zsh, but, unfortunately, not in bash (and (mostly) strictly POSIX-features-only shells such as dash do not support brace expansion at all, so brace expansion should be avoided with /bin/sh altogether).
Given your symptoms, I assume you're using bash, where you can only use literals in sequence expressions (e.g., {1..3}); from the manual (emphasis mine):
Brace expansion is performed before any other expansions, and any characters special to other expansions are preserved in the result.
In other words: at the time a brace expression is evaluated, variable references have not been expanded (resolved) yet; interpreting literals such as $var1 and $var2 as numbers in the context of a sequence expression therefore fails, so the brace expression is considered invalid and as not expanded.
Note, however, that the variable references are expanded, namely at a later stage of overall expansion; in the case at hand the literal result is the single word '{1..4}' - an unexpanded brace expression with variable values expanded.
While the list form of brace expansion (e.g., {foo,bar)) is expanded the same way, later variable expansion is not an issue there, because no interpretation of the list elements is needed up front; e.g. {$var1,$var2} correctly results in the 2 words 1 and 4.
As for why variables cannot be used in sequence expressions: historically, the list form of brace expansion came first, and when the sequence-expression form was later introduced, the order of expansions was already fixed.
For a general overview of brace expansion, see this answer.
Workarounds
Note: The workarounds focus on numerical sequence expressions, as in the question; the eval-based workaround also demonstrates use of variables with the less common character sequence expressions, which produce ranges of English letters (e.g., {a..c} to produce a b c).
A seq-based workaround is possible, as demonstrated in Jameson's answer.
A small caveat is that seq is not a POSIX utility, but most modern Unix-like platforms have it.
To refine it a little, using seq's -f option to supply a printf-style format string, and demonstrating two-digit zero-padding:
seq -f '%02.f.txt' $var1 $var2 | xargs ls # '%02.f'==zero-pad to 2 digits, no decimal places
Note that to make it fully robust - in case the resulting words contain spaces or tabs - you'd need to employ embedded quoting:
seq -f '"%02.f a.txt"' $var1 $var2 | xargs ls
ls then sees 01 a.txt, 02 a.txt, ... with the argument boundaries correctly preserved.
If you want to robustly collect the resulting words in a Bash array first, e.g., ${words[#]}:
IFS=$'\n' read -d '' -ra words < <(seq -f '%02.f.txt' $var1 $var2)
ls "${words[#]}"
The following are pure Bash workarounds:
A limited workaround using Bash features only is to use eval:
var1=1 var2=4
# Safety check
(( 10#$var1 + 10#$var2 || 1 )) 2>/dev/null || { echo "Need decimal integers." >&2; exit 1; }
ls $(eval printf '%s\ ' "{$var1..$var2}.txt") # -> ls 1.txt 2.txt 3.txt 4.txt
You can apply a similar technique to a character sequence expression;
var1=a var2=c
# Safety check
[[ $var1 == [a-zA-Z] && $var2 == [a-zA-Z] ]] || { echo "Need single letters."; exit 1; }
ls $(eval printf '%s\ ' "{$var1..$var2}.txt") # -> ls a.txt b.txt c.txt
Note:
A check is performed up front to ensure that $var1 and $var2 contain decimal integers or single English letters, which then makes it safe to use eval. Generally, using eval with unchecked input is a security risk and use of eval is therefore best avoided.
Given that output from eval must be passed unquoted to ls here, so that the shell splits the output into individual arguments through words-splitting, this only works if the resulting filenames contain no embedded spaces or other shell metacharacters.
A more robust, but more cumbersome pure Bash workaround to use an array to create the equivalent words:
var1=1 var2=4
# Emulate brace sequence expression using an array.
args=()
for (( i = var1; i <= var2; i++ )); do
args+=( "$i.txt" )
done
ls "${args[#]}"
This approach bears no security risk and also works with resulting filenames with embedded shell metacharacters, such as spaces.
Custom increments can be implemented by replacing i++ with, e.g., i+=2 to step in increments of 2.
Implementing zero-padding would require use of printf; e.g., as follows:
args+=( "$(printf '%02d.txt' "$i")" ) # -> '01.txt', '02.txt', ...
For that particular piece of syntax (a "sequence expression") you're out of luck, see Bash man page:
A sequence expression takes the form {x..y[..incr]}, where x and y are
either integers or single characters, and incr, an optional increment,
is an integer.
However, you could instead use the seq utility, which would have a similar effect -- and the approach would allow for the use of variables:
var1=1
var2=4
for i in `seq $var1 $var2`; do
ls ${i}.txt
done
Or, if calling ls four times instead of once bothers you, and/or you want it all on one line, something like:
for i in `seq $var1 $var2`; do echo ${i}.txt; done | xargs ls
From seq(1) man page:
seq [OPTION]... LAST
seq [OPTION]... FIRST LAST
seq [OPTION]... FIRST INCREMENT LAST

Get first character of a string SHELL

I want to first the first character of a string, for example:
$>./first $foreignKey
And I want to get "$"
I googled it and I found some solutions but it concerns only bash and not Sh !
This should work on any Posix compatible shell (including sh). printf is not required to be a builtin but it often is, so this may save a fork or two:
first_letter=$(printf %.1s "$1")
Note: (Possibly I should have explained this six years ago when I wrote this brief answer.) It might be tempting to write %c instead of %.1s; that produces exactly the same result except in the case where the argument "$1" is empty. printf %c "" actually produces a NUL byte, which is not a valid character in a Posix shell; different shells might treat this case differently. Some will allow NULs as an extension; others, like bash, ignore the NUL but generate an error message to tell you it has happened. The precise semantics of %.1s is "at most 1 character at the start of the argument, which means that first_letter is guaranteed to be set to the empty string if the argument is the empty string, without raising any error indication.
Well, you'll probably need to escape that particular value to prevent it being interpreted as a shell variable but, if you don't have access to the nifty bash substring facility, you can still use something like:
name=paxdiablo
firstchar=`echo $name | cut -c1-1`
If you do have bash (it's available on most Linux distros and, even if your login shell is not bash, you should be able to run scripts with it), it's the much easier:
firstchar=${name:0:1}
For escaping the value so that it's not interpreted by the shell, you need to use:
./first \$foreignKey
and the following first script shows how to get it:
letter=`echo $1 | cut -c1-1`
echo ".$letter."
Maybe it is an old question.
recently I got the same problem, according to POSIX shell manual about substring processing, this is my solution without involving any subshell/fork
a="some string here"
printf 'first char is "%s"\n' "${a%"${a#?}"}"
for shell sh
echo "hello" | cut -b 1 # -b 1 extract the 1st byte
h
echo "hello" |grep -o "." | head -n 1
h
echo "hello" | awk -F "" '{print $1}'
h
you can try this for bash:
s='hello'; echo ${s:0:1}
h
printf -v first_character "%c" "${variable}"

How does one ‘contract’ strings to escape special characters in Bash?

There are many ways to expand an escaped string, but how can a shell command be made to take a string as an argument and escape it?
Here are some examples of different ways of expansion:
$ echo -e '\x27\\012\b34\n56\\\aa7\t8\r 9\0\0134\047'
'\0134
9\'7 8
$ echo $'\x27\\012\b34\n56\\\aa7\t8\r 9\0\0134\047'
'\0134
9\a7 8
$ PS1='(5)$ ' # At least tab-width - 3 long; 5 columns given typical tab-width.
(5)$ printf %b '\x27\\012\b34\n56\\\aa7\t8\r 9\0\0134\047'
'\0134
9\'(5)$
Note: there's actually a tab character between the 7 and 8 above, but the markup rendering seems to break it.
Yes, all sorts of craziness in there. ;-)
Anyway, I'm looking for the reverse of such escape expansion commands. If the command was called escape, it would satisfy these properties:
$ echo -ne "$(escape "$originalString")"
Should output the verbatim value of originalString as would ‘echo -n "$originalString"’. I.e. it should be an identity.
Likewise:
$ escape "$(echo -ne "$escapedString")"
Should output the string escaped again, though not necessarily in the same way as before. E.g. \0134 may become \\ or vice versa.
Don't use echo -e -- it's very poorly specified in POSIX, and considered deprecated for all but the simplest uses. Bash has extensions to its printf that provide a better-supported approach:
printf -v escaped_string %q "$raw_string"
...gives you a shell-escaped string from a raw one (storing it in a variable named escaped_string), and
printf -v raw_string %b "$escaped_string"
...gives you a raw string from a backslash-escaped one, storing it in raw_string.
Note that the two escape syntaxes are not equivalent -- strings escaped with printf %q are ready for eval, rather than for printf %b.
That is, you can safely run:
eval "myvar=$escaped_string"
...when escaped_string has been created with printf %q as above.
That said: What's the use case? It's strongly preferred to handle raw strings as raw strings (using NUL terminaters when delimiting is necessary), rather than converting them to and from an escaped form.

sed returning different result on different platforms

Hi using following command on an x86 machine (using /bin/sh) returns: <port>3<port>
test="port 3"
echo $test | sed -r 's/\s*port\s*([0-9]+)\s*/<port>\1<\/port>/'
but running same command on sh shell of an ARM based network switch returns the string port 3.
How can I get same result on switch as I got on my x86 machine? To me it seems like digit is not being captured by [0-9].
\s is a GNU sed extension to the standard sed behavior. GNU sed is the implementation on desktop/server Linux systems. Most embedded Linux systems run BusyBox, a suite of utilities with a markedly smaller footprint and fewer features.
A standard way of specifying “any space character” is the [:space:] character class. It is supported by BusyBox (at least, by most BusyBox installations; most BusyBox features can be stripped off for an even lower footprint).
BusyBox also doesn't support the -r option, you need to use a basic regular expression. In a BRE, \(…\) marks groups, and there is no + operator, only *.
echo "$test" | sed 's/[[:space:]]*port[[:space:]]*\([0-9][0-9]*\)[[:space:]]*/<port>\1<\/port>/'
Note that since you didn't put any quotes around $test, the shell performed word splitting and wildcard expansion on the value of the variable. That is, the value of the variable was treated as a whitespace-separated list of file names which were then joined by a single space. So if you leave out the quotes, you don't have to worry about different kinds of whitespace, you can write echo $test | sed 's/ *port *([0-9][0-9]*) */<port>\1<\/port>/'. However, if $test had been port *, the result would have depended on what files exist in the current directory.
Not all seds support reg-expression short-hand like \s. A more portable version is
test="port 3"
echo "$test" | sed -r 's/[ ]*port[ ]*([0-9]+)[ ]*/<port>\1<\/port>/'
If you really need to check for tab chars as well, just add them to the char class (in all 3 places) that, in my example just contain space chars, i.e. the [ ] bit.
output
<port>3</port>
I hope this helps.

How to pass the value of a variable to the standard input of a command?

I'm writing a shell script that should be somewhat secure, i.e., does not pass secure data through parameters of commands and preferably does not use temporary files. How can I pass a variable to the standard input of a command?
Or, if it's not possible, how can I correctly use temporary files for such a task?
Passing a value to standard input in Bash is as simple as:
your-command <<< "$your_variable"
Always make sure you put quotes around variable expressions!
Be cautious, that this will probably work only in bash and will not work in sh.
Simple, but error-prone: using echo
Something as simple as this will do the trick:
echo "$blah" | my_cmd
Do note that this may not work correctly if $blah contains -n, -e, -E etc; or if it contains backslashes (bash's copy of echo preserves literal backslashes in absence of -e by default, but will treat them as escape sequences and replace them with corresponding characters even without -e if optional XSI extensions are enabled).
More sophisticated approach: using printf
printf '%s\n' "$blah" | my_cmd
This does not have the disadvantages listed above: all possible C strings (strings not containing NULs) are printed unchanged.
(cat <<END
$passwd
END
) | command
The cat is not really needed, but it helps to structure the code better and allows you to use more commands in parentheses as input to your command.
Note that the 'echo "$var" | command operations mean that standard input is limited to the line(s) echoed. If you also want the terminal to be connected, then you'll need to be fancier:
{ echo "$var"; cat - ; } | command
( echo "$var"; cat - ) | command
This means that the first line(s) will be the contents of $var but the rest will come from cat reading its standard input. If the command does not do anything too fancy (try to turn on command line editing, or run like vim does) then it will be fine. Otherwise, you need to get really fancy - I think expect or one of its derivatives is likely to be appropriate.
The command line notations are practically identical - but the second semi-colon is necessary with the braces whereas it is not with parentheses.
This robust and portable way has already appeared in comments. It should be a standalone answer.
printf '%s' "$var" | my_cmd
or
printf '%s\n' "$var" | my_cmd
Notes:
It's better than echo, reasons are here: Why is printf better than echo?
printf "$var" is wrong. The first argument is format where various sequences like %s or \n are interpreted. To pass the variable right, it must not be interpreted as format.
Usually variables don't contain trailing newlines. The former command (with %s) passes the variable as it is. However tools that work with text may ignore or complain about an incomplete line (see Why should text files end with a newline?). So you may want the latter command (with %s\n) which appends a newline character to the content of the variable. Non-obvious facts:
Here string in Bash (<<<"$var" my_cmd) does append a newline.
Any method that appends a newline results in non-empty stdin of my_cmd, even if the variable is empty or undefined.
I liked Martin's answer, but it has some problems depending on what is in the variable. This
your-command <<< """$your_variable"""
is better if you variable contains " or !.
As per Martin's answer, there is a Bash feature called Here Strings (which itself is a variant of the more widely supported Here Documents feature):
3.6.7 Here Strings
A variant of here documents, the format is:
<<< word
The word is expanded and supplied to the command on its standard
input.
Note that Here Strings would appear to be Bash-only, so, for improved portability, you'd probably be better off with the original Here Documents feature, as per PoltoS's answer:
( cat <<EOF
$variable
EOF
) | cmd
Or, a simpler variant of the above:
(cmd <<EOF
$variable
EOF
)
You can omit ( and ), unless you want to have this redirected further into other commands.
Try this:
echo "$variable" | command
If you came here from a duplicate, you are probably a beginner who tried to do something like
"$variable" >file
or
"$variable" | wc -l
where you obviously meant something like
echo "$variable" >file
echo "$variable" | wc -l
(Real beginners also forget the quotes; usually use quotes unless you have a specific reason to omit them, at least until you understand quoting.)

Resources