Bash doesn't parse quotes when converting a string to arguments

Bash doesn't parse quotes when converting a string to arguments - string

This is my problem. In bash 3:
$ test='One "This is two" Three'
$ set -- $test
$ echo $2
"This
How to get bash to understand the quotes and return $2 as This is two and not "This? Unfortunately I cannot alter the construction of the variable called test in this example.

The reason this happens is because of the order in which the shell parses the command line: it parses (and removes) quotes and escapes, then replaces variable values. By the time $test gets replaced with One "This is two" Three, it's too late for the quotes to have their intended effect.
The simple (but dangerous) way to do this is by adding another level of parsing with eval:
$ test='One "This is two" Three'
$ eval "set -- $test"
$ echo "$2"
This is two
(Note that the quotes in the echo command are not necessary, but are a good general practice.)
The reason I say this is dangerous is that it doesn't just go back and reparse for quoted strings, it goes back and reparses everything, maybe including things you didn't want interpreted like command substitutions. Suppose you had set
$ test='One `rm /some/important/file` Three'
...eval will actually run the rm command. So if you can't count on the contents of $test to be "safe", do not use this construct.
BTW, the right way to do this sort of thing is with an array:
$ test=(One "This is two" Three)
$ set -- "${test[#]}"
$ echo "$2"
This is two
Unfortunately, this requires control of how the variable is created.

Now we have bash 4 where it's possible to do something like that:
#!/bin/bash
function qs_parse() {
readarray -t "$1" < <( printf "%s" "$2"|xargs -n 1 printf "%s\n" )
}
tab=' ' # tabulation here
qs_parse test "One 'This is two' Three -n 'foo${tab}bar'"
printf "%s\n" "${test[0]}"
printf "%s\n" "${test[1]}"
printf "%s\n" "${test[2]}"
printf "%s\n" "${test[3]}"
printf "%s\n" "${test[4]}"
Outputs, as expected:
One
This is two
Three
-n
foo bar # tabulation saved
Actually, I am not sure but it's probably possible to do that in older bash like that:
function qs_parse() {
local i=0
while IFS='' read -r line || [[ -n "$line" ]]; do
parsed_str[i]="${line}"
let i++
done < <( printf "%s\n" "$1"|xargs -n 1 printf "%s\n" )
}
tab=' ' # tabulation here
qs_parse "One 'This is two' Three -n 'foo${tab}bar'"
printf "%s\n" "${parsed_str[0]}"
printf "%s\n" "${parsed_str[1]}"
printf "%s\n" "${parsed_str[2]}"
printf "%s\n" "${parsed_str[3]}"
printf "%s\n" "${parsed_str[4]}"

The solution to this problem is to use xargs (eval free).
It retains double quoted strings together:
$ test='One "This is two" Three'
$ IFS=$'\n' arr=( $(xargs -n1 <<<"$test") )
$ printf '<%s>\n' "${arr[#]}"
<One>
<This is two>
<Three>
Of course, you can set the positional arguments with that array:
$ set -- "${arr[#]}"
$ echo "$2"
This is two

I wrote a couple native bash functions to do this: https://github.com/mblais/bash_ParseFields
You can use the ParseFields function like this:
$ str='field1 field\ 2 "field 3"'
$ ParseFields -d "$str" a b c d
$ printf "|%s|\n|%s|\n|%s|\n|%s|\n" "$a" "$b" "$c" "$d"
|field1|
|field 2|
|field 3|
||
The -d option to ParseFields removes any surrounding quotes and interprets backslashes from the parsed fields.
There is also a simpler ParseField function (used by ParseFields) that parses a single field at a specific offset within a string.
Note that these functions cannot parse a stream, only a string. The IFS variable can also be used to specify field delimiters besides whitespace.
If you require that unescaped apostrophes may appear in unquoted fields, that would require a minor change - let me know.

test='One "This is two" Three'
mapfile -t some_args < <(xargs -n1 <<<"$test")
echo "'${some_args[0]}'" "'${some_args[1]}'" "'${some_args[2]}'"
output:
'One' 'This is two' 'Three'

Related

How to convert a string to lower case in Bash, when the string is potentially -e, -E or -n? [duplicate]

This question already has answers here:
How to convert a string to lower case in Bash
(29 answers)
Closed 1 year ago.
In this question: How to convert a string to lower case in Bash?
The accepted answer is:
tr:
echo "$a" | tr '[:upper:]' '[:lower:]'
awk:
echo "$a" | awk '{print tolower($0)}'
Neither of these solutions work if $a is -e or -E or -n
Would this be a more appropriate solution:
echo "#$a" | sed 's/^#//' | tr '[:upper:]' '[:lower:]'

Use
printf '%s\n' "$a" | tr '[:upper:]' '[:lower:]'

Don't bother with tr. Since you're using bash, just use the , operator in parameter expansion:
$ a='-e BAR'
$ printf "%s\n" "${a,,?}"
-e bar

Using typeset (or declare) you can define a variable to automatically convert data to lower case when assigning to the variable, eg:
$ a='-E'
$ printf "%s\n" "${a}"
-E
$ typeset -l a # change attribute of variable 'a' to automatically convert assigned data to lowercase
$ printf "%s\n" "${a}" # lowercase doesn't apply to already assigned data
-E
$ a='-E' # but for new assignments to variable 'a'
$ printf "%s\n" "${a}" # we can see that the data is
-e # converted to lowercase
If you need to maintain case sensitivity of the current variable you can always defined a new variable to hold the lowercase value, eg:
$ typeset -l lower_a
$ lower_a="${a}" # convert data to lowercase upon assignment to variable 'lower_a'
$ printf "%s\n" "${lower_a}"
-e

Shell script to print a variable which has delimiter

I have a shell script, which gets bunch of values. I am segregating it based on a Delimiter(,). Now i want to print it one by one inside a for loop.
For E.g
var=/a/b/c,d/e/f/,x/y/z
for i in $(echo $var | sed "s/,/ /g")
do
echo $i
done
Output is coming empty, Expected output is
/a/b/c
d/e/f/
x/y/z

You don't need a loop.
sed 's/,/\n/g' <<< "$var"
sed 'y/,/\n/' <<< "$var"
tr ',' '\n' <<< "$var"
echo "${var//,/$'\n'}"
they all yield the desired output.

You could read it into an array. This is quite readable:
var=/a/b/c,d/e/f/,x/y/z
IFS=, read -a paths <<<"$var"
for p in "${paths[#]}"; do echo "$p"; done
/a/b/c
d/e/f/
x/y/z

Just to add another option -
var="/a/b/c,d/e/f/,x/y/z"
while IFS=, read a b c
do printf "a=$a b=$b c=$c\n"
done <<< "$var"
a=/a/b/c b=d/e/f/ c=x/y/z
Doesn't need a loop - works fine as
$: IFS=, read a b c <<< "$var"
$: printf "a=$a b=$b c=$c\n"
a=/a/b/c b=d/e/f/ c=x/y/z
... just wanted to show the loop structure in case it helped.

Bash- scramble characters contained in a string

So I have this function with the following output:
AGsg4SKKs74s62#
I need to find a way to scramble the characters without deleting anything..aka all characters must be present after I scramble them.
I can only bash utilities including awk and sed.

echo 'AGsg4SKKs74s62#' | sed 's/./&\n/g' | shuf | tr -d "\n"
Output (e.g.):
S7s64#2gKAGsKs4

Here's a pure Bash function that does the job:
scramble() {
# $1: string to scramble
# return in variable scramble_ret
local a=$1 i
scramble_ret=
while((${#a})); do
((i=RANDOM%${#a}))
scramble_ret+=${a:i:1}
a=${a::i}${a:i+1}
done
}
See if it works:
$ scramble 'AGsg4SKKs74s62#'
$ echo "$scramble_ret"
G4s6s#2As74SgKK
Looks all right.

I know that you haven't mentioned Perl but it could be done like this:
perl -MList::Util=shuffle -F'' -lane 'print shuffle #F' <<<"AGsg4SKKs74s62#"
-a enables auto-split mode and -F'' sets the field separator to an empty string, so each character goes into a separate array element. The array is shuffled using the function provided by the core module List::Util.

Here is my solution, usage: shuffleString "any-string". Performance is not in my consideration when using bash.
function shuffleString() {
local line="$1"
for i in $(seq 1 ${#line}); do
local p=$(expr $RANDOM % ${#line})
if [[ $p -lt $i ]]; then
local line="${line:0:$p}${line:$i:1}${line:$p+1:$i-$p-1}${line:$p:1}${line:$i+1}"
elif [[ $p -gt $i ]]; then
local line="${line:0:$i}${line:$p:1}${line:$i+1:$p-$i-1}${line:$i:1}${line:$p+1}"
fi
done
echo "$line"
}

How to replace the first k characters of a string?

I know how to replace a certain substring of a given string:
foo=abcABC
echo ${foo/abc/xyz} # xyzABC
Is it also possible to replace the first k characters by k times a given character?
Update: Example:
foobar, replace first k = 3 characters by Z yields ZZZbar.

Based on Change string char at index X. Given the string $foo, to change the first k characters by a string $pattern, this can make it:
for ((i=0; i < $k; i++))
do
foo="${foo:0:$i}$pattern${foo:$((i+1))}"
done
Test
$ a="hellomynameisyou"
$ k=5
$ pattern="x"
$ for ((i=0; i < $k; i++)); do a="${a:0:$i}$pattern${a:$((i+1))}"; echo $a; done
xellomynameisyou
xxllomynameisyou
xxxlomynameisyou
xxxxomynameisyou
xxxxxmynameisyou
For your specific example
$ pattern="Z"
$ k=3
$ a="foobar"
$ for ((i=0; i < $k; i++)); do a="${a:0:$i}$pattern${a:$((i+1))}"; echo $a; done
Zxxbar
ZZxbar
ZZZbar
$ echo $a
ZZZbar

You can also try:
matStr=abc
repChar=y
echo "${foo/$matStr/$(seq -s $repChar $((${#matStr}+1)) | tr -d '[0-9]')}"
This is not applicable when repChar is a digit.

This would be fairly simple in Perl. I was looking for something similar in pure Unix utilities and BASH, but could think of any thing. The closest I found is tr.
This written on Linux, so I use sed -r. If this is on Mac, It should be sed -E. In fact, you might even get away without using either the -E or -r flag i you use backslashes before the parentheses.
What I do is produce two strings with sed. The first finds the first length characters and tosses out the rest of the string. The second sed tosses out the first length characters and keeps the string. I can then use tr to replace all the characters with my replacement character, then concatenate the two strings together.
string="1234567890"
length="4"
replace="z"
prefix=$(sed -r -e "s/^(.{1,$length}).*/\1/" <<<"$string" | tr "[:alnum:]" "$replace")
postfix=$(sed -r -e "s/^.{1,$length}//" <<<"$string")
string="${prefix}${postfix}"
echo "$string" #Will echo "zzzz567890"

This is very easy!
Variables:
str='helloworld'
k=3
char='.'
And the most important part:
Using Perl:
echo "$(perl -E "say '$char' x $k")${str:$k}"
Using Python:
echo "$(python -c "print '$char' * $k")${str:$k}"
Using printf and tr:
echo "$(printf "%${k}s" | tr ' ' "$char")${str:$k}"
Pure Bash:
for ((i = 0; i < $k; i++)); do echo -n "$char"; done
echo "${str:$k}"
Choose your weapon! I'd choose pure bash solution.

Iterate over lines instead of words in a for loop of shell script

Following is the shell script to read all the DSF present in the box. But since the line is having spaces, it is displaying them in different lines.
For those of you who dont understand ioscan -m dsf, replace it by ls -ltr, then the output is such that the permission and names are displayed in different line, but i want them in the same line.
#!/usr/bin/ksh
for a in `ioscan -m dsf`
do
echo $a
done

The for loop is not designed to loop over "lines". Instead it loops over "words".
Short simplified terminology: "lines" are things separated by newlines. "words" are things separated by spaces. in bash lingo "words" are called "fields".
The idiomatic way to loop over lines is to use a while loop in combination with read.
ioscan -m dsf | while read -r line
do
printf '%s\n' "$line"
done
Note that the while loop is in a subshell because of the pipe. This can cause some confusion with variable scope. In bash you can work around this by using process substitution.
while read -r line
do
printf '%s\n' "$line"
done < <(ioscan -m dsf)
But now the "generator" (ioscan in this example) is in a subshell.
For more information about the subshell problematic in loops see http://mywiki.wooledge.org/BashFAQ/024
If you insist on using a for loop to loop over lines you have to change the value of $IFS to only newline. IFS is short for Internal Field Separator. Usually $IFS contains a space, a tab, and a newline.
Here is the typical way to do so:
OLDIFS="$IFS"
IFS=$'\n' # bash specific
for line in $(ioscan -m dsf)
do
printf '%s\n' "$line"
done
IFS="$OLDIFS"
(the bash specific part ($'\n') is called ANSI-C Quoting)
But beware many commands depends on some sane setting for $IFS. I do not recommend changing $IFS. Too often it will cause an endless nightmare of obscure bug hunting.
See also:
http://wiki.bash-hackers.org/syntax/ccmd/classic_for
http://wiki.bash-hackers.org/commands/builtin/read
http://mywiki.wooledge.org/IFS
http://mywiki.wooledge.org/SubShell
http://mywiki.wooledge.org/ProcessSubstitution

Using for
for l in $() performs word splitting based on IFS:
$ for l in $(printf %b 'a b\nc'); do echo "$l"; done
a
b
c
$ IFS=$'\n'; for l in $(printf %b 'a b\nc'); do echo "$l"; done
a b
c
IFS doesn't have to be set back if it is not used later.
for l in $() also performs pathname expansion:
$ printf %b 'a\n*\n' > file.txt
$ IFS=$'\n'
$ for l in $(<file.txt); do echo "$l"; done
a
file.txt
$ set -f; for l in $(<file.txt); do echo "$l"; done; set +f
a
*
If IFS=$'\n', linefeeds are stripped and collapsed:
$ printf %b '\n\na\n\nb\n\n' > file.txt
$ IFS=$'\n'; for l in $(<file.txt); do echo "$l"; done
a
b
$(cat file.txt) (or $(<file.txt)) also reads the whole file to memory.
Using read
Without -r backslashes are used for line continuation and removed before other characters:
$ cat file.txt
\1\\2\
3
$ cat file.txt | while read l; do echo "$l"; done
1\23
$ cat file.txt | while read -r l; do echo "$l"; done
\1\\2\
3
Characters in IFS are stripped from the start and end of lines but not collapsed:
$ printf %b '1 2 \n\t3\n' | while read -r l; do echo "$l"; done
1 2
3
$ printf %b ' 1 2 \n\t3\n' | while IFS= read -r l; do echo "$l"; done
1 2
3
If the last line doesn't end with a newline, read assigns l to it but exits before the body of the loop:
$ printf 'x\ny' | while read l; do echo $l; done
x
$ printf 'x\ny' | while read l || [[ $l ]]; do echo $l; done
x
y
If a while loop is in a pipeline, it is also in a subshell, so variables are not visible outside it:
$ x=0; seq 3 | while read l; do let x+=l; done; echo $x
0
$ x=0; while read l; do let x+=l; done < <(seq 3); echo $x
6
$ x=0; x=8 | x=9; echo $x
0

you need to use this basically IFS=$'\n' and grep -x instead of grep as it will work like a equal to operator instead of like operator.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string