Bash: using parameter expansion to add variables at front and end simultaneously [duplicate] - linux

How to add suffix and prefix to $#?
If I do $PREFIX/$#/$SUFFIX, I get the prefix and the suffix only in the first parameter.

I would use shell [ parameter expansion ] for this
$ set -- one two three
$ echo "$#"
one two three
$ set -- "${#/#/pre}" && set -- "${#/%/post}"
$ echo "$#"
preonepost pretwopost prethreepost
Notes
The # matches the beginning
The % matches the end
Using double quotes around ${#} considers each element as a separate word. so replacement happens for every positional parameter

Let's create a parameters for test purposes:
$ set -- one two three
$ echo "$#"
one two three
Now, let's use bash to add prefixes and suffixes:
$ IFS=$'\n' a=($(printf "pre/%s/post\n" "$#"))
$ set -- "${a[#]}"
$ echo -- "$#"
pre/one/post pre/two/post pre/three/post
Limitations: (a) since this uses newline-separated strings, it won't work if your $# contains newlines itself. In that case, there may be another choice for IFS that would suffice. (b) This is subject to globbing. If either of these is an issue, see the more general solution below.
On the other hand, if the positional parameters do not contain whitespace, then no change to IFS is needed.
Also, if IFS is changed, then one may want to save IFS beforehand and restore afterward.
More general solution
If we don't want to make any assumptions about whitespace, we can modify "$#" with a loop:
$ a=(); for p in "$#"; do a+=("pre/$p/post"); done
$ set -- "${a[#]}"
$ echo "$#"
pre/one/post pre/two/post pre/three/post

Note: This is essentially a slightly more detailed version of sjam's answer.
John1024's answer is helpful, but:
requires a subshell (which involves a child process)
can result in unwanted globbing applied to the array elements.
Fortunately, Bash parameter expansion can be applied to arrays too, which avoids these issues:
set -- 'one' 'two' # sample input array, which will be reflected in $#
# Copy $# to new array ${a[#]}, adding a prefix to each element.
# `/#` replaces the string that follows, up to the next `/`,
# at the *start* of each element.
# In the absence of a string, the replacement string following
# the second `/` is unconditionally placed *before* each element.
a=( "${#/#/PREFIX}" )
# Add a suffix to each element of the resulting array ${a[#]}.
# `/%` replaces the string that follows, up to the next `/`,
# at the *end* of each element.
# In the absence of a string, the replacement string following
# the second `/` is unconditionally placed *after* each element.
a=( "${a[#]/%/SUFFIX}" )
# Print the resulting array.
declare -p a
This yields:
declare -a a='([0]="PREFIXoneSUFFIX" [1]="PREFIXtwoSUFFIX")'
Note that double-quoting the array references is crucial to protect their elements from potential word-splitting and globbing (filename expansion) - both of which are instances of shell expansions.

Related

Avoid using an array for wildcard expansion in bash

I wrote the following code:
join(){
IFS="$1"
shift
echo "$*"
}
FILES=(/tmp/*)
SEPARATED_FILES=$(join , ${FILES[*]})
echo $VAR
And it prints the comma separated lists of files in /tmp just fine. But I would like to refactor it and eliminate the tmp global variable FILES which is an array. I tried the following:
SEPARATED_FILES=$(join , ${(/tmp/*)[*]})
echo $VAR
But it prints the following error:
line 8: ${(/tmp/*)[*]}: bad substitution
Yes! You can avoid it by doing pass the glob as directly an argument to the function. Note that, the glob results are expanded by the shell before passing to the function. So pass the first argument as the IFS you want to set and the second as the glob expression you want to use.
join , /tmp/*
The glob is expanded to file names before the function is being called.
join , /tmp/file1 /tmp/file2 /tmp/file3
A noteworthy addition to the above would be to use nullglob option before calling the function. Because when the glob does not produce any results, the un-expanded string can be safely ignored.
shopt -s nullglob
join , /tmp/*
and in a command substitution syntax as
fileList=$(shopt -s nullglob; join , /tmp/*)
Couple of takeaways from your good effort.
Always apply shell quoting to variables/arrays unless you have a reason not to do so. Doing so preserves the literal value of the contents inside and prevents Word-Splitting from happening
Always use lower case names for user-defined variable/function and array names

how to use variables with brace expansion [duplicate]

This question already has answers here:
Brace expansion with variable? [duplicate]
(6 answers)
Closed 4 years ago.
I have four files:
1.txt 2.txt 3.txt 4.txt
in linux shell, I could use :
ls {1..4}.txt to list all the four files
but if I set two variables : var1=1 and var2=4, how to list the four files?
that is:
var1=1
var2=4
ls {$var1..$var2}.txt # error
what is the correct code?
Using variables with the sequence-expression form ({<numFrom>..<numTo>}) of brace expansion only works in ksh and zsh, but, unfortunately, not in bash (and (mostly) strictly POSIX-features-only shells such as dash do not support brace expansion at all, so brace expansion should be avoided with /bin/sh altogether).
Given your symptoms, I assume you're using bash, where you can only use literals in sequence expressions (e.g., {1..3}); from the manual (emphasis mine):
Brace expansion is performed before any other expansions, and any characters special to other expansions are preserved in the result.
In other words: at the time a brace expression is evaluated, variable references have not been expanded (resolved) yet; interpreting literals such as $var1 and $var2 as numbers in the context of a sequence expression therefore fails, so the brace expression is considered invalid and as not expanded.
Note, however, that the variable references are expanded, namely at a later stage of overall expansion; in the case at hand the literal result is the single word '{1..4}' - an unexpanded brace expression with variable values expanded.
While the list form of brace expansion (e.g., {foo,bar)) is expanded the same way, later variable expansion is not an issue there, because no interpretation of the list elements is needed up front; e.g. {$var1,$var2} correctly results in the 2 words 1 and 4.
As for why variables cannot be used in sequence expressions: historically, the list form of brace expansion came first, and when the sequence-expression form was later introduced, the order of expansions was already fixed.
For a general overview of brace expansion, see this answer.
Workarounds
Note: The workarounds focus on numerical sequence expressions, as in the question; the eval-based workaround also demonstrates use of variables with the less common character sequence expressions, which produce ranges of English letters (e.g., {a..c} to produce a b c).
A seq-based workaround is possible, as demonstrated in Jameson's answer.
A small caveat is that seq is not a POSIX utility, but most modern Unix-like platforms have it.
To refine it a little, using seq's -f option to supply a printf-style format string, and demonstrating two-digit zero-padding:
seq -f '%02.f.txt' $var1 $var2 | xargs ls # '%02.f'==zero-pad to 2 digits, no decimal places
Note that to make it fully robust - in case the resulting words contain spaces or tabs - you'd need to employ embedded quoting:
seq -f '"%02.f a.txt"' $var1 $var2 | xargs ls
ls then sees 01 a.txt, 02 a.txt, ... with the argument boundaries correctly preserved.
If you want to robustly collect the resulting words in a Bash array first, e.g., ${words[#]}:
IFS=$'\n' read -d '' -ra words < <(seq -f '%02.f.txt' $var1 $var2)
ls "${words[#]}"
The following are pure Bash workarounds:
A limited workaround using Bash features only is to use eval:
var1=1 var2=4
# Safety check
(( 10#$var1 + 10#$var2 || 1 )) 2>/dev/null || { echo "Need decimal integers." >&2; exit 1; }
ls $(eval printf '%s\ ' "{$var1..$var2}.txt") # -> ls 1.txt 2.txt 3.txt 4.txt
You can apply a similar technique to a character sequence expression;
var1=a var2=c
# Safety check
[[ $var1 == [a-zA-Z] && $var2 == [a-zA-Z] ]] || { echo "Need single letters."; exit 1; }
ls $(eval printf '%s\ ' "{$var1..$var2}.txt") # -> ls a.txt b.txt c.txt
Note:
A check is performed up front to ensure that $var1 and $var2 contain decimal integers or single English letters, which then makes it safe to use eval. Generally, using eval with unchecked input is a security risk and use of eval is therefore best avoided.
Given that output from eval must be passed unquoted to ls here, so that the shell splits the output into individual arguments through words-splitting, this only works if the resulting filenames contain no embedded spaces or other shell metacharacters.
A more robust, but more cumbersome pure Bash workaround to use an array to create the equivalent words:
var1=1 var2=4
# Emulate brace sequence expression using an array.
args=()
for (( i = var1; i <= var2; i++ )); do
args+=( "$i.txt" )
done
ls "${args[#]}"
This approach bears no security risk and also works with resulting filenames with embedded shell metacharacters, such as spaces.
Custom increments can be implemented by replacing i++ with, e.g., i+=2 to step in increments of 2.
Implementing zero-padding would require use of printf; e.g., as follows:
args+=( "$(printf '%02d.txt' "$i")" ) # -> '01.txt', '02.txt', ...
For that particular piece of syntax (a "sequence expression") you're out of luck, see Bash man page:
A sequence expression takes the form {x..y[..incr]}, where x and y are
either integers or single characters, and incr, an optional increment,
is an integer.
However, you could instead use the seq utility, which would have a similar effect -- and the approach would allow for the use of variables:
var1=1
var2=4
for i in `seq $var1 $var2`; do
ls ${i}.txt
done
Or, if calling ls four times instead of once bothers you, and/or you want it all on one line, something like:
for i in `seq $var1 $var2`; do echo ${i}.txt; done | xargs ls
From seq(1) man page:
seq [OPTION]... LAST
seq [OPTION]... FIRST LAST
seq [OPTION]... FIRST INCREMENT LAST

Is there an option to "ls" that limits filename characters?

syntax question. if I have a number of subdirectories within a target dir, and I want to output the names of the subs to a text file I can easily run:
ls > filelist.txt
on the target. But say all of my subs are named with a 7 character prefix like:
JR-5426_mydir
JR-5487_mydir2
JR-5517_mydir3
...
and I just want the prefixes. Is there an option to "ls" that will only output n characters per line?
Don't use ls in any programmatic context; it should be used strictly for presentation to humans -- ParsingLs gives details on why.
On bash 4.0 or later, the below will provide a deduplicated list of filename prefixes:
declare -A prefixes_seen=( ) # create an associative array -- aka "hash" or "map"
for file in *; do # iterate over all non-hidden directory entries
prefixes_seen[${file:0:2}]=1 # add the first two chars of each as a key in the map
done
printf '%s\n' "${!prefixes_seen[#]}" # print all keys in the map separated by newlines
That said, if instead of wanting a 2-character prefix you want everything before the first -, you can write something cleaner:
declare -A prefixes_seen=( )
for file in *-*; do
prefixes_seen[${file%%-*}]=1 # "${file%%-*}" cuts off "$file" at the first dash
done
printf '%s\n' "${!prefixes_seen[#]}"
...and if you don't care about deduplication:
for file in *-*; do
printf '%s\n' "${file%%-*}"
done
...or, sticking with the two-character rule:
for file in *; do
printf '%s\n' "${file:0:2}"
done
That said -- if you're trying to Do It Right, you shouldn't be using newlines to separate lists of filename characters either, because newlines are valid inside filenames on POSIX filesystems. Think about a file named f$'\n'oobar -- that is, with a literal newline in the second character; code written carelessly would see f as one prefix and oo as a second one, from this single name. Iterating over associative-array prefixes, as done for the deduplicating answers, is safer in this case, because it doesn't rely on any delimiter character.
To demonstrate the difference -- if instead of writing
printf '%s\n' "${!prefixes_seen[#]}"
you wrote
printf '%q\n' "${!prefixes_seen[#]}"
it would emit the prefix of the hypothetical file f$'\n'oobar as
$'f\n'
instead of
f
...with an extra newline below it.
If you want to pass lists of filenames (or, as here, filename prefixes) between programs, the safe way to do it is to NUL-delimit the elements -- as NULs are the single character which can't possibly exist in a valid UNIX path. (A filename also can't contain /, but a path obviously can).
A NUL-delimited list can be written like so:
printf '%s\0' "${!prefixes_seen[#]}"
...and read back into an identical data structure on the receiving end (should the receiving code be written in bash) like so:
declare -A prefixes_seen=( )
while IFS= read -r -d '' prefix; do
prefixes_seen[$prefix]=1
done
No, you use the cut command:
ls | cut -c1-7

decrypting a variable in a scripting environment of Linux

What does $# in unix shell script signify. For example:
A__JOB="$CLASS $#"
where $CLASS has my java class file name. So what might be the meaning of
$#.
What did I do?
I Googled :) but $# seems to be complex query for ir or maybe i do not know how to search google for special characters.
$# is the value of all arguments passed.
For example, if you pass:
./script A B C D
then "$#" will be equal to "A" "B" "C" "D"
So it looks like the purpose is to passe all the arguments passed to the script directly to the java program.
From bash manual:
# Expands to the positional parameters, starting from one. When
the expansion occurs within double quotes, each parameter
expands to a separate word. That is, "$#" is equivalent to "$1" "$2" ... If the double-quoted expansion occurs
within a
word, the expansion of the first parameter is joined with the beginning part of the original word, and the expansion of the
last parameter is joined with the last part of the original word. When there are no positional parameters, "$#" and $# expand
to nothing (i.e., they are removed).

What does $# mean in a shell script?

What does a dollar sign followed by an at-sign (#) mean in a shell script?
For example:
umbrella_corp_options $#
$# is all of the parameters passed to the script.
For instance, if you call ./someScript.sh foo bar then $# will be equal to foo bar.
If you do:
./someScript.sh foo bar
and then inside someScript.sh reference:
umbrella_corp_options "$#"
this will be passed to umbrella_corp_options with each individual parameter enclosed in double quotes, allowing to take parameters with blank space from the caller and pass them on.
$# is nearly the same as $*, both meaning "all command line arguments". They are often used to simply pass all arguments to another program (thus forming a wrapper around that other program).
The difference between the two syntaxes shows up when you have an argument with spaces in it (e.g.) and put $# in double quotes:
wrappedProgram "$#"
# ^^^ this is correct and will hand over all arguments in the way
# we received them, i. e. as several arguments, each of them
# containing all the spaces and other uglinesses they have.
wrappedProgram "$*"
# ^^^ this will hand over exactly one argument, containing all
# original arguments, separated by single spaces.
wrappedProgram $*
# ^^^ this will join all arguments by single spaces as well and
# will then split the string as the shell does on the command
# line, thus it will split an argument containing spaces into
# several arguments.
Example: Calling
wrapper "one two three" four five "six seven"
will result in:
"$#": wrappedProgram "one two three" four five "six seven"
"$*": wrappedProgram "one two three four five six seven"
^^^^ These spaces are part of the first
argument and are not changed.
$*: wrappedProgram one two three four five six seven
These are the command line arguments where:
$# = stores all the arguments in a list of string
$* = stores all the arguments as a single string
$# = stores the number of arguments
The usage of a pure $# means in most cases "hurt the programmer as hard as you can", because in most cases it leads to problems with word separation and with spaces and other characters in arguments.
In (guessed) 99% of all cases, it is required to enclose it in ": "$#" is what can be used to reliably iterate over the arguments.
for a in "$#"; do something_with "$a"; done
Meaning.
In brief, $# expands to the arguments passed from the caller to a function or a script. Its meaning is context-dependent: Inside a function, it expands to the arguments passed to such function. If used in a script (outside a function), it expands to the arguments passed to such script.
$ cat my-script
#! /bin/sh
echo "$#"
$ ./my-script "Hi!"
Hi!
$ put () { echo "$#"; }
$ put "Hi!"
Hi!
* Note: Word splitting.
The shell splits tokens based on the contents of the IFS environment variable. Its default value is \t\n; i.e., whitespace, tab, and newline. Expanding "$#" gives you a pristine copy of the arguments passed. Expanding $# may not. More specifically, any arguments containing characters present in IFS might split into two or more arguments or get truncated.
Thus, most of the time what you will want to use is "$#", not $#.
From the manual:
#
Expands to the positional parameters, starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word. That is, "$#" is equivalent to "$1" "$2" .... If the double-quoted expansion occurs within a word, the expansion of the first parameter is joined with the beginning part of the original word, and the expansion of the last parameter is joined with the last part of the original word. When there are no positional parameters, "$#" and $# expand to nothing (i.e., they are removed).
$# is basically use for refers all the command-line arguments of shell-script.
$1 , $2 , $3 refer to the first command-line argument, the second command-line argument, third argument.
They are often used to simply pass all arguments to another program
[root#node1 shell]# ./my-script hi 11 33
hi 11 33
[root#node1

Resources