Don't understand bash parameter declaration - linux

I have a problem with reading this parameter below:
I don't understand the purpose of using this $(basename "$0") where is it come from.
${BINARY%/*} seems it try to get the path of the directory but what exactly why just need to like this.
DIR_NAME=$(dirname "$0")
FILE_NAME=$(basename "$0")
BINARY=`readlink ${ROOT_DIR}/${DIR_NAME}/${FILE_NAME} -f`
BIN_PATH=${BINARY%/*}

$0 is the pathname of the script being run. So $(dirname "$0") returns the directory of the script, and $(basename "$0") is the filename.
${BINARY%/*} is explained in Shell Parameter Expansion
${parameter%word}
${parameter%%word}
The word is expanded to produce a pattern just as in filename expansion. If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the value of parameter with the shortest matching pattern (the ‘%’ case) or the longest matching pattern (the ‘%%’ case) deleted.
So this finds the trailing portion of $BINARY that matches /* and removes it, which returns the directory portion. It's equivalent to $(dirname "$BINARY")

Related

Rename filenames in shellscript substring

I have files which are named as "images123.jpg", "images456.jpg" etc
I would want to mv these files into testfolder folder and rename them accordingly to "123.jpg", "456.jpg" etc.
This is what i tried
for file in *.jpg; d
mv $file testfolder/($file | cut -c7-)
done
The below script should do what you need.
#!/bin/sh
for file in images*.jpg; do
mv ${file} testfolder/${file#images}
done
The key part is ${file#images}. That is a bash shell parameter expansion:
${parameter#word}
${parameter##word}
The word is expanded to produce a pattern and matched according to the rules described below (see Pattern Matching). If the pattern matches the beginning of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the shortest matching pattern (the ‘#’ case) or the longest matching pattern (the ‘##’ case) deleted.
In this particular case it matches and strips images from the start of each file name.

Why does echo command interpret variable for base directory?

I would like to find some file types in pictures folder and I have created the following bash-script in /home/user/pictures folder:
for i in *.pdf *.sh *.txt;
do
echo 'all file types with extension' $i;
find /home/user/pictures -type f -iname $i;
done
But when I execute the bash-script, it does not work as expected for files that are located on the base directory /home/user/pictures. Instead of echo 'All File types with Extension *.sh' the command interprets the variable for base directory:
all file types with extension file1.sh
/home/user/pictures/file1.sh
all file types with extension file2.sh
/home/user/pictures/file2.sh
all file types with extension file3.sh
/home/user/pictures/file3.sh
I would like to know why echo - command does not print "All File types with Extension *.sh".
Revised code:
for i in '*.pdf' '*.sh' '*.txt'
do
echo "all file types with extension $i"
find /home/user/pictures -type f -iname "$i"
done
Explanation:
In bash, a string containing *, or a variable which expands to such a string, may be expanded as a glob pattern unless that string is protected from glob expansion by putting it inside quotes (although if the glob pattern does not match any files, then the original glob pattern will remain after attempted expansion).
In this case, it is not wanted for the glob expansion to happen - the string containing the * needs to be passed as a literal to each of the echo and the find commands. So the $i should be enclosed in double quotes - these will allow the variable expansion from $i, but the subsequent wildcard expansion will not occur. (If single quotes, i.e. '$i' were used instead, then a literal $i would be passed to echo and to find, which is not wanted either.)
In addition to this, the initial for line needs to use quotes to protect against wildcard expansion in the event that any files matching any of the glob patterns exist in the current directory. Here, it does not matter whether single or double quotes are used.
Separately, the revised code here also removes some unnecessary semicolons. Semicolons in bash are a command separator and are not needed merely to terminate a statement (as in C etc).
Observed behaviour with original code
What seems to be happening here is that one of the patterns used in the initial for statement is matching files in the current directory (specifically the *.sh is matching file1.sh file2.sh, and file3.sh). It is therefore being replaced by a list of these filenames (file1.sh file2.sh file3.sh) in the expression, and the for statement will iterate over these values. (Note that the current directory might not be the same as either where the script is located or the top level directory used for the find.)
It would also still be expected that the *.pdf and *.txt would be used in the expression -- either substituted or not, depending on whether any matches are found. Therefore the output shown in the question is probably not the whole output of the script.
Such expressions (*.blabla) changes the value of $i in the loop. Here is the trick i would do :
for i in pdf sh txt;
do
echo 'all file types with extension *.'$i;
find /home/user/pictures -type f -iname '*.'$i;
done

Using a glob expression passed as a bash script argument

TL;DR:
Why isn't invoking ./myscript foo* when myscript has var=$1 the same as invoking ./myscript with var=foo* hardcoded?
Longer form
I've come across a weird issue in a bash script I'm writing. I am sure there is a simple explanation, but I can't figure it out.
I am trying to pass a command line argument to be assigned as a variable in the script.
I want the script to allow 2 command line arguments as follows:
$ bash my_bash_script.bash args1 args2
In my script, I assigned variables like this:
ARGS1=$1
ARGS2=$2
Args 1 is a string descriptor to add to the output file.
Args 2 is a group of directories: "dir1, dir2, dir3", which I am passing as dir*
When I assign dir* to ARGS2 in the script it works fine, but when I pass dir* as the second command line argument, it only includes dir1 in the wildcard expansion of dir*.
I assume this has something to do with how the shell handles wildcards (even when passed as args), but I don't really understand it.
Any help would be appreciated.
Environment / Usage
I have a group of directories:
dir_1_y_map, dir_1_x_map, dir_2_y_map, dir_2_x_map,
... dir_10_y_map, dir_10_x_map...
Inside these directories I am trying to access a file with extension ".status" via *.status, and ".report.txt" via *report.txt.
I want to pass dir_*_map as the second argument to the script and store it in the variable ARGS2, then use it to search within each of the directories for the ".status" and ".report" files.
The issue is that passing dir_*_map from the command line doesn't give the list of directories, but rather just the first item in the list. If I assign the variable ARGS2=dir_*_map within the script, it works as I intend.
Workaround: Quoting
It turns out that passing the second argument in quotes allowed the wildcard expansion to work appropriately for "dir_*_map"
#!/usr/bin/env bash
ARGS1=$1
ARGS2=$2
touch $ARGS1".extension"
for i in /$ARGS2/*.status
do
grep -e "string" $i >> $ARGS1".extension"
done
Here is an example invocation of the script:
sh ~/path/to/script descriptor "dir_*_map"
I don't fully understand when/why some arguments must be passed in quotes, but I assume it has to do with the wildcard expansion in the for loop.
Addressing the "why"
Assignments, as in var=foo*, don't expand globs -- that is, when you run var=foo*, the literal string foo* is put into the variable foo, not the list of files matching foo*.
By contrast, unquoted use of foo* on a command line expands the glob, replacing it with a list of individual names, each of which is passed as a separate argument.
Thus, running ./yourscript foo* doesn't pass foo* as $1 unless no files matching that glob expression exist; instead, it becomes something like ./yourscript foo01 foo02 foo03, with each argument in a different spot on the command line.
The reason running ./yourscript "foo*" functions as a workaround is the unquoted expansion inside the script allowing the glob to be expanded at that later time. However, this is bad practice: glob expansion happens concurrent with string-splitting (meaning that relying on this behavior removes your ability to pass filenames containing characters found in IFS, typically whitespace), and also means that you can't pass literal filenames when they could also be interpreted as globs (if you have a file named [1] and a file named 1, passing [1] would always be replaced with 1).
Idiomatic Usage
The idiomatic way to build this would be to shift away the first argument, and then iterate over subsequent ones, like so:
#!/bin/bash
out_base=$1; shift
shopt -s nullglob # avoid generating an error if a directory has no .status
for dir; do # iterate over directories passed in $2, $3, etc
for file in "$dir"/*.status; do # iterate over files ending in .status within those
grep -e "string" "$file" # match a single file
done
done >"${out_base}.extension"
If you have many .status files in a single directory, all this can be made more efficient by using find to invoke grep with as many arguments as possible, rather than calling grep individually on a per-file basis:
#!/bin/bash
out_base=$1; shift
find "$#" -maxdepth 1 -type f -name '*.status' \
-exec grep -h -- /dev/null '{}' + \
>"${out_base}.extension"
Both scripts above expect the globs passed not to be quoted on the invoking shell. Thus, usage is of the form:
# being unquoted, this expands the glob into a series of separate arguments
your_script descriptor dir_*_map
This is considerably better practice than passing globs to your script (which then is required to expand them to retrieve the actual files to use); it works correctly with filenames containing whitespace (which the other practice doesn't), and files whose names are themselves glob expressions.
Some other points of note:
Always put double quotes around expansions! Failing to do so results in the additional steps of string-splitting and glob expansion (in that order) being applied. If you want globbing, as in the case of "$dir"/*.status, then end the quotes before the glob expression starts.
for dir; do is precisely equivalent to for dir in "$#"; do, which iterates over arguments. Don't make the mistake of using for dir in $*; do or for dir in $#; do instead! These latter invocations combine each element of the list with the first character of IFS (which, by default, contains the space, the tab and the newline in that order), then splits the resulting string on any IFS characters found within, then expands each component of the resulting list as a glob.
Passing /dev/null as an argument to grep is a safety measure: It ensures that you don't have different behavior between the single-argument and multi-argument cases (as an example, grep defaults to printing filenames within output only when passed multiple arguments), and ensures that you can't have grep hang trying to read from stdin if it's passed no additional filenames at all (which find won't do here, but xargs can).
Using lower-case names for your own variables (as opposed to system- and shell-provided variables, which have all-uppercase names) is in accordance with POSIX-specified convention; see fourth paragraph of the POSIX specification regarding environment variables, keeping in mind that environment variables and shell variables share a namespace.

Check file extension BASH

I'm trying to compare 2 files that have the same name and the same contents but I only want to keep one of them.
Lets say I have the following
/root/dir1/file1.doc
/root/dir1/file1.pdf
I only want to keep the .doc file what would be the best way to go about this?
From the dash manpage (and I believe all bash, ksh, and zsh "inherit" these features):
${parameter%word} Remove Smallest Suffix Pattern. The word is
expanded to produce a pattern. The parameter expansion then results
in parameter, with the smallest portion of the suffix matched by the
pattern
deleted.
${parameter%%word} Remove Largest Suffix Pattern. The word is expanded to produce a pattern. The parameter expansion then results
in parameter, with the largest portion of the suffix matched by the
pattern
deleted.
${parameter#word} Remove Smallest Prefix Pattern. The word is expanded to produce a pattern. The parameter expansion then
results in parameter, with the smallest portion of the prefix matched
by the pattern
deleted.
${parameter##word} Remove Largest Prefix Pattern. The word is expanded to produce a pattern. The parameter expansion then results
in parameter, with the largest portion of the prefix matched by the
pattern
In practice:
shortSuff(){ printf '%s\n' "${1#*.}"; }
#^applies the string op to the $1 positional parameter
#`printf '%s\n'` is like `echo` but doesn't break on e.g., hyphen-prefix arguments
file=/root/dir1/file1.doc
[ shortSuff "$file" = doc ] && echo "Yes, the short suffix is doc"
You can write a short script to do this for you:
#!/bin/bash
EXT="${1}"
if -z "${EXT}"; then EXT=".doc"; fi
TD="$(mktemp -d)"
mv ./*"${EXT}" "${TD}"
rm ./*
mv "${TD}"/* .
rmdir "${TD}"
Assuming your script is called keep_copy.sh, you would call it as follows: keep_copy.sh .doc.

Extract file basename without path and extension in bash [duplicate]

This question already has answers here:
Extract filename and extension in Bash
(38 answers)
Closed 6 years ago.
Given file names like these:
/the/path/foo.txt
bar.txt
I hope to get:
foo
bar
Why this doesn't work?
#!/bin/bash
fullfile=$1
fname=$(basename $fullfile)
fbname=${fname%.*}
echo $fbname
What's the right way to do it?
You don't have to call the external basename command. Instead, you could use the following commands:
$ s=/the/path/foo.txt
$ echo "${s##*/}"
foo.txt
$ s=${s##*/}
$ echo "${s%.txt}"
foo
$ echo "${s%.*}"
foo
Note that this solution should work in all recent (post 2004) POSIX compliant shells, (e.g. bash, dash, ksh, etc.).
Source: Shell Command Language 2.6.2 Parameter Expansion
More on bash String Manipulations: http://tldp.org/LDP/LG/issue18/bash.html
The basename command has two different invocations; in one, you specify just the path, in which case it gives you the last component, while in the other you also give a suffix that it will remove. So, you can simplify your example code by using the second invocation of basename. Also, be careful to correctly quote things:
fbname=$(basename "$1" .txt)
echo "$fbname"
A combination of basename and cut works fine, even in case of double ending like .tar.gz:
fbname=$(basename "$fullfile" | cut -d. -f1)
Would be interesting if this solution needs less arithmetic power than Bash Parameter Expansion.
Here are oneliners:
$(basename "${s%.*}")
$(basename "${s}" ".${s##*.}")
I needed this, the same as asked by bongbang and w4etwetewtwet.
Pure bash, no basename, no variable juggling. Set a string and echo:
p=/the/path/foo.txt
echo "${p//+(*\/|.*)}"
Output:
foo
Note: the bash extglob option must be "on", (Ubuntu sets extglob "on" by default), if it's not, do:
shopt -s extglob
Walking through the ${p//+(*\/|.*)}:
${p -- start with $p.
// substitute every instance of the pattern that follows.
+( match one or more of the pattern list in parenthesis, (i.e. until item #7 below).
1st pattern: *\/ matches anything before a literal "/" char.
pattern separator | which in this instance acts like a logical OR.
2nd pattern: .* matches anything after a literal "." -- that is, in bash the "." is just a period char, and not a regex dot.
) end pattern list.
} end parameter expansion. With a string substitution, there's usually another / there, followed by a replacement string. But since there's no / there, the matched patterns are substituted with nothing; this deletes the matches.
Relevant man bash background:
pattern substitution:
${parameter/pattern/string}
Pattern substitution. The pattern is expanded to produce a pat
tern just as in pathname expansion. Parameter is expanded and
the longest match of pattern against its value is replaced with
string. If pattern begins with /, all matches of pattern are
replaced with string. Normally only the first match is
replaced. If pattern begins with #, it must match at the begin‐
ning of the expanded value of parameter. If pattern begins with
%, it must match at the end of the expanded value of parameter.
If string is null, matches of pattern are deleted and the / fol
lowing pattern may be omitted. If parameter is # or *, the sub
stitution operation is applied to each positional parameter in
turn, and the expansion is the resultant list. If parameter is
an array variable subscripted with # or *, the substitution
operation is applied to each member of the array in turn, and
the expansion is the resultant list.
extended pattern matching:
If the extglob shell option is enabled using the shopt builtin, several
extended pattern matching operators are recognized. In the following
description, a pattern-list is a list of one or more patterns separated
by a |. Composite patterns may be formed using one or more of the fol
lowing sub-patterns:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns
Here is another (more complex) way of getting either the filename or extension, first use the rev command to invert the file path, cut from the first . and then invert the file path again, like this:
filename=`rev <<< "$1" | cut -d"." -f2- | rev`
fileext=`rev <<< "$1" | cut -d"." -f1 | rev`
If you want to play nice with Windows file paths (under Cygwin) you can also try this:
fname=${fullfile##*[/|\\]}
This will account for backslash separators when using BaSH on Windows.
Just an alternative that I came up with to extract an extension, using the posts in this thread with my own small knowledge base that was more familiar to me.
ext="$(rev <<< "$(cut -f "1" -d "." <<< "$(rev <<< "file.docx")")")"
Note: Please advise on my use of quotes; it worked for me but I might be missing something on their proper use (I probably use too many).
Use the basename command. Its manpage is here: http://unixhelp.ed.ac.uk/CGI/man-cgi?basename

Resources