Bash and Variable Substitution for file with space in their name: application for gpsbabel

Bash and Variable Substitution for file with space in their name: application for gpsbabel - linux

I am trying to program a script to run gpsbabel. I am stuck to handle files with name containing (white) spaces.
My problem is in the bash syntax. Any help or insight from bash programmers will be much appreciated.
gpsbabel is software which permit merging of tracks recorded by gps devices.
The syntax for my purpose and which is working is:
gpsbabel -i gpx -f "file 1.gpx" -f "file 2.gpx" -o gpx -F output.gpx -x track,merge
The input format of the GPS data is given by -i , the output format by -o.
The input data files are listed after -f, and the resulting file after -F
(ref. gpsbabel manual, see example 4.9)
I am trying to write a batch to run this syntax with a number of input file not known initially. It means that the sequence -f "name_of_the_input_file" has to be repeated for each input file passed from the batch parameters.
Here is a script working for file with no spaces in their name
#!/bin/bash
# Append multiple gpx files easily
# batch name merge_gpx.sh
# Usage:
# merge_gpx.sh track_*.gpx
gpsbabel -i gpx $(echo $* | for GPX; do echo -n " -f $GPX "; done) \
-o gpx -F appended.gpx
`
So I tried to modify this script to handle also filename with containing spaces.
I got lost in the bash substitution and wrote and more sequenced bash for debugging purpose with no success.
Here is one of my trial
I get an error from gpsbabel "Extra arguments on command line" suggesting that I made a mistake in the variable usage.
#/bin/bash
# Merging all tracks in a single one
old_IFS=$IFS # Backup internal separator
IFS=$'\n' # New IFS
let i=0
echo " Merging GPX files"
for file in $(ls -1 "$#")
do
let i++
echo "i=" $i "," "$file"
tGPX[$i]=$file
done
IFS=$old_IFS #
#
echo "Number of files:" ${#tGPX[#]}
echo
# List of the datafile to treat (each name protected with a ')
LISTE=$(for (( ifile=1; ifile<=${#tGPX[#]} ; ifile++)) ;do echo -ne " -f '""${tGPX[$ifile]}""'"; done)
echo "LISTE: " $(echo -n $LISTE)
echo "++Merging .."
if (( $i>=1 )); then
gpsbabel -t \
-i gpx $(echo -n $LISTE) \
-x track,merge,title="TEST COMPIL" \
-o gpx -F track_compil.gpx
else
echo "Wrong selection of input file"
fi
#end

You are making things way more complicated for yourself than they need to be.
Any reasonably posix/gnu-compatible utility which takes an option in the form of two command-line arguments (-f STRING, or equivalently -f FILENAME) should also accept a single command-line argument -fSTRING. If the utility uses either getopt or getopt_long, this is automatic. gpsbabel appears to not use standard posix or gnu libraries for argument parsing, but I believe it still gets this right.
Apparently, your script expects its arguments to be a list of filenames; presumably, if the filenames include whitespace, you will quote the names which include whitespace:
./myscript "file 1.gpx" "file 2.gpx"
In that case, you only need to change the list of arguments by prepending -f to each one, so that the argument list becomes, in effect:
"-ffile 1.gpx" "-ffile 2.gpx"
That's extremely straightforward. We'll use the bash-specific find-and-replace syntax, described in the bash manual: (I highlighted the two features this solution uses)
${parameter/pattern/string}
Pattern substitution. The pattern is expanded to produce a pattern just as in pathname expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. If pattern begins with /, all matches of pattern are replaced with string. Normally only the first match is replaced. If pattern begins with #, it must match at the beginning of the expanded value of parameter. If pattern begins with %, it must match at the end of the expanded value of parameter. If string is null, matches of pattern are deleted and the / following pattern may be omitted. If parameter is # or *, the substitution operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with # or *, the substitution operation is applied to each member of the array in turn, and the expansion is the resultant list.
So, "${#/#/-f}" is the list of arguments (#), with the empty pattern at the beginning (#) replaced with -f:
#/bin/bash
# Merging all tracks in a single one
# $# is the number of arguments to the script.
if (( $# > 0 )); then
gpsbabel -t \
-i gpx "${#/#/-f}" \
-x track,merge,title="TEST COMPIL" \
-o gpx -F track_compil.gpx
else
# I changed the error message to make it more clear, sent it to stderr
# and cause the script to fail.
echo "No input files specified" >> /dev/stderr
exit 1
fi

Use an array:
files=()
for f; do
files+=(-f "$f")
done
gpsbabel -i gpx "${files[#]}" -o gpx -F appended.gpx
for f; do is short for for f in "$#"; do; most often you want to use $# to access the command-line arguments instead of $*. Quoting "${files[#]}" produces a list of words, one per element, that are treated as if they were quoted, so array elements containing whitespace are treated as a single word.

Related

How to replace date part in filename with current date

How to replace only date part to current date of all files present in diretory in unix.
Folder path: C:/shan
Sample files:
CN_Apria_837p_20180924.txt
DN_Apria_837p_20150502.txt
GN_Apria_837p_20160502.txt
CH_Apria_837p_20170502.txt
CU_Apria_837p_20180502.txt
PN_Apria_837p_20140502.txt
CN_Apria_837p_20101502.txt
Desired result should be:
CN_Apria_837p_20190502.txt
DN_Apria_837p_20190502.txt
GN_Apria_837p_20190502.txt
CH_Apria_837p_20190502.txt
CU_Apria_837p_20190502.txt
PN_Apria_837p_20190502.txt
CN_Apria_837p_20190502.txt
Edit:
I'm completely new to unix sell scripting. I tried this below, however it's not working.
#!/bin/bash
for i in ls $1 | grep -E '[0-9]{4}-[0-9]{2}-[0-9]{2}'
do
x=echo $i | grep -oE '[0-9]{4}-[0-9]{2}-[0-9]{2}'
y=echo $i | sed "s/$x/$(date +%F)/g"
mv $1/$i $1/$y 2>/dev/null #incase if old date is same as current date
done

I would use regular expressions here. From the bash man-page:
An additional binary operator, =~, is available, with the same
precedence as == and !=. When it is used, the string to the right
of the operator is considered an extended regular expression and
matched accordingly (as in regex(3)). The return value is 0 if the
string matches the pattern, and 1 otherwise. .... Substrings
matched by parenthesized subexpressions within the regular
expression are saved in the array variable BASH_REMATCH. ...
The element of BASH_REMATCH with indexn is the portion of the
string matching the nth parenthesized sub-expression.
Hence, assuming that the variable x holds the name of one of the files
in question, the code
if [[ $x =~ ^(.*_)[0-9]+([.]txt$) ]]
then
mv "$x" "$BASH_REMATCH[1]$(date +%Y%m%d)$BASH_REMATCH[2]"
fi
first tests roughly whether the file indeed follows the required naming scheme, and then modifies the name accordingly.
Of course in practice, you will tailor the regexp to match your application better. Only you can know what variations in the file name are permitted.

The below should do this
for f in $(find /path/to/files -name "*_*_*_*.txt")
do
newname=$(echo "$f" | sed -r "s/[12][0-9]{3}[01][0-9][0-3][0-9]/$(date '+%Y%m%d')/g")
mv "$f" "$newname"
done

Try this Shellcheck-clean code:
#! /bin/bash -p
readonly dir=$1
shopt -s nullglob # Make glob patterns that match nothing expand to nothing
readonly dateglob='20[0-9][0-9][0-9][0-9][0-9][0-9]'
currdate=$(date '+%Y%m%d')
# shellcheck disable=SC2231
for path in "$dir"/*_${dateglob}.* ; do
name=${path##*/}
newname=${name/_${dateglob}./_${currdate}.}
if [[ $newname != "$name" ]] ; then
newpath="$dir/$newname"
printf "%q -> %q\\n" "$path" "$newpath"
mv -i -- "$path" "$newpath"
fi
done
shopt -s nullglob stops the code trying to process a garbage path if nothing matches the glob pattern in for path in ....
The pattern assigned to dateglob assumes that you will not have to process dates before 2000 (or after 2099!). Change it if that assumption is not valid.
The # shellcheck ... line is to prevent Shellcheck warning about the use of ${dateglob} without quotes. The quotes would be wrong in this case because they would prevent the glob pattern being expanded.
The pattern used to match filenames (*_${dateglob}.*) will match many more forms of filename than the examples given (e.g. A_20180313.tar.gz). You might want to change it.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for information about the Bash string manipulation mechanisms used (${path##...}, ${name/...}).
I've added a printf to output details of what is being moved.
The -i option to mv prompts for confirmation if a file would be overwritten. This turns out to be an issue for the example files because both CN_Apria_837p_20180924.txt and CN_Apria_837p_20101502.txt are identical except for the date, so the code tries to rename them to the same thing.
If any of the files with dates in their names have names beginning with '.', the code will not process them. Add line shopt -s dotglob somewhere before the loop if that is an issue.

basename command confusion

Given the following command:
$(basename "/this-directory-does-not-exist/*.txt" ".txt")
it outputs not only txt files but other files as well. On the other hand if I change ".txt" to something like "gobble de gook" it returns:
*.txt
I'm confused with regard to why it returns the other extension types.

Your problem doesn't stem from basename, but from inadvertent use of the shell's pathname expansion (globbing) feature due to lack of quoting:
If you use the result of your command substitution ($(...)) unquoted:
$ echo $(basename "/this-directory-does-not-exist/*.txt" ".txt")
you effectively execute the following:
$ echo * # unquoted '*' expands to all files and folders in the current dir
because basename "/this-directory-does-not-exist/*.txt" ".txt" returns literal * (it strips the extension from filename *.txt;
the reason that the filename pattern *.txt didn't expand to an actual filename is that the shell leaves globbing patterns that don't match anything unmodified (by default).)
If you double-quote the command substitution, the problem goes away:
$ echo "$(basename "/this-directory-does-not-exist/*.txt" ".txt")" # -> *
However, even with this problem resolved, your basename command will only work correctly if the glob expands to one matching file, because the syntax form you're using only supports one filename argument.
GNU basename and BSD basename support the non-POSIX -s option, which allows for multiple file operands from which to strip the extension:
basename -s .txt "/some-dir/*.txt"
Assuming you use bash, you can put it all together robustly as follows:
#!/usr/bin/env bash
names=() # initialize result array
files=( *.txt ) # perform globbing and capture matching paths in an array
# Since the shell by default returns a pattern as-is if there are no matches,
# we test the first array item for existence; if it refers to an existing
# file or dir., we know that at least 1 match was found.
if [[ -e ${files[0]} ]]; then
# Apply the `basename` command with suffix-stripping to all matches
# and read the results robustly into an array.
# Note that just `names=( $(basename ...) )` would NOT work robustly.
readarray -t names < <(basename -s '.txt' "${files[#]}")
# Note: `readarray` requires Bash 4; in Bash 3.x, use the following:
# IFS=$'\n' read -r -d '' -a names < <(basename -s '.txt' "${files[#]}")
fi
# "${names[#]}" now contains an array of suffix-stripped basenames,
# or is empty, if no files matched.
printf '%s\n' "${names[#]}" # print names line by line
Note: The -e test comes with a tiny caveat: if there are matches and the first match is a broken symlink, the test will mistakenly conclude that there are no matches.
A more robust option is to use shopt -s nullglob to make the shell expand non-matching globs to the empty string, but note that this is a shell-global option, and it is good practice to return it to its previous value afterward, which makes that approach more cumbersome.

Try to put quotes around the whole thing, what you is globbing happening, your command becomes * which then is converted to all files in the current directory, this does not happen inside single or double quotes.

Using a variable to replace lines in a file with backslashes

I want to add the string %%% to the beginning of some specific lines in a text file.
This is my script:
#!/bin/bash
a="c:\Temp"
sed "s/$a/%%%$a/g" <File.txt
And this is my File.txt content:
d:\Temp
c:\Temp
e:\Temp
But nothing changes when I execute it.
I think the 'sed' command is not finding the pattern, possibly due to the \ backslashes in the variable a.
I can find the c:\Temp line if I use grep with -F option (to not interpret strings):
cat File.txt | grep -F "$a"
But sed seems not to implement such '-F` option.
Not working neither:
sed 's/$a/%%%$a/g' <File.txt
sed 's/"$a"/%%%"$a"/g' <File.txt
I have found similar threads about replacing with sed, but they don't refer to variables.
How can I replace the desired lines by using a variable adding them the %%% char string?
EDIT: It would be fine that the $a variable could be entered via parameter when calling the script, so it will be assigned like:
a=$1

Try it like this:
#!/bin/sh
a='c:\\Temp' # single quotes
sed "s/$a/%%%$a/g" <File.txt # double quotes
Output:
Johns-MacBook-Pro:sed jcreasey$ sh x.sh
d:\Temp
e:\Temp
%%%c:\Temp
You need the double slash '\' to escape the '\'.
The single quotes won't expand the variables.
So you escape the slash in single quotes and pass it into the double quotes.
Of course you could also just do this:
#!/bin/sh
sed 's/\(.*Temp\)/%%%&/' <File.txt
If you want to get input from the command line you have to allow for the fact that \ is an escape character there too. So the user needs to type 'c:\\' or the interpreter will just wait for another character. Then once you get it, you will need to escape it again. (printf %q).
#!/bin/sh
b=`printf "%q" $1`
sed "s/\($b\)/%%% &/" < File.txt

The issue you are having has to do with substitution of your variable providing a regular expression looking for a literal c:Temp with the \ interpreted as an escape by the shell. There are a number of workarounds. Seeing the comments and having worked through the possibilities, the following will allow the unquoted entry of the search term:
#!/bin/bash
## validate that needed input is given on the command line
[ -n "$1" -a "$2" ] || {
printf "Error: insufficient input. Usage: %s <term> <file>\n" "${0//*\//}" >&2
exit 1
}
## validate that the filename given is readable
[ -r "$2" ] || {
printf "Error: file not readable '%s'\n" "$2" >&2
exit 1
}
a="$1" # assign a
filenm="$2" # assign filename
## test and fix the search term entered
[[ "$a" =~ '/' ]] || a="${a/:/:\\}" # test if \ removed by shell, if so replace
a="${a/\\/\\\\}" # add second \
sed -e "s/$a/%%%$a/g" "$filenm" # call sed with output to stdout
Usage:
$ bash sedwinpath.sh c:\Temp dat/winpath.txt
d:\Temp
%%%c:\Temp
e:\Temp
Note: This allows both single-quoted or unquoted entry of the dos path search term. To edit in place use sed -i. Additionally, the [[ operator and =~ operator are limited to bash.
I could have sworn the original question said replace, but to append, just as you suggest in the comments. I have updated the code with:
sed -e "s/$a/%%%$a/g" "$filenm"
Which provides the new output:
$ bash sedwinpath.sh c:\Temp dat/winpath.txt
d:\Temp
%%%c:\Temp
e:\Temp
Remember: If you want to edit the file in place use sed -i or sed -i.bak which will edit the actual file (and if -i.bak is given create a backup of the original in originalname.bak). Let me know if that is not what you intended and I'm happy to edit again.

Creating your script with a positional parameter of $1
#!/bin/bash
a="$1"
cat <file path>|sed "s/"$1"/%%%"$1"/g" > "temporary file"
Now whenever you want sed to find "c:\Temp" you need to use your script command line as follows
bash <my executing script> c:\\\\Temp
The first backslash will make bash interpret any backslashes that follows therefore what will be save in variable "a" in your executing script is "c:\\Temp". Now substituting this variable in sed will cause sed to interpret 1 backlash since the first backslash in this variable will cause sed to start interpreting the other backlash.
when you Open your temporary file you will see:
d:\Temp
%%%c:\Temp
e:\Temp

Get and print directories from $PATH in bash

The script that I have to write must find the directories from the $PATH variable and print only the ones that end with an i.
How am I thinking about doing it
Get each directory from the variable with a for loop.
Find the length of each directory and get the last character from each using a substring
Use an If condition to print the directories that end with an i
Problems
The directories are not separated with a new line and I can't read them using a for loop.
Any ideas on how to get over this problem,or can you think of something more appropriate.

You can use this BASH one-liner for that job:
(IFS=':'; for i in $PATH; do [[ -d "$i" && $i =~ i$ ]] && echo "$i"; done)
IFS=':' sets input field separator to :
$PATH is iterated in a for loop
Each path element is tested if it is a directory and if it is ending with i using BASH regex
If test passes then it is pritned

Use bash's parameter expansion to replace all delimiters.
${parameter//pat/string}
For example,
mypaths="${PATH//:/ }"
will split the path by directory, so then you can run:
for directory in $mypaths
do
...
done

You can change the Inter Field Separator (IFS) to colon then path is dissected auto_magically. ;-)
IFS=:
for i in $PATH
do
echo $i | egrep -e 'i$'
done

grep 'i$' <<<"${PATH//:/$'\n'}"
The $PATH entries are split into individual lines by replacing : instances with newlines ($'\n') in a parameter expansion; $'\n' is an ANSI C-quoted string.
The resulting strings is passed to the stdin of grep as a here-string
(<<<...).
grep is then used to match only those lines that end in ($) the letter i.
To match case-insensitively, use grep -i 'i$'.
A demonstration:
$ (PATH='/ends/in_i:/usr/bin:/also/ends_in_i'; grep 'i$' <<<"${PATH//:/$'\n'}")
/ends/in_i
/also/ends_in_i

Shell Script: Truncating String

I have two folders full of trainings and corresponding testfiles and I'd like to run the fitting pairs against each other using a shell script.
This is what I have so far:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$x.test
done
This is supposed to take file1(-n).train in one directory, look for file1(-n).test in the other, and run them trough a tool called timbl.
What it does instead is look for a file called SpanishLS.train/file1(-n).train.test which of course doesn't exist.
What I tried to do, to no avail, is truncate $x in a way that lets the script find the correct file, but whenever I do this, $x is truncated way too early, resulting in the script not even finding the .train file.
How should I code this?

If I got you right, this will do the job:
for x in SpanishLS.train/*.train
do
y=${x##*/} # strip basepath
y=${y%.*} # strip extention
timbl -f $x -t SpanishLS.test/$y.test
done

Use basename:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$(basename "$x" .train).test
done
That removes the directory prefix and the .train suffix from $x, and builds up the name you want.
In bash (and other POSIX-compliant shells), you can do the basename operation with two shell parameter expansions without invoking an external program. (I don't think there's a way to combine the two expansions into one.)
for x in SpanishLS.train/*.train
do
y=${x##*/} # Remove path prefix
timbl -f $x -t SpanishLS.test/${y%.train}.test # Remove .train suffix
done
Beware: bash supports quite a number of (useful) expansions that are not defined by POSIX. For example, ${y//.train/.test} is a bash-only notation (or bash and compatible shells notation).

Replace all occurences of .train in the filename to .text:
timbl -f $x -t $(echo $x | sed 's/\.train/.text/g')

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string