Shell Script: Truncating String - string

I have two folders full of trainings and corresponding testfiles and I'd like to run the fitting pairs against each other using a shell script.
This is what I have so far:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$x.test
done
This is supposed to take file1(-n).train in one directory, look for file1(-n).test in the other, and run them trough a tool called timbl.
What it does instead is look for a file called SpanishLS.train/file1(-n).train.test which of course doesn't exist.
What I tried to do, to no avail, is truncate $x in a way that lets the script find the correct file, but whenever I do this, $x is truncated way too early, resulting in the script not even finding the .train file.
How should I code this?

If I got you right, this will do the job:
for x in SpanishLS.train/*.train
do
y=${x##*/} # strip basepath
y=${y%.*} # strip extention
timbl -f $x -t SpanishLS.test/$y.test
done

Use basename:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$(basename "$x" .train).test
done
That removes the directory prefix and the .train suffix from $x, and builds up the name you want.
In bash (and other POSIX-compliant shells), you can do the basename operation with two shell parameter expansions without invoking an external program. (I don't think there's a way to combine the two expansions into one.)
for x in SpanishLS.train/*.train
do
y=${x##*/} # Remove path prefix
timbl -f $x -t SpanishLS.test/${y%.train}.test # Remove .train suffix
done
Beware: bash supports quite a number of (useful) expansions that are not defined by POSIX. For example, ${y//.train/.test} is a bash-only notation (or bash and compatible shells notation).

Replace all occurences of .train in the filename to .text:
timbl -f $x -t $(echo $x | sed 's/\.train/.text/g')

Related

Multiple files rename using linux shell script

I have following images.
10.jpg
11.jpg
12.jpg
I want to remove above images. I used following shell script file.
for file in /home/scrapping/imgs/*
do
COUNT=$(expr $COUNT + 1)
STRING="/home/scrapping/imgs/""Img_"$COUNT".jpg"
echo $STRING
mv "$file" "$STRING"
done
So, replaced file name
Img_1.jpg
Img_2.jpg
Img_3.jpg
But, I want to replace the file name like this:
Img_10.jpg
Img_11.jpg
Img_12.jpg
So, How to set COUNT value 10 to get my own output?
The expr syntax is pretty outdated, POSIX shell allows you to do arithmetic evaluation with $(()) syntax. You can just do
#!/usr/bin/env bash
count=10
for file in /home/scrapping/imgs/*; do
[ -f "$file" ] || continue
mv "$file" "/home/scrapping/imgs/Img_$((count++)).jpg"
done
Also from the errors reported in the comments, you seem to be running it from the dash shell. It does not seem to have all the features complying to the standard POSIX shell. Run it with the sh or the bash shell.
And always use lowercase letters for user defined variables in your shell script. Upper case letters are primarily for the environment variables managed by the shell itself.
With rename command you can suffix your files with Img_:
rename 's/^/Img_/' *
The ^ means replace the start of the filename with Img_, i.e: adds a suffix.

basename command confusion

Given the following command:
$(basename "/this-directory-does-not-exist/*.txt" ".txt")
it outputs not only txt files but other files as well. On the other hand if I change ".txt" to something like "gobble de gook" it returns:
*.txt
I'm confused with regard to why it returns the other extension types.
Your problem doesn't stem from basename, but from inadvertent use of the shell's pathname expansion (globbing) feature due to lack of quoting:
If you use the result of your command substitution ($(...)) unquoted:
$ echo $(basename "/this-directory-does-not-exist/*.txt" ".txt")
you effectively execute the following:
$ echo * # unquoted '*' expands to all files and folders in the current dir
because basename "/this-directory-does-not-exist/*.txt" ".txt" returns literal * (it strips the extension from filename *.txt;
the reason that the filename pattern *.txt didn't expand to an actual filename is that the shell leaves globbing patterns that don't match anything unmodified (by default).)
If you double-quote the command substitution, the problem goes away:
$ echo "$(basename "/this-directory-does-not-exist/*.txt" ".txt")" # -> *
However, even with this problem resolved, your basename command will only work correctly if the glob expands to one matching file, because the syntax form you're using only supports one filename argument.
GNU basename and BSD basename support the non-POSIX -s option, which allows for multiple file operands from which to strip the extension:
basename -s .txt "/some-dir/*.txt"
Assuming you use bash, you can put it all together robustly as follows:
#!/usr/bin/env bash
names=() # initialize result array
files=( *.txt ) # perform globbing and capture matching paths in an array
# Since the shell by default returns a pattern as-is if there are no matches,
# we test the first array item for existence; if it refers to an existing
# file or dir., we know that at least 1 match was found.
if [[ -e ${files[0]} ]]; then
# Apply the `basename` command with suffix-stripping to all matches
# and read the results robustly into an array.
# Note that just `names=( $(basename ...) )` would NOT work robustly.
readarray -t names < <(basename -s '.txt' "${files[#]}")
# Note: `readarray` requires Bash 4; in Bash 3.x, use the following:
# IFS=$'\n' read -r -d '' -a names < <(basename -s '.txt' "${files[#]}")
fi
# "${names[#]}" now contains an array of suffix-stripped basenames,
# or is empty, if no files matched.
printf '%s\n' "${names[#]}" # print names line by line
Note: The -e test comes with a tiny caveat: if there are matches and the first match is a broken symlink, the test will mistakenly conclude that there are no matches.
A more robust option is to use shopt -s nullglob to make the shell expand non-matching globs to the empty string, but note that this is a shell-global option, and it is good practice to return it to its previous value afterward, which makes that approach more cumbersome.
Try to put quotes around the whole thing, what you is globbing happening, your command becomes * which then is converted to all files in the current directory, this does not happen inside single or double quotes.

Linux bash shell scripts - spaces in file names

It has been a long time since I did much bash script writing.
This is a bash script to copy and rename files by deleting all before the first period delimiter:
#!/bin/bash
mkdir fullname
mv *.audio fullname
cd fullname
for x in * ;
do
cp $x ../`echo $x | cut -d "." -f 2-`
done
cd ..
ls
It works well for file names with no embedded spaces but not for those with spaces.
How can I change the code to fix this simple Linux bash script? Any suggestions for improving the code for other reasons would also be welcome.
Example filenames, some with embedded spaces and some not (from link)
http://www.homenetvideo.com/demo/index.php?/Radio%20%28VLC%29
Ambient.A6.SOMA Space Station.audio
Blues.B9.Blues Radio U.K.audio
Classical.K3.Radio Stephansdom - Vienna.audio
College.CI.KDVS U of California, Davis.audio
Country.Q1.K-FROG.audio
Easy.G4.WNYU.audio
Eclectic.M2.XPN.audio
Electronica.E2.Rinse.audio
Folk.F1.Radionomy.audio
Hiphop.H1.NPR.audio
Indie.I4.WAUG.audio
Jazz.J6.KCSM.audio
Latin.L3.Mega.audio
Misc.X7.Gaydio.audio
News.N9.KQED.audio
Oldies.O1.Lonestar.audio
OldTime.Y1.Roswell.audio
Progressive.P1.Aural Moon.audio
Rock.R8.WXRT.audio
Scanner.Z3.Montreal.audio
Soul.S1.181.FM.audio
Talk.T2.TWiT.audio
World.W3.Persian.audio
http://lh5.googleusercontent.com/-QjLEiAtT4cw/U98_UFcWvvI/AAAAAAAABv8/gyPhbg8s7Bw/w681-h373-no/homenet-radio.png
Whenever you deal with file names that might have spaces in them, you must reference them as "$x" rather than just $x. That's what's causing your cp command to fail.
Your echo command is also problematic. Although echo does the right thing for simple spaces - it echoes a file named A B C as A B C - it will still fail if you have more than one consecutive space in the name, or whitespace that isn't a simple space character.
Instead of passing the file names to external programs for processing, which always requires getting them through the whitespace-hostile command line, you should use bash built-in functions for string manipulations wherever possible, e.g. ${x%%foo}, ${x#bar} and similar functions. The man page describes them under "Parameter expansion".
Here's my suggestion:
#!/bin/bash
shopt -s nullglob
mkdir fullname
mv *.audio fullname
(
cd fullname || exit
for x in *; do
cp "$x" "../${x#*.}"
done
)
ls
nullglob prevents * from presenting itself if no file matches it. Just optional.
() summons a subshell and saves you from changing back to another directory.
|| exit terminates the subshell if cd fails to change directory.
${x#*.} removes the <first>. from $x and expands it.

Make multiple copies of files with a shell script

I am trying to write a small shell script that makes the multiple copies of a file. I am able to take the file name as input but not the number of copies. Here is what I wrote. But I am unable to pass the NUMBER variable to for loop.
echo -n "Enter filename: "
read FILENAME
echo -n "Number of copies to be made: "
read NUMBER
for i in {2..$NUMBER}
do
cp -f $FILENAME ${FILENAME%%.*}"_"$i.csv
done
Unfortunately it doesn't work like that. Bash performs brace expansion before parameter expansion, so your brace will be expanded before $NUMBER is evaluated. See also the Bash Pitfall #33, which explains the issue.
One way to do this, using your code, would be:
for i in $(eval echo {2..$NUMBER})
do
# ...
done
Or, even shorter:
for i in $(seq 2 $NUMBER)
# ...
(thanks, Glenn Jackman!)
Note that typically, variables should be quoted. This is especially important for file names. What if your file is called foo bar? Then your cp -f would copy foo and bar since the arguments are split by whitespace.
So, do something like this:
cp -f "$FILENAME" "${FILENAME%%.*}_${i}.csv"
While it might not matter if your files don't contain whitespace, quoting variables is something you should do automatically to prevent any surprises in the future.

Looping through the elements of a path variable in Bash

I want to loop through a path list that I have gotten from an echo $VARIABLE command.
For example:
echo $MANPATH will return
/usr/lib:/usr/sfw/lib:/usr/info
So that is three different paths, each separated by a colon. I want to loop though each of those paths. Is there a way to do that? Thanks.
Thanks for all the replies so far, it looks like I actually don't need a loop after all. I just need a way to take out the colon so I can run one ls command on those three paths.
You can set the Internal Field Separator:
( IFS=:
for p in $MANPATH; do
echo "$p"
done
)
I used a subshell so the change in IFS is not reflected in my current shell.
The canonical way to do this, in Bash, is to use the read builtin appropriately:
IFS=: read -r -d '' -a path_array < <(printf '%s:\0' "$MANPATH")
This is the only robust solution: will do exactly what you want: split the string on the delimiter : and be safe with respect to spaces, newlines, and glob characters like *, [ ], etc. (unlike the other answers: they are all broken).
After this command, you'll have an array path_array, and you can loop on it:
for p in "${path_array[#]}"; do
printf '%s\n' "$p"
done
You can use Bash's pattern substitution parameter expansion to populate your loop variable. For example:
MANPATH=/usr/lib:/usr/sfw/lib:/usr/info
# Replace colons with spaces to create list.
for path in ${MANPATH//:/ }; do
echo "$path"
done
Note: Don't enclose the substitution expansion in quotes. You want the expanded values from MANPATH to be interpreted by the for-loop as separate words, rather than as a single string.
In this way you can safely go through the $PATH with a single loop, while $IFS will remain the same inside or outside the loop.
while IFS=: read -d: -r path; do # `$IFS` is only set for the `read` command
echo $path
done <<< "${PATH:+"${PATH}:"}" # append an extra ':' if `$PATH` is set
You can check the value of $IFS,
IFS='xxxxxxxx'
while IFS=: read -d: -r path; do
echo "${IFS}${path}"
done <<< "${PATH:+"${PATH}:"}"
and the output will be something like this.
xxxxxxxx/usr/local/bin
xxxxxxxx/usr/bin
xxxxxxxx/bin
Reference to another question on StackExchange.
for p in $(echo $MANPATH | tr ":" " ") ;do
echo $p
done
IFS=:
arr=(${MANPATH})
for path in "${arr[#]}" ; do # <- quotes required
echo $path
done
... it does take care of spaces :o) but also adds empty elements if you have something like:
:/usr/bin::/usr/lib:
... then index 0,2 will be empty (''), cannot say why index 4 isnt set at all
This can also be solved with Python, on the command line:
python -c "import os,sys;[os.system(' '.join(sys.argv[1:]).format(p)) for p in os.getenv('PATH').split(':')]" echo {}
Or as an alias:
alias foreachpath="python -c \"import os,sys;[os.system(' '.join(sys.argv[1:]).format(p)) for p in os.getenv('PATH').split(':')]\""
With example usage:
foreachpath echo {}
The advantage to this approach is that {} will be replaced by each path in succession. This can be used to construct all sorts of commands, for instance to list the size of all files and directories in the directories in $PATH. including directories with spaces in the name:
foreachpath 'for e in "{}"/*; do du -h "$e"; done'
Here is an example that shortens the length of the $PATH variable by creating symlinks to every file and directory in the $PATH in $HOME/.allbin. This is not useful for everyday usage, but may be useful if you get the too many arguments error message in a docker container, because bitbake uses the full $PATH as part of the command line...
mkdir -p "$HOME/.allbin"
python -c "import os,sys;[os.system(' '.join(sys.argv[1:]).format(p)) for p in os.getenv('PATH').split(':')]" 'for e in "{}"/*; do ln -sf "$e" "$HOME/.allbin/$(basename $e)"; done'
export PATH="$HOME/.allbin"
This should also, in theory, speed up regular shell usage and shell scripts, since there are fewer paths to search for every command that is executed. It is pretty hacky, though, so I don't recommend that anyone shorten their $PATH this way.
The foreachpath alias might come in handy, though.
Combining ideas from:
https://stackoverflow.com/a/29949759 - gniourf_gniourf
https://stackoverflow.com/a/31017384 - Yi H.
code:
PATHVAR='foo:bar baz:spam:eggs:' # demo path with space and empty
printf '%s:\0' "$PATHVAR" | while IFS=: read -d: -r p; do
echo $p
done | cat -n
output:
1 foo
2 bar baz
3 spam
4 eggs
5
You can use Bash's for X in ${} notation to accomplish this:
for p in ${PATH//:/$'\n'} ; do
echo $p;
done
OP's update wants to ls the resulting folders, and has pointed out that ls only requires a space-separated list.
ls $(echo $PATH | tr ':' ' ') is nice and simple and should fit the bill nicely.

Resources