Linux bash shell scripts - spaces in file names

Linux bash shell scripts - spaces in file names - linux

It has been a long time since I did much bash script writing.
This is a bash script to copy and rename files by deleting all before the first period delimiter:
#!/bin/bash
mkdir fullname
mv *.audio fullname
cd fullname
for x in * ;
do
cp $x ../`echo $x | cut -d "." -f 2-`
done
cd ..
ls
It works well for file names with no embedded spaces but not for those with spaces.
How can I change the code to fix this simple Linux bash script? Any suggestions for improving the code for other reasons would also be welcome.
Example filenames, some with embedded spaces and some not (from link)
http://www.homenetvideo.com/demo/index.php?/Radio%20%28VLC%29
Ambient.A6.SOMA Space Station.audio
Blues.B9.Blues Radio U.K.audio
Classical.K3.Radio Stephansdom - Vienna.audio
College.CI.KDVS U of California, Davis.audio
Country.Q1.K-FROG.audio
Easy.G4.WNYU.audio
Eclectic.M2.XPN.audio
Electronica.E2.Rinse.audio
Folk.F1.Radionomy.audio
Hiphop.H1.NPR.audio
Indie.I4.WAUG.audio
Jazz.J6.KCSM.audio
Latin.L3.Mega.audio
Misc.X7.Gaydio.audio
News.N9.KQED.audio
Oldies.O1.Lonestar.audio
OldTime.Y1.Roswell.audio
Progressive.P1.Aural Moon.audio
Rock.R8.WXRT.audio
Scanner.Z3.Montreal.audio
Soul.S1.181.FM.audio
Talk.T2.TWiT.audio
World.W3.Persian.audio
http://lh5.googleusercontent.com/-QjLEiAtT4cw/U98_UFcWvvI/AAAAAAAABv8/gyPhbg8s7Bw/w681-h373-no/homenet-radio.png

Whenever you deal with file names that might have spaces in them, you must reference them as "$x" rather than just $x. That's what's causing your cp command to fail.
Your echo command is also problematic. Although echo does the right thing for simple spaces - it echoes a file named A B C as A B C - it will still fail if you have more than one consecutive space in the name, or whitespace that isn't a simple space character.
Instead of passing the file names to external programs for processing, which always requires getting them through the whitespace-hostile command line, you should use bash built-in functions for string manipulations wherever possible, e.g. ${x%%foo}, ${x#bar} and similar functions. The man page describes them under "Parameter expansion".

Here's my suggestion:
#!/bin/bash
shopt -s nullglob
mkdir fullname
mv *.audio fullname
(
cd fullname || exit
for x in *; do
cp "$x" "../${x#*.}"
done
)
ls
nullglob prevents * from presenting itself if no file matches it. Just optional.
() summons a subshell and saves you from changing back to another directory.
|| exit terminates the subshell if cd fails to change directory.
${x#*.} removes the <first>. from $x and expands it.

Related

how to pass asterisk into ls command inside bash script

Hi… Need a little help here…
I tried to emulate the DOS' dir command in Linux using bash script. Basically it's just a wrapped ls command with some parameters plus summary info. Here's the script:
#!/bin/bash
# default to current folder
if [ -z "$1" ]; then var=.;
else var="$1"; fi
# check file existence
if [ -a "$var" ]; then
# list contents with color, folder first
CMD="ls -lgG $var --color --group-directories-first"; $CMD;
# sum all files size
size=$(ls -lgGp "$var" | grep -v / | awk '{ sum += $3 }; END { print sum }')
if [ "$size" == "" ]; then size="0"; fi
# create summary
if [ -d "$var" ]; then
folder=$(find $var/* -maxdepth 0 -type d | wc -l)
file=$(find $var/* -maxdepth 0 -type f | wc -l)
echo "Found: $folder folders "
echo " $file files $size bytes"
fi
# error message
else
echo "dir: Error \"$var\": No such file or directory"
fi
The problem is when the argument contains an asterisk (*), the ls within the script acts differently compare to the direct ls command given at the prompt. Instead of return the whole files list, the script only returns the first file. See the video below to see the comparation in action. I don't know why it behaves like that.
Anyone knows how to fix it? Thank you.
Video: problem in action
UPDATE:
The problem has been solved. Thank you all for the answers. Now my script works as expected. See the video here: http://i.giphy.com/3o8dp1YLz4fIyCbOAU.gif

The asterisk * is expanded by the shell when it parses the command line. In other words, your script doesn't get a parameter containing an asterisk, it gets a list of files as arguments. Your script only works with $1, the first argument. It should work with "$#" instead.

This is because when you retrieve $1 you assume the shell does NOT expand *.
In fact, when * (or other glob) matches, it is expanded, and broken into segments by $IFS, and then passed as $1, $2, etc.
You're lucky if you simply retrieved the first file. When your first file's path contains spaces, you'll get an error because you only get the first segment before the space.
Seriously, read this and especially this. Really.
And please don't do things like
CMD=whatever you get from user input; $CMD;
You are begging for trouble. Don't execute arbitrary string from the user.

Both above answers already answered your question. So, i'm going a bit more verbose.
In your terminal is running the bash interpreter (probably). This is the program which parses your input line(s) and doing "things" based on your input.
When you enter some line the bash start doing the following workflow:
parsing and lexical analysis
expansion
brace expansion
tidle expansion
variable expansion
artithmetic and other substitutions
command substitution
word splitting
filename generation (globbing)
removing quotes
Only after all above the bash
will execute some external commands, like ls or dir.sh... etc.,
or will do so some "internal" actions for the known keywords and builtins like echo, for, if etc...
As you can see, the second last is the filename generation (globbing). So, in your case - if the test* matches some files, your bash expands the willcard characters (aka does the globbing).
So,
when you enter dir.sh test*,
and the test* matches some files
the bash does the expansion first
and after will execute the command dir.sh with already expanded filenames
e.g. the script get executed (in your case) as: dir.sh test.pas test.swift
BTW, it acts exactly with the same way for your ls example:
the bash expands the ls test* to ls test.pas test.swift
then executes the ls with the above two arguments
and the ls will print the result for the got two arguments.
with other words, the ls don't even see the test* argument - if it is possible - the bash expands the wilcard characters. (* and ?).
Now back to your script: add after the shebang the following line:
echo "the $0 got this arguments: $#"
and you will immediatelly see, the real argumemts how your script got executed.
also, in such cases is a good practice trying to execute the script in debug-mode, e.g.
bash -x dir.sh test*
and you will see, what the script does exactly.
Also, you can do the same for your current interpreter, e.g. just enter into the terminal
set -x
and try run the dir.sh test* = and you will see, how the bash will execute the dir.sh command. (to stop the debug mode, just enter set +x)

Everbody is giving you valuable advice which you should definitely should follow!
But here is the real answer to your question.
To pass unexpanded arguments to any executable you need to single quote them:
./your_script '*'

The best solution I have is to use the eval command, in this way:
#!/bin/bash
cmd="some command \"with_quetes_and_asterisk_in_it*\""
echo "$cmd"
eval $cmd
The eval command takes its arguments and evaluates them into the command as the shell does.
This solves my problem when I need to call a command with asterisk '*' in it from a script.

Listing directories with spaces using Bash in linux

I would like to create a bash script to list all the directories in a directory provided by the user via input, or all the directories in the current directory (given no input).
Here's what I have thus far, but when I execute it I encounter two problems.
1) The script completely ignores my input. The file is located on my desktop but when I type in "home" as the input, the script simply prints the directories of the Desktop (current directory).
2) The directories are printed on their own lines (intended) but it treats each word in a folder name as its own folder. i.e. is printed as:
this
folder
Here's the code I have so far:
#!/bin/bash
echo -n "Enter a directory to load files: "
read d
if [ $d="" ]; #if input is blank, assume d = current directory
then d=${PWD##*/}
for i in $(ls -d */);
do echo ${i%%/};
done
else #otherwise, print sub-directories of given directory
for i in $(ls -d */);
do echo ${i%%/};
done
fi
Also in your response please explain your answer as I'm very new to bash.
Thanks for looking, I appreciate your time.
EDIT: Thanks to John1024's answer, I came up with the following:
#!/bin/bash
echo -n "Enter a directory to load files: "
IFS= read d
ls -1 -d "${d:-.}"/*/
And it does everything I need. Much appreciated!

I believe that this script accomplishes what you want:
#!/bin/sh
ls -1 -d "${1:-.}"/*/
Usage example:
$ bash ./script.sh /usr/X11R6
/usr/X11R6/bin
/usr/X11R6/man
Explanation:
-1 tells ls to print each file/directory on a separate line
-d tells ls to list directories by name instead of their contents
The shell will ${1:-.} to be the first argument to the script if there is one or . (which means the current directory) if there isn't.
Enhancement
The above script displays a / at the end of each directory name. If you don't want that, we can use sed to remove trailing slashes from the output:
#!/bin/sh
ls -1d ${1:-.}/*/ | sed 's|/$||'
Revised Version of Your Script
Starting with your script, some simplifications can be made:
#!/bin/bash
echo -n "Enter a directory to load files: "
IFS= read d
d=${d:-$PWD}
for i in "$d"/*/
do
echo ${i%%/}
done
Notes:
IFS= read d
Normally leading and trailing white space are stripped before the input is assigned to d. By setting IFS to an empty value, however, leading and trailing white space will be preserved. Thus this will work even if the pathologically strange case where the user specifies a directory whose name begins or ends with white space.
If the user enters a backslash, the shell will try to process it as an escape. If you don't like that, use IFS= read -r d and backslashes will be treated as normal characters, not escapes.
d=${d:-$PWD}
If the user supplied a value for d, this leaves it unchanged. If he didn't, this assigns it to $PWD.
for i in "$d"/*/
This will loop over every subdirectory of $d and will correctly handle subdirectory names with spaces, tabs, or any other odd character.
By contrast, consider:
for i in $(ls -d */)
After ls executes here, the shell will split up the output into individual words. This is called "word splitting" and is why this form of the for loop should be avoided.
Notice the double-quotes in for i in "$d"/*/. They are there to prevent word splitting on $d.

How to execute Linux shell variables within double quotes?

I have the following hacking-challenge, where we don't know, if there is a valid solution.
We have the following server script:
read s # read user input into var s
echo "$s"
# tests if it starts with 'a-f'
echo "$s" > "/home/user/${s}.txt"
We only control the input "$s". Is there a possibility to send OS-commands like uname or do you think "no way"?

I don't see any avenue for executing arbitrary commands. The script quotes $s every time it is referenced, so that limits what you can do.
The only serious attack vector I see is that the echo statement writes to a file name based on $s. Since you control $s, you can cause the script to write to some unexpected locations.
$s could contain a string like bob/important.txt. This script would then overwrite /home/user/bob/important.txt if executed with sufficient permissions. Sorry, Bob!
Or, worse, $s could be bob/../../../etc/passwd. The script would try to write to /home/user/bob/../../../etc/passwd. If the script is running as root... uh oh!
It's important to note that the script can only write to these places if it has the right permissions.
You could embed unusual characters in $s that would cause irregular file names to be created. Un-careful scripts could be taken advantage of. For example, if $s were foo -rf . bar, then the file /home/user/foo -rf . bar.txt would be created.
If someone ran for file in /home/user; rm $file; done they'd have a surprise on their hands. They would end up running rm /home/user/foo -rf . bar.txt, which is a disaster. If you take out /home/user/foo and bar.txt you're left with rm -rf . — everything in the current directory is deleted. Oops!
(They should have quoted "$file"!)
And there are two other minor things which, while I don't know how to take advantage of them maliciously, do cause the script to behave slightly differently than intended.
read allows backslashes to escape characters like space and newline. You can enter \space to embed spaces and \enter to have read parse multiple lines of input.
echo accepts a couple of flags. If $s is -n or -e then it won't actually echo $s; rather, it will interpret $s as a command-line flag.

Use read -r s or any \ will be lost/missinterpreted by your command.
read -r s?"Your input: "
if [ -n "${s}" ]
then
# "filter" file name from command
echo "${s##*/}" | sed 's|^ *\([[:alnum:]_]\{1,\}\)[[:blank:]].*|/home/user/\1.txt|' | read Output
(
# put any limitation on user here
ulimit -t 5 1>/dev/null 2>&1
`${read}`
) > ${OutPut}
else
echo "Bad command" > /home/user/Error.txt
fi

Sure:
read s
$s > /home/user/"$s".txt
If I enter uname, this prints Linux. But beware: this is a security nightmare. What if someone enters rm -rf $HOME? You'd also have issues with commands containing a slash.

"For" loop in bash script only run once

The script goal is simple.
I have many directory which contains some captured traffic files.
I want to run a command for each directory. So I came up with a script. But I don't know why the script is run only with the first match.
#!/bin/bash
# Collect throughput from a group of directory containing capture files
# Group of directory can be specify by pattern
# Usage: ./collectThroughputList [regex]
# [regex] is the name pattern of the group of directory
for DIR in $( ls -d $1 ); do
if test -d "$DIR"; then
echo Collecting throughputs from directory: "$DIR"
( sh collectThroughput.sh $DIR > $DIR.txt )
fi
done
echo Done\!
I try it with:
for DIR in $1; do
or
for DIR in `ls -d $1`; do
or
for DIR in $( ls -d "$1" ); do
or
for DIR in $( ls -d $1 ); do
But the result is the same. The for loop runs only one time.
Finally I found this one and did some tricks for it to work. However, I would like to know why my first script doesn't work.
find *Delay50ms* -type d -exec bash -c "cd '{}' && echo enter '{}' && ../collectThroughput.sh ../'{}' > ../'{}'.txt" \;
"*Delay*" is the directory pattern name that I want to run the command with.
Thanks for pointing out the issues.

Since you want to find all sub-directories under $1, use it like this:
for DIR in $(find $1 -type d)

Problem
Most probably the problem you are encountering is due to the fact that you are trying to use some kind of pattern like * as argument to your script.
Running it with something like:
my_script *
What's happening here is, that the shell will expand * prior to calling your script.
Thus after word splitting has been performed $1 in your script will just reference the first entry returned by ls.
Example
Given the following directory layout:
directory_a
directory_b
directory_c
Calling my_script * will result in:
my_script directory_a directory_b directory_c
being called thus your loop just iterating over $(ls -d directory_a) which in fact is nothing else but directory_a alone.
Solution
To have the program run with $1=* you would have to escape the * prior to calling your script.
Try running:
my_script \*
To see it effectively does what it is intended to do then. This way $1 in your script will contain * instead of directory_a which most probably is the way you wanted your script to work.

as mikyra has pointed out, the shell expands your argument * to all entries in your directory prior to passing it to your script.
if you want shell-expansion of your wildcards (e.g. * matches all but hidden files), you could simply leave the expansion to the shell and use the result, by iterating over all arguments, rather than just the first one:
for DIR in $#; do
# ...
done
if you want to do the expansion yourself (e.g. because the pattern should be applied only to a pre-filtered list or to files in a different directory, or because you want regex-expansion rather than shell globbing), you have to protect the argument from being expanded by the shell, either using backslash notation (like mikyra's \*) or by using quotes (which is often easier to use):
my_script "*"

Shell Script: Truncating String

I have two folders full of trainings and corresponding testfiles and I'd like to run the fitting pairs against each other using a shell script.
This is what I have so far:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$x.test
done
This is supposed to take file1(-n).train in one directory, look for file1(-n).test in the other, and run them trough a tool called timbl.
What it does instead is look for a file called SpanishLS.train/file1(-n).train.test which of course doesn't exist.
What I tried to do, to no avail, is truncate $x in a way that lets the script find the correct file, but whenever I do this, $x is truncated way too early, resulting in the script not even finding the .train file.
How should I code this?

If I got you right, this will do the job:
for x in SpanishLS.train/*.train
do
y=${x##*/} # strip basepath
y=${y%.*} # strip extention
timbl -f $x -t SpanishLS.test/$y.test
done

Use basename:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$(basename "$x" .train).test
done
That removes the directory prefix and the .train suffix from $x, and builds up the name you want.
In bash (and other POSIX-compliant shells), you can do the basename operation with two shell parameter expansions without invoking an external program. (I don't think there's a way to combine the two expansions into one.)
for x in SpanishLS.train/*.train
do
y=${x##*/} # Remove path prefix
timbl -f $x -t SpanishLS.test/${y%.train}.test # Remove .train suffix
done
Beware: bash supports quite a number of (useful) expansions that are not defined by POSIX. For example, ${y//.train/.test} is a bash-only notation (or bash and compatible shells notation).

Replace all occurences of .train in the filename to .text:
timbl -f $x -t $(echo $x | sed 's/\.train/.text/g')

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string