deal with filename with space in shell [duplicate] - linux

This question already has answers here:
Iterate over a list of files with spaces
(12 answers)
Closed 3 years ago.
I've read the answer here,but still got wrong.
In my folder,I only want to deal with *.gz file,Windows 10.tar.gz got space in filename.
Assume the folder contain:
Windows 10.tar.gz Windows7.tar.gz otherfile
Here is my shell scripts,I tried everything to quote with "",still can't got what I want.
crypt_import_xml.sh
#/bin/sh
rule_dir=/root/demo/rule
function crypt_import_xml()
{
rule=$1
# list the file in absoulte path
for file in `ls ${rule}/*.gz`; do
echo "${file}"
#tar -xf *.gz
#mv a b.xml to ab.xml
done
}
crypt_import_xml ${rule_dir}
Here is what I got:
root#localhost.localdomain:[/root/demo]./crypt_import_xml.sh
/root/demo/rule/Windows
10.tar.gz
/root/demo/rule/Windows7.tar.gz
After tar xf the *.gz file,the xml filename still contain space.It is a nightmare for me to deal with filename contain spaces.

You shouldn't use ls in for loop.
$ ls directory
file.txt 'file with more spaces.txt' 'file with spaces.txt'
Using ls:
$ for file in `ls ./directory`; do echo "$file"; done
file.txt
file
with
more
spaces.txt
file
with
spaces.txt
Using file globbing:
$ for file in ./directory/*; do echo "$file"; done
./directory/file.txt
./directory/file with more spaces.txt
./directory/file with spaces.txt
So:
for file in "$rule"/*.gz; do
echo "$file"
#tar -xf *.gz
#mv a b.xml to ab.xml
done

You do not need to call that ls command in the for loop, the file globbing will take place in your shell, without running this additional command:
XXX-macbookpro:testDir XXX$ ls -ltra
total 0
drwx------+ 123 XXX XXX 3936 Feb 22 17:15 ..
-rw-r--r-- 1 XXX XXX 0 Feb 22 17:15 abc 123
drwxr-xr-x 3 XXX XXX 96 Feb 22 17:15 .
XXX-macbookpro:testDir XXX$ rule=.
XXX-macbookpro:testDir XXX$ for f in "${rule}"/*; do echo "$f"; done
./abc 123
In your case you can change the "${rule}"/* into:
"${rule}"/*.gz;

Related

Testing a file in a tar archive

I've been manipulating a tar file and I would like to test if a file exists before extracting it
Let's say I have an tar file called Archive.Tar and after entering
tar -tvf Archive.Tar
I get:
-rwxrwxrwx guy/root 1502 2013-10-02 20:43 Directory/File
-rwxrwxrwx guy/root 494 2013-10-02 20:43 Dir/SubDir/Text
drwxrwxrwx guy/root 0 2013-10-02 20:43 Directory
I want to extract Text into my Working directory, but I want to be sure that it's actually a file by doing this:
if [ -f Dir/Sub/Text ]
then
echo "OK"
else
echo "KO"
fi
The result of this test is always KO and I really don't understand why, any suggestions?
Tested with BSD and GNU versions of tar,
in the output of tar tf,
entries that are directories end with /.
So to test if Dir/Sub/Text is a file or directory in the archive,
you can simply grep with full line matching:
if tar tf Archive.Tar | grep -x Dir/Sub/Text >/dev/null
then
echo "OK"
else
echo "KO"
fi
If the archive contains Dir/SubDir/Text/, then Dir/SubDir/Text is a directory, and the grep will not match, so KO will be printed.
If the archive contains Dir/SubDir/Text without a trailing /,
then Dir/SubDir/Text is a file and the grep will match,
so OK will be printed.
if [ ! -d Dir/Sub/Text ]
then
echo "OK"
else
echo "KO"
fi
will return KO only if a directory Text exists and be ok if it's a file or does not exist (or to be precise also OK if it would be a symlink).
This might be a solution,
tar -tvf Archive.Tar | grep Dir/Sub/Text
This will let you know if it find the file.

How do I find the latest date folder in a directory and then construct the command in a shell script?

I have a directory in which I will have some folders with date format (YYYYMMDD) as shown below -
david#machineX:/database/batch/snapshot$ ls -lt
drwxr-xr-x 2 app kyte 86016 Oct 25 05:19 20141023
drwxr-xr-x 2 app kyte 73728 Oct 18 00:21 20141016
drwxr-xr-x 2 app kyte 73728 Oct 9 22:23 20141009
drwxr-xr-x 2 app kyte 81920 Oct 4 03:11 20141002
Now I need to extract latest date folder from the /database/batch/snapshot directory and then construct the command in my shell script like this -
./file_checker --directory /database/batch/snapshot/20141023/ --regex ".*.data" > shardfile_20141023.log
Below is my shell script -
#!/bin/bash
./file_checker --directory /database/batch/snapshot/20141023/ --regex ".*.data" > shardfile_20141023.log
# now I need to grep shardfile_20141023.log after above command is executed
How do I find the latest date folder and construct above command in a shell script?
Look, this is one of approaches, just grep only folders that have 8 digits:
ls -t1 | grep -P -e "\d{8}" | head -1
Or
ls -t1 | grep -E -e "[0-9]{8}" | head -1
You could try the following in your script:
pushd /database/batch/snapshot
LATESTDATE=`ls -d * | sort -n | tail -1`
popd
./file_checker --directory /database/batch/snapshot/${LATESTDATE}/ --regex ".*.data" > shardfile_${LATESTDATE}.log
See BashFAQ#099 aka "How can I get the newest (or oldest) file from a directory?".
That being said, if you don't care for actual modification time and just want to find the most recent directory based on name you can use an array and globbing (note: the sort order with globbing is subject to LC_COLLATE):
$ find
.
./20141002
./20141009
./20141016
./20141023
$ foo=( * )
$ echo "${foo[${#foo[#]}-1]}"
20141023

bash tar error doesn't create tar.gz

I have the following bash script:
#DIR is something like: /home/foo/foobar/test/ without any whitespace but can also include whitespace
DIR="$( cd "$( dirname "$0" )" && pwd )"
#backup_name is read from a file
backup_name=FOOBAR
date=`date +%Y%m%d_%H%M_%S`
#subdirs is also read from the same file
subdirs=etc/ sbin/ bin/
filename="$DIR/Backup_$backup_name"_"$date.tar.gz"
cd /
echo "filename: $filename"
echo "subdirs $subdirs"
cmd='tar czvf "'$filename'" '$subdirs
echo "cmd tar: $cmd"
$cmd
But I get following output:
filename: /home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz
subdirs: etc/ sbin/ bin/
cmd tar: tar cfvz "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz" etc/ sbin/ bin/
etc/
# ... list of files in etc
# but no files from sbin or bin directory
tar: "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz": can open not execute: File or directory not found
tar: not recoverable error: abortion.
However, when I copy the echo output of the tar command, make a cd to / and paste it into the bash shell it is working:
tar cfvz "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz" etc/ sbin/ bin/
etc/
Every variable is defined and there is no trailing newline
I also tried $cmd with backticks
the two variables: backup_name and subdirs are read from a file (I did not include the reading process in the code)
edit: I just copied my script to a dir with no whitespace and changed the line:
cmd='tar czvf "'$filename'" '$subdirs
#to
cmd="tar czvf $filename $subdirs"
and it's working now but when I do the same in a dir which also contents whitespaces I get still the same error.
edit2: reading from file (the file is read before anything else happens)
config="config.txt"
local line
while read line
do
#points to next free element and declares it
config_lines[${#config_lines[#]}]=$line
done <$config
backup_name=${config_line[0]}
subdirs=${config_line[1]}
What is wrong with my bash script?
Short answer: see BashFAQ #050: I'm trying to put a command in a variable, but the complex cases always fail!.
Long answer: embedding quotes in a variable doesn't do anything useful, because when you use it (i.e. $cmd), bash parses quotes before replacing variables; by the time the quotes are there, it's too late for them to do any good. You do, however, have several options:
Don't bother with putting the command in a variable in the first place, just use it directly:
echo "filename: $filename"
echo "subdirs $subdirs"
tar czvf "$filename" $subdirs
If you really need to put it in a variable first, use an array rather than a plain text variable (and ideally, do the same with the subdirs list):
subdirs=(etc/ sbin/ bin/)
...
echo "filename: $filename"
echo "subdirs ${subdirs[*]}"
cmd=(tar czvf "$filename" "${subdirs[#]}")
printf "cmd tar:"
printf " %q" "${cmd[#]}" # Have to do some trickery to get it printed right
printf "\n"
"${cmd[#]}"
Instead of mucking about with messy quoting issues you could get the results you want a different way and, perhaps, save some time. How about something like this?
#!/usr/bin/env bash
# abusing set -v for fun and profit
tar_output=/tmp/$$.tarout
tar_command=/tmp/$$.tarcmd
tmp_script=/tmp/$$.script
dir="$(cd "$(dirname "$0")"; pwd)"
cat>"${tmp_script}"<<-'END'
datestamp=$(date +%Y%m%d_%H%M_%S)
subdirs=(etc sbin bin)
backup_name=FOOBAR
filename="$1/Backup_${backup_name}_${date}.tar.gz"
printf 'tar cmd: '
set -v
tar czvf "$filename" "${subdirs[#]}" 2>"$2"
set +v
END
bash "${tmp_script}" "$dir" "${tar_output}" 2>"${tar_command}"
cat "${tar_command}" | head -n 1 | sed -e 's/2>"\$2"$//'
cat "${tar_output}"
rm -f "${tmp_script}" "${tar_command}" "${tar_output}"
I apologize for nothing, but in the real world note that you'd want to make proper temp files.
If you execute the string $cmd, it won't work if "filename" embeds spaces
You have to let bash creates the arguments.
like this:
tar czvf "${filename}" $subdirs
You don't even need to put '\' in filename
OK, your original script did not work because file/path determination happens before variable expansion, so the filename is wrong: tar thinks that it's supposed to write to a file in the current directory named "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz" i.e. the file name contains slashes and double quotes!
tar cfz /this/file/does/nopt/exist .
tar: /this/file/does/nopt/exist: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
See the difference? There no double quotes around the file name/path in tar's error message.
It worked when you copy and paste the line because then, the doublequotes are intepreted by the shell.
Witness:
ls -l /tmp/screen-exchange
-rw-rw-rw- 1 aqn users 0 Mar 21 07:29 /tmp/screen-exchange
cmd='ls -l "'/tmp/screen-exchange'"'
$cmd
/bin/ls: "/tmp/screen-exchange": No such file or directory
eval $cmd
-rw-rw-rw- 1 aqn users 0 Mar 21 07:29 /tmp/screen-exchange
Of course, using eval won't guard against filenames with whitespaces in them. To guard against that, your tar command needs to be like so:
date>'file name with spaces'
file='file name with spaces' # this is the equivalent of your $filename
cmd='ls -l "$file"'
$cmd
ls: "$file": No such file or directory
eval $cmd
-rw-r--r-- 1 andyn SPICE\Domain Users 1083 Mar 22 15:28 a b
I would suggest you separate $cmd from $filename and $subdirs. I think the induced error comes from when you join these strings. Also, using multiple variables in one variable without proper quoting will also induce errors.
This should work for you:
cmd="tar -zcvf"
subdirs="etc/ sbin/ bin/"
filename="${DIR}/Backup_${backup_name}_${date}.tar.gz"
$cmd $filename $subdirs
#DIR is something like: /home/foo/foobar/test/ without any whitespace but can also include whitespace
DIR="$( cd "$( dirname "$0" )" && pwd )"
backup_name=FOOBAR
date=`date +%Y%m%d_%H%M_%S`
subdirs="etc/ sbin/ bin/"
filename="$DIR/Backup_$backup_name"_"$date.tar.gz"
cd /
echo "filename: $filename"
echo "subdirs $subdirs"
cmd="tar zcvf $filename $subdirs"
echo "cmd tar: $cmd"
$cmd

rsync prints "skipping non-regular file" for what appears to be a regular directory

I back up my files using rsync. Right after a sync, I ran it expecting to see nothing, but instead it looked like it was skipping directories. I've (obviously) changed names, but I believe I've still captured all the information I could. What's happening here?
$ ls -l /source/backup/myfiles
drwxr-xr-x 2 me me 4096 2010-10-03 14:00 foo
drwxr-xr-x 2 me me 4096 2011-08-03 23:49 bar
drwxr-xr-x 2 me me 4096 2011-08-18 18:58 baz
$ ls -l /destination/backup/myfiles
drwxr-xr-x 2 me me 4096 2010-10-03 14:00 foo
drwxr-xr-x 2 me me 4096 2011-08-03 23:49 bar
drwxr-xr-x 2 me me 4096 2011-08-18 18:58 baz
$ file /source/backup/myfiles/foo
/source/backup/myfiles/foo/: directory
Then I sync (expecting no changes):
$ rsync -rtvp /source/backup /destination
sending incremental file list
backup/myfiles
skipping non-regular file "backup/myfiles/foo"
skipping non-regular file "backup/myfiles/bar"
And here's the weird part:
$ echo 'hi' > /source/backup/myfiles/foo/test
$ rsync -rtvp /source/backup /destination
sending incremental file list
backup/myfiles
backup/myfiles/foo
backup/myfiles/foo/test
skipping non-regular file "backup/myfiles/foo"
skipping non-regular file "backup/myfiles/bar"
So it worked:
$ ls -l /source/backup/myfiles/foo
-rw-r--r-- 1 me me 3126091 2010-06-15 22:22 IMGP1856.JPG
-rw-r--r-- 1 me me 3473038 2010-06-15 22:30 P1010615.JPG
-rw-r--r-- 1 me me 3 2011-08-24 13:53 test
$ ls -l /destination/backup/myfiles/foo
-rw-r--r-- 1 me me 3126091 2010-06-15 22:22 IMGP1856.JPG
-rw-r--r-- 1 me me 3473038 2010-06-15 22:30 P1010615.JPG
-rw-r--r-- 1 me me 3 2011-08-24 13:53 test
but still:
$ rsync -rtvp /source/backup /destination
sending incremental file list
backup/myfiles
skipping non-regular file "backup/myfiles/foo"
skipping non-regular file "backup/myfiles/bar"
Other notes:
My actual directories "foo" and "bar" do have spaces, but no other strange characters. Other directories have spaces and have no problem. I 'stat'-ed and saw no differences between the directories that don't rsync and the ones that do.
If you need more information, just ask.
Are you absolutely sure those individual files are not symbolic links?
Rsync has a few useful flags such as -l which will "copy symlinks as symlinks". Adding -l to your command:
rsync -rtvpl /source/backup /destination
I believe symlinks are skipped by default because they can be a security risk. Check the man page or --help for more info on this:
rsync --help | grep link
To verify these are symbolic links or pro-actively to find symbolic links you can use file or find:
$ file /path/to/file
/path/to/file: symbolic link to `/path/file`
$ find /path -type l
/path/to/file
Are you absolutely sure that it's not a symbolic link directory?
try a:
file /source/backup/myfiles/foo
to make sure it's a directory
Also, it could very well be a loopback mount
try
mount
and make sure that /source/backup/myfiles/foo is not listed.
You should try the below command, most probably it will work for you:
rsync -ravz /source/backup /destination
You can try the following, it will work
rsync -rtvp /source/backup /destination
I personally always use this syntax in my script and works a treat to backup the entire system (skipping sys/* & proc/* nfs4/*)
sudo rsync --delete --stats --exclude-from $EXCLUDE -rlptgoDv / $TARGET/ | tee -a $LOG
Here is my script run by root's cron daily:
#!/bin/bash
#
NFS="/nfs4"
HOSTNAME=`hostname`
TIMESTAMP=`date "+%Y%m%d_%H%M%S"`
EXCLUDE="/home/gcclinux/Backups/root-rsync.excludes"
TARGET="${NFS}/${HOSTNAME}/SYS"
LOGDIR="${NFS}/${HOSTNAME}/SYS-LOG"
CMD=`/usr/bin/stat -f -L -c %T ${NFS}`
## CHECK IF NFS IS MOUNTED...
if [[ ! $CMD == "nfs" ]];then
echo "NFS NOT MOUNTED"
exit 1
fi
## CHECK IF LOG DIRECTORY EXIST
if [ ! -d "$LOGDIR" ]; then
/bin/mkdir -p $LOGDIR
fi
## CREATE LOG HEADER
LOG=$LOGDIR/"rsync_result."$TIMESTAMP".txt"
echo "-------------------------------------------------------" | tee -a $LOG
echo `date` | tee -a $LOG
echo "" | tee -a $LOG
## START RUNNING BACKUP
/usr/bin/rsync --delete --stats --exclude-from $EXCLUDE -rlptgoDv / $TARGET/ | tee -a $LOG
In some cases just copy file to another location (like home) then try again

bash script to rename all files in a directory?

i have bunch of files that needs to be renamed.
file1.txt needs to be renamed to file1_file1.txt
file2.avi needs to be renamed to file2_file2.avi
as you can see i need the _ folowed by the original file name.
there are lot of these files.
So far all the answers given either:
Require some non-portable tool
Break horribly with filenames containing spaces or newlines
Is not recursive, i.e. does not descend into sub-directories
These two scripts solve all of those problems.
Bash 2.X/3.X
#!/bin/bash
while IFS= read -r -d $'\0' file; do
dirname="${file%/*}/"
basename="${file:${#dirname}}"
echo mv "$file" "$dirname${basename%.*}_$basename"
done < <(find . -type f -print0)
Bash 4.X
#!/bin/bash
shopt -s globstar
for file in ./**; do
if [[ -f "$file" ]]; then
dirname="${file%/*}/"
basename="${file:${#dirname}}"
echo mv "$file" "$dirname${basename%.*}_$basename"
fi
done
Be sure to remove the echo from whichever script you choose once you are satisfied with it's output and run it again
Edit
Fixed problem in previous version that did not properly handle path names.
For your specific case, you want to use mmv as follows:
pax> ll
total 0
drwxr-xr-x+ 2 allachan None 0 Dec 24 09:47 .
drwxrwxrwx+ 5 allachan None 0 Dec 24 09:39 ..
-rw-r--r-- 1 allachan None 0 Dec 24 09:39 file1.txt
-rw-r--r-- 1 allachan None 0 Dec 24 09:39 file2.avi
pax> mmv '*.*' '#1_#1.#2'
pax> ll
total 0
drwxr-xr-x+ 2 allachan None 0 Dec 24 09:47 .
drwxrwxrwx+ 5 allachan None 0 Dec 24 09:39 ..
-rw-r--r-- 1 allachan None 0 Dec 24 09:39 file1_file1.txt
-rw-r--r-- 1 allachan None 0 Dec 24 09:39 file2_file2.avi
You need to be aware that the wildcard matching is not greedy. That means that the file a.b.txt will be turned into a_a.b.txt, not a.b_a.b.txt.
The mmv program was installed as part of my CygWin but I had to
sudo apt-get install mmv
on my Ubuntu box to get it down. If it's not in you standard distribution, whatever package manager you're using will hopefully have it available.
If, for some reason, you're not permitted to install it, you'll have to use one of the other bash for-loop-type solutions shown in the other answers. I prefer the terseness of mmv myself but you may not have the option.
for file in file*.*
do
[ -f "$file" ] && echo mv "$file" "${file%%.*}_$file"
done
Idea for recursion
recurse() {
for file in "$1"/*;do
if [ -d "$file" ];then
recurse "$file"
else
# check for relevant files
# echo mv "$file" "${file%%.*}_$file"
fi
done
}
recurse /path/to/files
find . -type f | while read FN; do
BFN=$(basename "$FN")
NFN=${BFN%.*}_${BFN}
echo "$BFN -> $NFN"
mv "$FN" "$NFN"
done
I like the PERL cookbook's rename script for this. It may not be /bin/sh but you can do regular expression-like renames.
The /bin/sh method would be to use sed/cut/awk to alter each filename inside a for loop. If the directory is large you'd need to rely on xargs.
One should mention the mmv tool, which is especially made for this.
It's described here: http://tldp.org/LDP/GNU-Linux-Tools-Summary/html/mass-rename.html
...along with alternatives.
I use prename (perl based), which is included in various linux distributions. It works with regular expressions, so to say change all img_x.jpg to IMAGE_x.jpg you'd do
prename 's/img_/IMAGE_/' img*jpg
You can use the -n flag to preview changes without making any actual changes.
prename man entry
#!/bin/bash
# Don't do this like I did:
# files=`ls ${1}`
for file in *.*
do
if [ -f $file ];
then
newname=${file%%.*}_${file}
mv $file $newname
fi
done
This one won't rename sub directories, only regular files.

Resources