Testing a file in a tar archive - linux

I've been manipulating a tar file and I would like to test if a file exists before extracting it
Let's say I have an tar file called Archive.Tar and after entering
tar -tvf Archive.Tar
I get:
-rwxrwxrwx guy/root 1502 2013-10-02 20:43 Directory/File
-rwxrwxrwx guy/root 494 2013-10-02 20:43 Dir/SubDir/Text
drwxrwxrwx guy/root 0 2013-10-02 20:43 Directory
I want to extract Text into my Working directory, but I want to be sure that it's actually a file by doing this:
if [ -f Dir/Sub/Text ]
then
echo "OK"
else
echo "KO"
fi
The result of this test is always KO and I really don't understand why, any suggestions?

Tested with BSD and GNU versions of tar,
in the output of tar tf,
entries that are directories end with /.
So to test if Dir/Sub/Text is a file or directory in the archive,
you can simply grep with full line matching:
if tar tf Archive.Tar | grep -x Dir/Sub/Text >/dev/null
then
echo "OK"
else
echo "KO"
fi
If the archive contains Dir/SubDir/Text/, then Dir/SubDir/Text is a directory, and the grep will not match, so KO will be printed.
If the archive contains Dir/SubDir/Text without a trailing /,
then Dir/SubDir/Text is a file and the grep will match,
so OK will be printed.

if [ ! -d Dir/Sub/Text ]
then
echo "OK"
else
echo "KO"
fi
will return KO only if a directory Text exists and be ok if it's a file or does not exist (or to be precise also OK if it would be a symlink).

This might be a solution,
tar -tvf Archive.Tar | grep Dir/Sub/Text
This will let you know if it find the file.

Related

deal with filename with space in shell [duplicate]

This question already has answers here:
Iterate over a list of files with spaces
(12 answers)
Closed 3 years ago.
I've read the answer here,but still got wrong.
In my folder,I only want to deal with *.gz file,Windows 10.tar.gz got space in filename.
Assume the folder contain:
Windows 10.tar.gz Windows7.tar.gz otherfile
Here is my shell scripts,I tried everything to quote with "",still can't got what I want.
crypt_import_xml.sh
#/bin/sh
rule_dir=/root/demo/rule
function crypt_import_xml()
{
rule=$1
# list the file in absoulte path
for file in `ls ${rule}/*.gz`; do
echo "${file}"
#tar -xf *.gz
#mv a b.xml to ab.xml
done
}
crypt_import_xml ${rule_dir}
Here is what I got:
root#localhost.localdomain:[/root/demo]./crypt_import_xml.sh
/root/demo/rule/Windows
10.tar.gz
/root/demo/rule/Windows7.tar.gz
After tar xf the *.gz file,the xml filename still contain space.It is a nightmare for me to deal with filename contain spaces.
You shouldn't use ls in for loop.
$ ls directory
file.txt 'file with more spaces.txt' 'file with spaces.txt'
Using ls:
$ for file in `ls ./directory`; do echo "$file"; done
file.txt
file
with
more
spaces.txt
file
with
spaces.txt
Using file globbing:
$ for file in ./directory/*; do echo "$file"; done
./directory/file.txt
./directory/file with more spaces.txt
./directory/file with spaces.txt
So:
for file in "$rule"/*.gz; do
echo "$file"
#tar -xf *.gz
#mv a b.xml to ab.xml
done
You do not need to call that ls command in the for loop, the file globbing will take place in your shell, without running this additional command:
XXX-macbookpro:testDir XXX$ ls -ltra
total 0
drwx------+ 123 XXX XXX 3936 Feb 22 17:15 ..
-rw-r--r-- 1 XXX XXX 0 Feb 22 17:15 abc 123
drwxr-xr-x 3 XXX XXX 96 Feb 22 17:15 .
XXX-macbookpro:testDir XXX$ rule=.
XXX-macbookpro:testDir XXX$ for f in "${rule}"/*; do echo "$f"; done
./abc 123
In your case you can change the "${rule}"/* into:
"${rule}"/*.gz;

tar command does not produce the .tar.gz file

I am trying to iterate in a loop, tar a couple of directories with each iteration and then compare the md5 sums of both of them. I notice that my first tar statement produces the tar files one level above the actual path of the directory. i.e. the statement:
tar -czvf ${folder_name}.tar.gz /tmp/psk1/hadoop_validation$ENV/${folder_name}
produces the ${folder_name}.tar.gz in /tmp/psk1/ rather than /tmp/psk1/hadoop_validation$ENV/
and the second tar statement:
tar -czvf ${folder_name}.tar.gz ${edge_base_dir}/wlossf$ENV/app/${folder_name}
doesn't produce the tar file at all. I can't find it even on one level above the actual path.
hdfs dfs -ls /haas/wlf/wlossf$ENV/app | while read rec; do
echo $rec
folder_path=`echo ${rec} | awk -F ' ' '{print $8}'`
folder_name=`echo ${folder_path} | awk -F '/' '{print $6}'`
if [ ! -z ${folder_name} ] && [ ! -z ${folder_path} ]; then
hdfs dfs -get ${folder_path} /tmp/psk1/hadoop_validation$ENV/
if [ $? -eq 0 ]; then
echo "Hadoop to local copy job Successful"
else
echo "Hadoop to local copy job Failed"
fi
tar -czvf ${folder_name}.tar.gz /tmp/psk1/hadoop_validation$ENV/${folder_name}
hadoop_md5=$(md5sum /tmp/psk1/hadoop_validation$ENV/${folder_name}.tar.gz)
tar -czvf ${folder_name}.tar.gz ${edge_base_dir}/wlossf$ENV/app/${folder_name}
edge_md5=$(md5sum ${edge_base_dir}/wlossf$ENV/app/${folder_name}.tar.gz)
if [ ${hadoop_md5} == ${edge_md5} ]; then
echo "${folder_name} is good"
else
echo "${folder_name} is bad"
fi
fi
echo ${folder_name}
echo ${folder_path}
done
What am I missing here? Any help would be appreciated.
Thank you.
As mouviciel said in the comments, tar by default creates the file in the current working directory.
Simply prefix the tar.gz file with the folder and it will create it where you want it:
tar -czvf /tmp/psk1/hadoop_validation$ENV/${folder_name}.tar.gz /tmp/psk1/hadoop_validation$ENV/${folder_name}
Note that as you will be creating the tar inside the same folder that you are archiving, you'll get a file changed as we read it warning as part of the output. Nothing to worry about.

Extracting zip file and then cd into it with different filename

I am creating a bash script to extract a tar file and cd'ing into it and then it runs another script. So far this has been working pretty well with my code below, however, i ran into a case where if the extracted folder is different than the .tar file name then it would cause an issue. So my question is, how should I handle unique cases where the file name is different than then .tar filename.
e.g,) my_file.tar ---> after extraction ----> my_different_file_name
#!/bin/bash
fname=$1
echo the file you are about to extract is $fname
if [ -f $fname ]; then #if the file exists
tar -xvzf $fname #tar it
cd ${fname%.*} #the `%.*` will extract filename from filename.tgz and cd into it
echo ${fname%.*}
echo $PWD
loadIt #another script to load
fi
You could do a:
topDir=$(tar -xvzf $fname | sed "s|/.*$||" | uniq)
[ $(wc -w <<< $topDir) == 1 ] || exit 1
echo topDir=$topDir
Explanation: the first command untars vebosely (outputs all files it's untarring), and then gets all the leading directory names, and pipes them into uniq. (so basically it returns a list of all the top level directories in the tar file). The next line checks that there's exactly one entry in topDir, otherwise it exits.
At this point $topdir will be the directory you want to cd into.
Maybe you could do something like that:
cd $(tar -tf $fname | head -1)
If you don't mind moving the directory around after you extract it you can do something like this
# Create a temporary directory
$ tmpd=$(mktemp -d)
# Change to the temporary directory
$ pushd "$tmpd"
# Extract the tarball
$ tar -xf "$fname"
# Glob the directory name
$ d=(*)
# Error if we have more (or less) than one directory
$ [ "${#d}" = 0 ] || exit 1
# Explicitly use just the first directory (optional since `$d` does the same thing)
$ d=${d[0]}
# Move the extracted directory to the previous directory
$ mv "$d" "$OLDPWD"
# Change back to the starting directory
$ popd
# Remove the (now empty) temporary directory
$ rmdir "$tmpd"
# Change into the extracted directory
$ cd "$d"
# Run 'loadIt'
$ loadIt

bash tar error doesn't create tar.gz

I have the following bash script:
#DIR is something like: /home/foo/foobar/test/ without any whitespace but can also include whitespace
DIR="$( cd "$( dirname "$0" )" && pwd )"
#backup_name is read from a file
backup_name=FOOBAR
date=`date +%Y%m%d_%H%M_%S`
#subdirs is also read from the same file
subdirs=etc/ sbin/ bin/
filename="$DIR/Backup_$backup_name"_"$date.tar.gz"
cd /
echo "filename: $filename"
echo "subdirs $subdirs"
cmd='tar czvf "'$filename'" '$subdirs
echo "cmd tar: $cmd"
$cmd
But I get following output:
filename: /home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz
subdirs: etc/ sbin/ bin/
cmd tar: tar cfvz "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz" etc/ sbin/ bin/
etc/
# ... list of files in etc
# but no files from sbin or bin directory
tar: "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz": can open not execute: File or directory not found
tar: not recoverable error: abortion.
However, when I copy the echo output of the tar command, make a cd to / and paste it into the bash shell it is working:
tar cfvz "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz" etc/ sbin/ bin/
etc/
Every variable is defined and there is no trailing newline
I also tried $cmd with backticks
the two variables: backup_name and subdirs are read from a file (I did not include the reading process in the code)
edit: I just copied my script to a dir with no whitespace and changed the line:
cmd='tar czvf "'$filename'" '$subdirs
#to
cmd="tar czvf $filename $subdirs"
and it's working now but when I do the same in a dir which also contents whitespaces I get still the same error.
edit2: reading from file (the file is read before anything else happens)
config="config.txt"
local line
while read line
do
#points to next free element and declares it
config_lines[${#config_lines[#]}]=$line
done <$config
backup_name=${config_line[0]}
subdirs=${config_line[1]}
What is wrong with my bash script?
Short answer: see BashFAQ #050: I'm trying to put a command in a variable, but the complex cases always fail!.
Long answer: embedding quotes in a variable doesn't do anything useful, because when you use it (i.e. $cmd), bash parses quotes before replacing variables; by the time the quotes are there, it's too late for them to do any good. You do, however, have several options:
Don't bother with putting the command in a variable in the first place, just use it directly:
echo "filename: $filename"
echo "subdirs $subdirs"
tar czvf "$filename" $subdirs
If you really need to put it in a variable first, use an array rather than a plain text variable (and ideally, do the same with the subdirs list):
subdirs=(etc/ sbin/ bin/)
...
echo "filename: $filename"
echo "subdirs ${subdirs[*]}"
cmd=(tar czvf "$filename" "${subdirs[#]}")
printf "cmd tar:"
printf " %q" "${cmd[#]}" # Have to do some trickery to get it printed right
printf "\n"
"${cmd[#]}"
Instead of mucking about with messy quoting issues you could get the results you want a different way and, perhaps, save some time. How about something like this?
#!/usr/bin/env bash
# abusing set -v for fun and profit
tar_output=/tmp/$$.tarout
tar_command=/tmp/$$.tarcmd
tmp_script=/tmp/$$.script
dir="$(cd "$(dirname "$0")"; pwd)"
cat>"${tmp_script}"<<-'END'
datestamp=$(date +%Y%m%d_%H%M_%S)
subdirs=(etc sbin bin)
backup_name=FOOBAR
filename="$1/Backup_${backup_name}_${date}.tar.gz"
printf 'tar cmd: '
set -v
tar czvf "$filename" "${subdirs[#]}" 2>"$2"
set +v
END
bash "${tmp_script}" "$dir" "${tar_output}" 2>"${tar_command}"
cat "${tar_command}" | head -n 1 | sed -e 's/2>"\$2"$//'
cat "${tar_output}"
rm -f "${tmp_script}" "${tar_command}" "${tar_output}"
I apologize for nothing, but in the real world note that you'd want to make proper temp files.
If you execute the string $cmd, it won't work if "filename" embeds spaces
You have to let bash creates the arguments.
like this:
tar czvf "${filename}" $subdirs
You don't even need to put '\' in filename
OK, your original script did not work because file/path determination happens before variable expansion, so the filename is wrong: tar thinks that it's supposed to write to a file in the current directory named "/home/foo/foobar/test/Backup_FOOBAR_20120322_1529_35.tar.gz" i.e. the file name contains slashes and double quotes!
tar cfz /this/file/does/nopt/exist .
tar: /this/file/does/nopt/exist: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
See the difference? There no double quotes around the file name/path in tar's error message.
It worked when you copy and paste the line because then, the doublequotes are intepreted by the shell.
Witness:
ls -l /tmp/screen-exchange
-rw-rw-rw- 1 aqn users 0 Mar 21 07:29 /tmp/screen-exchange
cmd='ls -l "'/tmp/screen-exchange'"'
$cmd
/bin/ls: "/tmp/screen-exchange": No such file or directory
eval $cmd
-rw-rw-rw- 1 aqn users 0 Mar 21 07:29 /tmp/screen-exchange
Of course, using eval won't guard against filenames with whitespaces in them. To guard against that, your tar command needs to be like so:
date>'file name with spaces'
file='file name with spaces' # this is the equivalent of your $filename
cmd='ls -l "$file"'
$cmd
ls: "$file": No such file or directory
eval $cmd
-rw-r--r-- 1 andyn SPICE\Domain Users 1083 Mar 22 15:28 a b
I would suggest you separate $cmd from $filename and $subdirs. I think the induced error comes from when you join these strings. Also, using multiple variables in one variable without proper quoting will also induce errors.
This should work for you:
cmd="tar -zcvf"
subdirs="etc/ sbin/ bin/"
filename="${DIR}/Backup_${backup_name}_${date}.tar.gz"
$cmd $filename $subdirs
#DIR is something like: /home/foo/foobar/test/ without any whitespace but can also include whitespace
DIR="$( cd "$( dirname "$0" )" && pwd )"
backup_name=FOOBAR
date=`date +%Y%m%d_%H%M_%S`
subdirs="etc/ sbin/ bin/"
filename="$DIR/Backup_$backup_name"_"$date.tar.gz"
cd /
echo "filename: $filename"
echo "subdirs $subdirs"
cmd="tar zcvf $filename $subdirs"
echo "cmd tar: $cmd"
$cmd

rsync prints "skipping non-regular file" for what appears to be a regular directory

I back up my files using rsync. Right after a sync, I ran it expecting to see nothing, but instead it looked like it was skipping directories. I've (obviously) changed names, but I believe I've still captured all the information I could. What's happening here?
$ ls -l /source/backup/myfiles
drwxr-xr-x 2 me me 4096 2010-10-03 14:00 foo
drwxr-xr-x 2 me me 4096 2011-08-03 23:49 bar
drwxr-xr-x 2 me me 4096 2011-08-18 18:58 baz
$ ls -l /destination/backup/myfiles
drwxr-xr-x 2 me me 4096 2010-10-03 14:00 foo
drwxr-xr-x 2 me me 4096 2011-08-03 23:49 bar
drwxr-xr-x 2 me me 4096 2011-08-18 18:58 baz
$ file /source/backup/myfiles/foo
/source/backup/myfiles/foo/: directory
Then I sync (expecting no changes):
$ rsync -rtvp /source/backup /destination
sending incremental file list
backup/myfiles
skipping non-regular file "backup/myfiles/foo"
skipping non-regular file "backup/myfiles/bar"
And here's the weird part:
$ echo 'hi' > /source/backup/myfiles/foo/test
$ rsync -rtvp /source/backup /destination
sending incremental file list
backup/myfiles
backup/myfiles/foo
backup/myfiles/foo/test
skipping non-regular file "backup/myfiles/foo"
skipping non-regular file "backup/myfiles/bar"
So it worked:
$ ls -l /source/backup/myfiles/foo
-rw-r--r-- 1 me me 3126091 2010-06-15 22:22 IMGP1856.JPG
-rw-r--r-- 1 me me 3473038 2010-06-15 22:30 P1010615.JPG
-rw-r--r-- 1 me me 3 2011-08-24 13:53 test
$ ls -l /destination/backup/myfiles/foo
-rw-r--r-- 1 me me 3126091 2010-06-15 22:22 IMGP1856.JPG
-rw-r--r-- 1 me me 3473038 2010-06-15 22:30 P1010615.JPG
-rw-r--r-- 1 me me 3 2011-08-24 13:53 test
but still:
$ rsync -rtvp /source/backup /destination
sending incremental file list
backup/myfiles
skipping non-regular file "backup/myfiles/foo"
skipping non-regular file "backup/myfiles/bar"
Other notes:
My actual directories "foo" and "bar" do have spaces, but no other strange characters. Other directories have spaces and have no problem. I 'stat'-ed and saw no differences between the directories that don't rsync and the ones that do.
If you need more information, just ask.
Are you absolutely sure those individual files are not symbolic links?
Rsync has a few useful flags such as -l which will "copy symlinks as symlinks". Adding -l to your command:
rsync -rtvpl /source/backup /destination
I believe symlinks are skipped by default because they can be a security risk. Check the man page or --help for more info on this:
rsync --help | grep link
To verify these are symbolic links or pro-actively to find symbolic links you can use file or find:
$ file /path/to/file
/path/to/file: symbolic link to `/path/file`
$ find /path -type l
/path/to/file
Are you absolutely sure that it's not a symbolic link directory?
try a:
file /source/backup/myfiles/foo
to make sure it's a directory
Also, it could very well be a loopback mount
try
mount
and make sure that /source/backup/myfiles/foo is not listed.
You should try the below command, most probably it will work for you:
rsync -ravz /source/backup /destination
You can try the following, it will work
rsync -rtvp /source/backup /destination
I personally always use this syntax in my script and works a treat to backup the entire system (skipping sys/* & proc/* nfs4/*)
sudo rsync --delete --stats --exclude-from $EXCLUDE -rlptgoDv / $TARGET/ | tee -a $LOG
Here is my script run by root's cron daily:
#!/bin/bash
#
NFS="/nfs4"
HOSTNAME=`hostname`
TIMESTAMP=`date "+%Y%m%d_%H%M%S"`
EXCLUDE="/home/gcclinux/Backups/root-rsync.excludes"
TARGET="${NFS}/${HOSTNAME}/SYS"
LOGDIR="${NFS}/${HOSTNAME}/SYS-LOG"
CMD=`/usr/bin/stat -f -L -c %T ${NFS}`
## CHECK IF NFS IS MOUNTED...
if [[ ! $CMD == "nfs" ]];then
echo "NFS NOT MOUNTED"
exit 1
fi
## CHECK IF LOG DIRECTORY EXIST
if [ ! -d "$LOGDIR" ]; then
/bin/mkdir -p $LOGDIR
fi
## CREATE LOG HEADER
LOG=$LOGDIR/"rsync_result."$TIMESTAMP".txt"
echo "-------------------------------------------------------" | tee -a $LOG
echo `date` | tee -a $LOG
echo "" | tee -a $LOG
## START RUNNING BACKUP
/usr/bin/rsync --delete --stats --exclude-from $EXCLUDE -rlptgoDv / $TARGET/ | tee -a $LOG
In some cases just copy file to another location (like home) then try again

Resources