create a tar.gz of a directory and clean up the directory except a file? - linux

I would like to know if tar (1.15.1, Linux system) already has something like this before creating a script for it.
I would like to create a dir.tar.gz of a given dir/ that contains a specific dir/myfile.gz. I would like the command to do 2 specific things:
Preserve dir/myfile.gz
Delete everything else in dir/ after the dir.tar.gz is complete.
For example:
tree -if .
.
./dir
./dir/dir2
./dir/dir2/file3
./dir/file1
./dir/file2
./dir/myfile.gz
2 directories, 4 files
I would like the result of the tar command to be:
tree -if
.
./dir
./dir/myfile.gz
./dir.tar.gz
1 directory, 2 files
I have read about the --delete and --extract options, but I am not sure how to apply them in one single command. Any ideas?

This should work:
tar -zcvf dir.tar.gz --exclude="*.gz" ./dir/. && find ./dir/. ! -name "*.gz" -exec rm -rf {} +
Test:
$ tree -if .
.
./dir
./dir/dir2
./dir/dir2/file3
./dir/file1
./dir/file2
./dir/myfile.gz
2 directories, 4 files
$ tar -zcvf dir.tar.gz --exclude="*.gz" ./dir/. && find ./dir/. ! -name "*.gz" -exec rm -rf {} +
./dir/./
./dir/./file1
./dir/./dir2/
./dir/./dir2/file3
./dir/./file2
rm: cannot remove `.' or `..'
$ tree -if .
.
./dir
./dir/myfile.gz
./dir.tar.gz
1 directory, 2 files

You may also be interested in tardy which is a tar post-processor (filter), so you can operate (even on the fly) on your tar archive and remove the offending file.

Related

how to tell find command to only remove contents of a directory

I am using find to get both files and dirs inside $dest_dir and remove them:
dest_dir="$HOME/pics"
# dest_dir content:
# dir1
# dir2
# pic1
# pic2
find $dest_dir -maxdepth 1 -exec rm -rf {} \;
Expectation: remove dest_dir contents only (i. e. dir1, dir2, pic1, pic2) and not dest_dir itself
Actual result: the command removes the dest_dir too
I also tried -delete instead of the -exec rm -rf {} \; section, but it can't remove non-empty directories.
If you pass a directory name to rm -rf it will delete it, by definition. If you don't want to recurse into subdirectories, why are you using find at all?
rm "$dest_dir"/*
On the other hand, if you want to rm -rf everything inside the directory ... Do that instead.
rm -rf "$dest_dir"/*
On the third hand, if you do want to remove files, but not directories, from an arbitrarily deep directory tree, try
find "$dest_dir" -type f -delete
or somewhat more obscurely with -execdir and find just the directories and pass in a command like sh -c 'rm {}/*', but in this scenario this is just clumsy and complex.
You can use this find command:
find "$dest_dir" -maxdepth 1 -mindepth 1 -exec rm -rf {} +
Option -mindepth 1 will find all entries inside "$dest_dir" at least one level down and will skip "$dest_dir" itself.

Linux: copy ".svn" directories recursively

I know there are dozen of questions about similar topcis but I still can't beat this up.
I need to copy all .svn directories recursively from /var/foo to /var/foo2 on a Debian machine:
/var/www/foo/.svn
/var/www/foo/bar/.svn
...
I tried these two commands without success:
find /var/foo -name ".svn" -type f -exec cp {} ./var/foo2 \;
find /var/foo -name ".svn" -type d -exec cp {} /var/foo2 \;
Once only the svn directory right inside foo is copied, while another time nothing is copied.
Given following file structure:
./
./a/
./a/test/
./a/test/2
./b/
./b/3
./test/
./test/1
Running following script in the directory to be copied:
find -type d -iname test -exec sh -c 'mkdir -p "$(dirname ~/tmp2/{})"; cp -r {}/ ~/tmp2/{}' \;
Should copy all test directories to ~/tmp2/.
Points of interest:
Directories are copied to the destination on a one-by-one basis
Parent directories are created in advance so that cp doesn't complain about target not existing
Rather than just cp, cp -r is used
The whole command is wrapped with sh -c so that operations on {} such as dirname can be performed (so that the shell expands it for each directory separately, rather than expanding it once during calling the find)
Resulting structure in ~/tmp2:
./
./a/
./a/test/
./a/test/2
./test/
./test/1
So all you should need to do is to replace test with .svn and ~/tmp2 with directory of choice. Just remember about running it in the source directory, instead of using absolute paths.
I find that using tar for such operations makes the code often much more readable:
$ mkdir /var/www/foo2
$ cd /var/www/foo2
$ find ../foo/ -type d -name .svn -exec tar c \{\} \+ | \
tar x --strip-components=1
find will list all directories named .svn, and call tar to create (c) an archive file (that is sent to stdout) with all these directories. the archive on stdout is then extracted (x) by another tar instance in the target directory. the relative path portion (../) is automatically removed by the archiving tar, but since we also want to remove the first path component (foo/) we need to add --strip-components.
Note: This will only work if you do not have very many .svn directories you want to copy (more than $(getconf ARG_MAX)-2, which on my system is more than 200000).

Linux Glob Pattern to remove all files containing a '?'?

I would like to descend into a directory and recursively remove all filenames that contain a ?.
I wgeted a website and files of the form index.html?p=46 were downloaded..extra marks for why this was.
I tried:
rm -R *?*
that failed: removed all regular files
rm -R *\?*
also failed: No such file of directory
Try this: find . -iname '*\?*' -exec rm {} \;
$ ls
xxy x?y
$find . -iname '*\?*'
./x?y
$ find . -iname '*\?*' -exec rm {} \;
$ ls
xxy
As for why it happened, the website you wgetted had links to index.html passing those parameters and you (presumably) told wget to mirror it.
? maps to single character. You need to escape it:
$ touch a a?a
$ ls *?*
a a?a
$ ls *\?*
a?a

mv files with certain extension

I'm using cygwin on my PC and I'm looking to move all .nef files in /pictures/ (and sub directories) into a new directory, BUT I want to keep the directory name that the nef files came from.
ie /pictures/vacation2012/image01.nef
and
/pictures/vac09/image01.nef
should go into
/pictures/nef/vac09/
and
/pictures/nef/vacation2012/
You can do it like this:
$ cd from_directory
$ (tar cf - `find . -name "*.nef"`) | (cd to_directory; tar xf - )
$ find . -name "*.nef" -exec rm -rf {} \; # be careful with this...

How can i copy only header files in an entire nested directory to another directory keeping the same hierarchy after copying to new folder

I have a directory which has a lot of header files(.h) and other .o and .c files and other files.
This directory has many nested directories inside. I want to copy only header files to a separate directory preserving the same structure in the new directory.
cp -rf oldDirectory newDirectory will copy all files.
I want to copy only header files.
(cd src && find . -name '*.h' -print | tar --create --files-from -) | (cd dst && tar xvfp -)
You can do something similar with cpio if you just want to hard link the files instead of copying them, but it's likely to require a little mv'ing afterward. If you have lots of data and don't mind (or need!) sharing, this can be much faster. It gets confused if dst needs to have a src in it - this is, if it isn't just a side effect:
find src -name '*.h' -print | cpio -pdlv dst
mv dst/src/* dst/.
rmdir dst/src
this one worked for me:
rsync -avm --include='*.h' -f 'hide,! */' . /destination_dir
from here
cp -rf oldDirectory/*.h newDirectory
or something like (depending on the actual paths)
find oldDirectory -type f -name "*.h" -print0 | xargs -file cp file newDirectory

Resources