How do I exclude absolute paths for tar? - linux

I am running a PHP script that gets me the absolute paths of files I want to tar up. This is the syntax I have:
tar -cf tarname.tar -C /www/path/path/file1.txt /www/path/path2/path3/file2.xls
When I untar it, it creates the absolute path to the files. How do I get just /path with everything under it to show?

If you want to remove the first n leading components of the file name, you need strip-components. So in your case, on extraction, do
tar xvf tarname.tar --strip-components=2
The man page has a list of tar's many options, including this one. Some earlier versions of tar use --strip-path for this operation instead.

You are incorrectly using the -C switch, which is used for changing directories. So what you need to do is:
tar -cf tarname.tar -C /www/path path/file1.txt path2/path3/file2.xls
or if you want to package everything under /www/path do:
tar -cf tarname.tar -C /www/path .
You can use -C switch multiple times.

For me the following works the best:
tar xvf some.tar --transform 's?.*/??g'
--transform argument is a replacement regex for sed, to which every extracted filepath is fed. Unlike --strip-components, it will remove all path information, not just fixed number of components.

If you don't know how many components are in the path, you could try this:
DIR_TO_PACK=/www/path/
cd $DIR_TO_PACK/..
tar -cf tarname.tar $(basename $DIR_TO_PACK)

Related

how to rename files you put into a tar archive using linux 'tar'

I'm trying to create a tar archive with a couple files, but rename those files in the archive. Right now I have something like this:
tar -czvf file1 /some/path/to/file2 file3 etc
But I'd like to do something like:
tar -czvf file1=file1 /some/path/to/file2=file2 file3=path/to/renamedFile3 etc=etc
Where, when extracted into directory testDir, you would see the files:
testDir/file1
testDir/file2
testDir/path/to/renamedFile3
testDir/etc
How can I do this?
You can modify filenames (among other things) with --transform. For example, to create a tape archive /tmp/foo.tar, putting files /etc/profile and /etc/bash.bashrc into it while also renaming profile to foo, you can do the following:
tar --transform='flags=r;s|bar|foo|' -cf file.tar file1 file2 bar fubar /dir/*
Results of the above is that bar is added to file.tar as foo.
The r flag means transformations are applied to regular files only. For more information see GNU tar documentation.
You can use --transform multiple times, for example:
tar --transform='flags=r;s|foo|bar|' --transform='flags=r;s|baz|woz|' -cf file.tar /some/dir/where/foo/is /some/dir/where/baz/is /other/stuff/* /dir/too
With --transform, there's no need to make a temporary testDir first. To prepend testDir/ to everything in the archive, match the beginning anchor ^:
tar --transform "s|file3|path/to/renamedFile3|" \
--transform "flags=r;s|^|testDir/|" \
-czvf my_archive.tgz file1 /some/path/to/file2 file3 etc
The r flag is critical to keep the transform from breaking any symlink targets in the archive (which also match ^).
We can refer to the man tar, the -O option is the best choice since files can be written to standard out.
-O (x, t modes only) In extract (-x) mode, files will be written to
standard out rather than being extracted to disk. In list (-t)
mode, the file listing will be written to stderr rather than the
usual stdout.
here are the examples:
# 1. without -O
tar xzf 20170511162930.db.tar.gz
# result: 20170511162930.db
# 2. with -O
tar xzf 20170511162930.db.tar.gz -O > latest.db
# result: latest.db
After not liking any solution that I've found, I've just written tarlogs.py, which lets you specify arbitrary names for tar entries. Each tar entry is constructed from one (or several) regular (or gzipped) inputs. You can also add directories, which will be recursed into as with regular tar. So in your case,
tarlogs.py -o file1 -i /some/path/to/file2 -o file2 -i file3 -o path/to/renamedFile3 -o /etc >output.tar
(-o with no -i inputs simply uses the output path as input, with no renaming)
This question has been up for a while, but for anyone who's looking for another suitable solution:
I've created a fork of the original GNU tar utility with additional support for file name mapping.
Usage example:
> touch myfile.txt
> tar cf file.tar ':myfile.txt:dir/inside/tar/newname.txt'
> tar tvf file.tar
-rw-rw-r-- user/user 0 2022-02-12 14:27 dir/inside/tar/newname.txt
The feature is triggered by prefixing file names with a colon (:) as shown above. A second colon functions as a separator between the source file location and the desired file name inside the archive.
:[source file]:[desired name inside the tar]
This feature is compatible with the -T (input list from file) flag.
How to compile it
> git clone https://github.com/leso-kn/tar
> cd tar
> ./bootstrap
> ./configure
> make -j4
# Run it
> src/tar --version

How do I tar a directory without retaining the directory structure?

I'm working on a backup script and want to tar up a file directory:
tar czf ~/backup.tgz /home/username/drupal/sites/default/files
This tars it up, but when I untar the resulting file, it includes the full file structure: the files are in home/username/drupal/sites/default/files.
Is there a way to exclude the parent directories, so that the resulting tar just knows about the last directory (files)?
Use the --directory option:
tar czf ~/backup.tgz --directory=/home/username/drupal/sites/default files
Hi I've a better solution when enter in the specified directory it's impossible (Makefiles,etc)
tar -cjvf files.tar.bz2 -C directory/contents/to/be/compressed .
Do not forget the dot (.) at the end !!
cd /home/username/drupal/sites/default/files
tar czf ~/backup.tgz *
Create a tar archive
tar czf $sourcedir/$backup_dir.tar --directory=$sourcedir WEB-INF en
Un-tar files on a local machine
tar -xvf $deploydir/med365/$backup_dir.tar -C $deploydir/med365/
Upload to a server
scp -r -i $privatekey $sourcedir/$backup_dir.tar $server:$deploydir/med365/
echo "File uploaded.. deployment folders"
Un-tar on server
ssh -i $privatekey $server tar -xvf $deploydir/med365/$backup_dir.tar -C $deploydir/med365/
To gunzip all txt (*.txt) files from /home/myuser/workspace/zip_from/
to /home/myuser/workspace/zip_to/ without directory structure of source files use following command:
tar -P -cvzf /home/myuser/workspace/zip_to/mydoc.tar.gz --directory="/home/myuser/workspace/zip_from/" *.txt
If you want to tar files while keeping the structure but ignore it partially or completely when extracting, use the --strip-components argument when extracting.
In this case, where the full path is /home/username/drupal/sites/default/files, the following command would extract the tar.gz content without the full parent directory structure, keeping only the last directory of the path (e.g. files/file1).
tar -xzv --strip-components=5 -f backup.tgz
I've found this tip on https://www.baeldung.com/linux/tar-archive-without-directory-structure#5-using-the---strip-components-option.
To build on nbt's and MaikoID's solutions:
tar -czf destination.tar.gz -C source/directory $(ls source/directory)
This solution:
Includes all files and folders in the directory
Does not include any of the directory structure (or .) in the final product
Does not require you to change directories.
However, it requires the directory to be given twice, so it may be most useful in another script. It may also be less efficient if there are a lot of files/folders in source/directory. Adjust the subcommand as necessary.
So for instance for the following structure:
|- source
| |- one
| `- two
`- working
the following command:
working$ tar -czf destination.tar.gz -C ../source $(ls ../source)
will produce destination.tar.gz where both one and two (and sub-files/-folders) are the first items.
This worked for me:
gzip -dc "<your_file>.tgz" | tar x -C <location>
For me -C or --directory did not work, I use this
cd source/directory/or/file
tar -cvzf destination/packaged-app.tgz *.jar
# this will put your current directory to what it previously was
cd -
Kindly use the below command to generate tar file without directory structure
tar -C <directoryPath> -cvzf <Path of the tar.gz file> filename1 filename2... filename N
eg:
tar -C /home/project/files -cvzf /home/project/files/test.tar.gz text1.txt text2.txt
tar -Cczf ~/backup.tgz /home/username/drupal/sites/default/files
-C does the cd for you

tar: add all files and directories in current directory INCLUDING .svn and so on

I try to tar.gz a directory and use
tar -czf workspace.tar.gz *
The resulting tar includes .svn directories in subdirs but NOT in the current directory (as * gets expanded to only 'visible' files before it is passed to tar
I tried to
tar -czf workspace.tar.gz . instead but then I am getting an error because '.' has changed while reading:
tar: ./workspace.tar.gz: file changed as we read it
Is there a trick so that * matches all files (including dot-prefixed) in a directory?
(using bash on Linux SLES-11 (2.6.27.19)
Don't create the tar file in the directory you are packing up:
tar -czf /tmp/workspace.tar.gz .
does the trick, except it will extract the files all over the current directory when you unpack. Better to do:
cd ..
tar -czf workspace.tar.gz workspace
or, if you don't know the name of the directory you were in:
base=$(basename $PWD)
cd ..
tar -czf $base.tar.gz $base
(This assumes that you didn't follow symlinks to get to where you are and that the shell doesn't try to second guess you by jumping backwards through a symlink - bash is not trustworthy in this respect. If you have to worry about that, use cd -P .. to do a physical change directory. Stupid that it is not the default behaviour in my view - confusing, at least, for those for whom cd .. never had any alternative meaning.)
One comment in the discussion says:
I [...] need to exclude the top directory and I [...] need to place the tar in the base directory.
The first part of the comment does not make much sense - if the tar file contains the current directory, it won't be created when you extract file from that archive because, by definition, the current directory already exists (except in very weird circumstances).
The second part of the comment can be dealt with in one of two ways:
Either: create the file somewhere else - /tmp is one possible location - and then move it back to the original location after it is complete.
Or: if you are using GNU Tar, use the --exclude=workspace.tar.gz option. The string after the = is a pattern - the example is the simplest pattern - an exact match. You might need to specify --exclude=./workspace.tar.gz if you are working in the current directory contrary to recommendations; you might need to specify --exclude=workspace/workspace.tar.gz if you are working up one level as suggested. If you have multiple tar files to exclude, use '*', as in --exclude=./*.gz.
There are a couple of steps to take:
Replace * by . to include hidden files as well.
To create the archive in the same directory a --exclude=workspace.tar.gz can be used to exclude the archive itself.
To prevent the tar: .: file changed as we read it error when the archive is not yet created, make sure it exists (e.g. using touch), so the --exclude matches with the archive filename. (It does not match it the file does not exists)
Combined this results in the following script:
touch workspace.tar.gz
tar -czf workspace.tar.gz --exclude=workspace.tar.gz .
If you really don't want to include top directory in the tarball (and that's generally bad idea):
tar czf workspace.tar.gz -C /path/to/workspace .
in directory want to compress (current directory) try this :
tar -czf workspace.tar.gz . --exclude=./*.gz
You can include the hidden directories by going back a directory and doing:
cd ..
tar czf workspace.tar.gz workspace
Assuming the directory you wanted to gzip was called workspace.
You can fix the . form by using --exclude:
tar -czf workspace.tar.gz --exclude=workspace.tar.gz .
Update: I added a fix for the OP's comment.
tar -czf workspace.tar.gz .
will indeed change the current directory, but why not place the file somewhere else?
tar -czf somewhereelse/workspace.tar.gz .
mv somewhereelse/workspace.tar.gz . # Update
Actually the problem is with the compression options. The trick is the pipe the tar result to a compressor instead of using the built-in options. Incidentally that can also give you better compression, since you can set extra compresion options.
Minimal tar:
tar --exclude=*.tar* -cf workspace.tar .
Pipe to a compressor of your choice. This example is verbose and uses xz with maximum compression:
tar --exclude=*.tar* -cv . | xz -9v >workspace.tar.xz
Solution was tested on Ubuntu 14.04 and Cygwin on Windows 7.
It's a community wiki answer, so feel free to edit if you spot a mistake.
Had a similar situation myself. I think it is best to create the tar elsewhere and then use -C to tell tar the base directory for the compressed files. Example:
tar -cjf workspace.tar.gz -C <path_to_workspace> $(ls -A <path_to_workspace>)
This way there is no need to exclude your own tarfile. As noted in other comments, -A will list hidden files.
Yet another solution, assuming the number of items in the folder is not huge and assuming all names do not contain characters the shell interprets as delimiters (whitespace):
tar -czf workspace.tar.gz `ls -A`
(ls -A prints normal and hidden files but not "." and ".." as ls -a does.)
A good question. In ZSH you can use the globbing modifier (D), which stands for "dotfiles". Compare:
ls $HOME/*
and
ls $HOME/*(D)
This correctly excludes the special directory entries . and ... In Bash you can use .* to include the dotfiles explicitly:
ls $HOME/* $HOME/.*
But that includes . and .. as well, so it's not what you were looking for. I'm sure there's some way to make * match dotfiles in bash, too.
The problem with the most solutions provided here is that tar contains ./ at the begging of every entry. So this results in having . directory when opening it through GUI compressor. So what I ended up doing is:
ls -1A | xargs -d "\n" tar cfz my.tar.gz
If you already have my.tar.gz in current directory you may want to grep this out:
ls -1A | grep -v my.tar.gz | xargs -d "\n" tar cfz my.tar.gz
Be aware of that xargs has certain limit (see xargs --show-limits). So this solution would not work if you are trying to create a package which has lots of entries (directories and files) on a directory which you are trying to tar.
10 years later, you have an alternative to tar, illustrated with Git 2.30 (Q1 2021), which uses "git archive"(man) to produce the release tarball
instead of tar.
(You don't need Git 2.30 to apply that alternative)
See commit 4813277 (11 Oct 2020), and commit 93e7031 (10 Oct 2020) by René Scharfe (rscharfe).
(Merged by Junio C Hamano -- gitster -- in commit 63e5273, 27 Oct 2020)
Makefile: use git init/add/commit/archive for dist-doc
Signed-off-by: René Scharfe
Reduce the dependency on external tools by generating the distribution archives for HTML documentation and manpages using git(man) commands instead of tar.
This gives the archive entries the same meta data as those in the dist archive for binaries.
So instead of:
tar cf ../archive.tar .
You can do using Git only:
git -C workspace init
git -C workspace add .
git -C workspace commit -m workspace
git -C workspace archive --format=tar --prefix=./ HEAD^{tree} > workspace.tar
rm -Rf workspace/.git
That was initially proposed because, as explained here, some exotic platform might have an old tar distribution with lacking options.
tar -czf workspace.tar.gz .??* *
Specifying .??* will include "dot" files and directories that have at least 2 characters after the dot. The down side is it will not include files/directories with a single character after the dot, such as .a, if there are any.
If disk space space is not an issue, this could also be a very easy thing to do:
mkdir backup
cp -r ./* backup
tar -zcvf backup.tar.gz ./backup
Using find is probably the easiest way:
find . -maxdepth 1 -exec tar zcvf workspace.tar.gz {} \+
find . -maxdepth 1 will find all files/directories/symlinks/etc in the current directory and run the command specified by -exec. The {} in the command means file list goes here and \+ means that the command will be run as:
tar zcvf workspace.tar.gz .file1 .file2 .dir3
instead of
tar zcvf workspace.tar.gz .file1
tar zcvf workspace.tar.gz .file2
tar zcvf workspace.tar.gz .dir3

Make Tar + gzip ignore directory paths

Is it possible, when making a tar + gzip through the 'tar c ...' command, to have the relative paths will be ignored upon expanding?
For example,
tar cvf test.tgz foo ../../files/bar
And then expanding the test.tgz with
tar xvf test.tgz
gives a directory containing:
foo files/bar
I want the directory to contain the files:
foo bar
Is this possible?
If all the paths begin with the same initial list of directories then you can use e.g. tar cvf test.tgz -C ../.. other/dir. Beware that the shell won't expand wildcards in pathnames "properly" however because -C asks tar to change directory.
Otherwise, the only way I've ever come up with is to make a temporary directory filled with appropriate symlinks and use the -h option to dereference through symlinks. Of course that won't work if some of the files you want to store are actually symlinks themselves.

Shell command to tar directory excluding certain files/folders

Is there a simple shell command/script that supports excluding certain files/folders from being archived?
I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup.
Not quite solutions:
The tar --exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded.
I could also use the find command to create a list of files and exclude the ones I don't want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands.
I'm beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with --exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory.
Can anybody think of a better/more efficient solution?
EDIT: Charles Ma's solution works well. The big gotcha is that the --exclude='./folder' MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory):
cd /folder_to_backup
tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .
You can have multiple exclude options for tar so
$ tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .
etc will work. Make sure to put --exclude before the source and destination items.
You can exclude directories with --exclude for tar.
If you want to archive everything except /usr you can use:
tar -zcvf /all.tgz / --exclude=/usr
In your case perhaps something like
tar -zcvf archive.tgz arc_dir --exclude=dir/ignore_this_dir
Possible options to exclude files/directories from backup using tar:
Exclude files using multiple patterns
tar -czf backup.tar.gz --exclude=PATTERN1 --exclude=PATTERN2 ... /path/to/backup
Exclude files using an exclude file filled with a list of patterns
tar -czf backup.tar.gz -X /path/to/exclude.txt /path/to/backup
Exclude files using tags by placing a tag file in any directory that should be skipped
tar -czf backup.tar.gz --exclude-tag-all=exclude.tag /path/to/backup
old question with many answers, but I found that none were quite clear enough for me, so I would like to add my try.
if you have the following structure
/home/ftp/mysite/
with following file/folders
/home/ftp/mysite/file1
/home/ftp/mysite/file2
/home/ftp/mysite/file3
/home/ftp/mysite/folder1
/home/ftp/mysite/folder2
/home/ftp/mysite/folder3
so, you want to make a tar file that contain everyting inside /home/ftp/mysite (to move the site to a new server), but file3 is just junk, and everything in folder3 is also not needed, so we will skip those two.
we use the format
tar -czvf <name of tar file> <what to tar> <any excludes>
where the c = create, z = zip, and v = verbose (you can see the files as they are entered, usefull to make sure none of the files you exclude are being added). and f= file.
so, my command would look like this
cd /home/ftp/
tar -czvf mysite.tar.gz mysite --exclude='file3' --exclude='folder3'
note the files/folders excluded are relatively to the root of your tar (I have tried full path here relative to / but I can not make that work).
hope this will help someone (and me next time I google it)
You can use standard "ant notation" to exclude directories relative.
This works for me and excludes any .git or node_module directories:
tar -cvf myFile.tar --exclude=**/.git/* --exclude=**/node_modules/* -T /data/txt/myInputFile.txt 2> /data/txt/myTarLogFile.txt
myInputFile.txt contains:
/dev2/java
/dev2/javascript
This exclude pattern handles filename suffix like png or mp3 as well as directory names like .git and node_modules
tar --exclude={*.png,*.mp3,*.wav,.git,node_modules} -Jcf ${target_tarball} ${source_dirname}
I've experienced that, at least with the Cygwin version of tar I'm using ("CYGWIN_NT-5.1 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin" on a Windows XP Home Edition SP3 machine), the order of options is important.
While this construction worked for me:
tar cfvz target.tgz --exclude='<dir1>' --exclude='<dir2>' target_dir
that one didn't work:
tar cfvz --exclude='<dir1>' --exclude='<dir2>' target.tgz target_dir
This, while tar --help reveals the following:
tar [OPTION...] [FILE]
So, the second command should also work, but apparently it doesn't seem to be the case...
Best rgds,
I found this somewhere else so I won't take credit, but it worked better than any of the solutions above for my mac specific issues (even though this is closed):
tar zc --exclude __MACOSX --exclude .DS_Store -f <archive> <source(s)>
After reading all this good answers for different versions and having solved the problem for myself, I think there are very small details that are very important, and rare to GNU/Linux general use, that aren't stressed enough and deserves more than comments.
So I'm not going to try to answer the question for every case, but instead, try to register where to look when things doesn't work.
IT IS VERY IMPORTANT TO NOTICE:
THE ORDER OF THE OPTIONS MATTER: it is not the same put the --exclude before than after the file option and directories to backup. This is unexpected at least to me, because in my experience, in GNU/Linux commands, usually the order of the options doesn't matter.
Different tar versions expects this options in different order: for instance, #Andrew's answer indicates that in GNU tar v 1.26 and 1.28 the excludes comes last, whereas in my case, with GNU tar 1.29, it's the other way.
THE TRAILING SLASHES MATTER: at least in GNU tar 1.29, it shouldn't be any.
In my case, for GNU tar 1.29 on Debian stretch, the command that worked was
tar --exclude="/home/user/.config/chromium" --exclude="/home/user/.cache" -cf file.tar /dir1/ /home/ /dir3/
The quotes didn't matter, it worked with or without them.
I hope this will be useful to someone.
If you are trying to exclude Version Control System (VCS) files, tar already supports two interesting options about it! :)
Option : --exclude-vcs
This option excludes files and directories used by following version control systems: CVS, RCS, SCCS, SVN, Arch, Bazaar, Mercurial, and Darcs.
As of version 1.32, the following files are excluded:
CVS/, and everything under it
RCS/, and everything under it
SCCS/, and everything under it
.git/, and everything under it
.gitignore
.gitmodules
.gitattributes
.cvsignore
.svn/, and everything under it
.arch-ids/, and everything under it
{arch}/, and everything under it
=RELEASE-ID
=meta-update
=update
.bzr
.bzrignore
.bzrtags
.hg
.hgignore
.hgrags
_darcs
Option : --exclude-vcs-ignores
When archiving directories that are under some version control system (VCS), it is often convenient to read exclusion patterns from this VCS' ignore files (e.g. .cvsignore, .gitignore, etc.) This option provide such possibility.
Before archiving a directory, see if it contains any of the following files: cvsignore, .gitignore, .bzrignore, or .hgignore. If so, read ignore patterns from these files.
The patterns are treated much as the corresponding VCS would treat them, i.e.:
.cvsignore
Contains shell-style globbing patterns that apply only to the directory where this file resides. No comments are allowed in the file. Empty lines are ignored.
.gitignore
Contains shell-style globbing patterns. Applies to the directory where .gitfile is located and all its subdirectories.
Any line beginning with a # is a comment. Backslash escapes the comment character.
.bzrignore
Contains shell globbing-patterns and regular expressions (if prefixed with RE:(16). Patterns affect the directory and all its subdirectories.
Any line beginning with a # is a comment.
.hgignore
Contains posix regular expressions(17). The line syntax: glob switches to shell globbing patterns. The line syntax: regexp switches back. Comments begin with a #. Patterns affect the directory and all its subdirectories.
Example
tar -czv --exclude-vcs --exclude-vcs-ignores -f path/to/my-tar-file.tar.gz path/to/my/project/
I'd like to show another option I used to get the same result as the answers before provide, I had a similar case where I wanted to backup android studio projects all together in a tar file to upload to media fire, using the du command to find the large files, I found that I didn't need some directories like:
build, linux e .dart_tools
Using the first answer of Charles_ma I modified it a little bit to be able to run the command from the parent directory of the my Android directory.
tar --exclude='*/build' --exclude='*/linux' --exclude='*/.dart_tool' -zcvf androidProjects.tar Android/
It worked like a charm.
Ps. Sorry if this kind of answer is not allowed, if this is the case I will remove.
For Mac OSX I had to do
tar -zcv --exclude='folder' -f theOutputTarFile.tar folderToTar
Note the -f after the --exclude=
For those who have issues with it, some versions of tar would only work properly without the './' in the exclude value.
Tar --version
tar (GNU tar) 1.27.1
Command syntax that work:
tar -czvf ../allfiles-butsome.tar.gz * --exclude=acme/foo
These will not work:
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=./acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='./acme/foo'
$ tar --exclude=./acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='./acme/foo' -czvf ../allfiles-butsome.tar.gz *
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=/full/path/acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='/full/path/acme/foo'
$ tar --exclude=/full/path/acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='/full/path/acme/foo' -czvf ../allfiles-butsome.tar.gz *
I agree the --exclude flag is the right approach.
$ tar --exclude='./folder_or_file' --exclude='file_pattern' --exclude='fileA'
A word of warning for a side effect that I did not find immediately obvious:
The exclusion of 'fileA' in this example will search for 'fileA' RECURSIVELY!
Example:A directory with a single subdirectory containing a file of the same name (data.txt)
data.txt
config.txt
--+dirA
| data.txt
| config.docx
If using --exclude='data.txt' the archive will not contain EITHER data.txt file. This can cause unexpected results if archiving third party libraries, such as a node_modules directory.
To avoid this issue make sure to give the entire path, like --exclude='./dirA/data.txt'
After reading this thread, I did a little testing on RHEL 5 and here are my results for tarring up the abc directory:
This will exclude the directories error and logs and all files under the directories:
tar cvpzf abc.tgz abc/ --exclude='abc/error' --exclude='abc/logs'
Adding a wildcard after the excluded directory will exclude the files but preserve the directories:
tar cvpzf abc.tgz abc/ --exclude='abc/error/*' --exclude='abc/logs/*'
To avoid possible 'xargs: Argument list too long' errors due to the use of find ... | xargs ... when processing tens of thousands of files, you can pipe the output of find directly to tar using find ... -print0 | tar --null ....
# archive a given directory, but exclude various files & directories
# specified by their full file paths
find "$(pwd -P)" -type d \( -path '/path/to/dir1' -or -path '/path/to/dir2' \) -prune \
-or -not \( -path '/path/to/file1' -or -path '/path/to/file2' \) -print0 |
gnutar --null --no-recursion -czf archive.tar.gz --files-from -
#bsdtar --null -n -czf archive.tar.gz -T -
You can also use one of the "--exclude-tag" options depending on your needs:
--exclude-tag=FILE
--exclude-tag-all=FILE
--exclude-tag-under=FILE
The folder hosting the specified FILE will be excluded.
Use the find command in conjunction with the tar append (-r) option. This way you can add files to an existing tar in a single step, instead of a two pass solution (create list of files, create tar).
find /dir/dir -prune ... -o etc etc.... -exec tar rvf ~/tarfile.tar {} \;
You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file:
find ... | cpio -o -H ustar | gzip -c > archive.tar.gz
gnu tar v 1.26 the --exclude needs to come after archive file and backup directory arguments, should have no leading or trailing slashes, and prefers no quotes (single or double). So relative to the PARENT directory to be backed up, it's:
tar cvfz /path_to/mytar.tgz ./dir_to_backup --exclude=some_path/to_exclude
tar -cvzf destination_folder source_folder -X /home/folder/excludes.txt
-X indicates a file which contains a list of filenames which must be excluded from the backup. For Instance, you can specify *~ in this file to not include any filenames ending with ~ in the backup.
Success Case:
1) if giving full path to take backup, in exclude also should be used full path.
tar -zcvf /opt/ABC/BKP_27032020/backup_27032020.tar.gz --exclude='/opt/ABC/csv/' --exclude='/opt/ABC/log/' /opt/ABC
2) if giving current path to take backup, in exclude also should be used current path only.
tar -zcvf backup_27032020.tar.gz --exclude='ABC/csv/' --exclude='ABC/log/' ABC
Failure Case:
if giving currentpath directory to take backup and full path to ignore,then wont work
tar -zcvf /opt/ABC/BKP_27032020/backup_27032020.tar.gz --exclude='/opt/ABC/csv/' --exclude='/opt/ABC/log/' ABC
Note: mentioning exclude before/after backup directory is fine.
It seems to be impossible to exclude directories with absolute paths.
As soon as ANY of the paths are absolute (source or/and exclude) the exclude command will not work. That's my experience after trying all possible combinations.
Check it out
tar cvpzf zip_folder.tgz . --exclude=./public --exclude=./tmp --exclude=./log --exclude=fileName
I want to have fresh front-end version (angular folder) on localhost.
Also, git folder is huge in my case, and I want to exclude it.
I need to download it from server, and unpack it in order to run application.
Compress angular folder from /var/lib/tomcat7/webapps, move it to /tmp folder with name angular.23.12.19.tar.gz
Command :
tar --exclude='.git' -zcvf /tmp/angular.23.12.19.tar.gz /var/lib/tomcat7/webapps/angular/
Your best bet is to use find with tar, via xargs (to handle the large number of arguments). For example:
find / -print0 | xargs -0 tar cjf tarfile.tar.bz2
Possible redundant answer but since I found it useful, here it is:
While a FreeBSD root (i.e. using csh) I wanted to copy my whole root filesystem to /mnt but without /usr and (obviously) /mnt. This is what worked (I am at /):
tar --exclude ./usr --exclude ./mnt --create --file - . (cd /mnt && tar xvd -)
My whole point is that it was necessary (by putting the ./) to specify to tar that the excluded directories where part of the greater directory being copied.
My €0.02
I had no luck getting tar to exclude a 5 Gigabyte subdirectory a few levels deep. In the end, I just used the unix Zip command. It worked a lot easier for me.
So for this particular example from the original post
(tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz . )
The equivalent would be:
zip -r /backup/filename.zip . -x upload/folder/**\* upload/folder2/**\*
(NOTE: Here is the post I originally used that helped me https://superuser.com/questions/312301/unix-zip-directory-but-excluded-specific-subdirectories-and-everything-within-t)
The following bash script should do the trick. It uses the answer given here by Marcus Sundman.
#!/bin/bash
echo -n "Please enter the name of the tar file you wish to create with out extension "
read nam
echo -n "Please enter the path to the directories to tar "
read pathin
echo tar -czvf $nam.tar.gz
excludes=`find $pathin -iname "*.CC" -exec echo "--exclude \'{}\'" \;|xargs`
echo $pathin
echo tar -czvf $nam.tar.gz $excludes $pathin
This will print out the command you need and you can just copy and paste it back in. There is probably a more elegant way to provide it directly to the command line.
Just change *.CC for any other common extension, file name or regex you want to exclude and this should still work.
EDIT
Just to add a little explanation; find generates a list of files matching the chosen regex (in this case *.CC). This list is passed via xargs to the echo command. This prints --exclude 'one entry from the list'. The slashes () are escape characters for the ' marks.

Resources