How to exclude specific files with the tar command? - linux

Let's assume I have the following directory structure
dir/
├── subdir
│   ├── dir
│   │   └── TODO.txt
│   └── TODO.txt
└── TODO.txt
I wish to bundle dir/ recursively into a tarball with the command tar on Linux, but I want to exclude the root TODO.txt. How can I specify this with a relative path ?
Attempt #1
tar -czf dir.tar.gz dir/ --exclude='TODO.txt'
Doesn't work : it gets rid of every TODO.txt in the resulting tarball :
dir/
└── subdir
└── dir
Attempt #2
tar -czf dir.tar.gz dir/ --exclude='dir/TODO.txt'
This also fails, because the dir subdirectory is also matched by this pattern. The resulting tarball hence contains
dir/
└── subdir
├── dir
└── TODO.txt
Is there any way I can specify exactly that I want to exclude the root TODO.txt with a relative path ?

Instead of using dir/ to name your transfer, cd into dir, then name it as .. The folder name . will never appear later in any files path, so it serves as a robust anchor. Then use --transform=, to have the paths in the archive begin with dir/.
Demonstration
without filter:
$ tar -czf dir.tar.gz -v dir
dir/
dir/TODO.txt
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt
cd into dir, name it as .:
$ tar -czf dir.tar.gz -v -C dir .
./
./TODO.txt
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
exclude, anchoring on .:
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' .
./
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
change . back to dir inside the archive (--show-transformed-names makes tar show the names as they go into the archive):
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' --transform='s/^\./dir/g' --show-transformed-names .
dir/
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt

From #arkascha 's answer :
find dir/ -type f | grep -v "^dir/TODO.txt" > files.txt
then
tar -czf dir.tar.gz -T files.txt
From the first line, there are 2 tricks to pay attention to :
The -type f option. If not put, directories will be included in find's result. This is bad, because it would include each file as many times as their depth in the file hierarchy.
The ^ in grep's regex : it ensures that we're excluding the pattern from the begining of the file hierarchy

Related

How to copy folders and subfolders which have selected files

I have a directory oridir with structure as follows:
.
├── DIRA
│   ├── DIRA1
│   │   └── file2.txt
│   └── DIRA2
│   ├── file1.xls
│   └── file1.txt
└── DIRB
├── DIRB1
│   └── file1.txt
└── DIRB2
└── file2.xls
I have to copy files which have extension .xls while maintaining the directory structure. So I need to have following directory and files in a newdir folder:
.
├── DIRA
│   └── DIRA2
│   └── file1.xls
└── DIRB
└── DIRB2
└── file2.xls
I tried following command but it copies all files and folders:
cp -r oridir newdir
Finding required files can be done as follows:
$ find oridir | grep xls$
oridir/DIRB/DIRB2/file2.xls
oridir/DIRA/DIRA2/file1.xls
Also as follows:
$ find oridir -type f -iname *.xls
./oridir/DIRB/DIRB2/file2.xls
./oridir/DIRA/DIRA2/file1.xls
But how to create these folders and copy files. How can I achieve this selected creation of directories and copying files with `bash' in Linux?
Edit: There are space also in some file and directory names.
cp's --parents flag makes use full source file name under DIRECTORY
For example, if recursive glob ** is enabled (shopt -s globstar):
cp --parents origin/**/*.xls target
If recursive glob is not enabled, you have to add a wildcard for each level on directory hierarchy:
cp --parents origin/*/*/*.xls target
If a destination dir is "dest".
foo.sh
#!/bin/bash
dest=./dest
find . -type f -name "*.xls" | while read f
do
d=$(dirname "${f}")
d="${dest}/${d}"
mkdir -p "${d}"
cp "${f}" "${d}"
done
Make dirs and files.
$ mkdir -p DIRA/DIRA1
$ mkdir -p DIRA/DIRA2
$ mkdir -p DIRB/DIRB1
$ mkdir -p DIRB/DIRB2
$ touch DIRA/DIRA1/file2.txt
$ touch DIRA/DIRA2/file1.xls
$ touch DIRA/DIRA2/file1.txt
$ touch DIRB/DIRB1/file1.txt
$ touch DIRB/DIRB1/file2.xls
A result is
$ find dest
dest
dest/DIRB
dest/DIRB/DIRB1
dest/DIRB/DIRB1/file2.xls
dest/DIRA
dest/DIRA/DIRA2
dest/DIRA/DIRA2/file1.xls
See Yuji's excellent answer first, but I think tar is also a good option here:
cd oridir; find . -name "*.xls" | xargs tar c | (cd ../newdir; tar x)
You may need to adjust oridir and/or ../newdir depending on the precise paths of your directories.
Possible improvement: Here is a version that may be better in that it will handle files (and paths) with spaces (or other strange characters) in their names, and that uses tar's own options instead of xargs and cd:
cd oridir; find . -print0 -name "*.xls" | tar -c --null -T- | tar -C ../newdir -x
Explanation:
The -print0 and the --null cause the respective commands to separate filenames by the null (ASCII 0) character only.
-T- causes tar to read filenames from standard input.
-C causes tar to cd before extracting.

Linux command to move all the files from var/A to var/

I guess that this question has been asked here, but I didn't find it.
I am at directory var/ and I have a folder var/A that have some files inside, what I want is to move this files inside A to var/. so what i want to do is the following:
from:
var
├── A
| ├── file1.txt
| └── file2.txt
└── file3.txt
to:
var
├── file1.txt
├── file2.txt
└── file3.txt
In /var I have tryed the following commands
sudo mv A /
sudo mv A/ ./
sudo mv A/ .
And no one had worked. Thank you in advance
I have found the answer:
mv A/* .
I hope it could help another one
You have to specify the right path or match
if you want to move the entire files or directory you have to something
like this using * (wildcards)
user#akuma:~/var$ mv ./A/* ./
You can type mv --help in your terminal for more information to see more option like below
user#akuma:~/WORKSPACE/Trash$ mv --help
Usage: mv [OPTION]... [-T] SOURCE DEST
or: mv [OPTION]... SOURCE... DIRECTORY
or: mv [OPTION]... -t DIRECTORY SOURCE...
Rename SOURCE to DEST, or move SOURCE(s) to DIRECTORY.
Mandatory arguments to long options are mandatory for short options too.
--backup[=CONTROL] make a backup of each existing destination file
-b like --backup but does not accept an argument
-f, --force do not prompt before overwriting
-i, --interactive prompt before overwrite
-n, --no-clobber do not overwrite an existing file
If you specify more than one of -i, -f, -n, only the final one takes effect.
--strip-trailing-slashes remove any trailing slashes from each SOURCE
argument
-S, --suffix=SUFFIX override the usual backup suffix
-t, --target-directory=DIRECTORY move all SOURCE arguments into DIRECTORY
-T, --no-target-directory treat DEST as a normal file
-u, --update move only when the SOURCE file is newer
than the destination file or when the
destination file is missing
-v, --verbose explain what is being done
-Z, --context set SELinux security context of destination
file to default type
--help display this help and exit
--version output version information and exit
The backup suffix is '~', unless set with --suffix or SIMPLE_BACKUP_SUFFIX.
The version control method may be selected via the --backup option or through
the VERSION_CONTROL environment variable. Here are the values:
none, off never make backups (even if --backup is given)
numbered, t make numbered backups
existing, nil numbered if numbered backups exist, simple otherwise
simple, never always make simple backups
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/mv>
or available locally via: info '(coreutils) mv invocation'
I managed to replicate your situation.
.
└── var
├── A
│   ├── file1.txt
│   └── file2.txt
└── file3.txt
2 directories, 3 files
user#host:~/soQ$ mv var/A/*.txt var/ && rm -rf var/A/
user#host:~/soQ$ tree
.
└── var
├── file1.txt
├── file2.txt
└── file3.txt
Hope this helps!
If you want to do this from var just do mv A/*.txt . && rm -rf A/
Also, let's assume you have the following structure in your directory.
.
├── A
│   ├── file3.txt
│   └── folder1
│   ├── file1.txt
│   └── file2.txt
└── file4.txt
If you want to copy just the *.txt file in var you need to use find like this:
find ./A/* -type f -exec mv {} ${PWD} \; && rm A
This looks at all the files and folders and returns the path for each file (each .txt for example) and then executes mv on each line and moves the files into the current directory (${PWD}).

Copy directory structure ignoring some files

I have a directory structure
./
└── file1
├── config.xml
├── config.yml
└── file2
├── config.xml
├── config.yml
└── file3
├── config.xml
└── config.yml
What i want is to copy same directory structure and everything but
ignoring config.yml files in the new location Any Linux Command or
script Thanks in advance
As i understood you just want to replicate the folder structure,
You could do something like:
find . -type d >dirs.txt
to create the list of directories, then
xargs mkdir -p <dirs.txt
to create the directories on the destination.
Take a look at this https://stackoverflow.com/a/4073992/6916391
You can do it in two steps by copying the whole structure and then removing the config.yml files, this is:
cp -R old_structure_parent_dir new_structure_parent_dir
find new_structure_parent_dir -name config.yml -exec rm -rf {} \;

Pinpoint archive file from a list of archive files where the target file is zipped

I have a directory structure like this -
./Archive1
./Archive1/Archive2
./Archive1/Archive3
In each directory there are many tar files. Say for example (don't go with the name, they are just for example) -
Archive1
├── Archive2
│ ├── tenth.tar.gz
│ └── third.tar.gz
├── Archive3
│ ├── fourth.tar.gz
│ └── sixth.tar.gz
├── fifth.tar.gz
├── first.tar.gz
└── second.tar.gz
Now I have a file file.txt, that could reside in any tar file. I need a command that would give me the output as which tar file have my input file (file.txt) and also the absolute path of the tar file.
So for example if test.txt is in sixth.tar.gz. The output will be sixth.tar.gz and ./Archive1/Archive3/
Currently I have this command, but the drawback of the command is, it is listing all the tar files -
find . -maxdepth "3" -type f -name "*.tar.gz" -printf [%f]\\n -exec tar -tf {} \; | grep -iE "[\[]|file.txt"
For each tar file, you can run tar | grep, and if there is a match, print the tar file's name. One way to do this is by running a shell command for each tar file. For a small number of files, and if performance is not too important, this might be good enough and it's fairly straightforward.
find . -maxdepth "3" -type f -name "*.tar.gz" -exec sh -c 'tar tf {} | grep -iEq "[\[]|file.txt" && echo {}' \;
So for example if test.txt is in sixth.tar.gz, the output will be ./Archive1/Archive3/sixth.tar.gz.

How I can pack into tar-file only last directory from path?

How I can pack into tar-file only last directory from path?
For example, I have several paths
/usr/local/files/dir1/
file1.txt
file2.txt
file3.txt
/usr/local/files/dir2/
file3.txt
file4.txt
file5.txt
if I run command:
tar czf my_arch.tar.gz -C /usr/local/files/dir1 .
I gain only containment of dir1 catalog, without itself.
So I have - my_arch.tar.gz/file1.txt, file2.txt, file3.txt, But I need structure like that inside my archive -
my_arch.tar.gz/dir1/file1.txt, file2.txt, file3.txt
How I can do this?
Thank you.
try
cd /usr/local/files
tar -cvzf my_arch.tar.gz dir1
The -C directive will make you change into dir1 and thus not archive the folder, but its contents:
-C, --directory DIR
change to directory DIR
you cannot do this directly through tar.
here's my suggestion :
#!/bin/bash
mydir=/my_dir/whit/long_and/complicated_path/the_stuff_is_here
dirname=$(dirname $mydir )
basename=$(basename $mydir )
tar cvf /tmp/$basename.tar -C $dirname $basename
$ tar vczf tmp/export/files.tar.gz -C tmp/export src
structure for files.tar.gz
src
src/app
src/app/main.js
src/app/util
src/app/util/runtime.js
If I understand what you are asking correctly, you want your tar file to contain the directory.
Try it without the -C flag as in:
tar -czf my_arch.tar.gz /usr/local/files/dir1
If you specify -C then you directory path is ./. Probably the following works like you want:
$ touch asdf/foo/bar/{1,2,3}
$ tree asdf/
asdf/
└── foo
└── bar
├── 1
├── 2
└── 3
2 directories, 3 files
$ tar -cv -C asdf/foo/bar/ -f asdf.tar ./
./
./3
./2
./1
$ tar tf asdf.tar
./
./3
./2
./1

Resources