Pinpoint archive file from a list of archive files where the target file is zipped - linux

I have a directory structure like this -
./Archive1
./Archive1/Archive2
./Archive1/Archive3
In each directory there are many tar files. Say for example (don't go with the name, they are just for example) -
Archive1
├── Archive2
│ ├── tenth.tar.gz
│ └── third.tar.gz
├── Archive3
│ ├── fourth.tar.gz
│ └── sixth.tar.gz
├── fifth.tar.gz
├── first.tar.gz
└── second.tar.gz
Now I have a file file.txt, that could reside in any tar file. I need a command that would give me the output as which tar file have my input file (file.txt) and also the absolute path of the tar file.
So for example if test.txt is in sixth.tar.gz. The output will be sixth.tar.gz and ./Archive1/Archive3/
Currently I have this command, but the drawback of the command is, it is listing all the tar files -
find . -maxdepth "3" -type f -name "*.tar.gz" -printf [%f]\\n -exec tar -tf {} \; | grep -iE "[\[]|file.txt"

For each tar file, you can run tar | grep, and if there is a match, print the tar file's name. One way to do this is by running a shell command for each tar file. For a small number of files, and if performance is not too important, this might be good enough and it's fairly straightforward.
find . -maxdepth "3" -type f -name "*.tar.gz" -exec sh -c 'tar tf {} | grep -iEq "[\[]|file.txt" && echo {}' \;
So for example if test.txt is in sixth.tar.gz, the output will be ./Archive1/Archive3/sixth.tar.gz.

Related

Change permission of a directory and subdirectories with xargs and chmod commands in linux

i have a list of directories in the current directory named with there permission codes (exemple : 552, 700, 777). I want to get the code permission from the name of directory and apply it to the directory and all the files it contains.
I tried with the xargs command :
find . -name "[0-9][0-9][0-9]" -type d | xargs chmod -R [0-9][0-9][0-9]
the problem with this command it takes the first directory and its change the permission code of all directories.
├── 555
│   └── logs
│   ├── 01.log
│   ├── 02.log
│   ├── 03.log
│   ├── 04.log
│   ├── 05.log
│   ├── 06.log
│   └── 07.log
├── 700
│   └── data
│   └── data1.data
what I want : I have the 555 directory so I want to change all sub files and directory with permission code 555 and for the second directory I want to change all the subfiles and directory with the permission code 700
what my command do: it change all other files and subdirectories with the permission code of the first file 500
Try
find . -name "[0-9][0-9][0-9]" -type d | sed 's#\./\(.*\)#\1#' | xargs -I{} chmod -R {} {}
the find is the same as yours.
the sed is added to remove the ./ from the directory name. Find returns ./700, ./555, ...
xargs with -I uses {} to reuse what it received into the command. So it says "chmod -R DIRNAME DIRNAME". So chmod -R 700 700 and so on.
In your attempt, xargs chmod -R [0-9][0-9][0-9], there is nothing to link the [0-9] in the find to the [0-9] in xargs.
Without xargs
find . -type d -regextype sed -regex ".*/[0-9]\{3\}$"| awk -F"/" '{print "chmod -R " $NF,$0}'|sh
find:
find . -type d -name '[0-7][0-7][0-7]' \
-exec sh -c 'for i do chmod -R "${i##*/}" "$i"; done' _ {} +
or a bash loop:
shopt -s globstar
for i in **/[0-7][0-7][0-7]/; do
i=${i%/}
chmod -R "${i##*/}" "$i"; done
done

Give out parent folder name if not containing a certain file

I am looking for a terminal linux command to give out the folder parent name that does not contain a certain file:
By now I use the following command:
find . -type d -exec test -e '{}'/recon-all.done \; -print| wc -l
Which gives me the amount of folders which contain then file.
The file recon-all.done would be in /subject/../../recon-all.done and I would need every single "subject" name which does not contain the recon-all.done file.
Loop through the directories, test for the existence of the file, and print the directory if the test fails.
for subject in */; do
if ! [ -e "${subject}scripts/recon-all.done" ]
then echo "$subject"
fi
done
Your command;
find . -type d -exec test -e '{}'/recon-all.done \; -print| wc -l
Almost does the job, we'll just need to
Remove | wc -l to show the directory path witch does not contain the recon-all file
Now, we can negate the -exec test by adding a ! like so:
find . -type d \! -exec test -e '{}'/recon-all.done \; -print
This way find will show each folder name if it does not contain the recon-all file
Note; Based on your comment on Barmar's answer, I've added a -maxdepth 1 to prevent deeper directorys from being checked.
Small example from my local machine:
$ /tmp/test$ tree
.
├── a
│   └── test.xt
├── b
├── c
│   └── test.xt
└── x
├── a
│   └── test.xt
└── b
6 directories, 3 files
$ /tmp/test$ find . -maxdepth 1 -type d \! -exec test -e '{}/test.xt' \; -print
.
./b
./x
$ /tmp/test$

How to copy folders and subfolders which have selected files

I have a directory oridir with structure as follows:
.
├── DIRA
│   ├── DIRA1
│   │   └── file2.txt
│   └── DIRA2
│   ├── file1.xls
│   └── file1.txt
└── DIRB
├── DIRB1
│   └── file1.txt
└── DIRB2
└── file2.xls
I have to copy files which have extension .xls while maintaining the directory structure. So I need to have following directory and files in a newdir folder:
.
├── DIRA
│   └── DIRA2
│   └── file1.xls
└── DIRB
└── DIRB2
└── file2.xls
I tried following command but it copies all files and folders:
cp -r oridir newdir
Finding required files can be done as follows:
$ find oridir | grep xls$
oridir/DIRB/DIRB2/file2.xls
oridir/DIRA/DIRA2/file1.xls
Also as follows:
$ find oridir -type f -iname *.xls
./oridir/DIRB/DIRB2/file2.xls
./oridir/DIRA/DIRA2/file1.xls
But how to create these folders and copy files. How can I achieve this selected creation of directories and copying files with `bash' in Linux?
Edit: There are space also in some file and directory names.
cp's --parents flag makes use full source file name under DIRECTORY
For example, if recursive glob ** is enabled (shopt -s globstar):
cp --parents origin/**/*.xls target
If recursive glob is not enabled, you have to add a wildcard for each level on directory hierarchy:
cp --parents origin/*/*/*.xls target
If a destination dir is "dest".
foo.sh
#!/bin/bash
dest=./dest
find . -type f -name "*.xls" | while read f
do
d=$(dirname "${f}")
d="${dest}/${d}"
mkdir -p "${d}"
cp "${f}" "${d}"
done
Make dirs and files.
$ mkdir -p DIRA/DIRA1
$ mkdir -p DIRA/DIRA2
$ mkdir -p DIRB/DIRB1
$ mkdir -p DIRB/DIRB2
$ touch DIRA/DIRA1/file2.txt
$ touch DIRA/DIRA2/file1.xls
$ touch DIRA/DIRA2/file1.txt
$ touch DIRB/DIRB1/file1.txt
$ touch DIRB/DIRB1/file2.xls
A result is
$ find dest
dest
dest/DIRB
dest/DIRB/DIRB1
dest/DIRB/DIRB1/file2.xls
dest/DIRA
dest/DIRA/DIRA2
dest/DIRA/DIRA2/file1.xls
See Yuji's excellent answer first, but I think tar is also a good option here:
cd oridir; find . -name "*.xls" | xargs tar c | (cd ../newdir; tar x)
You may need to adjust oridir and/or ../newdir depending on the precise paths of your directories.
Possible improvement: Here is a version that may be better in that it will handle files (and paths) with spaces (or other strange characters) in their names, and that uses tar's own options instead of xargs and cd:
cd oridir; find . -print0 -name "*.xls" | tar -c --null -T- | tar -C ../newdir -x
Explanation:
The -print0 and the --null cause the respective commands to separate filenames by the null (ASCII 0) character only.
-T- causes tar to read filenames from standard input.
-C causes tar to cd before extracting.

Copy files and preserving directory structure

Here's what I have to do: Find all files which are in the directory src (or in its subdirectories) and have str in their name and copy them to dest preserving the subdirectory structure. For example I have the directory dir1 which contains foo.txt and the subdirectory subdir which also contains foo.txt. After running my script (with str=txt and dest=dir2) dir2 should countain foo.txt and subdir/foo.txt. So far I have come up with this code:
while read -r line; do
cp --parents $line $dest
done <<< "$(find $src -name "*$str*")"
which almost does the job except that it creates dir1 inside of dir2 and the desired files are inside dir2/dir1. I also tried doing it with the -exec option of find but didn't get better results.
IIUC, this can be done with find ... -exec. Let's say we have the following directory:
$ tree
.
└── src
├── dir1
│   └── yet_another_file_src
└── file_src
2 directories, 2 files
We can copy all files that contain *src* to /tmp/copy-here like this:
$ find . -type f -name "*src*" -exec sh -c 'echo mkdir -p /tmp/copy-here/$(dirname {})' \; -exec sh -c 'echo cp {} /tmp/copy-here/$(dirname {})' \;
mkdir -p /tmp/copy-here/./src
cp ./src/file_src /tmp/copy-here/./src
mkdir -p /tmp/copy-here/./src/dir1
cp ./src/dir1/yet_another_file_src /tmp/copy-here/./src/dir1
Notice that I used echo instead of really running this command -
read the output and make sure that this is what you want to
achieve. If you're sure that this would be what you want just remove
echo like this:
$ find . -type f -name "*src*" -exec sh -c 'mkdir -p /tmp/copy-here/$(dirname {})' \; -exec sh -c 'cp {} /tmp/copy-here/$(dirname {})' \;
$ tree /tmp/copy-here
/tmp/copy-here
└── src
├── dir1
│   └── yet_another_file_src
└── file_src
2 directories, 2 files
EDIT:
And of course, you can always use rsync:
$ rsync -avz --include "*/" --include="*src*" --exclude="*" "$PWD" /tmp/copy-here

How to exclude specific files with the tar command?

Let's assume I have the following directory structure
dir/
├── subdir
│   ├── dir
│   │   └── TODO.txt
│   └── TODO.txt
└── TODO.txt
I wish to bundle dir/ recursively into a tarball with the command tar on Linux, but I want to exclude the root TODO.txt. How can I specify this with a relative path ?
Attempt #1
tar -czf dir.tar.gz dir/ --exclude='TODO.txt'
Doesn't work : it gets rid of every TODO.txt in the resulting tarball :
dir/
└── subdir
└── dir
Attempt #2
tar -czf dir.tar.gz dir/ --exclude='dir/TODO.txt'
This also fails, because the dir subdirectory is also matched by this pattern. The resulting tarball hence contains
dir/
└── subdir
├── dir
└── TODO.txt
Is there any way I can specify exactly that I want to exclude the root TODO.txt with a relative path ?
Instead of using dir/ to name your transfer, cd into dir, then name it as .. The folder name . will never appear later in any files path, so it serves as a robust anchor. Then use --transform=, to have the paths in the archive begin with dir/.
Demonstration
without filter:
$ tar -czf dir.tar.gz -v dir
dir/
dir/TODO.txt
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt
cd into dir, name it as .:
$ tar -czf dir.tar.gz -v -C dir .
./
./TODO.txt
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
exclude, anchoring on .:
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' .
./
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
change . back to dir inside the archive (--show-transformed-names makes tar show the names as they go into the archive):
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' --transform='s/^\./dir/g' --show-transformed-names .
dir/
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt
From #arkascha 's answer :
find dir/ -type f | grep -v "^dir/TODO.txt" > files.txt
then
tar -czf dir.tar.gz -T files.txt
From the first line, there are 2 tricks to pay attention to :
The -type f option. If not put, directories will be included in find's result. This is bad, because it would include each file as many times as their depth in the file hierarchy.
The ^ in grep's regex : it ensures that we're excluding the pattern from the begining of the file hierarchy

Resources