How to copy folders and subfolders which have selected files - linux

I have a directory oridir with structure as follows:
.
├── DIRA
│   ├── DIRA1
│   │   └── file2.txt
│   └── DIRA2
│   ├── file1.xls
│   └── file1.txt
└── DIRB
├── DIRB1
│   └── file1.txt
└── DIRB2
└── file2.xls
I have to copy files which have extension .xls while maintaining the directory structure. So I need to have following directory and files in a newdir folder:
.
├── DIRA
│   └── DIRA2
│   └── file1.xls
└── DIRB
└── DIRB2
└── file2.xls
I tried following command but it copies all files and folders:
cp -r oridir newdir
Finding required files can be done as follows:
$ find oridir | grep xls$
oridir/DIRB/DIRB2/file2.xls
oridir/DIRA/DIRA2/file1.xls
Also as follows:
$ find oridir -type f -iname *.xls
./oridir/DIRB/DIRB2/file2.xls
./oridir/DIRA/DIRA2/file1.xls
But how to create these folders and copy files. How can I achieve this selected creation of directories and copying files with `bash' in Linux?
Edit: There are space also in some file and directory names.

cp's --parents flag makes use full source file name under DIRECTORY
For example, if recursive glob ** is enabled (shopt -s globstar):
cp --parents origin/**/*.xls target
If recursive glob is not enabled, you have to add a wildcard for each level on directory hierarchy:
cp --parents origin/*/*/*.xls target

If a destination dir is "dest".
foo.sh
#!/bin/bash
dest=./dest
find . -type f -name "*.xls" | while read f
do
d=$(dirname "${f}")
d="${dest}/${d}"
mkdir -p "${d}"
cp "${f}" "${d}"
done
Make dirs and files.
$ mkdir -p DIRA/DIRA1
$ mkdir -p DIRA/DIRA2
$ mkdir -p DIRB/DIRB1
$ mkdir -p DIRB/DIRB2
$ touch DIRA/DIRA1/file2.txt
$ touch DIRA/DIRA2/file1.xls
$ touch DIRA/DIRA2/file1.txt
$ touch DIRB/DIRB1/file1.txt
$ touch DIRB/DIRB1/file2.xls
A result is
$ find dest
dest
dest/DIRB
dest/DIRB/DIRB1
dest/DIRB/DIRB1/file2.xls
dest/DIRA
dest/DIRA/DIRA2
dest/DIRA/DIRA2/file1.xls

See Yuji's excellent answer first, but I think tar is also a good option here:
cd oridir; find . -name "*.xls" | xargs tar c | (cd ../newdir; tar x)
You may need to adjust oridir and/or ../newdir depending on the precise paths of your directories.
Possible improvement: Here is a version that may be better in that it will handle files (and paths) with spaces (or other strange characters) in their names, and that uses tar's own options instead of xargs and cd:
cd oridir; find . -print0 -name "*.xls" | tar -c --null -T- | tar -C ../newdir -x
Explanation:
The -print0 and the --null cause the respective commands to separate filenames by the null (ASCII 0) character only.
-T- causes tar to read filenames from standard input.
-C causes tar to cd before extracting.

Related

Change permission of a directory and subdirectories with xargs and chmod commands in linux

i have a list of directories in the current directory named with there permission codes (exemple : 552, 700, 777). I want to get the code permission from the name of directory and apply it to the directory and all the files it contains.
I tried with the xargs command :
find . -name "[0-9][0-9][0-9]" -type d | xargs chmod -R [0-9][0-9][0-9]
the problem with this command it takes the first directory and its change the permission code of all directories.
├── 555
│   └── logs
│   ├── 01.log
│   ├── 02.log
│   ├── 03.log
│   ├── 04.log
│   ├── 05.log
│   ├── 06.log
│   └── 07.log
├── 700
│   └── data
│   └── data1.data
what I want : I have the 555 directory so I want to change all sub files and directory with permission code 555 and for the second directory I want to change all the subfiles and directory with the permission code 700
what my command do: it change all other files and subdirectories with the permission code of the first file 500
Try
find . -name "[0-9][0-9][0-9]" -type d | sed 's#\./\(.*\)#\1#' | xargs -I{} chmod -R {} {}
the find is the same as yours.
the sed is added to remove the ./ from the directory name. Find returns ./700, ./555, ...
xargs with -I uses {} to reuse what it received into the command. So it says "chmod -R DIRNAME DIRNAME". So chmod -R 700 700 and so on.
In your attempt, xargs chmod -R [0-9][0-9][0-9], there is nothing to link the [0-9] in the find to the [0-9] in xargs.
Without xargs
find . -type d -regextype sed -regex ".*/[0-9]\{3\}$"| awk -F"/" '{print "chmod -R " $NF,$0}'|sh
find:
find . -type d -name '[0-7][0-7][0-7]' \
-exec sh -c 'for i do chmod -R "${i##*/}" "$i"; done' _ {} +
or a bash loop:
shopt -s globstar
for i in **/[0-7][0-7][0-7]/; do
i=${i%/}
chmod -R "${i##*/}" "$i"; done
done

Linux command to move all the files from var/A to var/

I guess that this question has been asked here, but I didn't find it.
I am at directory var/ and I have a folder var/A that have some files inside, what I want is to move this files inside A to var/. so what i want to do is the following:
from:
var
├── A
| ├── file1.txt
| └── file2.txt
└── file3.txt
to:
var
├── file1.txt
├── file2.txt
└── file3.txt
In /var I have tryed the following commands
sudo mv A /
sudo mv A/ ./
sudo mv A/ .
And no one had worked. Thank you in advance
I have found the answer:
mv A/* .
I hope it could help another one
You have to specify the right path or match
if you want to move the entire files or directory you have to something
like this using * (wildcards)
user#akuma:~/var$ mv ./A/* ./
You can type mv --help in your terminal for more information to see more option like below
user#akuma:~/WORKSPACE/Trash$ mv --help
Usage: mv [OPTION]... [-T] SOURCE DEST
or: mv [OPTION]... SOURCE... DIRECTORY
or: mv [OPTION]... -t DIRECTORY SOURCE...
Rename SOURCE to DEST, or move SOURCE(s) to DIRECTORY.
Mandatory arguments to long options are mandatory for short options too.
--backup[=CONTROL] make a backup of each existing destination file
-b like --backup but does not accept an argument
-f, --force do not prompt before overwriting
-i, --interactive prompt before overwrite
-n, --no-clobber do not overwrite an existing file
If you specify more than one of -i, -f, -n, only the final one takes effect.
--strip-trailing-slashes remove any trailing slashes from each SOURCE
argument
-S, --suffix=SUFFIX override the usual backup suffix
-t, --target-directory=DIRECTORY move all SOURCE arguments into DIRECTORY
-T, --no-target-directory treat DEST as a normal file
-u, --update move only when the SOURCE file is newer
than the destination file or when the
destination file is missing
-v, --verbose explain what is being done
-Z, --context set SELinux security context of destination
file to default type
--help display this help and exit
--version output version information and exit
The backup suffix is '~', unless set with --suffix or SIMPLE_BACKUP_SUFFIX.
The version control method may be selected via the --backup option or through
the VERSION_CONTROL environment variable. Here are the values:
none, off never make backups (even if --backup is given)
numbered, t make numbered backups
existing, nil numbered if numbered backups exist, simple otherwise
simple, never always make simple backups
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/mv>
or available locally via: info '(coreutils) mv invocation'
I managed to replicate your situation.
.
└── var
├── A
│   ├── file1.txt
│   └── file2.txt
└── file3.txt
2 directories, 3 files
user#host:~/soQ$ mv var/A/*.txt var/ && rm -rf var/A/
user#host:~/soQ$ tree
.
└── var
├── file1.txt
├── file2.txt
└── file3.txt
Hope this helps!
If you want to do this from var just do mv A/*.txt . && rm -rf A/
Also, let's assume you have the following structure in your directory.
.
├── A
│   ├── file3.txt
│   └── folder1
│   ├── file1.txt
│   └── file2.txt
└── file4.txt
If you want to copy just the *.txt file in var you need to use find like this:
find ./A/* -type f -exec mv {} ${PWD} \; && rm A
This looks at all the files and folders and returns the path for each file (each .txt for example) and then executes mv on each line and moves the files into the current directory (${PWD}).

Pinpoint archive file from a list of archive files where the target file is zipped

I have a directory structure like this -
./Archive1
./Archive1/Archive2
./Archive1/Archive3
In each directory there are many tar files. Say for example (don't go with the name, they are just for example) -
Archive1
├── Archive2
│ ├── tenth.tar.gz
│ └── third.tar.gz
├── Archive3
│ ├── fourth.tar.gz
│ └── sixth.tar.gz
├── fifth.tar.gz
├── first.tar.gz
└── second.tar.gz
Now I have a file file.txt, that could reside in any tar file. I need a command that would give me the output as which tar file have my input file (file.txt) and also the absolute path of the tar file.
So for example if test.txt is in sixth.tar.gz. The output will be sixth.tar.gz and ./Archive1/Archive3/
Currently I have this command, but the drawback of the command is, it is listing all the tar files -
find . -maxdepth "3" -type f -name "*.tar.gz" -printf [%f]\\n -exec tar -tf {} \; | grep -iE "[\[]|file.txt"
For each tar file, you can run tar | grep, and if there is a match, print the tar file's name. One way to do this is by running a shell command for each tar file. For a small number of files, and if performance is not too important, this might be good enough and it's fairly straightforward.
find . -maxdepth "3" -type f -name "*.tar.gz" -exec sh -c 'tar tf {} | grep -iEq "[\[]|file.txt" && echo {}' \;
So for example if test.txt is in sixth.tar.gz, the output will be ./Archive1/Archive3/sixth.tar.gz.

How to exclude specific files with the tar command?

Let's assume I have the following directory structure
dir/
├── subdir
│   ├── dir
│   │   └── TODO.txt
│   └── TODO.txt
└── TODO.txt
I wish to bundle dir/ recursively into a tarball with the command tar on Linux, but I want to exclude the root TODO.txt. How can I specify this with a relative path ?
Attempt #1
tar -czf dir.tar.gz dir/ --exclude='TODO.txt'
Doesn't work : it gets rid of every TODO.txt in the resulting tarball :
dir/
└── subdir
└── dir
Attempt #2
tar -czf dir.tar.gz dir/ --exclude='dir/TODO.txt'
This also fails, because the dir subdirectory is also matched by this pattern. The resulting tarball hence contains
dir/
└── subdir
├── dir
└── TODO.txt
Is there any way I can specify exactly that I want to exclude the root TODO.txt with a relative path ?
Instead of using dir/ to name your transfer, cd into dir, then name it as .. The folder name . will never appear later in any files path, so it serves as a robust anchor. Then use --transform=, to have the paths in the archive begin with dir/.
Demonstration
without filter:
$ tar -czf dir.tar.gz -v dir
dir/
dir/TODO.txt
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt
cd into dir, name it as .:
$ tar -czf dir.tar.gz -v -C dir .
./
./TODO.txt
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
exclude, anchoring on .:
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' .
./
./subdir/
./subdir/TODO.txt
./subdir/dir/
./subdir/dir/TODO.txt
change . back to dir inside the archive (--show-transformed-names makes tar show the names as they go into the archive):
$ tar -czf dir.tar.gz -v -C dir --exclude='./TODO.txt' --transform='s/^\./dir/g' --show-transformed-names .
dir/
dir/subdir/
dir/subdir/TODO.txt
dir/subdir/dir/
dir/subdir/dir/TODO.txt
From #arkascha 's answer :
find dir/ -type f | grep -v "^dir/TODO.txt" > files.txt
then
tar -czf dir.tar.gz -T files.txt
From the first line, there are 2 tricks to pay attention to :
The -type f option. If not put, directories will be included in find's result. This is bad, because it would include each file as many times as their depth in the file hierarchy.
The ^ in grep's regex : it ensures that we're excluding the pattern from the begining of the file hierarchy

Recursively scan directories in bash

I need to write a bash script that scans directories in current directory and generate md5 checksum values for each file in directory tree. It also should keep relative path to file and print checksums in a file.
For example if directory tree looks like this:
.
├── d
│   ├── file1.c
│   └── file2.c
├── e
│   └── file3.c
└── f
└── file4.cpp
The output should be something like this:
d8e8fca2dc0f896fd7cb4cb0031ba249 d/file1.c
d8e8fca2dc0f896fd7cb4cb0031ba249 d/file2.c
d8e8fca2dc0f896fd7cb4cb0031ba249 e/file3.c
d8e8fca2dc0f896fd7cb4cb0031ba249 f/file4.cpp
But I can't find a way to keep path to file when cd to them...
find . -type f -exec md5sum {} \;
or...
find . -type f | xargs -n 1 -d "\n" md5sum

Resources