Rsync: Exclude directory contents, but include directory - linux

I know that a directory can be excluded with --exclude like this:
rsync -avz --exclude=dir/to/skip /my/source/path /my/backup/path
This will omit the directory dir/to/skip
However I want to copy the directory itself but not the contents | Is there a one-liner with rsync to accomplish this?
Essentially, include dir/to/skip but exclude dir/to/skip/*
NOTE: I did search for this question. I found a lot of similar posts but not exactly this. Apologies if there is a dupe already.

The --exclude option takes a PATTERN, which means you just should just be able to do this:
rsync -avz --exclude='dir/to/skip/*' /my/source/path /my/backup/path
Note that the PATTERN is quoted to prevent the shell from doing glob expansion on it.
Since dir/to/skip doesn't match the pattern dir/to/skip/*, it will be included.
Here's an example to show that it works:
> mkdir -p a/{1,2,3}
> find a -type d -exec touch {}/file \;
> tree --charset ascii a
a
|-- 1
| `-- file
|-- 2
| `-- file
|-- 3
| `-- file
`-- file
3 directories, 4 files
> rsync -r --exclude='/2/*' a/ b/
> tree --charset ascii b
b
|-- 1
| `-- file
|-- 2
|-- 3
| `-- file
`-- file
3 directories, 3 files
It is important to note that the leading / in the above PATTERN represents the root of the source directory, not the filesystem root. This is explained in the rsync man page. If you omit the leading slash, rsync will attempt to match the PATTERN from the end of each path. This could lead to excluding files unexpectedly. For example, suppose I have a directory a/3/2/ which contains a bunch of files that I do want to transfer. If I omit the leading / and do:
rsync -r --exclude='2/*' a/ b/
then the PATTERN will match both a/2/* and a/3/2/*, which is not what I wanted.

Try:
rsync -avz --include=src/dir/to/skip --exclude=src/dir/to/skip/* src_dir dest_dir
--include=src/dir/to/skip includes the directory. --exclude=src/dir/to/skip/* excludes everything under the directory.

Related

Creating empty file in all subfolders with unique name - bash

Similar to this question
but I want to have file name be the same name as the directory with "_info.txt" appended to it.
I have tried this
#!/bin/bash
find . -mindepth 1 -maxdepth 1 -type d | while read line; do
touch /Users/xxx/git/xxx/$line/$line_info.txt
done
It is creating ".txt" in each subdirectory.
What am I missing ?
The crucial mistake is that the underscore character is a valid character for a bash variable name:
$ a=1
$ a_b=2
$ echo $a_b
2
There are several ways around this:
$ echo "$a"_b
1_b
$ echo $a'_b'
1_b
$ echo ${a}_b
1_b
As for your task, here's a fast way:
find . -mindepth 1 -maxdepth 1 -type d -printf "%p_info.txt\0" | xargs -0 touch
The find -printf prints the path of each directory, and printf's %p is unaffected by an underscore after it. Then, xargs passes the filenames as many arguments to few runs of touch, making the creation of the files much faster.
Figured it out:
#!/bin/bash
find . -mindepth 1 -maxdepth 1 -type d | while read line; do
touch /Users/xxxx/git/xxxx/$line/"$line"_info.txt
done
I think I can still do the "find ." better as I have to run in directory I want the files added to. Ideally it should be able to be run anywhere.
If you are not recursing at all, you don't need find. To loop over subdirectories, all you need is
for dir in */; do
touch /Users/xxxx/git/xxxx/"$dir/${dir%/}_info.txt"
done
Incidentally, notice that the variable needs to be in double quotes always.
Try this script :
#!/bin/bash
find $1 -type d | while read line
do
touch "$line/$(basename $( readlink -m $line))_info.txt"
done
Save it as, say, appendinfo and run it as
./appendinfo directory_name_which_include_symlinks
Before
ssam#udistro:~/Documents/so$ tree 36973628
36973628
`-- 36973628p1
`-- 36973628p2
`-- 36973628p3
3 directories, 0 files
After
ssam#udistro:~/Documents/so$ tree 36973628
36973628
|-- 36973628_info.txt
`-- 36973628p1
|-- 36973628p1_info.txt
`-- 36973628p2
|-- 36973628p2_info.txt
`-- 36973628p3
`-- 36973628p3_info.txt
3 directories, 4 files
Note : readlink canonicalizes the path by following the symlinks. It is not required if, say, you're not going to give arguments like ..
Try this :
find . /Users/xxx/git/xxx/ -mindepth 1 -maxdepth 1 -type d -exec bash -c 'touch ${0}/${0##*/}_info.txt' {} \;
You're missing the path to the folder. I think you meant to write
touch "./$folder/*/*_info.txt"
Although this still only works if you start in the right directory. You should really have written something like
touch "/foo/bar/user/$folder/*/*_info.txt"
However, your use of "*/*_info.txt" is very... questionable... to say the least. I'm no bash expert, but I don't expect it to work. Reading everything in a directory (using "find" for example, and only accepting directories) and piping that output to "touch" would be a better approach.

How does Linux TAR Exclusion works?

Let's say I have a directory structure like this:
# tree original_directory/
|-- sub-1
|-- sub-2
|-- ignore_this_dir
|-- sub-3
Then the tar command to exclude the directory called ignore_this_dir is actually:
# tar -cf new_archived.tar original_directory/ --exclude=ignore_this_dir
OR
# tar -cf new_archived.tar original_directory/ --exclude=original_directory/ignore_this_dir
The man page states:
--exclude=PATTERN
exclude files, given as a PATTERN
Meaning
tar -cf new_archived.tar origin_directory/ --exclude=ignore_this_dir
will be ok in your situation as the pattern ignore_this_dir will match original_directory/ignore_this_dir.

Linux cmd line, how do I move all found files to upper directory

What I want to do is to find certain files recursively and move them to the upper level of their current directories.
For example,
foo1/bar1/abc
foo2/bar2/bar3/abc
will be updated to
foo1/abc
foo2/bar2/abc
I tried
find -name \*abc -exec /bin/mv '{}' .. \;
But this is wrong since it moved everything to upper directory of $PWD.
Is there a similar cmd line way to move things dynamically up? Or do I have to use more complex scripts?
Here you go ..
This is what the structure looks like to begin with...
$ tree
[.]
+-- foo1
! `-- bar1
! `-- abc
!
`-- foo2
`-- bar2
`-- bar3
`-- abc
Then we run the command to move files
$ find . -type f -name "*abc" -exec bash -c ' mv -v {} `dirname {}`/.. ' \;
This produces this output...
`./foo1/bar1/abc' -> `./foo1/bar1/../abc'
`./foo2/bar2/bar3/abc' -> `./foo2/bar2/bar3/../abc'
Now the directory structure looks like ..
$ tree
[.]
+-- foo1
! +-- abc
! `-- bar1
!
`-- foo2
`-- bar2
+-- abc
`-- bar3
the important bits ...
-type f <-- Make sure that you pick files not folders, if you need folders too omit this.
-name "*abc" <-- Your file-match pattern
-exec bash -c ' mv -v {} `dirname {}`/.. ' \;
Here ..
Execute bash, pass it a string that will find the directory of the file and add a '/..' at the end to make the mv take the file to parent directory.
IMPORTANT WARNING:
If you expect to have the file in the root of your search then
eventually the file will move to parent directory of your searching
root thereby taking them out of your match pattern forever.
OR
If you are at the root of the filesystem the files will always match
and mv will become a no-op.
Anyway it will be good to test that you don't end up with files at the root, unless that is what the intention is. :-)
Hope this helps.

diff to output only the file names

I'm looking to run a Linux command that will recursively compare two directories and output only the file names of what is different. This includes anything that is present in one directory and not the other or vice versa, and text differences.
From the diff man page:
-q Report only whether the files differ, not the details of the differences.
-r When comparing directories, recursively compare any subdirectories found.
Example command:
diff -qr dir1 dir2
Example output (depends on locale):
$ ls dir1 dir2
dir1:
same-file different only-1
dir2:
same-file different only-2
$ diff -qr dir1 dir2
Files dir1/different and dir2/different differ
Only in dir1: only-1
Only in dir2: only-2
You can also use rsync
rsync -rv --size-only --dry-run /my/source/ /my/dest/ > diff.out
If you want to get a list of files that are only in one directory and not their sub directories and only their file names:
diff -q /dir1 /dir2 | grep /dir1 | grep -E "^Only in*" | sed -n 's/[^:]*: //p'
If you want to recursively list all the files and directories that are different with their full paths:
diff -rq /dir1 /dir2 | grep -E "^Only in /dir1*" | sed -n 's/://p' | awk '{print $3"/"$4}'
This way you can apply different commands to all the files.
For example I could remove all the files and directories that are in dir1 but not dir2:
diff -rq /dir1 /dir2 | grep -E "^Only in /dir1*" | sed -n 's/://p' | awk '{print $3"/"$4}' xargs -I {} rm -r {}
The approach of running diff -qr old/ new/ has one major drawback: it may miss files in newly created directories. E.g. in the example below the file data/pages/playground/playground.txt is not in the output of diff -qr old/ new/ whereas the directory data/pages/playground/ is (search for playground.txt in your browser to quickly compare). I also posted the following solution on Unix & Linux Stack Exchange, but I'll copy it here as well:
To create a list of new or modified files programmatically the best solution I could come up with is using rsync, sort, and uniq:
(rsync -rcn --out-format="%n" old/ new/ && rsync -rcn --out-format="%n" new/ old/) | sort | uniq
Let me explain with this example: we want to compare two dokuwiki releases to see which files were changed and which ones were newly created.
We fetch the tars with wget and extract them into the directories old/ and new/:
wget http://download.dokuwiki.org/src/dokuwiki/dokuwiki-2014-09-29d.tgz
wget http://download.dokuwiki.org/src/dokuwiki/dokuwiki-2014-09-29.tgz
mkdir old && tar xzf dokuwiki-2014-09-29.tgz -C old --strip-components=1
mkdir new && tar xzf dokuwiki-2014-09-29d.tgz -C new --strip-components=1
Running rsync one way might miss newly created files as the comparison of rsync and diff shows here:
rsync -rcn --out-format="%n" old/ new/
yields the following output:
VERSION
doku.php
conf/mime.conf
inc/auth.php
inc/lang/no/lang.php
lib/plugins/acl/remote.php
lib/plugins/authplain/auth.php
lib/plugins/usermanager/admin.php
Running rsync only in one direction misses the newly created files and the other way round would miss deleted files, compare the output of diff:
diff -qr old/ new/
yields the following output:
Files old/VERSION and new/VERSION differ
Files old/conf/mime.conf and new/conf/mime.conf differ
Only in new/data/pages: playground
Files old/doku.php and new/doku.php differ
Files old/inc/auth.php and new/inc/auth.php differ
Files old/inc/lang/no/lang.php and new/inc/lang/no/lang.php differ
Files old/lib/plugins/acl/remote.php and new/lib/plugins/acl/remote.php differ
Files old/lib/plugins/authplain/auth.php and new/lib/plugins/authplain/auth.php differ
Files old/lib/plugins/usermanager/admin.php and new/lib/plugins/usermanager/admin.php differ
Running rsync both ways and sorting the output to remove duplicates reveals that the directory data/pages/playground/ and the file data/pages/playground/playground.txt were missed initially:
(rsync -rcn --out-format="%n" old/ new/ && rsync -rcn --out-format="%n" new/ old/) | sort | uniq
yields the following output:
VERSION
conf/mime.conf
data/pages/playground/
data/pages/playground/playground.txt
doku.php
inc/auth.php
inc/lang/no/lang.php
lib/plugins/acl/remote.php
lib/plugins/authplain/auth.php
lib/plugins/usermanager/admin.php
rsync is run with theses arguments:
-r to "recurse into directories",
-c to also compare files of identical size and only "skip based on checksum, not mod-time & size",
-n to "perform a trial run with no changes made", and
--out-format="%n" to "output updates using the specified FORMAT", which is "%n" here for the file name only
The output (list of files) of rsync in both directions is combined and sorted using sort, and this sorted list is then condensed by removing all duplicates with uniq
On my linux system to get just the filenames
diff -q /dir1 /dir2|cut -f2 -d' '
I have a directory.
$ tree dir1
dir1
├── a
│   └── 1.txt
├── b
│   └── 2.txt
└── c
├── 3.txt
├── 4.txt
└── d
└── 5.txt
4 directories, 5 files
I have another directory.
$ tree dir2
dir2
├── a
│   └── 1.txt
├── b
└── c
├── 3.txt
├── 5.txt
└── d
└── 5.txt
4 directories, 4 files
I can diff two directories.
$ diff <(cd dir1; find . -type f | sort) <(cd dir2; find . -type f| sort)
--- /dev/fd/11 2022-01-21 20:27:15.000000000 +0900
+++ /dev/fd/12 2022-01-21 20:27:15.000000000 +0900
## -1,5 +1,4 ##
./a/1.txt
-./b/2.txt
./c/3.txt
-./c/4.txt
+./c/5.txt
./c/d/5.txt
rsync -rvc --delete --size-only --dry-run source dir target dir

How to use 'mv' command to move files except those in a specific directory?

I am wondering - how can I move all the files in a directory except those files in a specific directory (as 'mv' does not have a '--exclude' option)?
Lets's assume the dir structure is like,
|parent
|--child1
|--child2
|--grandChild1
|--grandChild2
|--grandChild3
|--grandChild4
|--grandChild5
|--grandChild6
And we need to move files so that it would appear like,
|parent
|--child1
| |--grandChild1
| |--grandChild2
| |--grandChild3
| |--grandChild4
| |--grandChild5
| |--grandChild6
|--child2
In this case, you need to exclude two directories child1 and child2, and move rest of the directories in to child1 directory.
use,
mv !(child1|child2) child1
This will move all of rest of the directories into child1 directory.
Since find does have an exclude option, use find + xargs + mv:
find /source/directory -name ignore-directory-name -prune -print0 | xargs -0 mv --target-directory=/target/directory
Note that this is almost copied from the find man page (I think using mv --target-directory is better than cpio).
First get the names of files and folders and exclude whichever you want:
ls --ignore=file1 --ignore==folder1 --ignore==regular-expression1 ...
Then pass filtered names to mv as the first parameter and the second parameter will be the destination:
mv $(ls --ignore=file1 --ignore==folder1 --ignore==regular-expression1 ...) destination/
This isn't exactly what you asked for, but it might do the job:
mv the-folder-you-want-to-exclude somewhere-outside-of-the-main-tree
mv the-tree where-you-want-it
mv the-excluded-folder original-location
(Essentially, move the excluded folder out of the larger tree to be moved.)
So, if I have a/ and I want to exclude a/b/c/*:
mv a/b/c ../c
mv a final_destination
mkdir -p a/b
mv ../c a/b/c
Or something like that. Otherwise, you might be able to get find to help you.
This will move all files at or below the current directory not in the ./exclude/ directory to /wherever...
find -E . -not -type d -and -not -regex '\./exclude/.*' -exec echo mv {} /wherever \;
ls | grep -v exclude-dir | xargs -t -I '{}' mv {} exclude-dir
rename your directory to make it hidden so the wildcard does not see it:
mv specific_dir .specific_dir
mv * ../other_dir
#!/bin/bash
touch apple banana carrot dog cherry
mkdir fruit
F="apple banana carrot dog cherry"
mv ${F/dog/} fruit
# this removes 'dog' from the list F, so it remains in the
current directory and not moved to 'fruit'
Inspired by #user13747357 's answer.
First you can ls the file and filter them by:
ls | egrep -v '(dir_name|file_name.ext)'
Then you can run the following command to move the files except the specific ones:
mv $(ls | egrep -v '(dir_name|file_name.ext)') target_dir
* Note that I tested this inside a specific directory. Cross-directory operation should be more carefully executed :)
suppose you directory is
.
├── dir1
│ └── a.txt
├── dir2
│ ├── b.txt
│ └── hello.c
├── file1.txt
├── file2.txt
└── file3.txt
and you gonna put file1 file2 file3 into dir2.
you can use
mv $(ls -p | grep -v /) /dir2 to finish it, because
ls -p | grep -v / will print all files except directory in cwd.
For example, if I want to move all files/directories - except a specified file or directory - inside "var/www/html" to a sub-folder named "my_sub_domain", then I use "mv" with the command "!(what_to_exclude)":
$ cd /var/www/html
$ mv !(my_sub_domain) my_sub_domain
To exclude more I use "|" to seperate file/directory names:
$ mv !(my_sub_domain|test1.html) my_sub_domain
mv * exclude-dir
was the perfect solution for me

Resources