Compare directories with ssh - linux

I have a directory dirA on my laptop, and a directory dirB on a remote host to which I can ssh. Both directories include subdirectories.
I would like to compare the full content of the two subdirectories by using ssh. In particular, I want to know what file or subdirectory in dirA is not in dirB, and the other way around. The two directories are large enough that I do not want to transfer the full files through ssh, just compare their names and locations.
Do you know how to do this?
Thank you.

use rsync with the --dry-run option; something like:
rsync -ar --dry-run local-dir/ user#remote:remote-dir
This will output the list of files that would have been synchronized. So if there is no output; there is no difference.
edit : Two options you might consider:
--ignore-times : ignore the timestamp of the local and remote files when comparing the files.
--size-only : if you want to speedup the rsync command. rsync compares only the size of the files. Note that this is error prone (files with same size might differ)
--itemize-changes : show the changes between files
you may take a look at this anwser

Related

copy directory from another computer on Linux

On a computer with IP address like 10.11.12.123, I have a folder document. I want to copy that folder to my local folder /home/my-pc/doc/ using the shell.
I tried like this:
scp -r smb:10.11.12.123/other-pc/document /home/my-pc/doc/
but it's not working.
So you can use below command to copy your files.
scp -r <source> <destination>
(-r: Recursively copy entire directories)
eg:
scp -r user#10.11.12.123:/other-pc/document /home/my-pc/doc
To identify the location you can use the pwd command, eg:
kasun#kasunr:~/Downloads$ pwd
/home/kasun/Downloads
If you want to copy from B to A if you are logged into B: then
scp /source username#a:/destination
If you want to copy from B to A if you are logged into A: then
scp username#b:/source /destination
In addition to the comment, when you look at your host-to-host copy options on Linux today, rsync is by far, the most robust solution around. It is brought to you by the SAMBA team[1] and continues to enjoy active development. Most distributions include the rsync package by default. (if not, you should find an easily installable package for your distro or you can download it from rsync.samba.org ).
The basic use of rsync for host-to-host directory copy is:
$ rsync -uav srchost:/path/to/dir /path/to/dest
-uav simply recursively copies -ua only new or changed files preserving file & directory times and permissions while providing -v verbose output. You will be prompted for the username/password on 10.11.12.123 unless your have setup ssh-keys to allow public/private key authentication (see: ssh-keygen for key generation)
If you notice, the syntax is basically the same as that for scp with a slight difference in the options: (e.g. scp -rv srchost:/path/to/dir /path/to/dest). rsync will use ssh for secure transport by default, so you will want to insure sshd is running on your srchost (10.11.12.123 in your case). If you have name resolution working (or a simple entry in /etc/hosts for 10.11.12.123) you can use the hostname for the remote host instead of the remote IP. Regardless, you can always transfer the files you are interested in with:
$ rsync -uav 10.11.12.123:/other-pc/document /home/my-pc/doc/
Note: do NOT include a trailing / after document if you want to copy the directory itself. If you do include a trailing / after document (i.e. 10.11.12.123:/other-pc/document/) you are telling rsync to copy the contents, (i.e. the files and directories under) document to 10.11.12.123:/other-pc/ without also copying the document directory.
The reason rsync is far superior to other copy apps is it provides options to truly synchronize filesystems and directory trees both locally and between your local machine and remote host. Meaning, in your case, if you have used rsync to transfer files to /home/my-pc/doc/ and then make changes to the files or delete files on 10.11.12.123, you can simply call rsync again and have the changes/deletions reflected in /home/my-pc/doc/. (look at the several flavors of the --delete option for details in rsync --help or in man 1 rsync)
For these, and many more reasons, it is well worth the time to familiarize yourself with rsync. It is an invaluable tool in any Linux user's hip pocket. Hopefully this will solve your problem and get you started.
Footnotes
[1] the same folks that "Opened Windows to a Wider World" allowing seemless connection between windows/Linux hosts via the native windows server message block (smb) protocol. samba.org
If the two directories (document and /home/my-pc/doc/) you mentioned are on the same 10.11.12.123 machine.
then:
cp -ai document /home/my-pc/doc/
else:
scp -r document/ root#10.11.12.123:/home/my-pc/doc/

Compare two folders containing source files & hardlinks, remove orphaned files

I am looking for a way to compare two folders containing source files and hard links (lets use /media/store/download and /media/store/complete as an example) and then remove orphaned files that don't exist in both folders. These files may have been renamed and may be stored in subdirectories.
I'd like to set this up on a cron script to run regularly. I just can't logically figure out myself how work the logic of the script - could anyone be so kind as to help?
Many thanks
rsync can do what you want, using the --existing, --ignore-existing, and --delete options. You'll have to run it twice, once in each "direction" to clean orphans from both source and target directories.
rsync -avn --existing --ignore-existing --delete /media/store/download/ /media/store/complete
rsync -avn --existing --ignore-existing --delete /media/store/complete/ /media/store/download
--existing says don't copy orphan files
--ignore-existing says don't update existing files
--delete says delete orphans on target dir
The trailing slash on the source dir, and no trailing slash on the target dir, are mandatory for your task.
The 'n' in -avn means not to really do anything, and I always do a "dry run" with the -n option to make sure the command is going to do what I want, ESPECIALLY when using --delete. Once you're confident your command is correct, run it with just -av to actually do the work.
Perhaps rsync is of use ?
Rsync is a fast and extraordinarily versatile file copying tool. It
can copy locally, to/from another host over any remote shell, or
to/from a remote rsync daemon. It offers a large number of options
that control every aspect of its behavior and permit very flexible
specification of the set of files to be copied. It is famous for its
delta-transfer algorithm, which reduces the amount of data sent over
the network by sending only the differences between the source files
and the existing files in the destination. Rsync is widely used for
backups and mirroring and as an improved copy command for everyday
use.
Note it has a --delete option
--delete delete extraneous files from dest dirs
which could help with your specific use case above.
You can also use "diff" command to list down all the different files in two folders.

how to scp multiple files from multiple directories, while different files in different directories may have the same name

I want to scp several files from remote to local, the files in remote is like this:
/data/1792348/a.stat
/data/1792348/b.stat
/data/187657/a.stat
/data/187657/b.stat
... ...
1792348 187657 etc, the middle directory name is random.
how can i scp all the files ends with .stat from remote to local?
if i tried scp -P36000 user#host:/data//*.stat .*, i can only get 2 files a.stat b.stat.
why i can's submit this question?
i really don't know how to solve this, and hadn't search a answer from google.
i would use rsync (which uses scp internally; but is way more elaborate, e.g. it will only transmit minimal changesets of data, so if you run it several times, you will get an impressive speedup)
rsync -avz /data/ \
--include "*/" --include "*.stat" --exclude "*" \
user#host:/path/to/dest/data/

Copy files excluding some folder in linux

I want to create script that copy my project and make it zip archive. I want to exclude all folder named .svn in all sub directories. Any suggestion?
I'd use rsync's FILTER RULES for this:
Create an .rsync-filter file (in the origin directory) containing, e.g.
-.svn/
Now run rsync like an exalted copy:
rsync -aFF origin/ destination/
You can do this using rsync. Although this is designed to synchronise directories across servers, it can also be used to copy directories on a single machine.
rsync has a --exclude option to exclude files and directories by pattern. See http://www.samba.org/ftp/rsync/rsync.html for help and examples.
Just call the zip utility on your project’s folder and use the -r option for recursive plus the -x option to exclude files / folders by pattern.
zip -r target-filename.zip source-folder -x \*exclude-pattern\*
exclude-pattern in your case would be .svn
See also man zip

How can I recursively copy a directory into another and replace only the files that have not changed?

I am looking to do a specific copy in Fedora.
I have two folders:
'webroot': holding ALL web files/images etc
'export': folder containing thousands of PHP, CSS, JS documents that are exported from my SVN repo.
The export directory contains many of the same files/folders that the root does, however the root contains additional ones not found in export.
I'd like to merge all of the contents of export with my webroot with the following options:
Overwriting the file in webroot if export's version contains different code than what
is inside of webroot's version (live)
Preserve the permissions/users/groups of the file if it is overwritten (the export
version replacing the live version) *NOTE I would like the webroots permissions/ownership maintained, but with export's contents
No prompting/stopping of the copy
of any kind (ie not verbose)
Recursive copy - obviously I
would like to copy all* files
folders and subfolders found in
export
I've done a bit of research into cp - would this do the job?:
cp -pruf ./export /path/to/webroot
It might, but any time the corresponding files in export and webroot have the same content but different modification times, you'd wind up performing an unnecessary copy operation. You'd probably get slightly smarter behavior from rsync:
rsync -pr ./export /path/to/webroot
Besides, rsync can copy files from one host to another over an SSH connection, if you ever have a need to do that. Plus, it has a zillion options you can specify to tweak its behavior - look in the man page for details.
EDIT: with respect to your clarification about what you mean by preserving permissions: you'd probably want to leave off the -p option.
-u overwrites existing files folder if the destination is older than source
-p perserves the permission and dates
-f turns off verbosity
-r makes the copy recursive
So looks like you got all the correct args to cp
Sounds like a job for cpio (and hence, probably, GNU tar can do it too):
cd export
find . -print | cpio -pvdm /path/to/webroot
If you need owners preserved, you have to do it as root, of course. The -p option is 'pass mode', meaning copy between locations; -v is verbose (but not interactive; there's a difference); -d means create directories as necessary; -m means preserve modification time. By default, without the -u option, cpio won't overwrite files in the target area that are newer than the one from the source area.

Resources