how to scp multiple files from multiple directories, while different files in different directories may have the same name - linux

I want to scp several files from remote to local, the files in remote is like this:
/data/1792348/a.stat
/data/1792348/b.stat
/data/187657/a.stat
/data/187657/b.stat
... ...
1792348 187657 etc, the middle directory name is random.
how can i scp all the files ends with .stat from remote to local?
if i tried scp -P36000 user#host:/data//*.stat .*, i can only get 2 files a.stat b.stat.
why i can's submit this question?
i really don't know how to solve this, and hadn't search a answer from google.

i would use rsync (which uses scp internally; but is way more elaborate, e.g. it will only transmit minimal changesets of data, so if you run it several times, you will get an impressive speedup)
rsync -avz /data/ \
--include "*/" --include "*.stat" --exclude "*" \
user#host:/path/to/dest/data/

Related

rsync files (name starting with string* of a list) from multiple directories in server to local client

I have a list.txt of names/ids
13089
13090
13091...
for which I want to find and rsync/copy over the files starting with these names from several remote directories. The directories may or may not contain these files.
Right now I am using the following command...but it is not finding the files even when it does exist. Also is it correct to repeat the rsync command for each possible directory?
for ID_NAME in $(cat ./list_ids.csv)
do
rsync -PEav --rsh=ssh admin#123:'/folder1/subfolder1/${ID_NAME}**' /destinationfolder1
rsync -PEav --rsh=ssh admin#123:'/folder2/subfolder2/${ID_NAME}**' /destinationfolder2
rsync -PEav --rsh=ssh admin#123:'/folder3/subfolder3/${ID_NAME}**' /destinationfolder3
done

Is there a way to download files matching a pattern trough SFTP on shell script?

I'm trying to download multiple files trough SFTP on a linux server using
sftp -o IdentityFile=key <user>#<server><<END
get -r folder
exit
END
which will download all contents on a folder. It appears that find and grep are invalid commands, so are for loops.
I need to download files having a name containing a string e.g.
test_0.txt
test_1.txt
but no file.txt
Do you really need the -r switch? Are there really any subdirectories in the folder? You do not mention that.
If there are no subdirectories, you can use a simple get with a file mask:
cd folder
get *test*
Are you required to use sftp? A tool like rsync that operates over ssh has flexible include/exclude options. For example:
rsync -a <user>#<server>:folder/ folder/ \
--include='test_*.txt' --exclude='*.txt'
This requires rsync to be installed on the remote system, but that's very common these days. If rsync isn't available, you could do something similar using tar:
ssh <user>#<server> tar -cf- folder/ | tar -xvf- --wildcards '*/test_*.txt'
This tars up all the files remotely, but then only extracts files matching your target pattern on the receiving side.

Compare directories with ssh

I have a directory dirA on my laptop, and a directory dirB on a remote host to which I can ssh. Both directories include subdirectories.
I would like to compare the full content of the two subdirectories by using ssh. In particular, I want to know what file or subdirectory in dirA is not in dirB, and the other way around. The two directories are large enough that I do not want to transfer the full files through ssh, just compare their names and locations.
Do you know how to do this?
Thank you.
use rsync with the --dry-run option; something like:
rsync -ar --dry-run local-dir/ user#remote:remote-dir
This will output the list of files that would have been synchronized. So if there is no output; there is no difference.
edit : Two options you might consider:
--ignore-times : ignore the timestamp of the local and remote files when comparing the files.
--size-only : if you want to speedup the rsync command. rsync compares only the size of the files. Note that this is error prone (files with same size might differ)
--itemize-changes : show the changes between files
you may take a look at this anwser

Compare two folders containing source files & hardlinks, remove orphaned files

I am looking for a way to compare two folders containing source files and hard links (lets use /media/store/download and /media/store/complete as an example) and then remove orphaned files that don't exist in both folders. These files may have been renamed and may be stored in subdirectories.
I'd like to set this up on a cron script to run regularly. I just can't logically figure out myself how work the logic of the script - could anyone be so kind as to help?
Many thanks
rsync can do what you want, using the --existing, --ignore-existing, and --delete options. You'll have to run it twice, once in each "direction" to clean orphans from both source and target directories.
rsync -avn --existing --ignore-existing --delete /media/store/download/ /media/store/complete
rsync -avn --existing --ignore-existing --delete /media/store/complete/ /media/store/download
--existing says don't copy orphan files
--ignore-existing says don't update existing files
--delete says delete orphans on target dir
The trailing slash on the source dir, and no trailing slash on the target dir, are mandatory for your task.
The 'n' in -avn means not to really do anything, and I always do a "dry run" with the -n option to make sure the command is going to do what I want, ESPECIALLY when using --delete. Once you're confident your command is correct, run it with just -av to actually do the work.
Perhaps rsync is of use ?
Rsync is a fast and extraordinarily versatile file copying tool. It
can copy locally, to/from another host over any remote shell, or
to/from a remote rsync daemon. It offers a large number of options
that control every aspect of its behavior and permit very flexible
specification of the set of files to be copied. It is famous for its
delta-transfer algorithm, which reduces the amount of data sent over
the network by sending only the differences between the source files
and the existing files in the destination. Rsync is widely used for
backups and mirroring and as an improved copy command for everyday
use.
Note it has a --delete option
--delete delete extraneous files from dest dirs
which could help with your specific use case above.
You can also use "diff" command to list down all the different files in two folders.

rsync not synchronizing .htaccess file

I am trying to rsync directory A of server1 with directory B of server2.
Sitting in the directory A of server1, I ran the following commands.
rsync -av * server2::sharename/B
but the interesting thing is, it synchronizes all files and directories except .htaccess or any hidden file in the directory A. Any hidden files within subdirectories get synchronized.
I also tried the following command:
rsync -av --include=".htaccess" * server2::sharename/B
but the results are the same.
Any ideas why hidden files of A directory are not getting synchronized and how to fix it. I am running as root user.
thanks
This is due to the fact that * is by default expanded to all files in the current working directory except the files whose name starts with a dot. Thus, rsync never receives these files as arguments.
You can pass . denoting current working directory to rsync:
rsync -av . server2::sharename/B
This way rsync will look for files to transfer in the current working directory as opposed to looking for them in what * expands to.
Alternatively, you can use the following command to make * expand to all files including those which start with a dot:
shopt -s dotglob
See also shopt manpage.
For anyone who's just trying to sync directories between servers (including all hidden files) -- e.g., syncing somedirA on source-server to somedirB on a destination server -- try this:
rsync -avz -e ssh --progress user#source-server:/somedirA/ somedirB/
Note the slashes at the end of both paths. Any other syntax may lead to unexpected results!
Also, for me its easiest to perform rsync commands from the destination server, because it's easier to make sure I've got proper write access (i.e., I might need to add sudo to the command above).
Probably goes without saying, but obviously your remote user also needs read access to somedirA on your source server. :)
I had the same issue.
For me when I did the following command the hidden files did not get rsync'ed
rsync -av /home/user1 server02:/home/user1
But when I added the slashes at the end of the paths, the hidden files were rsync'ed.
rsync -av /home/user1/ server02:/home/user1/
Note the slashes at the end of the paths, as Brian Lacy said the slashes are the key. I don't have the reputation to comment on his post or I would have done that.
I think the problem is due to shell wildcard expansion. Use . instead of star.
Consider the following example directory content
$ ls -a .
. .. .htaccess a.html z.js
The shell's wildcard expansion translates the argument list that the rsync program gets from
-av * server2::sharename/B
into
-av a.html z.js server2::sharename/B
before the command starts getting executed.
The * tell to rsynch to not synch hidden files. You should not omit it.
On a related note, in case any are coming in from google etc trying to find while rsync is not copying hidden subfolders, I found one additional reason why this can happen and figured I'd pay it forward for the next guy running into the same thing: if you are using the -C option (obviously the --exclude would do it too but I figure that one's a bit easier to spot).
In my case, I had a script that was copying several folders across computers, including a directory with several git projects and I noticed that the I couldn't run any of the normal git commands in the copied repos (yes, normally one should use git clone but this was part of a larger backup that included other things). After looking at the script, I found that it was calling rsync with 7 or 8 options.
After googling didn't turn up any obvious answers, I started going through the switches one by one. After dropping the -C option, it worked correctly. In the case of the script, the -C flag appears to have been added as a mistake, likely because sftp was originally used and -C is a compression-related option under that tool.
per man rsync, the option is described as
--cvs-exclude, -C auto-ignore files in the same way CVS does
Since CVS is an older version control system, and given the man page description, it makes perfect sense that it would behave this way.

Resources