Verify copied data between Windows & Linux shares? - linux

I just copied a ton of data from a machine running Windows 7 Ultimate to a server running Ubuntu Server LTS 10.04. I used the robocopy utility via PowerShell to accommplish this task, but I couldn't find any informaiton online regarding whether Robocopy verifies the copied file's integrity once it is copied to the server.
First of all, does anyone know if this is done inherently? There is no switch that explicitly allows you to add verification to a file transfer.
Second, if it doesn't or there is uncertainty about whether or not it does, what would be the simplest method to accomplish this for multiple directories with several files/sub-directories?
Thanks!

The easiest mechanism I know would rely on an md5sum and Unix-like find utility on your Windows machine.
You can generate a manifest file of filenames / md5sums:
find /source -type f -exec md5sum {} \; > MD5SUM
Copy the MD5SUM file to your Linux machine, and then run:
cd /path/to/destination
md5sum --quiet -c MD5SUM
You'll get a list of files that fail:
$ md5sum --quiet -c /tmp/MD5SUM
/etc/aliases: FAILED
md5sum: WARNING: 1 of 341 computed checksums did NOT match

Much easier method is to use unix uilities diff and rsync
With diff you can compare two file but you can compare also two directories. With diff I would recomend this command:
diff -r source/directory/ destination/directory
-r forces diff to analyse directorysor recursively
The second option is to use rsync which is ment to sync files or directories but with -n option you can use it also to analyes differencies between directories. rsync also works even when the files are not on the same host it means on one there can be remote host and you can acess it even with ssh. Pluss rsync is really flexible with its meny options avaialbe
rsync -avn --itemize-changes --progress --stats source/directory/ destination/directory/
-n option makes rsync do a "dry-run", meaning it makes no changes on the
-a otion includes in it "Recursive mode,symbolic links,file permissions, file timestamps, file owner parameter, file group paratmeter
-v increase verbosity
--itemize-changes output a change-summary for all updates
Here you can find even more ways how to compear directories:
https://askubuntu.com/questions/12473/file-and-directory-comparison-tool
On rsync wikipedia page you can find windows alternative programs for rsync

Related

Bash Scripting with xargs to BACK UP files

I need to copy a file from multiple locations to the BACK UP directory by retaining its directory structure. For example, I have a file "a.txt" at the following locations /a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt, I now need to copy this file from multiple locations to the backup directory /tmp/backup. The end result should be:
when i list /tmp/backup/a --> it should contain /b/a.txt /c/a.txt /d/a.txt & /e/a.txt.
For this, I had used the command: echo /a/*/a.txt | xargs -I {} -n 1 sudo cp --parent -vp {} /tmp/backup. This is throwing the error "cp: cannot stat '/a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt': No such file or directory"
-I option is taking the complete input from echo instead of individual values (like -n 1 does). If someone can help debug this issue that would be very helpful instead of providing an alternative command.
Use rsync with the --relative (-R) option to keep (parts of) the source paths.
I've used a wildcard for the source to match your example command rather than the explicit list of directories mentioned in your question.
rsync -avR /a/*/a.txt /tmp/backup/
Do the backups need to be exactly the same as the originals? In most cases, I'd prefer a little compression. [tar](https://man7.org/linux/man-pages/man1/tar.1.html) does a great job of bundling things including the directory structure.
tar cvzf /path/to/backup/tarball.tgz /source/path/
tar can't update compressed archives, so you can skip the compression
tar uf /path/to/backup/tarball.tar /source/path/
This gives you versioning of a sort, as if only updates changed files, but keeps the before and after versions, both.
If you have time and cycles and still want the compression, you can decompress before and recompress after.

Linux - moving files from source to destination with overwrite

Firstly I'd like to apologize if I duplicated a questions which already exists but it's too important to screw this.
I have two directories on my linux server:
- /tmp/tmp as source
- /var as destination
There are already like 500 txt files in /var directory and I'd like to move all my files from /tmp directory (about 200 files) to /var directory with replacing those ones which already exist with the same name but as well as not touching those ones which are not in /tmp.
Practical example:
/var files: a.txt , b.txt , c.txt , d.txt
/tmp files: a.txt , b.txt
Result: /var files: a.txt(from /tmp) , b.txt(from /tmp) , c.txt , d.txt
Not sure if mv is a proper method to do that, so thank you in advance guys! :)
This can be implemented through rsync
Refer rsync manual for more information.
Local: rsync [OPTION...] SRC... [DEST]
Access via remote shell:
Pull: rsync [OPTION...] [USER#]HOST:SRC... [DEST]
Push: rsync [OPTION...] SRC... [USER#]HOST:DEST
Access via rsync daemon:
Pull: rsync [OPTION...] [USER#]HOST::SRC... [DEST]
rsync [OPTION...] rsync://[USER#]HOST[:PORT]/SRC... [DEST]
Push: rsync [OPTION...] SRC... [USER#]HOST::DEST
rsync [OPTION...] SRC... rsync://[USER#]HOST[:PORT]/DEST
For your case,
rsync -avzh /tmp/ /var/
If you want to test it
rsync -avzh --dry-run /tmp/ /var/
-a, --archive archive mode;
-v, --verbose increase verbosity
-z, --compress compress file data during the transfer
-h, --human-readable output numbers in a human-readable format
Detailed commentary
-v, --verbose This option increases the amount of information you are given during the transfer. By default, rsync works silently. A single
-v will give you information about what files are being transferred and a brief summary at the end. Two -v options will give you
information on what files are being skipped and slightly more
information at the end. More than two -v options should only be used
if you are debugging rsync. Note that the names of the transferred
files that are output are done using a default --out-format of
lq%n%Lrq, which tells you just the name of the file and, if the item
is a link, where it points. At the single -v level of verbosity, this
does not mention when a file gets its attributes changed. If you ask
for an itemized list of changed attributes (either --itemize-changes
or adding lq%irq to the --out-format setting), the output (on the
client) increases to mention all items that are changed in any way.
See the --out-format option for more details.
-a, --archive This is equivalent to -rlptgoD. It is a quick way of saying you want recursion and want to preserve almost everything (with
-H being a notable omission). The only exception to the above equivalence is when --files-from is specified, in which case -r is not
implied. Note that -a does not preserve hardlinks, because finding
multiply-linked files is expensive. You must separately specify -H.
--no-OPTION You may turn off one or more implied options by prefixing the option name with lqno-rq. Not all options may be prefixed with a
lqno-rq: only options that are implied by other options (e.g. --no-D,
--no-perms) or have different defaults in various circumstances (e.g. --no-whole-file, --no-blocking-io, --no-dirs). You may specify either the short or the long option name after the lqno-rq prefix (e.g.
--no-R is the same as --no-relative). For example: if you want to use -a (--archive) but don't want
-o (--owner), instead of converting -a into -rlptgD, you could specify -a --no-o (or -a --no-owner). The order of the options is important: if you specify --no-r -a, the
-r option would end up being turned on, the opposite of -a --no-r. Note also that the side-effects of the --files-from option are NOT
positional, as it affects the default state of several options and
slightly changes the meaning of -a (see the --files-from option for
more details).
-z, --compress With this option, rsync compresses the file data as it is sent to the destination machine, which reduces the amount of data
being transmitted -- something that is useful over a slow connection.
Note that this option typically achieves better compression ratios
than can be achieved by using a compressing remote shell or a
compressing transport because it takes advantage of the implicit
information in the matching data blocks that are not explicitly sent
over the connection.
-h, --human-readable Output numbers in a more human-readable format. This makes big numbers output using larger units, with a K, M, or G
suffix. If this option was specified once, these units are K (1000), M
(1000*1000), and G (1000*1000*1000); if the option is repeated, the
units are powers of 1024 instead of 1000.
You cannot merge two directories using mv command simply. You can use either of the following ways:
1) You can use cp and rm commands to achieve your purpose.
$ cp -r /var /tmp
$ rm -rf /var
2) Using rsync, you can achieve that but rsync doesn't actyally "move" as per your question. It rather copies from one location to other(usually remote).
$ rsync -a -v /var /tmp
$ rm -r /var
I think this should serve the purpose. Cheers!

Is there a way to download files matching a pattern trough SFTP on shell script?

I'm trying to download multiple files trough SFTP on a linux server using
sftp -o IdentityFile=key <user>#<server><<END
get -r folder
exit
END
which will download all contents on a folder. It appears that find and grep are invalid commands, so are for loops.
I need to download files having a name containing a string e.g.
test_0.txt
test_1.txt
but no file.txt
Do you really need the -r switch? Are there really any subdirectories in the folder? You do not mention that.
If there are no subdirectories, you can use a simple get with a file mask:
cd folder
get *test*
Are you required to use sftp? A tool like rsync that operates over ssh has flexible include/exclude options. For example:
rsync -a <user>#<server>:folder/ folder/ \
--include='test_*.txt' --exclude='*.txt'
This requires rsync to be installed on the remote system, but that's very common these days. If rsync isn't available, you could do something similar using tar:
ssh <user>#<server> tar -cf- folder/ | tar -xvf- --wildcards '*/test_*.txt'
This tars up all the files remotely, but then only extracts files matching your target pattern on the receiving side.

copy directory from another computer on Linux

On a computer with IP address like 10.11.12.123, I have a folder document. I want to copy that folder to my local folder /home/my-pc/doc/ using the shell.
I tried like this:
scp -r smb:10.11.12.123/other-pc/document /home/my-pc/doc/
but it's not working.
So you can use below command to copy your files.
scp -r <source> <destination>
(-r: Recursively copy entire directories)
eg:
scp -r user#10.11.12.123:/other-pc/document /home/my-pc/doc
To identify the location you can use the pwd command, eg:
kasun#kasunr:~/Downloads$ pwd
/home/kasun/Downloads
If you want to copy from B to A if you are logged into B: then
scp /source username#a:/destination
If you want to copy from B to A if you are logged into A: then
scp username#b:/source /destination
In addition to the comment, when you look at your host-to-host copy options on Linux today, rsync is by far, the most robust solution around. It is brought to you by the SAMBA team[1] and continues to enjoy active development. Most distributions include the rsync package by default. (if not, you should find an easily installable package for your distro or you can download it from rsync.samba.org ).
The basic use of rsync for host-to-host directory copy is:
$ rsync -uav srchost:/path/to/dir /path/to/dest
-uav simply recursively copies -ua only new or changed files preserving file & directory times and permissions while providing -v verbose output. You will be prompted for the username/password on 10.11.12.123 unless your have setup ssh-keys to allow public/private key authentication (see: ssh-keygen for key generation)
If you notice, the syntax is basically the same as that for scp with a slight difference in the options: (e.g. scp -rv srchost:/path/to/dir /path/to/dest). rsync will use ssh for secure transport by default, so you will want to insure sshd is running on your srchost (10.11.12.123 in your case). If you have name resolution working (or a simple entry in /etc/hosts for 10.11.12.123) you can use the hostname for the remote host instead of the remote IP. Regardless, you can always transfer the files you are interested in with:
$ rsync -uav 10.11.12.123:/other-pc/document /home/my-pc/doc/
Note: do NOT include a trailing / after document if you want to copy the directory itself. If you do include a trailing / after document (i.e. 10.11.12.123:/other-pc/document/) you are telling rsync to copy the contents, (i.e. the files and directories under) document to 10.11.12.123:/other-pc/ without also copying the document directory.
The reason rsync is far superior to other copy apps is it provides options to truly synchronize filesystems and directory trees both locally and between your local machine and remote host. Meaning, in your case, if you have used rsync to transfer files to /home/my-pc/doc/ and then make changes to the files or delete files on 10.11.12.123, you can simply call rsync again and have the changes/deletions reflected in /home/my-pc/doc/. (look at the several flavors of the --delete option for details in rsync --help or in man 1 rsync)
For these, and many more reasons, it is well worth the time to familiarize yourself with rsync. It is an invaluable tool in any Linux user's hip pocket. Hopefully this will solve your problem and get you started.
Footnotes
[1] the same folks that "Opened Windows to a Wider World" allowing seemless connection between windows/Linux hosts via the native windows server message block (smb) protocol. samba.org
If the two directories (document and /home/my-pc/doc/) you mentioned are on the same 10.11.12.123 machine.
then:
cp -ai document /home/my-pc/doc/
else:
scp -r document/ root#10.11.12.123:/home/my-pc/doc/

Elegantly send local tarball and untar on remote end

All,
This might be a FAQ, but I can't get my search-fu to find it. Namely, I kind of want to do the "reverse" tar pipe. Usually a tar pipe used to send a local folder to a remote location as a tar ball in a single nice command:
tar zcvf - ~/MyFolder | ssh user#remote "cat > ~/backup/MyFolder.tar.gz"
(I hope I got that right. I typed it from memory.)
I'm wondering about the reverse situation. Let's say I locally have a tarball of a large directory and what I want to do is copy it (rsync? scp?) to a remote machine where it will live as the expanded file, i.e.,:
Local: sourcecode.tar.gz ==> send to Remote and untar ==>
Remote: sourcecode/
I want to do this because the "local" disk has inode pressure so keeping a single bigger file is better than many smaller files. But the remote system is one with negligible inode pressure, and it would be preferable to keep it as an expanded directory.
Now, I can think of various ways to do this with &&-command chaining and the like, but I figure there must be a way to do this with tar-pipes and rsync or ssh/scp that I am just not seeing.
You're most of the way there:
ssh user#remote "tar -C /parent/directory -xz -f-" < sourcecode.tar.gz
Where -f- tells tar to extract from stdin, and the -C flag changes directory before untarring.

Resources