Linux - moving files from source to destination with overwrite - linux

Firstly I'd like to apologize if I duplicated a questions which already exists but it's too important to screw this.
I have two directories on my linux server:
- /tmp/tmp as source
- /var as destination
There are already like 500 txt files in /var directory and I'd like to move all my files from /tmp directory (about 200 files) to /var directory with replacing those ones which already exist with the same name but as well as not touching those ones which are not in /tmp.
Practical example:
/var files: a.txt , b.txt , c.txt , d.txt
/tmp files: a.txt , b.txt
Result: /var files: a.txt(from /tmp) , b.txt(from /tmp) , c.txt , d.txt
Not sure if mv is a proper method to do that, so thank you in advance guys! :)

This can be implemented through rsync
Refer rsync manual for more information.
Local: rsync [OPTION...] SRC... [DEST]
Access via remote shell:
Pull: rsync [OPTION...] [USER#]HOST:SRC... [DEST]
Push: rsync [OPTION...] SRC... [USER#]HOST:DEST
Access via rsync daemon:
Pull: rsync [OPTION...] [USER#]HOST::SRC... [DEST]
rsync [OPTION...] rsync://[USER#]HOST[:PORT]/SRC... [DEST]
Push: rsync [OPTION...] SRC... [USER#]HOST::DEST
rsync [OPTION...] SRC... rsync://[USER#]HOST[:PORT]/DEST
For your case,
rsync -avzh /tmp/ /var/
If you want to test it
rsync -avzh --dry-run /tmp/ /var/
-a, --archive archive mode;
-v, --verbose increase verbosity
-z, --compress compress file data during the transfer
-h, --human-readable output numbers in a human-readable format
Detailed commentary
-v, --verbose This option increases the amount of information you are given during the transfer. By default, rsync works silently. A single
-v will give you information about what files are being transferred and a brief summary at the end. Two -v options will give you
information on what files are being skipped and slightly more
information at the end. More than two -v options should only be used
if you are debugging rsync. Note that the names of the transferred
files that are output are done using a default --out-format of
lq%n%Lrq, which tells you just the name of the file and, if the item
is a link, where it points. At the single -v level of verbosity, this
does not mention when a file gets its attributes changed. If you ask
for an itemized list of changed attributes (either --itemize-changes
or adding lq%irq to the --out-format setting), the output (on the
client) increases to mention all items that are changed in any way.
See the --out-format option for more details.
-a, --archive This is equivalent to -rlptgoD. It is a quick way of saying you want recursion and want to preserve almost everything (with
-H being a notable omission). The only exception to the above equivalence is when --files-from is specified, in which case -r is not
implied. Note that -a does not preserve hardlinks, because finding
multiply-linked files is expensive. You must separately specify -H.
--no-OPTION You may turn off one or more implied options by prefixing the option name with lqno-rq. Not all options may be prefixed with a
lqno-rq: only options that are implied by other options (e.g. --no-D,
--no-perms) or have different defaults in various circumstances (e.g. --no-whole-file, --no-blocking-io, --no-dirs). You may specify either the short or the long option name after the lqno-rq prefix (e.g.
--no-R is the same as --no-relative). For example: if you want to use -a (--archive) but don't want
-o (--owner), instead of converting -a into -rlptgD, you could specify -a --no-o (or -a --no-owner). The order of the options is important: if you specify --no-r -a, the
-r option would end up being turned on, the opposite of -a --no-r. Note also that the side-effects of the --files-from option are NOT
positional, as it affects the default state of several options and
slightly changes the meaning of -a (see the --files-from option for
more details).
-z, --compress With this option, rsync compresses the file data as it is sent to the destination machine, which reduces the amount of data
being transmitted -- something that is useful over a slow connection.
Note that this option typically achieves better compression ratios
than can be achieved by using a compressing remote shell or a
compressing transport because it takes advantage of the implicit
information in the matching data blocks that are not explicitly sent
over the connection.
-h, --human-readable Output numbers in a more human-readable format. This makes big numbers output using larger units, with a K, M, or G
suffix. If this option was specified once, these units are K (1000), M
(1000*1000), and G (1000*1000*1000); if the option is repeated, the
units are powers of 1024 instead of 1000.

You cannot merge two directories using mv command simply. You can use either of the following ways:
1) You can use cp and rm commands to achieve your purpose.
$ cp -r /var /tmp
$ rm -rf /var
2) Using rsync, you can achieve that but rsync doesn't actyally "move" as per your question. It rather copies from one location to other(usually remote).
$ rsync -a -v /var /tmp
$ rm -r /var
I think this should serve the purpose. Cheers!

Related

Bash Scripting with xargs to BACK UP files

I need to copy a file from multiple locations to the BACK UP directory by retaining its directory structure. For example, I have a file "a.txt" at the following locations /a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt, I now need to copy this file from multiple locations to the backup directory /tmp/backup. The end result should be:
when i list /tmp/backup/a --> it should contain /b/a.txt /c/a.txt /d/a.txt & /e/a.txt.
For this, I had used the command: echo /a/*/a.txt | xargs -I {} -n 1 sudo cp --parent -vp {} /tmp/backup. This is throwing the error "cp: cannot stat '/a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt': No such file or directory"
-I option is taking the complete input from echo instead of individual values (like -n 1 does). If someone can help debug this issue that would be very helpful instead of providing an alternative command.
Use rsync with the --relative (-R) option to keep (parts of) the source paths.
I've used a wildcard for the source to match your example command rather than the explicit list of directories mentioned in your question.
rsync -avR /a/*/a.txt /tmp/backup/
Do the backups need to be exactly the same as the originals? In most cases, I'd prefer a little compression. [tar](https://man7.org/linux/man-pages/man1/tar.1.html) does a great job of bundling things including the directory structure.
tar cvzf /path/to/backup/tarball.tgz /source/path/
tar can't update compressed archives, so you can skip the compression
tar uf /path/to/backup/tarball.tar /source/path/
This gives you versioning of a sort, as if only updates changed files, but keeps the before and after versions, both.
If you have time and cycles and still want the compression, you can decompress before and recompress after.

How can I download all the files from a remote directory to my local directory?

I want to download all the files in a specific directory of my site.
Let's say I have 3 files in my remote SFTP directory
www.site.com/files/phone/2017-09-19-20-39-15
a.txt
b.txt
c.txt
My goal is to create a local folder on my desktop with ONLY those downloaded files. No parents files or parents directory needed. I am trying to get the clean report.
I've tried
wget -m --no-parent -l1 -nH -P ~/Desktop/phone/ www.site.com/files/phone/2017-09-19-20-39-15 --reject=index.html* -e robots=off
I got
I want to get
How do I tweak my wget command to get something like that?
Should I use anything else other than wget ?
Ihue,
Taking a shell programatic perspective I would recommend you try the following command line script, note I also added the citation so you can see the original threads.
wget -r -P ~/Desktop/phone/ -A txt www.site.com/files/phone/2017-09-19-20-39-15 --reject=index.html* -e robots=off
-r enables recursive retrieval. See Recursive Download for more information.
-P sets the directory prefix where all files and directories are saved to.
-A sets a whitelist for retrieving only certain file types. Strings and patterns are accepted, and both can be used in a comma separated list. See Types of Files for more information.
Ref: #don-joey
https://askubuntu.com/questions/373047/i-used-wget-to-download-html-files-where-are-the-images-in-the-file-stored

Verify copied data between Windows & Linux shares?

I just copied a ton of data from a machine running Windows 7 Ultimate to a server running Ubuntu Server LTS 10.04. I used the robocopy utility via PowerShell to accommplish this task, but I couldn't find any informaiton online regarding whether Robocopy verifies the copied file's integrity once it is copied to the server.
First of all, does anyone know if this is done inherently? There is no switch that explicitly allows you to add verification to a file transfer.
Second, if it doesn't or there is uncertainty about whether or not it does, what would be the simplest method to accomplish this for multiple directories with several files/sub-directories?
Thanks!
The easiest mechanism I know would rely on an md5sum and Unix-like find utility on your Windows machine.
You can generate a manifest file of filenames / md5sums:
find /source -type f -exec md5sum {} \; > MD5SUM
Copy the MD5SUM file to your Linux machine, and then run:
cd /path/to/destination
md5sum --quiet -c MD5SUM
You'll get a list of files that fail:
$ md5sum --quiet -c /tmp/MD5SUM
/etc/aliases: FAILED
md5sum: WARNING: 1 of 341 computed checksums did NOT match
Much easier method is to use unix uilities diff and rsync
With diff you can compare two file but you can compare also two directories. With diff I would recomend this command:
diff -r source/directory/ destination/directory
-r forces diff to analyse directorysor recursively
The second option is to use rsync which is ment to sync files or directories but with -n option you can use it also to analyes differencies between directories. rsync also works even when the files are not on the same host it means on one there can be remote host and you can acess it even with ssh. Pluss rsync is really flexible with its meny options avaialbe
rsync -avn --itemize-changes --progress --stats source/directory/ destination/directory/
-n option makes rsync do a "dry-run", meaning it makes no changes on the
-a otion includes in it "Recursive mode,symbolic links,file permissions, file timestamps, file owner parameter, file group paratmeter
-v increase verbosity
--itemize-changes output a change-summary for all updates
Here you can find even more ways how to compear directories:
https://askubuntu.com/questions/12473/file-and-directory-comparison-tool
On rsync wikipedia page you can find windows alternative programs for rsync

Can rsync verify contents before syncing

Can Rsync be configured to verify the contents of the file before they are being synced. I have heard about checksum, but I came to know that checksum only does a sampling. I want to transfer a file only if it is contents are changed and not timestamp, is there a way to do it with any of the rsync modes.
In my scenario, say file sample.text will be created every week and I want to sync it with a remote server only if the contents of sample.text are changed, since it is created every week, the time stamp would obviously change. But I want the transfer only on a content change.
Yes:
$ man rsync | grep "\--checksum"
-c, --checksum skip based on checksum, not mod-time & size
Rsync is pretty complicated. I recommend cuddle time with the man page and experimentation with test data before using it for anything remotely important.
Most of the time, people use rsync -ac source dest.
$ man rsync | grep "\--archive"
-a, --archive archive mode; same as -rlptgoD (no -H)
And that -rlptgoD garbage means: recursive (r), copy symlinks as symlinks (l), preserve permissions (p), preserve times (t), preserve group (g), preserve owner (o), preserve device files (D, super-user only), preserve special files (also part of D).
The -c or --checksum is really what you are looking for (skip based on checksum, not mod-time & size). Your supposition that rsync only samples mtime and size is wrong.
See the --checksum option on the rsync man page.
Also, the --size-only option will be a faster choice if you know for sure that a change of contents also means a change of size.
Example:
#!/bin/bash
echo > diff.txt
rsync -rvcn --delete /source/ /destination/ > diff.txt
second_row=$(sed -n '2p' diff.txt)
if [ "$second_row" = "" ]; then
echo there was no change
else
echo there was change
fi

How can I recursively copy a directory into another and replace only the files that have not changed?

I am looking to do a specific copy in Fedora.
I have two folders:
'webroot': holding ALL web files/images etc
'export': folder containing thousands of PHP, CSS, JS documents that are exported from my SVN repo.
The export directory contains many of the same files/folders that the root does, however the root contains additional ones not found in export.
I'd like to merge all of the contents of export with my webroot with the following options:
Overwriting the file in webroot if export's version contains different code than what
is inside of webroot's version (live)
Preserve the permissions/users/groups of the file if it is overwritten (the export
version replacing the live version) *NOTE I would like the webroots permissions/ownership maintained, but with export's contents
No prompting/stopping of the copy
of any kind (ie not verbose)
Recursive copy - obviously I
would like to copy all* files
folders and subfolders found in
export
I've done a bit of research into cp - would this do the job?:
cp -pruf ./export /path/to/webroot
It might, but any time the corresponding files in export and webroot have the same content but different modification times, you'd wind up performing an unnecessary copy operation. You'd probably get slightly smarter behavior from rsync:
rsync -pr ./export /path/to/webroot
Besides, rsync can copy files from one host to another over an SSH connection, if you ever have a need to do that. Plus, it has a zillion options you can specify to tweak its behavior - look in the man page for details.
EDIT: with respect to your clarification about what you mean by preserving permissions: you'd probably want to leave off the -p option.
-u overwrites existing files folder if the destination is older than source
-p perserves the permission and dates
-f turns off verbosity
-r makes the copy recursive
So looks like you got all the correct args to cp
Sounds like a job for cpio (and hence, probably, GNU tar can do it too):
cd export
find . -print | cpio -pvdm /path/to/webroot
If you need owners preserved, you have to do it as root, of course. The -p option is 'pass mode', meaning copy between locations; -v is verbose (but not interactive; there's a difference); -d means create directories as necessary; -m means preserve modification time. By default, without the -u option, cpio won't overwrite files in the target area that are newer than the one from the source area.

Resources