Elegantly send local tarball and untar on remote end - linux

All,
This might be a FAQ, but I can't get my search-fu to find it. Namely, I kind of want to do the "reverse" tar pipe. Usually a tar pipe used to send a local folder to a remote location as a tar ball in a single nice command:
tar zcvf - ~/MyFolder | ssh user#remote "cat > ~/backup/MyFolder.tar.gz"
(I hope I got that right. I typed it from memory.)
I'm wondering about the reverse situation. Let's say I locally have a tarball of a large directory and what I want to do is copy it (rsync? scp?) to a remote machine where it will live as the expanded file, i.e.,:
Local: sourcecode.tar.gz ==> send to Remote and untar ==>
Remote: sourcecode/
I want to do this because the "local" disk has inode pressure so keeping a single bigger file is better than many smaller files. But the remote system is one with negligible inode pressure, and it would be preferable to keep it as an expanded directory.
Now, I can think of various ways to do this with &&-command chaining and the like, but I figure there must be a way to do this with tar-pipes and rsync or ssh/scp that I am just not seeing.

You're most of the way there:
ssh user#remote "tar -C /parent/directory -xz -f-" < sourcecode.tar.gz
Where -f- tells tar to extract from stdin, and the -C flag changes directory before untarring.

Related

Bash Scripting with xargs to BACK UP files

I need to copy a file from multiple locations to the BACK UP directory by retaining its directory structure. For example, I have a file "a.txt" at the following locations /a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt, I now need to copy this file from multiple locations to the backup directory /tmp/backup. The end result should be:
when i list /tmp/backup/a --> it should contain /b/a.txt /c/a.txt /d/a.txt & /e/a.txt.
For this, I had used the command: echo /a/*/a.txt | xargs -I {} -n 1 sudo cp --parent -vp {} /tmp/backup. This is throwing the error "cp: cannot stat '/a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt': No such file or directory"
-I option is taking the complete input from echo instead of individual values (like -n 1 does). If someone can help debug this issue that would be very helpful instead of providing an alternative command.
Use rsync with the --relative (-R) option to keep (parts of) the source paths.
I've used a wildcard for the source to match your example command rather than the explicit list of directories mentioned in your question.
rsync -avR /a/*/a.txt /tmp/backup/
Do the backups need to be exactly the same as the originals? In most cases, I'd prefer a little compression. [tar](https://man7.org/linux/man-pages/man1/tar.1.html) does a great job of bundling things including the directory structure.
tar cvzf /path/to/backup/tarball.tgz /source/path/
tar can't update compressed archives, so you can skip the compression
tar uf /path/to/backup/tarball.tar /source/path/
This gives you versioning of a sort, as if only updates changed files, but keeps the before and after versions, both.
If you have time and cycles and still want the compression, you can decompress before and recompress after.

Is there a way to download files matching a pattern trough SFTP on shell script?

I'm trying to download multiple files trough SFTP on a linux server using
sftp -o IdentityFile=key <user>#<server><<END
get -r folder
exit
END
which will download all contents on a folder. It appears that find and grep are invalid commands, so are for loops.
I need to download files having a name containing a string e.g.
test_0.txt
test_1.txt
but no file.txt
Do you really need the -r switch? Are there really any subdirectories in the folder? You do not mention that.
If there are no subdirectories, you can use a simple get with a file mask:
cd folder
get *test*
Are you required to use sftp? A tool like rsync that operates over ssh has flexible include/exclude options. For example:
rsync -a <user>#<server>:folder/ folder/ \
--include='test_*.txt' --exclude='*.txt'
This requires rsync to be installed on the remote system, but that's very common these days. If rsync isn't available, you could do something similar using tar:
ssh <user>#<server> tar -cf- folder/ | tar -xvf- --wildcards '*/test_*.txt'
This tars up all the files remotely, but then only extracts files matching your target pattern on the receiving side.

bash: sending compressed files while compressing others

i have a simple bash script to download a lot of logs files over pretty slow network. i can compress logs on the remote side. basically it's:
ssh: compress whole directory
scp: download archive
ssh: rm archive
using lzma gives great compression but compressing the whole directory is slow. is there any tool or easy way to write a script that allows me to compress a single files (or a bunch of files) and start downloading them while other files/chunks are still being compressed? i was thinking about launching compressing for every single file in the background and in the loop downloading/rsync files with correct extension. but then i don't know how to check if compressing process finished its work
The easiest way would be to compress them in transit using ssh -C. However, if you have a large number of small files, you are better off tarring and gzip/bzipping the whole directory at once using tar zcf or tar jcf. You may be able to start downloading the file while it's still being written, though I haven't tried it.
best solution i found here. in my case it was:
ssh -T user#example.com 'tar ... | lzma -5 -' > big.compressed
Try sshing into your server and going to the log directory and using GNU Parallel to compress all the logs in parallel and as each one is compressed, change its name to add the .done suffix so you can do rsync. So, on the server you would run:
cd <LOG DIRECTORY>
rm ALL_COMPRESSED.marker
parallel 'lzma {}; mv {}.lzma {}.lzma.done' ::: *.log
touch ALL_COMPRESSED.marker

How to create a Linux compatible zip archive of a directory on a Mac

I've tried multiple ways of creating a zip or a tar.gz on the mac using GUI or command lines, and I have tried decompressing on the Linux side and gotten various errors, from things like "File.XML" and "File.xml" both appearing in a directory, to all sorts of others about something being truncated, etc.
Without listing all my experiments on the command line on the Mac and Linux (using tcsh), what should 2 bullet proof commands be to:
1) make a zip file of a directory (with no __MACOSX folders)
2) unzip / untar (whatever) the Mac zip on Linux with no errors (and no __MACOSX folders)
IT staff on the Linux side said they "usually use .gz and use gzip and gunzip commands".
Thanks!
After much research and experimentation, I found this works every time:
1) Create a zipped tar file with this command on the Mac in Terminal:
tar -cvzf your_archive_name.tar.gz your_folder_name/
2) When you FTP the file from one server to another, make sure you do so with binary mode turned on
3) Unzip and untar in two steps in your shell on the Linux box (in this case, tcsh):
gunzip your_archive_name.tar.gz
tar -xvf your_archive_name.tar
On my Mac and in ssh bash I use the following simple commands:
Create Zip File (-czf)
tar -czf NAME.tgz FOLDER
Extract Zip File (-xzf)
tar -xzf NAME.tgz
Best, Mike
First off, the File.XML and File.xml cannot both appear in an HFS+ file system. It is possible, but very unusual, for someone to format a case-sensitive HFSX file system that would permit that. Can you really create two such files and see them listed separately?
You can use the -X option with zip to prevent resource forks and extended attributes from being saved. You can also throw in a -x .DS_Store to get rid of those files as well.
For tar, precede it with COPYFILE_DISABLE=true or setenv COPYFILE_DISABLE true, depending on your shell. You can also throw in an --exclude=.DS_Store.
Your "IT Staff" gave you a pretty useless answer, since gzip can only compress one file. gzip has to be used in combination with tar to archive a directory.

Verify copied data between Windows & Linux shares?

I just copied a ton of data from a machine running Windows 7 Ultimate to a server running Ubuntu Server LTS 10.04. I used the robocopy utility via PowerShell to accommplish this task, but I couldn't find any informaiton online regarding whether Robocopy verifies the copied file's integrity once it is copied to the server.
First of all, does anyone know if this is done inherently? There is no switch that explicitly allows you to add verification to a file transfer.
Second, if it doesn't or there is uncertainty about whether or not it does, what would be the simplest method to accomplish this for multiple directories with several files/sub-directories?
Thanks!
The easiest mechanism I know would rely on an md5sum and Unix-like find utility on your Windows machine.
You can generate a manifest file of filenames / md5sums:
find /source -type f -exec md5sum {} \; > MD5SUM
Copy the MD5SUM file to your Linux machine, and then run:
cd /path/to/destination
md5sum --quiet -c MD5SUM
You'll get a list of files that fail:
$ md5sum --quiet -c /tmp/MD5SUM
/etc/aliases: FAILED
md5sum: WARNING: 1 of 341 computed checksums did NOT match
Much easier method is to use unix uilities diff and rsync
With diff you can compare two file but you can compare also two directories. With diff I would recomend this command:
diff -r source/directory/ destination/directory
-r forces diff to analyse directorysor recursively
The second option is to use rsync which is ment to sync files or directories but with -n option you can use it also to analyes differencies between directories. rsync also works even when the files are not on the same host it means on one there can be remote host and you can acess it even with ssh. Pluss rsync is really flexible with its meny options avaialbe
rsync -avn --itemize-changes --progress --stats source/directory/ destination/directory/
-n option makes rsync do a "dry-run", meaning it makes no changes on the
-a otion includes in it "Recursive mode,symbolic links,file permissions, file timestamps, file owner parameter, file group paratmeter
-v increase verbosity
--itemize-changes output a change-summary for all updates
Here you can find even more ways how to compear directories:
https://askubuntu.com/questions/12473/file-and-directory-comparison-tool
On rsync wikipedia page you can find windows alternative programs for rsync

Resources