Alternative to scp, transferring files between linux machines by opening parallel connections - linux

Is there an alternative to scp, to transfer a large file from one machine to another machine by opening parallel connections and also able to pause and resume the download.
Please don't transfer this to severfault.com. I am not a system administrator. I am a developer trying to transfer past database dumps between backup hosts and servers.
Thank you

You could try using split(1) to break the file apart and then scp the pieces in parallel. The file could then be combined into a single file on the destination machine with 'cat'.
# on local host
split -b 1M large.file large.file. # split into 1MiB chunks
for f in large.file.*; do scp $f remote_host: & done
# on remote host
cat large.file.* > large.file

Take a look at rsync to see if it will meet your needs.
The correct placement of questions is not based on your role, but on the type of question. Since this one is not strictly programming related it is likely that it will be migrated.

Similar to Mike K's answer, check out https://code.google.com/p/scp-tsunami/ - it handles splitting the file, starting several scp processes to copy the parts and then joins them again...it can also copy to multiple hosts...
./scpTsunami.py -v -s -t 9 -b 10m -u dan bigfile.tar.gz /tmp -l remote.host
That splits the file into 10MB chunks and copies them using 9 scp processes...

The program you are after is lftp. It supports sftp and parallel transfers using its pget command. It is available under Ubuntu (sudo apt-get install lftp) and you can read a review of it here:
http://www.cyberciti.biz/tips/linux-unix-download-accelerator.html

Related

How to sync two folders between two VMs Linux?

I am newbie on Linux. I have a folder1 on VM1 and folder 2 on VM2. How can I configure as soon as I edit folder1, folder2 changes too.
Thanks.
You could consider rsync (https://linux.die.net/man/1/rsync).
However I am using SCP which is a "secured copy" using my private key like so:
scp -r -i /home/private_key_file what_to_copy.txt /var/projects/some_folder root#123.123.123.123:/var/projects/where_to_copy_to >> log_file_with_results.log 2>&1
So my VM2 is protected with a private key (/home/private_key_file) and I'm using user "root" to login. Hope this helps but that would be the most secure way in my opinion.
I would then run that command in a crontab every minute. For instant sync I don't know how to do so (yet), I hope that 1 minute increments are enough for you?
You can mount the same folder on the host machine as drives/folders on the VM guests. This means that a write in VM1 writes into the folder on the host and will also show up on VM2.
How to do this depends on the tool that you use for virtualisation.
You can try rsync + crontab
rsync /path/to/folder1 username#host:/path/to/folder2
you can set this task on crontab using a short delay of time.
Don't forget to put the respectively keys ssh pk into yours VMs ( /.ssh )
it work easy for me.
Beyond rsync as the other answers have mentioned, you can consider mounting the directory on other VMs using Network File System (NFS). The following article documents the steps to installing and configuring NFS quite well.
If you are new to Linux, I would advise for you to snapshot the VMs before making any changes so that you will be able to revert to an earlier snapshot in the event that something breaks.

How to connect to multiple servers to run the same query?

I have 4 servers where we have log files in same pattern. For every serch/query I need to login to all servers one by one and execute the command.
Is it possible to provide some command, so that it will login to all those servers one by one automatically and will fetch the output from each server?
What configuration, settings etc I have to do to make it working.
I am new to Linux Domain.
As suggested in your question comments, there are a number of tools to help you in performing a task on multiple machines. I will add to this list and suggest Ansible. It is designed to perform all of the interactions over ssh, in quite a simple manner, and with very little configuration.
https://github.com/ansible/ansible
If you were to have server-1 and server-2 defined in your ~/.ssh/config file, then the ansible configuration would be as simple as
[myservers]
server-1
server-2
Then to run a command on the group
$ ansible myservers -a uptime
If your servers are called eenie, meanie, minie, and moe, you simply do
for server in eenie meanie minie moe; do
ssh "$server" grep 'intrusion attempt' /var/log/firewall.log
done
The grep command won't reveal from which server it is reporting a result; maybe replace it with ssh "$server" sed -n "/intrusion attempt/s/^/$server: /p" /var/log/firewall.log
Use https://sealion.com. You just have to execute one script and it will install the agent in your servers and start collecting output. It has a convenient web interface to see the output across all your servers.

Shell: Get The Data From Remote Host And Excute Some Other Commands

I need to create a shell script to do this:
ssh to another remote host
use sqlplus on that host and spool command to get the data from oracle db into a file
transfer the file from that host to my host
excute another shell script to process the data file
I have finished the 4th step shell script. Now I have to do this 4 steps one by one. I want to create a script and do them all. Is that possible? How to transfer the data from one host to my host?
I think maybe the db file is not necessary.
Note: I have to ssh to another host to use sqlplus. It is the only one host which have the permission to access database.
# steps 1 and 2
ssh remote_user#remote_host 'sqlplus db_user/db_pass#db #sql_script_that_spools'
# step 3
scp remote_user#remote_host:/path/to/spool_file local_file
# step 4
process local_file
Or
# steps 1, 2 and 3
ssh remote_user#remote_host 'sqlplus db_user/db_pass#db #sql_script_no_spool' > local_file
# step 4
process local_file
Or, all in one:
ssh remote_user#remote_host 'sqlplus db_user/db_pass#db #sql_script_no_spool' |
process_stdin
Well Glenn pretty much summed it all up.
In order to make your life easier, you may want to also consider setting up passwordless ssh. There is a slightly higher security risk associated with this, but in many cases the risk is negligible.
Here is a link to a good tutorial. It is a debian based tutorial, but the commands given should work the same on most major linux distros.

multiple wget -r a site simultaneously?

any command / wget with options?
For multithreaded download a site recursively and simultaneously?
I found a decent solution.
Read original at http://www.linuxquestions.org/questions/linux-networking-3/wget-multi-threaded-downloading-457375/
wget -r -np -N [url] &
wget -r -np -N [url] &
wget -r -np -N [url] &
wget -r -np -N [url] &
copied as many times as you deem fitting to have as much processes
downloading. This isn't as elegant as a properly multithreaded app,
but it will get the job done with only a slight amount of over head.
the key here being the "-N" switch. This means transfer the file only
if it is newer than what's on the disk. This will (mostly) prevent
each process from downloading the same file a different process
already downloaded, but skip the file and download what some other
process hasn't downloaded. It uses the time stamp as a means of doing
this, hence the slight overhead.
It works great for me and saves a lot of time. Don't have too many
processes as this may saturate the web site's connection and tick off
the owner. Keep it around a max of 4 or so. However, the number is
only limited by CPU and network bandwidth on both ends.
With the use of parallel wget utilizing the xargs switch, this solution seems so much better:
https://stackoverflow.com/a/11850469/1647809
Use axel to download with multi connections
apt-get install axel
axel http://example.com/file.zip
Well, you can always run multiple instances of wget, no?
Example:
wget -r http://somesite.example.org/ &
wget -r http://othersite.example.net/ &
etc. This syntax will work in any Unix-like environment (e.g. Linux or MacOS); not sure how to do this in Windows.
Wget itself does not support multithreaded operations - at least, neither the manpage nor its website has any mention of this. Anyway, since wget supports HTTP keepalive, the bottleneck is usually the bandwidth of the connection, not the number of simultaneous downloads.

Remote linux server to remote linux server large sparse files copy - How To?

I have two twins CentOS 5.4 servers with VMware Server installed on each.
What is the most reliable and fast method for copying virtual machines files from one server to the other, assuming that I always use sparse file for my vmware virtual machines?
The vm's files are a pain to copy since they are very large (50 GB) but since they are sparse files I think something can be done to improve the speed of the copy.
If you want to copy large data quickly, rsync over SSH is not for you. As running an rsync daemon for quick one-shot copying is also overkill, yer olde tar and nc do the trick as follows.
Create the process that will serve the files over network:
tar cSf - /path/to/files | nc -l 5000
Note that it may take tar a long time to examine sparse files, so it's normal to see no progress for a while.
And receive the files with the following at the other end:
nc hostname_or_ip 5000 | tar xSf -
Alternatively, if you want to get all fancy, use pv to display progress:
tar cSf - /path/to/files \
| pv -s `du -sb /path/to/files | awk '{ print $1 }'` \
| nc -l 5000
Wait a little until you see that pv reports that some bytes have passed by, then start the receiver at the other end:
nc hostname_or_ip 5000 | pv -btr | tar xSf -
Have you tried rsync with the option --sparse(possibly over ssh)?
From man rsync:
Try to handle sparse files efficiently so they take up less
space on the destination. Conflicts with --inplace because it’s
not possible to overwrite data in a sparse fashion.
Since rsync is terribly slow at copying sparse file, I usually resort using tar over ssh :
tar Scjf - my-src-files | ssh sylvain#my.dest.host tar Sxjf - -C /the/target/directory
You could have a look at http://www.barricane.com/virtsync
(Disclaimer: I am the author.)

Resources