scp: how to find out that copying was finished - linux

I'm using scp command to copy file from one Linux host to another.
I run scp commend on host1 and copy file from host1 to host2. File is quite big and it takes for some time to copy it.
On host2 file appears immediately as soon as copying was started. I can do everything with this file even if copying is still in progress.
Is there any reliable way to find out if copying was finished or not on host2?

Off the top of my head, you could do something like:
touch tinyfile
scp bigfile tinyfile user#host:
Then when tinyfile appears you know that the transfer of bigfile is complete.
As pointed out in the comments, this assumes that scp will copy the files one by one, in the order specified. If you don't trust it, you could do them one by one explicitly:
scp bigfile user#host:
scp tinyfile user#host:
The disadvantage of this approach is that you would potentially have to authenticate twice. If this were an issue you could use something like ssh-agent.

On sending side (host1) use script like this:
#!/bin/bash
echo 'starting transfer'
scp FILE USER#DST_SERVER:DST_PATH
OUT=$?
if [ $OUT = 0 ]; then
echo 'transfer successful'
touch successful
scp successful USER#DST_SERVER:DST_PATH
else
echo 'transfer faild'
fi
On receiving side (host2) make script like this:
#!/bin/bash
SLEEP_TIME=30
MAX_CNT=10
CNT=0
while [[ ! -e successful && $CNT < $MAX_CNT ]]; do
((CNT++))
sleep($SLEEP_TIME);
done;
if [[ -e successful ]]; then
echo 'successful'
rm successful
# do somethning with FILE
fi
With CNT and MAX_CNT you disable endless loop (in case file successful isn't transferred).
Product MAX_CNT and SLEEP_TIME should be equal or greater expected transfer time. In my example expected transfer time is less than 300 seconds.

A checksum (md5sum, sha256sum ,sha512sum) of the local and remote files would tell you if they're identical.
For the situation where you don't have SSH access to the remote system - like an FTP server - you can download the file after it's uploaded and compare the checksums. I do this for files I send from production scripts at work. Below is a snippet from the script in which I do this.
MD5SRC=$(md5sum $LOCALFILE | cut -c 1-32)
MD5TESTFILE=$(mktemp -p /ramdisk)
curl \
-o $MD5TESTFILE \
-sS \
-u $FTPUSER:$FTPPASS \
ftp://$FTPHOST/$REMOTEFILE
MD5DST=$(md5sum $MD5TESTFILE | cut -c 1-32)
if [ "$MD5SRC" == "$MD5DST" ]
then
echo "+Local and Remote files match!"
else
echo "-Local and Remote files don't match"
fi

if you use inotify-tools,
then the solution will looks like this:
while ! inotifywait -e close $(dirname ${bigfile_fullname}) 2>/dev/null | \
grep -Eo "CLOSE $(basename ${bigfile_fullname})$">/dev/null
do true
done
echo "File ${bigfile_fullname} closed"

After some investigation, and discussion of the problem on other forums I have found one more solution. Maybe it can help somebody.
There is a command "lsof". It lists open files. During copying the file will be opened, so the command
lsof | grep filename
will return non empty result.
So you might want to make a while loop to wait until lsof returns nothing and proceed with your task.
Example:
# provide your file name here
f=<nameOfYourFile>
lsofresult=`lsof | grep $f | wc -l`
while [ $lsofresult != 0 ]; do
echo still copying file $f...
sleep 5
lsofresult=`lsof | grep $f | wc -l`
done; echo copying file $f is finished: `ls $f`

For the duplicate question, How to check if file has been scp 100% to the remote location , which was for an expect script, to know if a file is transferred completely, we can add expect 100% .. .. i.e something like this ...
expect -c "
set timeout 1
spawn scp user#$REMOTE_IP:/tmp/my.file user#$HOST_IP:/home/.
expect yes/no { send yes\r ; exp_continue }
expect password: { send $SCP_PASSWORD\r }
expect 100%
sleep 1
exit
"
if [ -f "/home/my.file" ]; then
echo "Success"
fi

If avoiding a second SSH handshake is important, you can use something like the following:
ssh host cat \> bigfile \&\& touch complete < bigfile
Then wait for the "complete" file to get created on the remote end.

Related

/proc/meminfo not updating when reading from ssh script

I have the following script in bash:
ssh user#1.1.1.1 "echo 'start'
mkdir -p /home/user/out
cp /tmp/big_file /home/user/out
echo 'syncing flash'
sync
while [[ $(cat /proc/meminfo | grep Dirty | awk '{print $2}') -ne 0 ]] ; do
echo \"$(cat /proc/meminfo)\"
sleep 1
sync
done
echo 'done'"
I have my host PC and a target PC which I am copying to. Before I run this script I have already scp'd a big file into /tmp on the target.
When I run this script it copies the file /tmp/big ok, but when it enters the loop to sync the flash and I wait for meminfo Dirty to get to zero what I see is always Dirty: 74224 kB repeated in the loop.
However in a different ssh session logged in to the target I have it running:
watch -n1 "cat /proc/meminfo | grep Drity"
And I see this count down from ~74000kb to 0kB.
The difference is that the ssh session doing the watch is logged in as root and the ssh is logged in a user.
So I did the same test with the ssh shell logged in as user and I saw always 0kb in Drity...
Does this imply that the user can't read meminfo relating to the whole system? - how can I tell when the flash has sync'd as a non-root user?
Since the argument to ssh is in double quotes, variables and command substitutions are expanded locally on the client before sending the command, they're not done on the remote machine. Since they're substituted on the client, you'll obviously get the same result each time through the loop (because the client isn't looping).
You should either escape the $ characters so they're sent to the server, or put the command inside single quotes (but the latter makes it difficult to include single quotes in the command).
ssh user#1.1.1.1 "echo 'start'
mkdir -p /home/user/out
cp /tmp/big_file /home/user/out
echo 'syncing flash'
sync
while [[ \$(awk '/Dirty/ {print \$2}' /proc/meminfo) -ne 0 ]] ; do
cat /proc/meminfo
sleep 1
sync
done
echo 'done'"
There's also no need for cat /proc/meminfo and grep Dirty in the command substitution. awk can do pattern matching and take a filename argument.

bash - wget -N if else value check

I'm working on a bash script that pulls a file from an FTP site only if the timestamp on remote is different than local. After it puts the file, it copies the file over to 3 other computers via samba (smbclient).
Everything works, but the file copies even if the wget -N ftp://insertsitehere.com returns a value that the file on the remote was not newer. What would be the best way to check the output of the script so that the copy only happens if a new version was pulled from FTP?
Ideally, I'd like the copy to the computers to preserve the timestamp just like the wget -N command does, too.
Here is an example of what I have:
#!/bin/bash
OUTDIR=/cats/dogs
cd $OUTDIR
wget -N ftp://user:password#sitegoeshere.com/filename
if [ $? -eq 0 ]; then
HOSTS="server1 server2 server3"
for i in $HOSTS; do
echo "Uploading to $i..."
smbclient -A /root/.smbclient.authfile //$i/path -c "lcd /cats/dogs; put fiilename.txt"
if [ $? -eq 0 ]; then
echo "Upload to $i successful..."
else
echo "There was an issue uploading to host $i..."
fi
done
else
echo "There was an issue with the FTP Download...."
exit 1
fi
The return value of wget is different than 0 only if there is an error. If -N is in use and the remote file is older than the local file, it will still have a return value of 0, so you cannot use that to check if the file has been modified.
You could check the mtime of the file to see if it changed, or the content. For example, you could use something like:
md5_old=$( md5sum filename.txt 2>/dev/null )
wget -N ftp://user:password#sitegoeshere.com/filename.txt
md5_new=$( md5sum filename.txt )
if [ "$md5_old" != "$md5_new" ]; then
# Copy filename.txt to SMB servers
fi
Regarding smbclient, unfortunately there is no way to preserve timestamps in either get or put commands. If you need it, you must use some different tool (scp -p, rsync -t...)
touch -r foo.txt foo.old
wget -N example.com/foo.txt
if [ foo.txt -nt foo.old ]
then
echo 'Uploading to server1...'
fi
"Save" the current timestamp into a new empty file
Use wget --timestamping to only download the file if it is newer
If file is newer than the "save" file, do stuff

Bash Script if a file exists and larger than loop

*Note i edited this so my final functioning code is below
Ok so I'm writing a bash script to backup our mysql database to a directory, delete the oldest backup if 10 exist, and output the results of the backup to a log so I can further create alerts if it fails. Everything works great except the if loop to output the results, thanks again for the help guys code is below!
#! /bin/bash
#THis creates a variable with the date stamp to add to the filename
now=$(date +"%m_%d_%y")
#This moves the bash shell to the directory of the backups
cd /dbbkp/backups/
#Counts the number of files in the direstory with the *.sql extension and deletes the oldest once 10 is reached.
[[ $(ls -ltr *.sql | wc -l) -gt 10 ]] && rm $(ls -ltr *.sql | awk 'NR==1{print $NF}')
#Moves the bash shell to the mysql bin directory to run the backup script
cd /opt/GroupLink/everything_HelpDesk/mysql/bin/
#command to run and dump the mysql db to the directory
./mysqldump -u root -p dbname > /dbbkp/backups/ehdbkp_$now.sql --protocol=socket --socket=/tmp/GLmysql.sock --password=password
#Echo the results to the log file
#Change back to the directory you created the backup in
cd /dbbkp/backups/
#If loop to check if the backup is proper size and if it exists
if find ehdbkp_$now.sql -type f -size +51200c 2>/dev/null | grep -q .; then
echo "The backup has run successfully" >> /var/log/backups
else
echo "The backup was unsuccessful" >> /var/log/backups
fi
Alternatively, you could use stat instead of find.
if [ $(stat -c %s ehdbkp_$now 2>/dev/null || echo 0) -gt 51200 ]; then
echo "The backup has run successfully"
else
echo "The backup was unsuccessful"
fi >> /var/log/backups
Option -c %s tells stat to return the size of file in bytes. This will take care of both the presence of file and size greater than 51200. When the file is missing, stat will err out, thus we redirect error message to /dev/null. The logical or condition || will get executed only when the file is missing thus the comparison will make [ 0 -gt 100 ] false.
To check if the file exists and larger than 51200 bytes you could rewrite your if like this:
if find ehdbkp_$now -type f -size +51200c 2>/dev/null | grep -q .; then
echo "The backup has run successfully"
else
echo "The backup has was unsuccessful"
fi >> /var/log/backups
Other notes:
The find takes care two things at once: checks if file exists and size is greater than 51200.
We redirect stderr to /dev/null to hide the error message if the file doesn't exist.
If there was a file matching both conditions, then grep will match and exit with success, otherwise it will exit with failure
The final outcome of the grep is what decides the if condition
I moved the >> /var/log/backups after the closing fi, as it's equivalent this way and less duplication.
Btw if is NOT a loop, it's a conditional.
UPDATE
As #glennjackman pointed out, a better way to write the if, without grep:
if [[ $(find ehdbkp_$now -type f -size +51200c 2>/dev/null) ]]; then
...

launch process in background and modify it from bash script

I'm creating a bash script that will run a process in the background, which creates a socket file. The socket file then needs to be chmod'd. The problem I'm having is that the socket file isn't being created before trying to chmod the file.
Example source:
#!/bin/bash
# first create folder that will hold socket file
mkdir /tmp/myproc
# now run process in background that generates the socket file
node ../main.js &
# finally chmod the thing
chmod /tmp/myproc/*.sock
How do I delay the execution of the chmod until after the socket file has been created?
The easiest way I know to do this is to busywait for the file to appear. Conveniently, ls returns non-zero when the file it is asked to list doesn't exist; so just loop on ls until it returns 0, and when it does you know you have at least one *.sock file to chmod.
#!/bin/sh
echo -n "Waiting for socket to open.."
( while [ ! $(ls /tmp/myproc/*.sock) ]; do
echo -n "."
sleep 2
done ) 2> /dev/null
echo ". Found"
If this is something you need to do more than once wrap it in a function, but otherwise as is should do what you need.
EDIT:
As pointed out in the comments, using ls like this is inferior to -e in the test, so the rewritten script below is to be preferred. (I have also corrected the shell invocation, as -n is not supported on all platforms in sh emulation mode.)
#!/bin/bash
echo -n "Waiting for socket to open.."
while [ ! -e /tmp/myproc/*.sock ]; do
echo -n "."
sleep 2
done
echo ". Found"
Test to see if the file exists before proceeding:
while [[ ! -e filename ]]
do
sleep 1
done
If you set your umask (try umask 0) you may not have to chmod at all. If you still don't get the right permissions check if node has options to change that.

Getting an empty file for grep output

I am running this command in a script
while [ 1 ]
do
if [ -e $LOG ]
then
grep -A 5 -B 5 -f $PATTERNS $LOG >> $FOREMAIL
break
fi
done
$LOG file is scp'ed from another machine. So as soon as it appears in the current directory, while loop detects it and does the grep. The problem is, the $FOREMAIL file turns up to be empty. But if I run this grep outside of the script as a standalone command with same files and params, I can see that it generates an output.
I am baffled as to why this command is generating no o/p in the script?
The -e is triggering as soon as scp creates the file, while it still has no data in it, and grep is operating on an empty file. You need to wait until the file has finished transferring.
You could accomplish this by transferring to a temporary filename, than running mv over ssh from the machine which is pushing the file up.
Edit: the code for the machine copying to log file up...
scp $log 192.168.0.1:/logfiles/${log}.tmp
ssh 192.168.0.1 mv /logfiles/${log}.tmp /logfiles/${log}
Before you can grep, you need to wait for two things: 1) the download started (file comes into existence) and 2) download finished (nobody is opening the file anymore). I have a script call waitfor.sh, which does this:
#!/bin/bash
# waitfor.sh - wait for a file fully downloaded (via Firefox, scp, ...)
# Syntax:
# waitfor.sh filename
FILENAME=$1 # Name of file to wait for
INTERVAL=10 # Wait interval of N seconds
# Wait for download started
while [ ! -f $FILENAME ]
do
sleep $INTERVAL
done
# Wait for download finished
while lsof $FILENAME
do
sleep $INTERVAL
done
To use it:
waitfor.sh $LOG
grep ...
Could it be that the while [1] is very fast, so when the file starts copying, it shows up as an empty file first before copying is complete? Depending on the size of the file, try a sleep delay inside the then loop. Figuring out when a file finishes copying when done by an external process is probably a separate question - e.g. googling for something like "how to tell when scp has finshed copying a file" turns up a bunch of links like: https://superuser.com/questions/45224/is-there-a-way-to-tell-if-a-file-is-done-copying
Better to use:
if [ -f $LOG ]
instead of:
if [ -e $LOG ]
-f checks for a regular type
-e checks for any file
Here's what I ended up doing:
scp $LOGFILE
then
scp $SCPDONE # empty file
And modified the if clause like this:
while [ 1 ]
do
if [ -e $SCPDONE ]
then
grep -A 5 -B 5 -f $PATTERNS $LOG >> $FOREMAIL
break
fi
done

Resources