Is there a way to perform a "tail -f" from an url? - linux

I currently use tail -f to monitor a log file: this way I get an autorefreshing console monitoring a web server.
Now, said webserver was moved to another host and I have no shell privileges for that.
Nevertheless I have a .txt network path, which in the end is a log file which is constantly updated.
So, I'd like to do something like tail -f, but on that url.
Would it be possible?In the end "in linux everything is a file" so..

You can do auto-refresh with help of watch combined with wget.
It won't show history, like tail -f, rather update screen like top.
Example of command, that shows content on file.txt on the screen, and update output every five seconds:
watch -n 5 wget -qO- http://fake.link/file.txt
Also, you can output n last lines, instead of the whole file:
watch -n 5 "wget -qO- http://fake.link/file.txt | tail"
In case if you still need behaviour like "tail -f" (with keeping history), I think you need to write a script that will download log file each time period, compare it to previous downloaded version, and then print new lines. Should be quite easy.

I wrote a simple bash script to fetch URL content each 2 seconds and compare with local file output.txt then append the diff to the same file
I wanted to stream AWS amplify logs in my Jenkins pipeline
while true; do comm -13 --output-delimiter="" <(cat output.txt) <(curl -s "$URL") >> output.txt; sleep 2; done
don't forget to create empty file output.txt file first
: > output.txt
view the stream :
tail -f output.txt
original comment : https://stackoverflow.com/a/62347827/2073339
UPDATE:
I found better solution using wget here:
while true; do wget -ca -o /dev/null -O output.txt "$URL"; sleep 2; done
https://superuser.com/a/514078/603774

I've made this small function and added it to the .*rc of my shell. This uses wget -c, so it does not re-download the whole page:
# Poll logs continuously over HTTP
logpoll() {
FILE=$(mktemp)
echo "———————— LOGPOLLING TO $FILE ————————"
tail -f $FILE &
tail_pid=$!
bg %1
stop=0
trap "stop=1" SIGINT SIGTERM
while [ $stop -ne 1 ]; do wget -co /dev/null -O $FILE "$1"; sleep 2; done
echo "——————————— LOGPOLL DONE ————————————"
kill $tail_pid
rm $FILE
trap - SIGINT SIGTERM
}
Explanation:
Create a temporary logfile using mktemp and save its path to $FILE
Make tail -f output the logfile continuously in the background
Make ctrl+c set stop to 1 instead of exiting the function
Loop until stop bit is set, i.e. until the user presses ctrl+c
wget given URL in a loop every two seconds:
-c - "continue getting partially downloaded file", so that wget continues instead of truncating the file and downloading again
-o /dev/null - wget's log messages shall be thrown into the void
-O $FILE - output the contents to the temp logfile we've created
Clean up after yourself: kill the tail -f, delete the temporary logfile, unset the signal handlers.

The proposed solutions periodically download the full file.
To avoid that I've created a package and published in NPM that does a HEAD request ( getting the size of the file ) and requesting only the last bytes.
Check it out and let me know if you need any help.
https://www.npmjs.com/package/#imdt-os/url-tail

Related

inotify seems to add a 6 letter code to filenames in its output, before the extension

inotify seems to add a 6 letter code to filenames in its output, before the extension.
For example:
"/path/to/directory/ CLOSE_WRITE,CLOSE filename-HzdVai.lyx"
or with --format "%w%f":
/path/to/directory/filename-HzdVai.lyx
This didn't happen with other scripts and I couldn't find any example of this or why this would happen with googling.
code:
inotifywait -m -r -e close_write --exclude '[^l][^y][^x]$' ~/Routines/* ~/Projects/* | while read path msg name
do
echo "$path $msg $name"
lyx -e pdf "$path$name.lyx"
done
If it's relevant, I am using Ubuntu 20.4.
The intention of the script was to continuously update LyX documents matching pdf files (LyX is a LaTeX-based document processor) so whenever I saved a document it would be compiled automatically
#larks had guessed correctly and tracking move events as well showed that LyX just wrote to the file with the id temporarily, then renamed it.
The final, working, script:
#!/usr/bin/env sh
inotifywait -m -r -e moved_to --exclude '[^l][^y][^x]$' --format "%w%f" ~/Routines/* ~/Projects/* | while read file_path
do
echo "$file_path"
lyx -e pdf "$file_path"
done

Not every command is being for in a while loop

I am trying to make a script what looks at a folder and will automatically encode files that go into that folder using hand brake. I want to do this doing monitoring the folder using inotify putting the new additions to the folder into a list then using a cron job to encode them overnight. However when using a while loop to loop over the list handbrake only encodes the first file exists then the scripts carrys on to after the loop without doing every file in the list. Here is the script that is calling handbrake:
#!/bin/bash
while IFS= read -r line
do
echo "$(basename "$line")"
HandBrakeCLI -Z "Very Fast 1080p30" -i "$line" -o "$line.m4v"
rm "$line"
done < list.txt
> list.txt
When testing the loop with a simple echo instead of the HandBrakeCLI it works fine and prints out every file so I have no idea what is wrong.
Here is the scripts that is monitoring the folder incase that is the problem:
#!/bin/bash
if ! [ -f list.txt ]
then
touch list.txt
fi
inotifywait -m -e create --format "%w%f" tv-shows | while read FILE
do
echo "$FILE" >> list.txt
done
Any help would be great, thanks
EDIT:
Just to be more specific, the script works fine for the first file in the list.txt, it encodes it no problem and removes the old version, but then it doesn't do any of the others in the list
Taken from here
To solve the problem simply
echo "" | HandBrakeCLI ......
or
HandBrakeCLI ...... < /dev/null

Multiple scripts making rest calls interfering

So I am running into a problem with unix scripts that use curl to make rest calls. I have one script, that runs two other scripts inside of it.
cat example.sh
FILE="file1.txt"
RECIP="wilfred#blamagam.com"
rm -f $FILE
./script1.sh > $FILE
mail -s "subject" $RECIP < $FILE
RECIP="bob#blamagam.com"
rm -f $FILE
./script2.sh > $FILE
mail -s "subject" $RECIP < $FILE
exit 0
Each script makes rest calls to the same service. It is my understanding that script1.sh should completely finish before script2.sh is ran, however that is not the case. In the logs for the rest service I see a rest call from the second script in the middle of the first one still executing. The second script then fails because of this (it does not get any data returned).
I am modifying this process so I am not the one who originally wrote it. I am not seeing any forked processes, or background processes at all and I have been banging my head against the wall.
I do know that script2.sh works. Whenever script1.sh takes under a minute script2.sh works just fine, but more often than not script1.sh takes over a min, causing the second script to fail.
This is ran by a cron, and the contents of the files are mailed out, so I cant just default to running them manually. Any suggestions for what to look into would be much appreciated!
EDIT: Here is a high pseudo code example
script1.sh
ITEMS=`/usr/bin/curl -m 10 -k -u userName:passWord -L https://server/rest-service/rest?where=clause=value;clause2=value2&sel=field 2>/dev/null | sed s/<\/\?Attribute[^>]*>/\n/g | grep -v '^<' | grep -v '^$' | sed 's/ //g'`
echo "\n Subject for these metrics"
echo "$ITEMS"
Both scripts have lots of entries like this. There are 2 or 3 for loops but they are simple and I do not see any background processes being called. Its a large script so I could only provide a snippet. Could the rest call into pipes be causing an issue?
Edit:
Just tested this on my system and it seems to work.
cat example.sh
FILE="file1.txt"
RECIP="wilfred#blamagam.com"
rm -f "$FILE"
(./script1.sh > "$FILE") &
procscript1=$!
wait "$procscript1"
mail -s "subject" "$RECIP" < "$FILE"
RECIP="bob#blamagam.com"
rm -f "$FILE"
(./script2.sh > "$FILE") &
procscript2=$!
wait "$procscript2"
mail -s "subject" "$RECIP" < "$FILE"
exit 0
Put the script executions in the background with the &.
Get the process id's for each script execution.
Use the wait command to block until the execution is done.

scp: how to find out that copying was finished

I'm using scp command to copy file from one Linux host to another.
I run scp commend on host1 and copy file from host1 to host2. File is quite big and it takes for some time to copy it.
On host2 file appears immediately as soon as copying was started. I can do everything with this file even if copying is still in progress.
Is there any reliable way to find out if copying was finished or not on host2?
Off the top of my head, you could do something like:
touch tinyfile
scp bigfile tinyfile user#host:
Then when tinyfile appears you know that the transfer of bigfile is complete.
As pointed out in the comments, this assumes that scp will copy the files one by one, in the order specified. If you don't trust it, you could do them one by one explicitly:
scp bigfile user#host:
scp tinyfile user#host:
The disadvantage of this approach is that you would potentially have to authenticate twice. If this were an issue you could use something like ssh-agent.
On sending side (host1) use script like this:
#!/bin/bash
echo 'starting transfer'
scp FILE USER#DST_SERVER:DST_PATH
OUT=$?
if [ $OUT = 0 ]; then
echo 'transfer successful'
touch successful
scp successful USER#DST_SERVER:DST_PATH
else
echo 'transfer faild'
fi
On receiving side (host2) make script like this:
#!/bin/bash
SLEEP_TIME=30
MAX_CNT=10
CNT=0
while [[ ! -e successful && $CNT < $MAX_CNT ]]; do
((CNT++))
sleep($SLEEP_TIME);
done;
if [[ -e successful ]]; then
echo 'successful'
rm successful
# do somethning with FILE
fi
With CNT and MAX_CNT you disable endless loop (in case file successful isn't transferred).
Product MAX_CNT and SLEEP_TIME should be equal or greater expected transfer time. In my example expected transfer time is less than 300 seconds.
A checksum (md5sum, sha256sum ,sha512sum) of the local and remote files would tell you if they're identical.
For the situation where you don't have SSH access to the remote system - like an FTP server - you can download the file after it's uploaded and compare the checksums. I do this for files I send from production scripts at work. Below is a snippet from the script in which I do this.
MD5SRC=$(md5sum $LOCALFILE | cut -c 1-32)
MD5TESTFILE=$(mktemp -p /ramdisk)
curl \
-o $MD5TESTFILE \
-sS \
-u $FTPUSER:$FTPPASS \
ftp://$FTPHOST/$REMOTEFILE
MD5DST=$(md5sum $MD5TESTFILE | cut -c 1-32)
if [ "$MD5SRC" == "$MD5DST" ]
then
echo "+Local and Remote files match!"
else
echo "-Local and Remote files don't match"
fi
if you use inotify-tools,
then the solution will looks like this:
while ! inotifywait -e close $(dirname ${bigfile_fullname}) 2>/dev/null | \
grep -Eo "CLOSE $(basename ${bigfile_fullname})$">/dev/null
do true
done
echo "File ${bigfile_fullname} closed"
After some investigation, and discussion of the problem on other forums I have found one more solution. Maybe it can help somebody.
There is a command "lsof". It lists open files. During copying the file will be opened, so the command
lsof | grep filename
will return non empty result.
So you might want to make a while loop to wait until lsof returns nothing and proceed with your task.
Example:
# provide your file name here
f=<nameOfYourFile>
lsofresult=`lsof | grep $f | wc -l`
while [ $lsofresult != 0 ]; do
echo still copying file $f...
sleep 5
lsofresult=`lsof | grep $f | wc -l`
done; echo copying file $f is finished: `ls $f`
For the duplicate question, How to check if file has been scp 100% to the remote location , which was for an expect script, to know if a file is transferred completely, we can add expect 100% .. .. i.e something like this ...
expect -c "
set timeout 1
spawn scp user#$REMOTE_IP:/tmp/my.file user#$HOST_IP:/home/.
expect yes/no { send yes\r ; exp_continue }
expect password: { send $SCP_PASSWORD\r }
expect 100%
sleep 1
exit
"
if [ -f "/home/my.file" ]; then
echo "Success"
fi
If avoiding a second SSH handshake is important, you can use something like the following:
ssh host cat \> bigfile \&\& touch complete < bigfile
Then wait for the "complete" file to get created on the remote end.

Getting an empty file for grep output

I am running this command in a script
while [ 1 ]
do
if [ -e $LOG ]
then
grep -A 5 -B 5 -f $PATTERNS $LOG >> $FOREMAIL
break
fi
done
$LOG file is scp'ed from another machine. So as soon as it appears in the current directory, while loop detects it and does the grep. The problem is, the $FOREMAIL file turns up to be empty. But if I run this grep outside of the script as a standalone command with same files and params, I can see that it generates an output.
I am baffled as to why this command is generating no o/p in the script?
The -e is triggering as soon as scp creates the file, while it still has no data in it, and grep is operating on an empty file. You need to wait until the file has finished transferring.
You could accomplish this by transferring to a temporary filename, than running mv over ssh from the machine which is pushing the file up.
Edit: the code for the machine copying to log file up...
scp $log 192.168.0.1:/logfiles/${log}.tmp
ssh 192.168.0.1 mv /logfiles/${log}.tmp /logfiles/${log}
Before you can grep, you need to wait for two things: 1) the download started (file comes into existence) and 2) download finished (nobody is opening the file anymore). I have a script call waitfor.sh, which does this:
#!/bin/bash
# waitfor.sh - wait for a file fully downloaded (via Firefox, scp, ...)
# Syntax:
# waitfor.sh filename
FILENAME=$1 # Name of file to wait for
INTERVAL=10 # Wait interval of N seconds
# Wait for download started
while [ ! -f $FILENAME ]
do
sleep $INTERVAL
done
# Wait for download finished
while lsof $FILENAME
do
sleep $INTERVAL
done
To use it:
waitfor.sh $LOG
grep ...
Could it be that the while [1] is very fast, so when the file starts copying, it shows up as an empty file first before copying is complete? Depending on the size of the file, try a sleep delay inside the then loop. Figuring out when a file finishes copying when done by an external process is probably a separate question - e.g. googling for something like "how to tell when scp has finshed copying a file" turns up a bunch of links like: https://superuser.com/questions/45224/is-there-a-way-to-tell-if-a-file-is-done-copying
Better to use:
if [ -f $LOG ]
instead of:
if [ -e $LOG ]
-f checks for a regular type
-e checks for any file
Here's what I ended up doing:
scp $LOGFILE
then
scp $SCPDONE # empty file
And modified the if clause like this:
while [ 1 ]
do
if [ -e $SCPDONE ]
then
grep -A 5 -B 5 -f $PATTERNS $LOG >> $FOREMAIL
break
fi
done

Resources