tail-like continuous ls (file list)

tail-like continuous ls (file list) - linux

I am monitoring the new files created in a folder in linux. Every now and then I issue an "ls -ltr" in it. But I wish there was a program/script that would automatically print it, and only the latest entries. I did a short while loop to list it, but it would repeat the entries that were not new and it would keep my screen rolling up when there were no new files. I've learned about "watch", which does show what I want and refreshes every N seconds, but I don't want a ncurses interface, I'm looking for something like tail:
continuous
shows only the new stuff
prints in my terminal, so I can run it in the background and do other things and see the output every now and then getting mixed with whatever I'm doing :D
Summarizing: get the input, compare to a previous input, output only what is new.
Something that do that doesn't sound like such an odd tool, I can see it being used for other situations also, so I would expect it to already exist, but I couldn't find anything. Suggestions?

You can use the very handy command watch
watch -n 10 "ls -ltr"
And you will get a ls every 10 seconds.
And if you add a tail -10 you will only get the 10 newest.
watch -n 10 "ls -ltr|tail -10"

If you have access to inotifywait (available from the inotify-tools package if you are on Debian/Ubuntu) you could write a script like this:
#!/bin/bash
WATCH=/tmp
inotifywait -q -m -e create --format %f $WATCH | while read event
do
ls -ltr $WATCH/$event
done
This is a one-liner that won't give you the same information that ls does, but it will print out the filename:
inotifywait -q -m -e create --format %w%f /some/directory

This works in cygwin and Linux. Some of the previous solutions which write a file will cause the disk to thrash.
This script does not have that problem:
SIG=1
SIG0=SIG
while [ $SIG != 0 ] ; do
while [ $SIG = $SIG0 ] ; do
SIG=`ls -1 | md5sum | cut -c1-32`
sleep 10
done
SIG0=$SIG
ls -lrt | tail -n 1
done

Related

How to view syslog entries since last time I looked

I want to view the entries in Linux /var/log/syslog, but I only want to see the entries since last time I looked (preferably create a bash script to do this). The solution I thought of was to take a copy of syslog and diff it against the last time I took a copy, but this seems unclean because syslog can be big and diff adds artifacts in its output. Im thinking maybe somehow use tail directly on syslog, but I dont know how to do this when I dont know how many lines have been added since last time I tried. Any better thoughts? I would like to be able to redirect the result to a file so I can later interactively grep for specific parts of interest.

Linux has a wc command which can count the number of lines within a file, for example
wc -l /var/log/syslog. The bash script below stores the output of the wc -l command in a file called ./prevlinecount. Whenever you want just the new lines in a file it gets the value in ./prevlinecount and subtracts this value from a new instance of wc -l /var/log/syslog called newlinecount. Then it tails (newlinecount - prevlinecount).
#!/bin/bash
prevlinecount=`cat ./prevlinecount`
if [ -z $prevlinecount ]; then
echo `wc -l $1 | awk '{ print $1 }' > ./prevlinecount`
tail -n +1 $1
else
newlinecount=`wc -l $1 | awk '{print $1}'`
tail -n `expr $newlinecount - $prevlinecount` $1
echo $newlinecount > ./prevlinecount
fi
beware
this is a very rudimentary script which can only keep track of one file. If you would like to extend this script to multiple files, look into associative arrays. With associative arrays you could keep track of multiple files by having the key as the filename and value being the previous line count.
beware too that over time syslog files can be archived after the file reaches a predetermined size (maybe 10MB) and this script does not account for the archival process.

Pass each file obtained from a command to another command as a parameter

I am using the following line to take a pdf and split it:
pdfseparate -f 14 -l 23 ALF.SS.0.pdf "${FILE}"-%d.pdf
Now I want for each file produced, to run several commands like this:
pdfcrop --margins '-30 0 -385 0' outputOfpdfSeparate outputOfpdfSeparate-1stCol.pdf
I am trying to figure out the best way to do this:
With a single loop, for each file created by pdfseparate, if I manage to "know" what is the name of the file, I could pass it to pdfcrop and done. But since it is using %d, I do not know how to handle this "new name" in which each file has a new number. I know how to do this in Java but here I do not see it so clear.
Using pipes. I think I have the same issue since if I do
pdfseparate [options] | pdfcrops inputfile outputfile,
I do not know how to "use" the name of inputfile. I am sure it is easy but I dont see it.
Using xargs. I am studying this command since it is new for me.
Using exec. I am under the impression this is not necessary but maybe I am wrong since it's been a long while since I last used exec.
Thanks in advance.

You can use xargs. It is the best way in terms of speed.
I usually use it for converting a lot of .mp4 file to .mp3.
Doing this conversion one-by-one not only is tedious but also takes a long time. Therefore you can use the auto parallel mechanism with the help of -P 0 option in xargs
for example if I had 10 .mp4 files I would do this:
ls *.mp4 | xargs -I xxx -P 0 ffmpeg -i xxx xxx.mp3
After running this line; 10 ffmpet commands are running simultaneously.
The other way to do this is storing a list of .mp4 file in a text file like this:
ls *.mp4 > list-mp4
then:
xargs -I xxx -P 0 ffmpeg -i xxx xxx.mp3 < list-mp4
Or may you have access to GNU-parallel. Thus you can:
parallel ffmpeg -i {} {}.mp3 ::: *.mp4
Now for your case; if you want to use these (= xargs or parallel) or your own command, you should notice that your first command should send its output to stdout. Because the second command is going to read its stdin from the stdout of the first command and bash does this for your.
Thus when you can use pipe == | with your: pdfseparate than it sends its output to stdout. If it does/can NOT, then the right-side of the pipe == the second command does nothing and vice versa: the second command should/can read its stdin from incoming stdout.
For example
ls *.txt | echo {}
here echo does not read any incoming stdout from the ls command and just prints {}
Eventually, your pdfseparate should send to stdout. Then xargs store it in -I anything-your-like and then call your second command
Therefor:
pdfseparate options... | xargs -I ABC -P 0 your-second-command+its-options ABC
NOTE-1 that xargs stores the given stdout line-by-line in ABC and you pass this to your second command as its input
NOTE-2 you do not have to use -P 0 at all. It is just for speeding up the executing time. You can omit it but your second command are synchronized per incoming line.

pdfseparate does not output the files it created, thus you have to use "ls" command to get the filelist, you want to operate on.
# separate the pdfs
pdfseparate -f 14 -l 23 ALF.SS.0.pdf "${FILE}"-%d.pdf
# operate on the just created files, assumes that a "FILE" variable is set, which might not be the case
for i in $(ls "${FILE}-*.pdf"); do pdfcrop --margins '-30 0 -385 0' $i; done;
# assuming that FILE variable in your case would match ALF.SS.0-[0-9]*.pdf, you'd use this:
for i in $(ls ALF.SS.0-[0-9]*.pdf); do pdfcrop --margins '-30 0 -385 0' $i; done;

Bash standard output display and redirection at the same time

In terminal, sometimes I would like to display the standard output and also save it as a backup. but if I use redirection ( > &> etc), it does not display the output in the terminal anymore.
I think I can do for example ls > localbackup.txt | cat localbackup.txt. But it just doesn't feel right. Is there any shortcut to achieve this?
Thank you!

tee is the command you are looking for:
ls | tee localbackup.txt

In addition to using tee to duplicate the output (and it's worth mentioning that tee is able to append to the file instead of overwriting it, by using tee -a, so that you can run several commands in sequence and retain all of the output), you can also use tail -f to "follow" the output file from a parallel process (e.g. a separate terminal):
command1 >localbackup.txt # create output file
command2 >>localbackup.txt # append to output
and from a separate terminal, at the same time:
tail -f localbackup.txt # this will keep outputting as text is appended to the file

shell script to download latest file from FTP

I am writing shell script first time, I want to download latest create file from FTP.
I want to download latest file of specific folder. Below is my code for that. But it is downloading all the files of the folder not the latest one.
ftp -in ftp.abc.com << SCRIPTEND
user xyz xyz
binary
cd Rpts/
mget ls -t -r | tail -n 1
quit
SCRIPTEND
help me with this, please?

Try using wget or lftp utility instead, it compares file time/date and AFAIR its purpose is ftp scripting. Switch to ssh/rsync if possible, you can read a bit about lftp instead of rsync here:
https://serverfault.com/questions/24622/how-to-use-rsync-over-ftp
Probably the easiest way is to link last version on server side to "current", and always get the file pointed. If you're not admin of the server, you need to list all files with date/time, grab the information, parse it, decide which one is newest, in the meantime state on the server can change, and you find yourself in more complicated solution than it's worth.
The point is, that "ls" sorts output in some way, and time may not be default. There are switches to sort it e.g. base on modification time, however even when server responds with OK on ls -t , you can't be sure it really supports sorting, it can just ignore all switches and always return the same list, that's why admins usually use "current" link (ln -s). If there's no "current", to make sure you have the right file, you need to parse list anyway ( ls -al ).
http://www.catb.org/esr/writings/unix-koans/shell-tools.html

Looking at the code, the line
mget ls -t -r | tail -n 1
doesn't do what you think. It actually grabs all of the output of ls -t and then tail processes the output of mget. You could replace this line with
mget $(ls -t -r | tail -n 1)
but I am not sure if ftp will support such a call...
Try using an FTP client other than ftp. For example, curlftpfs available at curlftpfs.sourceforge.net is a good candidate as it allows you to mount an FTP to a directory as if it is a local folder and then run different commands on the files there (including find, grep, etc.). Take a look at this article.
This way, since the output comes form a local command, you'd be more certain that ls -t returns a properly sorted list.
Btw, it's a bit less convoluted to use ls -t | head -1 than ls -t -r | tail -1. They produce the same result but why reverse and grab from the tail when you can just grab the head :)
If you use curlftpfs then your script would be something like this (assuming server ftp.abc.com and user xyz with password xyz).
mkdir /tmp/ftpsession
curlftpfs ftp://xyz:xyz#ftp.abc.com /tmp/ftpsession
cd /tmp/ftpsession/Rpts
cp -Rpf $(ls -t | head -1) /your/destination/folder/or/file
cd -
umount /tmp/ftpsession

My Solution is this:
curl 'ftp://server.de/dir/'$(curl 'ftp://server.de/dir/' 2>/dev/null | tail -1 | awk '{print $(NF)}')

How to capture the output of a top command in a file in linux?

I want to write the output of a specific 'top' command to a file. I did some googling and find out that it can be done by using the following command.
top -n 10 -b > top-output.txt
where -n is to specify the number of iterations and -b is for batch mode. This works very well if let top for the 10 iterations. But if i break the running of the command with a Ctrl-C, the output file seems to be empty.
I won't be knowing the number of iterations beforehand, so i need to break it manually. How can i capture the output of top in a file without specifying iterations?
The command which I am trying to use precisely is
top -b | grep init > top-output.txt
and break it whenever i want. But it doesn't work.
EDIT: To give more context to the question, I have a Java Code which invokes a tool with an Input File. As in the tool takes a file as an input and runs for some time, then takes the next file and so on. I have a set of 100,000 files which need to be fed to the tool. So now I am trying to monitor that specific tool ( It runs as a process in Linux). I cannot capture the whole of 'top' s data as the file as would be too huge with unwanted data. How to capture the system stats of just that process and write it to a file using top?

for me top -b > test.txt will store all output from top ok even if i break it with ctrl-c. I suggest you dump first, and then grep the resulting file.

How about using while loop and -n 1:
while sleep 3; do
top -b -n1 | grep init > top-output.txt
done

It looks like the output is not writing to the file until all iterations are finished. You could solve this by wrapping with an external loop like this:
touch top-output.txt
while true; do
top -b | grep init >> top-output.txt
done

Here is the 1-liner I like to use on my mac:
top -o -pid -l 1 | grep "some regexp"
Cheers.

As pointed out by #Thor in a comment, you just need to ensure that grep is not buffering arbitrarily but per-line with the --line-buffered option:
top -bn 10 | grep 'init' --line-buffered | tee top-output.txt
Without grep-ing, redirecting the output of top to a file works just fine, interrupt included.

Solved this issue. This works even if you press Ctrl+c Even I was facing the same issue when I wanted to log Cpu%.
Execute this shell script:
#!/bin/sh
while true; do
echo "$(top -b -n 1 | grep init)" | tee -a top-output.log
sleep 1
done
You can grep anything you wanna extract out of top command, use this script to store it to a file.
-b : Batch mode operation
Starts top in Batch mode, which could be useful for sending output from top to other programs or
to a file. In this mode, top will not accept input and runs until the iterations limit you've set
with the -n command-line option or until killed.
-n number, this option specifies the maximum number of iterations, or frames, top should produce before ending. Here I've used -n 1.
Do man top for more details
tee -a enables the output to be visible on the console and also stores the output onto the file. -a option appends the output to the file.
Here, I have given an interval of 1 second. You can mention any other interval.
Source for explanations of -b and -n: manpages
man top
Kruthika

CTRL+C is not a ideal solution due to control stays in CLI. You can use below command which dumps top output to a file:
top -n 1 -b > top-output.txt

I had the exact same problem...
here was my line:
top -b -u myUser | grep -v Prog.sh | grep Prog > myFile.txt
It would create myFile.txt but it would be empty when I Ctrl+C'd it. So after I kicked off my top command, then I started a SECOND top process. When I found the first top's PID (took some trial and error), and I killed it thru the second top, the first top wrote to the file as expected.
Hope that helps!

If you wish to run the top command in background (just not to worry about logout/sleep, etc) - you can make use of nohup or batch job or cron or screen.
Using nohup (stands for : No Hang Up):
Say suppose if you save the top command in a file called top-exec.sh with following content :
top -p <PID> -b > /tmp/top.log
You can replace the top command for whatever process you are interested in.
Then, You can execute top-exec.sh using nohup as follows :
$> nohup top-exec.sh &
This will redirect all the output of top command to a file named "top.log".

Set the -n argument to 1 it tells top how many frames it will produce before exits.
top -b -n 1 > ~/mytopview.txt
or even
myvar=`top -b -n 1`
echo $myvar

From the top command, we can see all the processes with their PID (Process ID).
To print top output for only one process, use the following command:
$ top –p PID
To save top command of any process to a file, use the following command:
top -p $PROCESS_ID -b > top.log
where > redirects standard output to a file.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string