Pipeline of greps on running process outputs nothing - linux

I'm using a macbook, Darwin os.
I have a running process ./r that generates a log file which it runs. It is a long running process.
When I filter once like this, it works and I see all lines except those with 'A' in them.
./r | grep -v A
But when I do this, there is no output, even though not all lines contain 'B'
./r | grep -v A | grep -v B
To prove this I can do this, which does indeed show output.
./r > tmp
# wait 30 seconds, ctrl-c
cat tmp | grep -v A | grep -v B

Buffering.
When it's in the tmp file and you pipe that in, there is an EOF, which ends the pipeline. But with your running process, that EOF never comes (because process is still running).
Probably there is an environment and/or OS level setting for this, idk... but you can force grep to not buffer more than one line.
Try this.
grep --line-buffered -v A

Related

Loop to filter out lines from apache log files

I have several apache access files that I would like to clean up a bit before I analyze them. I am trying to use grep in the following way:
grep -v term_to_grep apache_access_log
I have several terms that I want to grep, so I am piping every grep action as follow:
grep -v term_to_grep_1 apache_access_log | grep -v term_to_grep_2 | grep -v term_to_grep_3 | grep -v term_to_grep_n > apache_access_log_cleaned
Until here my rudimentary script works as expected! But I have many apache access logs, and I don't want to do that for every file. I have started to write a bash script but so far I couldn't make it work. This is my try:
for logs in ./access_logs/*;
do
cat $logs | grep -v term_to_grep | grep -v term_to_grep_2 | grep -v term_to_grep_3 | grep -v term_to_grep_n > $logs_clean
done;
Could anyone point me out what I am doing wrong?
If you have a variable and you append _clean to its name, that's a new variable, and not the value of the old one with _clean appended. To fix that, use curly braces:
$ var=file.log
$ echo "<$var>"
<file.log>
$ echo "<$var_clean>"
<>
$ echo "<${var}_clean>"
<file.log_clean>
Without it, your pipeline tries to redirect to the empty string, which results in an error. Note that "$file"_clean would also work.
As for your pipeline, you could combine that into a single grep command:
grep -Ev 'term_to_grep|term_to_grep_2|term_to_grep_3|term_to_grep_n' "$logs" > "${logs}_clean"
No cat needed, only a single invocation of grep.
Or you could stick all your terms into a file:
$ cat excludes
term_to_grep_1
term_to_grep_2
term_to_grep_3
term_to_grep_n
and then use the -f option:
grep -vf excludes "$logs" > "${logs}_clean"
If your terms are strings and not regular expressions, you might be able to speed this up by using -F ("fixed strings"):
grep -vFf excludes "$logs" > "${logs}_clean"
I think GNU grep checks that for you on its own, though.
You are looping over several files, but in your loop you constantly overwrite your result file, so it will only contain the last result from the last file.
You don't need a loop, use this instead:
egrep -v 'term_to_grep|term_to_grep_2|term_to_grep_3' ./access_logs/* > "$logs_clean"
Note, it is always helpful to start a Bash script with set -eEuCo pipefail. This catches most common errors -- it would have stopped with an error when you tried to clobber the $logs_clean file.

Cleanly kill all vim process from other tabs

I am using following Linux command to kill all vim process,
ps -ef | grep "gvim" | grep -v grep | awk '{print $2}' | xargs kill
Above command is working except it outputs:
Vim: Caught deadly signal TERM
Is there another clean way to kill it?
Well as long as you're fine with losing all unsaved changes and halting any background jobs mid-go, you can send a different signal from the kill command with the -s option. Signal 15, SIGTERM, is the default one; you can do kill -s 9 to murder the processes outright. Here is a good list of signals for reference.
So your final command would look like:
ps -ef | grep "gvim" | grep -v grep | awk '{print $2}' | xargs kill -s 9
This does not generate any warnings about deadly signals. It's also a pretty nasty way of doing things, and could lead to some gnarly behavior if you've got a non-trivial vim setup.
As #moopet said in the comments, you can still use SIGTERM if you redirect the command's output to /dev/null. This won't be any nicer than SIGKILL in that Vim won't automatically save your files, but it may give any background tasks it was doing some time to clean up.
Alternatively, if you're willing to change your setup, you can always run a vim as a server and issue save-and-close commands to it from the terminal. There is a great explanation over on the Vim SE site.

Show not logged users processes linux bash script

I am doing a bash script and i am essaying to show not logged users processes,which are typically daemon processes, for this,in the exercise, they recommend me:
To process the command line, we will use the cut command, which allows
selecting the different columns of the list through a filter.
I used:
ps -A | grep -v w
ps -A | grep -v who
ps -A | grep -v $USER
but trying all these options all the processes of all users are printed in the output file, and I only want the processes of users who are not logged.
I appreciate your help
Thank you.
grep -v w will remove lines matching the regular expression w (which is simply anything which contains the string w). To run the command w you have to say so; but as hinted in the instructions, you will also need to use cut to post-process the output.
So as not to give the answer away completely, here's rough pseudocode.
w | cut something >tempfile
ps -A | grep -Fvf tempfile
It would be nice if you could pass the post-processed results of w in a pipe, but standard input is already tied to ps -A. If you have a shell which supports process substitution, you can use that.
ps -A | grep -Fvf <(w | cut something)
Unfortunately, the output from w is not properly machine-readable -- you will properly want to cut out the header line(s), too. (On my machine, there are two header lines. Yours might differ.) You'll probably learn a bit of Awk later on in the course, but until then, maybe something like
ps -A | grep -Fvf <(w | tail -n +3 | cut something)
This still doesn't completely handle all possible situations. What if someone's account name is grep?

Why `read -t` is not timing out in bash on RHEL?

Why read -t doesn't time out when reading from pipe on RHEL5 or RHEL6?
Here is my example which doesn't timeout on my RHEL boxes wile reading from the pipe:
tail -f logfile.log | grep 'something' | read -t 3 variable
If I'm correct read -t 3 should timeout after 3 seconds?
Many thanks in advance.
Chris
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
The solution given by chepner should work.
An explanation why your version doesn't is simple: When you construct a pipe like yours, the data flows through the pipe from the left to the right. When your read times out however, the programs on the left side will keep running until they notice that the pipe is broken, and that happens only when they try to write to the pipe.
A simple example is this:
cat | sleep 5
After five seconds the pipe will be broken because sleep will exit, but cat will nevertheless keep running until you press return.
In your case that means, until grep produces a result, your command will keep running despite the timeout.
While not a direct answer to your specific question, you will need to run something like
read -t 3 variable < <( tail -f logfile.log | grep "something" )
in order for the newly set value of variable to be visible after the pipeline completes. See if this times out as expected.
Since you are simply using read as a way of exiting the pipeline after a fixed amount of time, you don't have to worry about the scope of variable. However, grep may find a match without printing it within your timeout due to its own internal buffering. You can disable that (with GNU grep, at least), using the --line-buffered option:
tail -f logfile.log | grep --line-buffered "something" | read -t 3
Another option, if available, is the timeout command as a replacement for the read:
timeout 3 tail -f logfile.log | grep -q --line-buffered "something"
Here, we kill tail after 3 seconds, and use the exit status of grep in the usual way.
I dont have a RHEL server to test your script right now but I could bet than read is exiting on timeout and working as it should. Try run:
grep 'something' | strace bash -c "read -t 3 variable"
and you can confirm that.

Multiple greps in pipeline not terminating after completion

I'm seem to be having a problem with a simple grep statement not finishing/terminating after it's been completed.
For example:
grep -v -E 'syslogd [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}: restart' |
grep -v 'printStats: total reads from cache:' /var/log/customlog.log >\
/tmp/filtered_log.tmp
The above statement will strip out the contents and save them into a temp file, however after the grep finishes processing the entire file, the shell script hangs and cannot proceed anymore. This behavior is also triggered when manually running the command within the command line. Essentially combining multiple grep statements causes a PAGER like action (more/less).
Does anyone have any suggestions to overcome this limitation? Ideally I wouldn't want to do the following giving that the customlog.log file might get huge at times.
cat /var/log/customlog.log |
grep -v -E 'syslogd [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}: restart' |
grep -v 'printStats: total reads from cache:' > /tmp/filtered_log.tmp
Thanks,
Tony
As explained above, you need to move here your file name:
grep -v -E \
'syslogd [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}: restart' /var/log/customlog.log
| grep -v 'printStats: total reads from cache:' > /tmp/filtered_log.tmp
But you can also combine the two greps:
grep -v -E \
-e 'syslogd [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}: restart' \
-e 'printStats: total reads from cache:' /var/log/customlog.log > \
/tmp/filtered_log.tmp
Saves a bit of CPU and will fix your error at the same time.
BTW, another possible issue: What if two instances of this script are run at the same time? Both will be using the same temp file. This probably isn't an issue in this particular case, but you might as well get used to developing scripts for that situation. I recommend that you use $$ to put the process ID in your temporary file:
tempFileName="/tmp/filtered_log.$$.tmp"
grep -v -E -e [blah... blah.. blah] /var/log/customlog.log > $tempFileName
Now, if two different people are running this process, you won't get them using the same temp file.
Appended
As pointed out by uwe-kleine-konig, you're actually better off using mktemp:
tempFileName=$(mktemp filtered_log.XXXXX)
grep -v -E -e [blah... blah.. blah] /var/log/customlog.log > $tempFileName
Thanks for the suggestion.

Resources