grepping data Apache access.log

grepping data Apache access.log - linux

I want to have a real-time copy of /var/log/apache2/access.log so i can grep, do hostname resolution, etc.
What's the best way to do this?
I am curious to see what kind of traffic is passing by

You could:
configure apache to send logs via syslog, than configure syslog to obtain separated logs files (with specific owner). Take a look at: O'Reilly : Sending Apache httpd Logs to Syslog
use tail -f, but you have to ensure that following commands are unbuffered in order to read events immediately
tail -f /var/log/apache2/access.log | grep --line-buffered "something" or
tail -f /var/log/apache2/access.log | sed -une "/something/p"
Make the tail -f | grep using perl or python (perl is a good choice for grepping in log files).
(This sample are copied from man perlfaq5:
for (;;) {
for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
# search for some stuff and put it into files
}
# sleep for a while
seek(GWFILE, $curpos, 0); # seek to where we had been
}

Do this:
#Customize as appropriate:
tail -f /var/log/apache2/access.log | cut -f 0 -d ' ' &
tail -f /var/log/apache2/access.log | grep foo &

Related

Stop Tail when a certain string is found in log file

Need to monitor a log file for a particular string "Server running at http". Once that string is found in the log file , I need to stop checking and want to continue to rest of the code.
currently I am currently using "tail -f my-file.log | grep -q "Server running at http" . But this doesn't seem to work.tail command is still running.
tail -f my-file.log | grep -q "Server running at http"

You can try something like :
tail -f my-file.log | awk '/Server running at http/ { print | bash file_with_code}'
P.S. instead of a different file it can be a function within the same script, needless to say that in such a case you will not need to say bash before function
Another possible solution can be:
tail -f my-file.log | egrep -m 1 "Server running at http";echo "found the pattern"

You have to specify, how many matches you need. As you need first hit
tail -f server.log | grep -m 1 "mystring"
once "mystring" found for the first hit, program will exit automatically.

Problems with tail -f and awk? [duplicate]

Is that possible to use grep on a continuous stream?
What I mean is sort of a tail -f <file> command, but with grep on the output in order to keep only the lines that interest me.
I've tried tail -f <file> | grep pattern but it seems that grep can only be executed once tail finishes, that is to say never.

Turn on grep's line buffering mode when using BSD grep (FreeBSD, Mac OS X etc.)
tail -f file | grep --line-buffered my_pattern
It looks like a while ago --line-buffered didn't matter for GNU grep (used on pretty much any Linux) as it flushed by default (YMMV for other Unix-likes such as SmartOS, AIX or QNX). However, as of November 2020, --line-buffered is needed (at least with GNU grep 3.5 in openSUSE, but it seems generally needed based on comments below).

I use the tail -f <file> | grep <pattern> all the time.
It will wait till grep flushes, not till it finishes (I'm using Ubuntu).

I think that your problem is that grep uses some output buffering. Try
tail -f file | stdbuf -o0 grep my_pattern
it will set output buffering mode of grep to unbuffered.

If you want to find matches in the entire file (not just the tail), and you want it to sit and wait for any new matches, this works nicely:
tail -c +0 -f <file> | grep --line-buffered <pattern>
The -c +0 flag says that the output should start 0 bytes (-c) from the beginning (+) of the file.

In most cases, you can tail -f /var/log/some.log |grep foo and it will work just fine.
If you need to use multiple greps on a running log file and you find that you get no output, you may need to stick the --line-buffered switch into your middle grep(s), like so:
tail -f /var/log/some.log | grep --line-buffered foo | grep bar

you may consider this answer as enhancement .. usually I am using
tail -F <fileName> | grep --line-buffered <pattern> -A 3 -B 5
-F is better in case of file rotate (-f will not work properly if file rotated)
-A and -B is useful to get lines just before and after the pattern occurrence .. these blocks will appeared between dashed line separators
But For me I prefer doing the following
tail -F <file> | less
this is very useful if you want to search inside streamed logs. I mean go back and forward and look deeply

Didn't see anyone offer my usual go-to for this:
less +F <file>
ctrl + c
/<search term>
<enter>
shift + f
I prefer this, because you can use ctrl + c to stop and navigate through the file whenever, and then just hit shift + f to return to the live, streaming search.

sed would be a better choice (stream editor)
tail -n0 -f <file> | sed -n '/search string/p'
and then if you wanted the tail command to exit once you found a particular string:
tail --pid=$(($BASHPID+1)) -n0 -f <file> | sed -n '/search string/{p; q}'
Obviously a bashism: $BASHPID will be the process id of the tail command. The sed command is next after tail in the pipe, so the sed process id will be $BASHPID+1.

Yes, this will actually work just fine. Grep and most Unix commands operate on streams one line at a time. Each line that comes out of tail will be analyzed and passed on if it matches.

This one command workes for me (Suse):
mail-srv:/var/log # tail -f /var/log/mail.info |grep --line-buffered LOGIN >> logins_to_mail
collecting logins to mail service

Coming some late on this question, considering this kind of work as an important part of monitoring job, here is my (not so short) answer...
Following logs using bash
1. Command tail
This command is a little more porewfull than read on already published answer
Difference between follow option tail -f and tail -F, from manpage:
-f, --follow[={name|descriptor}]
output appended data as the file grows;
...
-F same as --follow=name --retry
...
--retry
keep trying to open a file if it is inaccessible
This mean: by using -F instead of -f, tail will re-open file(s) when removed (on log rotation, for sample).
This is usefull for watching logfile over many days.
Ability of following more than one file simultaneously
I've already used:
tail -F /var/www/clients/client*/web*/log/{error,access}.log /var/log/{mail,auth}.log \
/var/log/apache2/{,ssl_,other_vhosts_}access.log \
/var/log/pure-ftpd/transfer.log
For following events through hundreds of files... (consider rest of this answer to understand how to make it readable... ;)
Using switches -n (Don't use -c for line buffering!).By default tail will show 10 last lines. This can be tunned:
tail -n 0 -F file
Will follow file, but only new lines will be printed
tail -n +0 -F file
Will print whole file before following his progression.
2. Buffer issues when piping:
If you plan to filter ouptuts, consider buffering! See -u option for sed, --line-buffered for grep, or stdbuf command:
tail -F /some/files | sed -une '/Regular Expression/p'
Is (a lot more efficient than using grep) a lot more reactive than if you does'nt use -u switch in sed command.
tail -F /some/files |
sed -une '/Regular Expression/p' |
stdbuf -i0 -o0 tee /some/resultfile
3. Recent journaling system
On recent system, instead of tail -f /var/log/syslog you have to run journalctl -xf, in near same way...
journalctl -axf | sed -une '/Regular Expression/p'
But read man page, this tool was built for log analyses!
4. Integrating this in a bash script
Colored output of two files (or more)
Here is a sample of script watching for many files, coloring ouptut differently for 1st file than others:
#!/bin/bash
tail -F "$#" |
sed -une "
/^==> /{h;};
//!{
G;
s/^\\(.*\\)\\n==>.*${1//\//\\\/}.*<==/\\o33[47m\\1\\o33[0m/;
s/^\\(.*\\)\\n==> .* <==/\\o33[47;31m\\1\\o33[0m/;
p;}"
They work fine on my host, running:
sudo ./myColoredTail /var/log/{kern.,sys}log
Interactive script
You may be watching logs for reacting on events?
Here is a little script playing some sound when some USB device appear or disappear, but same script could send mail, or any other interaction, like powering on coffe machine...
#!/bin/bash
exec {tailF}< <(tail -F /var/log/kern.log)
tailPid=$!
while :;do
read -rsn 1 -t .3 keyboard
[ "${keyboard,}" = "q" ] && break
if read -ru $tailF -t 0 _ ;then
read -ru $tailF line
case $line in
*New\ USB\ device\ found* ) play /some/sound.ogg ;;
*USB\ disconnect* ) play /some/othersound.ogg ;;
esac
printf "\r%s\e[K" "$line"
fi
done
echo
exec {tailF}<&-
kill $tailPid
You could quit by pressing Q key.

you certainly won't succeed with
tail -f /var/log/foo.log |grep --line-buffered string2search
when you use "colortail" as an alias for tail, eg. in bash
alias tail='colortail -n 30'
you can check by
type alias
if this outputs something like
tail isan alias of colortail -n 30.
then you have your culprit :)
Solution:
remove the alias with
unalias tail
ensure that you're using the 'real' tail binary by this command
type tail
which should output something like:
tail is /usr/bin/tail
and then you can run your command
tail -f foo.log |grep --line-buffered something
Good luck.

Use awk(another great bash utility) instead of grep where you dont have the line buffered option! It will continuously stream your data from tail.
this is how you use grep
tail -f <file> | grep pattern
This is how you would use awk
tail -f <file> | awk '/pattern/{print $0}'

Linux tail + grep + less

I want to see live output from my website's access log. I want to see only certail types of entries, in this case, entries that match ".php".
This works fine, but lines wrap to the next line and I don't want that:
tail -f access-log | fgrep ".php" --line-buffered
This works fine for avoiding the line wrapping, but it is not filtered:
less +F -S access-log
I prefer looking at the file without lines wrapping to the next line because it is easier to see the structure in the output, this is what I want less -S for.
This kind of works, but the "cursor" doesn't stay at the bottom of the file, and any command I enter makes less hang (press "SHIFT + f" for staying at the bottom as the stream comes):
tail -f access-log | fgrep ".php" --line-buffered | less -S
But this doesn't work at all:
tail -f access-log | fgrep ".php" --line-buffered | less +F -S
So, is there a way to achieve the what I want?
I also take outside-the-box solutions, maybe cutting with sed so that each line is never longer than my screen?

With bash I suggest:
tail -f access-log | fgrep ".php" --line-buffered | cut -c 1-$COLUMNS

Consider using the watch command:
watch -n1 tail access-log | fgrep ".php" --line-buffered

I already used the accepted answer for my case, but I guess if I really wanted to use less because I like other features it also offers, I could do this:
tail -f access-log | fgrep ".php" --line-buffered >> tmp.access-log
And then
less -S +F tmp.access-log
Then when I am done, if I don't need that tmp file I just delete it.

Why does 'top | grep > file' not work?

I tested the following command, but it doesn't work.
$> top -b -d 1 | grep java > top.log
It doesn't use standard error. I checked that it uses standard output, but top.log is always empty. Why is this?

By default, grep buffers output which implies that nothing would be written to top.log until the grep output exceeds the size of the buffer (which might vary across systems).
Tell grep to use line buffering on output. Try:
top -b -d 1 | grep --line-buffered java > top.log

In my embedded machine, grep hadn't the --line-buffered option. So I used this workaround for my myself:
while :;do top -b -n 1 | grep java >> top.log;done &
By this way I could have a running monitor in the background for a program like "java" and keep all results in the file top.log.

Buffering problem when piping output between CLI programs

I'm trying to tail apache error logs through a few filters.
This works perfectly:
tail -fn0 /var/log/apache2/error.log | egrep -v "PHP Notice|File does not exist"
but there are some literal "\n" in the output which I want to replace with an actual new line so I pipe into perl:
tail -fn0 /var/log/apache2/error.log | egrep -v "PHP Notice|File does not exist" | perl -ne 's/\\n/\n/g; print"$_"'
This seems to have some caching issue (first page hit produces nothing, second page hit and two loads of debugging info comes out), It also seems a bit tempramental.
So I tried sed:
tail -fn0 /var/log/apache2/error.log | egrep -v "PHP Notice|File does not exist" | sed 's/\\n/\n/g'
which seems to suffer the same problem.

Correct, when you use most programs to file or pipe they buffer output. You can control this in some cases: the GNU grep family accepts the --line-buffered option, specifically for use in pipelines like this. Also, in Perl you can use $| = 1; for the same effect. (sed doesn't have any such option that I'm aware of.)
It's the stuff at the beginning or middle of the pipeline that will be buffering, not the end (which is talking to your terminal so it will be line buffered) so you want to use egrep --line-buffered.

Looks like you can use -u for sed as in:
tail -f myLog | sed -u "s/\(joelog\)/^[[46;1m\1^[[0m/g" | sed -u 's/\\n/\n/g'
which tails the log, highlights 'joelog', and then adds linebreaks where there are '\n'
source:
http://www-01.ibm.com/support/docview.wss?uid=isg1IZ42070

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

grepping data Apache access.log - linux

I want to have a real-time copy of /var/log/apache2/access.log so i can grep, do hostname resolution, etc. What's the best way to do this? I am curious to see what kind of traffic is passing by

Do this: #Customize as appropriate: tail -f /var/log/apache2/access.log | cut -f 0 -d ' ' & tail -f /var/log/apache2/access.log | grep foo &

Related

Stop Tail when a certain string is found in log file

Problems with tail -f and awk? [duplicate]

Linux tail + grep + less

Why does 'top | grep > file' not work?

Buffering problem when piping output between CLI programs

Categories

Resources