piped sed does not output to file using ngrep - linux

I am using ngrep to filter some tcp packetes into STDOUT
Since it now become more important to log the output (after changing the result a bit usingsed) into a file.
piping it with sed looks OK in stdout - But no content is written when writing to dump.log
Below is the command:
grep -l -q -W none -i "^POST /somefile.php" tcp and port 80 | sed -e 's/^T/IP/g' >> dump.log
Having the impression that either sed or ngrep blocks the process of pushing the content.

Add -U to GNU sed to load minimal amounts of data from the input and flush the output buffers more often.

Related

Looking for a specyfic message in zipped pcap files

I have around 7200 compressed .pcap files. Each is compressed into a separate .gz file. I need to look for a specific string in packet data details. I would like to write a command to do that. At the moment all I have is:
zcat 20230212*.pcap.gz | tcpdump -qns 0 -X | grep "specyfic string"
where 20230212*.pcap.gz is pattern for these 72000 files.
I know that problem is somewhere on tcpdump part. Sorry for my english.
Update
I tried
tcpdump -qns 0 -A -r filename.pcap | grep "string"
where filename is name of specyfic file, that contains string. It works, but I had to unzip this file. I cannot do it for all files. Also tried:
tcpdump -qns 0 -X -r filename.pcap | grep "string"
but this command cannot find string.
xargs zcat filename.pcap.gz | tcpdump -qns 0 -A -r | grep "string"
gives me: tcpdump: option requires an argument -- 'r'
tcpdump: option requires an argument -- 'r'
The -r flag needs to be given an argument to indicate what to read.
An argument of - means "read the standard input", which is what you want here, as you're piping the result of zcat to it.
So you want
zcat filename.pcap.gz | tcpdump -qns 0 -A -r - | grep "string"
You don't want xargs, because, with
xargs zcat filename.pcap.gz | tcpdump -qns 0 -A -r - | grep "string"
it will:
read file names from the standard input of the first command - meaning that, if you run that exact command from the command line, it will read file names from the terminal, so you would have to type a bunch of file names, followed by control-D to mark the end of the list of file names;
collect the file names into bunches;
run zcat filename.pcap.gz {bunch of file names} - meaning that it will decompresss first filename.pcap.gz, followed by all of the files in that bunch, and write the decompressed contents of all those files as a single stream of raw bytes;
read more file names and do that again until it runs out of file names;
which means that what tcpdump will see will look like a bunch of pcap-format files stuck together ("concatenated") into one. That will NOT look like a single pcap-format file to tcpdump; instead, it will look like the first pcap file, followed by a lot of stuff that will not look like valid pcap file contents, so tcpdump will probably print an error and give up.
(And other programs that read pcap-format files, such as tshark, will do the exact same thing. There's no magic flag or tool to fix that.)
What you should do, instead, is have a small shell script, such as
#! /bin/sh
echo "Processing $1:"
zcat "$1" | tcpdump -qns 0 -A -r - | grep "$2"
and, to look for a given string in one .pcap.gz file, do
{path to script} {file name} "string"
where {path to script} is the path name of the script and {file name} is the pathname of the file.
To scan all the files, do
for file in 20230212*.pcap.gz
do
{path to script} "$file" "string"
done >/tmp/output
That is a loop that loops over all files that match 20230212*.pcap.gz and, for each of them, runs the script on the file, looking for the string, and sends the output of that entire loop to the file /tmp/output.
Note that /tmp/output will contain one line for every file, giving the name of the file. If you don't care which capture files contain the string, you can remove the
echo "Processing $1:"
line from the script. If you do care which capture files contain the string, but you don't care what the exact text that matches is, you can have the script be
#! /bin/sh
echo "Processing $1:"
if zcat "$1" | tcpdump -qns 0 -A -r - | grep -q "$2"
then
echo "$1 contains \"$2\""
fi
which tests whether the grep command found the string and, if it did, prints a message. The -q flag causes grep not to write the matching text out, so the file doesn't have that extra information in it.
After using: xargs zcat "filename" | tcpdump -qns 0 -X | grep "string, I receive tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on bond0, link-type EN10MB (Ethernet), capture size 262144 bytes
That's because you didn't provide a -r argument to tcpdump, which means that will capture network traffic from a network interface; because you also didn't specify a -i argument, which would specify an interface from which to capture, it will pick the first interface that shows up in the list it gets from the system, which happened to be bond0 on your system.
You need to specify -r to get tcpdump to read from a capture file.
but this command cannot find string.
That command uses -X, not -A, so it dumped out packet data in a format like this:
0x0020: 5010 1920 a97a 0000 4854 5450 2f31 2e31 P....z..HTTP/1.1
0x0030: 2032 3030 204f 4b0d 0a44 6174 653a 2046 .200.OK..Date:.F
0x0040: 7269 2c20 3236 2041 7567 2032 3030 3520 ri,.26.Aug.2005.
There's no guarantee that the string will all fit on one line.

Problems with tail -f and awk? [duplicate]

Is that possible to use grep on a continuous stream?
What I mean is sort of a tail -f <file> command, but with grep on the output in order to keep only the lines that interest me.
I've tried tail -f <file> | grep pattern but it seems that grep can only be executed once tail finishes, that is to say never.
Turn on grep's line buffering mode when using BSD grep (FreeBSD, Mac OS X etc.)
tail -f file | grep --line-buffered my_pattern
It looks like a while ago --line-buffered didn't matter for GNU grep (used on pretty much any Linux) as it flushed by default (YMMV for other Unix-likes such as SmartOS, AIX or QNX). However, as of November 2020, --line-buffered is needed (at least with GNU grep 3.5 in openSUSE, but it seems generally needed based on comments below).
I use the tail -f <file> | grep <pattern> all the time.
It will wait till grep flushes, not till it finishes (I'm using Ubuntu).
I think that your problem is that grep uses some output buffering. Try
tail -f file | stdbuf -o0 grep my_pattern
it will set output buffering mode of grep to unbuffered.
If you want to find matches in the entire file (not just the tail), and you want it to sit and wait for any new matches, this works nicely:
tail -c +0 -f <file> | grep --line-buffered <pattern>
The -c +0 flag says that the output should start 0 bytes (-c) from the beginning (+) of the file.
In most cases, you can tail -f /var/log/some.log |grep foo and it will work just fine.
If you need to use multiple greps on a running log file and you find that you get no output, you may need to stick the --line-buffered switch into your middle grep(s), like so:
tail -f /var/log/some.log | grep --line-buffered foo | grep bar
you may consider this answer as enhancement .. usually I am using
tail -F <fileName> | grep --line-buffered <pattern> -A 3 -B 5
-F is better in case of file rotate (-f will not work properly if file rotated)
-A and -B is useful to get lines just before and after the pattern occurrence .. these blocks will appeared between dashed line separators
But For me I prefer doing the following
tail -F <file> | less
this is very useful if you want to search inside streamed logs. I mean go back and forward and look deeply
Didn't see anyone offer my usual go-to for this:
less +F <file>
ctrl + c
/<search term>
<enter>
shift + f
I prefer this, because you can use ctrl + c to stop and navigate through the file whenever, and then just hit shift + f to return to the live, streaming search.
sed would be a better choice (stream editor)
tail -n0 -f <file> | sed -n '/search string/p'
and then if you wanted the tail command to exit once you found a particular string:
tail --pid=$(($BASHPID+1)) -n0 -f <file> | sed -n '/search string/{p; q}'
Obviously a bashism: $BASHPID will be the process id of the tail command. The sed command is next after tail in the pipe, so the sed process id will be $BASHPID+1.
Yes, this will actually work just fine. Grep and most Unix commands operate on streams one line at a time. Each line that comes out of tail will be analyzed and passed on if it matches.
This one command workes for me (Suse):
mail-srv:/var/log # tail -f /var/log/mail.info |grep --line-buffered LOGIN >> logins_to_mail
collecting logins to mail service
Coming some late on this question, considering this kind of work as an important part of monitoring job, here is my (not so short) answer...
Following logs using bash
1. Command tail
This command is a little more porewfull than read on already published answer
Difference between follow option tail -f and tail -F, from manpage:
-f, --follow[={name|descriptor}]
output appended data as the file grows;
...
-F same as --follow=name --retry
...
--retry
keep trying to open a file if it is inaccessible
This mean: by using -F instead of -f, tail will re-open file(s) when removed (on log rotation, for sample).
This is usefull for watching logfile over many days.
Ability of following more than one file simultaneously
I've already used:
tail -F /var/www/clients/client*/web*/log/{error,access}.log /var/log/{mail,auth}.log \
/var/log/apache2/{,ssl_,other_vhosts_}access.log \
/var/log/pure-ftpd/transfer.log
For following events through hundreds of files... (consider rest of this answer to understand how to make it readable... ;)
Using switches -n (Don't use -c for line buffering!).By default tail will show 10 last lines. This can be tunned:
tail -n 0 -F file
Will follow file, but only new lines will be printed
tail -n +0 -F file
Will print whole file before following his progression.
2. Buffer issues when piping:
If you plan to filter ouptuts, consider buffering! See -u option for sed, --line-buffered for grep, or stdbuf command:
tail -F /some/files | sed -une '/Regular Expression/p'
Is (a lot more efficient than using grep) a lot more reactive than if you does'nt use -u switch in sed command.
tail -F /some/files |
sed -une '/Regular Expression/p' |
stdbuf -i0 -o0 tee /some/resultfile
3. Recent journaling system
On recent system, instead of tail -f /var/log/syslog you have to run journalctl -xf, in near same way...
journalctl -axf | sed -une '/Regular Expression/p'
But read man page, this tool was built for log analyses!
4. Integrating this in a bash script
Colored output of two files (or more)
Here is a sample of script watching for many files, coloring ouptut differently for 1st file than others:
#!/bin/bash
tail -F "$#" |
sed -une "
/^==> /{h;};
//!{
G;
s/^\\(.*\\)\\n==>.*${1//\//\\\/}.*<==/\\o33[47m\\1\\o33[0m/;
s/^\\(.*\\)\\n==> .* <==/\\o33[47;31m\\1\\o33[0m/;
p;}"
They work fine on my host, running:
sudo ./myColoredTail /var/log/{kern.,sys}log
Interactive script
You may be watching logs for reacting on events?
Here is a little script playing some sound when some USB device appear or disappear, but same script could send mail, or any other interaction, like powering on coffe machine...
#!/bin/bash
exec {tailF}< <(tail -F /var/log/kern.log)
tailPid=$!
while :;do
read -rsn 1 -t .3 keyboard
[ "${keyboard,}" = "q" ] && break
if read -ru $tailF -t 0 _ ;then
read -ru $tailF line
case $line in
*New\ USB\ device\ found* ) play /some/sound.ogg ;;
*USB\ disconnect* ) play /some/othersound.ogg ;;
esac
printf "\r%s\e[K" "$line"
fi
done
echo
exec {tailF}<&-
kill $tailPid
You could quit by pressing Q key.
you certainly won't succeed with
tail -f /var/log/foo.log |grep --line-buffered string2search
when you use "colortail" as an alias for tail, eg. in bash
alias tail='colortail -n 30'
you can check by
type alias
if this outputs something like
tail isan alias of colortail -n 30.
then you have your culprit :)
Solution:
remove the alias with
unalias tail
ensure that you're using the 'real' tail binary by this command
type tail
which should output something like:
tail is /usr/bin/tail
and then you can run your command
tail -f foo.log |grep --line-buffered something
Good luck.
Use awk(another great bash utility) instead of grep where you dont have the line buffered option! It will continuously stream your data from tail.
this is how you use grep
tail -f <file> | grep pattern
This is how you would use awk
tail -f <file> | awk '/pattern/{print $0}'

How can I cat some continous logs and grep word in real time?

In Linux, I want to monitor the output of some tool, e.g. dbus-monitor's output. I hope to cat some special key word of its output, and then use the key word to be as input argument of other program. Like below, but it is not good.
dbus-monitor --system > d.log &
var=`cat d.log | grep some-key-word`
my_script.sh $var
I hope to monitor the output flow in real time, and not to cat the whole log from beginning. Just to cat its latest change. E.g. dmesg provides an option, dmesg -w, which meets what I want.
-w, --follow wait for new messages
So how to make such script? To cat the latest new output and use it continuously.
Instead of cat, use tail -F <file> | grep <something>. This option makes tail to wait for and output all incoming data. Most likely, you also will need to modify buffering mode for standard streams with stdbuf -oL (by default, stdout is fully buffered meaning that data is written into file each couple of kilobytes and not after each line).

bash standard output can not be redirected into file

I am reading 'advanced bash script', in Chapter 31, there is a problem. I can not figure it out.
tail -f /var/log/msg | grep 'error' >> logfile
Why is there nothing to output into logfile?
can you offer me an explanation?
thank you in advance
As #chepner comments, grep is using a larger buffer (perhaps 4k or more) to buffer its stdout. Most of the standard utilities do this when piping or redirecting to a file. They typically only switch to line-buffered mode when outputting directly to the terminal.
You can use the stdbuf utility to force grep to do line buffering of its output:
tail -f /var/log/msg | stdbuf -oL grep 'error' >> logfile
As an easily observable demonstration of this effect, you can try the following two commands:
for ((i=0;;i++)); do echo $i; sleep 0.001; done | grep . | cat
and
for ((i=0;;i++)); do echo $i; sleep 0.001; done | stdbuf -oL grep . | cat
In the first command, the output from grep . (i.e. match all lines) be buffered going into the pipe to cat. On mine the buffer appears to be about 4k. You will see the ascending numbers output in chunks as the buffer gets filled and then flushed.
In the second command, grep's output into the pipe to cat is line-buffered, so you should see terminal output for every line, i.e. more-or-less continuous output.

Sed operation on streaming data

Need to perform sed operation on streaming data
tail -f sourcefile | sed -n 's/text1/text2/p' >~/destinationfile
I tried the above command but could not get it to work.
Both programs are linked against libc and the libc performs internal buffering on input/output operations. Buffering will be line-based when stdout(!) is a terminal but block-based when stdout is a pipe. The block-based buffering uses larger buffers and the consuming application has to wait until the buffer is filled, or the end of the stream is reached or the program calls flush() on the file descriptor. However neither tail nor sed calling flush() (with default command line options).
In your case you can see block based buffering of tail's stdin in effect. This happens because stdout is going into a pipe to tail.
Solution: You can issue the stdbuf command to disable the input buffering of tail:
if you only want to see sed's output in terminal:
stdbuf -i0 tail -f /var/log/syslog | sed -n 's/CRON/cron/p'
if you are piping to a file sed's output buffer now needs to be disabled as well!
touch output.txt
tail -f output.txt & # tail output in background in order to see
# file changes immediately
stdbuf -i0 tail -f /var/log/syslog | stdbuf -o0 sed -n 's/CRON/cron/p' > output.txt

Resources