Looking for a specyfic message in zipped pcap files - linux

I have around 7200 compressed .pcap files. Each is compressed into a separate .gz file. I need to look for a specific string in packet data details. I would like to write a command to do that. At the moment all I have is:
zcat 20230212*.pcap.gz | tcpdump -qns 0 -X | grep "specyfic string"
where 20230212*.pcap.gz is pattern for these 72000 files.
I know that problem is somewhere on tcpdump part. Sorry for my english.
Update
I tried
tcpdump -qns 0 -A -r filename.pcap | grep "string"
where filename is name of specyfic file, that contains string. It works, but I had to unzip this file. I cannot do it for all files. Also tried:
tcpdump -qns 0 -X -r filename.pcap | grep "string"
but this command cannot find string.
xargs zcat filename.pcap.gz | tcpdump -qns 0 -A -r | grep "string"
gives me: tcpdump: option requires an argument -- 'r'

tcpdump: option requires an argument -- 'r'
The -r flag needs to be given an argument to indicate what to read.
An argument of - means "read the standard input", which is what you want here, as you're piping the result of zcat to it.
So you want
zcat filename.pcap.gz | tcpdump -qns 0 -A -r - | grep "string"
You don't want xargs, because, with
xargs zcat filename.pcap.gz | tcpdump -qns 0 -A -r - | grep "string"
it will:
read file names from the standard input of the first command - meaning that, if you run that exact command from the command line, it will read file names from the terminal, so you would have to type a bunch of file names, followed by control-D to mark the end of the list of file names;
collect the file names into bunches;
run zcat filename.pcap.gz {bunch of file names} - meaning that it will decompresss first filename.pcap.gz, followed by all of the files in that bunch, and write the decompressed contents of all those files as a single stream of raw bytes;
read more file names and do that again until it runs out of file names;
which means that what tcpdump will see will look like a bunch of pcap-format files stuck together ("concatenated") into one. That will NOT look like a single pcap-format file to tcpdump; instead, it will look like the first pcap file, followed by a lot of stuff that will not look like valid pcap file contents, so tcpdump will probably print an error and give up.
(And other programs that read pcap-format files, such as tshark, will do the exact same thing. There's no magic flag or tool to fix that.)
What you should do, instead, is have a small shell script, such as
#! /bin/sh
echo "Processing $1:"
zcat "$1" | tcpdump -qns 0 -A -r - | grep "$2"
and, to look for a given string in one .pcap.gz file, do
{path to script} {file name} "string"
where {path to script} is the path name of the script and {file name} is the pathname of the file.
To scan all the files, do
for file in 20230212*.pcap.gz
do
{path to script} "$file" "string"
done >/tmp/output
That is a loop that loops over all files that match 20230212*.pcap.gz and, for each of them, runs the script on the file, looking for the string, and sends the output of that entire loop to the file /tmp/output.
Note that /tmp/output will contain one line for every file, giving the name of the file. If you don't care which capture files contain the string, you can remove the
echo "Processing $1:"
line from the script. If you do care which capture files contain the string, but you don't care what the exact text that matches is, you can have the script be
#! /bin/sh
echo "Processing $1:"
if zcat "$1" | tcpdump -qns 0 -A -r - | grep -q "$2"
then
echo "$1 contains \"$2\""
fi
which tests whether the grep command found the string and, if it did, prints a message. The -q flag causes grep not to write the matching text out, so the file doesn't have that extra information in it.
After using: xargs zcat "filename" | tcpdump -qns 0 -X | grep "string, I receive tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on bond0, link-type EN10MB (Ethernet), capture size 262144 bytes
That's because you didn't provide a -r argument to tcpdump, which means that will capture network traffic from a network interface; because you also didn't specify a -i argument, which would specify an interface from which to capture, it will pick the first interface that shows up in the list it gets from the system, which happened to be bond0 on your system.
You need to specify -r to get tcpdump to read from a capture file.
but this command cannot find string.
That command uses -X, not -A, so it dumped out packet data in a format like this:
0x0020: 5010 1920 a97a 0000 4854 5450 2f31 2e31 P....z..HTTP/1.1
0x0030: 2032 3030 204f 4b0d 0a44 6174 653a 2046 .200.OK..Date:.F
0x0040: 7269 2c20 3236 2041 7567 2032 3030 3520 ri,.26.Aug.2005.
There's no guarantee that the string will all fit on one line.

Related

grep show output, then do something with it

I try to use script for find messages in mail server. I need to see file location and output. then i do
grep -r KTsiPtCf0YDQvtC00LDQstC10YYt0LrQvtC90YHRg9C70YzRgtCw0L3RgiDRgdCw0LvQ ./aaa/Maildir/cur/
output is
./aaa/Maildir/cur/1506410272.M30769P16754.ml.exmp.com,S=134882,W=136851:2,S:KTsiPtCf0YDQvtC00LDQstC10YYt0LrQvtC90YHRg9C70YzRgtCw0L3RgiDRgdCw0LvQ
And then, i need to cut all before ":" and make search result readable.
grep -r KTsiPtCf0YDQvtC00LDQstC10YYt0LrQvtC90YHRg9C70YzRgtCw0L3RgiDRgdCw0LvQ ./aaa/Maildir/cur/ | sed 's/.*://' |base64 -d| enca -L ru -x utf-8
But if i do it by pipe i miss file location. How to output location of file and then do pipe?

piped sed does not output to file using ngrep

I am using ngrep to filter some tcp packetes into STDOUT
Since it now become more important to log the output (after changing the result a bit usingsed) into a file.
piping it with sed looks OK in stdout - But no content is written when writing to dump.log
Below is the command:
grep -l -q -W none -i "^POST /somefile.php" tcp and port 80 | sed -e 's/^T/IP/g' >> dump.log
Having the impression that either sed or ngrep blocks the process of pushing the content.
Add -U to GNU sed to load minimal amounts of data from the input and flush the output buffers more often.

Hexdump reverse command

The hexdump command converts any file to hex values.
But what if I have hex values and I want to reverse the process, is this possible?
There is a similar tool called xxd. If you run xxd with just a file name it dumps the data in a fairly standard hex dump format:
# xxd bdata
0000000: 0001 0203 0405
......
Now if you pipe the output back to xxd with the -r option and redirect that to a new file, you can convert the hex dump back to binary:
# xxd bdata | xxd -r >bdata2
# cmp bdata bdata2
# xxd bdata2
0000000: 0001 0203 0405
I've written a short AWK script which reverses hexdump -C output back to the
original data. Use like this:
reverse-hexdump.sh hex.txt > data
Handles '*' repeat markers and generating original data even if binary.
hexdump -C and reverse-hexdump.sh make a data round-trip pair. It is
available here:
GitHub reverse-hexdump repo
Direct to reverse-hexdump.sh
Restore file, given only the output of hexdump file
If you only have the output of hexdump file and want to restore the original file, first note that hexdump's default output depends on the endianness of the system you ran hexdump on!
If you have access to the system that created the dump, you can determinate its endianness using below command:
[[ "$(printf '\01\03' | hexdump)" == *0103* ]] && echo big || echo little
Reversing little-endian hexdump
This is the most common case. All x86/x64 systems are little-endian. If you don't know the endianness of the system that ran hexdump file, try this.
sed 's/ \(..\)\(..\)/ \2\1/g;$d' dump | xxd -r
The sed part converts hexdump's format into xxd's format, at least so far that xxd -r works.
Reversing big-endian hexdump
sed '$d' dump | xxd -r
Known Bugs (see comment section)
A trailing null byte is added if the original file was of odd length (e.g. 1, 3, 5, 7, ..., byte long).
Repeating sections of the original file are not restored correctly if they were hexdumped using a *.
You can check your dump for above problematic cases by running below command:
grep -qE '^\*|^[0-9a-f]*[13579bdf] *$' dump && echo bug || echo ok
Better alternative to create hexdumps in the first place
Besides the non-posix (and therefore not so portable) xxd there is od (octal dump) which should be available on all unix-like systems as it is specified by posix:
od -tx1 -An -v
Will print a hexadecimal dump, grouping digits as single bytes (-tx1), with no Address prefixes (-An, similar to xxd -p) and without abbreviating repeated sections as * (-v). You can reverse such a dump using xxd -r -p.
As someone who sucks at bash, I could not understand the examples already posted.
Here is what would have helped me when I was originally searching:
Take your text file "AYE.TXT" and convert it into a hex dump called "BEE.TXT"
xxd -p "AYE.TXT" > "BEE.TXT"
Take your hex dump file ("BEE.TXT") and covert it back to ascii file "CEE.TXT"
xxd -r -p "BEE.TXT" > "CEE.TXT"
Now that you have some simple working code, feel free to check out
"xxd -help" on the command line for an explanation of what all those flags do.
(That part is the easy part, the hard part is the specifics of the bash syntax)
There is a tonne of more elegant ways to get this done, but I've quickly hacked something together that Works for Me (tm) when regenerating a binary file from a hex dump generated by hexdump -C some_file.bin:
sed 's/\(.\{8\}\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\)/\1: \2\3 \4\5 \6\7 \8\9/g' some_file.hexdump | sed 's/\(.*\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) |/\1 \2\3 \4\5 \6\7 \8\9 /g' | sed 's/.$//g' | xxd -r > some_file.restored
Basically, uses 2 sed processeses, each handling it's part of each line. Ugly, but someone might find it useful.
If you don't have xxd, use hexdump, od, perl or python:
The following all give the same output:
# If you only have hexdump
hexdump -ve '1/1 "%.2x"' mybinaryfile > mydump
# This gives exactly the same output as:
xxd -p mybinaryfile > mydump
# Or, much slower:
od -v -t x1 -An < mybinaryfile | tr -d "\n " > mydump
# Or, the fastest:
perl -pe 'BEGIN{$/=\1e6} $_=unpack "H*"' < mybinaryfile > mydump
# Or, if you somehow have Python, and not Perl:
python -c "print(open('mybinaryfile','rb').read().hex())" > mydump
Then you can copy and paste, or pipe the output, and convert back with:
xxd -r -p mydump mybinaryfileagain
# Or
xxd -r -p < mydump > mybinaryfileagain
The hexdump command is available almost everywhere, and is usually part of the default busybox - if it's not linked, you can try running busybox hexdump or busybox xxd.
If xxd is not available to reverse the data, then you can try awk
The old days: Zmodem
In the old days we used to use X/Y/Zmodem which is available in the package lrzsz which can tolerate lossy comms - but it's a bidirectional protocol so the binaries need to be running at the same time and there needs to be bidirectional comms:
# Demo on local machine, using FIFOs
mkfifo /tmp/fifo-in
mkfifo /tmp/fifo-out
sz -b mybinaryfile > /tmp/fifo-out < /tmp/fifo-in
mkdir out; cd out
rz -b < /tmp/fifo-out > /tmp/fifo-in
Luckily, screen supports receiving Zmodem, so if you're in a screen session:
screen
telnet somehost
Then type Ctrl+A and : and then zmodem catch and Enter. Then inside the screen on the remote host, issue:
# sz -b mybinaryfile
Press Enter when you see the string starting with "!!!".
When you see "Transfer Complete", you may want to run reset if you want to continue the terminal session normally.
This program reverses hexdump -C output back to the original data.
Usage:
make
make test
./unhexdump -i inputfile -o outputfile
see https://github.com/zhouzq-thu/unhexdump!
i found more simple solution:
bin2hex
echo -n "abc" | hexdump -ve '1/1 "%02x"'
hex2bin
echo -n "616263" | grep -Eo ".{2}" | sed 's/\(.*\)/\\x\1/' | tr -d '\n' | xargs -0 echo -ne

How to get lines that don't contain certain patterns

I have a file that contain many lines :
ABRD0455252003666
JLKS8568875002886
KLJD2557852003625
.
.
.
AION9656532007525
BJRE8242248007866
I want to extract the lines that start with (ABRD or AION) and in column 12 to 14 the numbers (003 or 007).
The output should be
KLJD2557852003625
BJRE8242248007866
I have tried this and it works but it s too long command and I want to optimise it for performance concerns:
egrep -a --text '^.{12}(?:003|007)' file.txt > result.txt |touch results.txt && chmod 777 results.txt |egrep -v -a --text "ABRD|AION" result.txt > result2.text
The -a option is a non-standard extension for dealing with binary files, it is not needed for text files.
grep -E '^.{11}(003|007)' file.txt | grep -Ev '^(ABRD|AION)'
The first stage matches any line with either 003 or 007 in the twelfth through fourteenth column.
The second stage filters out removes any line starting with either ABRD or AION.
You really need to just read a regexp tutorial but meantime try this:
grep -E "^(ABRD|AION).{7}00[37]"

Ambiguous Redirection on shell script

I was trying to create a little shell script that allowed me to check the transfer progress when copying large files from my laptop's hdd to an external drive.
From the command line this is a simple feat using pv and simple redirection, although the line is rather long and you must know the file size (which is why I wanted the script):
console: du filename (to get the exact file size)
console: cat filename | pv -s FILE_SIZE -e -r -p > dest_path/filename
On my shell script I added egrep "[0-9]{1,}" -o to strip the filename and keep just the size numbers from the return value of du, and the rest should be straightforward.
#!/bin/bash
du $1 | egrep "[0-9]{1,}" -o
sudo cat $1 | pv -s $? -e -r -p > $2/$1
The problem is when I try to copy file12345.mp3 using this I get an ambiguous redirection error because egrep is getting the 12345 from the filename, but I just want the size.
Which means the return value from the first line is actually:
FILE_SIZE
12345
which bugs it.
How should I modify this script to parse just the first numbers until the first " " (space)?
Thanks in advance.
If I understand you correctly:
To retain only the filesize from the du command output:
du $1 | awk '{print $1}'
(assuming the 1st field is the size of the file)
Add double quotes to your redirection to avoid the error:
sudo cat $1 | pv -s $? -e -r -p > "$2/$1"
This quoting is done since your $2 contains spaces.

Resources