Parsing nmap -oG output using sed - linux

I have a logfile
...
Host: 111.222.121.123 (111.222.121.123.deploy.static.akamaitechnologies.com) Ports: 80/open/tcp//http//AkamaiGHost (Akamai's HTTP Acceleration|Mirror service)/, 443/open/tcp//ssl|http//AkamaiGHost (Akamai's HTTP Acceleration|Mirror service)/
Host: 1.2.3.4 () Ports: 80/open/tcp//http//cloudflare/, 443/open/tcp//ssl|https//cloudflare/, 2052/open/tcp//clearvisn?///, 2053/open/tcp//ssl|http//nginx/, 2082/open/tcp//infowave?///, 2083/open/tcp//ssl|http//nginx/, 2086/open/tcp//gnunet?///, 2087/open/tcp//ssl|http//nginx/, 2095/open/tcp//nbx-ser?///, 2096/open/tcp//ssl|http//nginx/, 8080/open/tcp//http-proxy//cloudflare/, 8443/open/tcp//ssl|https-alt//cloudflare/, 8880/open/tcp//cddbp-alt?///
Host: 2.3.4.5 (a104-96-1-61.deploy.static.akamaitechnologies.com) Ports: 53/open/tcp//domain//(unknown banner: 29571.61)/
...
I need to extract and convert IPs and http ports to the following format
1.2.3.4:80,443,2083
There are just two types of port fields in the logfile
80/open/tcp//http
2083/open/tcp//ssl|http
Tried to use sed but without success. I ended up with this dysfunctional command
cat ../host_ports.txt | sed -rn 's/Host: ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*?([0-9]{1,5}\/open\/tcp\/\/http|[0-9]{1,5}\/open\/tcp\/\/ssl\|http).*/\1 \2/p'

First handle the repeating ports, and next replace Host/Port to the desired format.
sed -r 's/(Ports:|,) ([0-9]*)[^,]*/\1\2/g;s/Host: ([^ ]*).*Ports:/\1:/' ../host_ports.txt
EDIT:
First I gave all ports of a line with http somewhere, now limit the result to ports with http in its description.
sed -nr 's/Ports: /, /;
s/, ([0-9]*)[^,]*http[^,]*/,\1/g;
s/,[^,]*\/[^,]*//g;
s/Host: ([^ ]*)[^,]*,/\1:/p' ../host_ports.txt

This script will do it for you, and you don't need sed :
#!/bin/bash
while read -r line; do
if echo $line | grep -q "http"; then
host=$(echo "$line" | grep -Po '(?<=^Host: )[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+')
ports=$(echo "$line" | grep -Po '[0-9]*((?=\/open\/tcp\/\/http)|(?=\/open\/tcp\/\/ssl\|http))' | tr '\n' ',')
echo "$host:${ports:0:-1}"
fi
done < ../log
The first grep will catch the IP address, with the help of Look behind. the -P is to use perl like regex, and the -o is to output only the matching string
The second regex is much like the first, but uses look after instead of look behind. It will only capture ports which are followed by /open/tcp//http or /open/tcp//ssl|http. The tr right after will replace newlines with commas.
the ${ports:0:-1} is just to eliminate the trailing comma.
Hope this helps!

Related

Shell execute grep command within sed replacement not as expected

I am working on some deployment script for Kubernetes, and want to execute a single command to replace some "Template Variables" in yaml file using sed.
I have the following (example, shortened yaml file) input file where i want to do the replacements:
input.txt
CONTAINER_ADDITIONAL
spec:
selector:
app: CONTAINER_NAME
type: NodePort
ports:
- protocol: TCP
port: EXPOSED_PORT
Also, I've got the following Dockerfile where i want to get the EXPOSE port number from to be inserted into the EXPOSED_PORT:
Dockerfile
FROM node:latest
EXPOSE 3000
WORKDIR /app
I have now tried to use different approaches to get the port number 3000 (from Dockerfile) inserted into the EXPOSED_PORT
(CONTAINER_NAME and CONTAINER_ADDITIONAL are working, they are in the example file for completenes with the command below).
The following commands can be executed directly in the shell and give the wanted result (3000):
cat Dockerfile | grep EXPOSE | cut -d" " -f2 -> there may be still the \n
cat Dockerfile | grep EXPOSE | cut -d" " -f2 | tr -d "\n" -> previously mentionned \n removed
grep EXPOSE Dockerfile | cut -d" " -f2 -> with \n
grep EXPOSE Dockerfile | cut -d" " -f2 | tr -d "\n" -> without \n
grep EXPOSE Dockerfile | awk '{print $2}' -> with \n, uses single quote - not an option?
grep EXPOSE Dockerfile|tr -d -c 0-9 -> without \n, not prefered (when situation would exist of multiple port numbers separated by spaces)
grep -Po "(?<=^EXPOSE )\w*$" Dockerfile -> with \n, not prefered (multiple port numbers)
grep -Po "(?<=^EXPOSE )\w*$" Dockerfile | tr -d "\n" -> without \n
HOWEVER:
sed -e 's#CONTAINER_NAME#some-container-name#g;s#CONTAINER_ADDITIONAL#cat some_config.txt#e;s#EXPOSED_PORT#grep EXPOSE Dockerfile | cut -d" " -f2 | tr -d "\n"#e' input.txt
Does not work for the EXPOSED_PORT. The other two variables CONTAINER_NAME and CONTAINER_ADDITIONAL work (the cat gets executed, content of some_config.txt is being put in there)
No matter which of the above mentioned commands that are working and giving the correct result directly in shell, they do not work when executed in sed (the awk for sure not, because of single quotes).
The output I get is:
inserted_content_of: some_config.txt
some_more_inserted_content_from: some_config.txt
spec:
selector:
app: some-container-name
type: NodePort
ports:
- protocol: TCP
sh: 1: port:: not found
The expected output that i want to have:
inserted_content_of: some_config.txt
some_more_inserted_content_from: some_config.txt
spec:
selector:
app: some-container-name
type: NodePort
ports:
- protocol: TCP
port: 3000
Is there anything that I am doing wrong with the sed command?
Is there some explanation what goes wrong?
How can i solve this issue?
I would think, that
grep EXPOSE input.txt | cut -d" " -f2
is a good way to get the portnumber (the \n is not appended when used in another command). Perhaps you should save it first before your next step.
portnumber=$(grep EXPOSE input.txt | cut -d" " -f2)
echo "The portnumber will be [${portnumber}]."
I will replace your grep command with echo "4444", showing the problem with your nested command.
With #e you ask sed to execute the resulting string after processing s#EXPOSED_PORT#echo "4444"#. The line with EXPOSED_PORT is
port: EXPOSED_PORT
So sed is trying to execute port echo "4444", and complains about the command port.
When you want to use the #e, you should use something like
sed -r 's#(.*)(EXPOSED_PORT)(.*)#echo "\1$(echo "4444")\3"#e' input.txt
And you thought it would be easy? Try like this:
sed 's#EXPOSED_PORT#'${portnumber}'#' input.txt
# or when you really want to squeeze your command into this line
sed 's#EXPOSED_PORT#'$(echo "4444")'#' input.txt
Or look at awk:
awk -v port=${portnumber} '/EXPOSED_PORT/ {$2=port} 1' input.txt
# or nested
awk -v port=$(echo "4444") '/EXPOSED_PORT/ {$2=port} 1' input.txt
The issue is that you are not changing the whole line when changing the port but rather just a section of the line. Because the leading "port:" has been left, sed tries to execute this along with the grep command and hence the error. To overcome, this, search and replace the whole line and so:
sed -e 's#CONTAINER_NAME#some-container-name#g;s#CONTAINER_ADDITIONAL#cat some_config.txt#e;s#^.*EXPOSED_PORT#echo " port: $(grep EXPOSE input.txt | cut -d" " -f2 | tr -d "\n")"#e' input.txt
Echo out the leading port: along with the expanded grep command.

Regex to match IP addresses but ignore localhost

So I have this script that does something with IPs allocated to my OS (GNU/Linux) that I get from running ifconfig. It works fine, however, I was wondering if I could filter out loopback/localhost IP (127.0.0.1) in the same regex expression [I assume every server within my cluster has said IP and I don't need to do anything with it in my script.]
What my script uses is:
ifconfig | awk '/(([0-9]{1,3}\.){3})/ {print}' |sed -e "s/.*addr\://g" -e "s/\s.*//g"
I get results like:
> ifconfig | awk '/(([0-9]{1,3}\.){3})/ {print}' |sed -e "s/.*addr\://g" -e "s/\s.*//g"
172.16.0.1
127.0.0.1
I know it might be a stupid question, but could I filter any IP that starts with 127 in my first regex?
I could try changing awk for grep, somethin like:
> ifconfig |egrep -o "addr\:(([0-9]{1,3}\.){3}[0-9]{1,3})" |sed -e "s/.*addr\://g"
but if I try to negate (?!127) at the beginning, bash will interpret it as !127 which would just throw me something from the history.
I mean, I could just run another grep at the end of the oneliner like grep -v "127.0.0.1", but I just wanted to avoid greping something already greped. Not that anything is wrong with that, just trying to know little more and be more efficient, I guess.
With only one grep without sed or awk:
# ip a|grep -oP "inet \K[0-9.]*(?=.*[^ ][^l][^o]$)"
192.168.1.31
172.16.5.31
You can just add a clause to match the 127.0.0.1 and exclude it by adding the next as below. This way Awk ignores doing any action on the lines containing this pattern.
.. | awk '/127.0.0.1/{next}/(([0-9]{1,3}\.){3})/{print}' | ..

Grep search for a string by pattern and then find part of this string inside of another file

I have log file where I look for strings like:
tail -n 1000 -f logfile.log | grep -i "host"
and then I receive strings like these:
host2 %host-DEREG: host c459.cf00.1105 is deregistered on E0/1:60.
Could I choose mac addresses from these strings and look for strings with these mac addresses inside of another file?
There is no macaddress in your example
grep `tail -n 1000 -f logfile.log | grep -i "host" | grep -o "[a-f0-9][a-f0-9][a-f0-9][a-f0-9]\.[a-f0-9][a-f0-9][a-f0-9][a-f0-9]\.[a-f0-9][a-f0-9][a-f0-9][a-f0-9]"` anotherfile

how to match hostname machine

please advice how to match the following hostname
machine hostname Should be according to the following rule
<a-z word ><number><a-z character/s>
real example
star1a
linux25as
machine2b
linux5a
solaris300C
unix9c
please advice how to machine these hostname with grep
I have for now this syntax
hostname | grep -c '[a-z][1-2][a-z]'
but these syntax not work on all my examples
on solaris the option egrep -E not works
hostname | grep -E '\b[a-z]+[0-9][a-z]+'
grep: illegal option -- E
Usage: grep -hblcnsviw pattern file . . .
Broken Pipe
try the second option ( on solaris machine ):
hostname
swu2a
hostname | grep "^[a-z]\+[0-9][a-z]\+$"
not matched!!!
I also try this:
hostname
swu2a
hostname | grep '[a-z]\+[0-9]\+[a-zA-Z]\+'
NOT MATCHED!!!
Here is an awk using same regex as the grep posted here uses.
awk '/[a-z]+[0-9]+[a-zA-Z]+/'
star1a
linux25as
machine2b
linux5a
solaris300C
unix9c
If you need to make sure there is nothing else in the line, only the words above, use:
awk '/^[a-z]+[0-9]+[a-zA-Z]+$/'
^ marks start of line.
$ marks end of line.
You can use the following pattern:
grep '^[a-z]\+[0-9]\+[a-zA-Z]\+$'
Note that you can use the the return value of grep to decide whether the pattern matches or not, you don't need to use the -c option. Like this:
if [ hostname | grep '^[a-z]\+[0-9]\+[a-zA-Z]\+$' >/dev/null 2>&1 ] ; then
echo "host name OK"
fi

Separate IPs From Ports using Shell Script?

I was just wondering how I would go about writing a shell script to separate proxy IPs from their ports.
The proxies are stored in this format
ip:port
ip:port
ip:port
How can I use a shell script to separate the IP on the left side of the colon from the Port on the right side, and put the IP and Port lists in separate .txt files with the same order? Is this even possible?
If the proxies are listed that way in a file, say proxy.txt, then all you need is cut:
cut -f1 -d: proxy.txt > proxy_ip.txt
cut -f2 -d: proxy.txt > proxy_port.txt
Try something like this:
#!/bin/bash
ips="1.2.3.4:123 2.3.4.5:356 4.5.6.7:576"
# or get IPs from stdin
# split them
ips_array=($ips)
for w in ${ips_array[#]}
do
echo $w | sed -e 's/:.*$//g' >> ips.txt
echo $w | sed -e 's/^.*://g' >> ports.txt
done
Key is using the ($ips) to split the list up.
EDIT:
I just realized that you didn't format your question correctly so it's not a single line with IP:PORTs separated by spaces, but one on a line by itself. You just need this then:
#!/bin/bash
while read w
do
echo $w | sed -e 's/:.*$//g' >> ips.txt
echo $w | sed -e 's/^.*://g' >> ports.txt
done
And you read from stdin.

Resources