Find webserver listening on port with Scapy port scanner - python-3.x

I am trying to write a port scanner in Python for Scapy to find out on which port a webserver is listening. The server does not use port 80 and port 443. The range to be scanned is from 5000 to 10000 (this is an assignment for university). I need to use Scapy for this, so no nmap and other is allowed.
The code I have written so far (it is an adaptation of this original work https://is.muni.cz/th/n9spk/dp.pdf):
target = "172.16.51.142"
ports = range(5000, 10000)
ip = IP(dst=target)
tcp = TCP(dport=ports , flags="S") # SYN flag
ans, unans = sr(ip/tcp) # send packets
for sent, rcvd in ans:
if rcvd.haslayer(TCP): # TCP packet
if rcvd.haslayer(TCP).flags & 2: # SYN/ACK flag
print (sent.dport) # open ports
The first part until the for-loop works as intended:
But when the for-loop starts, I get the following error:
I don't know how to fix this problem.
I have used the online documentation https://scapy.readthedocs.io/en/latest/usage.html#send-and-receive-packets-sr and https://scapy.readthedocs.io/en/latest/usage.html#tcp-port-scanning but could not find a solution.

rcvd.haslayer(TCP).flags isn't possible as haslayer returns a Boolean.
You're looking for
rcvd.getlayer(TCP).flags

Related

How can you ensure a viable endpoint for a stanza CoreNLPClient?

I would like to use the stanza CoreNLPClient to extract noun phrases, similar to this method.
However, I cannot seem to find a good port to start the server on. The default is 9000, but this is often occupied, as indicated by the error message:
PermanentlyFailedException: Error: unable to start the CoreNLP server
on port 9000 (possibly something is already running there)
EDIT: Port 9000 is in use by python.exe, which is why I can't just shut the process down to make space for the CoreNLPClient.
Then, when I select other ports such as 7999, 8000, or 8080, the server keeps listening indefinetely, not executing the consecutive code lines, showing only the following:
2021-07-19 12:05:55 INFO: Starting server with command: java -Xmx8G -cp C:\Users\timjo\stanza_corenlp* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 7998 -timeout 60000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-2e15724b8064491b.props -preload -outputFormat serialized
I have the latest version of stanza installed, and am running the following code from an .ipynb file in VS Code:
# sample sentence
sentence = "Albert Einstein was a German-born theoretical physicist."
# start the client as indicated in the docs
with CoreNLPClient(properties='corenlp_server-2e15724b8064491b.props', endpoint='https://localhost:7998', memory='8G', be_quiet=True) as client:
matches = client.tregex(text=sentence, pattern = 'NP')
# extract the noun phrases and their indices
noun_phrases = [[text, begin, end] for text, begin, end in
zip([sentence[match_id]['spanString'] for sentence in matches['sentences'] for match_id in sentence],
[sentence[match_id]['characterOffsetBegin'] for sentence in matches['sentences'] for match_id in sentence],
[sentence[match_id]['characterOffsetEnd'] for sentence in matches['sentences'] for match_id in sentence])]
Main question: How can I ensure that the server starts on an open port, and closes afterwards? I would prefer having a semi-automatic way to finding open / shutting down occupied ports for the client to run on.
In general it is sufficient to choose another number that nothing else is using – maybe 9017? There are lots of numbers to choose from! But the more careful choice would be to create the CoreNLPClient in a while loop with a try/catch and to increment the port number till you found one that was open.
After 2 hours of working on this, I now know the following:
Taking port 9000 is not an option, given that it is used by python. Informal evidence points towards this having to do something with using a jupyter notebook as opposed to a 'regular' python .py file.
Regarding the Client not closing when using other endpoints: I should've simply used http://localhost:port' instead of https://....
Hopefully this can help someone else struggling with this problem. I guess this was my non-computer science background seeping through.
(edited to resolve typos)

AWS changes the port number to name

AWS automatically changes the well known port numbers to name.
For example 554 to rtsp.
When I am installing iptable rules, with the port number as 554, its getting changed to rtsp. This is creating problem when searching because my program passes 554 as parameter.
How to make sure that the AWS doesn't change the number to name ?
In the picture we can see the dpt:rtsp, which actually should be dpt:554.
Perhaps you're looking for iptables --list -n? The -n prints "numeric output of addresses and ports."

Determine the source port of an IPv4 packet with perl

I have a perl script that reads and processes IPv4 packets from a TunTap interface. Stripped down a bit, it looks like this:
#!/usr/bin/perl
use warnings;
use strict;
use Common;
use Linux::TunTap;
use NetPacket::IP;
use IO::Socket;
$|++;
###### Predecs #####
my $tun;
my %config = Loadconfig();
$tun = Linux::TunTap->new(NAME => $config{'localtun_name'})
or die "Couldn't connect to Interface $config{localtun_name}\n";
print "Interface up: " . $tun->{interface} . "\n";
while (my $rawdata = $tun->get_raw()) {
$rawdata =~ s/^....//; # Strip the TunTap header
my $packet = NetPacket::IP->decode($rawdata);
print "$packet->{id} $packet->{src_ip} -> $packet->{dest_ip} $packet->{proto} $packet->{len}\n";
# Do some processing here
}
For routing reasons, I need to know the source port of the data. I have not found a way of doing this with NetPacket::IP, so is there a different way of determining this? I am currently only using NetPacket::IP for debugging reasons, so I am not really set on that module in particular if a different module will allow me to extract the source port in addition to sequence number, size, source IP, and destination IP.
NetPacket::IP only deals with IP packets which have no concept of ports. Ports only come into play on the TCP/UDP (or whatever you have layered on top of IP) layer, so you need e.g. NetPacket::TCP to get this information. You'll probably have to look at $packet->{proto} to decide which module (TCP or UDP) you want to use for the layer4 parsing.
If you're sure you won't need additional header fields for which the higher-level NetPacket modules would make sense, you could exploit the fact that the source port is in the first 16 bits of the header both for TCP and UDP, so you could say
# Untested, so I'm not sure about the case returned in
# $packet->{proto}
if($packet->{proto} eq 'tcp' or $packet->{proto} eq 'udp') {
$port = unpack('n', $packet->{data});
...
}
Edit: BTW, using substr() instead of a regexp substitution should be faster if that's a concern.

Understanding DNS in wireshark output

I used wireshark to collect data from some sites, and then used tcpdump to get it as a text file. For the project I'm working on, I want to count how many DNS resolutions are involved in accessing a particular website, and what the nature of the DNS responses was. The problem is I don't really understand the output from wireshark or how to interpret it to find what I'm looking for. For instance, here is a line:
21:08:05.454852 IP 10.0.0.2.57512 > ord08s09-in-f21.1e100.net.https:
Flags [.], seq 1:1419, ack 55, win 65535, options [nop,nop,TS val
1348891674 ecr 2473250009], length 1418
What do the different parts of this mean, and what will the data I'm looking for look like? I'm worried I might be using Wireshark incorrectly without knowing it.
I used wireshark to collect data from some sites, and then used tcpdump to get it as a text file.
Most people who use both tools use them for the opposite purposes. :-) I.e., they use tcpdump to capture traffic into a file and then read the file with Wireshark. If you're only using Wireshark to capture traffic, that's probably overkill - you can do the same thing with dumpcap or possibly even tcpdump.
The output you're showing is text output, so, if you "used tcpdump to get it as a text file", it's output from tcpdump, not from Wireshark; text output from Wireshark would look different. If you "used wireshark to collect data from some sites, and then used tcpdump to get it as a text file", the output from Wireshark is either a pcap file or a pcap-ng file, which is a binary file, and is completely uninterpreted raw data. The interpretation of the data in your example is being done by tcpdump, not Wireshark.
What the output is saying is:
"21:08:05.454852": the packet arrived at 21:08:05 and a fraction of a second, local time.
"IP": the packet is an IPv4 packet.
"10.0.0.2.57512 > ord08s09-in-f21.1e100.net.https": the packet is from IP address 10.0.0.2, port 57512, to the IP address whose for which the host name is "ord08s09-in-f21.1e100.net", and the port for "https", which is port 443.
See the tcpdump man page, and a description of TCP, for details on the rest of the line.
The key point here is that this is NOT DNS traffic! It's probably "HTTP-over-SSL", or "https", traffic.
In tcpdump, DNS traffic would look like
11:06:25.247272 IP 10.0.1.3.50953 > 10.0.1.1.domain: 7088+ A? www.kernel.org. (32)
11:06:25.282723 IP 10.0.1.1.domain > 10.0.1.3.50953: 7088 3/0/0 CNAME pub.us.kernel.org., A 149.20.4.69, A 198.145.20.140 (85)
or
11:06:30.622744 IP 10.0.1.3.62767 > 10.0.1.1.domain: 2439+ A? e3191.c.akamaiedge.net.0.1.cn.akamaiedge.net. (62)
11:06:30.639279 IP 10.0.1.1.domain > 10.0.1.3.62767: 2439 1/0/0 A 184.85.109.15 (78)
"A?" means that a query is being done for an A record; "CNAME" means that a CNAME record is being returned (i.e., "www.kernel.org" is an alias for "pub.us.kernel.org", and "A" means that an A record is being returned, giving an IPv4 address.
In Wireshark or TShark, it would look like:
12.316361 10.0.1.3 -> 10.0.1.1 DNS Standard query 0xc2fa A 1.courier-sandbox-push-apple.com.akadns.net
12.332894 10.0.1.1 -> 10.0.1.3 DNS Standard query response 0xc2fa A 17.149.34.59 A 17.149.34.61 A 17.149.34.62 A 17.149.34.63 A 17.149.34.57
or
15.163941 10.0.1.3 -> 10.0.1.1 DNS Standard query 0x168c A www.gnu.org
15.176266 10.0.1.1 -> 10.0.1.3 DNS Standard query response 0x168c CNAME wildebeest.gnu.org A 208.118.235.148
If you're only trying to capture DNS packet, you should use a capture filter such as "port 53" or "port domain", so that non-DNS traffic will be discarded. That filter will work with Wireshark, TShark, or tcpdump (as they use the same libpcap code for packet capture).

TCP connection, bash only

I found this line in a script. While I globally understand what it does--opening a bidirectional TCP connection--, I need some explanations on the syntax. Here's the line:
exec 5<>"/dev/tcp/${SERVER}/${PORT}"
And my questions:
< and > are usually used to redirect IOs. What does it mean there? Is it usable in another context? How?
Why does it work, while /dev/tcp doesn't exists?
Why 5? Can it be another number? What are the values allowed?
Why is exec necessary? (given nothing is actually executed)
Thanks.
< and > are usually used to redirect IOs. What does it mean there? Is it usable in another context? How?
It's the same - input and output is redirected to fd 5.
Why does it work, while /dev/tcp doesn't exists?
It's a special file: If host is a valid hostname or Internet address, and port is an integer port number or service name, bash attempts to open a TCP connection to the corresponding socket.
Why 5? Can it be another number? What are the values allowed?
Yes, it can be any value, but you need to ensure you don't use an fd already in use.
Why is exec necessary? (given nothing is actually executed)
exec means the redirection happens in the current shell, not within a subshell.
I can only answer for the exec part:
exec without a command given may be used to change I/O redirections. <> in this case means open for read+write. 5 is the channel number (or file descriptor). This makes sense if other commands send their output / read their input from channel 5.
For "/dev/tcp/${SERVER}/${PORT}" I don't know if it's a feature of a specific Linux version or if it's a bash feature (I assume the latter).
-- EDIT: from the bash manual page: --
Bash handles several filenames specially when they are used
in redirections, as described in the following table:
/dev/fd/fd
If fd is a valid integer, file descriptor fd is
duplicated.
/dev/stdin
File descriptor 0 is duplicated.
/dev/stdout
File descriptor 1 is duplicated.
/dev/stderr
File descriptor 2 is duplicated.
/dev/tcp/host/port
If host is a valid hostname or Internet address,
and port is an integer port number or service
name, bash attempts to open a TCP connection to
the corresponding socket.
/dev/udp/host/port
If host is a valid hostname or Internet address,
and port is an integer port number or service
name, bash attempts to open a UDP connection to
the corresponding socket.

Resources