Minimum requirements for custom networking stack to send UDP packets? - linux

(edit: solved -- see below)
This is my situation:
TL-MR3020 -----ethernet----- mbed
OpenWRT C/C++ custom networking stack
192.168.2.1 192.168.2.16
TL-MR3020 is a Linux embedded router
mbed is an ARM microcontroller.
On the network I just want them to exchange messages using UDP packets on port 2225. In particular, TL-MR3020 has to periodically send packets every second to 192.168.2.16:2225, while mbed has to periodically send packets every 50ms to 192.168.2.1:2225.
Everything was good untill I removed the network stack library from mbed (lwIP, not so lightweight for me) and written a new minimal stack.
My new stacks sends 5 gratuitous ARP reply just after the ethernet link gets up, then starts sending and receiving udp packets.
Now TL-MR3020 doesn't receive any UDP packet. In particular, with ifconfig I can see packets coming, but my application can't get them.
Also, if I connect my laptop instead of the TL-MR3020, I can see the UDP packets coming, using Wireshark. There's nothing wrong, except done for my application.
I have a node.js script that has to receive the packets, but it doesn't receive nothing, but if I send UDP packets from local to local, the script receives them.
I think that my application is OK also because neither SOCAT can receive the UDP packets using socat - UDP-LISTEN:2225.
I've already checked on TL-MR3020:
arp table has the correct ip-mac assiciation
destination MAC address matches the incoming interface
destination IP address matches the incoming interface
IP checksum: wireshark says good=false, bad=false
UDP checksum: wireshark says good=false, bad=false
So, I'm asking... what are the minimum requirements for a custom networking stack to send UDP packets?
SOLVED:
You need a good checksum in the IP header.
UDP checksum, my case, can be set to zero.
tcpdump is very helpful (thanks to AndrewMcDonnell)

Related

Raw ICMP socket interaction with internal stack windows/linux

In regard to using ICMP raw socket like in this example
sd = socket(AF_INET, SOCK_RAW, IPPROTO_ICMP);
There's some important question I didn't find an answer anywhere in the documentation. As far as I understand the ICMP protocol is implemented
on kernel level ((by %SystemRoot%\System32\Drivers\Tcpip.sys driver windows) .
So how this kernel logic interacts with the raw user space socket willing to send and receive the ICMP packets defined as in example above?
Is ICMP logic canceled since RAW socket is open and OS gives the application full control of ICMP? Or they are working in parallel (inevitably creating the mess on the network). Can I tell OS which ICMP packets I would like to handle exactly?
Answers for both linux and windows are welcome.
By using the raw socket with IPPROTO_ICMP you only get copies of the ICMP packets which arrive at your host (see How to receive ICMP request in C with raw sockets). The ICMP-logic in the network stack is still alive and will handle ICMP-messages.
So you just need to pick the ICMP packets of your interest after you received them (e.g. with the corresponding ID in the ICMP header). In the receive buffer you get filled by calling recv() you also get the complete IP header.
Under Linux there is even a socket option (ICMP_FILTER) with which you can set a receive-filter for different ICMP packets.

how to send/inject packet into local network interface (linux)

I am working on a C program on Linux (kernel 2.6.18). I need to send/inject IP packets (e.g., over a socket) in my Linux systems, but make the same Linux "think" that these packets are incoming from another host. I creat a datalink socket and use faked source mac/ip for the packets sent over this socket. The destination mac/ip are set to the ones in my local Linux. However, whether I send these packets in a user-space program or in a kernel module, my local Linux just doesn't think these packets are coming from outside. For example, if I create a datalink socket to send an ICMP request destined to my local Linux, I expect my local Linux to think this ICMP request coming from outside, and would respond with an ICMP reply, but my local Linux does not do so. (However, with the same program I can send a faked ICMP request to another host, and that host does respond an ICMP reply.)
I did some research on this topic online, and it seems all related solution suggest using TAP. But as this VirtualBox article says:
... TAP is no longer necessary on Linux with bridged networking, ...
I am very interested to know how this is possible. Thanks.

How do I prevent Linux kernel from responding to incoming TCP packets?

For my application, I need to intercept certain TCP/IP packets and route them to a different device over a custom communications link (not Ethernet). I need all the TCP control packets and the full headers. I have figured out how to obtain these using a raw socket via socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)); This works well and allows me to attach filters to just see the TCP port I'm interested in.
However, Linux also sees these packets. By default, it sends a RST when it receives a packet to a TCP port number it doesn't know about. That's no good as I plan to send back a response myself later. If I open up a second "normal" socket on that same port using socket(PF_INET, SOCK_STREAM, 0); and listen() on it, Linux then sends ACK to incoming TCP packets. Neither of these options is what I want. I want it to do nothing with these packets so I can handle everything myself. How can I accomplish this?
I would like to do the same thing. My reason is from a security perspective… I am wanting to construct a Tarpit application. I intent to forward TCP traffic from certain source IPs to the Tarpit. The Tarpit must receive the ACK. It will reply with a SYN/ACK of its own. I do not want the kernel to respond. Hence, a raw socket will not work (because the supplied TCP packets are teed), I need to also implement a Divert socket. That's about all I know so far… have not yet implemented.

how is TCP's checksum calculated when we use tcpdump to capture packets which we send out

I am trying to generate a series of packets to simulate the TCP 3-way handshake procedure, my first step is to capture the real connecting packets, and try to re-send the same packets from the same machine, but it didn't work at first.
finally I found it out that the packet I captured with tcpdump is not exactly what my computer sent out, the TCP's checksum field is changed and it lead me to thinkk that I can establish a tcp connection even the TCP checksum is incorrect.
so my question is how is the checksum field calculated? is it modified by tcpdump or hardware? why is it changed? Is it a bug of tcpdump? or it's because the calculation is omitted.
the following is the screenshot I captured from my host machine and a virtual machinne, you can see that the same packet captured on differnet machine are all the same except for the TCP checksum.
and the small window is my virtual machine, I used command "ssh 10.82.25.138" from the host to generate these packets
What you are seeing may be the result of checksum offloading. To quote from the wireshark wiki (http://wiki.wireshark.org/CaptureSetup/Offloading):
Most modern operating systems support some form of network offloading,
where some network processing happens on the NIC instead of the CPU.
Normally this is a great thing. It can free up resources on the rest
of the system and let it handle more connections. If you're trying to
capture traffic it can result in false errors and strange or even
missing traffic.
On systems that support checksum offloading, IP, TCP, and UDP
checksums are calculated on the NIC just before they're transmitted on
the wire. In Wireshark these show up as outgoing packets marked black
with red Text and the note [incorrect, should be xxxx (maybe caused by
"TCP checksum offload"?)].
Wireshark captures packets before they are sent to the network
adapter. It won't see the correct checksum because it has not been
calculated yet. Even worse, most OSes don't bother initialize this
data so you're probably seeing little chunks of memory that you
shouldn't.
Although this is for wireshark, the same principle applies. In your host machine, you see the wrong checksum because it just hasn't been filled in yet. It looks right on the guest, because before it's sent out on the "wire" it is filled in. Try disabling checksum offloading on the interface which is handling this traffic, e.g.:
ethtool -K eth0 rx off tx off
if it's eth0.

howto make locally terminated tcp connections go through prerouting and postrouting?

I am developing an application that filters and mangles packets using netfilter queue's. It's rather complicated and needs to perform well so I would like to automate some rigorous testing. To do this I need to be to be able to route some TCP connections through my system, however, I don't want to have to rely on two other machines to act as client and server. I would prefer to run a local client that sends data and a local server that checks the mangled result.
The problem is that my application needs to intercept packets at the PREROUTING stage and so packets generated by the local client can't just be routed to the loopback interface.
So I need some way to inject packets before the prerouting stage and intercept them back after postrouting. If I could somehow use stream sockets to send and receive the data that would be great!
The most straightforward way I can think of doing this is to use a tun device. The tun device allows you to inject packets from userspace that appear to arrive through the tun interface. You could either write code to create and manipulate the tun interface yourself, or you can make use of an application like OpenVPN that already does this. With OpenVPN it would be easy: no special raw sockets or anything: you just send it IP packets encapsulated in UDP and it will make them arrive through a tun interface.
I've been thinking a bit about this and using the tun devices my client and server test applications should be able to use plain linux sockets. I will explain how this can work by describing the path of a packet sent by the test client.
Prerequisites:
a) Two tun devices each providing access to a distinct subnetwork
b) routing table was set up to route traffic to the correct tun device
1) the client sends a packet to an address in the tun1 subnetwork
2) the app attached to tun1 (tun1app) will translate the dst address of the packet to an address in tun2 subnetwork and the source address to an address in the tun1 subnetwork different from the address of the tun1 interface
3) tun1app will send the modified packet back out
4) after routing tun2app will receive the packet and translate the destination address to the tun2 interface and the source address to an address in the tun2 network different from the interface address
5) tun2app will send it back out and the server will receive the packet assuming the destination port is the one the server is listening on
Packets from the server will follow the inverse path.
This seems like the core idea of a very useful tool. Does anyone know of a tool that is able to do this?
All connections from-and-to localhost itself do go over PREROUTING and POSTROUTING. Whoever tells something else is mistaken. (You can verify that with ip6tables -t raw -I OUTPUT -j TRACE, and you will see that it passes through OUTPUT-POSTROUTING-PREROUTING-INPUT when, for example, you ping6 ::1 yourself.)

Resources