Linux raw socket (layer2) - best strategy - linux

The context:
I am thinking about the best way to process packet from nic to apps.
I have 4 processes running and receiving packets from ethernet nic.
They run pf_packet sockets so they receive layer 2 packets.
The problem is that they all have to filter the packet they see.
There are no race conditions since the filtering is done by port. One app is interested in one unique port.
The question:
Is there a way to avoid each app to filter all the packet? Having one core for the filter and communicating the packet to the right app incurs context switch costs.
Is it possible for a nic to put the packets corresponding to custom port in a defined rx queue? That way my app will be sure that those packets are exclusively for it.
What is the best way?

If you do not want to use BPF and libpcap, perhaps you can use Linux Socket Filters https://www.kernel.org/doc/Documentation/networking/filter.txt
This will filter the packets in kernel space, before handing them to your packet sockets.
For some syntax examples, use the BSD BPF man-page https://www.freebsd.org/cgi/man.cgi?query=bpf&sektion=4 or google/duckduckgo
But I also suggest that, if your application is performance critical, you prototype and measure different alternatives, before discarding any one in particular (like libpcap).

Related

altering packet content of forwarded packets with nft or iptables using queues

I need to create a moderatly large application that changes the content of forwarded packets quite drastically. I was wondering whether or not I could alter the content of a packet that is intended for routing (kind of performing a man in the middle) using a userspace application based around something like queues from nft or iptables.
all that i've seen in the documentation revolves around accepting or dropping the packet and not altering it's content, and i've read somewhere that the library that is in charge of the queues only copies the packets from kernelspace and thus renders me unable to alter them, but i was wondering maybe I was missing something or there was a known hack about doing something of the sort.
i'd really appriciate your input and thanks a bunch.

Advantages of multiple udp sockets

Hi this question is from a test i had recently :
(code of a server using one thread for read actions and N number of threads to write where N is the number of Writing actions needed to be done right now)
will using multiple UDP sockets (one for each client )over a single one(one for all of them) have any advantages ?
the official answer :
no because the server is using one thread for read/write per client which wont make it more efficient (students who addressed buffer overflow got full points)
my question is - under any circumstances will changing single UDP to many will have an efficiency impact?
Thanks
Maybe yes, if you are using BSD Systems (OSX, FreeBSD, etc.). BSD System do not behave well when sending a lot of upd packets on the same sockets. If the sending queue of the socket is full, the packets get dropped. So write on udp sockets never blocks on BSD systems. When you use 10.000 UDP Sockets sending to the same address, no packets will be dropped on the sending host. So there is definitely an advantage, since some OSs do crazy queue management.

Process distinction from packets

I captured all packets from a pc with NDIS driver and Pcap library.
Can i distinct processes from these packet and sort packets group by process?
Or should i use recv, send function hook about all process?
By the time the packets have hit the NDIS layer, the higher-layer metadata about who sent the packets is gone. (If you try to get the current process anyway, you'll find the current process ID is often wrong. NDIS sends traffic in arbitrary process context, not the sender's original context.)
The preferred way to do this in Windows is to develop a WFP callout. WFP callouts are given the packet, sending process, user identity, and other metadata.
Microsoft discourages you from hooking functions. Even LSPs are discouraged, and the OS will not run your LSP in all cases (e.g., store applications).

Sending Data over network inside kernel

I'm writing a driver in Linux kernel that sends data over the network . Now suppose that my data to be sent (buffer) is in kernel space . how do i send the data without creating a socket (First of all is that a good idea at all ? ) .I'm looking for performance in the code rather than easy coding . And how do i design the receiver end ? without a socket connection , can i get and view the data on the receiver end (How) ? And will all this change ( including the performance) if the buffer is in user space (i'll do a copy from user if it does :-) ) ?
If you are looking to send data on the network without sockets you'd need to hook into the network drivers and send raw packets through them and filter their incoming packets for those you want to hijack. I don't think the performance benefit will be large enough to warrant this.
I don't even think there are normal hooks for this in the network drivers, I did something relevant in the past to implement a firewall. You could conceivably use the netfilter hooks to do something similar in order to attach to the receive side from the network drivers.
You should probably use netlink, and if you want to really communicate with a distant host (e.g. thru TCP/IPv6) use a user-level proxy application for that. (so kernel module use netlink to your application proxy, which could use TCP, or even go thru ssh or HTTP, to send the data remotely, or store it on-disk...).
I don't think that having a kernel module directly talking to a distant host makes sense otherwise (e.g. security issues, filtering, routing, iptables ...)
And the real bottleneck is almost always the (physical) network itself. a 1Gbit ethernet is almost always much slower than what a kernel module, or an application, can sustainably produce (and also latency issues).

Epoll for many short living UDP requests

I want to realize an application which handles many UDP connections to different servers simultaneously and non-blocking. They should just send one short message (around 80 bytes including the UDP header) and receive one packet. They are not intended to do more than these two operations: send a message and receive one. So my first goal is to have as many requests per second as possible. I thought of using epoll() to realize my idea, but when I read that epoll is primarily meant for servers with long living client-server sessions, I was unsure if this would make sense in my case. What is the best technique? (I use Linux)

Resources