What is the best way to process DNS packets with DPDK? - dns

Edit : I am using DPDK version DPDK 21.11.2 (LTS)
I'm working on implementing a DPDK based DNS server. I have researched and studied both concepts. In my understanding, we are bypassing the kernel with DPDK and thus losing all the processing ability of the kernel. DPDK is limited till Layer 2, as the NIC we bind to it loses its IP. To implement layer 3 protocol, is l3fwd the best option?
I read up on KNI as well, as it shows a simple command to assign IP. But that is no longer available in latest versions
Basically what I want to do is, send a DNS packet to the DPDK NIC (via another VM). This current DPDK VM then processes it and send a answer packet back through the same NIC. What is the least complicated way to do this?
I have only tried basic DPDK examples like basicfwd and l2fwd where simple packets are sent and received by the same VM.

Related

Ethernet frames from NIC

I'm searching for help and an opinion-advice for a network project, in which I'm working lately. This requires a Linux machine to be a passive network appliance.
Network packets come in from one network interface and come out from another interface ( net--eth0-->Linux PC--eth1-->net) without making any modifications on data.
The application, which is going to run on the Linux system, will change only the order of the packets. It is going to be a "silly" network emulator application.
The first implementation was made with RAW sockets, where read() is called every time a packet arrives to user space and write() is called when an Ethernet packet should be sent down to the NIC.
I would like to know if there is a more practical and direct way than RAW sockets, bypassing Linux's network stack.
If what you want is to bypass the kernel, DPDK in Linux and NetMap in FreeBSD are options to do just that.
Indeed this can be done in dpdk in Linux. There are l3fw and l2fwd sample applications in the examples folder of the dpdk tree, which may inspire you. Also consider using vpp, a fd.io project hosted by Linux Foundation, which can use dpdk.
Rami Rosen

Embedded Linux on Zynq 7000, dropping almost all UDP packets

I am working with the Xilinx distribution of Linux on a Zynq 7000 board. This has two ARM processors, some L2 cache, a DRAM interface, and a large amount of FPGA fabric. Our appliance collects data being processed by the FPGA and then sends it over the gigabit network to other systems.
One of the services we need to support on this appliance is SNMP, which relies on UDP datagrams, and although SNMP does have TCP support, we cannot force the clients to use that.
What I am finding is that this system is losing almost all SNMP requests.
It is important to note that neither the network nor the CPUs are being overloaded. The data rate isn't particularly high, and the CPUs are usually somewhere around 30% load. Plus, we're using SNMP++ and Agent++ libraries for SNMP, so we have control over those, so it's not a problem with a system daemon breaking. However, if we do stop the processing and network activity, SNMP requests are not lost. SNMP is being handled in its own thread, and we've made sure to keep requests rare and spread-out so that there really should be no more than one request buffered at any one time. With the low CPU load, there should be no problem context-switching to the receiving process to service the request.
Since it's not a CPU or ethernet bandwidth problem, my best guess is that the problem lies in the Linux kernel. Despite the low network load, I'm guessing that there are limited network stack buffers being overfilled, and this is why it's dropping UDP datagrams.
When googling this, I find examples of how to use netstat to report lost packets, but that doesn't seem to work on this system, because there is no "-s" option. How can I monitor these packet drops? How can I diagnose the cause? How can I tune kernel parameters to minimize this loss?
Thanks!
Wireshark or tcpdump is a good approach.
You may want to take a look at the settings in /proc/sys/net/ipv4/ or try an older kernel (3.x instead of 4.x). We had an issue with tcp connections on the Zynq with the 4.4 kernel but this could be seen in the system logs (A warning regarding SYN cookies and possible flooding).

is there a way to run network(socket) program in RISC-V?

I'm trying to run some network program (like nginx) on RISC-V.
I'm not aware of network devices available on RISC-V, and I don't want to implement it myself, so I'm looking for the other way.
First I tried running simple program that send HTTP request to the given address. It resolves the IP address using gethostbyname and sends a HTTP request.
I successfully ran riscv-linux on spike simulator, and compiled the program and ran it. gethostbyname returns Unknown server error, and when I use IP address directly, connect returns Network is unreachable error.
I found that the front end server fesvr can handle system calls that is forwarded from the RISC-V processor, and thought maybe it will handle network related system calls. I also found fesvr-eth and thought maybe it is something related to handling network services, but according to this link it is just for connecting fesvr running on PC to the development board by ethernet. Also fesvr-eth has been removed from the latest git repo.
So now I want to look and risc-v linux and see how it actually handle network system calls like connect. I'm thinking maybe if the network operations are only done inside localhost, it can be handled properly without a network device. Or there can be easy way to extend fesvr to handle network services.
Summarizing the questions:
is there a way to run network applications in RISC-V, that I missed?
can network services be actually handled if the request is from within the same host, even if there is no network devices?
any other comments or references that can be helpful?
At the moment, neither spike nor riscv-qemu provide models for network interfaces. With both of these tools, you will be able to use the loopback device (127.0.0.1) for networking among processes. Loopback networking is implemented directly by the OS kernel.
For other architectures, qemu provides a large number of networking options. It should not be too difficult to modify riscv-qemu/hw/riscv/riscv_board.c to instantiate one of these virtual network interfaces into an emulated risc-v system.

Ip reassembly at intermediate node

I have the following requirment,
I have a Linux PC connected directly to an embedded board.
The Linux PC receives IP traffic from the Internet - it needs to forward this to the embedded board. However the embedded board does not have ability to reassemble IP fragments. Currently what we do is receive the reassembled packet in the linux pc and then sendto() to the emmbeded board. However given the high load of traffic this consumes too much CPU cycles in the Linux PC - since this invovles a copy from kernel space to user space and again the same packet is copied from user space to kernel space.
Is there a way for the kernel to reassemble the fragements and IP forward it to the embedded board without the packet having to come to user space? Note: I have the flexibility to make the destination IP of the IP packets as either the Linux PC or the embedded board.
Thanks
Broadly speaking, no this is not built into the kernel, particularly if your reassembled packet exceeds the MTU size and therefore cannot be transmitted to your embedded board. If you wanted to do it, I'd suggest routing via a tun device and reassembling in user space, or (if you are just using tcp) using any old tcp proxy. If written efficiently it's hard to see why a linux PC would not be able to keep up with this if the embedded board can manage to process the output. If you insist on using the kernel, I think there is a tcp splice technique (see kernel-based (Linux) data relay between two TCP sockets) though whether that works at a segment level and thus does not reassemble, I don't know.
However, do you really need it? See:
http://en.wikipedia.org/wiki/Path_MTU_Discovery
Here tcp sessions are sent with the DF bit set precisely so no fragmentation occurs. This means that most such tcp sessions won't actually need to support fragmentation.
Based on the title of the question, it appears you need to perform reassembly on the intermediate node (linux device). This doesn't mean you have to do it in the kernel space.
Take a look at DPDK. It is an opensource dataplane development kit. It may sound complicated, but all it does is use Poll Mode Drivers to get packets up to the user space with out the copying and interrupt overhead.
Please not, it uses poll mode drivers and will take up CPU cycles. You can use dpdk on a x86_64 hardware if you are ready to give up a couple of cores assuming you also want to fragment the packets in the reverse path.
Take a look at the sample application guide of DPDK for packet fragmentation and reassembly.

How can I use DPDK to write a DNS server?

I want to write a high performance DNS server using Intel DPDK. How can I use Intel DPDK to process TCP packets effectively?
Sure, implement a net stack on DPDK is the solution. But it's too complicated.
As DNS server handles much more UDP queries than TCP queries, I intend to use DPDK to handle UDP queries and use linux net stack to handle TCP queries.
How can I do this on a single machine?
What Gabe has suggested is correct -- However there is a better way to achieve what you really want. You need to use a bifurcated driver.
The problem with using a KNI as suggested by Gabe is this:
Its your software in user space that will decide what it needs to retain (UDP) and what all needs to be routed via a network stack (TCP). You will then pass them to the kernel via DPDK software rings that would consume CPU and memory cycles.
There will be a memory copy between your mbuf and the kernel socket buffer which will affect your KNI performance.
Also note that if you handle UDP in your user space then you need to construct the L2 header before pushing the packet out. This means that you also perhaps need to trap all ARP packets so that you can build your ARP cache, as you will need that to construct the L2 header.
Look at the KNI documentation and Kernel NIC Interface example in the DPDK.
After allocating your KNI device, in you main DPDK polling loop you will pull packets off of the NIC. You'll have to parse the header yourself, or you could get fancy and set up multiple RX queues and then use RSS 5-tuple filters to send UDP packets to your processing queue, and the rest to the default queue.
Anyway, regardless of the method chosen, if it is a UDP packet
you can handle it your self (like you requested); otherwise, you will queue the packet on to the KNI thread. You will also need the other half of the KNI interface as well, where you poll packets off of the KNI thread, and send it out the interface.
This is exactly the way we do it in our application where we still want a linux networking stack for all operations other than our specific traffic.

Resources