What do I need to build to directly access the InfiniBand HCA ports and inject an IPoIB frame bits? - linux

I would like to retrieve the IPoIB frame bits for all the IPoIB frames on the fabric no matter if they are destined (LID + QPN level) for my machine or not.
Also, I should be able to re-inject the modified IPoIB frames directly to the InfiniBand HCA ports from the linux Kernel.
The logic for that has to be at the kernel level.
So in order to achieve this do I need to build a separate kernel module or IPoIB driver or IPoIB network interface
Note: I have just started learning Linux kernel module development for my project. I'm sorry if it is not the appropriate place to post this question.

You are going to have a big problem receiving IPoIB packets not destined for your machine. The fabric forwards packets based on the destination LID, and if the LID is not associated with your local port, you won't receive the packet.

Related

What is the best way to process DNS packets with DPDK?

Edit : I am using DPDK version DPDK 21.11.2 (LTS)
I'm working on implementing a DPDK based DNS server. I have researched and studied both concepts. In my understanding, we are bypassing the kernel with DPDK and thus losing all the processing ability of the kernel. DPDK is limited till Layer 2, as the NIC we bind to it loses its IP. To implement layer 3 protocol, is l3fwd the best option?
I read up on KNI as well, as it shows a simple command to assign IP. But that is no longer available in latest versions
Basically what I want to do is, send a DNS packet to the DPDK NIC (via another VM). This current DPDK VM then processes it and send a answer packet back through the same NIC. What is the least complicated way to do this?
I have only tried basic DPDK examples like basicfwd and l2fwd where simple packets are sent and received by the same VM.

Ethernet frames from NIC

I'm searching for help and an opinion-advice for a network project, in which I'm working lately. This requires a Linux machine to be a passive network appliance.
Network packets come in from one network interface and come out from another interface ( net--eth0-->Linux PC--eth1-->net) without making any modifications on data.
The application, which is going to run on the Linux system, will change only the order of the packets. It is going to be a "silly" network emulator application.
The first implementation was made with RAW sockets, where read() is called every time a packet arrives to user space and write() is called when an Ethernet packet should be sent down to the NIC.
I would like to know if there is a more practical and direct way than RAW sockets, bypassing Linux's network stack.
If what you want is to bypass the kernel, DPDK in Linux and NetMap in FreeBSD are options to do just that.
Indeed this can be done in dpdk in Linux. There are l3fw and l2fwd sample applications in the examples folder of the dpdk tree, which may inspire you. Also consider using vpp, a fd.io project hosted by Linux Foundation, which can use dpdk.
Rami Rosen

Ip reassembly at intermediate node

I have the following requirment,
I have a Linux PC connected directly to an embedded board.
The Linux PC receives IP traffic from the Internet - it needs to forward this to the embedded board. However the embedded board does not have ability to reassemble IP fragments. Currently what we do is receive the reassembled packet in the linux pc and then sendto() to the emmbeded board. However given the high load of traffic this consumes too much CPU cycles in the Linux PC - since this invovles a copy from kernel space to user space and again the same packet is copied from user space to kernel space.
Is there a way for the kernel to reassemble the fragements and IP forward it to the embedded board without the packet having to come to user space? Note: I have the flexibility to make the destination IP of the IP packets as either the Linux PC or the embedded board.
Thanks
Broadly speaking, no this is not built into the kernel, particularly if your reassembled packet exceeds the MTU size and therefore cannot be transmitted to your embedded board. If you wanted to do it, I'd suggest routing via a tun device and reassembling in user space, or (if you are just using tcp) using any old tcp proxy. If written efficiently it's hard to see why a linux PC would not be able to keep up with this if the embedded board can manage to process the output. If you insist on using the kernel, I think there is a tcp splice technique (see kernel-based (Linux) data relay between two TCP sockets) though whether that works at a segment level and thus does not reassemble, I don't know.
However, do you really need it? See:
http://en.wikipedia.org/wiki/Path_MTU_Discovery
Here tcp sessions are sent with the DF bit set precisely so no fragmentation occurs. This means that most such tcp sessions won't actually need to support fragmentation.
Based on the title of the question, it appears you need to perform reassembly on the intermediate node (linux device). This doesn't mean you have to do it in the kernel space.
Take a look at DPDK. It is an opensource dataplane development kit. It may sound complicated, but all it does is use Poll Mode Drivers to get packets up to the user space with out the copying and interrupt overhead.
Please not, it uses poll mode drivers and will take up CPU cycles. You can use dpdk on a x86_64 hardware if you are ready to give up a couple of cores assuming you also want to fragment the packets in the reverse path.
Take a look at the sample application guide of DPDK for packet fragmentation and reassembly.

Implementing a kernel debugging module for a Linux guest OS inside a VmWare VM

Sorry for the rather long post.
I need some input regarding a project that I am going to undertake.
I am trying to make an application that collects kernel debugging information from a guest Linux OS, located inside a VmWare Virtual Machine, and send them to a host OS efficiently.
So far, I have found a similar project, but written for Windows[1].
The author of the project wrote a DLL that is loaded into memory, and replaces the implementation of the KdSendPacket and KdReceivePacket functions, to use the VmWare GuestRpc[2] mechanism, instead of the slow serial port.
The data are then send to a debugging application on the host(Kd or WinDbg) trough a named pipe.
The author claims that there is a speed-up up to 45%, by avoiding the serial port transmission.
I am trying to achieve something similar ,but for Linux, and try to make the debugging process a little faster, than using the serial port.
My concrete questions are :
Do any similar applications exist?
I didn't manage to find any.
Would such an application be worth it ,comparing its functionality to netconsole[3], for example?
What method of intercepting printk messages would you suggest ?
Is there an equivalent of KdSendPacket/KdReceivePacket on Linux ?
[1]. http://virtualkd.sysprogs.org/dox/operation.html
[2]. http://articles.sysprogs.org/kdvmware/guestrpc.shtml
[3]. http://www.kernel.org/doc/Documentation/networking/netconsole.txt
Using the serial port is really suboptimal.. even the (virtual) network would be preferable to that, but getting back to host-guest IPC channels, VMware's VMCI comes to mind.
many approaches can use to achieve your goal, below methods can be applied if network is connected:
use syslog service and transfer log though network to your server:
syslogd, syslogng seems support sending log to a log server with some filter critiera.
directly call tcp/udp socket functions in your kernel module to sends your collected data back to server.
other approaches, you may write application on host machine that calls hypervisor's share memory access function to read the memory buffer of your kernel module. However, the xen/kvm hypervisor both support these apis and i am not sure about weather vmware have this kind of library.

How to check the registers value of the network card on ARM Linux?

On our device, we observed that the IPv6 NS packet with multicast address of Layer2 were droped. The tcpdump cannot capture these packet so I guess the packets were dropped by the network card driver(correct me if I am wrong).
To verify this, I want to write a module to check the value of some registers in the network card. Since it is not possible for me to recompile the original driver I need a separate module to finish this job.
Is it possible to do that? How?
You can recompile the driver, adding printk with whatever it is you want to see.
If you are developing for an ARM target, it is possible you are using the Embedded Linux Dev Kit (ELDK), so you could look in the kernel source tree for the driver, modify it, and rebuild the kernel. Or you could remove the resident driver and compile it as a loadable module—which is a lot easier for tinkering with a driver.

Resources