Query related to AF_XDP for transmission of data - linux

I'm developing a user space application where my end goal is to send data from a Linux machine A (an embedded device) and receive on another Linux machine B (an embedded device) over AF_XDP. I want to use AF_XDP to achieve low latency.
I understand for my user space application to receive data over AF_XDP I should use XDP_REDIRECT (please correct me if I'm wrong), what I can't understand is which XDP option (XDP_REDIRECT, XDP_TX etc) should I use to transmit data from my user space application over AF_XDP?

what I can't understand is which XDP option (XDP_REDIRECT, XDP_TX etc) should I use to transmit data from my user space application over AF_XDP?
That is not how AF_XDP works. Once all of the setup work is done, by you or a library you use, there will be a shared memory region called UMEM and 4 ringbuffers which are used in a sort of dance to receive and transmit packets. The buffers are called: UMEM_Fill, UMEM_Completion, RX, TX
On the ingress side, your XDP program is triggered so you can make a decision to send traffic to your AF_XDP socket or not. Here you will use the bpf_redirect_map to redirect the traffic into the socket/map which will return XDP_REDIRECT if successful (may fail if the socket isn't setup correctly or buffers are full).
To read the redirected packet you dequeue the RX queue/ring, the frame descriptor within will point to an address in UMEM which is where your packet data is located. As soon as you are done with this memory you send the frame descriptor back over the UMEM_Fill queue/ring so the NIC/Driver can re-use it.
To write a packet you dequeue a frame descriptor from the UMEM_Completion queue/ring fill the region of UMEM you have temporary control over with your packet and give the frame descriptor to the TX queue/ring. The NIC/Driver will consume the TX queue/ring and send out the packet. No XDP program is triggered on egress.
Would highly recommend you checkout https://www.kernel.org/doc/html/latest/networking/af_xdp.html which has more details regarding using AF_XDP.

Related

Linux can-bus excessive retransmit

I'm working on a project involving a linux embedded device with CAN bus-support.
I've noticed that if I try to send a CAN-packet without having anything attached to the CAN-bus, the transmit is automatically reattempted by the kernel an unlimited number of times. I can verify this using a scope - the same message is automatically transmitted over and over. This retransmission persists even if I shut down the process which created the message, and even if this process only ever attempts to transmit one single message.
My question is - is this normal behaviour for a linux CAN bus kernel? My worry is that if there is ever something wrong in the device, and it erroneously concludes that it is alone on the bus, the device might possibly swamp the bus making it unusable for other bus participants. I would have expected there to be some sort of retry-limit.
The device is using linux 4.14.48, and the can-chip is Philips SJA1000.
What you are seeing is likely error frames. Compliant behavior is this:
Node is active. It attempts to send a data frame but get no ACK bits set since nobody is listening to it.
It will send out an error frame, which pretty much only consists of 6 dominant bits to purposely break bit stuffing.
The controller will re-attempt to send the message. If a new attempt to send without receiving ACK is done, another error frame will be sent. This will keep repeating automatically.
After 128 errors, the node will go error passive, where it will still send error frames, but now with recessive level where it doesn't disrupt other traffic.
After a total of 256 errors, the node will go bus off and shut up completely.
This should all be handled by the CAN controller hardware, not by the OS. You might need to reset or power cycle the SJA1000 once it goes bus off. If it never goes bus off, then something in the driver code might be continuously resetting the CAN controller after a certain amount of errors.
Mind that microcontroller implementations might act the same and reset upon errors too, since that's typically the only way to re-establish communication after a bus off. This depends on the nature of the CAN application.
Short answer is yes - if ACK is the only TEM error the counter will stop at 128 and not go into BUS OFF. It will go forever. This happened to me as well and I just turned off the re-transmit function from the processor side. Not sure if that is a CAN standard function or not.

Who's time slice is used when a packet is carried in linux network stack?

From the man page, when calling sendto() to send a packet in blocking mode, sendto() will return after the packet is copied from userspace to kernel space(send buffer of the socket). Then userspace app continue to run. Every thread has its own time slice, if userspace app continue to run, there must be another kernel thread help to carried the packet from send buffer of the socket to net adapter, right? But if there are many user apps sending packets, how many kthreads are used to carry pkt?It looks unreasonable to use kthread to carry pkt, but I can't understand the story after sendto() return, if not so.

Can I use SO_REUSEPORT to distribute a single UDP flow to multiple receiver threads?

My Linux application needs to receive a single UDP flow with modestly-sized packets (~1 KB) at a rate on the order of ~600,000 packets per second. My current implementation is naive: it has a single thread that simply calls recv() repeatedly, placing the received data in a queue to be processed by another thread. Therefore, the receiver thread is only in charge of pulling in the packets.
In some initial testing that I've done, I'm only able to receive between 200,000-300,000 packets per second before the thread reaches full utilization of its CPU core. This obviously isn't good enough to meet the goal of ~600,000 packets per second.
Ideally, I would find some way of distributing the packet reception load across multiple threads. In looking for a solution to the problem, I came across the SO_REUSEPORT socket option, which allows multiple TCP/UDP threads to be bound to the same IP/port combination. At first, this seemed to be exactly what I wanted.
However, the article also points out this detail:
Incoming connections and datagrams are distributed to the server sockets using a hash based on the 4-tuple of the connection—that is, the peer IP address and port plus the local IP address and port. This means, for example, that if a client uses the same socket to send a series of datagrams to the server port, then those datagrams will all be directed to the same receiving server (as long as it continues to exist). This eases the task of conducting stateful conversations between the client and server.
Therefore, if I only have a single UDP flow, the above hashing implementation would yield all of the packets being directed to the same receiver thread, thwarting my attempt at parallelizing the work. Therefore, the question is: is there a way to receive a single flow of UDP packets from multiple threads, using SO_REUSEPORT or some other mechanism?
Note that my application can handle reordering of packets; the protocol that the datagrams are formatted with contains sequencing information that I can use to reorder them properly afterward.
If you didn't find the solution for last 3 years take a look at SO_ATTACH_REUSEPORT_CBPF. We had exactly the same issue and we solved it by attaching simple BPF program which distributes datagrams randomly mod n.

Isochronous USB transfers confusion

Isochronous endpoints are one way only. But single isochronous IN transmission is described in various sources (eg. here http://www.beyondlogic.org/usbnutshell/usb4.shtml#Isochronous) as one IN token packet (from host to a device) followed by one DATA packet (from a device to the host). So I see communication in both directions here. Is the token packet from the host received by the same IN isochronous endpoint which then sends the data?
What is synchronization for? HERE : http://wiki.osdev.org/Universal_Serial_Bus#Supporting_Isochronous_Transfers we read : "Due to application-specific sampling rates, different hardware clock designs, scheduling policies in the operating system, or even physical anomalies, the host and isochronous device could fall out of synchronization." But how? I understand the sequence of events like this : device fills its outgoing buffer with data, and waits for the token (some interrupt probably). Host sends the token packet, and waits for the data packet, which (I think) should arrive instantly. Sequence is repeated every frame (#F.S.) and everybody is happy. Isn't the token packet synchronizing the reply from the device?
Here http://wiki.osdev.org/Universal_Serial_Bus#SYNC_Field we read : "All USB packets start with a SYNC field which serves, unsurprisingly, as a synchronization mechanism between the receiver and the sender." So once again I ask : why to synchronize isochronous transfers in another manner than this?
All USB transactions are always initiated by the Host. E.g. for an isochronous IN transaction the Host will ask the device for the next piece of data first. This is of course a data flow to the device, but on a lower protocol level (Token Packets). So a kind of control data is send TO the device, but the meaningful data (Data Packts) is only sent FROM the device (IN direction). When you develop software for a device you can often abstract away the Bus Protocol details, because they are handled in hardware (USB device peripheral). The low level messages do not enter an endpoint. Endpoints are on a higher layer.
Consider a USB microphone: It records audio data with a very specific sample rate which is based on the local oscillator of the device. It's only a matter of time that the clock of the Host and the microphone will drift. After a few minutes a gap in the data would appear (or a buffer overflow will occur) because the microphone is recording data at a slightly different speed then the USB is expecting it (from the device's configuration descriptor). So they need some kind of synchronization.
The SYNC field is on the lowest Layer. It is for bit synchronization only and should not be confused with the synchronization for isochronous endpoints (2.)
You might want to take a look at the official USB 2.0 Specification (usb_20.pdf) instead of all the third party wikis which kind of confused you.

Multicasting + Linux Kernel

I have one doubt regarding multicasting in linux kernel. When multicast data arrives
linux kernel checks MFC and if the matching entry is not found then kernel gives conrol message cache miss and header to the user space. My question is what happens to the data
packet? Suppose i may deliberately not want to keep the entry inside MFC but i may have some
other table which has got forwarding information and i want to use that one then what to do?
Regards,
Bhavin.
If a data packet arrives for which there is no matching MFC entry, then the data packet gets put into a queue. It will stay in that queue until either an MFC entry gets added that matches that packet or a timeout expires (10 seconds), whichever happens first. The queue itself has a limit of 10 entries, and once that limit is reached no more packets will get put onto the queue. In that case, unresolved packets will get dropped.
I don't think Linux supports having multiple MFC tables (but I could be wrong). As an alternative, you could route these multicast packets in userspace using by receiving them on a raw socket and then forwarding them out whatever interface you like. In fact many of the IPv6 multicast routing daemons used a method like this before IPv6 multicast support on Linux matured.
you can check it that if related kernel compiled multicast support using command below
grep -i "multicast" /boot/config-2.6.32-358.6.1.el6.x86_64
/UE

Resources