Is ethernet checksum exposed via AF_PACKET? - linux

As it is implied by this question, it seems that checksum is calculated and verified by ethernet hardware, so it seems highly unlikely that it must be generated by software when sending frames using an AF_PACKET socket, as seem here and here. Also, I don't think it can be received from the socket nor by any simple mean, since even Wireshark doesn't display it.
So, can anyone confirm this? Do I really need to send the checksum myself as shown in the last two links? Will checksum be created and checked automatically by the ethernet adaptor?

No, you do not need to include the CRC.
When using a packet socket in Linux using socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL) ), you must provide the layer 2 header when sending. This is defined by struct ether_header in netinet/if_ether.h and includes the destination host, source host, and type. The frame check sequence is not included, nor is the preamble, start of frame delimiter, or trailer. These are added by the hardware.

On Linux, if you mention socket(AF_PACKET, SOCK_RAW, htobe16(ETH_P_ALL)) similar case, you don't need to calculate ethernet checksum, NIC hardware/driver will do it for you. That means you need to offer whole data link layer frame except checksum before send it to raw socket.

Related

When working with raw sockets (Layer 2), does the kernel generate the frame check sequence (FCS), or do I need to generate it and append it myself? [duplicate]

As it is implied by this question, it seems that checksum is calculated and verified by ethernet hardware, so it seems highly unlikely that it must be generated by software when sending frames using an AF_PACKET socket, as seem here and here. Also, I don't think it can be received from the socket nor by any simple mean, since even Wireshark doesn't display it.
So, can anyone confirm this? Do I really need to send the checksum myself as shown in the last two links? Will checksum be created and checked automatically by the ethernet adaptor?
No, you do not need to include the CRC.
When using a packet socket in Linux using socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL) ), you must provide the layer 2 header when sending. This is defined by struct ether_header in netinet/if_ether.h and includes the destination host, source host, and type. The frame check sequence is not included, nor is the preamble, start of frame delimiter, or trailer. These are added by the hardware.
On Linux, if you mention socket(AF_PACKET, SOCK_RAW, htobe16(ETH_P_ALL)) similar case, you don't need to calculate ethernet checksum, NIC hardware/driver will do it for you. That means you need to offer whole data link layer frame except checksum before send it to raw socket.

Where did Wireshark/tcpdump/libpcap intercept packet inside Linux kernel?

According to this, wireshark is able to get the packet before it is dropped (therefore I cannot get such packets by myself). And I'm still wondering the exact location in linux kernel for wireshark to fetch the packets.
The answer goes as "On UN*Xes, it uses libpcap, which, on Linux, uses AF_PACKET sockets." Does anyone have more concrete example to use "AF_PACKET sockets"? If I understand wireshark correctly, the network interface card (NIC) will make a copy of all incoming packets and send it to a filter (berkeley packet filter) defined by the user. But where does this happen? Or am I wrong with that understanding and do I miss anything here?
Thanks in advance!
But where does this happen?
If I understood you correctly - you want to know, where is initialized such socket.
There is pcap_create function, that tries to determine type of source interface, creates duplicate of it and activates it.
For network see pcap_create_interface function => pcap_create_common function => pcap_activate_linux function.
All initialization happens in pcap_activate_linux => activate_new function => iface_bind function
( copy descriptor of device with handlep->device = strdup(device);,
create socket with socket(PF_PACKET, SOCK_DGRAM, htons(ETH_P_ALL)),
bind socket to device with bind(fd, (struct sockaddr *) &sll, sizeof(sll)) ).
For more detailed information read comments in source files of mentioned functions - they are very detailed.
After initialization all work happens in a group of functions such as pcap_read_linux, etc.
On Linux, you should be able to simply use tcpdump (which leverages the libpcap library) to do this. This can be done with a file or to STDOUT and you specify the filter at the end of the tcpdump command..

Finding out the number of dropped packets in raw sockets

I am developing a program that sniffs network packets using a raw socket (AF_PACKET, SOCK_RAW) and processes them in some way.
I am not sure whether my program runs fast enough and succeeds to capture all packets on the socket. I am worried that the recieve buffer for this socket occainally gets full (due to traffic bursts) and some packets are dropped.
How do I know if packets were dropped due to lack of space in the
socket's receive buffer?
I have tried running ss -f link -nlp.
This outputs the number of bytes that are currently stored in the revice buffer for that socket, but I can not tell if any packets were dropped.
I am using Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-52-generic x86_64).
Thanks.
I was having a similar problem as you. I knew that tcpdump was able to to generate statistics about packet drops, so I tried to figure out how it did that. By looking at the code of tcpdump, I noticed that it is not generating those statistic by itself, but that it is using the libpcap library to get those statistics. The libpcap is on the other hand getting those statistics by accessing the if_packet.h header and calling the PACKET_STATISTICS socket option (at least I think so, but I'm no C expert).
Therefore, I saw only two solutions to the problem:
I had to interact somehow with the linux header files from my Pyhton script to get the packet statistics, which seemed a bit complicated.
Use the Python version of libpcap which is pypcap to get those information.
Since I had no clue how to do the first thing, I implemented the second option. Here is an example how to get packet statistics using pypcap and how to get the packet data using dpkg:
import pcap
import dpkt
import socket
pc=pcap.pcap(name="eth0", timeout_ms=10000, immediate=True)
def packet_handler(ts,pkt):
#printing packet statistic (packets received, packets dropped, packets dropped by interface
print pc.stats()
#example packet parsing using dpkt
eth=dpkt.ethernet.Ethernet(pkt)
if eth.type != dpkt.ethernet.ETH_TYPE_IP:
return
ip =eth.data
layer4=ip.data
ipsrc=socket.inet_ntoa(ip.src)
ipdst=socket.inet_ntoa(ip.dst)
pc.loop(0,packet_handler)
tpacket_stats structure is defined in linux/packet.h header file
Create variable using the tpacket_stats structre and pass it to getSockOpt with PACKET_STATISTICS SOL_SOCKET options will give packets received and dropped count.
-- some times drop can be due to buffer size
-- so if you want to decrease the drop count check increasing the buffersize using setsockopt function
First off, switch your operating system.
You need a reliable, network oriented operating system. Not some pink fluffy "ease of use" with "security" functionality enabled. NetBSD or Gentoo/ArchLinux (the bare installations, not the GUI kitted ones).
Start a simultaneous tcpdump on a network tap and capture the traffic you're supposed to receive along side of your program and compare the results.
There's no efficient way to check if you've received all the packets you intended to on the receiving end since the packets might be dropped on a lower level than you anticipate.
Also this is a question for Unix # StackOverflow, there's no programming here what I can see, at least there's no code.
The only certain way to verify packet drops is to have a much more beefy sender (perhaps a farm of machines that send packets) to a single client, record every packet sent to your reciever. Have the statistical data analyzed and compared against your senders and see how much you dropped.
The cheaper way is to buy a network tap or even more ad-hoc enable port mirroring in your switch if possible. This enables you to dump as much traffic as possible into a second machine.
This will give you a more accurate result because your application machine will be busy as it is taking care of incoming traffic and processing it.
Further more, this is why network taps are effective because they split the communication up into two channels, the receiving and sending directions of your traffic if you will. This enables you to capture traffic on two separate machines (also using tcpdump, but instead of a mirrored port, you get a more accurate traffic mirroring).
So either use port mirroring
Or you buy one of these:

Update TCP/IP checksum in headers

I am trying to modify the source and destination address in the IP header manually in kernel when sending the packet. After that, I need to recalculate both IP checksum and TCP checksum.
I am doing it in the following way.
iph = ip_hdr(skb);
iph->saddr = mysaddr;
iph->daddr = mydaddr;
tcph= tcp_hdr(skb);
__tcp_v4_send_check(skb, iph->saddr, iph->daddr);
iph->tot_len = htons(skb->len);
ip_send_check(iph);
But at the receiver, the checksum always fails at TCP layer while it can pass IP layer.
I did much debugs, and found that during normal process, when the packet arrives, the skb->ip_summed is generally CHECKSUM_UNNECESSARY, but if I do the modification at the sender, then this value will be CHECKSUM_NONE when arriving at the receiver.
Can anybody give me some suggestion? Thanks.
As for CHECKSUM_UNNECESSARY and CHECKSUM_NONE, too much details about them, can not describe them in simple word, better read:
<<Understanding.Linux.Network.Internals >> / 19.1.1.2. sk_buff structure.
Post my guess here (hope it helpful):
Since at sender side, you see generally CHECKSUM_UNNECESSARY, this might means your network card is probably able to do checksum verification and computing checksum in hardware.
By default, if TCP stack seeing your network card has this hardware capability, it would not bother computing checksum itself in software. But set CHECKSUM_PARTIAL in egress packet, so that network interface driver would instruct its hardware to computing checksum.
In your case, you compute the checksum by yourself, and should set CHECKSUM_NONE, so network interface would not compute the checksum again for you ( if doing so, break your checksum of modified packet ).
you can just use any packet capture tool (like wireshark) with a hub connecting between sender and receiver, to see if the checksum in captured packets are broken.

tcpdump: capture outgoing packets on virtual interfaces that has an unknown link type to libpcap?

In the system I am testing right now, it has a couple of virtual L2 devices chained together to add our own L2.5 headers between Eth headers and IP headers. Now when I use
tcpdump -xx -i vir_device_1
, it actually shows the SLL header with IP header. How do I capture the full packet that is actually going out of the vir_device_1, i.e. after the ndo_start_xmit() device call?
How do I capture the full packet that is actually going out of the vir_device_1, i.e. after the ndo_start_xmit() device call?
Either by writing your own code to directly use a PF_PACKET/SOCK_RAW socket (you say "SLL header", so this is presumably Linux), or by:
making sure you've assigned a special ARPHRD_ value for your virtual interface;
using one of the DLT_USERn values for your special set of headers, or asking tcpdump-workers#lists.tcpdump.org for an official DLT_ value to be assigned for them;
modifying libpcap to map that ARPHRD_ value to the DLT_ value you're using;
modifying tcpdump to handle that DLT_ value;
if necessary, modifying other programs that would capture on that interface or read capture files as written by tcpdump on that interface to handle that value as well.
Note that the DLT_USERn values are specifically reserved for private use, and no official versions of libpcap, tcpdump, or Wireshark will ever assign them for their own use (i.e., if you use a DLT_USERn value, don't bother contributing patches to assign that value to your type of headers, as they won't be accepted; other people may already be using it for their own special headers, and that must continue to be supported), so you'll have to maintain the modified versions of libpcap, tcpdump, etc. yourself if you use one of those values rather than getting an official value assigned.
Thanks Guy Harris for providing very helpful answers to my original question!
I am adding this as an answer/note to a follow up question I asked in the comments.
Basically my question was what is the status of the packet received by PF_PACKET/SOCK_RAW.
For an software device(no queue), dev_queue_xmit() will call dev_hard_start_xmit(skb, dev) to start transmitting skb buffer. This function calls dev_queue_xmit_nit() before it calls dev->ops->ndo_start_xmit(skb,dev), which means the packet PF_PACKET sees is at the state before any changes made in ndo_start_xmit().

Resources