Why might Wireshark and NodeJS disagree about a packet's contents? - node.js

I'm working with raw-socket (a node module for sending raw data out on the network) and playing with their Ping example.
I have Wireshark set up to monitor traffic. I can see my ICMP packet go out, and a response comes back.
Here's where things get strange.
Wireshark shows the following packet:
IP: 4500003c69ea00004001e2fec0a85647c0a85640
ICMP: 00004b5200010a096162636465666768696a6b6c6d6e6f7071727374757677616263646566676869
However, the node event handler that fires when data comes in is showing:
IP: 4500280069ea00004001e2fec0a85647c0a85640
ICMP: 00004b5200010a096162636465666768696a6b6c6d6e6f7071727374757677616263646566676869
The ICMP components match. However, bytes 0x02 and 0x03 (the Length bytes) differ.
Wireshark shows 0x003c or 60 bytes (as expected).
Node shows 0x2800 or 10kB... which is not what is expected.
Notably, the checksum (bytes 0x18 and 0x19) are the same in each case, although it's only valid for the Wireshark packet.
So, here's the question: what might lead to this discrepancy? I'm inclined to believe Wireshark is correct since 60 bytes is the right size for an ICMP reply, but why is Node wrong here?
OSX note
The docs for this module point out that, on OSX, it will try to use SOCK_DGRAM if SOCK_RAW is not permitted. I have tried this with that function disabled and using sudo and got the same responses as before.
Github issue
It looks like https://github.com/nospaceships/node-raw-socket/issues/60 is open for this very issue, but it remains unclear if this is a code bug or a usage problem...

This is due to a FreeBSD bug (feature?) which subtracts the length of the IP header from the IP length header field and also flips it to host byte order.
https://cseweb.ucsd.edu//~braghava/notes/freebsd-sockets.txt

Related

ICMP timestamps added to ping echo requests in linux. How are they represented to bytes? Or how to convert UNIX Epoch time to that timestamp format?

I encountered this issue while trying to make a ping program in C myself and used wireshark for further digging into the problem: ping which sends echo requests to a destination IP also ads a timestamp field of 8 bytes (TOD timestamp) after the ICMP header in linux. Ping in Windows doesn't add that timestamp but rather I think makes the time calculations locally. Now my question is how do you convert the time from Unix Epoch format (the number of seconds from 1970 which you get with the 'time' function in C) to that TOD format of 8 bytes? I got to this question as, finally, after quite a time of research, my ping.c program sends the ICMP echo request message to the destination, where after a test with 2 hosts I noticed that it manages to arrive, but gets no ping echo reply message back while the native linux ping works properly. I can only imagine 2 possible causes:
I didnt complete well the fields of the ICMP and IP header. To be honest, I myself pretty much doubt this possiblity because wireshark shows the message arrives to the destination and is recognized as an echo request message, but doesn't trigger any echo reply answer. However, if it would to be this, the only thing I can think off is that timestamp which I don`t know how to convert in TOD form to occupy at most 8 bytes.
There might be a firewall at the destination or some other system dependent fact.
https://www.ibm.com/docs/en/zos/2.2.0?topic=problems-using-ping-command
The Ping command does not use the ICMP/ICMPv6 header sequence number field (icmp_seq or icmp6_seq) to correlate requests with ICMP/ICMPv6 Echo Replies. Instead, it uses the ICMP/ICMPv6 header identifier field (icmp_id or icmp6_id) plus an 8-byte TOD time stamp field to correlate requests with replies. The TOD time stamp is the first 8-bytes of data after the ICMP/ICMPv6 header.
Finally, to repeat the initial question:
How do you convert the UNIX Epoch time to the TOD timestamp form which linux ping adds at the end of the ICMP header/begining of data field?
An useful explanation, but I don't think sufficient I found here:
https://towardsdatascience.com/3-tips-to-handle-timestamps-in-c-ad5b36892294
I should probably mention I`m working on Ubuntu 20.04 focalfossa.
I found a related post here. The book "Principles of Operation" is mentioned in the comments. I skimmed through it but it seems to be generally lower level than C so if anyone knows another place/way to answer the question it would be better.
Background
Including the UNIX timestamp of the time of transmission in the first data bytes of the ICMP Echo message is a trick/optimization the original ping by Mike Muuss used to avoid keeping track of it locally. It exploits the following guarantee made by RFC 792's Echo or Echo Reply Message description:
The data received in the echo message must be returned in the echo reply message.
Many (if not all) BSD ping implementations are based on Mike Muuss' original implementation and therefore kept this behavior. On Linux systems, ping is typically provided by iputils, GNU inetutils, or Busybox. All exhibit the same behavior. fping is a notable exception, which stores a mapping from target host and sequence number to timestamp locally.
Implementations typically store the timestamp in the sender's native precision and byte order as opposed to a well-defined precision in conventional network byte order (big endian), that is normally used for data transmitted over the network, as it intended to be only be interpreted by the original sender and others should just treat it as opaque stream of bytes.
Because this is so common however, the Wireshark ICMP dissector (as of v3.6.2) tries to be clever and heuristically decode it nonetheless, if the first 8 data bytes look like a struct timeval timestamp in 32-bit precision for seconds and microseconds in either byte order. Please note that if the sender was actually using big endian 64-bit precision, this will fail and if it was using little endian 64-bit precision, it will truncate the microseconds before the Epochalypse and fail after that.
Obtaining and serializing epoch time
To answer your actual question:
How do you convert the UNIX Epoch time to the TOD timestamp form which linux ping adds at the end of the ICMP header/begining of data field?
Most implementations use the obsolescent gettimeofday(2) instead of the newer clock_gettime(2). The following snippet is taken from iputils:
struct icmphdr *icp /* = pointer to ICMP packet */;
struct timeval tmp_tv;
gettimeofday(&tmp_tv, NULL);
memcpy(icp + 1, &tmp_tv, sizeof(tmp_tv));
memcpy from a temporary variable instead of directly passing the icp + 1 as target to gettimeofday is used to avoid potential improper alignment, effective type and strict aliasing violation issues.
I appreciate the clear answers. I actually managed to solve the problem and make the ping function work and I have to say the problem was certainly not the timestamp because, yes indeed, i was talking about the "Echo or Echo Reply Message". One way of implementing ping is by using the ICMP feature of Echo and Echo reply messages. The fact is when I put this question I was obviously stuck probably because I wasn't clearly differentiating the main aspects of the problem. Thus, I started to examine the packet sent by the native ping on my Ubuntu 20.04 focal_fossa (with Wireshark), hopefully trying to get a better grasp of how to fill the headers of the packet sent by my program (the IP and ICMP headers). This question simply arised from the fact that I noticed that in this version of Ubuntu, ping adds a timestamp (indeed of 32 bits) after the end of ICMP header, basically in the data/payload section. As a matter of fact, I also used Wireshark on Windows10 and saw that there is no timestamp added after the header. So yes, it might be about different versions of the program being used.
What is the main point I want to emphasize is that my final version of ping has nothing to do with any timestamps. So yes, they are not a crucial aspect for ping to work.

Linux UDP short broadcast with longer response expected

The receiving platform is Linux-2.6.37_DM8127_IPNC_3.80.00 (TI's DaVinci processor).
In the course of a discovery process, I make a short broadcast (Win7) - the expected response to which is several bytes longer than the length of the broadcast.
I always get a partial response - having exactly the length of the broadcast. This happens regardless of the computer, firewall being on/off, antivirus etc.
Padding the broadcast to the length of the expected response - and I get the desired response.
Is this some LINUX feature - or should I keep on searching for some bug in my code?

Finding out the number of dropped packets in raw sockets

I am developing a program that sniffs network packets using a raw socket (AF_PACKET, SOCK_RAW) and processes them in some way.
I am not sure whether my program runs fast enough and succeeds to capture all packets on the socket. I am worried that the recieve buffer for this socket occainally gets full (due to traffic bursts) and some packets are dropped.
How do I know if packets were dropped due to lack of space in the
socket's receive buffer?
I have tried running ss -f link -nlp.
This outputs the number of bytes that are currently stored in the revice buffer for that socket, but I can not tell if any packets were dropped.
I am using Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-52-generic x86_64).
Thanks.
I was having a similar problem as you. I knew that tcpdump was able to to generate statistics about packet drops, so I tried to figure out how it did that. By looking at the code of tcpdump, I noticed that it is not generating those statistic by itself, but that it is using the libpcap library to get those statistics. The libpcap is on the other hand getting those statistics by accessing the if_packet.h header and calling the PACKET_STATISTICS socket option (at least I think so, but I'm no C expert).
Therefore, I saw only two solutions to the problem:
I had to interact somehow with the linux header files from my Pyhton script to get the packet statistics, which seemed a bit complicated.
Use the Python version of libpcap which is pypcap to get those information.
Since I had no clue how to do the first thing, I implemented the second option. Here is an example how to get packet statistics using pypcap and how to get the packet data using dpkg:
import pcap
import dpkt
import socket
pc=pcap.pcap(name="eth0", timeout_ms=10000, immediate=True)
def packet_handler(ts,pkt):
#printing packet statistic (packets received, packets dropped, packets dropped by interface
print pc.stats()
#example packet parsing using dpkt
eth=dpkt.ethernet.Ethernet(pkt)
if eth.type != dpkt.ethernet.ETH_TYPE_IP:
return
ip =eth.data
layer4=ip.data
ipsrc=socket.inet_ntoa(ip.src)
ipdst=socket.inet_ntoa(ip.dst)
pc.loop(0,packet_handler)
tpacket_stats structure is defined in linux/packet.h header file
Create variable using the tpacket_stats structre and pass it to getSockOpt with PACKET_STATISTICS SOL_SOCKET options will give packets received and dropped count.
-- some times drop can be due to buffer size
-- so if you want to decrease the drop count check increasing the buffersize using setsockopt function
First off, switch your operating system.
You need a reliable, network oriented operating system. Not some pink fluffy "ease of use" with "security" functionality enabled. NetBSD or Gentoo/ArchLinux (the bare installations, not the GUI kitted ones).
Start a simultaneous tcpdump on a network tap and capture the traffic you're supposed to receive along side of your program and compare the results.
There's no efficient way to check if you've received all the packets you intended to on the receiving end since the packets might be dropped on a lower level than you anticipate.
Also this is a question for Unix # StackOverflow, there's no programming here what I can see, at least there's no code.
The only certain way to verify packet drops is to have a much more beefy sender (perhaps a farm of machines that send packets) to a single client, record every packet sent to your reciever. Have the statistical data analyzed and compared against your senders and see how much you dropped.
The cheaper way is to buy a network tap or even more ad-hoc enable port mirroring in your switch if possible. This enables you to dump as much traffic as possible into a second machine.
This will give you a more accurate result because your application machine will be busy as it is taking care of incoming traffic and processing it.
Further more, this is why network taps are effective because they split the communication up into two channels, the receiving and sending directions of your traffic if you will. This enables you to capture traffic on two separate machines (also using tcpdump, but instead of a mirrored port, you get a more accurate traffic mirroring).
So either use port mirroring
Or you buy one of these:

socket UDP recvfrom not working for large packets even if buffer given is enough on LINUX

I am calling a recvfrom api of a valid address , where i am trying to read data of size 9600 bytes , the buffer i have provided i of size 12KB , I am not even getting select read events.
Even tough recommended MTU size is 1.5 KB, I am able to send and receive packets of 4 KB.
I am using android NDK , (Linux) for development.
Please help . Is there a socket Option i have to set to read large buffers ?
If you send a packet larger than the MTU, it will be fragmented. That is, it'll be broken up into smaller pieces, each which fits within the MTU. The problem with this is that if even one of those pieces is lost (quite likely on a cellular connection...), the entire packet will effectively disappear.
To determine whether this is the case you'll need to use a packet sniffer on one (or both) ends of the connection. Wireshark is a good choice on a PC end, or tcpdump on the android side (you'll need root). Keep in mind that home routers may reassemble fragmented packets - this means that if you're sniffing packets from inside a home router/firewall, you might not see any fragments arrive until all of them arrive at the router (and obviously if some are getting lost this won't happen).
A better option would be to simply ensure that you're always sending packets smaller than the MTU, of course. Fragmentation is almost never the right thing to be doing. Keep in mind that the MTU may vary at various hops along the path between server and client - you can either use the common choice of a bit less than 1500 (1400 ought to be safe), or try to probe for it by setting the MTU discovery flag on your UDP packets (via IP_MTU_DISCOVER) and always sending less than the value returned by getsockopt's IP_MTU option (including on retransmits!)

UDP IP Fragmentation and MTU

I'm trying to understand some behavior I'm seeing in the context of sending UDP packets.
I have two little Java programs: one that transmits UDP packets, and the other that receives them. I'm running them locally on my network between two computers that are connected via a single switch.
The MTU setting (reported by /sbin/ifconfig) is 1500 on both network adapters.
If I send packets with a size < 1500, I receive them. Expected.
If I send packets with 1500 < size < 24258 I receive them. Expected. I have confirmed via wireshark that the IP layer is fragmenting them.
If I send packets with size > 24258, they are lost. Not Expected. When I run wireshark on the receiving side, I don't see any of these packets.
I was able to see similar behavior with ping -s.
ping -s 24258 hostA works but
ping -s 24259 hostA fails.
Does anyone understand what may be happening, or have ideas of what I should be looking for?
Both computers are running CentOS 5 64-bit. I'm using a 1.6 JDK, but I don't really think it's a programming problem, it's a networking or maybe OS problem.
Implementations of the IP protocol are not required to be capable of handling arbitrarily large packets. In theory, the maximum possible IP packet size is 65,535 octets, but the standard only requires that implementations support at least 576 octets.
It would appear that your host's implementation supports a maximum size much greater than 576, but still significantly smaller than the maximum theoretical size of 65,535. (I don't think the switch should be a problem, because it shouldn't need to do any defragmentation -- it's not even operating at the IP layer).
The IP standard further recommends that hosts not send packets larger than 576 bytes, unless they are certain that the receiving host can handle the larger packet size. You should maybe consider whether or not it would be better for your program to send a smaller packet size. 24,529 seems awfully large to me. I think there may be a possibility that a lot of hosts won't handle packets that large.
Note that these packet size limits are entirely separate from MTU (the maximum frame size supported by the data link layer protocol).
I found the following which may be of interest:
Determine the maximum size of a UDP datagram packet on Linux
Set the DF bit in the IP header and send continually larger packets to determine at what point a packet is fragmented as per Path MTU Discovery. Packet fragmentation should then result in a ICMP type 3 packet with code 4 indicating that the packet was too large to be sent without being fragmented.
Dan's answer is useful but note that after headers you're really limited to 65507 bytes.

Resources