EADDRNOTAVAIL even after using IP_FREEBIND? - linux

I was under the impression that under Linux you could bind to a non-local address as long as you set the IP_FREEBIND socket option, but that's not the behavior I'm seeing:
$ sudo strace -e 'trace=%network' ...
...
socket(AF_INET, SOCK_RAW, IPPROTO_UDP) = 5
setsockopt(5, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
setsockopt(5, SOL_SOCKET, SO_NO_CHECK, [1], 4) = 0
setsockopt(5, SOL_IP, IP_HDRINCL, [1], 4) = 0
setsockopt(5, SOL_IP, IP_FREEBIND, [1], 4) = 0
bind(5, {sa_family=AF_INET, sin_port=htons(abcd), sin_addr=inet_addr("w.x.y.z")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
...
I also set the ip_nonlocal_bind setting, just to be certain, and I get the same results.
$ sysctl net.ipv4.ip_nonlocal_bind
net.ipv4.ip_nonlocal_bind = 1

Unfortunately, it seems that it is not possible to bind a raw IP socket to a non-local, non-broadcast and non-multicast address, regardless of IP_FREEBIND. Since I see inet_addr("w.x.y.z") in your strace output, I assume that this is exactly what you're trying to do and w.x.y.z is a non-local unicast address, thus your bind syscall fails.
This seems in accordance with man 7 raw:
A raw socket can be bound to a specific local address using the
bind(2) call. If it isn't bound, all packets with the specified
IP protocol are received. In addition, a raw socket can be bound
to a specific network device using SO_BINDTODEVICE; see socket(7).
Indeed, looking at the kernel source code, in raw_bind() we can see the following check:
ret = -EADDRNOTAVAIL;
if (addr->sin_addr.s_addr && chk_addr_ret != RTN_LOCAL &&
chk_addr_ret != RTN_MULTICAST && chk_addr_ret != RTN_BROADCAST)
goto out;
Also, note that .sin_port must be 0. The .sin_port field for raw sockets is used to select a sending/receiving IP protocol (not a port, since we are at level 3 and ports do not exist). As the manual states, from Linux 2.2 onwards you cannot select a sending protocol through .sin_port anymore, the sending protocol is the one set when creating the socket.

Related

Linux netlink mutlicast routing updates

My application should get netlink multicast route updates from kernel.
I did some research and found mutlicast uses different family:RTNL_FAMILY_IPMR
and group is RTMGRP_IPV4_MROUTE.
However if I use:
sfd = socket (AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
snl.nl_groups |= RTMGRP_IPV4_MROUTE
I dont get any updates.
But
sfd = socket (AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
snl.nl_family = RTNL_FAMILY_IPMR;
snl.nl_groups |= RTMGRP_IPV4_MROUTE;
This give bind error", bind: Invalid argument
sfd = socket (RTNL_FAMILY_IPMR, SOCK_RAW, NETLINK_ROUTE);
This give "Address family not supported by protocol" error
I'm not sure how to get updates from kernel for mutlicast routes.
copy-paste from earlier project I have:
struct sockaddr_nl naddr;
netlinkfd = socket (AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
naddr.nl_family = AF_NETLINK;
naddr.nl_groups = (1 << (RTNLGRP_LINK - 1)) |
(1 << (RTNLGRP_IPV4_ROUTE - 1)) |
(1 << (RTNLGRP_IPV6_ROUTE - 1)) |
(1 << (RTNLGRP_IPV4_IFADDR - 1)) |
(1 << (RTNLGRP_IPV6_IFADDR -1 ));
if (bind (netlinkfd, (struct sockaddr *)&naddr, sizeof (naddr)))
{
error_foo();
return;
}
This one works for me receiving link, ip and routing table in general. (pushing me all the changes from this point of - if I want to receive the current status, I need to request them aswell). Try to have both ROUTE and MROUTE since you want multicast routing tables, but they might be merged into the normal routing table

netfilter-like kernel module to get source and destination address

I read this guide to write a kernel module to do simple network filtering.
First, I have no idea of what below text this means, and what's the difference between inbound and outbound data packet(by transportation layer)?
When a packet goes in from wire, it travels from physical layer, data
link layer, network layer upwards, therefore it might not go through
the functions defined in netfilter for skb_transport_header to work.
Second, I hate magic numbers, and I want to replace the 20 (the length of typical IP header) with any function from the linux kernel's utilities(source file).
Any help will be appreciated.
This article is a little outdated now. Text that you don't understand is only applicable to kernel versions below 3.11.
For new kernels (>= 3.11)
If you are sure that your code will only be used with kernels >= 3.11, you can use next code for both input and output packets:
udp_header = (struct udphdr *)skb_transport_header(skb);
Or more elegant:
udp_header = udp_hdr(skb);
It's because transport header is already set up for you in ip_rcv():
skb->transport_header = skb->network_header + iph->ihl*4;
This change was brought by this commit.
For old kernels (< 3.11)
Outgoing packets (NF_INET_POST_ROUTING)
In this case .transport_header field set up correctly in sk_buffer, so it points to actual transport layer header (UDP/TCP). So you can use code like this:
udp_header = (struct udphdr *)skb_transport_header(skb);
or better looking (but actually the same):
udp_header = udp_hdr(skb);
Incoming packets (NF_INET_PRE_ROUTING)
This is the tricky part.
In this case the .transport_header field is not set to the actual transport layer header (UDP or TCP) in sk_buffer structure (that you get in your netfilter hook function). Instead, .transport_header points to IP header (which is network layer header).
So you need to calculate address of transport header by your own. To do so you need to skip IP header (i.e. add IP header length to your .transport_header address). That's why you can see next code in the article:
udp_header = (struct udphdr *)(skb_transport_header(skb) + 20);
So 20 here is just the length of IP header.
It can be done more elegant in this way:
struct iphdr *iph;
struct udphdr *udph;
iph = ip_hdr(skb);
/* If transport header is not set for this kernel version */
if (skb_transport_header(skb) == (unsigned char *)iph)
udph = (unsigned char *)iph + (iph->ihl * 4); /* skip IP header */
else
udph = udp_hdr(skb);
In this code we use an actual IP header size (which is iph->ihl * 4, in bytes) instead of magic number 20.
Another magic number in the article is 17 in next code:
if (ip_header->protocol == 17) {
In this code you should use IPPROTO_UDP instead of 17:
#include <linux/udp.h>
if (ip_header->protocol == IPPROTO_UDP) {
Netfilter input/output packets explanation
If you need some reference about difference between incoming and outgoing packets in netfilter, see the picture below.
Details:
[1]: Some useful code from GitHub
[2]: "Linux Kernel Networking: Implementation and Theory" by Rami Rosen
[3]: This answer may be also useful

tcp keepalive - Protocol not available?

I'm trying to set tcp keepalive but in doing so I'm seeing the error
"Protocol not available"
int rc = setsockopt(s, SOL_SOCKET, TCP_KEEPIDLE, &keepalive_idle, sizeof(keepalive_idle));
if (rc < 0)
printf("error setting keepalive_idle: %s\n", strerror(errno));
I'm able to turn on keepalive, set keepalive interval and count but keepalive idle which is keepalive time is throwing that error and I never see any keepalive packets being transmitted/received either with wireshark and the filter tcp.analysis.keep_alive or with tcpdump
sudo tcpdump -vv "tcp[tcpflags] == tcp-ack and less 1"
Is there a kernel module that needs to be loaded or something? Or are you no longer able to override the global KEEPIDLE time.
By the way the output of
matt#devpc:~/ sysctl net.ipv4.tcp_keepalive_time net.ipv4.tcp_keepalive_probes net.ipv4.tcp_keepalive_intvl
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
In an application that I coded, the following works:
setsockopt(*sfd, SOL_SOCKET, SO_KEEPALIVE,(char *)&enable_keepalive, sizeof(enable_keepalive));
setsockopt(*sfd, IPPROTO_TCP, TCP_KEEPCNT, (char *)&num_keepalive_strobes, sizeof(num_keepalive_strobes));
setsockopt(*sfd, IPPROTO_TCP, TCP_KEEPIDLE, (char *)&keepalive_idle_time_secs, sizeof(keepalive_idle_time_secs));
setsockopt(*sfd, IPPROTO_TCP, TCP_KEEPINTVL, (char *)&keepalive_strobe_interval_secs, sizeof(keepalive_strobe_interval_secs));
Try changing SOL_SOCKET to IPPROTO_TCP for TCPKEEPIDLE.
There a very handy lib that can help you, it's called libkeepalive : http://libkeepalive.sourceforge.net/
It can be used with LD_PRELOAD in order to enable and control keep-alive on all TCP sockets. You can also override keep-alive settings with environment variables.
I tried to run a tcp server with it:
KEEPIDLE=5 KEEPINTVL=5 KEEPCNT=100 LD_PRELOAD=/usr/lib/libkeepalive.so nc -l -p 4242
Then I connected a client:
nc 127.0.0.1 4242
And I visualized the traffic with Wireshark: the keep-alive packets began exactly after 5 seconds of inactivity (my system wide setting is 75). Therefore it means that it's possible to override the system settings.
Here is how libkeepalive sets TCP_KEEPIDLE:
if((env = getenv("KEEPIDLE")) && ((optval = atoi(env)) >= 0)) {
setsockopt(s, SOL_TCP, TCP_KEEPIDLE, &optval, sizeof(optval));
}
Looks like they use SOL_TCP instead of SOL_SOCKET.

Reduce TCP maximum segment size (MSS) in Linux on a socket

In a special application in which our server needs to update firmware of low-on-resource sensor/tracking devices we encountered a problem in which sometimes data is lost in the
remote devices (clients) receiving packets of the new firmware. The connection is TCP/IP over
GPRS network. The devices use SIM900 GSM chip as a network interface.
The problems possibly come because of the device receiving too much data. We tried reducing the
traffic by sending packages more rarely but sometimes the error still occured.
We contacted the local retailer of the SIM900 chip who is also responsible for giving technical support and possibly contacting the chinese manufacturer (simcom) of the chip. They said that at first we should try to reduce the TCP MSS (Maximum Segment Size) of our connection.
In our server I did the following:
static int
create_master_socket(unsigned short master_port) {
static struct sockaddr_in master_address;
int master_socket = socket(AF_INET,SOCK_STREAM,0);
if(!master_socket) {
perror("socket");
throw runtime_error("Failed to create master socket.");
}
int tr=1;
if(setsockopt(master_socket,SOL_SOCKET,SO_REUSEADDR,&tr,sizeof(int))==-1) {
perror("setsockopt");
throw runtime_error("Failed to set SO_REUSEADDR on master socket");
}
master_address.sin_family = AF_INET;
master_address.sin_addr.s_addr = INADDR_ANY;
master_address.sin_port = htons(master_port);
uint16_t tcp_maxseg;
socklen_t tcp_maxseg_len = sizeof(tcp_maxseg);
if(getsockopt(master_socket, IPPROTO_TCP, TCP_MAXSEG, &tcp_maxseg, &tcp_maxseg_len)) {
log_error << "Failed to get TCP_MAXSEG for master socket. Reason: " << errno;
perror("getsockopt");
} else {
log_info << "TCP_MAXSEG: " << tcp_maxseg;
}
tcp_maxseg = 256;
if(setsockopt(master_socket, IPPROTO_TCP, TCP_MAXSEG, &tcp_maxseg, tcp_maxseg_len)) {
log_error << "Failed to set TCP_MAXSEG for master socket. Reason: " << errno;
perror("setsockopt");
} else {
log_info << "TCP_MAXSEG: " << tcp_maxseg;
}
if(getsockopt(master_socket, IPPROTO_TCP, TCP_MAXSEG, &tcp_maxseg, &tcp_maxseg_len)) {
log_error << "Failed to get TCP_MAXSEG for master socket. Reason: " << errno;
perror("getsockopt");
} else {
log_info << "TCP_MAXSEG: " << tcp_maxseg;
}
if(bind(master_socket, (struct sockaddr*)&master_address,
sizeof(master_address))) {
perror("bind");
close(master_socket);
throw runtime_error("Failed to bind master_socket to port");
}
return master_socket;
}
Running the above code results in:
I0807 ... main.cpp:267] TCP_MAXSEG: 536
E0807 ... main.cpp:271] Failed to set TCP_MAXSEG for master socket. Reason: 22 setsockopt: Invalid argument
I0807 ... main.cpp:280] TCP_MAXSEG: 536
As you may see, the problem in the second line of the output: setsockopt returns "Invalid argument".
Why does this happen? I read about some constraints in setting TCP_MAXSEG but I did not encounter any report on such a behaviour as this.
Thanks,
Dennis
In addition to xaxxon's answer, just wanted to note my experience with trying to force my Linux to send only maximum TCP segments of a certain size (lower than what they normally are):
The easiest way I found to do so, was to use iptables:
sudo iptables -A INPUT -p tcp --tcp-flags SYN,RST SYN --destination 1.1.1.1 -j TCPMSS --set-mss 200
This overwrites the remote incoming SYN/ACK packet on an outbound connection, and forces the MSS to a specific value.
Note1: You do not see this in wireshark, since wireshark capture before this happens.
Note 2: Iptables does not allow you to -increase- the MSS, just lower it
Alternatively, I also tried setting the socket option TCP_MAXSEG, like dennis had done. After taking the fix from xaxxon, this also worked.
Note: You should read the MSS value after the connection has been set up. Otherwise it returns the default value, which put me (and dennis) on the wrong track.
Now finally, I also ran into a number of other things:
I ran into TCP-offloading issues, where despite my MSS being set correctly, the frames being sent were still shown by wireshark as too big. You can disable this feature by : sudo ethtool -K eth0 tx off sg off tso off. This took me a long time to figure out.
TCP has lots of fancy things like MTU path discovery, which actually try to dynamically increase the MSS. Fun and cool, but confusing obviously. I did not have issues with it though in my tests
Hope this helps someone trying to do the same thing one day.
Unless otherwise noted, optval is a pointer to an int.
but you're using a u_int16. I don't see anything saying that this parameter isn't an int.
edit: Yeah, here is the source code and you can see:
637 if (optlen < sizeof(int))
638 return -EINVAL;

Crafting an ICMP packet inside a Linux kernel Module

I'm tring to experiment with the ICMP protocol and have created a kernel-module for linux that analyses ICMP packet ( Processes the packet only if if the ICMP code field is a magic number ) . Now to test this module , i have to create a an ICMP packet and send it to the host where this analysing module is running . In fact it would be nice if i could implement it the kernel itself (as a module ) . I am looking for something like a packetcrafter in kernel , I googled it found a lot of articles explaining the lifetime of a packet , rather than tutorials of creating it . User space packetcrafters would be my last resort, that too those which are highly flexible like where i'll be able to set ICMP code etc . And I'm not wary of kernel panics :-) !!!!! Any packet crafting ideas are welcome .
Sir, I strongly advice you against using the kernel module to build ICMP packets.
You can use user-space raw-sockets to craft ICMP packets, even build the IP-header itself byte by byte.
So you can get as flexible as it can get using that.
Please, take a look at this
ip = (struct iphdr*) packet;
icmp = (struct icmphdr*) (packet + sizeof(struct iphdr));
/*
* here the ip packet is set up except checksum
*/
ip->ihl = 5;
ip->version = 4;
ip->tos = 0;
ip->tot_len = sizeof(struct iphdr) + sizeof(struct icmphdr);
ip->id = htons(random());
ip->ttl = 255;
ip->protocol = IPPROTO_ICMP;
ip->saddr = inet_addr(src_addr);
ip->daddr = inet_addr(dst_addr);
if ((sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)) == -1)
{
perror("socket");
exit(EXIT_FAILURE);
}
/*
* IP_HDRINCL must be set on the socket so that
* the kernel does not attempt to automatically add
* a default ip header to the packet
*/
setsockopt(sockfd, IPPROTO_IP, IP_HDRINCL, &optval, sizeof(int));
/*
* here the icmp packet is created
* also the ip checksum is generated
*/
icmp->type = ICMP_ECHO;
icmp->code = 0;
icmp->un.echo.id = 0;
icmp->un.echo.sequence = 0;
icmp->checksum = 0;
icmp-> checksum = in_cksum((unsigned short *)icmp, sizeof(struct icmphdr));
ip->check = in_cksum((unsigned short *)ip, sizeof(struct iphdr));
If this part of code looks flexible enough, then read about raw sockets :D maybe they're the easiest and safest answer to your need.
Please check the following links for further info
http://courses.cs.vt.edu/~cs4254/fall04/slides/raw_6.pdf
http://www.cs.binghamton.edu/~steflik/cs455/rawip.txt
http://cboard.cprogramming.com/networking-device-communication/107801-linux-raw-socket-programming.html a very nice topic, pretty useful imo
You can try libcrafter for packet crafting on user space. Is very easy to use! The library is able to craft or decode packets of most common networks protocols, send them on the wire, capture them and match requests and replies.
For example, the next code craft and send an ICMP packet:
string MyIP = GetMyIP("eth0");
/* Create an IP header */
IP ip_header;
/* Set the Source and Destination IP address */
ip_header.SetSourceIP(MyIP);
ip_header.SetDestinationIP("1.2.3.4");
/* Create an ICMP header */
ICMP icmp_header;
icmp_header.SetType(ICMP::EchoRequest);
icmp_header.SetIdentifier(RNG16());
/* Create a packet... */
Packet packet = ip_header / icmp_header;
packet.Send();
Why you want to craft an ICMP packet on kernel-space? Just for fun? :-p
Linux kernel includes a packet generator tool pktgen for testing the network with pre-configured packets. Source code for this module resides in net/core/pktgen.c

Resources