Specify source IP address for TCP socket when using Linux network device aliases - linux

For some specific networking tests, I've created a VLAN device, eth1.900, and a couple of aliases, eth1.900:1 and eth1.900.2.
eth1.900 Link encap:Ethernet HWaddr 00:18:E7:17:2F:13
inet addr:1.0.1.120 Bcast:1.0.1.255 Mask:255.255.255.0
eth1.900:1 Link encap:Ethernet HWaddr 00:18:E7:17:2F:13
inet addr:1.0.1.200 Bcast:1.0.1.255 Mask:255.255.255.0
eth1.900:2 Link encap:Ethernet HWaddr 00:18:E7:17:2F:13
inet addr:1.0.1.201 Bcast:1.0.1.255 Mask:255.255.255.0
When connecting to a server, is there a way to specify which of these aliases will be used? I can ping using the -I <ip> address option to select which alias to use, but I can't see how to do it with a TCP socket in code without using raw sockets, since I would also like to run without extra socket privileges, i.e. not running as root, if possible.
Unfortunately, even with root, SO_BINDTODEVICE doesn't work because the alias device name is not recognized:
printf("Bind to %s\n", devname);
if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, (char*)devname, sizeof(devname)) != 0)
{
perror("SO_BINDTODEVICE");
return 1;
}
Output:
Bind to eth1.900:1
SO_BINDTODEVICE: No such device

Use getifaddrs() to enumerate all the interfaces and find the IP address for the interface you want to bind to. Then use bind() to bind to that IP address, before you call connect().

Since a packet can't be send out on an aliased interface anyway, it would make no sense to use SO_BINDTODEVICE on one. SO_BINDTODEVICE controls which device a packet is sent out from if routing cannot be used for this purpose (for example, if it's a raw Ethernet frame).

You don't show the definition of devname, but if it's a string pointer, e.g.:
char *devname = "eth1.900:1";
Then perhaps it's failing since you specify the argument size using sizeof devname, which would in this case be the same as sizeof (char *), i.e. typically 4 on a 32-bit system.
If setsockopt() expects to see the actual size of the argument, i.e. the length of the string, this could explain the issue since it's then perhaps just inspecting the first four characters and failing since the result is an invalid interface name.

Related

Kernel API to know up address of interface

Is there any kernel side/space API to know the ip address of an interface , given it's name?
I think you're looking for rtnetlink (man page)
Rtnetlink allows the kernel's routing tables to be read and altered.
It is used within the kernel to communicate between various
subsystems, though this usage is not documented here, and for
communication with user-space programs. Network routes, IP addresses,
link parameters, neighbor setups, queueing disciplines, traffic
classes and packet classifiers may all be controlled through
NETLINK_ROUTE sockets.
According to strace, tt's the api ip addr show dev XXX uses:
strace ip addr sh dev lo 2>&1 | grep sendmsg
sendmsg(4, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=48, type=RTM_GETLINK, flags=NLM_F_REQUEST, seq=1596838225, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, [{{nla_len=8, nla_type=IFLA_EXT_MASK}, 9}, {{nla_len=7, nla_type=IFLA_IFNAME}, "lo"}]}, iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 48
sendmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST, seq=1596838225, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=if_nametoindex("lo"), ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 9}}, iov_len=40}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 40
However, it looks like a non-trivial api so if you don't need it often, it might be easier to just run ip addr sh dev XXX and parse the response.
Edit:
Looks like it's also possible using netdevice (man page), specifically, the SIOCGIFADDR ioctl:
SIOCGIFADDR, SIOCSIFADDR
Get or set the address of the device using ifr_addr. Setting the interface address is a privileged operation. For compatibility, only
AF_INET addresses are accepted or returned.
There's example code here

How to create a kernel module that can intercept all packets coming to/from a network interface

I have 2 port NIC on my system - eth0 and eth1 as seen by Linux.
I want to intercept all packets coming in/to eth0, send them out through eth1 to an external device connected to the same switch as eth1 is. So I need to slap on an additional header to make it reach the correct external device.
I know that there is a concept of network taps that both the transmit and receive code in the kernel send to, but how do I create one? Also I want to capture not just IP, but all ethernet packets, I know NETFILTER_HOOK would have helped me get me IPv4 packets.
The can be readily implemented with a rx_handler:
static rx_handler_result_t handle_frame(struct sk_buff **pskb)
{
struct sk_buff *skb = *pskb;
struct net_device *whereto_dev;
skb = skb_share_check(skb, GFP_ATOMIC);
if (unlikely(!skb))
return RX_HANDLER_CONSUMED;
*pskb = skb;
whereto_dev = rcu_dereference(skb->dev->rx_handler_data);
skb->dev = whereto_dev;
return RX_HANDLER_ANOTHER; /* Do another round in receive path */
}
They are registered via netdev_rx_handler_register(slave_dev, handle_frame, whereto). See the bonding or my uman driver for example usage.
dev_add_pack would work too, but it seems, apart from af_packet.c, all all-packet-catching users of dev_add_pack have been migrated to use rx_handlers, e.g. https://patchwork.ozlabs.org/patch/367236/. The patch's discussion suggests this might be more effecient.

Identifying the preferred IPv6 source address for an adapter

If you have a IPv6 enabled host that has more than one global-scope address, how can you programmatically identify the preferred address for bind()?
Example address list:
eth0 Link encap:Ethernet HWaddr 00:14:5e:bd:6d:da
inet addr:10.6.28.31 Bcast:10.6.28.255 Mask:255.255.255.0
inet6 addr: 2002:dce8:d28e:0:214:5eff:febd:6dda/64 Scope:Global
inet6 addr: fe80::214:5eff:febd:6dda/64 Scope:Link
inet6 addr: 2002:dce8:d28e::31/64 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
On Solaris you can indicate a preferred address with an interface flag and it is available programmatically via SIOCGLIFCONF:
/usr/include/net/if.h:
#define IFF_PREFERRED 0x0400000000 /* Prefer as source address */
As listed in the interface list:
eri0: flags=2104841<UP,RUNNING,MULTICAST,DHCP,ROUTER,IPv6> mtu 1500 index 2
inet6 fe80::203:baff:fe4e:6cc8/10
eri0:1: flags=402100841<UP,RUNNING,MULTICAST,ROUTER,IPv6,PREFERRED> mtu 1500 index 2
inet6 2002:dce8:d28e::36/64
This is not portable to OSX, Linux, FreeBSD, or Windows though. Windows is let off easy though as it has completely useless, from an administrators perspective, UUID based adapter names (depending upon the Windows version).
For Linux this article details how the parameter preferred_lft, where lft is short for "lifetime", can be altered to weight the selection process by the kernel. This setting doesn't appear conveniently available in the results of SIOCGIFCONF or getifaddrs() though.
So I want to bind to eth0, eri0, or whatever available interface name. The choices are a bit stark:
Fail on adapter names resolving to multiple interfaces. I take this approach for handling multicast transports (OpenPGM) as the protocol MUST have one-only sending address.
Bind to everything. This is a cop out and would be unexpected to users.
Bind to the adapter with SO_BINDTODEVICE. This requires CAP_NET_RAW system capability on Linux which can be quite a cumbersome overhead for administrators.
Bind to the first IPv6 interface on the adapter. The ordering tends to be completely bogus.
Bind to the last interface. David Croft's article implies Linux does this, but is also a bit bogus.
Enumerate over every interface and create a new socket explicitly for each.
With option #6 I would expect you could usually be smarter and take the approach that if only a link-local scope address is available bind to that, otherwise bind to just the available global-link scope addresses.
When connecting to another host then RFC 3484 can be used, but as you can see all the choices are dependent upon matching the destination address:
Prefer same address. (i.e. destination is local machine)
Prefer appropriate scope. (i.e. smallest scope shared with the destination)
Avoid deprecated addresses.
Prefer home addresses. Prefer outgoing
interface. (i.e. prefer an address on the interface we're sending
out of)
Prefer matching label.
Prefer public addresses.
Use longest matching prefix.
In some circumstances we can use #7 here, but in the interface example above both global-scope interfaces have a 64-bit prefix length.
RFC 3484 has the following pertinent lines:
The IPv6 addressing architecture 5 allows multiple unicast
addresses to be assigned to interfaces. These addresses may have
different reachability scopes (link-local, site-local, or global).
These addresses may also be "preferred" or "deprecated" 6.
The link being to RFC 2462, similarly expanded:
preferred address - an address assigned to an interface whose use
by
upper layer protocols is unrestricted. Preferred addresses may
be used as the source (or destination) address of packets sent
from (or to) the interface.
But no methods to programmatically acquire this detail.
Props to Win32 API that exposes an ioctl SIO_ADDRESS_LIST_SORT that allows a developer to use not only RFC 3484 sorting but to take into consideration any system administrator overrides. Linux has /etc/gai.conf as used for RFC 3484 sorting in getaddrinfo() but no API for directly accessing the sorting. Solaris has the ipaddrsel command. OSX is following FreeBSD by adding ip6addrctl in 10.7.
edit: Some concerns with RFC 3484 sorting are listed and referred to in this additional IETF draft document:
https://datatracker.ietf.org/doc/html/draft-axu-addr-sel-01
Solaris, for example, creates new alias-interfaces for each new
address assigned to a physical interface. So if_index could also be
used to uniquely identify a source address specific routing table on
that platform. Other operating systems do not work the same way.
The author likes Solaris's approach of giving each additional IPv6 interface a new alias, so that eri0 would become the link-local scope address, and eri0:1 or eri0:2, etc, must be specified to use a global-scope address.
Clearly whilst a nice idea one couldn't expect to see other OS change for quite some time.
I'm not sure this is in the direction you're seeking, but...
Poking around in the iproute bundle's ip code (ip/ipaddress.c) under linux shows that the ip command digs up interface flags like primary and secondary from a struct ifaddrmsg, member ifa_flags. The ifaddmsg seems to be acquired through a struct nlmsghdr which is documented in man 7 netlink, and used via sendmsg and recvmsg interaction with the kernel, which overall sounds like a royal pain but it's at least programmatic. Whether primary and secondary would be enough to be useful is a separate question.

Sniffing 802.3 eth packets with socket raw

I'd need to sniff on an interface BPDU (bridge protocol data unit) packets which are encapsulated in eth frames of type 802.3 with LLC header. I tried to open a socket raw:
skd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_802_3))
but trying to sniff packets i can't catch them. Looking at include/linux/if_ether.h seems that ETH_P_802_3 was a dummy type...is there a solution or i should use ETH_P_ALL and analize the EtherType field of the ethernet header?
Thank you all!
Sorry, I'm not sure if your question is regarding the ETH_P_ALL flag or if your sniffer simply doesn't work.
I would recommend using ETH_P_ALL and decoding the headers yourself.
If your sniffers not working, make sure that you have promiscuous mode on? From the command line, you can use ifconfig eth0 promisc, assuming your ethernet device is eth0. Or you can set the IFF_PROMISC flag on your device using ioctl.
All that said, unless you have a good reason not to, it's probably strongly worth your while to not reinvent the wheel and simply use libpcap.

Utility for parsing /proc/net/route

Is there an available utility to parse the content of /proc/net/route into more human readable format (i.e. dotted decimal for addresses)?
ROUTE(8) does exactly that if you invoke it with -n flag. Moreover, it could be used on systems without procfs support. For example:
$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.1.2 0.0.0.0 UG 100 0 0 eth0
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
Nowadays you should be using the ip route list command instead of route.
Check out the /sbin/route command.
libnetlink may be the right way to go for this. I know it is the "proper" interface for a lot of low-level and physical network stuff, I'm not 100% sure about route tables, though.
See: libnetlink netlink rtnetlink

Resources