Kernel API to know up address of interface - linux

Is there any kernel side/space API to know the ip address of an interface , given it's name?

I think you're looking for rtnetlink (man page)
Rtnetlink allows the kernel's routing tables to be read and altered.
It is used within the kernel to communicate between various
subsystems, though this usage is not documented here, and for
communication with user-space programs. Network routes, IP addresses,
link parameters, neighbor setups, queueing disciplines, traffic
classes and packet classifiers may all be controlled through
NETLINK_ROUTE sockets.
According to strace, tt's the api ip addr show dev XXX uses:
strace ip addr sh dev lo 2>&1 | grep sendmsg
sendmsg(4, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=48, type=RTM_GETLINK, flags=NLM_F_REQUEST, seq=1596838225, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, [{{nla_len=8, nla_type=IFLA_EXT_MASK}, 9}, {{nla_len=7, nla_type=IFLA_IFNAME}, "lo"}]}, iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 48
sendmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST, seq=1596838225, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=if_nametoindex("lo"), ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 9}}, iov_len=40}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 40
However, it looks like a non-trivial api so if you don't need it often, it might be easier to just run ip addr sh dev XXX and parse the response.
Edit:
Looks like it's also possible using netdevice (man page), specifically, the SIOCGIFADDR ioctl:
SIOCGIFADDR, SIOCSIFADDR
Get or set the address of the device using ifr_addr. Setting the interface address is a privileged operation. For compatibility, only
AF_INET addresses are accepted or returned.
There's example code here

Related

Live with Predictable Network Interface Name

I'm facing for the first time with the new name scheme of network interfaces: Predictable Network Interface Name.
My question is NOT related if this scheme is better or worse... I'm just trying to understand how to use it correcly.
Here I read:
When changing the interface naming scheme, do not forget to update all network-related configuration files and custom systemd unit files to reflect the change.
So I have to write in all the configuration files the actual interface name. In the previous scheme it was i.e. eth0 and it just means the first ethernet card, with the known caveats if there are multiple interfaces.
Now, instead, I have to write the predictable name, that is composed of some easy parts (i.e. type of the interface) and other un-predictable ones like the MAC address. As far as I understand each card will have a different name.
I admit my question might appear fool, but I don't understand how to prepare a configuration file. Let's see an example, /etc/dhcpcd.conf:
profile static_eth0
static ip_address=192.168.1.23/24
static routers=192.168.1.1
static domain_name_servers=192.168.1.1
interface eth0
fallback static_eth0
What should I put instead of eth0 in the o.s. image?
Only when I run the target machine I can retreive the actual name of the ethernet interface.
100% of my systems are headless, and I never connect a keyboard and display to them. Furthermore, if I have to send a spare part of the SBC do I need to reconfigure all?
Would you please help me to understand the correct usage?
ps. I know I can revert back to the old naming scheme... but that's not the point of my question.
See https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
it explains how the names are assigned
Names incorporating Firmware/BIOS provided index numbers for on-board devices (example: eno1)
Names incorporating Firmware/BIOS provided PCI Express hotplug slot index numbers (example: ens1)
Names incorporating physical/geographical location of the connector of the hardware (example: enp2s0)
Names incorporating the interfaces's MAC address (example: enx78e7d1ea46da)
Classic, unpredictable kernel-native ethX naming (example: eth0)
By default, systemd v197 will now name interfaces following policy 1) if that
information from the firmware is applicable and available, falling back to 2) if
that information from the firmware is applicable and available, falling back to 3)
if applicable, falling back to 5) in all other cases. Policy 4) is not used by
default, but is available if the user chooses so.
So you could opt for a different approach, likely in your setup its easiest to take the mac, just reboot once an image that tries pxe/dhcp requests and note down the sended mac.
Another way, that may work, depending on your setup, would be interface groupings.
From "man interfaces"
auto /eth*
If the kernel knows about the interfaces with names lo, eth0 and eth1, then the above line is then interpreted as:
auto eth0 eth1
Note that there must still be valid "iface" stanzas for each matching interface. However, it is possible to combine a pattern with a mapping to a logical interface, like so:
auto /eth*=eth
iface eth inet dhcp
So maybe if you only have one interface, but can't tell where it will be assigned, you could write "auto /e*=eth" to catch all interfaces starting with e and address them inside the configuration file as "eth".

How to make sure packets from the same flow land on the same queue on two NICs when bridging

I'm writing a network bridge that reassembles and analyzes TCP flows on the fly. I have a pair of multi-queue NICS and I use netmap to capture packets from each rx queues on different threads and than pass them on to the other NIC for transmission. The problem is, packets from the same flow do not land on the same queue on the two NICs, due to the source and destination addresses and port being reversed.
I tried ethtool to change the tuple the hash of which is used for distributing packets. Running this:
# ethtool --show-nfc p1p1 rx-flow-hash tcp4
results in:
TCP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]
Repeating the above command for the other NIC results in the same queue. I thought I could change the order for one of the rings by running ethtool --config-nfc p1p1 rx-flow-hash tcp4 dsfn but that doesn't change the order of the fields used in the hash. It seems that both sdfn and dsfn result in the same tuple. The same is true for ds and sd.
Is there a way to make the packets in one flow always land on the same queue (and the same thread) on both NICs whether using ethtool to configure the NICs or otherwise another method or tool?

How to create a kernel module that can intercept all packets coming to/from a network interface

I have 2 port NIC on my system - eth0 and eth1 as seen by Linux.
I want to intercept all packets coming in/to eth0, send them out through eth1 to an external device connected to the same switch as eth1 is. So I need to slap on an additional header to make it reach the correct external device.
I know that there is a concept of network taps that both the transmit and receive code in the kernel send to, but how do I create one? Also I want to capture not just IP, but all ethernet packets, I know NETFILTER_HOOK would have helped me get me IPv4 packets.
The can be readily implemented with a rx_handler:
static rx_handler_result_t handle_frame(struct sk_buff **pskb)
{
struct sk_buff *skb = *pskb;
struct net_device *whereto_dev;
skb = skb_share_check(skb, GFP_ATOMIC);
if (unlikely(!skb))
return RX_HANDLER_CONSUMED;
*pskb = skb;
whereto_dev = rcu_dereference(skb->dev->rx_handler_data);
skb->dev = whereto_dev;
return RX_HANDLER_ANOTHER; /* Do another round in receive path */
}
They are registered via netdev_rx_handler_register(slave_dev, handle_frame, whereto). See the bonding or my uman driver for example usage.
dev_add_pack would work too, but it seems, apart from af_packet.c, all all-packet-catching users of dev_add_pack have been migrated to use rx_handlers, e.g. https://patchwork.ozlabs.org/patch/367236/. The patch's discussion suggests this might be more effecient.

How do I discover a process that has made an interface promiscuous?

On linux, I'd like to quickly tie a running process to a promiscuous interface it created. For instance, tcpdump will change an interface when it starts and ends, and I'd like to efficiently associate that process to the promiscuous interface while it's running.
For instance, I would want this method to detect rogue malware that is sniffing.
Normally I could ps -ef | grep tcpdump, but in the malware case I may not know the process doing the work.
Also, for bonus points... if the process is no longer running, how would you determine how an interface was made promiscuous? (assuming it's not in .history)
Kernel will printk() a message when an interface is put into promiscuous mode. That message should end up in the system logs (usually in /var/log), though most likely your intruder will be smart enough to censor logs and hide his/her/its trail. The only correct answer to this challenge, in my humble opinion, is to have a remote logging server where at least some of the system messages are redirected in addition to storing them to a local disk.
To get more information into logs you could turn on kernel auditing by adding audit=1 to kernel command line.
An interface can be in promiscuous mode without any process actively "keeping" it as such. Actually, you can just turn on promiscuous mode for an interface with ip link set <interface> promisc on. Try it on your loopback interface with ip link set lo promisc on, see what netstat -i produces on your terminal, then turn promiscuous mode again off with ip link set lo promisc off and check once again with netstat -i how flags for the loopback interface have changed.
To answer your first question: there is no way to know which process keeps an interface in promiscuous mode as there might be such a process in the first place. The kernel doesn't have detailed process information at the point of __dev_set_promiscuity():
if (dev->flags != old_flags) {
pr_info("device %s %s promiscuous mode\n",
dev->name,
dev->flags & IFF_PROMISC ? "entered" : "left");
if (audit_enabled) {
current_uid_gid(&uid, &gid);
audit_log(current->audit_context, GFP_ATOMIC,
AUDIT_ANOM_PROMISCUOUS,
"dev=%s prom=%d old_prom=%d auid=%u uid=%u gid=%u ses=%u",
dev->name, (dev->flags & IFF_PROMISC),
(old_flags & IFF_PROMISC),
from_kuid(&init_user_ns, audit_get_loginuid(current)),
from_kuid(&init_user_ns, uid),
from_kgid(&init_user_ns, gid),
audit_get_sessionid(current));
}
For details, see file net/core/dev.c in the Linux kernel source tree.

Identifying the preferred IPv6 source address for an adapter

If you have a IPv6 enabled host that has more than one global-scope address, how can you programmatically identify the preferred address for bind()?
Example address list:
eth0 Link encap:Ethernet HWaddr 00:14:5e:bd:6d:da
inet addr:10.6.28.31 Bcast:10.6.28.255 Mask:255.255.255.0
inet6 addr: 2002:dce8:d28e:0:214:5eff:febd:6dda/64 Scope:Global
inet6 addr: fe80::214:5eff:febd:6dda/64 Scope:Link
inet6 addr: 2002:dce8:d28e::31/64 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
On Solaris you can indicate a preferred address with an interface flag and it is available programmatically via SIOCGLIFCONF:
/usr/include/net/if.h:
#define IFF_PREFERRED 0x0400000000 /* Prefer as source address */
As listed in the interface list:
eri0: flags=2104841<UP,RUNNING,MULTICAST,DHCP,ROUTER,IPv6> mtu 1500 index 2
inet6 fe80::203:baff:fe4e:6cc8/10
eri0:1: flags=402100841<UP,RUNNING,MULTICAST,ROUTER,IPv6,PREFERRED> mtu 1500 index 2
inet6 2002:dce8:d28e::36/64
This is not portable to OSX, Linux, FreeBSD, or Windows though. Windows is let off easy though as it has completely useless, from an administrators perspective, UUID based adapter names (depending upon the Windows version).
For Linux this article details how the parameter preferred_lft, where lft is short for "lifetime", can be altered to weight the selection process by the kernel. This setting doesn't appear conveniently available in the results of SIOCGIFCONF or getifaddrs() though.
So I want to bind to eth0, eri0, or whatever available interface name. The choices are a bit stark:
Fail on adapter names resolving to multiple interfaces. I take this approach for handling multicast transports (OpenPGM) as the protocol MUST have one-only sending address.
Bind to everything. This is a cop out and would be unexpected to users.
Bind to the adapter with SO_BINDTODEVICE. This requires CAP_NET_RAW system capability on Linux which can be quite a cumbersome overhead for administrators.
Bind to the first IPv6 interface on the adapter. The ordering tends to be completely bogus.
Bind to the last interface. David Croft's article implies Linux does this, but is also a bit bogus.
Enumerate over every interface and create a new socket explicitly for each.
With option #6 I would expect you could usually be smarter and take the approach that if only a link-local scope address is available bind to that, otherwise bind to just the available global-link scope addresses.
When connecting to another host then RFC 3484 can be used, but as you can see all the choices are dependent upon matching the destination address:
Prefer same address. (i.e. destination is local machine)
Prefer appropriate scope. (i.e. smallest scope shared with the destination)
Avoid deprecated addresses.
Prefer home addresses. Prefer outgoing
interface. (i.e. prefer an address on the interface we're sending
out of)
Prefer matching label.
Prefer public addresses.
Use longest matching prefix.
In some circumstances we can use #7 here, but in the interface example above both global-scope interfaces have a 64-bit prefix length.
RFC 3484 has the following pertinent lines:
The IPv6 addressing architecture 5 allows multiple unicast
addresses to be assigned to interfaces. These addresses may have
different reachability scopes (link-local, site-local, or global).
These addresses may also be "preferred" or "deprecated" 6.
The link being to RFC 2462, similarly expanded:
preferred address - an address assigned to an interface whose use
by
upper layer protocols is unrestricted. Preferred addresses may
be used as the source (or destination) address of packets sent
from (or to) the interface.
But no methods to programmatically acquire this detail.
Props to Win32 API that exposes an ioctl SIO_ADDRESS_LIST_SORT that allows a developer to use not only RFC 3484 sorting but to take into consideration any system administrator overrides. Linux has /etc/gai.conf as used for RFC 3484 sorting in getaddrinfo() but no API for directly accessing the sorting. Solaris has the ipaddrsel command. OSX is following FreeBSD by adding ip6addrctl in 10.7.
edit: Some concerns with RFC 3484 sorting are listed and referred to in this additional IETF draft document:
https://datatracker.ietf.org/doc/html/draft-axu-addr-sel-01
Solaris, for example, creates new alias-interfaces for each new
address assigned to a physical interface. So if_index could also be
used to uniquely identify a source address specific routing table on
that platform. Other operating systems do not work the same way.
The author likes Solaris's approach of giving each additional IPv6 interface a new alias, so that eri0 would become the link-local scope address, and eri0:1 or eri0:2, etc, must be specified to use a global-scope address.
Clearly whilst a nice idea one couldn't expect to see other OS change for quite some time.
I'm not sure this is in the direction you're seeking, but...
Poking around in the iproute bundle's ip code (ip/ipaddress.c) under linux shows that the ip command digs up interface flags like primary and secondary from a struct ifaddrmsg, member ifa_flags. The ifaddmsg seems to be acquired through a struct nlmsghdr which is documented in man 7 netlink, and used via sendmsg and recvmsg interaction with the kernel, which overall sounds like a royal pain but it's at least programmatic. Whether primary and secondary would be enough to be useful is a separate question.

Resources