How to create a kernel module that can intercept all packets coming to/from a network interface - linux

I have 2 port NIC on my system - eth0 and eth1 as seen by Linux.
I want to intercept all packets coming in/to eth0, send them out through eth1 to an external device connected to the same switch as eth1 is. So I need to slap on an additional header to make it reach the correct external device.
I know that there is a concept of network taps that both the transmit and receive code in the kernel send to, but how do I create one? Also I want to capture not just IP, but all ethernet packets, I know NETFILTER_HOOK would have helped me get me IPv4 packets.

The can be readily implemented with a rx_handler:
static rx_handler_result_t handle_frame(struct sk_buff **pskb)
{
struct sk_buff *skb = *pskb;
struct net_device *whereto_dev;
skb = skb_share_check(skb, GFP_ATOMIC);
if (unlikely(!skb))
return RX_HANDLER_CONSUMED;
*pskb = skb;
whereto_dev = rcu_dereference(skb->dev->rx_handler_data);
skb->dev = whereto_dev;
return RX_HANDLER_ANOTHER; /* Do another round in receive path */
}
They are registered via netdev_rx_handler_register(slave_dev, handle_frame, whereto). See the bonding or my uman driver for example usage.
dev_add_pack would work too, but it seems, apart from af_packet.c, all all-packet-catching users of dev_add_pack have been migrated to use rx_handlers, e.g. https://patchwork.ozlabs.org/patch/367236/. The patch's discussion suggests this might be more effecient.

Related

Where did Wireshark/tcpdump/libpcap intercept packet inside Linux kernel?

According to this, wireshark is able to get the packet before it is dropped (therefore I cannot get such packets by myself). And I'm still wondering the exact location in linux kernel for wireshark to fetch the packets.
The answer goes as "On UN*Xes, it uses libpcap, which, on Linux, uses AF_PACKET sockets." Does anyone have more concrete example to use "AF_PACKET sockets"? If I understand wireshark correctly, the network interface card (NIC) will make a copy of all incoming packets and send it to a filter (berkeley packet filter) defined by the user. But where does this happen? Or am I wrong with that understanding and do I miss anything here?
Thanks in advance!
But where does this happen?
If I understood you correctly - you want to know, where is initialized such socket.
There is pcap_create function, that tries to determine type of source interface, creates duplicate of it and activates it.
For network see pcap_create_interface function => pcap_create_common function => pcap_activate_linux function.
All initialization happens in pcap_activate_linux => activate_new function => iface_bind function
( copy descriptor of device with handlep->device = strdup(device);,
create socket with socket(PF_PACKET, SOCK_DGRAM, htons(ETH_P_ALL)),
bind socket to device with bind(fd, (struct sockaddr *) &sll, sizeof(sll)) ).
For more detailed information read comments in source files of mentioned functions - they are very detailed.
After initialization all work happens in a group of functions such as pcap_read_linux, etc.
On Linux, you should be able to simply use tcpdump (which leverages the libpcap library) to do this. This can be done with a file or to STDOUT and you specify the filter at the end of the tcpdump command..

join igmp_group not working in lightwight IP (lwip)

I'm new to lwip, and I want to create a multicast receiver with lwip. My steps are as follow:
1. Enable LWIP_IGMP;
2. Set NETIF_FLAG_IGMP in low_level_init();
3. Join multicast group, create and bind pcb;
4. udp_connect to remote_ip (or multicast IP address? Both are tried but failed)
Joining group returns success, and everything looks fine when program executing this. However the multicast receiver doesn't work, no multicast data comes into network interface. Seems I don't actually join my receiver to the igmp group, although the joining process looks fine. Does any one know what I'm missing?
I found "netif->igmp_mac_filter != NULL" in igmp_joingroup(), but this callback is set as NULL and not implemented. Do I need to implement it by myself to set the MAC filter or it is OK just leave it as NULL?
Thanks a lot for your help!
Ryan
When you join a multicast group the netif->igmp_mac_filter callback is typically called to configure a MAC filter in your Ethernet controller to accept packets with the multicast MAC address corresponding to the group. So, depending on the Ethernet H/W that you are using you may need to implement the callback.
The hardware needs to be configured to receive multcast MAC frames, otherwise it will simply discard all frames with multicast destination address. There is probably an option to accept all incoming multicast frames. Enable that in low_level_init() and you should be able to see the incoming multicast frames. You shouldn't need to implement any filter.
I had the same problem. I solved it removing the ETH Multicast Frame filter in the init of the MAC interface.
To test, you can also set the interface in promiscuous mode, check if the multicast packet are received an then remove the promiscuous mode and set an appropriate Multicast Frame Filtering mode according to your needs.
I set the code for Multicast Frame Filter as follows:
/* USER CODE BEGIN PHY_PRE_CONFIG */
ETH_MACFilterConfigTypeDef FilterConfig;
FilterConfig.PromiscuousMode = 1;
FilterConfig.PassAllMulticast = 1;
HAL_ETH_SetMACFilterConfig(&heth, &FilterConfig);
/* USER CODE END PHY_PRE_CONFIG */

Can I access the 4 addresses available in the mac header of the Wireless 802.11 packets?

I am writing a module that hooks a function on OUTPUT hook of the kernel netfilter. I want to access the complete information inside the mac header of the packet esp all the 4 addresses or in some cases 3 addresses of the mac header. I am using the struct sk_buff to get the packet's fields but it does not give any fields other than
ethhdr = (struct ethhdr *)skb_mac_header(skb);
eth_hdr(skb)->h_dest;
eth_hdr(skb)->h_source;
eth_hdr(skb)->h_proto;
these fields.
Can anyone give me some code that can extract them at the OUTPUT hook of the netfilter in Linux where this would be added as a hook function? Or if this is possible anyways?

How to set linux kernel not to send RST_ACK, so that I can give SYN_ACK within raw socket

I want to ask a classic question about raw socket programming and linux kernel TCP handling. I've done the research to some same threads like linux raw socket programming question, How to reproduce TCP protocol 3-way handshake with raw sockets correctly?, and TCP ACK spoofing, but still can't get the solution.
I try to make a server which don't listen to any port, but sniff SYN packets from remote hosts. After the server do some calculation, it will send back a SYN_ACK packet to corresponding SYN packet, so that I can create TCP Connection manually, without including kernel's operation. I've create raw socket and send the SYN_ACK over it, but the packet cannot get through to the remote host. When I tcpdump on the server (Ubuntu Server 10.04) and wireshark on client (windows 7), the server returns RST_ACK instead of my SYN_ACK packet. After doing some research, I got information that we cannot preempt kernel's TCP handling.
Is there still any other ways to hack or set the kernel not to responds RST_ACK to those packets?
I've added a firewall to local ip of server to tell the kernel that maybe there's something behind the firewall which is waiting for the packet, but still no luck
Did you try to drop RST using iptables?
iptables -A OUTPUT -p tcp --tcp-flags RST RST -j DROP
should do the job for you.
I recommend using ip tables, but since you ask about hacking the kernel as well, here is an explanation of how you could do that (I'm using kernel 4.1.20 as reference):
When a packet is received (a sk_buff), the IP protocol handler will send it to the networking protocol registered:
static int ip_local_deliver_finish(struct sock *sk, struct sk_buff *skb)
{
...
ipprot = rcu_dereference(inet_protos[protocol]);
if (ipprot) {
...
ret = ipprot->handler(skb);
Assuming the protocol is TCP, the handler is tcp_v4_rcv:
static const struct net_protocol tcp_protocol = {
.early_demux = tcp_v4_early_demux,
.handler = tcp_v4_rcv,
.err_handler = tcp_v4_err,
.no_policy = 1,
.netns_ok = 1,
.icmp_strict_tag_validation = 1,
};
So tcp_v4_cv is called. It will try to find the socket for the skb received, and if it doesn't, it will send reset:
int tcp_v4_rcv(struct sk_buff *skb)
{
sk = __inet_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest);
if (!sk)
goto no_tcp_socket;
no_tcp_socket:
if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
goto discard_it;
tcp_v4_send_reset(NULL, skb);
...
There are many different ways you can hack this. You could go to the xfrm4_policy_check function and hack/change the policy for AF_INET. Or you can just simply comment out the line that calls xfrm4_policy_check, so that the code will always go to discard_it, or you can just comment out the line that calls tcp_v4_send_reset (which will have more consequences, though).
Hope this helps.

Specify source IP address for TCP socket when using Linux network device aliases

For some specific networking tests, I've created a VLAN device, eth1.900, and a couple of aliases, eth1.900:1 and eth1.900.2.
eth1.900 Link encap:Ethernet HWaddr 00:18:E7:17:2F:13
inet addr:1.0.1.120 Bcast:1.0.1.255 Mask:255.255.255.0
eth1.900:1 Link encap:Ethernet HWaddr 00:18:E7:17:2F:13
inet addr:1.0.1.200 Bcast:1.0.1.255 Mask:255.255.255.0
eth1.900:2 Link encap:Ethernet HWaddr 00:18:E7:17:2F:13
inet addr:1.0.1.201 Bcast:1.0.1.255 Mask:255.255.255.0
When connecting to a server, is there a way to specify which of these aliases will be used? I can ping using the -I <ip> address option to select which alias to use, but I can't see how to do it with a TCP socket in code without using raw sockets, since I would also like to run without extra socket privileges, i.e. not running as root, if possible.
Unfortunately, even with root, SO_BINDTODEVICE doesn't work because the alias device name is not recognized:
printf("Bind to %s\n", devname);
if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, (char*)devname, sizeof(devname)) != 0)
{
perror("SO_BINDTODEVICE");
return 1;
}
Output:
Bind to eth1.900:1
SO_BINDTODEVICE: No such device
Use getifaddrs() to enumerate all the interfaces and find the IP address for the interface you want to bind to. Then use bind() to bind to that IP address, before you call connect().
Since a packet can't be send out on an aliased interface anyway, it would make no sense to use SO_BINDTODEVICE on one. SO_BINDTODEVICE controls which device a packet is sent out from if routing cannot be used for this purpose (for example, if it's a raw Ethernet frame).
You don't show the definition of devname, but if it's a string pointer, e.g.:
char *devname = "eth1.900:1";
Then perhaps it's failing since you specify the argument size using sizeof devname, which would in this case be the same as sizeof (char *), i.e. typically 4 on a 32-bit system.
If setsockopt() expects to see the actual size of the argument, i.e. the length of the string, this could explain the issue since it's then perhaps just inspecting the first four characters and failing since the result is an invalid interface name.

Resources