How is the order of tables/chains in nftables arranged? - linux

For example, I have two tables:
table inet filter2 {
chain forward {
type filter hook forward priority 0; policy accept;
ct state established,related accept
iifname $wireif oifname $wirelessif accept
}
}
table inet filter {
chain forward {
type filter hook forward priority 0; policy accept;
drop
}
}
The filter is executed first, so all my forward packets are dropped, which is not what I want. How can I set a table/chain to be executed at last to make it work as the default option?

Nftables uses underlying netfilter framework which has 6 hooks points located at different places in linux kernel network stack.
When one or more rules are added at same hook point netfilter framework priorities the rules by their priority type.
For example in your case while adding chain you can use different priority type for each chain.
Lets say you want to give higher priority to rule defined forward chain of
filter table than forward chain of filter2 table
.
nft add table ip filter2
nft add chain ip filter2 forward {type filter hook forward priority 0 \;}
Now to give higher priority to forward chain of filter table assign priority less than 0.
nft add table inet filter
nft add chain inet filter '{type filter hook forward priority -1 }'
Here higher priority value means lower priority and vice-a-versa.
But be careful while using different priority type because it sometime may cause unintended behaviour to packet. For more information read this
PS: There is slight different syntax for negative priority.

Related

tc filter drop matched packets

I'm looking to add a set of filters that would drop packets that match parameters. It seems tc filters do not support drop action based on match, but based on qos parameters. Has anyone been able to place tc drop filters?
The most common method i've found thus far is to to mark it using tc and then us iptables to drop the marked packet, but that is not as efficient in my opinion.
tc filter supports drop action based on match. This is actually more straight forward than i anticipated
An example below would drop all IP GRE traffic on interface eth3
# add an ingress qdisc
tc qdisc add dev eth3 ingress
# filter on ip GRE traffic (protocol 47)
tc filter add dev eth3 parent ffff: protocol ip prio 6 u32 match ip protocol 47 0x47 flowid 1:16 action drop

How to make sure packets from the same flow land on the same queue on two NICs when bridging

I'm writing a network bridge that reassembles and analyzes TCP flows on the fly. I have a pair of multi-queue NICS and I use netmap to capture packets from each rx queues on different threads and than pass them on to the other NIC for transmission. The problem is, packets from the same flow do not land on the same queue on the two NICs, due to the source and destination addresses and port being reversed.
I tried ethtool to change the tuple the hash of which is used for distributing packets. Running this:
# ethtool --show-nfc p1p1 rx-flow-hash tcp4
results in:
TCP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]
Repeating the above command for the other NIC results in the same queue. I thought I could change the order for one of the rings by running ethtool --config-nfc p1p1 rx-flow-hash tcp4 dsfn but that doesn't change the order of the fields used in the hash. It seems that both sdfn and dsfn result in the same tuple. The same is true for ds and sd.
Is there a way to make the packets in one flow always land on the same queue (and the same thread) on both NICs whether using ethtool to configure the NICs or otherwise another method or tool?

Jumps in firewall rule sets

I have a general question regarding software-based firewalls.
Specifically, I would like to know whether there are other firewalls than iptables
which allow the specification of jumps inside of the rule set.
In iptables, users have the possibility to specify "jumps" inside of the rule set by targeting specific chains when a rule matches on a packet.
For example, in the following rule set
(1) iptables -A INPUT --src 1.2.3.4 -j ACCEPT
(2) iptables -A INPUT --src 1.2.3.5 -j ACCEPT
(3) iptables -A INPUT --src 1.2.3.6 -j ACCEPT
(4) iptables -A INPUT --src 8.8.8.8 -j NEXT_CHAIN
(5) iptables -A INPUT --src 2.2.2.2 -j ACCEPT
(6) iptables -A INPUT --src 2.2.2.3 -j ACCEPT
<NEXT_CHAIN starts here ...>
rule (4) redirects packet processing to another rule set named "NEXT_CHAIN".
In other words, rules (5) and (6) are skipped (in some sense, if there is a match in NEXT_CHAIN).
I think this is also possible in iptables' predecessor ipchains.
Do you know whether there are any other firewalls that provide a similar feature?
The other main competitor to iptables is pf, which has similar capabilities to iptables.
Linux firewalls are built around Netfilter; the kernel's network packet processing framework which is made of several kernel modules performing specific tasks like:
The FILTER module (always loaded by default) mainly allows us to ACCEPT or DROP IP packets based on a certain matching criteria.
The NAT module set allows us to perform Network Address Translations (SNAT, DNAT, MASQUERADE).
The MANGLE module allows us to alter certain IP packet fields (TOS, TTL)
Users configure the Netfilter framework ("kernel mode") to suit their firewall needs using iptables which is an "userland" application run from the command line. With iptables we define rules that instruct the Linux kernel what to do with IP packets when they arrive into, pass through, or leave our Linux box.
All the Linux based firewalls are based in Netfilter and most of them use iptables as a way to control Netfilter.
Different technologies use a similar strategy; i.e. in BSD (OpenBSD) the kernel module is called PF (Packet Filter) an the "userland" application for controlling PF is called pfctl
Depending what technology you use you have one or the other; both systems do basically the same and of course they both can perform the jumps you mention. Remember a firewall in Linux or BSD is just a set of rules loaded by the corresponding userland application which set the behavior of the corresponding net traffic control kernel engine.
also consider when you jump into a user defined chain you can also "return"
I did some research on other packet filtering systems, and I found out the following:
OpenBSD's pf can implement some sort of control using conditional anchors:
EXAMPLE: anchor udp-only in on fxp0 inet proto udp
The OpenFlow switch provides direct jumps by using GOTO targets
NetBSD's ipfw provides the skipto action
Each of these features allows to modify the control flow during packet classification and can be used to implement JUMP semantics.

join igmp_group not working in lightwight IP (lwip)

I'm new to lwip, and I want to create a multicast receiver with lwip. My steps are as follow:
1. Enable LWIP_IGMP;
2. Set NETIF_FLAG_IGMP in low_level_init();
3. Join multicast group, create and bind pcb;
4. udp_connect to remote_ip (or multicast IP address? Both are tried but failed)
Joining group returns success, and everything looks fine when program executing this. However the multicast receiver doesn't work, no multicast data comes into network interface. Seems I don't actually join my receiver to the igmp group, although the joining process looks fine. Does any one know what I'm missing?
I found "netif->igmp_mac_filter != NULL" in igmp_joingroup(), but this callback is set as NULL and not implemented. Do I need to implement it by myself to set the MAC filter or it is OK just leave it as NULL?
Thanks a lot for your help!
Ryan
When you join a multicast group the netif->igmp_mac_filter callback is typically called to configure a MAC filter in your Ethernet controller to accept packets with the multicast MAC address corresponding to the group. So, depending on the Ethernet H/W that you are using you may need to implement the callback.
The hardware needs to be configured to receive multcast MAC frames, otherwise it will simply discard all frames with multicast destination address. There is probably an option to accept all incoming multicast frames. Enable that in low_level_init() and you should be able to see the incoming multicast frames. You shouldn't need to implement any filter.
I had the same problem. I solved it removing the ETH Multicast Frame filter in the init of the MAC interface.
To test, you can also set the interface in promiscuous mode, check if the multicast packet are received an then remove the promiscuous mode and set an appropriate Multicast Frame Filtering mode according to your needs.
I set the code for Multicast Frame Filter as follows:
/* USER CODE BEGIN PHY_PRE_CONFIG */
ETH_MACFilterConfigTypeDef FilterConfig;
FilterConfig.PromiscuousMode = 1;
FilterConfig.PassAllMulticast = 1;
HAL_ETH_SetMACFilterConfig(&heth, &FilterConfig);
/* USER CODE END PHY_PRE_CONFIG */

libpcap setfilter() function and packet loss

this is my first question here #stackoverflow.
I'm writing a monitoring tool for some VoIP production servers, particularly a sniff tool that allows to capture all traffic (VoIP calls) that match a given pattern using pcap library in Perl.
I cannot use poor selective filters like e.g. "udp" and then do all the filtering in my app's code, because that would involve too much traffic and the kernel wouldn't cope reporting packet loss.
What I do then is to iteratively build the more selective filter possible during the capture. At the beginning I capture only (all) SIP signalling traffic and IP fragments (the pattern match has to be done at application level in any case) then when I find some information about RTP into SIP packets, I add 'or' clauses to the actual filter-string with specific IP and PORT and re-set the filter with setfilter().
So basically something like this:
Initial filter : "(udp and port 5060) or (udp and ip[6:2] & 0x1fff != 0)" -> captures all SIP traffic and IP fragments
Updated filter : "(udp and port 5060) or (udp and ip[6:2] & 0x1fff != 0) or (host IP and port PORT)" -> Captures also the RTP on specific IP,PORT
Updated filter : "(udp and port 5060) or (udp and ip[6:2] & 0x1fff != 0) or (host IP and port PORT) or (host IP2 and port PORT2)" -> Captures a second RTP stream as well
And so on.
This works quite well, as I'm able to get the 'real' packet loss of RTP streams for monitoring purposes, whereas with a poor selective filter version of my tool, the RTP packet loss percentage wasn't reliable because there was some packets missing due to packet drop by kernel.
But let's get to the drawback of this approach.
Calling setfilter() while capturing involves the fact that libpcap drops packets received "while changing the filter" as stated in code comments for function set_kernel_filter() into pcap-linux.c (checked libpcap version 0.9 and 1.1).
So what happens is that when I call setfilter() and some packets arrive IP-fragmented, I do loose some fragments, and this is not reported by libpcap statistics at the end: I spotted it digging into traces.
Now, I understand the reason why this action is done by libpcap, but in my case I definitely need not to have any packet drop (I don't care about getting some unrelated traffic).
Would you have any idea on how to solve this problem that is not modifying libpcap's code?
What about starting up a new process with the more specific filter. You could have two parallel pcap captures going at once. After some time (or checking that both received the same packets) you could stop the original.
Can you just capture all RTP traffic?
From capture filters the suggestion for RTP traffic is:
udp[1] & 1 != 1 && udp[3] & 1 != 1 && udp[8] & 0x80 == 0x80 && length < 250
As the link points out you will get a few false positives where DNS and possibly other UDP packets occassionally contain the header byte, 0x80, used by RTP packets, however the number should be negligible and not enough to cause kernel drops.
Round hole, square peg.
You have a tool that doesn't quite fit your need.
Another option is to do a first-level filter (as above, that captures a lot more than wanted) and pipe it into an another tool that implements the finer filter you want (down to the per-call case). If that first-level filter is too much for the kernel due to heavy RTP traffic, then you may need to do something else like keep a stable of processes to capture individual calls (so you're not changing the filter on the "main" process; it's simply instructing the others how to set their filters.)
Yes, this may mean merging captures, either on the fly (pass them all to a "save the capture" process) or after the fact.
You do realize that you may well miss RTP packets anyways if you don't install your filters fast. Don't forget that RTP packets could come in for the originator before the 200 OK comes in (or right together), and they may go back to the answerer before the ACK (or on top of it). Also don't forget INVITE with no SDP (offer in the 200 OK, answer in ACK). Etc, etc. :-)

Resources