I have 4 tap interfaces, tap0 and tap1 is connected and so is tap2 and tap3
vde_switch -d -tap tap0 -tap tap1 click
vde_switch -d -tap tap2 -tap tap3 --sock /run/vde.ctl/ctl2
I then assigned ip for tap1 and tap2
ip addr add 1.1.1.1/24 dev tap1
ip addr add 1.2.1.1/24 dev tap2
From raw socket application, I sent a udp packet from tap0 with source ip 1.1.1.3 and destination ip 1.2.1.3 and it arrived at tap3 (according to wireshark).
The problem is, if I send fragmented ip/udp packet, Linux doesn't forward it to tap3.
I checked the fragmented ip packet (first segment), its checksum and destination mac addr are all right. The funny thing is, if I remove the "more fragment" bit in ip header (ip checksum will change), then it got forwarded.
By the way, I am using Linux 3.19.0-65 on 64bit laptop.
Any idea why? Thanks a lot!
EDIT1
Here is the output of ip route list
default via 10.0.0.1 dev wlan0 proto static
1.1.1.0/24 dev tap1 proto kernel scope link src 1.1.1.1
1.2.1.0/24 dev tap2 proto kernel scope link src 1.2.1.1
10.0.0.0/24 dev wlan0 proto kernel scope link src 10.0.0.3 metric 9
172.16.83.0/24 dev vmnet1 proto kernel scope link src 172.16.83.1
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.181.0/24 dev vmnet8 proto kernel scope link src 192.168.181.1
Edit2
Here is the link to the pcap of the IP fragment packet, captured on tap0 interface.
Related
I am attempting to validate ECMP functionality on a linux host with unnumbered interfaces and network namespaces.
The following example can be used to demonstrate:
# add address to loopback for unnumbered veth interfaces
ip addr add 198.51.100.0/32 dev lo
# namespace 1
ip netns add ns1
ip link add veth100 type veth peer name veth101
ip link set veth100 up
ip link set veth101 netns ns1
ip netns exec ns1 ip link set veth101 name eth0
ip netns exec ns1 ip addr add 192.0.2.1/32 dev eth0
ip netns exec ns1 ip link set eth0 up
ip netns exec ns1 ip route add 198.51.100.0/32 dev eth0
ip netns exec ns1 ip route add 0.0.0.0/0 via 198.51.100.0
ip route add 192.0.2.1/32 dev veth100
# namespace 2
ip netns add ns2
ip link add veth200 type veth peer name veth201
ip link set veth200 up
ip link set veth201 netns ns2
ip netns exec ns2 ip link set veth201 name eth0
ip netns exec ns2 ip addr add 192.0.2.2/32 dev eth0
ip netns exec ns2 ip link set eth0 up
ip netns exec ns2 ip route add 198.51.100.0/32 dev eth0
ip netns exec ns2 ip route add 203.0.113.0/32 dev eth0
ip netns exec ns2 ip route add 0.0.0.0/0 via 198.51.100.0
ip route add 192.0.2.2/32 dev veth200
# anycast / ecmp setup
ip netns exec ns1 ip addr add 203.0.113.0/32 dev lo
ip netns exec ns1 ip link set dev lo up
ip netns exec ns2 ip addr add 203.0.113.0/32 dev lo
ip netns exec ns2 ip link set dev lo up
ip route append 203.0.113.0/32 nexthop via 192.0.2.1 weight 100
ip route append 203.0.113.0/32 nexthop via 192.0.2.2 weight 100
I can see that I have two routes in my routing table:
$ ip route show
...
203.0.113.0 via 192.0.2.1 dev veth100 onlink
203.0.113.0 via 192.0.2.2 dev veth200 onlink
...
Ping to 203.0.113.0 works (as expected):
$ ping 203.0.113.0 -c 2
PING 203.0.113.0 (203.0.113.0) 56(84) bytes of data.
64 bytes from 203.0.113.0: icmp_seq=1 ttl=64 time=0.096 ms
64 bytes from 203.0.113.0: icmp_seq=2 ttl=64 time=0.079 ms
--- 203.0.113.0 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1024ms
rtt min/avg/max/mdev = 0.079/0.087/0.096/0.008 ms
I can set either veth100 or veth200 down and achieve fail over. However, the load does not appear to be shared across veth100 and veth200 at the same time. I verified this by tcpdump'ing both veth100 and veth200 at the same time.
Experimenting, I've tried adding the ecmp route this way:
ip route add 203.0.113.0/32 nexthop via 192.0.2.2 weight 10 nexthop via 192.0.2.1 weight 10
The route appears to be installed differently. I'm not sure what the difference is in reality, but it looks different.
$ ip route show
...
203.0.113.0
nexthop via 192.0.2.2 dev veth200 weight 10
nexthop via 192.0.2.1 dev veth100 weight 10
...
But, this still has the same problem as mentioned above.
I'm not sure what next steps to take. What am I doing wrong? Is there any way to achieve ECMP load sharing in this scenario?
If you're only testing with ICMP pings the behaviour is expected. ECMP's 5-tuple hash (sourceIP+sourcePort+destIP+destPort+protocol) can't work with ICMP since it doesn't use port numbers so you'll always hit the same host.
Experiment with multiple UDP and TCP and you should see the load balancing effect since at least source ports should be ephemeral (unlike the destination well-known service ports).
BTW - thanks for spelling out the steps you took since I'm currently experimenting with the same concepts in order to replace the K8S network mess with simple load-balanced, routed, IPv6 only.
I have the following socket programming code. My client program is running on a VM on desktop machine using virtual box and my server program is running on a University cluster VM. The client is unable to send the data to the server. Both client and server program are running inside a docker container
running client and server container
docker run --rm -it -p 192.168.56.110:5555:5555 client bash
docker run --rm -it -p 192.168.101.238:5555:5555 server bash
client.py
context=zmq.Context()
print("Connecting")
socket=context.socket(zmq.REQ)
socket.connect("tcp://192.168.101.238:5555")
name="Max"
while True:
message=input("Message: ")
socket.send_pyobj({1:[name,message]})
message2=socket.recv_pyobj()
print("%s:%s" %(message2.get (1)[0], message2.get(1)[1]))
server.py
context=zmq.Context()
socket=context.socket(zmq.REP)
socket.bind("tcp://0.0.0.0:5555")
while True:
message=socket.recv_pyobj()
print("%s:%s" %(message.get(1)[0],message.get(1)[1]))
socket.send_pyobj({1:[message.get(1)[0],message.get(1)[1]]})
ip route client VM
default via 172.27.248.1 dev tun0 proto static metric 50
default via 10.0.2.2 dev enp0s3 proto dhcp metric 100
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 100
10.0.2.2 dev enp0s3 proto static scope link metric 100
143.117.101.145 via 10.0.2.2 dev enp0s3 proto static metric 100
169.254.0.0/16 dev enp0s8 scope link metric 1000
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.18.0.0/16 dev br-1f684a10d7c8 proto kernel scope link src 172.18.0.1 linkdown
172.27.248.0/22 dev tun0 proto kernel scope link src 172.27.250.80 metric 50
192.168.56.0/24 dev enp0s8 proto kernel scope link src 192.168.56.110
server ip route
default via 192.168.101.254 dev ens160 proto dhcp src 192.168.101.238 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.101.0/24 dev ens160 proto kernel scope link src 192.168.101.238
192.168.101.254 dev ens160 proto dhcp scope link src 192.168.101.238 metric 100
On the client side, its stuck its not sending the data to the server and on the server its not receiving data.
Help is highly appreciated thanks
My server has 5 different external IPs (all working)
I added them by using:
ip addr add xx.xx.xx.xx/32 dev eth0
ip addr add yy.yy.yy.yy/32 dev eth0
ip addr add zz.zz.zz.zz/32 dev eth0
How can I should curl to use either zz.zz.zz.zz IP address ?
You should be able to use
curl --interface zz.zz.zz.zz http://example.com/
when i try to change the tcp init cwnd,
first, when run ip show route,show:
10.61.0.0/24 dev eth0 proto kernel scope link src 10.61.0.241
169.254.0.0/16 dev eth0 scope link metric 1002
default via 10.61.0.254 dev eth0 proto static
so i run
sudo ip route change default via 10.61.0.254 dev eth0 proto static initcwnd 10
to change the initcwnd to 10,
and after above, i run ip show routeagain:
10.61.0.0/24 dev eth0 proto kernel scope link src 10.61.0.241
169.254.0.0/16 dev eth0 scope link metric 1002
default via 10.61.0.254 dev eth0 proto static initcwnd 10
it seems work.but when i reboot, the value don't reserve.
10.61.0.0/24 dev eth0 proto kernel scope link src 10.61.0.241
169.254.0.0/16 dev eth0 scope link metric 1002
default via 10.61.0.254 dev eth0 proto static
How should I do?
my os version info:
Linux version 2.6.32-358.18.1.el6.x86_64 (mockbuild#c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1
You can add the ip route commands in the /etc/rc.d/rc.local so they take effect at boot time.
Kernel 2.6.32 does not have Miller's patch ( https://lwn.net/Articles/426883/) for initcwnd # 10
You can use ip tcp_metrics or ss to see also more information on a per socket/stream basis
I am using Ubuntu 12.04. I assigned two IP addresses to the ethernet card by editing /etc/network/interfaces. It now looks like that (skipping lines not related to the question).
auto eth0
iface eth0 inet static
address 192.168.60.23
netmask 255.255.255.0
gateway 192.168.60.1
up route add 192.168.60.1 dev eth0
up route add 10.0.1.1 dev eth0
up route add 192.168.60.151 gw 10.0.1.1
auto eth0:1
iface eth0:1 inet static
address 192.168.60.101
netmask 255.255.255.0
Now, howerver, I would like to let the packets going to 192.168.60.151 leave my machine with the second IP address (192.168.60.101) as source address.
I tried adding src 192.168.60.101 to the corresponding up route line but it didn't work. I also tried to move this line to the eth0:1 block but it didn't work either. When I execute ip route get 192.168.60.151 I always get 192.168.60.151 via 10.0.1.1 dev eth0 src 192.168.60.21.
I googled but didn't find out how to modify the source address of outgoing packets.