Could we use DNS round robin with nscd's dns cache? - dns

I try to use DNS round robin with nscd's dns cache.
But I am not convinced about the belows.
nscd respect the dns record ttl at its dns reply
the traffic from clients with nscd are distributed equally to servers behind domain name
Is it possible to use DNS round robin with nscd?

Summary
Yes, we can. But The traffic can be unequally distributed to servers, which makes slightly larger load to servers behind the domain name. It makes inefficient server resource usage.
nscd respects the TTL time at DNS query but the shorter TTL time than 15s seems working like 15s. It's because nscd prunes its cache at least every 15s intervals, which is defined as CACHE_PRUNE_INTERVAL at /nscd/nscd.h
By this CACHE_PRUNE_INTERVAL, traffic can be unequally distributed to servers behind the domain by DNS round-robin.
This undistributed can be strengthened by clients using keep-alive.
This undistributed can be weakened by large number of clients
In detail
Environment
Network topology
Centos 7.9
nscd (GNU libc) 2.17
locust 2.8.6 with master-worker at several servers. the workers # : 1 ~ 60. the master is only one. each worker lives in its own server.
A record test-nscd.apps.com binding to two servers (PM1, PM2). its TTL : 1~60s
/etc/nscd.conf
#
# /etc/nscd.conf
#
# An example Name Service Cache config file. This file is needed by nscd.
#
# Legal entries are:
#
# logfile <file>
# debug-level <level>
# threads <initial #threads to use>
# max-threads <maximum #threads to use>
# server-user <user to run server as instead of root>
# server-user is ignored if nscd is started with -S parameters
# stat-user <user who is allowed to request statistics>
# reload-count unlimited|<number>
# paranoia <yes|no>
# restart-interval <time in seconds>
#
# enable-cache <service> <yes|no>
# positive-time-to-live <service> <time in seconds>
# negative-time-to-live <service> <time in seconds>
# suggested-size <service> <prime number>
# check-files <service> <yes|no>
# persistent <service> <yes|no>
# shared <service> <yes|no>
# max-db-size <service> <number bytes>
# auto-propagate <service> <yes|no>
#
# Currently supported cache names (services): passwd, group, hosts
#
# logfile /var/log/nscd.log
# threads 6
# max-threads 128
server-user nscd
# stat-user nocpulse
debug-level 0
# reload-count 5
paranoia no
# restart-interval 3600
enable-cache passwd yes
positive-time-to-live passwd 600
negative-time-to-live passwd 20
suggested-size passwd 211
check-files passwd yes
persistent passwd yes
shared passwd yes
max-db-size passwd 33554432
auto-propagate passwd yes
enable-cache group yes
positive-time-to-live group 3600
negative-time-to-live group 60
suggested-size group 211
check-files group yes
persistent group yes
shared group yes
max-db-size group 33554432
auto-propagate group yes
enable-cache hosts yes
positive-time-to-live hosts 300
negative-time-to-live hosts 20
suggested-size hosts 211
check-files hosts yes
persistent hosts yes
shared hosts yes
max-db-size hosts 33554432
What experiments I did
sending traffic to test-nscd.apps.com with TTL 1 ~ 60s from 1 locust workers. And checking traffic distributed at PM1, PM2
sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 60 locust workers. And checking traffic distributed at PM1, PM2
sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 60 locust workers using keepalive. And checking traffic distributed at PM1, PM2
The test results
1. sending traffic to test-nscd.apps.com with TTL 1 ~ 60s from 1 locust workers and checking traffic distributed at PM1, PM2
TTL 60s
Traffic are distributed but not equallye.
You can see the clients(workers) gets dns reply from dns server, every interval 60~75s by using tcpdump src port 53 -vvv
14:37:55.116675 IP (tos 0x80, ttl 49, id 41538, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.39956: [udp sum ok] 9453 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1m] A 10.130.248.64, test-nscd.apps.com. [1m] A 10.130.248.63 (83)
--
14:39:10.121451 IP (tos 0x80, ttl 49, id 20047, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.55173: [udp sum ok] 6722 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1m] A 10.130.248.63, test-nscd.apps.com. [1m] A 10.130.248.64 (83)
--
14:40:25.120127 IP (tos 0x80, ttl 49, id 28851, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.39461: [udp sum ok] 40481 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1m] A 10.130.248.63, test-nscd.apps.com. [1m] A 10.130.248.64 (83)
--
TTL 30s
Traffic are distributed but not equally because TTL is too large.
You can see the clients gets dns reply from dns server, every interval 30~45s.
16:14:04.359901 IP (tos 0x80, ttl 49, id 39510, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain >test-client.51466: [udp sum ok] 43607 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [5s] A 10.130.248.63, test-nscd.apps.com. [5s] A 10.130.248.64 (83)
--
16:14:19.361964 IP (tos 0x80, ttl 49, id 3196, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain >test-client.39370: [udp sum ok] 62519 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [5s] A 10.130.248.63, test-nscd.apps.com. [5s] A 10.130.248.64 (83)
--
16:14:34.364359 IP (tos 0x80, ttl 49, id 27647, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain >test-client.49659: [udp sum ok] 51890 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [5s] A 10.130.248.64, test-nscd.apps.com. [5s] A 10.130.248.63 (83)
--
TTL 15s
Traffic are distributed but not equally.
But the traffic became to be distributed more equally than TTL 45s case.
You can see the clients gets dns reply from dns server, every interval 15~30s.
15:45:04.141762 IP (tos 0x80, ttl 49, id 30678, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain >test-client.35411: [udp sum ok] 63073 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [15s] A 10.130.248.63,test-nscd.apps.com. [15s] A 10.130.248.64 (83)
--
15:45:34.191159 IP (tos 0x80, ttl 49, id 48496, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain >test-client.52441: [udp sum ok] 24183 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [15s] A 10.130.248.63,test-nscd.apps.com. [15s] A 10.130.248.64 (83)
--
15:46:04.192905 IP (tos 0x80, ttl 49, id 32793, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain >test-client.49875: [udp sum ok] 59065 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [15s] A 10.130.248.63,test-nscd.apps.com. [15s] A 10.130.248.64 (83)
--
TTL 5s
Traffic are distributed but not equally.
But the traffic became to be distributed more equally than TTL 30s case.
You can see the clients gets dns reply from dns server, every interval 15s, although TTL is 5s
16:14:04.359901 IP (tos 0x80, ttl 49, id 39510, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.51466: [udp sum ok] 43607 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [5s] A 10.130.248.63,test-nscd.apps.com. [5s] A 10.130.248.64 (83)
--
16:14:19.361964 IP (tos 0x80, ttl 49, id 3196, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.com.39370: [udp sum ok] 62519 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [5s] A 10.130.248.63,test-nscd.apps.com. [5s] A 10.130.248.64 (83)
--
16:14:34.364359 IP (tos 0x80, ttl 49, id 27647, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.com.49659: [udp sum ok] 51890 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [5s] A 10.130.248.64,test-nscd.apps.com. [5s] A 10.130.248.63 (83)
--
TTL 1s
Traffic are distributed but not equally.
The result is similar with TTL 5s case.
You can see the clients gets dns reply from dns server, every interval 15s, although TTL is 1s. It's same with TTL 5s case.
16:43:27.814701 IP (tos 0x80, ttl 49, id 28956, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.49891: [udp sum ok] 22634 q: A?test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1s] A 10.130.248.63,test-nscd.apps.com. [1s] A 10.130.248.64 (83)
--
16:43:42.816721 IP (tos 0x80, ttl 49, id 27128, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.34490: [udp sum ok] 37589 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [1s] A 10.130.248.63,test-nscd.apps.com. [1s] A 10.130.248.64 (83)
--
16:43:57.842106 IP (tos 0x80, ttl 49, id 60723, offset 0, flags [none], proto UDP (17), length 111)
10.230.167.65.domain > test-client.55185: [udp sum ok] 1139 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [1s] A 10.130.248.63,test-nscd.apps.com. [1s] A 10.130.248.64 (83)
2. sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 100 locust workers and checking traffic distributed at PM1, PM2
Increasing the locust workers from 1, 10, 20, 40, 60
I increase the locust workers every 30 minutes
I found the traffic became more equally distributed by increasing workers (increasing clients)
At 60 workers, there was only 3 percent difference between an average traffic RPS, on time average.
3. sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 100 locust workers using keepalive and checking traffic distributed at PM1, PM2
Increasing the locust workers from 1, 10, 20, 40, 60
I increase the locust workers every 30 minutes
I found the traffic became more equally distributed by increasing workers (increasing clients)
At 60 workers, there was only 6 percent difference between an average traffic RPS, on time average.
The result is not good as much as the experiment 2 due to keepalive's connection caching
4. (Comparison experiment) sending traffic to test-nscd.apps.com which is bound to machine JVM(JVM has its own dns caching). And checking traffic distributed at PM1, PM2
JVM TTL 30s
JVM TTL 10s
JVM TTL 5s
JVM TTL 1s
We found that TTL should be smaller than at least 10s for distributing traffic equally.
Conclusion
nscd respects the TTL time at DNS query. But the shorter TTL than 15s seems working like 15s because nscd prune its cache at least every 15s interval, which is defined as CACHE_PRUNE_INTERVAL at /nscd/nscd.h. You can find this facts from belows.
getaddrinfo() use nscd, https://elixir.bootlin.com/glibc/glibc-2.35/source/sysdeps/posix/getaddrinfo.c#L610, https://udrepper.livejournal.com/16362.html, https://serverfault.com/questions/729738/nscd-ttl-and-dns-ttl-which-one-is-stronger
CACHE_PRUNE_INTERVAL, https://elixir.bootlin.com/glibc/glibc-2.35/source/nscd/connections.c#L1556
nscd_run_prune, https://elixir.bootlin.com/glibc/glibc-2.35/source/nscd/nscd.h#L189
By this CACHE_PRUNE_INTERVAL, traffic can be unequally distributed to servers behind the domain by DNS round-robin. Compared to the dns caching of JVM, nscd is hard to use DNS round robin.
This undistributed can be strengthened by keep-alive of clients
it seems keep-alive cache the connections, so it makes less frequent dns queries and more undistributed traffic.
This undistributed can be weakened by large number of clients
it seems large number of clients makes more frequent and less undistributed traffic.

Related

PXEBOOT, TFTPD-HPA and Firewall

I have setup a pxeboot which basically works fine. I can run any configured linux image.
Then I have enabled the firewall, released UDP port 69 for TFTP
~# iptables -L |grep tftp
ACCEPT udp -- anywhere anywhere udp dpt:tftp
ACCEPT udp -- anywhere anywhere udp dpt:tftp
~# netstat -tulp|grep tftp
udp 0 0 0.0.0.0:tftp 0.0.0.0:* 15869/in.tftpd
udp6 0 0 [::]:tftp [::]:* 15869/in.tftpd
~# cat /etc/services|grep tftp
tftp 69/udp
and now I get a timeout when pxeboot is pulling tftp://192.168.0.220/images/pxelinux.0 (rc = 4c126035).
Anywhere is ok here for now as there is another firewall running between the pxeserver and the router which blocks everything unwanted from/to WAN
The funny part is that tcpdump shows that the request is incoming on the pxeboot server:
~# tcpdump port 69
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp5s0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:00:47.062723 IP 192.168.0.136.1024 > mittelerde.tftp: 47 RRQ "images/pxelinux.0" octet blksize 1432 tsize 0
14:00:47.415412 IP 192.168.0.136.1024 > mittelerde.tftp: 47 RRQ "images/pxelinux.0" octet blksize 1432 tsize 0
14:00:48.184506 IP 192.168.0.136.1024 > mittelerde.tftp: 47 RRQ "images/pxelinux.0" octet blksize 1432 tsize 0
14:00:49.722630 IP 192.168.0.136.1024 > mittelerde.tftp: 47 RRQ "images/pxelinux.0" octet blksize 1432 tsize 0
14:00:52.798136 IP 192.168.0.136.1024 > mittelerde.tftp: 47 RRQ "images/pxelinux.0" octet blksize 1432 tsize 0
Once I stop the firewall service pxeboot works fine again. Of course the conntrack module is loaded:
~# lsmod|grep conntrack
nf_conntrack_tftp 16384 0
nf_conntrack_ftp 20480 0
xt_conntrack 16384 4
nf_conntrack_ipv4 16384 20
nf_defrag_ipv4 16384 1 nf_conntrack_ipv4
nf_conntrack 131072 9 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,nf_conntrack_tftp,ipt_MASQUERADE,nf_nat_ipv4,xt_nat,nf_conntrack_ftp
libcrc32c 16384 2 nf_conntrack,nf_nat
x_tables 40960 8 xt_conntrack,iptable_filter,xt_multiport,xt_tcpudp,ipt_MASQUERADE,xt_nat,xt_comment,ip_tables
What I am missing here?
Problem solved. For tftpd-hpa the following UDP ports must be open as well:
1024
49152:49182

Configure QEMU (Guest Debian-9.0 Sparc64 - Host MacOS High Sierra) to do ssh from guest to host

Firstly, with a QEMU Virtual Machine (Debian Sparc64 Etch 4.0), I have been able successfully to get ssh and scp commands from Guest to Host (MacOS Hight Sierra OS 10.13.3).
I wanted only to transfer files between guest and host.
To get it, I have followed this tutorial :
1) I have installed TUN/TAP drivers
2) Launching QEMU like this :
qemu-system-sparc -boot c -hda debian_etch.img -m 512M -net nic -net tap,script=no,downscript=no
3) Once VM booted, do on MacOS host : ifconfig tap0 192.168.10.1
4) On Debian Etch host, into /etc/network/interfaces :
auto eth0
iface eth0 inet static
address 192.168.10.2
netmask 255.255.255.0
gateway 192.168.10.1
and doing : /etc/init.d/networking restart
5) Finally, make on guest: $ scp -r dir user_host#192.168.10.1:~/
Now, I would like to get the same thing with a "Debian Sparc64 Stretch 9.0" guest.
It seems that ifconfig is deprecated with recent versions of Debian.
Anyway, I tried to launch the Sparc64 image with :
qemu-system-sparc64 \
-drive file=debian-9.0-sparc64.qcow2,if=none,id=drive-ide0-0-1,format=qcow2,cache=none \
-m 1024 \
-boot c \
-net nic \
-net tap,ifname=tap0,script=no,downscript=no \
-nographic
and do again the steps 1),3),4) but unfortunately, ssh and scp from guest don't work.
I must make notice that with this Debian Sparc64 9.0 guest, network logical name is changing (maybe for each boot). For example, /etc/network/interfaces contains :
auto enp0s5
allow-hotplug enp0s5
iface enp0s5 inet static
address 192.168.10.2
netmask 255.255.255.0
gateway 192.168.10.1
Finally, I get from guest the following result :
# ssh user_host#192.168.10.1
ssh: connect to host 192.168.10.1 port 22: No route to host
ip a gives :
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.2/24 brd 192.168.10.255 scope global enp0s5
valid_lft forever preferred_lft forever
inet6 fec0::5054:ff:fe12:3456/64 scope site mngtmpaddr dynamic
valid_lft 86207sec preferred_lft 14207sec
inet6 fe80::5054:ff:fe12:3456/64 scope link
valid_lft forever preferred_lft forever
If someone could give me some clues to fix it and get ssh/scp commands to work from guest to host (I have not network on Guest and no sshd server, so I want only the direction guest-->host for ssh/scp).
UPDATE 1:
I keep on debug this issue.
1) First, from this link, I rename at each boot the network interface of guest "Debian 9.0 Sparc64" to eth0 :
vi /etc/udev/rules.d/10-network.rules
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="52:54:00:12:34:56", NAME="eth0"
with MAC adress given by :
$ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.2/24 brd 192.168.10.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe12:3456/64 scope link
valid_lft forever preferred_lft forever
2) I used tcpdump on TAP interface of the host MacOS High Sierra :
# tcpdump -vv -i tap0
tcpdump: listening on tap0, link-type EN10MB (Ethernet), capture size 262144 bytes
00:23:06.112155 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:06.112228 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
00:23:07.128440 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:07.128499 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
00:23:08.152323 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:08.152381 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
00:23:11.119346 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:11.119396 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
00:23:12.120190 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:12.120250 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
00:23:13.145028 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:13.145075 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
00:23:16.127525 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:16.127575 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
00:23:17.145202 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.10.1 tell 192.168.10.2, length 46
00:23:17.145272 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.10.1 is-at fe:22:e7:8c:7f:fa (oui Unknown), length 28
Should I conclude that guest (192.168.10.2 on guest /etc/network/interfaces) and host (192.168.10.1 set by ifconfig tap0 192.168.10.1) are communicating, since I see both adresses with tcpdump above ?
If I do a tcpdump -vv -i tap0 on host while I restart networkin on guest, I get :
00:27:07.648620 IP6 (hlim 1, next-header Options (0) payload length: 36) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff12:3456 to_ex { }]
00:27:07.804644 IP6 (hlim 1, next-header Options (0) payload length: 36) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff12:3456 to_ex { }]
00:27:08.569140 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff12:3456: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::5054:ff:fe12:3456
unknown option (14), length 8 (1):
0x0000: 3bd4 4c86 3dd6
00:27:08.612632 IP (tos 0x0, ttl 255, id 37381, offset 0, flags [none], proto UDP (17), length 118)
192.168.10.1.mdns > 224.0.0.251.mdns: [udp sum ok] 0 PTR (QU)? 6.5.4.3.2.1.e.f.f.f.0.0.4.5.0.5.0.0.0.0.0.0.0.0.0.0.0.0.0.8.e.f.ip6.arpa. (90)
00:27:09.592322 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::5054:ff:fe12:3456 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff12:3456 to_ex { }]
00:27:09.592483 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::5054:ff:fe12:3456 > ip6-allrouters: [icmp6 sum ok] ICMP6, router solicitation, length 16
source link-address option (1), length 8 (1): 52:54:00:12:34:56
0x0000: 5254 0012 3456
00:27:09.616466 IP (tos 0x0, ttl 255, id 18614, offset 0, flags [none], proto UDP (17), length 118)
192.168.10.1.mdns > 224.0.0.251.mdns: [udp sum ok] 0 PTR (QM)? 6.5.4.3.2.1.e.f.f.f.0.0.4.5.0.5.0.0.0.0.0.0.0.0.0.0.0.0.0.8.e.f.ip6.arpa. (90)
00:27:09.976787 IP6 (hlim 1, next-header Options (0) payload length: 36) fe80::5054:ff:fe12:3456 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff12:3456 to_ex { }]
Are there useful informations in these messages, in order to get ssh/scp from guest to host ?
Finally, is it normal to have the following state (UNKNOWN) for guest eth0 :
eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN
??
UPDATE 2: I tried also to launch by using guestfwd flag with "-net tap" flag like this :
qemu-system-sparc64 \
-boot c \
-hda debian-9.0-sparc64.qcow2 \
-net nic \
-net tap,ifname=tap0,script=no,downscript=no \
-net 'user,guestfwd=tcp::22-tcp::22' \
-m 1024 \
-nographic
But still no ssh access from guest to host.
I don't know if, into -net 'user,guestfwd=tcp::22-tcp::22', in which order I have to put the IP of guest and host and the ports to use for each of them (I used here 22 for both)
If someone could give me some precisions about "guestfwd" flag.
UPDATE 3 :
Finally, the issue is fixed by doing on MacOS host (as root) :
1) set IP 190.168.10.1 on bridge0 with "ifconfig bridge0 192.168.10.1"
2) Launch Qemu with following command :
qemu-system-sparc64 \
-boot c \
-hda debian-9.0-sparc64.qcow2 \
-device virtio-balloon \
-net nic,model=virtio,macaddr=52:54:00:12:34:56 \
-vga none \
-net tap,ifname=tap0,script=no,downscript=no \
-m 1024 \
-nographic
MAC Adress 52:54:00:12:34:56 is important.
3) Once Qemu is booted, add tap0 interface to bridge0 : ifconfig bridge0 addm tap0
4) Finally, from guest Debian Sparc64, I can connect to MacOS host with (as simple user or root) :
ssh user_host#192.168.10.1
Some remarks:
Yes, ifconfig is deprecated, but to my best knowledge, it is so since at least six years or so, and it still is here ... which has its reasons. I think you could use it without bad conscience.
Regarding your tcpdump excerpt: Your feeling that it contains useful information is right. It does not show a real communication between guest and host, though, but it shows ARP queries. ARP is the Address Resolution Protocol and is needed for the following reason:
Basically, as long as TCP/IP is stacked on top of Ethernet, computers (whether or not they are virtual) need to know the Ethernet hardware address (MAC (Media Access Control) address) of their communication partners.
So if a computer A with IP address a.a.a.a wants to talk to a computer B with IP address b.b.b.b, A needs to know B's MAC address first. Therefore, A sends an Ethernet broadcast frame into the local Ethernet segment (basically, a broadcast frame is a frame which goes to all machines connected to the respective segment), asking: "I need to talk to the guy which has IP address b.b.b.b. If any of you guys out there has this IP address, what is your Ethernet MAC address?".
Your tcpdump excerpt shows that this ARP resolution fails. Your guest asks over and over again without ever getting an answer. As long as this is the case, the guest is not able to do any TCP/IP communication with the host.
So your problem was not related to SCP/SSH only, but to the IP protocol in general. For example, the guest would not have been able to show a web site which is located on the host.
Furthermore, since the host did not feel like sending an answer, any other guest would have had the same problem. I strongly assume that your old debian etch would have failed the same way if you had started its VM exactly the same way and on exactly the same host with absolutely identical configuration as the the VM with the new debian stretch.
Of course, ARP requests from the guest first get to the bridge where the guest VM's TAP is connected. As long as that bridge does not have the IP address assigned the guest asks for, the guest's ARP requests won't be answered. This problem is usually solved in the following way:
On the host, take the IP address away from the physical network interface and assign it to the bridge. Then the bridge answers the guest's ARP queries. But this doesn't get you anywhere yet, because now you can't use the host's physical network interface any more (you have taken away its IP address).
Therefore, you connect the host's physical network interface to the bridge as well. Usually, this is a static configuration, i.e. it is not done dynamically when starting VMs. This means that the OS on the host is configured to create a bridge and to add its physical network interfaces to the bridge upon startup. In contrast, the guests' TAPs are added dynamically to the bridge when the guests are started, and are removed from the bridge when the guests are shut down.
Dynamically adding and removing the guests' TAPs to the host's bridge is often done by the upscript and downscript parameters which you are giving to the -tap ... configuration option. Obviously, you are doing this manually (item 3 of your UPDATE 3).
To summarize, your problem was that the host did not answer the guest's ARP queries. This happened because those queries get to the bridge where the VM's TAP is attached to, but you had not assigned an IP address to that bridge (to be more precise, at least not the IP address the guest's ARP queries were asking for).

Node.js server listening for UDP. Tcpdump says packets are coming through, but node doesn't get them

Integrating with a partner. Our server has a restful interface, but their product emits a stream of UDP packets. Since we're still prototyping, I didn't want to make any commits to our API server repo to accommodate the change. Instead, I wrote a little node.js server to listen for their UDP packets, do a bit of conversion, and then PUT to our restful server.
I'm stuck, because the node.js process is listening on port 17270, but isn't getting any of my sample UDP packets.
Node server
const dgram = require('dgram');
const server = dgram.createSocket('udp4');
server.on('error', function(err) {
console.log('server error:\n' + err.stack);
});
server.on('message', function(msg, rinfo) {
console.log('Server got UDP packet: ' + msg + ' from ' + rinfo.address + ':' + rinfo.port + '');
doBusinessLogic(msg);
});
server.on('listening', function() {
var address = server.address();
console.log('Server listening for UDP ' + address.address + ':' + address.port + '');
});
function main() {
server.bind(17270);
}
main();
When I send a UDP packet from my local machine using netcat,
echo -n "udp content" | nc -vv4u -w1 ec2.instance 17270
I see nothing happening with the server.
I can run my node.js server locally, and it responds to UDP packets sent to 127.0.0.1.
I can also ssh into the ec2 instance, and netcat UDP packets to 127.0.0.1. That creates the expected response, as well.
So the problem must be networking, I thought.
When I run netstat on the ec2 instance, I can see that node is listening on port 17270.
# netstat -plun
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp 0 0 0.0.0.0:17270 0.0.0.0:* 21308/node
I thought it might be AWS security settings, but when I run tcpdump on the ec2 instance and then trigger netcat from my local machine, I can see that there's traffic being received on the ec2 instance.
# tcpdump -vv -i any udp
15:09:52.276786 IP (tos 0x8, ttl 38, id 1756, offset 0, flags [none], proto UDP (17), length 29)
my.local.machine.60924 > ec2.instance.17270: [udp sum ok] UDP, length 1
15:09:52.276852 IP (tos 0x8, ttl 38, id 48463, offset 0, flags [none], proto UDP (17), length 29)
my.local.machine.60924 > ec2.instance.17270: [udp sum ok] UDP, length 1
15:09:52.276863 IP (tos 0x8, ttl 38, id 31296, offset 0, flags [none], proto UDP (17), length 29)
my.local.machine.60924 > ec2.instance.17270: [udp sum ok] UDP, length 1
15:09:52.278461 IP (tos 0x8, ttl 38, id 50202, offset 0, flags [none], proto UDP (17), length 29)
my.local.machine.60924 > ec2.instance.17270: [udp sum ok] UDP, length 1
15:09:52.289575 IP (tos 0x8, ttl 38, id 49316, offset 0, flags [none], proto UDP (17), length 149)
my.local.machine.60924 > ec2.instance.17270: [udp sum ok] UDP, length 121
Just to be sure, I tried temporarily closing port 17270 in the AWS console. If I do that, those packets will be discarded and I won't see any info from tcpdump. So I reopened the port.
I have a process that is listening on a port. I'm clearly sending UDP packets to that port. But the process isn't getting the messages.
I just can't figure out where the disconnect is. What am I missing?
Thanks in advance for any insight!
Probably iptables at a guess. Packets hit the BPF (which tcpdump uses to look at incoming traffic) separately from iptables, so it's possible to see them via tcpdump only to have iptables drop them before they get out of the kernel and up to your application. Look in the output of 'iptables -nvL' for either a default drop/reject policy on the relevant input chain or a rule specifically dropping UDP traffic.
As for fixing it if this is the case, it depends on which distro you're using. In the old days you'd just use the iptables command, something like this (but with the relevant chain name instead of INPUT):
iptables -A INPUT -p udp --dport 17270 -j ACCEPT
...but if you're using Ubuntu they want you to use their ufw tool for
this, and if it's CentOS/RHEL 7 you're probably going to be dealing with firewalld and its firewall-cmd frontend.

No traffic when listening with netcat

I have a server with multiple network interfaces.
I'm trying to run a network monitoring tools in order to verify network traffic statistics by using the sFlow standard on a router.
I get my sFlow datagram on port 5600 of eth1 interface. I'm able to see the generated traffic thanks to tcpdump:
user#lnssrv:~$ sudo tcpdump -i eth1
14:09:01.856499 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1456
14:09:02.047778 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1432
14:09:02.230895 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1300
14:09:02.340114 IP 198.51.100.253.5678 > 255.255.255.255.5678: UDP, length 111
14:09:02.385036 STP 802.1d, Config, Flags [none], bridge-id c01e.b4:a4:e3:0b:a6:00.8018, length 43
14:09:02.434658 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1392
14:09:02.634447 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1440
14:09:02.836015 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1364
14:09:03.059851 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1372
14:09:03.279067 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1356
14:09:03.518385 IP 10.10.10.10.60147 > 198.51.100.232.5600: UDP, length 1440
It seems all ok, but, when i try to read the packet with netcat it seems that there are no packets here:
nc -lu 5600
Indeed, sflowtool nor nprobe doesn't read anything from port 5600.
Where I'm wrong?
nc -lu 5600 is going to open a socket on port 5600, meaning that it will only dump packages that are received in that socket, i.e, packages aiming to that specific address and port.
On the other side, tcpdump collects all the traffic flowing, even without it being sent to a specific server.
Two causes of your problem here:
a) Your host IP is not 198.51.100.232
With host command you will be able exactly see TCP traffic of your server
for example : tcpdump -i eth1 host 198.51.100.232 port 80
b) There is another server that is listening in UDP port 5600 that is grabbing all the data, so, nothing is leftover for nc socket.
Notice: with TCPDUMP you will not be able to check and listen UDP ports.
Not sure that it can help here but in my case it helped ( I had similar problem ) so i just stopped "iptables" like
service iptables stop.
It seems that tcpdump works on the lower level than iptable and ipdaple can stop datagrams from being proceed to the higher level. Here a good article on this topic with nice picture.

ICMP time exceeded in-transit [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
In the last few days my server suffers an attack of this kind:
(bandwith > 60MBit/s, XXX.XXX.XXX.XXX are multiple IPs)
tcpdump -n proto ICMP
17:15:19.267464 IP XXX.XXX.XXX.XXX > my_ip: ICMP time exceeded in-transit, length 36
17:15:19.325217 IP XXX.XXX.XXX.XXX > my_ip: ICMP time exceeded in-transit, length 36
17:15:19.345561 IP XXX.XXX.XXX.XXX > my_ip: ICMP time exceeded in-transit, length 56
17:15:19.484865 IP XXX.XXX.XXX.XXX > my_ip: ICMP time exceeded in-transit, length 36
17:15:19.529616 IP XXX.XXX.XXX.XXX > my_ip: ICMP time exceeded in-transit, length 36
17:15:19.957058 IP XXX.XXX.XXX.XXX > my_ip: ICMP YYY.YYY.YYY.YYY tcp port 39692 unreachable, length 36
17:15:19.968957 IP XXX.XXX.XXX.XXX > my_ip: ICMP host YYY.YYY.YYY.YYY unreachable, length 56
17:15:20.112520 IP XXX.XXX.XXX.XXX > my_ip: ICMP host YYY.YYY.YYY.YYY unreachable, length 56
17:15:20.203199 IP XXX.XXX.XXX.XXX > my_ip: ICMP host YYY.YYY.YYY.YYY unreachable, length 36
17:15:20.204803 IP XXX.XXX.XXX.XXX > my_ip: ICMP host YYY.YYY.YYY.YYY unreachable, length 36
I've FreeBSD 9.1 and my pf.conf is
ext_if="em0"
table <blockedips> persist file "/etc/pf-blocked-ips.conf"
set skip on lo0
block drop in log (all) quick on $ext_if from <blockedips> to any
block in
pass out flags S/SA keep state
pass in on $ext_if proto tcp to port 80 flags S/SA keep state
pass in on $ext_if proto tcp to port ssh flags S/SA synproxy state
There's anything that i can do with pf?
It looks like you might be receiving some backscatter from a ddos attack (http://blog.usu.edu/security/2010/08/24/backscatters-the-name-dos-the-game/).
There's not much you can do about this unless you can filter them in a switch before they hit your server; they're already getting dropped in the kernel as a network anomaly.

Resources