Consul dns round robin and ping - dns

I setup test cluster which contains 3 servers. Consul, dnsmask and NetworkManager are installed on all machines under CentOS 7.
I'd like to test simple round robin procedure:
Expected: ping consul.service.consul must send icmp requests to one of three servers.
Actual: ping always send requests to one IP address (10.82.5.6)
However ip order is changed in answer section of dig command:
[vagrant#localhost ~]$ dig consul.service.consul
; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.1 <<>> consul.service.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23466
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;consul.service.consul. IN A
;; ANSWER SECTION:
consul.service.consul. 0 IN A 10.82.5.5
consul.service.consul. 0 IN A 10.82.5.4
consul.service.consul. 0 IN A 10.82.5.6
;; Query time: 2 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Dec 13 13:40:20 UTC 2017
;; MSG SIZE rcvd: 98
If I reboot 10.82.5.6 node, dig returns 2 nodes and ping begins properly work - with round robin. But when I have my node 10.82.5.6 rebooted, only this node again responds to ping commands

according to https://www.consul.io/docs/agent/dns.html the DNS interface randomizes the returned nodes so it'll never be round robin.
there's also DNS caching https://www.consul.io/docs/guides/dns-cache.html the default TTL is 0, but you may have something different and/or results are cached somewhere else..

Related

PowerDNS not sync zones from master to slave

I have installed PowerDNS on 2 VPS servers:
ns1 - 10.0.0.1
ns2 - 10.0.0.2
The Problem is the records/zones are not getting synced from Master to Slave. Here are the configurations:
Master Server:
allow-axfr-ips=10.0.0.2/32
daemon=yes
disable-axfr=no
include-dir=/etc/powerdns/pdns.d
master=yes
setgid=pdns
setuid=pdns
Slave Server:
daemon=yes
disable-axfr=yes
include-dir=/etc/powerdns/pdns.d
setgid=pdns
setuid=pdns
slave=yes
slave-cycle-interval=60
Database on Slave Server
MariaDB [powerdns]> select * from supermasters;
+-------------+------------------+---------+
| ip | nameserver | account |
+-------------+------------------+---------+
| 10.0.0.1 | ns2.example.com | admin |
+-------------+------------------+---------+
1 row in set (0.000 sec)
Both servers are running on MySQL database backend. Master is serving all records as expected but Slave server is giving this:
root#vps10:~# dig example.com #localhost
; <<>> DiG 9.16.1-Ubuntu <<>> example.com #localhost
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 22750
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;example.com. IN A
;; Query time: 4 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Feb 04 22:11:39 UTC 2022
;; MSG SIZE rcvd: 45
I have also checked the Slave server and it does not have any zones from Master. Also tried this on the master server:
root#vps06:~# pdns_control notify example.com
Added to queue
Surfed the internet for any solutions but nothing was available. Can anyone guide me or point out what is wrong with my configuration?
You'll need to enable superslave and make sure your primary sends the correct notifications (NS records, ALSO-NOTIFY metadata (https://doc.powerdns.com/authoritative/domainmetadata.html?#also-notify) etc)

DNS resolution failures for www.docusign.net (US West area)

We are doing API calls to Docusign, which fail occasionally with "getaddrinfo: Name or service not known" errors. Investigating further, we see that when we connect, name resolution fails sometimes, but only from our West datacenter location. Seems the GLB DNS for the US West can take a very long time to resolve, causing DNS client timeouts when it takes >10s to look up the address.
$ dig #1.1.1.1 www.docusign.net
; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> #1.1.1.1 www.docusign.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49468
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1452
;; QUESTION SECTION:
;www.docusign.net. IN A
;; ANSWER SECTION:
www.docusign.net. 22 IN CNAME www-geo.docusign.net.akadns.net.
www-geo.docusign.net.akadns.net. 22 IN CNAME www-west.docusign.net.akadns.net.
www-west.docusign.net.akadns.net. 22 IN A 162.248.184.27
;; Query time: 1 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Oct 04 13:16:43 EDT 2018
;; MSG SIZE rcvd: 126
Above is a good result, which took 1msec (cached)
$ dig #1.1.1.1 www.docusign.net
; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> #1.1.1.1 www.docusign.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21193
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1452
;; QUESTION SECTION:
;www.docusign.net. IN A
;; ANSWER SECTION:
www.docusign.net. 6 IN CNAME www-geo.docusign.net.akadns.net.
www-geo.docusign.net.akadns.net. 6 IN CNAME www-west.docusign.net.akadns.net.
www-west.docusign.net.akadns.net. 6 IN A 162.248.184.27
;; Query time: 2725 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Oct 04 13:21:29 EDT 2018
;; MSG SIZE rcvd: 126
This one is worse, as it took nearly 3s. During testing we've seen this go over 12s which will time out a lot of DNS clients and requesting apps.
Since the TTL is set to 30s, that means that every 30 seconds we have a chance at getting a timeout, our app generating errors, then a DNS success results in resumption of service. Unfortunately, this shows up as an error to our customers in our app.
We're able to work around this using hacks, but am curious if anyone else is seeing this, and how you've worked around it. Also, it might be good for people at docusign/akamai to look into why the performance of the www-west.docusign.net.akadns.net record is so bad.

Are SRV records being stripped by DNS resolvers?

I'm building a custom DNS Server that, among other things, serves SRV records and associated A and AAAA records. I can verify that querying the server directly returns the expected answer:
$ dig lseed.bitcoinstats.com SRV #139.59.143.87 +short
10 10 9735 2c932136c294204bc65c73266300b30fe8ccb99c24fb2261d2e9980a7e8ffe9.80.lseed.bitcoinstats.com.
10 10 6331 31ce6a2b947fdbc97f10405c4062848393cf8140f33cc492aa044fe47d18f59.c6.lseed.bitcoinstats.com.
10 10 8334 283a918ae4609473c01f2e19491e9202788150dbe8d4361a3a04f3a879e9f0a.45.lseed.bitcoinstats.com.
10 10 53258 2673073e3751681b0c55aa88e5af17522c6d6b32d7d210bf4d65439d063c1ba.91.lseed.bitcoinstats.com.
However when querying through my ISPs resolver (or any of the public resolvers, like Google's 8.8.8.8) I get an empty answer back:
$ dig lseed.bitcoinstats.com SRV #8.8.8.8
; <<>> DiG 9.9.5-3ubuntu0.10-Ubuntu <<>> lseed.bitcoinstats.com SRV #8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 10994
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;lseed.bitcoinstats.com. IN SRV
;; Query time: 86 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue Nov 29 12:32:15 CET 2016
;; MSG SIZE rcvd: 51
The query returns immediately and I can see that my server receives an incoming query, however it is empty. Is it known behavior that the resolver strips SRV and additional answers? If this were the case, why is the query being forwarded at all to my server? Or is the error on my side, and the server simply replies with an incorrect answer?
Turns out that the answers were stripped due to non-matching names in the answer. The query was asking for lseed.bitcoinstats.com while the answers were replying with another domain (_lightning._tcp.lseed.bitcoinstats.com) hence the resolvers were stripping the non-matching answers, leaving just an empty reply with no answers. Setting the domain in the answers equal to the domain in the question results in resolvers passing the answers through.

Why are multiple queries being made to my DNS Server?

As part of a project I've written a very simplistic DNS server whose only purpose is to resolve queries for the zone it serves, and to store the IP addresses of the server that made the query.
I've noticed that if I use dig, my DNS server gets queried multiple times - sometimes from the same IP address. Why does this happen? Is it due to the unreliable nature of UDP?
For example, here's a dig reply I made:
C:\Data>dig xyz.dns.example.com
; <<>> DiG 9.10.4-P2 <<>> xyz.dns.example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2539
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;xyz.dns.example.com. IN A
;; ANSWER SECTION:
xyz.dns.example.com. 12321 IN A 50.16.166.175
;; Query time: 224 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Thu Aug 11 15:07:42 Eastern Daylight Time 2016
;; MSG SIZE rcvd: 77
In this example, the zone file for example.com has an NS record for dns.example.com which is where my simplistic DNS server runs. Fror this one query, my server was called 4 times from 2 different IP addresses.
I also noticed that I'm supposedly returning an "Additional" record, but the data I return in bytes 10 and 11 are clearly 0. Could this be causing a problem?
Try dig's +trace option:
dig example.com +trace

What does it mean when a "dig" command with "+nssearch" option returns nothing?

When I run the following dig command on www.google.com with the +nssearch option I get no results:
mac$ dig www.google.com +nssearch
mac$
Can someone explain why no data is returned here? The +nssearch option reads the SOA of all the authoritative name servers I believe. Does this mean there are no authoritative name servers? How is that possible? The domain www.google.com obviously works so I was expecting some sort of result.
; <<>> DiG 9.9.5-3ubuntu0.2-Ubuntu <<>> www.google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40522
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.google.com. IN A
;; ANSWER SECTION:
www.google.com. 20 IN A 74.125.196.106
www.google.com. 20 IN A 74.125.196.104
www.google.com. 20 IN A 74.125.196.99
www.google.com. 20 IN A 74.125.196.147
www.google.com. 20 IN A 74.125.196.105
www.google.com. 20 IN A 74.125.196.103
;; Query time: 2 msec
;; SERVER: 192.168.186.1#53(192.168.186.1)
;; WHEN: Wed Jun 17 17:17:37 CDT 2015
;; MSG SIZE rcvd: 139
From "man dig":
+[no]nssearch
When this option is set, dig attempts to find the authoritative name servers for the zone containing the name being
looked up and display
the SOA record that each name server has for the zone.
Since there's no authority section in the response, +nssearch is going to return nothing.
www.google.com is not a zone, but a name in a zone. Therefore it doesn't have any NS records (or SOA records) for dig to display. Try dropping the www. bit and you'll get more output.

Resources