Forwarding DNS to Cloudflare's DNS-over-TLS via CoreDNS - dns

I'm using this Docker image as an example to try to setup secure DNS forwarding over TLS to CloudFlare's resolvers. I'm using CoreDNS 1.5.0 (latest) and my config is this:
# CoreDNS Configuration
.:53 {
forward . tls://1.1.1.1 tls://1.0.0.1 {
tls_servername tls.cloudflare-dns.com
policy sequential
health_check 5s
}
log
}
I'm making requests like so:
root#8ef125545369:/# dig #127.0.0.1 google.com
; <<>> DiG 9.11.3-1ubuntu1.5-Ubuntu <<>> #127.0.0.1 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 49802
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 090b0d7fadcdd8bb (echoed)
;; QUESTION SECTION:
;google.com. IN A
;; Query time: 24 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Apr 08 19:29:30 UTC 2019
;; MSG SIZE rcvd: 51
I'm not getting answers. The CoreDNS logs look like this:
missioncontrol | 2019-04-08T19:29:30.778Z [INFO] 127.0.0.1:39615 - 49802 "A IN google.com. udp 51 false 4096" NOERROR - 0 5.02365452s
missioncontrol | 2019-04-08T19:29:35.759Z [INFO] 127.0.0.1:39615 - 49802 "A IN google.com. udp 51 false 4096" NOERROR - 0 5.00549558s
It's clear that CoreDNS is getting the requests, but I can't determine why this is failing. My image is ubuntu:bionic and ca-certificates is installed. I can also use openssl s_client to connect to 1.1.1.1:443 without issues.
Is there something I'm missing to setup DNS-over-TLS forwarding from CoreDNS to CloudFlare's resolvers?
EDIT
I've tested this on my host operating system outside of a Docker container and I'm seeing the same functionality, namely that it's not working.

I tested this again by running it in Travis CI, and it worked; apparently my corporate firewall does not like DNS-over-TLS.
I was able to validate this by installing knot-dnsutils (on Ubuntu 18.04) and trying to query Cloudflare directly:
$ kdig -d #1.0.0.1 +tls-ca +tls-host=cloudflare-dns.com google.com
;; DEBUG: Querying for owner(google.com.), class(1), type(1), server(1.0.0.1), port(853), protocol(TCP)
;; DEBUG: TLS, imported 133 system certificates
;; DEBUG: TLS, received certificate hierarchy:
;; DEBUG: #1, C=US,ST=California,L=San Francisco,O=Cloudflare\, Inc.,CN=cloudflare-dns.com
;; DEBUG: SHA-256 PIN: V6zes8hHBVwUECsHf7uV5xGM7dj3uMXIS9//7qC8+jU=
;; DEBUG: #2, C=US,O=DigiCert Inc,CN=DigiCert ECC Secure Server CA
;; DEBUG: SHA-256 PIN: PZXN3lRAy+8tBKk2Ox6F7jIlnzr2Yzmwqc3JnyfXoCw=
;; DEBUG: TLS, skipping certificate PIN check
;; DEBUG: TLS, The certificate is trusted.
;; WARNING: TLS, handshake failed (Error in the pull function.)
This is what happened when querying within the corporate network. From Travis CI, I saw:
;; DEBUG: Querying for owner(google.com.), class(1), type(1), server(1.1.1.1), port(853), protocol(TCP)
;; DEBUG: TLS, imported 133 system certificates
;; DEBUG: TLS, received certificate hierarchy:
;; DEBUG: #1, C=US,ST=California,L=San Francisco,O=Cloudflare\, Inc.,CN=cloudflare-dns.com
;; DEBUG: SHA-256 PIN: V6zes8hHBVwUECsHf7uV5xGM7dj3uMXIS9//7qC8+jU=
;; DEBUG: #2, C=US,O=DigiCert Inc,CN=DigiCert ECC Secure Server CA
;; DEBUG: SHA-256 PIN: PZXN3lRAy+8tBKk2Ox6F7jIlnzr2Yzmwqc3JnyfXoCw=
;; DEBUG: TLS, skipping certificate PIN check
;; DEBUG: TLS, The certificate is trusted.
;; TLS session (TLS1.2)-(ECDHE-ECDSA-SECP256R1)-(AES-256-GCM)
;; ->>HEADER<<- opcode: QUERY; status: NOERROR; id: 59442
;; Flags: qr rd ra; QUERY: 1; ANSWER: 1; AUTHORITY: 0; ADDITIONAL: 1
;; EDNS PSEUDOSECTION:
;; Version: 0; flags: ; UDP size: 1452 B; ext-rcode: NOERROR
;; PADDING: 69 B
;; QUESTION SECTION:
;; google.com. IN A
;; ANSWER SECTION:
google.com. 156 IN A 172.217.5.14
;; Received 128 B
;; Time 2019-04-09 22:03:18 UTC
;; From 1.1.1.1#853(TCP) in 12.8 ms
Clearly the corporate firewall is blocking this access unfortunately.

Related

PowerDNS not sync zones from master to slave

I have installed PowerDNS on 2 VPS servers:
ns1 - 10.0.0.1
ns2 - 10.0.0.2
The Problem is the records/zones are not getting synced from Master to Slave. Here are the configurations:
Master Server:
allow-axfr-ips=10.0.0.2/32
daemon=yes
disable-axfr=no
include-dir=/etc/powerdns/pdns.d
master=yes
setgid=pdns
setuid=pdns
Slave Server:
daemon=yes
disable-axfr=yes
include-dir=/etc/powerdns/pdns.d
setgid=pdns
setuid=pdns
slave=yes
slave-cycle-interval=60
Database on Slave Server
MariaDB [powerdns]> select * from supermasters;
+-------------+------------------+---------+
| ip | nameserver | account |
+-------------+------------------+---------+
| 10.0.0.1 | ns2.example.com | admin |
+-------------+------------------+---------+
1 row in set (0.000 sec)
Both servers are running on MySQL database backend. Master is serving all records as expected but Slave server is giving this:
root#vps10:~# dig example.com #localhost
; <<>> DiG 9.16.1-Ubuntu <<>> example.com #localhost
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 22750
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;example.com. IN A
;; Query time: 4 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Feb 04 22:11:39 UTC 2022
;; MSG SIZE rcvd: 45
I have also checked the Slave server and it does not have any zones from Master. Also tried this on the master server:
root#vps06:~# pdns_control notify example.com
Added to queue
Surfed the internet for any solutions but nothing was available. Can anyone guide me or point out what is wrong with my configuration?
You'll need to enable superslave and make sure your primary sends the correct notifications (NS records, ALSO-NOTIFY metadata (https://doc.powerdns.com/authoritative/domainmetadata.html?#also-notify) etc)

Can't Verify Mailgun Receiving and Tracking Records

I'm trying to set up DNS receiving and tracking records for Mailgun. The DNS sending records were verified and work fine, but for some reason the receiving records and tracking records don't get verified.
The domain I'm using is mg.optimizeprice.com and the DNS provider I'm using is Dynadot. There are two mx records that it wants me to set up. I took screenshots of the Mailgun DNS records page as well as the Dynadot DNS records page as I have them set up right now. What do I need to change to get this to work?
Also, here is the output of dig optimizeprice.com mx:
; <<>> DiG 9.11.3-1ubuntu1.8-Ubuntu <<>> optimizeprice.com mx
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8708
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;optimizeprice.com. IN MX
;; ANSWER SECTION:
optimizeprice.com. 10800 IN CNAME curved-aardvark-xbc5jxmn5ja3pdaq403yoxab.herokudns.com.
;; AUTHORITY SECTION:
herokudns.com. 10 IN SOA dns1.p05.nsone.net. hostmaster.nsone.net. 1563353642 600 900 1209600 10
;; Query time: 41 msec
;; SERVER: 75.75.75.75#53(75.75.75.75)
;; WHEN: Wed Jul 17 01:55:08 PDT 2019
;; MSG SIZE rcvd: 176

DNS resolution failures for www.docusign.net (US West area)

We are doing API calls to Docusign, which fail occasionally with "getaddrinfo: Name or service not known" errors. Investigating further, we see that when we connect, name resolution fails sometimes, but only from our West datacenter location. Seems the GLB DNS for the US West can take a very long time to resolve, causing DNS client timeouts when it takes >10s to look up the address.
$ dig #1.1.1.1 www.docusign.net
; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> #1.1.1.1 www.docusign.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49468
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1452
;; QUESTION SECTION:
;www.docusign.net. IN A
;; ANSWER SECTION:
www.docusign.net. 22 IN CNAME www-geo.docusign.net.akadns.net.
www-geo.docusign.net.akadns.net. 22 IN CNAME www-west.docusign.net.akadns.net.
www-west.docusign.net.akadns.net. 22 IN A 162.248.184.27
;; Query time: 1 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Oct 04 13:16:43 EDT 2018
;; MSG SIZE rcvd: 126
Above is a good result, which took 1msec (cached)
$ dig #1.1.1.1 www.docusign.net
; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> #1.1.1.1 www.docusign.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21193
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1452
;; QUESTION SECTION:
;www.docusign.net. IN A
;; ANSWER SECTION:
www.docusign.net. 6 IN CNAME www-geo.docusign.net.akadns.net.
www-geo.docusign.net.akadns.net. 6 IN CNAME www-west.docusign.net.akadns.net.
www-west.docusign.net.akadns.net. 6 IN A 162.248.184.27
;; Query time: 2725 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Oct 04 13:21:29 EDT 2018
;; MSG SIZE rcvd: 126
This one is worse, as it took nearly 3s. During testing we've seen this go over 12s which will time out a lot of DNS clients and requesting apps.
Since the TTL is set to 30s, that means that every 30 seconds we have a chance at getting a timeout, our app generating errors, then a DNS success results in resumption of service. Unfortunately, this shows up as an error to our customers in our app.
We're able to work around this using hacks, but am curious if anyone else is seeing this, and how you've worked around it. Also, it might be good for people at docusign/akamai to look into why the performance of the www-west.docusign.net.akadns.net record is so bad.

Consul dns round robin and ping

I setup test cluster which contains 3 servers. Consul, dnsmask and NetworkManager are installed on all machines under CentOS 7.
I'd like to test simple round robin procedure:
Expected: ping consul.service.consul must send icmp requests to one of three servers.
Actual: ping always send requests to one IP address (10.82.5.6)
However ip order is changed in answer section of dig command:
[vagrant#localhost ~]$ dig consul.service.consul
; <<>> DiG 9.9.4-RedHat-9.9.4-51.el7_4.1 <<>> consul.service.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23466
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;consul.service.consul. IN A
;; ANSWER SECTION:
consul.service.consul. 0 IN A 10.82.5.5
consul.service.consul. 0 IN A 10.82.5.4
consul.service.consul. 0 IN A 10.82.5.6
;; Query time: 2 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Dec 13 13:40:20 UTC 2017
;; MSG SIZE rcvd: 98
If I reboot 10.82.5.6 node, dig returns 2 nodes and ping begins properly work - with round robin. But when I have my node 10.82.5.6 rebooted, only this node again responds to ping commands
according to https://www.consul.io/docs/agent/dns.html the DNS interface randomizes the returned nodes so it'll never be round robin.
there's also DNS caching https://www.consul.io/docs/guides/dns-cache.html the default TTL is 0, but you may have something different and/or results are cached somewhere else..

Why are multiple queries being made to my DNS Server?

As part of a project I've written a very simplistic DNS server whose only purpose is to resolve queries for the zone it serves, and to store the IP addresses of the server that made the query.
I've noticed that if I use dig, my DNS server gets queried multiple times - sometimes from the same IP address. Why does this happen? Is it due to the unreliable nature of UDP?
For example, here's a dig reply I made:
C:\Data>dig xyz.dns.example.com
; <<>> DiG 9.10.4-P2 <<>> xyz.dns.example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2539
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;xyz.dns.example.com. IN A
;; ANSWER SECTION:
xyz.dns.example.com. 12321 IN A 50.16.166.175
;; Query time: 224 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Thu Aug 11 15:07:42 Eastern Daylight Time 2016
;; MSG SIZE rcvd: 77
In this example, the zone file for example.com has an NS record for dns.example.com which is where my simplistic DNS server runs. Fror this one query, my server was called 4 times from 2 different IP addresses.
I also noticed that I'm supposedly returning an "Additional" record, but the data I return in bytes 10 and 11 are clearly 0. Could this be causing a problem?
Try dig's +trace option:
dig example.com +trace

Resources