How can I catch a SERVFAIL exception using Python's dns resolver? - python-3.x

I'm looking to query a domain like this:
dns.resolver.resolve("dnssec-failed.org","A")
Which returns an error like this:
raise NoNameservers(request=self.request, errors=self.errors)
dns.resolver.NoNameservers: All nameservers failed to answer the query dnssec-failed.org. IN A: Server 127.0.0.1 UDP port 53 answered SERVFAIL
I want to be able to catch that exception in my function like so:
def get_a_record(url):
try:
answers = dns.resolver.resolve(url,"A")
except dns.resolver.SERVFAIL:
print("SERVFAIL error for %s" % url)
except dns.resolver.NXDOMAIN:
print("No such domain %s" % url)
except dns.resolver.Timeout:
print("Timed out while resolving %s" % url)
except dns.exception.DNSException:
print("Unhandled exception")
Now I know in the above snippet dns resolver doesn't have a SERVAIL exception but what I'd like to do is catch the error, be able to log it, and continue my script. Is there a proper way to do this using the dns resolver package, or would I need to call the dig command and parse that result?
EDIT
For clarification, I only used dnssec-failed.org as an example because it results in (what I thought) would be the same response as something I am specifically looking, for but don't actually have any active examples of. That "something" being domains which point to ip addresses that are no longer in use. Dangling NS records in other words.
For example I use an IP address that is loaned to me by AWS for use in some XYZ cloud-based application, and I create the name-to-address mapping records in my DNS zone. If I decide to deprecate this service and return the ip back to the cloud provider's pool of ips but forget to remove the DNS record from the zone, it is left "dangling".
That is what I am looking for and I mistakenly assumed that a SERVFAIL is the type of response I get from a query like dig domain-with-no-ip.com
Apologies for the confusion.
EDIT 2
I went and tested this by taking a domain I'd already registered. Configured an A record for it and pointed it to an Ubuntu EC2 listening on port 7272 (python3 -m http.server 7272). Waited 5 minutes for the zone to propagate and then I was able to reach my domain, publicly. All fine and good.
Then I stopped the instance, waited a bit, and then restarted it. Upon coming back up it had a new public ip. Great. So at this point there is a dangling A record for me to test.
So I do dig and nslookup on the domain. Both come back with perfectly fine answers. They just simply point to the now old/original public ip. And that makes sense since the DNS record hasn't changed. The only observable thing that really changes is something like curl, which times out.
So unless my understanding is still wrong, there really isn't an all-too reliable way to hunt down dangling A records because basing logic off n http timeout doesn't necessarily imply a dangling record. The server could just be off/down and the ip is still attached to the resource. Am I correct in my understanding or am I missing something still?
EDIT 3
Accepting the answer because even though my question mildly evolved into something else, the answer did technically address the original question of my post and I think that warrants accepting it.

First, dnssec-failed.org has nameservers but is, by design, failing DNSSEC.
Hence a simple query towards any recursive nameserver that does DNSSEC validation will fail with SERVFAIL as expected:
$ dig dnssec-failed.org NS +short
(no output)
$ dig dnssec-failed.org NS | grep status: | tail -1
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 46517
but as soon as you disable DNSSEC validation you get the nameservers as it should be:
$ dig dnssec-failed.org NS +cdflag +noall +ans +nottlunits
dnssec-failed.org. 7200 IN NS dns105.comcast.net.
dnssec-failed.org. 7200 IN NS dns101.comcast.net.
dnssec-failed.org. 7200 IN NS dns102.comcast.net.
dnssec-failed.org. 7200 IN NS dns103.comcast.net.
dnssec-failed.org. 7200 IN NS dns104.comcast.net.
Now back to the Python part.
resolve() from dnspython is an high level API call, it does everything a resolver does, that is it potentially recurse from root up to being able to give you an answer. Hence, this simple call hides possibly multiple questions and responses and as such may not expose you to the real underlying problem, but provides high level API in output also, using an exception.
As you can see in your own example, you have the SERVFAIL right in the error message, but it is an NoNameservers exceptions because the code asked the registry nameservers for the list of nameservers (which works, there is a DS for this name in parent nameservers), and then ask for any of those nameservers for further data and then they fail there DNSSEC validation, hence the final exception.
It is not clear to me what is your position on DNSSEC error in your case, if you do not care about them or if you really want to study them and do something particular. Hence the above solutions may need to be adapted. If you do not care, just log the NoNameservers exception and go on, everything will work as excepted, DNSSEC validation error will happen exactly like a broken domain, which is per design.
Hence do you really need to handle DNSSEC errors in any way different from any other errors? Why can't you catch NoNameservers exception, log it, and go further?
Otherwise the quick (and dirty way), just parse the error message attached to the NoNameservers exception, and if you see SERVFAIL you can suppose (but not be 100% sure) it is a DNSSEC problem, and at least go further as you need.
If you really need to have further details and be sure it is a DNSSEC problem, you need to do the equivalent of what is above for dig, that is do 2 queries that just differ in the CD DNS flag, and compare results. Which means going "lower" than resolve() API and use dns.query directly, such as this way:
>>> import dns, dns.rcode
>>> resolver_ip = '8.8.8.8' # Use any recursive **validating** nameserver that you trust
>>> query=dns.message.make_query('dnssec-failed.org', 'A')
>>> response = dns.query.udp_with_fallback(query, resolver_ip)[0]
>>> response.rcode() == dns.rcode.SERVFAIL
True
# Now checking if disabling DNSSEC resolves the problem and gets us a reply
# If so, it really means there is a DNSSEC problem
>>> print(str(query))
id 65008
opcode QUERY
rcode NOERROR
flags RD
;QUESTION
dnssec-failed.org. IN A
;ANSWER
;AUTHORITY
;ADDITIONAL
>>> query.flags
<Flag.RD: 256>
>>> query.flags = query.flags | dns.flags.CD
>>> query.flags
<Flag.RD|CD: 272>
>>> print(str(query))
id 65008
opcode QUERY
rcode NOERROR
flags RD CD
;QUESTION
dnssec-failed.org. IN A
;ANSWER
;AUTHORITY
;ADDITIONAL
# We enabled flag "CD" aka checking disabled aka please do not do any DNSSEC validation, and now doing the same query as above again:
>>> response = dns.query.udp_with_fallback(query, resolver_ip)[0]
>>> response.rcode() == dns.rcode.SERVFAIL
False
>>> response.rcode() == dns.rcode.NOERROR
True
>>> response.answer[0][0]
<DNS IN A rdata: 69.252.80.75>

Related

Does DNSpython have a method that automatically performs a forward or reverse lookup depending on the value that it's passed?

I am wondering if there is a way to pass either a host, fqdn, or IP address to DNSPython and have it perform the proper lookup (forward for hosts and fqdns and reverse for ips). Also, I want to know what kind of address (host, fqdn, or ip) was sent to the method originally.
Thanks in advance
To my knowledge, there's not a single function that will do what you're looking for. However, it wouldn't be too hard to implement. Here's how I'd probably do it.
First, I'd check if the input was an IP address
import ipaddress
def is_ipaddress(string):
try:
ipaddress.ip_address(string)
return True
except ValueError:
return False
If it is an IP address, then I'd call dns.query.reverse_query(). This is assuming I have installed the latest development version of dnspython from Github because reverse_query() was only recently added (see https://github.com/rthalley/dnspython/commit/7c105cce64699e1a221176f98f7cb9e682aba1e0).
If it's not an IP address, then I'd prepare my query with dns.message.make_query(name, rdtype) and then send it with dns.query.udp().
If you wanted to use the value of search in /etc/reolv.conf, you might consider creating a dns.resolver.Resolver, which currently does search processing by default.
import dns.resolver
import dns.rdatatype
resolver = dns.resolver.Resolver()
response = resolver.query('my-computer', dns.rdatatype.A)

Understanding DNS response header information

I'm in the middle of learning about DNS, and I'm trying to understand how a non-recursive resolver/server would respond to an empty response.
My understanding of DNS is basically that:
If the server returns a non-authoritative response, it will usually provide a list of nameservers (the NSCOUNT) which you can consult to find the authoritative response.
But, what happens if a DNS server returns nothing? As in - just the response header with ANCOUNT = 0, NSCOUNT = 0 and ARCOUNT = 0?
For example, if I query Google's free DNS server (8.8.8.8), and I ask it to resolve "google.com", and the recursion bit is NOT set, this is the response I get:
+---------------------------------------------------------------------------+
| 25550 | QR: 1 | OP: 00 | AA: 0 | TC: 0 | RD: 0 | RA: 1 | Z: 0 | RCODE: 00 |
+---------------------------------------------------------------------------+
| QDCOUNT: 1, ANCOUNT: 0, NSCOUNT: 0, ARCOUNT: 0 |
+---------------------------------------------------------------------------+
So basically, it returned nothing to me except my original query, and it informed me that recursion is available.
In this case, how should the query proceed (assuming we don't just use ask the server to use recursion). Is the only recourse here to contact one of the top-level servers? Or, to put my question another way, how come Google's DNS server didn't return me a list of nameservers (why is NSCOUNT 0?) that I can consult?
When you said "No Recurse", then the Google's NS did not recurse. Since they are not the authoritative nameservers for google.com, they didn't provide any response. This is normal, and acceptable behaviour.
You can only request with "recurse" bit set, to figure out the A-record for google.com. Other way is:
Find the NS for com., from the root servers.
Find the NS for google.com from one of the com NS.
Find the A record for google.com from the gooogle.com NS.
Basically, you do what the recursive nameserver was supposed to do for you.
Note: Recursive NS can use its cache for getting you a response without actual queries, based on TTL for the record (and of course if you set the recursion bit (-:
Only an authoritative server is supposed to include the NS records in the authority section of the response.
The Google 8.8.8.8 servers are not authoritative for google.com, and you asked them not to recurse, so they didn't.
This is an abnormal query that a real DNS client wouldn't send to them, so their response of "NO DATA / NO ERROR" (RCODE == 0, ANCOUNT == 0) is acceptable.

dig "hostname_1" #"IP" command

I was told to execute the command: dig "hostname_1" #"IP".
I don't know what it is for, any idea? and the meaning of "#IP"?
Another question, the response has the field:
;;AUTHORITY SECTION:
"hostname_1" 1200 IN NS "hostname_2"
"hostname_1" 1200 IN NS "hostname_3"
Is it correct that hostname_2 and hostname_3 are another names for hostname_1?or are they nameservers of the hostname_1 host?
dig is a tool for performing DNS lookups.
Normally dig asks your locally configured nameserver, however, with #IP you can make dig ask the nameserver which runs on the specified IP.
The output of dig can be read a follows KEY, TTL (time to live in seconds), CLASS (normally "IN" for Internet), TYPE, RDATA (resource data) (see https://en.wikipedia.org/wiki/Resource_record for a longer description)
There are a number of types (see https://en.wikipedia.org/wiki/List_of_DNS_record_types), NS means "nameserver". In your case hostname_2 and hostname_3 are the responsible nameservers for hostname_1.

AAAA DNS query on ipv4 interface

We use RH5.8 with ipv6 disabled.
named(bind) service is in forward mode (cache enabled)
options {
directory "/var/named";
listen-on { 127.0.0.1; };
forwarders {10.10.12.1;};
forward only;
};
It appears that some commands (like telnet) always query AAAA record in the first place and when fallback to query A record the answer (No such name) already in named caching.
As a result, clients are always getting an error.
in the example below, 10.10.10.1 is a local ip:
127.0.0.1 -> 127.0.0.1 DNS Standard query AAAA testapp.test.com
10.10.10.1 -> 10.10.12.1 DNS Standard query AAAA testapp.test.com
10.10.10.1 -> 10.10.12.1 DNS Standard query AAAA testapp.test.com
10.10.12.1 -> 10.10.10.1 DNS Standard query response, No such name
127.0.0.1 -> 127.0.0.1 DNS Standard query response, No such name
127.0.0.1 -> 127.0.0.1 DNS Standard query A testapp.test.com
127.0.0.1 -> 127.0.0.1 DNS Standard query response, No such name
I searched over net and discovered that not only me encountered with such problem
http://www.linuxforums.org/forum/red-hat-fedora-linux/136217-disabling-ipv6-dns-queries.html
less /etc/modprobe.conf
alias net-pf-10 off
alias ipv6 off
options ipv6 disable=1
less /etc/sysconfig/network
NETWORKING_IPV6=no
less /etc/sysconfig/named
OPTIONS="-4"
named -v
BIND 9.3.6-P1-RedHat-9.3.6-20.P1.el5
but unfortunately did not find any solution so far...
As requested in the comments: some explanation on negative cacheing.
The difference between NXDOMAIN and NODATA is described in section 5 of RFC 2308:
A negative answer that resulted from a name error (NXDOMAIN) should
be cached such that it can be retrieved and returned in response to
another query for the same <QNAME, QCLASS> that resulted in the
cached negative response.
So an NXDOMAIN can be cached based on the QNAME (i.e. "blabla.example.com.") and the QCLASS (usually "IN"). So it means that blabla.example.com does not exist at all. The negative cache entry is independent of the QTYPE. A NODATA answer is different:
A negative answer that resulted from a no data error (NODATA) should
be cached such that it can be retrieved and returned in response to
another query for the same <QNAME, QTYPE, QCLASS> that resulted in
the cached negative response.
Here is QTYPE (i.e. "AAAA") is included. A NODATA negative cache entry only means that this specific record type does not exist for this name.
So: If you receive an NXDOMAIN response then you know that the name doesn't exist at all for any record type. If you receive a NODATA response then you know that the requested record type does not exist, but other record types may exist.
This also means that when sending responses you should never send an NXDOMAIN response if there may be a valid record of a different record type for the same name. The non-existence of the domain name will be cached and the cache will start telling its clients that the name doesn't exist at all.

Dns Server (edns)opt type of resource record

i want to request for an opt resource record message to dns server bind 9.
but i don't know the format and the server configration.
http://www.ietf.org/rfc/rfc2671.txt this doc is the edns doc.
i create the message formatted following the doc,but it doesn't work. server tell me format error.
the req message:
Question Record:
QName:a6.debian.com
QType:0x41(OPT type)
QClass:0x01(Internet)
Additional Record:
Resource Name:0xc0,0x0c( pointer to QName)
Resource Type:0x41
ResourceClass:512(udp payload size)
TimeToLive:0x1EF0000(split to extent-code version and Z)
ResourceDataLength:0x08
Rdata:(OPTRdata):
OptCode:0x4000
OptLength:0x04
OptData:0x0A,0x0A,0x0A,0x0A
What's wrong???Could you help me?
There's no OPT type record. OPT is pseudo-record type. You can not query for it. You use OPT to pass some specific parameters to DNS server, like EDNS. Besides it can appear in "ADDITIONAL" section only

Resources