DNS Switch A Record to C Name Without Impacting Consumers - dns

Say we have an A REC that points to IP x of our LB for one of our services. It has a TTL of 3600s. But... what it should have been was a C NAME that points to a A REC for a VIP. It's already in production and has about 10 services that calls the new A REC comprising of ~100 machines. If the A REC is deleted and a new C NAME is created with the same name and points to a new A REC, will the consumers notice this change? Is there a chance that the callers would time out?
I'd assume with the amount of machines some are bound to be impacted. If I set the TTL to 5 hours would there be a better chance of no one noticing?
So my question is, how do I swap an A REC for a C NAME without consumers of our service noticing?
Would it matter if the record is for use inside the network only vs available to the public?
I ask because we will need to load balance across data centers soon, and we have some records that are stuck pointing to an IP.
It would be nice to have an explanation of how the DNS system would behave in this scenario. Thanks.

Let's assume that you have a name foo.example.org that has nothing except an A record with the IPv4 address 192.0.2.1 and a 3600 second TTL. Anyone who looks up foo.example.org will get that A record, and remember it for an hour before they go and ask your name server for fresher information.
Then assume you change things so that foo.example.org has a CNAME record pointing at bar.example.net, which in turn has an A record holding the address 192.0.2.1. Anyone who looks up the name foo.example.org for the first time will get the CNAME, proceed to look up bar.example.net, and get the A record from there.
The only complication is that anyone who looked up foo.example.org during the 3600 seconds immediately before you change to the CNAME chain took effect will remember the direct lookup, and thus not see the new information until the TTL expires. So for up to an hour after you do the change, some people may still see the old information. So to keep the change transparent to users, make sure that the old information (the old IP address) still works for at least one full TTL period after you make a change.
This is not in any way special for changing from A to CNAME. No matter what you change, there will be a full TTL period during which clients can legitimately get the old info. That's just how DNS works.
On top of that, of course, there are clients and caching servers that don't pay as much attention to the TTL value as they should, but that's a whole different thing.

Related

What are Time To Refresh (TTR), Time To Live (TTL), and Time To Birth (TTB), Cascade Delete (CCD) in an AtKeys metadata for?

These are a few Time To mechanisms that can have some light shun upon them as they can be quite useful to a developer. I will be answering them below in an attempt to explain what they can be used for and why on the #platform.
Time To Mechanisms (Attributes of Metadata)
Any data that is shared between #signs can go through several mechanisms. Some of these mechanisms include TTR (Time To Refresh), TTL (Time To Live), and TTB (Time To Birth).
Time To Refresh
TTR, which is an attribute of the metadata of a shared key, accepts an integer value which represents seconds. The subsequent refresh happens based on the given value: for example, if the set TTR value is 86400, then the refresh happens once in a day (there are 86,400 seconds in a day). Another very important attribute of the metadata is CCD (Cascade Delete), which is a boolean variable (a variable that accepts true or false values). For those who are well versed in SQL and database management, you will already have some understanding of what CCD does and how it functions.
If the CCD value is set as true when the sender deletes their original key, the cached key gets deleted on both the sender’s server and the recipient’s server. Correspondingly, if the CCD value is false when the sender deletes their original key, the cached key gets deleted on only the sender’s server and remains cached on the recipient’s server. But why is this useful? CCD is used to avoid unnecessary network calls. As an example: if #alice is in need of #bob’s phone number, she does not need to make a request from her server to #bob’s server to find it, but rather needs only to search locally on her device to find the phone number.
Let’s consider a similar example: #alice shares her phone number with her friends #bob and #john. A few months later, however, #alice purchases a new phone plan, resulting in a new phone number. If #alice has her #sign’s TTR variable set to true, once she updates her old phone number to match her new one, this updated value will also be reflected on #bob and #john’s devices. #alice also has the ability to set a specific time, in seconds, for when the new phone number will be cascaded on shared servers (this is TTB, which is described later). This can be 10 minutes, a day, or whatever specific amount of time she defines.
This function can be quite handy, especially if someone is constantly updating values on their server. This prevents a high density of calls and requests whenever someone wishes to see what new values exist on a shared server.
Time To Live
TTL (Time To Live) is quite self-explanatory: it defines how long data will live on a server. Anyone with an #sign has the ability to upload information on their server and define how long it stays on the server before it is automatically deleted. If #alice wishes to share her summer vacation getaway location as her current location, she has the option to share that summer vacation location for as long as she plans on being there!
To really take advantage of a mechanism like this, developers can combine it with other Time To commands to make life for themselves and those they share their information with easier. Say for instance Alice lives in sunny San Francisco, and owns a vacation home in Spain. With mechanisms such as Time To Refresh and Time To Live, Alice has the ability of travelling to her vacation home for several weeks, uploading her current location as Spain, and setting that information to live on her server for the several weeks that she will be staying at that location.
Time To Birth
Another Time To mechanism that is utilized within the #protocol is the Time To Birth mechanism. This mechanism allows individuals to upload information to their secondary server and have it become activated after a specified amount of time, in seconds. During the time that the data is not ‘active’, any recipients of this information will see the ‘null’ value in place until the activation has occurred.
For example, if #alice wishes to upload a web URL of her personal website after she has completed it, she can simply specify that the URL value can be uploaded to her secondary in exactly 1 days’ time. Until the value is updated a day later, #bob can only see that her website URL is ‘null’.

Knot Resolver: How to observe and modify a resolved answer at the right time

Goal
I would like to stitch up a GNU GPL licensed Knot Resolver module either in C or in CGO that would examine the client's query and the corresponding resolved answer with the goal of querying an external API offering a knowledge base of malware infected hostnames and ip addresses (e.g. GNU AGPL v3 IntelMQ).
If there is a match with the resolved A's (AAAA's) IP address it is to be logged, likewise a match with the queried hostname should be logged or (optionally) it could result in sending the client an IP address of a sinkhole instead of the resolved one.
Means
I studied the layers and I came to the conclusion that the phase I'm interested in is consume. I don't want to affect the resolution process, I just want to step in at the last moment and check the results and possibly modify them.
I ventured to register the a consume function
with
static knot_layer_api_t _layer = {
.consume = &consume,
};
but I'm not sure it is the right place to do the deed.
Furthermore, I also looked into module hints.c, especially its query method
and module stats.c for its _to_wire function usage.
Question(s)
Phase (Layer?)
When is the right time to step in and read/write the answer to the query before it's send to the client? Am I at the right spot in consume layer?
Answer sections
If the following attempt at getting the resolved IP address gives me the Name Server's address:
char addr_str[INET6_ADDRSTRLEN];
memset(addr_str, 0, sizeof(addr_str));
const struct sockaddr *src = &(req->answer->sections);
inet_ntop(qry->ns.addr[0].ip.sa_family, kr_inaddr(src), addr_str, sizeof(addr_str));
DEBUG_MSG(NULL, "ADDR: %s\n", addr_str);
how do I get the resolved (A, AAAA) IP address for the query's hostname? I would like to iterate over A/AAAA IP addresses and CNAMEs in the answer and look at the IP addresses they were resolved to.
Modifying the answer
If the module setting demands it, I would like to be able to "ditch" the resolved answer and provide a new one comprising an A record pointed at a sinkhole.
How do I prepare the record so as it could be translated from char* to Knot's wire format and the proper structure in the right context at the right phase?
I guess it might go along functions such as knot_rrset_init and knot_rrset_add_rdata, but I wasn't able to arrive at any successful result.
THX for pointers and suggestions.
If you want to step in the last moment when the response is finalised but not yet sent to the requestor, the right place is finish. You can do it in consume as well, but you'll be overwriting responses from authoritative servers here, not the assembled response to requestor (which means DNSSEC validator is likely to stop your rewritten answers).
Disclaimer: Go interface is rough and requires a lot of CGO code to access internal structures. You'd be probably better suited by a LuaJIT module, there is another module doing something similar that you may take as an example, it also has wrappers for creating records from text etc. If you still want to do it, that's awesome and improvements to Go interface are welcome, read on.
What you need to do is roughly this (as CGO).
That will walk you through RR sets in the packet (C.knot_rrset_t),
where you can match type (rr.type) and contents (rr.rdata).
Contents is stored in DNS wire format, for address records it is the address in network byte order, e.g. {0x7f, 0, 0, 1}.
You will have to compare that to address/subnet you're looking for - example in C code.
When you find a match, you want to clear the whole packet and insert sinkhole record (you cannot selectively remove records, because the packet is append-only for performance reasons). This is relatively easy as there is a helper for that. Here's code in LuaJIT from policy module, you'd have to rewrite it in Go, using all functions mentioned above and using A/AAAA sinkhole record instead of SOA. Good luck!

Subdomain DNS seems to only be partially propagating

I own a domain, and clearly its DNS resolution is fine, everywhere seems to point to the right server : https://dnschecker.org/#A/e-bis.fr
I created a wildcard for subdomains, and it seems like it only points to the right server in some random places in the world, changes randomly every once in a while (as in sometimes a server will say it resolves, and one hour later it won't anymore) : https://dnschecker.org/#A/whatever.e-bis.fr
At first I thought it was a propagation issue, but it's been a week now so clearly it's me messing up the config at some point.
Here's the zone file used by bind9 for this domain :
# IN SOA ns3032550.ip-91-121-79.eu. postmaster.e-bis.fr. (
2014070501 ; Serial
8H ; Refresh
30M ; Retry
4W ; Expire
8H ; Minimum TTL
)
IN NS ns3032550.ip-91-121-79.eu.
IN NS ns.kimsufi.com.
e-bis.fr. IN A 91.121.79.161
*.e-bis.fr. IN A 91.121.79.161
ownercheck IN TXT "28834a04"
I do a service bind9 reload every time I update it, so the only thing I can see is the issue being in the zone file. I'm terrible with them, so it wouldn't surprise me if it was a beginner mistake.
Thanks in advance to anyone who can help,
Éric B.
Turns out I had just forgotten to update the serial (I think?).
For anyone running into the same problem, it was this line 2014070501 ; Serial which I had not updated. Incrementing it then restarting the service is enough.

DNS Response Packets

I'm trying to code my own DNS server, I'm reading through RFC1035 on DNS but I have a few queries:
1) I want my server to respond with a CNAME for a particular request, but no A records - can I do this? for example, receive request for 'server1.com', response 'CNAME server2.com', and then the client queries another DNS server to get the A record for 'server2.com'.
I've currently set the header to: '\x84\x00' such to say this is the authoritive server, but recurse is not possible. Is this right?
2) I want my server to respond with no records for any other request, such that the client then queries a different DNS server for the records. I've currently set header to '\x83\x03' such to signal a NAME ERROR reply code. Is this right? Then what do I follow this with, zeros in all the other fields, or just end the packet there? I don't want to respond with 'this name doesn't exist', rather 'I don't know this name, try someone else' - how do I do this?
Many Thanks :)
Sounds about right - in fact, CNAME with A records is incorrect (RFC1034 section 3.6.2: "If a CNAME RR is present at a node, no other data should be present").
This would be very unusual behaviour from an authoritative nameserver - I'd suggest rethinking it or at least testing with some real-life resolvers to ensure they do what you want. RCODE #3 ("name error" or NXDOMAIN) is positive confirmation that the name doesn't exist. This would cause resolvers to terminate resolution and possibly cache the nonexistence of the name, which doesn't sound like what you're after. If you want the resolver to query one of the other nameservers that was delegated to for that zone, I guess SERVFAIL (RCODE #2) is the most appropriate/likely to have the desired effect.
By the way, for debugging the exact format of your DNS packets I can highly recommend Wireshark for its decoding accuracy compared with pasting hex codes into Stack Overflow ;)
In the CNAME case, your (authoritative) server should just return the CNAME in the answer section unless it is also authoritative for the domain that the CNAME points to, in which case it should also include the result of following the CNAME.
For your second case you should return RCODE 5 ("REFUSED") - this is the preferred error that an authoritative server should give when asked a question for a domain for which it is not configured.
Following that, you still need to send the four 16-bit count fields and a copy of the question from the original request. In this case the four counts would be (1, 0, 0, 0) - one question, no answer, no ns records, no additional records.

Ideal timeout period for dns lookup

In my rails app i do a nslookup using a ruby library resolv. If the site like dgdfgdfgdfg.com is entered its talking too long to resolve. in some instance like 20 sec.(mostly for non-existent sites) Because it cause the application to slowdown.
So i though of introducing a timeout period for the dns lookup.
What will be the ideal timeout period for the dns lookup so that resolution of actual site doesnt fail. will something like 10 sec will be fine?
There's no IETF mandated value, although §6.1.3.3 of RFC 1123 suggests a value not less than 5 seconds.
Perl's Net::DNS and the command line dig utility do default to 5 seconds between retries. Some versions of the Microsoft resolver appear to default to 3 seconds.
You can run some tests among the users to find out the right number compromising responsiveness / performance.
Also you can adjust that timeout dinamically depending on the network traffic.
For example, for every sucessful resolv, you save how much time it took you to resolv it. And every hour (for example) you can calculate an average and set double of its value as timeout (Remember that "average" is, roughly speaking, "the middle"). This way if your latency is high at some point, it autoadjust itself to increase the timeout period.

Resources