Do new DNS records propagate? - dns

As simple as that. I went through quite a lot of articles on the internet and all of them just go on about how updated/modified DNS records take time to propagate and so on. I may be stupid (most likely I am), but the whole situation is not very clear. Especially the following:
Do new (absolutely new records) propagate?
Example: we have an old domain, with propagated nameservers, IP, etc and add a TXT record to it. No TXT records existed previously. Is it applied immediately, after some time or after TTL?
Is there any influence on this from local DNS, cache, ISP or anything else?
Thank you.

There are at least two things being mixed under the term "propagation" here.
One is various caches of local resolvers and recursing name servers remembering information for a set amount of time before they go out and ask an authoritative server again. This has no relevance to your question, but it is what many of those articles you read were talking about.
The other is data moving from a master name server to its secondary name servers. This is relevant to your question. A master name server is where data gets injected into DNS from outside, so that's where your new records begin their lives. Secondary servers check with the master server for new data when they think enough time has passed or when they get prodded to do so (usually, the master server is set to prod them when its information is updated). The way they tell if they need to re-fetch a zone from the master or not is by comparing the serial number in the zone's SOA record between what they have stored locally and what the server has. If the number at the master is higher, the secondary will fetch the whole zone again (usually, other options exist). If the number at the master is not higher, the secondary will assume the information it has is up to date, and do nothing.
The most common reason, by far, for new records not propagating to secondaries is that whoever added the new records forgot to increase the serial number in the SOA record.

Related

How to deal with millions queries to DNS server?

I'm wondering, how modern DNS servers dealing with millions queries per second, due to the fact that txnid field is uint16 type?
Let me explain. There is intermediate server, from one side clients sending to it DNS requests, and from other side server itself sending requests to upper DNS server (8.8.8.8 for example). So the thing is, that according to DNS protocol there is field txnid in the DNS header, which should be unchanged during request and response. Obviously, that intermediate DNS server with multiple clients replace this value with it's own txnid value (which is a counter), then sends request to external DNS server and after resolving replace this value back to client's one. And all of this will work fine for 65535 simultaneous requests due to uint16 field type. But what if we have hundreds of millions of them like Google DNS servers?
Going from your Google DNS server example:
In mid-2018 their servers were handling 1.2 trillion queries-per-day, extrapolating that growth says their service is currently handling ~20 million queries-per-second
They say that successful resolution of a cache-miss takes ~130ms, but taking timeouts into account pushes the average time up to ~400ms
I can't find any numbers on what their cache-hit rates are like, but I'd assume it's more than 90%. And presumably it increases with the popularity of their service
Putting the above together (2e7 * 0.4 * (1-0.9)) we get ~1M transactions active at any one time. So you have to find at least 20 bits of state somewhere. 16 bits comes for free because of the txnid field. As Steffen points out you can also use port numbers, which might give you another ~15 bits of state. Just these two sources give you more than enough state to run something orders of magnitude bigger than Google's DNS system.
That said, you could also just relegate transaction IDs to preventing any cache-poisoning attacks, i.e. reject any answers where the txnid doesn't match the inflight query for that question. If this check passes, then add the answer to the cache and resume any waiting clients.

What happened after DNS TTL expired in intermediate Name Server?

I have some questions to better understand DNS mechanism:
1) I know between clients and authoritative DNS server there are some intermediate DNS servers like ISP's one. What and where are the other types of them?
2) After the TTL of an NS record is expired in intermediate DNS servers, when do they refresh the addresses of names? Clients request? or right after expiration, they refresh records?
Thanks.
Your question is off topic here as not related to programming.
But:
I know between clients and authoritative DNS server there are some intermediate DNS servers like ISP's one. What and where are the other types of them?
There are only two types of DNS servers (we will put aside the stub case for now): it is either an authoritative nameserver (holding information about some domains and being the trust source of it) or a recursive one, attached to a cache, that is basically starting with no data and will then progressively, based on queries it gets, do various queries to get information.
Technically, a single server could do both, but it is a bad idea for at least the reason of the cache, and the different population of clients: an authoritative nameserver is normally open to any client as it needs to "broadcast" its data everywhere while a recursive nameserver is normally only for a selected list of clients (like the ISP clients).
There exists open public recursive nameservers today by big organizations: CloudFlare, Google, Quad9, etc. However, they have both the hardware, links, and manpower to handle all issues that come out of public recursive nameservers, like DDOS with amplification.
Technically you can have a farm of recursive nameservers, like big ISPs will need to do (or the above big public ones) because any single instance could not sustain all clients queries, and they can either share a single cache or work in a hierarchy, the bottom ones sending their data to another upstream recursive nameserver, etc.
After the TTL of an NS record is expired in intermediate DNS servers, when do they refresh the addresses of names? Clients request? or right after expiration, they refresh records?
This historic naïve way could be summarized as: a request arrive, do I have it in my cache? If no, query outside for it and cache it. If yes, is it expired in my cache? If no, ship it to client, but if yes we need to remove it from cache and then do like it was not in cache from the beginning.
You then have various variations:
some caches are not exactly honoring the TTLs: some are clamping values that are too low or too high, based on their own local policies. The most agreed reading on the specification is that the TTL is an indication of the maximum amount of time to keep the record in cache, which means the client is free to ditch it before. However, it should not rewrite it to a higher value if it thinks it is too low.
caches can be kept along reboots/restarts, and can be prefetched, especially for "popular" records; in a way, the list of root NS is prefetched at boot and compared to the internal hardcoded list, in order to update it
caches, especially in RAM, may need to be trimmed on, typically on an "oldest removed" case, in order to get places for new records coming along the way.
so depending on how the cache is managed and which features it is requested to have, there may be a background task that monitor expirations and refresh records.
I recommend you to have a look at unbound as a recursive nameserver as it has various settings around TTL handling, so you could learn things, and then reading up the code itself (which brings us back on-topic kind of).
You can also read this document: https://www.ietf.org/archive/id/draft-wkumari-dnsop-hammer-03.txt an IETF Internet-Draft about:
The principle is that popular RRset in the cache are fetched, that is
to say resolved before their TTL expires and flushed. By fetching
RRset before they are being queried by an end user, that is to say
prefetched, HAMMER is expected to improve the quality of experience
of the end users as well as to optimize the resources involved in
large DNSSEC resolving platforms.
Make sure to read Appendix A with a lot of useful examples, such as:
Unbound already does this (they use a percentage of TTL, instead of a number
of seconds).
OpenDNS that they also implement something similar.
BIND as of 9.10, around Feb
2014 now implements something like this
(https://deepthought.isc.org/article/AA-01122/0/Early-refresh-of-cache-records-cache-prefetch-in-BIND-9.10.html), and enables it by
default.
A number of recursive resolvers implement techniques similar to the
techniques described in this document. This section documents some
of these and tradeoffs they make in picking their techniques.
And to take one example, the Bind one, you can read:
BIND 9.10 prefetch works as follows. There are two numbers that control it. The first number is the "eligibility". Only records that arrive with TTL values bigger than the configured elegibility will be considered for prefetch. The second number is the "trigger". If a query arrives asking for data that is cached with fewer than "trigger" seconds left before it expires, then in addition to returning that data as the reply to the query, BIND will also ask the authoritative server for a fresh copy. The intention is that the fresh copy would arrive before the existing copy expires, which ensures a uniform response time.
BIND 9.10 prefetch values are global options. You cannot ask for different prefetch behavior in different domains. Prefetch is enabled by default. To turn it off, specify a trigger value of 0. The following command specifies a trigger value of 2 seconds and an eligibility value of 9 seconds, which are the defaults.

Preventing DoS and flooding

This requires a little context, so bear with me.
Suppose you're building a chat app atop CouchDB that functioned like IRC (or Slack). There's a server and some clients. But in this case, the server and the clients each have a CouchDB database and they're all bidirectionally replicating to each other -- the clients to the server, and the server to the other clients (hub-and-spoke style). Clients send messages by writing to their local instance, which then replicates to the server and out to the other clients.
Is there any way (validation function?) to prevent hostile clients from inserting a billion records and replicating those changes up to the server and other clients? Or is it a rule that you just can't give untrusted clients write access to a CouchDB instance that replicates anywhere else?
Related:
couchdb validation based on content from existing documents
Can I query views from a couchdb update or validate_doc_update function?
Can local documents be disabled in CouchDB?
For a rather simple defense agaist flooding, I am using the following workflow:
All public write access is only allowed through update functions
Every document insert/update gets generated a unique hash, consisting of the req.peer field (for the IP address) and an ISO timestamp where I cut off the final part. For example I may have 2017-11-24T14:14 as they key unique string, so that ensures that a unique key is generated every minute
Calculate the hash for every write request, ensure it is unique, and you will be certain a given IP would only be allowed to write once every minute.
This technique works ok for small floods, coming from a given set of IPs. For a more coordinated attack a variation (or even something else completely) might be needed.

Cause of DNS A record delays

I have some hosts that come up on demand in EC2 and when they do the service that starts them creates an A record in Route53 under an existing zone.
The A records are of the form: randomid.example.com.
So it's not an update or change of an existing name/IP pair, it's completely new entry. There shouldn't be any propagation delay.
What I'm seeing is that after the entry has been added and available for lookup with DNS on any of the Amazon servers, my own client PC can't resolve the name for what seems like 5-10 minutes. You ping it, and I'd expect to see an IP for it. But I simply get "no such host".
If I change my /etc/resolv.conf nameserver entry from my local nameserver to 8.8.8.8 (google dns), it resolves. I switch back and it doesn't resolve. This doesn't seem to have anything to do with Route53 given that google answers.
What would cause this? Shouldn't my local resolver be querying the relevant nameservers and eventually the nameserver for example.com which should get an answer for randomid.example.com?
There shouldn't be any propagation delay.
Yes, there should be.
All DNS configuration has a "propagation delay."¹
In the case of new records, a lookup of a hostname before the record is actually available from the authoritative name servers results in negative caching: when a resolver looks up a non-existent record, the NXDOMAIN response is cached by the resolver for a period of time, and this response is returned for subsequent request until the default TTL elapses and the response is evicted from the resolver's cache.
Negative caching is useful as it reduces the response time for negative answers. It also reduces the number of messages that have to be sent between resolvers and name servers hence overall network traffic.
https://www.rfc-editor.org/rfc/rfc2308
When you use dig to query the new record, you'll see the TTL counting down to 0. Once that happens, you start seeing the expected answer. On Linux the watch utility is handy for this, as in watch -n 1 'dig example.com'.
The timer should be set from the minimum TTL, which is found in your hosted zone's SOA record:
The minimum time to live (TTL). This value helps define the length of time that an NXDOMAIN result, which indicates that a domain does not exist, should be cached by a DNS resolver. Caching this negative result is referred to as negative caching. The duration of negative caching is the lesser of the SOA record's TTL or the value of the minimum TTL field. The default minimum TTL on Amazon Route 53 SOA records is 900 seconds.
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/SOA-NSrecords.html
There's the source of your 5-10 minutes. It's actually a worst case of 15 minutes (900 seconds).
Reducing this timer will reduce the amount of time that well-behaved resolvers will cache the fact that the record does not (yet) exist.
"Great," you object, "but I didn't query the hostname before it existed. What now?"
You probably did, because Route 53 does not immediately make records visible. There's a brief lag between the time a change is made to a hosted zone and the time Route 53 begins returning the records.
The Route 53 API supports the GetChange action, which should not return INSYNC until the authoritative servers for your hosted zone are returning the expected answer for the change (and of course this uses "change" in the sense that both "insert" and "update" are a "change").
You can also determine this by directly querying one of the servers specifically assigned to your hosted zone (as seen in the console, among other places).
$ dig #ns-xxxx.awsdns-yy.com example.com
Because you are querying an authoritative server directly, you'll see the result of the change as soon as the server has it available, because there is no resolver in the path that will cache responses.
¹For the purposes of this answer, I'm glossing over the fact that what is commonly referred to as "propagation delay" in DNS is actually a nothing of the sort -- it's actually a TTL-based cache eviction delay for existing records.

jmdns constants

I have been using JmDNSfor a while now. I could use it for the purposes of my application. Every thing works fine for me (I have "announcer" machines and a "listening" one, and this latter machine can see the other devices and discover their information).
It is true that I've managed to work with the JmDNS jar file, but I did it without totally understanding what is going on in this file. Now I want to know about the effect of using JmDNS for the network traffic. I have consulted the documentation but couldn't manage to discover the signification of the constants, like QUERY_WAIT_INTERVAL, PROBE_THROTTLE_COUNT, etc.
I want to know the default frequency with which the announcer machine sends service announcements.
I also noticed DNS_TTL that was described as follows: "The default TTL is set to 1 hour by the standard, so a record is going to stay in the cache of any listening machine for an hour without need to ping the server again".
I understand that it is the Time To Live of the service to stay in the DNS cache, but I couldn't understand what is intended by "purge the server". Does it mean that the listener has to ask the announcer about a service when the DNS_TTL expires? if so, why do need to have the announcer announce its service every 1s (ANNOUNCE_WAIT_INTERVAL = 1000 milliseconds)?
I am so confused.
The way that the Domain Name System works is basically very simple. Fundamentally it's a tree-like system which starts with the root nameservers. These then delegate name space out to the next level. That level in turn delegates out the next level and so on. For example . is the root, which delegates to .com., which can then delegate out example.com.. (Yes, that trailing . is actually part of the domain name, though you almost never have to use it or see it.
When you load a web page there are usually hundreds of elements that load. This is every image, every JS file, every CSS file, etc. To have your computer request that same domain to IP resolution that many times for one page would make load time unbearable and also create massive unnecessary traffic on the nameserver. Therefore DNS caches. The TTL is how long it caches for. If it's set to 24 hours then when you get an answer for that resolution, that's how long you can hold on to it for before you make another request.
The announcing that you're talking about is the nameserver basically announcing that it's responsible for those domains. You want it constantly stating that so other nameservers know where to go to get the correct (authoritative) data.
Throttling is a term used in many fields and applications and means you're limiting your traffic flow so it doesn't get overloaded.
DNS is actually quite simple to understand once you get the basics down.
Here are a few links that could help you get a better grip of it all:
Few paragraphs of basic DNS info
About.com guide
A few definitions
Relatively simple and informative PDF from IETF

Resources