DNS Server Load Balancing - dns

I have two problems.
Fist problem:
I want to use multiple DNS load-balancers (don't know what load-balancing solutions are out there, please suggest some) and some master slave (powerdns) replication on my dns servers.
so my approach is like an A record which will round robin my two or more NS records, our NS records would then resolve our dns load balancer POP around a distributed network of our DNS Load balancers and then our powerdns master slave replication would kick in.
this roundrobin approach is just to focus DDOS mitigation measure and we want this. i know people don't load balance dns servers but we have to.
Second problem:
after deploying this system we wanna use dns to resolve our distributed webservers around the globe like an anycast system.
we have two different servers in different countries
one server = cdn.xyz.in (static content)
second server = xyz.in (dynamic website)
well we have those two servers deployed in multiple locations like Singapore, NYC etc. that we call - POP's
so what i want is to use dns to resolve the closest POP's to the user request. Kind of Geo. i forgot the name from Amazon 53, it does route the traffic how do they achieve it?
also if we can add a monitoring system or a dns analytical system in front of load balancer to monitor and track out traffic like cloudflare does would be great, list me some tools i would figure it myself.

Related

How does load balancing work for very high traffic domains?

Take Google.com for example. If it ultimately resolves to a single IP at any point of time, the packets will land on a single server. Even if all it does is send a redirect response (for transferring load other servers), it still has to be capable of handling hundreds of thousands of requests per second.
I can think of a number of non standard ways to handle this. For example the router may be programmed to load balance the packet across multiple servers. But it still means that google.com is dependent on a single physical facility as IP addresses are not portable to another location.
I was hoping internet fabric itself has some mechanism to handle such things.Multiple A records per domain is one such mechanism. But while researching this I found that google.com's DNS entry has only one A record and the IP value is different depending on which site you query it from.
How is it done? In what ways is it better and why has Google chosen to do it this way instead of having multiple A records?
Trying to lookup A record of google.com yields different results from different sites:
https://www.misk.com/tools/#dns/google.com Resolves to 216.58.217.142
https://www.ultratools.com/tools/dnsLookupResult resolves to 172.217.9.206
This is generally done using dynamic DNS/round robin DNS/ DNS load balancing.
Say your have 3 web servers at 3 different locations. When the lookup is done the DNS server will respond with a different IP for each request. Some DNS servers also allow a policy based config... wherein it can return a certain IP 70% of time and some other IP 30% of the time.
This document provides reference on how to do this with Windows 2016.

Domain Name to Multiple IP Address Conversion

Google has multiple servers at multiple locations. When I search Google in my web browser, how does the DNS map this name to the corresponding IP address? Google has multiple servers in multiple locations with separate IPs. Is a load balancer used first?
A couple of different approaches are used:
Geographic DNS
When a request comes in for a domain name, the DNS server looks at the IP address making the request and returns an IP address of a nearby server.
Some complicated extensions are required to deal with large shared caching DNS servers (like ISP nameservers), but that's the general idea.
Anycast DNS
Anycast is a weird routing trick where a single IP range can be advertised by multiple ASes. This will cause requests to an IP address in that range to be routed to whichever server is closest.
If a DNS server is hosted on an anycast IP, different instances of that server can be configured to return different IPs. This can be used as a computationally easier alternative to geographic DNS.
Anycast HTTP
If anycast can be used to route DNS to the closest server, why not just go to the next step and use it to route HTTP as well?
(It turns out there's a reason why you usually don't want to do this: Routing changes can break a HTTP connection. This doesn't affect DNS as it's usually used over UDP. Cloudflare does it anyway, though, and it usually works fine… YMMV.)
In large scale reverse proxy server is usually used for this purpose and it can do various tasks including load balancing as well. To the client it appears that you connect only to one server while reverse proxy hides servers behind it.
In small scale you can do similar things just with DNS settings mapping different domain names to different IP addresses. See this article

Single domain on multiple servers

I have a domain that needs spread on several server for load balancing purposes.
I also have my application to tell what server suppose to handle certain requests.
Right ow I have it set to use sub-domains like www1, www2 and just redirect to each server but that is ugly.
I need a way to proxy the requests and users to see only www all the time regardless what IP is actually serving the request...
I read a bit into apache proxy thing, but I am still confused how will such a scenario deliver the page and resources like videos without changing the www.
You can enter multiple ip addresses per subdomain in your DNS table. If your DNS server supports it, you can rotate these entries on each request to get a simple round robin load balancer (see http://en.wikipedia.org/wiki/Round-robin_DNS)
However, a much better solution is to have a load balancing server that handles all request to your web site. This way you can add and remove web servers to/from load balance instantaneously. So when you need to do some maintenance on one server you just take it out of the rotation.
Many load balancers also check if the web servers are still alive and remove dead servers automatically. This will increase your uptime significantly.

Do browsers re-try DNS when a page load fails?

After Amazon's failure and reading many articles about what redundant/distributed means in practice, DNS seems to be the weak point. For example, if DNS is set to round-robin among data centers, and one of the data centers fails, it seems that many browsers will have cached that DNS and continue to hit a failed node.
I understand time-to-live (TTL), but of course this may be set to a long time.
So my question is, if a browser does not get a response from an IP, is it smart enough to refresh the DNS in the hope of being routed to another node?
Round-robin DNS is a per-browser thing. This is how mozilla does it:
A single host name may resolve to multiple ip addresses, each of which is stored in the
host entity returned after a successful lookup. Netlib preserves the order in which the dns
server returns the ip addresses. If at any point during a connection, the ip address
currently in use for a host name fails, netlib will use the next ip address stored in the
host entity. If that one fails, the next is queried, and so on. This progression through
available ip address is accomplished in the NET_FinishConnect() function. Before a url load
is considered complete because it's connection went foul, it's host entity is consulted to
determine whether or not another ip address should be tried for the given host. Once an ip
address fails, it's out, removed from the host entity in the cache. If all ip addresses in
the host entity fail, netlib propegates the "server not responding" error back up the call
chain.
As for Amazon's failure, there was NOTHING wrong with DNS during Amazon's downtime. The DNS servers correctly reported the IP addresses, and the browsers used those IP addresses. The screw-up was on Amazon's side. They re-routed traffic to an overwhelmed cluster. The DNS was dead-on, but the clusters themselves couldn't handle the huge load of traffic.
Amazon says it best themself:
EC2 provides two very important availability building blocks: Regions and Availability
Zones. By design, Regions are completely separate deployments of our infrastructure.
Regions are completely isolated from each other and provide the highest degree of
independence. Many users utilize multiple EC2 Regions to achieve extremely-high levels of
fault tolerance. However, if you want to move data between Regions, you need to do it via
your applications as we don’t replicate any data between Regions on our users’ behalf.
In other words, "remember all of that high-availability we told you we have? Yeah it's really still up to you." Due to their own bumbling, they took out both the primary AND secondary nodes in the cluster, and there was nothing left to fail over to. And then when they brought it all back, there was a sudden "re-mirroring storm" as the nodes tried to synchronize simultaneously, causing more denial of service. DNS had nothing to do with any of it.

Using DNS for failover using multiple A records

It has recently come to my attention that setting up multiple A records for a hostname can be used not only for round-robin load-balancing but also for automatic failover.
So I tried testing it:
I loaded a page from our domain
Noted which of our servers had served the page
Turned off the web server on that host
Reloaded the page
And indeed the browser automatically tried a different server to load the page. This worked in Opera, Safari, IE, and Firefox. Only Chrome failed to try a different server.
But after leaving that server offline for a few minutes and looking at the access logs, I found that the number of requests to the other servers had not significantly increased. With 1 out of 3 servers offline, I had expected accesses to each of the remaining 2 servers to roughly increase by 50%, but instead I only saw 7-10%. That can only mean DNS-based failover does not work for the majority of browsers/visitors, which directly contradicts what I had just tested.
Does anyone have an idea what is up with DNS-based web browser failover? What possible reason could there be why automatic failover works for me but not the majority of our visitors?
What's happening is that the browsers are not doing automatic DNS failover.
If you have multiple A records on a domain then when your nameserver requests the IP for the domain you typed into your browser, it'll request one from the SOA. It could be any of those A records. Then it passes it along.
Some nameservers are 'smart' enough to request a new A record if the one it gets doesn't work and some aren't. So if you set multiple A records then you will have set up a pseudo redundancy failover, but only for those people with 'smart' nameservers. The rest get a toss of the dice on which IP they get and if it works then good, and if not then it will fail to load as it did for you in Chrome.
If you want to specifically test this then you can use your hosts file C:\Windows\system32\drivers\etc\hosts in Windows and /etc/hosts in Linux to specify what IP you want to go with what domain to see if you get a true failover - as what you'll run into in practicality is that DNS servers across the net will cache your domain name resolution based on its TTL. So if/when you get a real failure, that IP will still need to be resolve and be otherwise farmed out to another nameserver.
Another possible explanation is that, for most public websites, the bulk of traffic comes from bots not from browsers. Depending on the bot it is possible that they aren't quite as smart as the browsers when it comes to handling multiple A records for a domain.
Also, some bots use keep-alives to keep the TCP connections open & make multiple HTTP requests over the same connection. Given that the DNS lookup is only done when a connection is made, they will continue to make requests to the old IP address at least as long as the connection is kept open.
If the above explanation has any weight you should be able to see it in your logs by examining the user agent strings.

Resources