how to connect to geographically close datacenter - dns

I am reading a book about distributed systems. One of the options of data replication mentioned is the use of a multi leader approach and place each leader in a different datacenter. The main point of different datacenters is to be geographically close to the user.
The author then discusses all the write conflicts that emerge by having multiple write leaders, but he doesn't say much on how to direct users to connect to geographically close data center.
For example, user in Austria makes a HTTP request to https://stackoverflow.com. Stackoverflow has datacenters in Germany and North America. DNS record point to the datacenter in US.
Is initial request always going to be pointing to the datacenter in US? I know that once a user is identified, I can instruct all AJAX and img requests to point to Germany (by chaning the html response I sent back), but initial requests, such as page reload will always point to US.
This kinda defeats the purpose of being geographically close to users if they always have to connect to the distant server at first and only after that, the inline resources are fetched from a nearby server.. Am I missing some essential principles here ?

It is very much possible to connect to geographically nearest datacenter. There are lots of companies providing this as a service. eg. Akamai, AWS, Google Cloud, Cloudflare.
This is generally done at DNS level. So when someone makes a request to your domain, the first request goes to the DNS server to resolve the domain-name to IP location. -> This is where the appropriate nearest server location gets resolved.
This is generally used for loadbalancing as well, and called DNS loadbalancing.

This is done through Geo DNS, most cloud service providers have this.
This article has a good explanation on how Geo DNS works.
For example, in AWS one could set up a Geo Location policy, this would let you choose the resources that serve your traffic based on the geographic location of your users, meaning the location that DNS queries originate from

Related

Slow communications between Azure regions

So just a quick summary of what we are doing to put everything into context. We have a socket server running as an Azure Cloud Service (worker role) within the South Central US region. All of our other components (Queue, DBs, web app, API etc) are located in East US. The reasons being is sadly due to not being able to modify the static IP address that was created for the South Central US a few years ago. The devices in the field cannot alter their IP as well :/ So we are stuck communicating cross region.
So what Im asking, is there a way to improve latency? Can we "port forward" ? What other options do we have? Im assuming the latency is our biggest enemy here as we pipe data back and forth.
Looking at load balancing at moment - https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-overview
Thoughts?
Load Balancer is a regional service and cannot direct traffic across regions.
There are a couple of options:
1) build your own VM's with a TCP proxy to achieve your scenario. You could use Load Balancer to scale and protect your TCP proxy instance if you want to pursue that path.
2) explore using Application Gateway for this scenario since it is a proxy and can direct to IP address destinations. This is essentially a managed service for option 1, although limited to HTTP & HTTPS.
3) migrate to a DNS approach for locating your service and orchestrating a migration across regions over time.
Either way, traffic would remain on Microsoft's own backbone between regions.
Best regards,
Christian

Clients pointing their domains to our IP - Concerns & System Longevity

For our SaaS app, we're allowing customers to point their domain name to our server.
The plan right now is to simply hand out one of our AWS elastic IP addresses for them to point their domain to. The elastic IP address would essentially be pointed to a EC2 instance web-server...and maybe a load balancer in time (if traffic demands it!).
The user would specify what their domain is in our app, and we'd be able to resolve the host name coming in as their app.
My concern is the longevity of this solution. This IP cannot change. And we'll certainly be tied to AWS if we go this route.
(Note: Being a 1-2 person startup, standing up a data-center is more than likely no-go, and we hope to use AWS or Azure).
What solutions would make this IP address -> SaaS Web Server concept last in the long run, with flexibility, and as minor of a tie as possible to a cloud provider?
With running the risk of asking "what is the best way to do this"...what's the best way to do this, keeping in mind longevity and small opt-in to a cloud provider?
You can't point an IP address to a load balancer, so this seems like a very bad idea. You need your own domain/subdomain that clients can point their domains/subdomains to via a CNAME record on their end. Then if the location of your service ever changes you just have to update your domain record and their DNS records will continue to be correct.

Azure Traffic manager - Route by User IP Address

I have a webapplication in multiple Regions in the Azure Cloud and i'm using the Traffic Manager in Performance mode zu redirect the user to the closest Region.
What's concerning me is the following:
With this site https://www.whatsmydns.net i checked my Webapplication to see, which Datacenter is selected.
The funny thing is, that people from California gets redirected to the server in Westeurope but there is a Server in US Central too.
So from the site of the traffic manager the ping to the europe server is faster then to US central.
But i believe, that the difference between these too can not be high...
Now i have the fear, that it can happen that a user jumps between US Central and Europe all the time because he is in such a zone where the latencies to the available servers are nearly identical.
I also store files in a Azure Storage account in each region. If the user now jumps, i would have to transfer these files between the regions all the time...
So i was wondering if there is a possibility to redirect the user by his GEOIp to a specific region than by latency?
One of the benefit of the traffic manager is in my eyes that i can use one domain for all regions...
the only solution for my problem i can think of is a own cloudservice which replaces the traffic manager and redirects the user to the different regions by their IP like us-center.DOMAIN.com, we-eu.DOMAIN.com etc...
Are there any other solutions?
Thanks for your help!
Br,
metabolic
If you believe Traffic Manager is routing queries incorrectly, that should be raised with Azure Support.
Traffic Manager 'Performance' mode routing is based on an internal 'IP address to Azure data center latency map. The source IP of the DNS query (which is typically the IP of your DNS server) is looked up in the map to determine which Azure location will offer the best performance. There is an implicit assumption that the IP address of the DNS server is a good proxy for the location of the end user.
The 'Performance' mode in Azure Traffic Manager is deterministic. Identical queries from the same address will be routed consistently. The only exception is that routing may change during occasional map updates, which affect only a small %age of the IP address space.
A more common cause of routing changes is customers moving from place to place. For example, during travel, or simply by picking up a Wifi network that uses a DNS service in a different location, with a different IP address.
A Geo-IP based routing is not currently supported by Traffic Manager. However, please note that it would work in the same way as the 'performance' routing, just that it would use a different map. Users could still be routed to different locations as a result of map updates or changing DNS servers.
As you describe, if your application requires a strong, un-violable association between a user and a region, one option is re-direct users at the application level (e.g. via HTTP 302).

Azure Traffic Manager not routing to correct data center

We have deployed our Azure web application to two separate data centers (one located in West Europe and the other located in South east Asia) for purely performance reasons. We have a configured a Traffic Manager to route requests between these two DC's based on performance. However, when users in Shanghai try to access the site, they are routed to the DC in West Europe when ideally they should have been routed to SE Asia DC. Due to this, users in Shanghai are not seeing the intended performance benefit. However, it works well from India and Europe i.e. they are routed to the closest DC. What could be the problem? Also, is there a way to test traffic manager's performance based routing to see if it working as expected?
UPDATE:
I requested the users in Shanghai to use azure speed test to know the closest DC. They see "It looks like your nearest Data Center is West Europe. There appears to be a CDN Node nearer your location". When we use the above site from India, it shows "It looks like your nearest Data Center is Southeast Asia". My questions are:
Based on the above lookups, TM seems to be routing correctly even though India is closer to West Europe than Shanghai?
What is the additional information "There appears to be a CDN Node nearer your location" after data center that is displayed when looked up from Shanghai that is not shown from India? Could this "CDN Node" be making the Shanghai users to detect West Europe as the closest DC?
I think you are confusing two different concepts:
Physical proximity vs. network proximity (ie. latency)
CDN vs. WATM
A user's physical geographic proximity to a datacenter is only marginally related to which datacenter will be fastest. Around the world there are different peering agreements, interconnects, etc that can cause a geographically closer datacenter to have worse latency/bandwidth to a user than a geographically farther datacenter. This is why it is always important to test the latency from your users' location rather than just assuming by looking at a map. The azurespeedtest site you used is a great way to check the real-world performance of a user to an Azure datacenter, and the fact that it shows the same results as WATM means that WATM is performing correctly and your users are getting the fastest speeds possible.
CDN is a cache layer for static content and has lots of nodes throughout the world (see http://msdn.microsoft.com/library/azure/gg680302.aspx), and these nodes are in no way related to Azure datacenters. CDN also has nothing to do with WATM or which datacenter WATM would point a specific user to. If you have a lot of static content then you may want to consider adding a CDN endpoint in front of your site in order to cache the content closer to your users. See http://msdn.microsoft.com/en-us/library/azure/ee795176.aspx for more info.

automatic failover if webserver is down (SRV / additional A-record / ?)

I am starting to develop a webservice that will be hosted in the cloud but needs higher availability than typical cloud SLAs provide.
Typical SLAs, e.g. Windows Azure, promise an availability of 99.9%, i.e. up to 43min downtime per month. I am looking for an order of magnitude better availability (<5min down time per month). While I can configure several load balanced database back-ends to resolve that part of the issue I see a bottleneck at the webserver. If the webserver fails, the whole service is unavailable to the customer. What are the options of reducing that risk without introducing another possible single point of failure? I see the following solutions and drawbacks to each:
SRV-record:
I duplicate the whole infrastructure (and take care that the databases are in sync) and add additional SRV records for the domain so that the user tying to access www.example.com will automatically get forwarded to example.cloud1.com or if that one is offline to example.cloud2.com. Googling around it seems that SRV records are not supported by any major browser, is that true?
second A-record:
Add an additional A-record as alternatives. Drawbacks:
a) at my hosting provider I do not see any possibility to add a second A-record but just one... is that normal?
b)if one server of two servers are down I am not sure if the user gets automatically re-directed to the other one or 50% of all users get a 404 or some other error
Any clues for a best-practice would be appreciated
Cheers,
Sebastian
The availability of the instance i.e. SLA when specified by the Cloud Provider means the "Instance's Health is server running in the context of Hypervisor or Fabric Controller". With that said, you need to take an effort and ensure the instance is not failing because of your app / OS / or pretty much anything running inside the instance. There are few things which devops tend to miss and that kind of hit back hard like for instance - forgetting to configure the OS Updates and Patches.
The fundamental axiom with the availability is the redundancy. More redundant your application / infrastructure is more availabile is your app.
I recommend your to look into the Azure Traffic Manager and then re-work on your architecture. You need not worry about the SRV record or A-Record. Just a CNAME for the traffic manager would do the trick.
The idea of traffic manager is simple, you can tell the traffic
manager to stand after the domain name ( domain name resolution of the
app ) then the traffic manager decides where to send the request on
considerations of factors like Round-Robin, Disaster Management etc.
With the combination of the Traffic Manager and multi-region infrastructure setup; you will march towards the high availability goal.
Links
Azure Traffic Manager Overview
Cloud Power: How to scale Azure Websites globally with Traffic Manager
Maybe You should configure a corosync cluster with DRBD ?
DRBD will ensure You that the data on both nodes are replicated (for example website files and db files).
Apache as web server will be available under a virtual IP to which domain is pointed. In case of one server is down corosync will move all services to second server within few seconds.

Resources