I have a domain that needs spread on several server for load balancing purposes.
I also have my application to tell what server suppose to handle certain requests.
Right ow I have it set to use sub-domains like www1, www2 and just redirect to each server but that is ugly.
I need a way to proxy the requests and users to see only www all the time regardless what IP is actually serving the request...
I read a bit into apache proxy thing, but I am still confused how will such a scenario deliver the page and resources like videos without changing the www.
You can enter multiple ip addresses per subdomain in your DNS table. If your DNS server supports it, you can rotate these entries on each request to get a simple round robin load balancer (see http://en.wikipedia.org/wiki/Round-robin_DNS)
However, a much better solution is to have a load balancing server that handles all request to your web site. This way you can add and remove web servers to/from load balance instantaneously. So when you need to do some maintenance on one server you just take it out of the rotation.
Many load balancers also check if the web servers are still alive and remove dead servers automatically. This will increase your uptime significantly.
Related
I have successfully set haproxy on my server cluster. I have run into one snag that I can't find a solution for...
TESTING INDIVIDUAL CLUSTER COMPUTERS
It can happen that for one reason or another, one computer in the cluster gets a configuration variation. I can't find a way to tell haproxy that I want to use a specific computer out of a cluster.
Basically, mysite.com (and several other domains) are served up by boxes web1, web2 and web3. And they round-robin perfectly.
I want to add something to the URL to tell haproxy that I specifically want to talk to web2 only because in a specific case, only that server is throwing an error on one web page.
Anyone know how to do that without building a new cluster with a URI filter and only have one computer in that cluster? I am hoping to use the cluster as-is but add something to the URI that will tell haproxy which server to use out of the cluster.
Thanks!
Have you thought about using different port for this? Defining new listen section with different port, because, as I understand, you can modify your URL by any means?
Basically, haproxy cannot do what I was hoping. There is no way to add a param to the URL to suggest which host in the cluster to use.
I solved my testing issue by setting up unique ports for each server in the cluster at the firewall. This could also be done at the haproxy level.
To secure this path from the outside world, I told the firewall to only accept traffic from inside our own network.
This lets us test specific servers within the cluster. We did have to add a trap in our PHP app to deal with a session cookie that is too large because we have haproxy manipulating this cookie to keep users on the server they first hit. So when the invalid session cookie is detected, we have the page simply drop the session and reload the page.
This is working well for our testing purposes.
It has recently come to my attention that setting up multiple A records for a hostname can be used not only for round-robin load-balancing but also for automatic failover.
So I tried testing it:
I loaded a page from our domain
Noted which of our servers had served the page
Turned off the web server on that host
Reloaded the page
And indeed the browser automatically tried a different server to load the page. This worked in Opera, Safari, IE, and Firefox. Only Chrome failed to try a different server.
But after leaving that server offline for a few minutes and looking at the access logs, I found that the number of requests to the other servers had not significantly increased. With 1 out of 3 servers offline, I had expected accesses to each of the remaining 2 servers to roughly increase by 50%, but instead I only saw 7-10%. That can only mean DNS-based failover does not work for the majority of browsers/visitors, which directly contradicts what I had just tested.
Does anyone have an idea what is up with DNS-based web browser failover? What possible reason could there be why automatic failover works for me but not the majority of our visitors?
What's happening is that the browsers are not doing automatic DNS failover.
If you have multiple A records on a domain then when your nameserver requests the IP for the domain you typed into your browser, it'll request one from the SOA. It could be any of those A records. Then it passes it along.
Some nameservers are 'smart' enough to request a new A record if the one it gets doesn't work and some aren't. So if you set multiple A records then you will have set up a pseudo redundancy failover, but only for those people with 'smart' nameservers. The rest get a toss of the dice on which IP they get and if it works then good, and if not then it will fail to load as it did for you in Chrome.
If you want to specifically test this then you can use your hosts file C:\Windows\system32\drivers\etc\hosts in Windows and /etc/hosts in Linux to specify what IP you want to go with what domain to see if you get a true failover - as what you'll run into in practicality is that DNS servers across the net will cache your domain name resolution based on its TTL. So if/when you get a real failure, that IP will still need to be resolve and be otherwise farmed out to another nameserver.
Another possible explanation is that, for most public websites, the bulk of traffic comes from bots not from browsers. Depending on the bot it is possible that they aren't quite as smart as the browsers when it comes to handling multiple A records for a domain.
Also, some bots use keep-alives to keep the TCP connections open & make multiple HTTP requests over the same connection. Given that the DNS lookup is only done when a connection is made, they will continue to make requests to the old IP address at least as long as the connection is kept open.
If the above explanation has any weight you should be able to see it in your logs by examining the user agent strings.
I will be running a dynamic web site and if the server ever is to stop responding, I'd like to failover to a static website that displays a "We are down for maintenance" page. I have been reading and I found that switching the DNS dynamically may be an option, but how quick will that change take place? And will everyone see the change immediately? Are there any better ways to failover to another server?
DNS has a TTL (time to live) and gets cached until the TTL expires. So a DNS cutover does not happen immediately. Everyone with a cached DNS lookup of your site still uses the old value. You could set an insanely short TTL but this is crappy for performance. DNS is almost certainly not the right way to accomplish what you are doing.
A load balancer can do this kind of immediate switchover. All traffic always hits the load balancer first which under normal circumstances proxies requests along to your main web server(s). In the event of web server crash, you can just have the load balancer direct all web traffic to your failover web server.
pound, perlbal or other software load-balancer could do that, I believe, yes
perhaps even Apache rewrite rules could allow this? I'm not sure if there's a way to branch when the dynamic server is not available, though. Customize Apache 404 response to your liking?
first of all is important understand which kind of failure you want failover, if it's app/db error and the server remain up you can create a script that do some checks and failover your website to another temp page. (changing apache config or .htaccess)
If is an hardware failover the DNS solution is ok but it's not immediate so you will lose some users traffic.
The best ideal solution is to use a proxy (like HAProxy) that forward the HTTP request to at least 2 webserver and automatically detect if one of those fail and switch over to the working one.
If you're using Amazon AWS you can use ELB - Elastic Load Balancer
Currently we use DNS polling for four web servers.
The problem we met is that: When the user refreshes, he might go to other web servers. This feels very bad when a user has already logged in. Because we use a session to remember login status, but when refreshing to other web servers, the session is lost.
So the best solution should be to make the user still be on the same web server when he refreshes. Is there a way out?
Ok, I believe you mean "Round Robin DNS". Well, what you describe is a very common problem and there is no "right" solution for it, since the possible answers depend on many variables: are you trying to provide automatic failover or just load balancing? Are you willing to spend time and/or money in a load balancer? What technologies are you using? Java EE? PHP? Apache? IIS?
Having said that, if you're just after load balancing and failover is not much of an issue you may want to use different names for each server (www1,www2,www3 and so on) and redirect to them from your "main" web server (www) upon first access. It's simple (and simplistic) but practical in a few settings.
Can the web servers use a common database server to store the session information?
I know that certain hardware based load balancers will create a "sticky" relationship between a user and a server to avoid this type of problem.
You have quite a few options.
You can store sessions in a key:value storage, f.e. memcached (my personal favorite)
You can store sessions in a database
You can put reverse-proxy loadbalancers like in DNS and Your servers in the back. Then set it to make all requests from the same IPs go to the same servers, regardless of which loadbalancer they go through. In HAProxy this option is called balance source. Beware: if the number of node changes, the sessions can be lost. You can use the cookie or url_param features to avoid this.
See the HAProxy documentation. It's worth reading, really.
Are the four web servers all on the same site and network, or are they distributed?
If the former, you can include a server ID somewhere in the HTTP response, such that a reverse proxy in front of the real servers can identify which server is responsible for the session.
A DNS server that can respond based on the location of client could solve this problem. PowerDNS with the geoip module or GeoIPdns are some examples. You would need to make sure that the IP address sets were non-overlapping so a client always got the same response.
This would not provide any sort of fail over on its own.
How can I make that a site automagically show a nice "Currently Offline" page when the server is down (I mean, the full server is down and the request can't reach IIS)
Changing the DNS manually is not an option.
Edit: I'm looking to some kind of DNS trick to redirect to other server in case the main server is down. I can make permanent changes to the DNS, but not manually as the server goes down.
I have used the uptime services at DNSMadeEasy to great success. In effect, they set the DNS TTL to a very low number (5 minutes). They take care of pinging your server.
In the event of outage, DNS queries get directed to the secondary IP. An excellent option for a "warm spare" in small shops with limited DNS requirements. I've used them for 3 years with not a single minute of downtime.
EDIT:
This allows for geographically redundant failover, which the NLB solution proposed does not address. If the network connection is down, both servers in a standard NLB configuration will be unreachable.
Some server needs to dish out the "currently offline page", so if your server is completely down, there will have to be some other server serving the file(s), so either you can set up a cluster of servers (even if just 2) and while the first one is down, the 2nd is configured only to return the "currently offline page". Once the 1st server is back up, you can take down the 2nd safetly (as server 1 will take all the load).
You probably need a second server with 100% uptime and then add some kind of failover load balancer. to it, and if the main server is online redirect to that and if it isn't redirect to itself showing a page saying server is down
I believe that if the server is down, there is nothing you can do.
The request will send up a 404 network error because when the web address is resolved to an IP, the IP that is being requested does not exist (because the server is down). If you can't change the DNS entry, then the client browser will continue to hit xxx.xxx.xxx.xxx and will never get a response.
If the server is up, but the website is down, you have options.
EDIT
Your edit mentions that you can make a permanent change the IP. But you would still need a two server setup in order to achieve what you are talking about. You can direct the DNS to a load balancer which would be able to direct the request to a server that is currently active. However, this still requires 100% uptime for the server that the DNS points to.
No matter what, if the server that the DNS is pointing to (which you must control, in order to redirect the traffic) is down, then all requests will receive a 404 network error.
EDIT Thanks to brian for pointing out my 404 error error.
Seriously, DNS is not the right answer to server load-balancing or fail-over. Too many systems (including stub clients and ISP recursive resolve) will cache records for much longer than the specified TTL.
If both servers are on the same network, use routing protocols to achieve fail-over by having both servers present the same IP address to the network, but where the fail-over server only takes over if it detects that the (supposedly) live server is offline.
If the servers are Unix, this is easily done by running Quagga on each server, and then using OSPF as the local routing protocol. I've personally used this for warm standby servers where the redundant system was actually in another data center, albeit one that was connected via a direct link to the main data center.
Certain DNS providers, such as AWS's Route 53, have a health-check option, which can be used to re-route to a static page. AWS has a how-to guide on setting this up.
I'm thinking if the site is load balanced the load balancer itself would detect that the web servers it's trying to redirect clients to are down, therefore it would send the user to a backup server with a message dictating technical problems.
Other than that.....
The only thing I can think is to control the calling page. Obviously that won't work in all circumstances... but if you know that most of your hits to this server will come from a particular source, then you could add a java script test to the source, and redirect to a "server down" page that is generated on a different server.
But if you are trying to handle all hits, from all sources (some of which you can't control), then I think you are out of luck. As other folks are saying - when a server is down, the browser gets a 404 error when it attempts a connection.
... perhaps there would be a way at a point in between to detect 404 errors being returned by servers and replacing them with a "server is down" web page. You'd need something like an HTML firewall or some other intermediate network gear between the server and the web client.