What are the benefits of using Fastly versus simply having my own self-hosted Varnish? Are there additional benefits and features that Fastly provides that regular Varnish does not, or is it simply that Fastly is managed Varnish in the same way that CloudAMQP is hosted and managed RabbitMQ?
I just stumbled across this question, I know you asked this a while ago but I'm going to try and answer it for you regardless.
You are correct in assuming that Fastly manages the Varnish instances for you, so you don't have to deal with manually managing your servers. It is a slightly different concept than CloudAMQP however; CloudAMQP is a managed RabbitMQ system that lives in a specific datacenter, perhaps with Multi-AZ enabled for failover purposes.
Fastly is a full blown content delivery network which means they have machines running Varnish all over the world which could significantly increase your user's experience because of lower latency. For example if an Australian user visits your website he will be retrieving the cached content via Fastly's Australian machines, whereas if he were to connect to your own Varnish instance he'd probably have to connect to an instance in the U.S. which would introduce a lot more latency. On top of that it wouldn't only improve speed, but also reliability. Your single Varnish instance having a failure is quite likely, Fastly's global network of 1000s of machines running Varnish collapsing is very unlikely.
So to sum it up for you:
Speed
Reliability
Regards,
Rene.
Related
Is there a recommended way how to configure an IdentityServer for high availability? What are pros/cons for one solution over the other.
Currently I use ARR for it, but I've some issues and I'm not sure if it is the best solution anyhow?
It's not really a question specific to ASP.Net Core or IDS4 and to do it to a degree where it really is truly HA is quite hard.
That said, if not talking about Amazon/Azure stuff... I'd do something like this:
Two sets of servers, each in geographically separate sites, combined with a multi-site SQL Server AlwaysOn High Availability Group using synchronous commits and automatic fail-over and the app configured to suit (i.e. MultiSubnetFailover=true).
In front of the each set of web servers have a traffic manager (with its own HA) with health checking enabled and then in front of that have active DNS failover via a service like Dyn (https://dyn.com/active-failover/)
This would allow you to suffer an entire data centre going down or any individual server and carry on like nothing happened.
From answers to other questions (such as this question), it sounds like different instance sizes offer different network throughput. My processing is I/O bound, and I'm trying to use web jobs to do it on a web site instance. Do web sites offer the same bandwidth as VMs with the same size/price point? Or if I need bandwidth higher than 100 Mb/sec, would I need to choose a solution other than web sites to do this processing?
Thanks,
David
Unfortunately, the bandwidth limits are not currently exposed.
At the end of the day, Azure App Service is using some Cloud Services machines and the bandwidth should be quite similar than in Web/Worker roles.
However, the requests go through different mechanisms (IIS ARR for example) but it might not add so much overhead.
That being said, the best way would be to try and scale out (using multiple instances) if you need more.
I hope this helps!
Adding to #dmatson answer a small detail - right now, expect SLA for high availability, which means that you can have different numbers sometimes. You will need to wait for the official release of SLA - or scale out by the size or amount. The very good FAQ i have found on that topic is here, many networking-related questions are covered.
https://blogs.msdn.microsoft.com/igorpag/2014/09/28/my-personal-azure-faq-on-azure-networking-slas-bandwidth-latency-performance-slb-dns-dmz-vnet-ipv6-and-much-more/
I have an azure traffic manager configured to route traffic over two data centres based on performance (latency). The two DCs are replicas of each other, and is engineered in this way so that our global customers are givin a good performance no matter where they are connecting from.
The application tiers do not hold state, and the data tiers are set up using SQL merge replication on a 1 minute timer to keep the DBS in sync as to provide service continuity in the event of a Datacenter failover.
The issues that I have found is that the traffic managers routing is slightly erratic. I have observed registering a user under one Datacenter only to find the login has bee routed to the other one - the SQL replication hasn't synced at this point and the second DC isn't aware that the user exists. Even though the user both registered and logged in from the same location! The DCs are in the West US and South east asia.
I'm looking at a few options to fix this. Solution A is to Silo the users data to a specific data center, therefor whatever DC the user registers to is used thereafter. I wouldn't have syncing issues but I lose the advantage of continuity that the SQL replication provides.
Solution B is to use a different more predictable global load balancer. But first I want some opinions and to perhaps see if I am doing something wrong or perhaps my architecture is flawed.
Thanks for advice.
My solution had challenges using the traffic manager also, although slightly different to yours. The traffic manager is a great value solution if it can work for you. As far as I am aware no configuration in traffic manager allows it to be aware of sessions, therefore it is blinkered to its config setting of performance in your case. This means its acting erratic based on your expectation for it to use sessions to be persistent to an endpoint subject to it being available.
In terms of your solution, it is very much Enterprise. To move backwards with solution A probably doesn't fit the requirement given what you went to the effort of building. Solution B brings many more features that Traffic Manager lacks and one of them will resolve your issue. For other reasons I am looking at
http://kemptechnologies.com/uk/server-load-balancing-appliances/virtual-loadbalancer/loadmaster-azure
It is designed for Azure and is available as a pre-installed VM. There are others available but this has been my choice and what I would use if I were in your position and wanted to keep the level of resilience you currently have.
Hope this helps.
What is the best methods for protecting a site form DoS attack. Any idea how popular sites/services handles this issue?.
what are the tools/services in application, operating system, networking, hosting levels?.
it would be nice if some one could share their real experience they deal with.
Thanks
Sure you mean DoS not injections? There's not much you can do on a web programming end to prevent them as it's more about tying up connection ports and blocking them at the physical layer than at the application layer (web programming).
In regards to how most companies prevent them is a lot of companies use load balancing and server farms to displace the bandwidth coming in. Also, a lot of smart routers are monitoring activity from IPs and IP ranges to make sure there aren't too many inquiries coming in (and if so performs a block before it hits the server).
Biggest intentional DoS I can think of is woot.com during a woot-off though. I suggest trying wikipedia ( http://en.wikipedia.org/wiki/Denial-of-service_attack#Prevention_and_response ) and see what they have to say about prevention methods.
I've never had to deal with this yet, but a common method involves writing a small piece of code to track IP addresses that are making a large amount of requests in a short amount of time and denying them before processing actually happens.
Many hosting services provide this along with hosting, check with them to see if they do.
I implemented this once in the application layer. We recorded all requests served to our server farms through a service which each machine in the farm could send request information to. We then processed these requests, aggregated by IP address, and automatically flagged any IP address exceeding a threshold of a certain number of requests per time interval. Any request coming from a flagged IP got a standard Captcha response, if they failed too many times, they were banned forever (dangerous if you get a DoS from behind a proxy.) If they proved they were a human the statistics related to their IP were "zeroed."
Well, this is an old one, but people looking to do this might want to look at fail2ban.
http://go2linux.garron.me/linux/2011/05/fail2ban-protect-web-server-http-dos-attack-1084.html
That's more of a serverfault sort of answer, as opposed to building this into your application, but I think it's the sort of problem which is most likely better tackled that way. If the logic for what you want to block is complex, consider having your application just log enough info to base the banning policy action on, rather than trying to put the policy into effect.
Consider also that depending on the web server you use, you might be vulnerable to things like a slow loris attack, and there's nothing you can do about that at a web application level.
I'm interested in cross-colo fail-over strategies for web applications, such that if the main site fails users seamlessly land at the fail-over site in another colo.
The application side of things looks to be mostly figured out with a master-slave database setup between the colos and services designed to recover and be able to pick up mid-stream. I'm trying to figure out the strategy for moving traffic from the main site to the fail-over site. DNS failover, even with low TTLs, seems to carry a fair bit of latency.
What strategies would you recommend for quickly moving traffic between colos, assuming the servers at the main colo are unreachable?
If you have other interesting experience / words of wisdom about cross-colo failover I'd love to hear those as well.
DNS based mechanisms are troublesome, even if you put low TTLs in your zone files.
The reason for this is that many applications (e.g. MSIE) maintain their own caches which ignore the TTL. Other software will do a single gethostbyname() or equivalent call and store the result until the program is restarted.
Worse still, many ISPs' recursive DNS servers are known to ignore TTLs below their own preferred minimum and impose their own higher TTLs.
Ultimately if the site is to run from both data centers without changing its IP address then you need to look at arrangements for "Multihoming" via global BGP4 route announcements.
With multihoming you need to get at least a /24 netblock of "provider independent" (aka "PI") IP address space, and then have that only be announced to the global routing table from the backup site if the main site goes offline.
As for DNS, I like to reference, "Why DNS Based Global Server Load Balancing Doesn't Work". For everything else -- use BGP.
Designing networks in order to load balance using BGP is still not an easy task and I myself certainly am not an expert on this. It's also more complex than Wikipedia can tell you but there are a couple interesting articles on the web that detail how it can be done:
Load Balancing In BGP Networks
Load Sharing in Single and Multi homed environments
There is always more if you search for BGP and load balancing. There are also a couple whitepapers on the net which describe how Akamai does their global loadbalancing (I believe it's BGP too.), which is always interesting to read and learn about.
Beyond the obvious concepts you can use software and hardware to achieve, you might also want to check with your ISP/provider/colo if they can set you up.
Also, no offense in regard to your choice of colo (Who's the provider?), but most places should be setup to deal with downtimes and so on, they should not require you to take actions. Of course floods or aliens can always strike, but in that case I guess there are more important issues. :-)
If you can, Multicast - http://en.wikipedia.org/wiki/Multicast or AnyCast - http://en.wikipedia.org/wiki/Anycast