AWS Route 53 weighted routing to two cloudfront distributions - amazon-cloudfront

Background:
I have a Javascript hosted on S3 bucket with static website + Cloudfront distribution + Route 53 setup. I need Cloudfront because I need custom domain with SSL support. This works fine. As an example, the script can be accessed at https://app.example.org/myscript.js. This URL is given to clients to embeded on their webpages and I can not change it.
The setup in Route 53 is like this:
app.example.org => Cloudfont Distribution (s3://app.example.org)
What I want:
I want to set up a staging environment for new features. I would like to direct 10% of production request to another version of the script.
What I tried
I've tried setting up another S3 bucket with static webiste + cloudfront with different alternative domain name (e.g. app-beta.example.org).
I need to use a different alternative subdomain because Cloudfront does not allow same alternative domain name with multiple distributions.
In Route 53, I set up alias A records like this:
app-beta.example.org => Alias Cloudfont Distribution (s3://app-beta.example.org)
app.example.org => Alias app-beta.example.org (weighted 10)
app.example.org => Alias Cloudfont Distribution (s3://app.example.org) (weighted 100)
Why didn't it work
It turned out that this won't work:
Because the requested domains is the same app.example.org, cloudfront distribution (s3://app.example.org) will pickup the request regardless the record of app.example.org => Alias app-beta.example.org.
I think here Route 53 do try routing the traffics to cloudfront according to the weight rules but eventually when requests reach cloudfront, cloudfront respects the 'alternative domain name' and use the one with (app.example.org) regardless.
How about not using cloudfront?
I've thought of using EC2 or EBS for an nginx server serving the static javascript files. This could work. But:
the javascript file will be requested worldwide. Using EC2 or EBS means for some users it will have higher latency. This is not ideal.
this means extra resources to manage and cost.
How about lambda#Edge?
Technically lambda#Edge is a perfect solution if it is cheaper. Yes, it is already cheap ($0.0000006 per request) but in our case, we will be paying more than USD 10k for it. ( the script is requested > 10 billion times per month, and 2 lambda functions are required for the set up)
Question
Is there any other way I can achieve what I want ? (i.e. gradually roll out new version without changing the URL)

Related

CloudFront Distribution to handle many API Stages with Custom Subdomains

Currently my Lambda is served by an API Gateway (HTTPApi) behind a CloudFront distribution. The API's only stage is $default. I have a custom domain example.com which directs traffic to the CloudFront distro and to the API. I am using CloudFormation for my whole infrastructure.
I want to improve my staging by introducing dev and prod stages. I want example.com to point to the prod stage, and dev.example.com to point to the dev stage.
In my api_stack I have created the two stages:
self.http_api.add_stage(
"dev",
stage_name="dev",
auto_deploy=True,
)
self.http_api.add_stage(
"prod",
stage_name="prod",
auto_deploy=False,
)
I only have one CloudFront distribution and it points to the $default stage (the invoke URL of which is stored in origin_url in the following code). So I extend my cloudfront_stack to point to the prod stage (having manually deployed at least once).
self.distribution = cloudfront.Distribution(
self,
construct_id,
default_behavior=cloudfront.BehaviorOptions(
origin=cloudfront_origins.HttpOrigin(
origin_url,
origin_path="/prod",
),
...
certificate=certificate,
domain_names=domain_names,
certificate is a DnsValidatedCertificate for example.com. domain_names=["example.com"].
At this point, my API (website, Flask app) breaks. static/ filepaths fail, as per this issue.
Ignoring that issue... I don't see how I can extend my CloudFront distribution to serve <invokeURL>/dev at dev.example.com. I expect I need:
A new certificate for dev.example.com
???
It almost feels like I need a second CloudFront distribution, which feels extraneous. I would like to solve this problem while keeping the invoke URLs disabled (as per HttpApi: disable_execute_api_endpoint).
Have I got it the wrong way round; should I have multiple CloudFront distributions, each with their own singular API deployment (Stage)? Does Stage describe something far more encompassing than a Stack? I would love for cloudfront.BehaviorOptions param origin to be origins and for some mapping to be available elsewhere: dev.example.com/: example.com/dev/. Please educate me! :)
<small Might imply I make 2 distros />

How do I serve thousands of websites with Amazon S3 or CloudFront?

We're building a website editor - like Wix, Webflow etc. Users can create their websites, and choose to deploy them - and add their own custom domain to it.
For example -
An user created a website and wants it to be deployed to https://client1.com
The static files for that entire website is being stored in a subfolder of a bucket called all-sites. This bucket will have subfolders, each one corresponding to a different website.
For example, the bucket all-sites will have these sub-folders -
/client1-site
/client2-site
/client3-site
and each one of these folders will have their static website content/resources -
/client1.com
..index.html
..js/
....script.js
..img/
....img.png
How do I ask users to add A/CNAME records to their domain, that points exactly to these subfolders? Or what should my server's endpoint do to serve all these websites, while loading content for each website independently, also making sure the websites are being served over HTTPS?
Currently I have come up with a couple of approaches, but none of them are good -
Instead of saving files to s3, save it to my server's storage, and return files from the server - which is very easy but will have problems with SSL certificates.
Add a cloudfront distribution for that bucket, and serve sites - https://xyz.cloudfront.net/site-id/index.html etc. But how do I link that to a client's url?
Your CloudFront idea sounds best to me as you would gain some extra advantages, key among them enabling caching on different regions (which might not be advisable if using APIs though).
I guess at this point you would have to ask the customer to add a CNAME record pointing towards his corresponding CloudFront distribution domain name (typically something like XXXXXXXX.cloudfront.net). Hope that helps.

Understanding complex website architecture (reactjs,node, CDN, AWS S3, nginx)

Can somebody explain to me the architecture of this website (link to a picture) ? I am struggling to understand the different elements in the front-end section as well as the fields on top, which seem to be related to AWS S3 and CDNs. The backend-section seems clear enough, although I don't understand the memcache. I also don't get why in the front end section an nginx proxy is needed or why it is there.
I am an absolute beginner, so it would be really helpful if somebody could just once talk me through how these things are connected.
Source
Memcache is probably used to cache the results of frequent database queries. It can also be used as a session database so that authenticated users' session work consistently across multiple servers, eliminating a need for server affinity (memcache is one of several ways of doing this).
The CDN on the left caches images in its edge locations as they are fetched from S3, which is where they are pushed by the WordPress part of the application. The CDN isn't strictly necessary but improves down performance by caching frequently-requested objects closer to where the viewers are, and lowers transport costs somewhat.
The nginx proxy is an HTTP router that selectively routes certain path patterns to one group of servers and other paths to other groups of servers -- it appears that part of the site is powered by WordPress, and part of it node.js, and part of it is static react code that the browsers need to fetch, and this is one way of separating the paths behind a single hostname and routing them to different server clusters. Other ways to do this (in AWS) are Application Load Balancer and CloudFront, either of which can route to a specific server based on the request path, e.g. /assets/* or /css/*.

Optimising page loads using a single storage bucket

I have a Google Cloud Storage bucket mywebsite-static. Due to browser restrictions on max parallel HTTP connections, I would like to create multiple DNS records in such a way that I can access files within this bucket using static.mywebsite.com, static2.mywebsite.com, etc.
The docs recommend adding CNAME records, but the bucket name must match the CNAME. Keeping content in the one bucket saves synchronising/updating multiple buckets when the static content changes, and is also much cleaner than storing multiple copies of the same static content.
Is there any way to create multiple DNS records in order to reach a single storage bucket?
Not with GCS alone. However, using Google Cloud Load Balancing, you could set up a global forwarding rules which all map to a single backend GCS bucket. This will give you an IP address that you can map to as many DNS names as you like. https://cloud.google.com/compute/docs/load-balancing/http/global-forwarding-rules
The Load Balancer is a powerful tool and can also be used to swap out which GCS bucket you're serving from or let you serve some directories dynamically from GCE or other services.
The downsides are that may be overkill for your use case, configuring this is somewhat complicated, and global forwarding rules aren't cheap, but it will get the job done. It might be easier to look into other options to improve your site, such as CSS sprite sheets.
As some of the answers in that thread hint, if you are using HTTP 2.0 the parallel connection issue is not a problem. You can use HTTP 2.0 with GCS, so long as you are accessing it via HTTPS. This means either use https://storage.cloud.google.com/bucket/object, https://bucket.storage.cloud.google.com/object or GCLB+Backend bucket. HTTPS doesn't work with the CNAME as GCS won't have a certificate for that domain.

Honey tokens using AWS API Gateway and Lambda functions

I want to setup fake URLs or honeytoken to trick an attacker to follow those links, and have a script to auto block the attacker IPs using AWS WAF.
Security is a big thing these days, and our web infrastructure has already been a target of massive bruteforce and DDOS attempts. I want to setup tracks so attacker who are using directory traversing attacks can be found. e.g A common directory listing attacks traverse URLs like ../admin, ../wp-admin etc while scanning a target website. I want to setup a mechanism to get alerted when any of these non-existent URLs get browsed.
Question:
1. Is it possible to redirect part of web-traffic e.g www.abc.com/admin to API gateway and remaining www.abc.com to my existent servers?
2. How will I setup DNS entries for such, if it is possible?
3. Is there a different/easy to achieve this.
Any suggestion is welcome, as I am open to Ideas. Thanks
You can first setup a Cloudfront distribution with WAF ip blacklisting. Then setup a honey pot using API Gateway and Lambda which becomes a origin for urls like /admin, /wp-admin.
Following example with bad bot blocking using honeypots can provide you with a head start.

Resources