CloudFront: path routing without trailing slash - amazon-cloudfront

There are two CloudFront Behaviros.
If URL path is /articles/ or /articles/123, it will be routed to the server under ELB, but if /articles, it will be routed to S3.
I want to configure accesses to be routed to ELB even without trailing slash.  
Can anyone give me some advice?
Should I write a CloudFront Function to redirect or modify the path pattern to solve this problem?
Priority
Path pattern
Origin
0
/articles/*
ELB
1
default(*)
S3

I solved it in the following way: change Cloudfront path to /articles* instead of /arcticles/*.

you can add a path pattern /articles to the same origin ELB with a higher priority then the one of /articles/*
Another way could be to add a redirect on s3 on articles to the slash equivalent

Related

Is it possible to use AWS Cloudfront to proxy Google Fonts?

I have created an AWS Cloudfront distribution in an attempt to proxy requests to fonts.googleapis.com through Cloudfront. So for example, I'd like to use something list this:
https://xxxxxx.cloudfront.net/css2?family=Noto+Sans+HK:wght#400;500;700;900&display=swap
To fetch the actual content from the origin at:
https://fonts.googleapis.com/css2?family=Noto+Sans+HK:wght#400;500;700;900&display=swap
I have configured Cloudfront with an origin of "fonts.googleapis.com" and set it so that it passes through all URL parameters, but still the origin responds with:
404. That’s an error.
The requested URL /css2 was not found on this server.
Does anyone know what could be causing this? Afaik, the way I've configured Cloudfront should act like a transparent pass-through.
I can't share all of the Cloudfront config settings here (there are too many), but perhaps someone can point me in the right direction?
Or is this impossible?
This in fact did work fine. I had just setup the CloudFront distribution incorrectly.
I suspect the change OP referred to was to update the behavior to prevent the origin request from sending the host header. Create a cache policy, remove the host header from the origin request and everything will work magically for you.

Google indexing Cloudfront distribution

I have a static site through Cloudfront with an S3 origin & custom domain via Route 53. All works well, except that Google has also indexed the Cloudfront distribution url (d123etc.cloudfront.net) as well as my custom domain, leading to duplicate content issues.
I've tried canonical urls, but the distribution remains indexed. It has been suggested to serve up a different robots.txt depending on what domain is being used, which sounds fine, but there is no .htaccess or web server, leaving it to a Lambda Edge function to try and send the different robots.txt.
The problem is that I can't find how in the function to determine if a request is coming from my custom domain or from the direct distribution url. I've tried white-listing the Origin, but it is not sent through when using an S3 origin. I've also tried white-listing the Referer header, but no referrer is sent through when accessing the robots.txt file as it's a direct request.
For the time-being, I'm adding a meta noindex client-side using js on page load (which I realise is too late), and also redirecting client-side to my actual domain in case someone follows the google indexed cloudfront.net domain.
Does anyone know how to detect in Lambda Edge which domain is being used to make the request? Or some other way of blocking Google from indexing the Cloudfront url, just leaving it to index the custom domain.
So I think the way to do this would be to set up a redirect on your hosted webserver. If you check the 'host' in the request header and check for cloudfront.com, send a 301 response code along with your custom domain name.
S3 has a UI way to do this:
https://medium.com/tensult/how-to-do-site-redirection-using-aws-522a4002c645
It seems you'll need a second bucket behind the same cloudfront url but without the custom domain. Then you can set it to redirect all requests to your custom domain.
The browser or bots would then stop trying cloudfront.com because it doesn't return anything, they would automatically (without the user really noticing) to my domain.xyz and all the links would link to your own domain.

Redirect urls with trailing slash

My website currently online is completely static and all the URLs have a trailing slash at the end : https://www.website.com/blog/article-1/
I'm working on my new website which is using Prestashop. On Prestashop, URLs don't have a trailing slash : https://www.website.com/blog/article-1
Problem: I have an excellent SEO on my current website and I need to keep the actual URLs (with trailing slash) available. For user experience, I'd like URLs to work with or without trailing slash.
How can I redirect my new URLs to the same URL + trailing slash? If possible, I'd like to rewrite URLs so that users always see the URL with a trailing slash.
Example :
https://www.website.com/blog/article-1/ is redirected to https://www.website.com/blog/article-1 and the URL visible in the address bar is https://www.website.com/blog/article-1/.
Well, ask "How can I redirect my new URLs to the same URL + trailing slash"...
The answer obviously is: by implementing exactly that rule. There are thousands of examples for this alone here on SO. None of those helped? Why not?
Anyway, here is another one:
RewriteEngine on
RewriteRule ^/blog/([^/]+)$ /blog/$1/ [R=301]
RewriteRule ^/blog/([^/]+)/$ /blog/$1 [END]
You need to take care to send out references with leading slashes with this setup. Since otherwise your site will be dead slow, since the clients will have to request every single page twice due to the redirection then required for every single page...
It is a good idea to start out with a 302 temporary redirection and only change that to a 301 permanent redirection later, once you are certain everything is correctly set up. That prevents caching issues while trying things out...
In case you receive an internal server error (http status 500) using the rule above then chances are that you operate a very old version of the apache http server. You will see a definite hint to an unsupported [END] flag in your http servers error log file in that case. You can either try to upgrade or use the older [L] flag, it probably will work the same in this situation, though that depends a bit on your setup.
This rule will work likewise in the http servers host configuration or inside a dynamic configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a dynamic configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).
If you mean default prestashop links like products, categories etc. you can just change their way to be built. Prestsahop allows us to achieve this within admin-panel Configure->Shop Parameters->Traffic & SEO->SEO and URL's>Schema of URLs (for PS 1.7).
And there change an URL in interest, for example, Route to category
from {id}-{rewrite} to {id}-{rewrite}/. And you won't need to redirect anything.

DNS redirect of a url to another url

We are currently looking at identifying the best approach to carry out a redirection of a url folder to another url folder o a separate domain. We have tried a few options but have been unable to make this work. Any other redirection options such as apache, html etc are not possible. This url is only accessed through the browser by an application to download some files. This application cannot be changed but needs to download these files from another location.
Hence, we need to redirect the following:
https://sub1.domain1.com/xyz
to
https://sub2.domain2.com/abc/xyz
Any ideas how we can achieve this?
Note: we have full control of DNS of the domain1 and there are no plans to use this domain.
You can't do that with DNS alone. The DNS never sees the "path" part of the URL. You need a webserver aware of the situation who can provide a 302 redirect.

Cloudfront origin pull from a folder

I have setup a CloudFront origin pull server. It allows me to set a domain name, which I have. This works.
But I don't want the whole domain to be the origin. I want
mydomain.com/folder/subfolder
to be the origin. Also, the cloudfront distribution is CNAMEd to a cdn, which is setup via DNS to cloudfront. This seems to work.
So, basically, instead of this URL:
xyz.cloudfront.net/folder/subfolder/1.jpg
I want this instead:
cdn.mydomain.com/1.jpg
Currently I have achieved, via CNAME and origin pull:
cdn.mydomain.com/folder/subfolder/1.jpg
The question is: on CloudFront how do I setup an origin pull from a folder, not from the main domain name?
The accepted answer is out of date. This is possible using the "Origin Path" setting in AWS which will rewrite the request to a sub-folder on the origin:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesOriginPath
I am not aware of a way to do this cloudfront. However, you create a virtual host at your origin for subfolder.example.com and have it's root directory be the directory you mentioned. Then you could set subfolder.example.com as your origin for the default cache behavior.

Resources