CloudFront multiple origins configuration - amazon-cloudfront

I would like to start using CF on my website hosted here, I have a few questions that maybe some of you aws guys knows, thanks for your help.
my website have multiple origins that are located over different servers and different ELB, i would like to configure CF in order to collect data from different origins and provide content using different policy and remove a layer of reverse proxy that I've in place right now plus caching some stuff.
Below my idea about the CF configuration:
Origins:`
www.pippo.com > XXX.cloudfront.net
Origin 1: pluto.pippo.com = xxx.elb1.aws.amazon.com > WC
Origin 2: paperino.pippo.com = xxx.elb2.aws.amazon.com > WC
Origin 3: minnie.pippo.com = Apache\Nginx\Tomcat`
Behaviors:`
Origin 1: pluto.pippo.com/*.jpg cache
Origin 1: pluto.pippo.com/*.png cache
Origin 1: pluto.pippo.com/*.* cache
Origin 1: pluto.pippo.com Default(*) NON cache
Origin 2: paperino.pippo.com/paywall/* NON cache
Origin 2: peperino.pippo.com/*.png cache
Origin 2: peperino.pippo.com/*.jpg cache
Origin 2: paperino.pippo.com Default(*) NON cache
Origin 3: minnie.pippo.com/*.jpg cache
Origin 3: minnie.pippo.com/*.png cache
Origin 3: minnie.pippo.com/*.* cache
Origin 3: minnie.pippo.com Default(*) NON cache`
Questions:
When my users open www.pippo.com CF will provide the cached content (*.jpg \ .png in the example) and eveything is not specified in the behaviors will be directly requested (using the default () policy) from the ELB to the webcaches. Correct? From the CF or from the user?
How i can prevent that users go directly to pluto.pippo.com ? just a 301 with exception for the CF subnet?
Using this configuration Sticky Sessions will be maintained?
Sorry for the newbie question.
Thanks for any help.

This may not completely answer your questions, however:
Sticky sessions: You'll need to forward all cookies (or whitelist the AWSELB cookie) for your default rule (assuming this captures your page requests - unless you're using page extensions, in which case they'd be caught by the . rule).
Presumably you don't want your cached content (.jpg .png .) to obey sticky sessions.
Preventing users from accessing the origin: There isn't an really effective way to do this for custom origin. You can try security through obscurity (create an obscure origin domain) and / or only allow direct requests with an 'Amazon Cloudfront' User-Agent.

Related

Is it possible to use AWS Cloudfront to proxy Google Fonts?

I have created an AWS Cloudfront distribution in an attempt to proxy requests to fonts.googleapis.com through Cloudfront. So for example, I'd like to use something list this:
https://xxxxxx.cloudfront.net/css2?family=Noto+Sans+HK:wght#400;500;700;900&display=swap
To fetch the actual content from the origin at:
https://fonts.googleapis.com/css2?family=Noto+Sans+HK:wght#400;500;700;900&display=swap
I have configured Cloudfront with an origin of "fonts.googleapis.com" and set it so that it passes through all URL parameters, but still the origin responds with:
404. That’s an error.
The requested URL /css2 was not found on this server.
Does anyone know what could be causing this? Afaik, the way I've configured Cloudfront should act like a transparent pass-through.
I can't share all of the Cloudfront config settings here (there are too many), but perhaps someone can point me in the right direction?
Or is this impossible?
This in fact did work fine. I had just setup the CloudFront distribution incorrectly.
I suspect the change OP referred to was to update the behavior to prevent the origin request from sending the host header. Create a cache policy, remove the host header from the origin request and everything will work magically for you.

How to invalidate with custom cache policy

Context
I have a distribution where I added the host to the cache policy
These 2 domains point to the same distribution:
www.site1.com/pageA
www.site2.com/pageA
these 2 hosts have their respective cache entry, In this setup, I have a custom origin response lambda on edge that will return different content base on the host.
The question:
I'm use to invalidate based on the path ex: /pageA
how should I format my invalidation if I want to only invalidate pageA for site1?
It is currently not possible to invalidate by domain.
Cloudfront invalidation is by path only.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html

Why wordpress images only work with only one domain

I have created many WordPress sites and there is something I was never able to fix.
When you have two domains for the same website (such as www.example.com and www.example.fr) only one shows correctly and the alternative doesn't show it's images.
I guess this is a common problem that might happen to a lot of you. Any idea to help me fix it ?
First, check that both WordPress Address (URL) and Site Address (URL) are set properly in
wp-admin/ >> Settings >> General
If that is not the case, see the error messages in the console:
(index):1 Font from origin 'http://draidel.com' has been blocked from
loading by Cross-Origin Resource Sharing policy: No
'Access-Control-Allow-Origin' header is present on the requested
resource. Origin 'http://draidel.com.ar' is therefore not allowed
access.
You can resolve this by adding the following to you .htaccess
Header add Access-Control-Allow-Origin "draidel.com"
You may need to change the permissions of .htaccess as WordPress loves to change it randomly.

Cloudfront origin pull from a folder

I have setup a CloudFront origin pull server. It allows me to set a domain name, which I have. This works.
But I don't want the whole domain to be the origin. I want
mydomain.com/folder/subfolder
to be the origin. Also, the cloudfront distribution is CNAMEd to a cdn, which is setup via DNS to cloudfront. This seems to work.
So, basically, instead of this URL:
xyz.cloudfront.net/folder/subfolder/1.jpg
I want this instead:
cdn.mydomain.com/1.jpg
Currently I have achieved, via CNAME and origin pull:
cdn.mydomain.com/folder/subfolder/1.jpg
The question is: on CloudFront how do I setup an origin pull from a folder, not from the main domain name?
The accepted answer is out of date. This is possible using the "Origin Path" setting in AWS which will rewrite the request to a sub-folder on the origin:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesOriginPath
I am not aware of a way to do this cloudfront. However, you create a virtual host at your origin for subfolder.example.com and have it's root directory be the directory you mentioned. Then you could set subfolder.example.com as your origin for the default cache behavior.

Is there any way to identify requests coming to custom origin server from CloudFront?

I'm using CloudFront with custom origin and want to redirect certain requests coming to a web app to CloudFront (clients use direct URLs, which cannot be changed to CloudFront-based URLs). In order to ensure that cache on CloudFront is updated properly, I must not redirect requests coming from CloudFront itself. Is there any way to identify such requests on origin server?
Does CloudFront add any custom headers to requests sent to origin server? Or is there any other reliable way to determine that requests is coming from CloudFront?
yes you can identify requests coming to your origin server from cloudfront by checking the useragent. the user agent would be 'Amazon CloudFront'
Update
It's an old question, but my update useful for someone research or looking for the new solution.
Recently AWS added new feature Origin Custom Headers.You can set a header with a secret value and check it on your origin server by the web server or your applications.
Update
Avinash Bijja correctly pointed out (+1) that the HTTP User-agent header would be 'Amazon CloudFront' for requests coming from Amazon CloudFront servers. Unfortunately this doesn't seem to be explicitly documented indeed, but is implicitly acknowledged by various posts in the respective forum, see e.g. the AWS Team response to User Agent String - does CF overwrite the user agent string?:
You are correct. The User-Agent field is always populated as "Amazon CloudFront".
However, it turns out this is not currently entirely reliable, insofar CloudFront sends an empty User-Agent to the origin if one is missing in the originating client request already:
I can confirm that CloudFront is not sending a User-Agent to the
origin when the original client does not send a User-Agent. We have
enhancements & fixes to User-Agent handling on our backlog, but no
release dates at this time. I've sent you a PM with further details.
These enhancements & fixes are apparently not rolled out still as of February 07 2013 at least.
These enhancements & fixes have been rolled out as of August 05 2013 (thanks webbiedave for the update!).
Initial Answer
Does CloudFront add any custom headers to requests sent to origin
server?
One would think so indeed, but at least they don't appear to be documented where I would have expected it, namely in How CloudFront Processes and Forwards Requests to Your Custom Origin Server. Given you are in control of the origin server, you might just check its HTTP access logs though?
Or is there any other reliable way to determine that requests is
coming from CloudFront?
You'll need to judge the reliability yourself, but The IP address that CloudFront forwards to the origin server is the IP addresses of a CloudFront server, not the IP address of the end user's computer. - consequently you could restrict access to the published Amazon CloudFront Public IP Ranges; however, be aware of the respective disclaimer:
The CloudFront IP addresses change frequently and we cannot guarantee
advance notice of changes. On a best-effort basis, we will provide the
list of current addresses. Customers should not use these addresses
for mission critical applications and must never hard code them in DNS
names. [emphasis mine]
Consequently you'll need to monitor this forum/post to take notice of respective changes as early as possible (if this constraint is acceptable for your use case in the first place of course).
CloudFront appears to add a X-Amz-Cf-Id header to every request before forwarding it to the origin. At least, it currently is doing that for me.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html#request-custom-headers-behavior
This should probably be a comment on Reza's answer, but I can't do that :).
For completeness, here's the link to the official documentation regarding Forwarding Custom Headers, which currently claims the following.
You can configure CloudFront to include custom headers whenever it forwards a request to your origin. You can specify the names and values of custom headers for each origin, both for custom origins and for Amazon S3 buckets. Custom headers have a variety of uses, such as the following:
You can identify the requests that are forwarded to your custom origin by CloudFront. This is useful if you want to know whether users are bypassing CloudFront or if you're using more than one CDN and you want information about which requests are coming from each CDN. (If you're using an Amazon S3 origin and you enable Amazon S3 server access logging, the logs don't include header information.)

Resources