CloudFront - How to forward all request headers to the origin

CloudFront - How to forward all request headers to the origin - amazon-cloudfront

At CloudFront behaviour setting, is "All" the one to forward all request headers to the origin?
Values That You Specify When You Create or Update a Distribution
If you configure CloudFront to forward all headers to your origin for a cache behavior, CloudFront never caches the associated objects. Instead, CloudFront forwards all requests for those objects to the origin. In that configuration, the value of Minimum TTL must be 0.

Yes, it is.
The documentation seems to focus more on caching based on headers and less on what's forwarded, but caching on headers and forwarding headers to the origin go hand-in-hand.
As I was looking for clear citations from the documentation, one reference I found in the Amazon CloudFront Developer Guide is the one shown below. It's a link to a section titled "Cache Based on Selected Request Headers" but its anchor tag is DownloadDistValuesForwardHeaders.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesForwardHeaders
This suggests that someone has tried to clarify or simplify the documemtation... with apparently limited success.
Note that this forwards almost all headers to the origin, except for some that are still stripped for security and/or operational reasons, like X-Forwarded-Proto, X-Real-IP, and X-Edge-*.
Note also that if your origin protocol is HTTPS and you were not already whitelisting the Host header at CloudFront, then whitelisting all headers will potentially change the requirements for the origin's TLS certificate. Failure to handle this correctly is one of several reasons why CloudFront might return a 502 error to the viewer.

The layout has changed a bit since this question was asked and answered.
In the "behavior" settings, it is now necessary to select "Legacy cache settings" for these options to be visible. You can select "all" or a specific set of headers to forward. Below are a set of headers that allow websocket connections to work:

Related

Is it possible to use AWS Cloudfront to proxy Google Fonts?

I have created an AWS Cloudfront distribution in an attempt to proxy requests to fonts.googleapis.com through Cloudfront. So for example, I'd like to use something list this:
https://xxxxxx.cloudfront.net/css2?family=Noto+Sans+HK:wght#400;500;700;900&display=swap
To fetch the actual content from the origin at:
https://fonts.googleapis.com/css2?family=Noto+Sans+HK:wght#400;500;700;900&display=swap
I have configured Cloudfront with an origin of "fonts.googleapis.com" and set it so that it passes through all URL parameters, but still the origin responds with:
404. That’s an error.
The requested URL /css2 was not found on this server.
Does anyone know what could be causing this? Afaik, the way I've configured Cloudfront should act like a transparent pass-through.
I can't share all of the Cloudfront config settings here (there are too many), but perhaps someone can point me in the right direction?
Or is this impossible?

This in fact did work fine. I had just setup the CloudFront distribution incorrectly.

I suspect the change OP referred to was to update the behavior to prevent the origin request from sending the host header. Create a cache policy, remove the host header from the origin request and everything will work magically for you.

cloudfront fail to request objects in behavior

I have setup cloudfront, elb and my ec2 web server for default behavior (no caching), everything is working fine. There is only 1 origin (the elb) and the origin path is empty.
Now I want to cache static stuff with cloudfront from the web server (wildfly) like js/css, they're all served in /my-context/assets folder
So i add a new behavior with path pattern '/my-context/assets/*' and default cache settings using the same origin.
This is not working, my request login page return the page html itself, but all css/js are failed. Request to /my-context/assets/a/b/some.css return 502 with "CloudFront wasn't able to connect to the origin."
I also tried to setup a new origin (with the same elb) with path "/my-context/assets" for the new behavior, it also fail.
Can I have instruction on how to make this work? or is this actually not do-able?
Thank you!

The solution is to configure the cache behavior to forward (whitelist) the Host: header to the origin, from the incoming request.
This is not to imply that it's the "correct" configuration in every case, but many times it is desirable, or even required.
When CloudFront makes a back-end https connection to your origin server, the certificate offered by the server has to not only be valid (not expired, not self-signed, issued by a trusted CA, and with an intact intermediate chain) but also has to be valid for the request CloudFront will be sending.
For CloudFront to use HTTPS when communicating with your origin, one of the domain names in the certificate must match one or both of the following values:
• The value that you specified for Origin Domain Name for the applicable origin in your distribution.
• If you configured CloudFront to forward the Host header to your origin, the value of the Host header.
The SSL/TLS certificate on your origin includes a domain name in the Common Name field and possibly several more in the Subject Alternative Names field. (CloudFront supports wildcard characters in certificate domain names.) If your certificate doesn't contain any domain names that match either Origin Domain Name or the domain name in the Host header, CloudFront returns an HTTP status code 502 (Bad Gateway) to the viewer.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/SecureConnections.html#SecureConnectionsHowToRequireCustomProcedure
In your case, you originally were running requests through CloudFront with caching disabled, which is typically done by configuring CloudFront to forward all request headers to the origin, as this automatically disables caching of responses.
Later, when you attempted configure a second cache behavior so that objects matching certain path patterns could be cached, you naturally did not forward all headers to the origin -- but in this case, forwarding the Host: header (which CloudFront refers to as "whitelisting" the header for forwarding) was necessary, because CloudFront appeared to have needed that information in order to validate the certificate that the origin server was presenting.
If you don't forward the Host: header, the the certificate must match the Origin Domain Name, as noted above, and in your case, this us apparently not the case. If the Host: header is not whitelisted for forwarding, then CloudFront still sends a host header in the back-end request, but this header is set to the same value as Origin Domain Name, hence the reason the certificate must match that value.
If matching one way or the other were not required (along with all the other conditions CloudFront imposes on HTTPS connections to the origin), this would prevent CloudFront from determining with reasonable certainty that the back end connection was being handled by the intended server, and that the origin server is genuinely the server it claims to be, which is one of two protections provided by TLS/SSL (the other protection, of course, is the actual encryption of traffic).

Is there a security vulnerability to permit all CORS origins if the request origin matches a pattern?

We had a requirement to permit cross-origin requests provided that the origin was part of the corporate domain. While this question is not C# or Web API specific, the snippets below demonstrate our strategy.
We have a collection of regular expressions defining permitted origins:
private static readonly Regex[] AllowedOriginPatterns = new[]
{
new Regex(#"https?://\w[\w\-\.]*\.company.com",
RegexOptions.Compiled | RegexOptions.IgnoreCase)
};
Later, in our ICorsPolicyProvider attribute, we test the origin. If the origin matches any pattern, we add the origin to the set of allowed origins:
var requestOrigin = request.GetCorsRequestContext().Origin;
if (!string.IsNullOrWhiteSpace(requestOrigin) &&
AllowedOriginPatterns.Any(pattern => pattern.IsMatch(requestOrigin)))
{
policy.Origins.Add(requestOrigin);
}
The perk of this approach is that we can support a large and flexible set of authorized origins, and restrict unauthorized origins, all without placing hundreds of origins in our headers.
We have an issue with HTTP header rewriting on the part of our reverse proxy. Under some circumstances, the reverse proxy replaces the Access-Control-Allow-Origin value with the host. Seems fishy to me, but that is not my question.
It was proposed that we change our code so that if the origin meets our precondition that we return * instead of echoing the origin in the response.
My question is whether we open a potential security vulnerability by returning Access-Control-Allow-Origin: * to origins that match our preconditions. I would hope that browsers check the access control headers with every request. If so, I am not overly concerned. If not, I can imagine an attacker running an untrusted web site taking advantage of browsers that have previously received a * for a trusted origin.
I have read the OWASP article on scrutinizing CORS headers. The article does not reflect any of my concerns, which is reassuring; however, I wanted to inquire of the community before pressing forward.
Update
As it would turn out, this question is not valid. We discovered, after exploring this implementation, that allowing all origins for credentialed requests is disallowed. After troubleshooting our failed implementation, I found two small statements that make this quite clear.
From the W3C Recommendation, Section 6.1:
Note: The string "*" cannot be used for a resource that supports credentials.
From the Mozilla Developer Network:
Important note: when responding to a credentialed request, server must specify a domain, and cannot use wild carding.
This proved ultimately disappointing in our implementation; however, I imagine the complexity of adequately securing credentialed CORS requests would explain why this is a prohibited scenario.
I appreciate all assistance and consideration rendered.

My question is whether we open a potential security vulnerability by returning Access-Control-Allow-Origin: * to origins that match our preconditions.
Potentially, if the browser (or proxy server) caches the response. However, if you are not specifying Access-Control-Allow-Credentials then the risk is minimal.
To mitigate the risk of sending *, make sure you set the appropriate HTTP response headers to prevent caching:
Cache-Control: private, no-cache, no-store, max-age=0, no-transform
Pragma: no-cache
Expires: 0
Note that specifying private is not enough as an attacker could then target someone who has already visited your page and has it in the browser cache.
As an alternative, you might be able to use the Vary: Origin header. However, if some clients don't send the Origin header you might end up with * cached for all requests.
The best approach would be to fix the reverse proxy issue so that your page returns the appropriate Origin as per your code. However, if this is not feasible then * is fine, only if used with an aggressive anti-caching policy. Also, * for all requests may be fine if Access-Control-Allow-Credentials is false or missing - this depends whether your application holds sensitive user based state.

Amazon Cloudfront removes Referer header

I am using Amazon CloudFront to deliver some HDS files. I have an origin server which check the HTTP HEADER REFERER and in case is no allowed it block it.
The problem is that cloud front is removing the referer header, so it is not forwarded to the origin.
Is it possible to tell Amazon not to do it?

Within days of writing the answer below, changes have been announced to Cloudfront. Cloudfront will now pass through headers you select and can add some headers of its own.
However, much of what I stated below remains true. Note that in the announcement, an option is offered to forward all headers which, as I suggested, would effectively disable caching. There's also an option to forward specific headers, which will cause Cloudfront to cache the object against the complete set of forwarded headers -- not just the uri -- meaning that the effectiveness of the cache is somewhat reduced, since Cloudfront has no option but to assume that the inclusion of the header might modify the response the server will generate for that request.
Each of your CloudFront distributions now contains a list of headers that are to be forwarded to the origin server. You have three options:
None - This option requests the original behavior.
All - This option forwards all headers and effectively disables all caching at the edge.
Whitelist - This option give you full control of the headers that are to be forwarded. The list starts out empty, and grows as you add more headers. You can add common HTTP headers by choosing them from a list. You can also add "custom" headers by simply entering the name.
If you choose the Whitelist option, each header that you add to the list becomes part of the cache key for the URLs associated with the distribution. Adding a header to the list simply tells CloudFront that the value of the header can affect the content returned by the origin server.
http://aws.amazon.com/blogs/aws/enhanced-cloudfront-customization/
Cloudfront does remove the Referer header along with several others that are not particularly meaningful -- or whose presence would cause illogical consequences -- in the world of cached content.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html
Just like cookies, if the Referer: header were allowed to remain, such that the origin could see it and react to it, that would imply that the object should be cached based on the request plus the referring page, which would seem to largely defeat the cachability of objects. Otherwise, if the origin did react to an undesired referer and send no-cache responses, that would be all well and good until the first legitimate request came in, the response to which would be served to subsequent requesters regardless of their referer, also largely defeating the purpose.
RFC-2616 Section 13 requires that a cache return a response that has been "checked for equivalence with what the origin server would have returned," and this implies that the response be valid based on all headers in the request.
The same thing goes for User-agent and other headers an origin server might use to modify its response... if you need to react to these values at the origin, there's little obvious purpose for serving them with a CDN.
Referring page-based tests are quite a primitive measure, the way many people use them, since headers are so trivial to forge.
If you are dealing with a platform that you don't control, and this is something you need to override (with a dummy value, just to keep the existing system "happy,") then a reverse proxy in front of the origin server could serve such a purpose, with Cloudfront using the reverse proxy as its origin.

In today's newsletter amazon announced that it is now possible to forward request headers with cloudfront. See: http://aws.amazon.com/de/about-aws/whats-new/2014/06/26/amazon-cloudfront-device-detection-geo-targeting-host-header-cors/

Is there any way to identify requests coming to custom origin server from CloudFront?

I'm using CloudFront with custom origin and want to redirect certain requests coming to a web app to CloudFront (clients use direct URLs, which cannot be changed to CloudFront-based URLs). In order to ensure that cache on CloudFront is updated properly, I must not redirect requests coming from CloudFront itself. Is there any way to identify such requests on origin server?
Does CloudFront add any custom headers to requests sent to origin server? Or is there any other reliable way to determine that requests is coming from CloudFront?

yes you can identify requests coming to your origin server from cloudfront by checking the useragent. the user agent would be 'Amazon CloudFront'

Update
It's an old question, but my update useful for someone research or looking for the new solution.
Recently AWS added new feature Origin Custom Headers.You can set a header with a secret value and check it on your origin server by the web server or your applications.

Update
Avinash Bijja correctly pointed out (+1) that the HTTP User-agent header would be 'Amazon CloudFront' for requests coming from Amazon CloudFront servers. Unfortunately this doesn't seem to be explicitly documented indeed, but is implicitly acknowledged by various posts in the respective forum, see e.g. the AWS Team response to User Agent String - does CF overwrite the user agent string?:
You are correct. The User-Agent field is always populated as "Amazon CloudFront".
However, it turns out this is not currently entirely reliable, insofar CloudFront sends an empty User-Agent to the origin if one is missing in the originating client request already:
I can confirm that CloudFront is not sending a User-Agent to the
origin when the original client does not send a User-Agent. We have
enhancements & fixes to User-Agent handling on our backlog, but no
release dates at this time. I've sent you a PM with further details.
These enhancements & fixes are apparently not rolled out still as of February 07 2013 at least.
These enhancements & fixes have been rolled out as of August 05 2013 (thanks webbiedave for the update!).
Initial Answer
Does CloudFront add any custom headers to requests sent to origin
server?
One would think so indeed, but at least they don't appear to be documented where I would have expected it, namely in How CloudFront Processes and Forwards Requests to Your Custom Origin Server. Given you are in control of the origin server, you might just check its HTTP access logs though?
Or is there any other reliable way to determine that requests is
coming from CloudFront?
You'll need to judge the reliability yourself, but The IP address that CloudFront forwards to the origin server is the IP addresses of a CloudFront server, not the IP address of the end user's computer. - consequently you could restrict access to the published Amazon CloudFront Public IP Ranges; however, be aware of the respective disclaimer:
The CloudFront IP addresses change frequently and we cannot guarantee
advance notice of changes. On a best-effort basis, we will provide the
list of current addresses. Customers should not use these addresses
for mission critical applications and must never hard code them in DNS
names. [emphasis mine]
Consequently you'll need to monitor this forum/post to take notice of respective changes as early as possible (if this constraint is acceptable for your use case in the first place of course).

CloudFront appears to add a X-Amz-Cf-Id header to every request before forwarding it to the origin. At least, it currently is doing that for me.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html#request-custom-headers-behavior

This should probably be a comment on Reza's answer, but I can't do that :).
For completeness, here's the link to the official documentation regarding Forwarding Custom Headers, which currently claims the following.
You can configure CloudFront to include custom headers whenever it forwards a request to your origin. You can specify the names and values of custom headers for each origin, both for custom origins and for Amazon S3 buckets. Custom headers have a variety of uses, such as the following:
You can identify the requests that are forwarded to your custom origin by CloudFront. This is useful if you want to know whether users are bypassing CloudFront or if you're using more than one CDN and you want information about which requests are coming from each CDN. (If you're using an Amazon S3 origin and you enable Amazon S3 server access logging, the logs don't include header information.)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string