CloudFront Modify JS / CSS Content - amazon-cloudfront

My website's theme is broken when I am serving JS and CSS via CloudFront. Further troubleshooting shows that some JS and CSS contents are different from the origin and I suspect this is the reason. Is it possible that CF has some kind of optimization features that modify our JS /CSS content? If yes, how can we disable or fix this problem?
I believe it is not a caching problem due to there isn't any changes to the origin's file after CF enabled. Also, I've tried to invalidated /wp-content/uploads/sites/2386/bb-plugin/cache/* but still getting the same behavior. As shown in the print screen below, I've also set query string to "Forward all, cache based on all".
Below are the JS and CSS files that are different by comparing the origin and CF, and my CF settings print screen:
JS
(Origin) https://www.seeustosee.com/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.js?ver=774d199e19697e00bc26b83ff78afa2c
(CF) https://da4e1j5r7gw87.cloudfront.net/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.js?ver=774d199e19697e00bc26b83ff78afa2c
CSS
(Origin) https://www.seeustosee.com/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.css?ver=774d199e19697e00bc26b83ff78afa2c
(CF) https://da4e1j5r7gw87.cloudfront.net/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.css?ver=774d199e19697e00bc26b83ff78afa2c
CF Behavior Settings
https://imgur.com/XiPDq0X

CloudFront does not modify payload. Even when Compress Objects Automatically is enabled (which it isn't), the compression is transparent gzip that results in a response body identical to the original, after decompression.
But take a look at your response headers, and you'll see the problem. Your origin server is Nginx, but you don't have CloudFront configured to use that server as the origin for these requests. You have CloudFront sending the requests to an Amazon S3 bucket. The JS file there is from August 28, 2019.
Content-Type: application/javascript
Content-Length: 18371
Date: Fri, 31 Jan 2020 02:21:42 GMT
Last-Modified: Wed, 28 Aug 2019 06:53:02 GMT
Server: AmazonS3

Related

Implement full HTML page caching on CDN

We are trying to implement a full page html caching using CDN on our Kentico portal engine site. To be able to do this we need to set the cache-control of the documents and not only assets to "public". I've tried adding the code below in my global.asax begin request event to test it but for some reason the document response header cache-control is always set to no-cache. Did Kentico intentionally set it? I would think yes because they have its own caching mechanism built-in but if we want to use CDN we need to set the cache to public. Is there a way to override this?
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.Cache.SetMaxAge(new TimeSpan(1, 0, 0));
I also tried modifying the PortalTemplate.aspx.cs to add cache-control meta tag but it also did not work.
tags.Text += "<meta http-equiv=\"cache-control\" content=\"public\" />";
The response header is always
cache-control:no-cache, must-revalidate
content-encoding:deflate
content-type:text/html; charset=utf-8
date:Fri, 02 Mar 2018 18:38:03 GMT
expires:-1
pragma:no-cache
server:Microsoft-IIS/10.0
status:200
vary:Accept-Encoding
x-aspnet-version:4.0.30319
x-frame-options:SAMEORIGIN
x-powered-by:ASP.NET
I was able to override it in PreSendRequestHeaders event in global.asax.
protected void Application_PreSendRequestHeaders(Object source, EventArgs e)
{
//removed some code for brevity
var headers = Response.Headers;
headers.Remove("cache-control");
headers.Remove("pragma");
headers.Remove("expires");
headers.Remove("set-cookie");
headers.Add("cache-control", "public, max-age=" + TimeSpan.FromHours(1).TotalSeconds.ToString());
}
Adding in a great article for static sites by one of the MVPs
https://www.kenticotricks.com/blog/static-sites-with-kentico-cloud

Caching effect on CORS: No 'Access-Control-Allow-Origin' header is present on the requested resource

The short version of this issue is we are seeing the typical CORS error (x has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.) however we are absolutely sending the specified headers. The requests are fine to begin with however after n (pattern undetermined) amount of time SOME (no real pattern to this other than it's a random 1 or 2 assets referenced in the html file) requests will suddenly start failing. On a hard refresh or with disabling cache, the issue is resolved.
We're wondering how caching may affect CORS in this case? Or if the issue lies elsewhere?
What we see is the asset is loaded fine in the first instance.
Here's a cURL representation of what the browser (chrome, not tested elsewhere) sends to the server (cloudfront in front of s3):
curl -I 'https://assets-frontend.kalohq.ink/style.allapps.add899080acbbeed5bb6a7301d234b65.css' -H 'Referer: https://lystable.kalohq.ink/projects/2180?edit=true' -H 'Origin: https://lystable.kalohq.ink' -H 'DPR: 2' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gec
And the headers in response to this look like:
HTTP/1.1 200 OK
Content-Type: text/css
Content-Length: 5632
Connection: keep-alive
Date: Wed, 28 Jun 2017 09:23:04 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET
Access-Control-Max-Age: 3000
Last-Modified: Wed, 28 Jun 2017 09:16:15 GMT
ETag: "ece4babc2509d989254638493ff4c742"
Cache-Control: max-age=31556926
Content-Encoding: gzip
Accept-Ranges: bytes
Server: AmazonS3
Vary: Origin,Access-Control-Request-Headers,Access-Control-Request-Method
Age: 3384
X-Cache: Hit from cloudfront
Via: 1.1 adc13b6f5827d04caa2efba65479257c.cloudfront.net (CloudFront)
X-Amz-Cf-Id: PcC2qL04aC4DPtNuwCudckVNM3QGhz4jiDL10IDkjIBnCOK3hxoMoQ==
After this you can browse the site for a while, refresh a few times and everything is fine and dandy.
But then you might refresh and suddenly you see the error in console:
Access to CSS stylesheet at 'https://assets-frontend.kalohq.ink/style.allapps.add899080acbbeed5bb6a7301d234b65.css' from origin 'https://kalohq.ink' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://kalohq.ink' is therefore not allowed access.
At this point if you hard-refresh or disable the cache and reload the page everything goes back to working. This is why we're pointing at browser caching behaviour playing with CORS at this point.
The HTML file loading these assets is as follows:
<!doctype html><html lang="en"><head><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><title>Kalo</title><meta name="description" content="Kalo is used by the best teams on the planet to onboard, manage, and pay their freelancers. "><meta name="viewport" content="width=device-width,initial-scale=1"><meta http-equiv="Accept-CH" content="Width,DPR,Save-Data"><script>window.performance&&"function"==typeof window.performance.mark&&window.performance.mark("start load bootstrap"),console.log("Kalo v0.214.1 🎉")</script><script type="text/javascript" crossorigin="anonymous">window.webpackManifest={0:"moment-timezone-data.8189aab661847dea1b73.chunk.js",1:"1.7645e36f0742ed31139b.chunk.js",2:"2.bf0a1c9b400d715e3138.chunk.js",3:"3.d077b7a1cede6f6960e6.chunk.js",4:"4.0bbd51f182d8fa3f4951.chunk.js",5:"5.1dcf124ea7874546fc7a.chunk.js",6:"6.85ee04326ef5cfe2c084.chunk.js",7:"7.cf718eabaa3814fcb47c.chunk.js",8:"8.4c4c5b070e09afe037a1.chunk.js",9:"9.ba3b9a5f540f057fca46.chunk.js",10:"10.3c850061770df8801575.chunk.js",11:"11.df971dd9c4ab435fd421.chunk.js",12:"12.81905afa591a4796dcfc.chunk.js",13:"13.0f78c0c77d45cd79ac26.chunk.js",14:"14.f8f9f24d15e1cc4372a1.chunk.js",15:"15.6badd92530b5da668e98.chunk.js",16:"16.ef87b8dc2f87ca2d40a1.chunk.js",17:"17.bf842b852470057c4f0b.chunk.js",18:"18.f091321e6a0bbf16bf1f.chunk.js",19:"19.0297861a162b49308887.chunk.js",20:"20.7281da4b01eb4eb4bf1f.chunk.js",21:"21.781ca5137a9c76031df2.chunk.js",22:"22.c7dfd45fc0bd41c7618d.chunk.js",23:"23.8c4885794fd57453884a.chunk.js",24:"24.1447090b6f41a311414e.chunk.js",25:"25.021a38e680888fe2ac7e.chunk.js",26:"26.1afe06be0d6164d3409a.chunk.js",27:"27.dc70b696039ad4762a3b.chunk.js",28:"28.8c383709ce92ecae6b0c.chunk.js",29:"29.f594eb538f606ae17c50.chunk.js",30:"30.a2c1dfc70e0fac57b2a4.chunk.js",31:"31.2eaee95b85227b23ccd8.chunk.js",32:"32.528e99c8151fef966483.chunk.js",33:"33.c3b7530ab92bc1280136.chunk.js",34:"34.1eb5635dc498ad450839.chunk.js",35:"35.e71c1e7bc6092ff2a35f.chunk.js",36:"36.0d174c67ddb177944140.chunk.js",37:"37.af1c6ed4cde9120da636.chunk.js",38:"38.fb0dd22a16e7b597ef93.chunk.js",39:"39.c17f705a3438de3dc997.chunk.js",40:"40.d509fa240e2adf2888aa.chunk.js",41:"41.37d2f0e0e06a3c7d816b.chunk.js",42:"42.4febbf78adc3084afec3.chunk.js",43:"43.7aa48b320fcf69adb0a3.chunk.js",44:"44.5e6da9391c7412910447.chunk.js",45:"45.a17d5b7c5e534f260841.chunk.js",46:"46.a1d3a7790959ac892ed0.chunk.js",47:"47.241627b0e5da4ce35606.chunk.js",48:"48.84f9532a64f5a3beb20c.chunk.js",49:"49.f8527afe7cade8fc293a.chunk.js",50:"50.776b466f9019479de8fc.chunk.js",51:"51.ca34827c84d4bcc82079.chunk.js",52:"52.517f4f6c63395646cdd7.chunk.js",53:"53.e3a2103e4151cd13300f.chunk.js",54:"athena.5e6c5b01662cea2c8b1a.chunk.js",55:"hera.b69b80db056ad9c9389f.chunk.js",56:"hermes.29bb236b97c128e8b6ee.chunk.js",57:"iris.834233a6fb064bf576a9.chunk.js",58:"hephaestus.7ac71b3274dda739ba1f.chunk.js",59:"59.ce1aefa687f2ef9c9908.chunk.js",60:"60.5070b818882287dfc402.chunk.js",61:"61.19d5149d0a2bd9ef3c1e.chunk.js",62:"62.d7831f900b939591822e.chunk.js"}</script><link rel="shortcut icon" href="https://assets-frontend.kalohq.ink/favicon.ico" crossorigin="anonymous"><link href="https://assets-frontend.kalohq.ink/style.allapps.add899080acbbeed5bb6a7301d234b65.css" rel="stylesheet" crossorigin="anonymous"><link href="https://assets-frontend.kalohq.ink/style.hermes.689f9795642815d4b8afd20e446a174d.css" rel="stylesheet" crossorigin="anonymous"><link rel="preload" href="https://assets-frontend.kalohq.ink/hermes.29bb236b97c128e8b6ee.js" as="script" crossorigin="anonymous"><link rel="preload" href="https://assets-frontend.kalohq.ink/style.hermes.689f9795642815d4b8afd20e446a174d.css" as="style" crossorigin="anonymous"><link rel="preload" href="https://assets-frontend.kalohq.ink/allapps.commons.8395b1aa9666e3271c40.js" as="script" crossorigin="anonymous"><link rel="preload" href="https://assets-frontend.kalohq.ink/style.allapps.add899080acbbeed5bb6a7301d234b65.css" as="style" crossorigin="anonymous"><link rel="preload" href="https://assets-frontend.kalohq.ink/vendor.83e606c69fc5ae7aeb9b.js" as="script" crossorigin="anonymous"><link rel="preload" href="https://assets-frontend.kalohq.ink/core/styles/fonts/Fakt-Soft-Pro-SemiBold/FaktSoftPro-SemiBold.1901bce5eea18c64a60693e961585ba1.woff" as="font" crossorigin="anonymous"><link rel="preload" href="https://assets-frontend.kalohq.ink/core/styles/fonts/Fakt-Soft-Pro-Blond/FaktSoftPro-Blond.4ab21e2be2f31a0ab8d798a9c65f99c1.woff" as="font" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/hera.b69b80db056ad9c9389f.js" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/iris.834233a6fb064bf576a9.js" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/athena.5e6c5b01662cea2c8b1a.js" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/moment-timezone-data.8189aab661847dea1b73.chunk.js" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/style.hera.f00a272db8e5756775fb2632e67c1056.css" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/style.iris.1465dc22f4279c748a04c66f3b4494de.css" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/style.athena.6acb14c0d060121364c9a0cf3e6fa0ad.css" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/_/node_modules/#kalo/ui/icon/fonts/MaterialIcons/MaterialIcons-Regular.012cf6a10129e2275d79d6adac7f3b02.woff" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/core/assets/fonts/MaterialIcons-Regular.012cf6a10129e2275d79d6adac7f3b02.woff" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/_/node_modules/#kalo/ui/icon/fonts/MaterialIcons/MaterialIcons-Regular.570eb83859dc23dd0eec423a49e147fe.woff2" crossorigin="anonymous"><link rel="prefetch" href="https://assets-frontend.kalohq.ink/core/assets/fonts/MaterialIcons-Regular.570eb83859dc23dd0eec423a49e147fe.woff2" crossorigin="anonymous"></head><body><main id="app"><!--[if lt IE 8]>
<p class="browserupgrade">You are using an outdated browser. Please upgrade your browser to improve your experience.</p>
<![endif]--><noscript>Kalo - Work without boundaries Please wait a moment as we load Kalo. Please make sure you have Javascript enabled to continue. Kalo’s aim is to give companies complete visibility over their external network.</noscript><noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-5XLW75" height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript></main><div class="root __splash"><style>html{position:static!important;overflow-y:auto}.root{transition:opacity .35s linear;color:#234957;background-color:#f9fafc;position:absolute;top:0;right:0;bottom:0;left:0;opacity:1}.root.exit{opacity:0!important}.navigation{height:60px;background:#fff;border-bottom:1px solid #eceff1}.login{background:#ea5f6e;position:absolute;top:0;left:0;bottom:0;width:50%;display:flex;justify-content:center;align-items:center}#media screen and (max-width:767px){.login{width:100%;right:0}}.hide{display:none!important}.logo{height:107px}</style><div id="navbar" class="navigation hide"></div><div id="login" class="login hide"><div class="logo"><svg width="160" height="70" viewBox="0 0 206 90" xmlns="http://www.w3.org/2000/svg"><title>Kalo</title><path fill-rule="evenodd" fill="#fff" d="M17.629 47.172c2.31 0 4.254-.986 6.078-2.833l18.845-19.706c1.824-1.971 3.89-2.957 6.323-2.957 7.294 0 10.212 9.114 5.835 13.55L35.378 54.562l18.724 19.706c3.283 3.571 3.526 8.498.244 12.07-1.46 1.601-3.406 2.464-5.837 2.464-2.552 0-4.62-.986-6.2-2.834L23.707 65.646c-1.7-1.847-3.647-2.832-5.835-2.832h-1.58v17.612c0 4.804-3.405 8.5-8.147 8.5-4.376 0-8.145-3.942-8.145-8.5V8.498C0 3.695 3.647 0 8.145 0c4.5 0 8.147 3.695 8.147 8.498v38.674h1.337zm97.134 29.56c0 2.586-.972 4.433-2.916 5.789-6.566 4.557-15.077 6.773-25.654 6.773-16.656 0-25.653-9.236-25.653-21.676 0-11.455 8.146-20.076 25.045-20.076 3.891 0 8.39.616 13.496 1.848v-3.326c0-6.528-3.283-9.608-11.55-9.608-3.525 0-7.417.74-11.672 2.095-6.686 2.094-11.185-1.11-11.185-6.405 0-3.572 1.823-6.035 5.35-7.513 4.742-2.094 10.698-3.08 17.871-3.08 17.872 0 26.868 8.376 26.868 25.003v30.176zm-15.682-4.68V60.965c-4.378-1.354-8.39-1.97-12.159-1.97-6.443 0-10.577 3.202-10.577 8.006 0 5.296 4.134 8.252 10.942 8.252 4.5 0 8.51-1.11 11.794-3.203zm39.845 8.904c0 4.803-3.405 8.498-8.147 8.498-4.376 0-8.145-3.941-8.145-8.498V9.15c0-4.803 3.647-8.62 8.145-8.62 4.5 0 8.147 3.817 8.147 8.62v71.806zm57.513 1.359c-5.348 4.681-12.035 7.02-20.06 7.02-7.903 0-14.589-2.339-20.06-7.02-5.471-4.68-8.511-10.715-9.118-17.982-.365-5.788-.365-11.7 0-17.612.607-7.391 3.525-13.426 8.996-18.106 5.472-4.68 12.28-7.02 20.183-7.02 8.024 0 14.71 2.34 20.06 7.02 5.349 4.68 8.389 10.715 8.997 18.106.365 5.789.365 11.7 0 17.488-.608 7.391-3.648 13.427-8.998 18.106zm-7.172-33.009c-.363-7.02-5.229-11.946-12.887-11.946-7.417 0-12.402 4.68-13.01 11.946a69.483 69.483 0 0 0 0 12.318c.608 7.266 5.593 11.946 13.01 11.946 7.416 0 12.4-4.68 12.887-11.946a69.326 69.326 0 0 0 0-12.318z"/></svg></div></div><script>"/login"===window.location.pathname&&-1===document.cookie.indexOf("VIEW=")?document.getElementById("login").classList.remove("hide"):document.getElementById("navbar").classList.remove("hide"),document.querySelector(".__splash.root").id="splash"</script></div><script src="https://cdn.polyfill.io/v2/polyfill.min.js?features=Symbol,fetch,Intl.~locale.en&unknown=polyfill"></script><script src="https://apis.google.com/js/client.js" async></script><script src="https://maps.googleapis.com/maps/api/js?key=AIzaSyDteWPK1-k97egIjYcX8-Btt8SpRsHit50&libraries=places" async></script><script>!function(e,t,a,n,c,o,s){e.GoogleAnalyticsObject=c,e[c]=e[c]||function(){(e[c].q=e[c].q||[]).push(arguments)},e[c].l=1*new Date,o=t.createElement(a),s=t.getElementsByTagName(a)[0],o.async=1,o.src="https://www.google-analytics.com/analytics.js",s.parentNode.insertBefore(o,s)}(window,document,"script",0,"ga"),ga("create","","auto")</script><script>!function(e,t,a,n,g){e[n]=e[n]||[],e[n].push({"gtm.start":(new Date).getTime(),event:"gtm.js"});var m=t.getElementsByTagName(a)[0],r=t.createElement(a);r.async=!0,r.src="https://www.googletagmanager.com/gtm.js?id=GTM-5XLW75",m.parentNode.insertBefore(r,m)}(window,document,"script","dataLayer")</script><script>!function(){function t(){var t=a.createElement("script");t.type="text/javascript",t.async=!0,t.src="https://widget.intercom.io/widget/s21m3m5m";var e=a.getElementsByTagName("script")[0];e.parentNode.insertBefore(t,e)}var e=window,n=e.Intercom;if("function"==typeof n)n("reattach_activator"),n("update",intercomSettings);else{var a=document,c=function(){c.c(arguments)};c.q=[],c.c=function(t){c.q.push(t)},e.Intercom=c,e.attachEvent?e.attachEvent("onload",t):e.addEventListener("load",t,!1)}}()</script><script type="text/javascript" src="https://assets-frontend.kalohq.ink/vendor.83e606c69fc5ae7aeb9b.js" crossorigin="anonymous"></script><script type="text/javascript" src="https://assets-frontend.kalohq.ink/allapps.commons.8395b1aa9666e3271c40.js" crossorigin="anonymous"></script><script type="text/javascript" src="https://assets-frontend.kalohq.ink/hermes.29bb236b97c128e8b6ee.js" crossorigin="anonymous"></script></body></html>
Something to note here is that all script and link tags have crossorigin="anonymous". Also note the preload and prefetch tags.
The issue is mostly affecting stylesheets it seems BUT scripts have also been affected in the same manner. Again it's really odd that it seems to pick randomly which assets will break and when. Considering these two facts perhaps it is even based on the reference ordering in the document/load order.
A few final clarifications hopefully to help:
Assets served from cloudfront in front of s3 (see response headers)
Not had reports/testing in browsers other than chrome at this point though can hopefully update on that shortly
All the script and stylesheet assets are preloaded using
Any help or guidance with this issue is going to be hugely appreciated. It's pretty blocking at the moment!
Update:
So we have managed to get what appears to be a continuously working build out without any apparent issues. Hard to know for 100% without time due to seemingly sporadic/random nature of the issue. What we changed was the following:
Bypass Cloudfront to directly reference assets in S3. What could be different?
Set access-control-max-age to -1 which disables this. We wouldn't expect this to have any effect because this should only (reading spec) affect preflight requests which don't occur for GET requests.
Remove the preload/prefetch link tags.
We are now doing further testing to try and isolate one or a combo of these as the culprits. We can then dig further into what is happening there.
Note this solving the issue has now been proved incorrect. See Update 2.
Update 2:
We have had further reports and occurrences in-house of the issue after the previous rollout which we thought bypassed the issue. One affect the previous rollout did have was that the issue is now seen much less frequently. Again a hard refresh fixes everything.
The issue is identical to previously described still and so far we have not seen first-hand a failure to load JS since the first occurrence - always seems to be a CSS file failing now.
Update 3:
Some pretty important information I didn't mention originally is the change which happened around the time this issue started presenting itself.
Last Monday we released a bundle refactor, powered by webpack which meant assets became shared between deployments. For example if an output file allapps.commons.HASH123.css didn't change between release v1 and v2 then the idea is we could leverage browser caching.
What still happens however is that the script uploading these assets to S3 IS currently dumbly uploading and overriding the original file. We were under the assumption this change would be pretty harmless since the file is the same name and contents but perhaps this has some adverse effect?
Another effect of this release was that now there will be a lot more assets due to aggressive code splitting. One thing to note here though is that none of the async chunks seem to suffer from the same problem (they're using jsonp afterall) and the issue is only with those assets reference via <script> and <link> tags.
You can find the build artifacts of the release PRIOR to the breaking release here. And find the NEW build artifacts of the current active release showing infrequent issues here. You can also find our deploy scripts here
All resources can be found on google drive here.
Update 4:
This issue is still occurring and has now been reported on an async chunk which is loaded on-demand. Looking at the webpack runtime these scripts are loaded by adding a new script tag to the page, again with crossorigin="anonymous".
Update 5:
On each build we now use a unique salt (the release version) when hashing the file names. This means no assets are shared between builds. The issue has continued to persist after this release.
Update 6:
I've uploaded a .har file showing this issue occurring over a user session.
Search for the following string "url": "https://assets-frontend.kalohq.ink/style.allapps.add899080acbbeed5bb6a7301d234b65.css", and see the various requests made for this asset. You will see the first few are fine and have the headers you'd the expect. The last occurrence (line 32624) is the one which failed.
{
"startedDateTime": "2017-06-28T09:40:15.534Z",
"time": 0,
"request": {
"method": "GET",
"url": "https://assets-frontend.kalohq.ink/style.allapps.add899080acbbeed5bb6a7301d234b65.css",
"httpVersion": "unknown",
"headers": [
{
"name": "Referer",
"value": "https://kalohq.ink/account"
},
{
"name": "Origin",
"value": "https://kalohq.ink"
},
{
"name": "DPR",
"value": "2"
},
{
"name": "User-Agent",
"value": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
}
],
"queryString": [],
"cookies": [],
"headersSize": -1,
"bodySize": 0
},
"response": {
"status": 0,
"statusText": "",
"httpVersion": "unknown",
"headers": [],
"cookies": [],
"content": {
"size": 0,
"mimeType": "x-unknown"
},
"redirectURL": "",
"headersSize": -1,
"bodySize": -1,
"_transferSize": 0,
"_error": ""
},
"cache": {},
"timings": {
"blocked": -1,
"dns": -1,
"connect": -1,
"send": 0,
"wait": 0,
"receive": 0,
"ssl": -1
},
"serverIPAddress": "",
"pageref": "page_10"
},
Update 7:
So last night we pushed a change which removed the usage of the crossorigin="anonymous" attribute everywhere. So far we have not seen the issue occur (still waiting given the nature of the issue) but are seeing some interesting and unexpected responses from the requests being made now. Would be great if we could get some clarification on what exactly is happening here. I don't believe we expected removing crossorigin="anonymous" to have such an effect or even understand why it was so broken before since our server is setup to send the correct headers AND the Vary header.
Request from cli to s3, with an Origin header, no cors response headers
curl -I 'https://s3.amazonaws.com/olympus.lystable.com/style.allapps.5ebcc4d28ec238a53f46d6c8e12900d1.css' -H 'Pragma: no-cache' -H 'Accept-Encoding: gzip, deflate, br' -H 'Accept-Language: en-GB,en-US;q=0.8,en;q=0.6' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/59.0.3071.109 Chrome/59.0.3071.109 Safari/537.36' -H 'Accept: text/css,*/*;q=0.1' -H 'Referer: https://asos.kalohq.com/categories' -H 'Connection: keep-alive' -H 'DPR: 1' -H 'Cache-Control: no-cache' -H "Origin: https://kalohq.com" --compressed
HTTP/1.1 200 OK
x-amz-id-2: kxOvBrYsKyZ42wGgJu8iyRZ8q6j5DHDC6QoK1xn2e8FO1wIEEVkxQ0JvGQTmwrN/Njf8EOlmLrE=
x-amz-request-id: DA8B5488D3A7EF73
Date: Thu, 13 Jul 2017 13:27:47 GMT
Last-Modified: Thu, 13 Jul 2017 11:30:50 GMT
ETag: "c765a0a215cb4c9a074f22c3863c1223"
Cache-Control: max-age=31556926
Content-Encoding: gzip
Accept-Ranges: bytes
Content-Type: text/css
Content-Length: 5887
Server: AmazonS3
Request a moment later from cli again to s3 with just origin header. Now suddenly gives all expected cors headers back...
curl -H "Origin: https://kalohq.com" -I https://assets-frontend.kalohq.com/style.allapps.5ebcc4d28ec238a53f46d6c8e12900d1.css
HTTP/1.1 200 OK
Content-Type: text/css
Content-Length: 5887
Connection: keep-alive
Date: Thu, 13 Jul 2017 13:33:09 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET
Access-Control-Max-Age: -1
Last-Modified: Thu, 13 Jul 2017 11:30:50 GMT
ETag: "c765a0a215cb4c9a074f22c3863c1223"
Cache-Control: max-age=31556926
Content-Encoding: gzip
Accept-Ranges: bytes
Server: AmazonS3
Vary: Origin,Access-Control-Request-Headers,Access-Control-Request-Method
Age: 69
X-Cache: Hit from cloudfront
Via: 1.1 a19c66da9b402e0bee3fd29619661850.cloudfront.net (CloudFront)
X-Amz-Cf-Id: 3wQ7Z6EaAcMscGirwsYVi1M_rvoc1fbI034QY4QZd6IqmlRzLRllEg==
Update 8:
The removal of crossorigin="anonymous" tags have resolved the issue. Investigation into why this suddenly started being an issue with this release is ongoing since we had this attribute on script tags before.
All resources useful in this investigation can be found on google drive here.
https://assets-frontend.kalohq.ink/style.allapps.add899080acbbeed5bb6a7301d234b65.css only returns CORS headers when an "Origin" header is present (which is sent with a CORS request, but not regular requests).
Here's what happens:
User fetches CSS as part of a no-CORS request (eg, <link rel="stylesheet">). This caches due to the Cache-Control header.
User fetches CSS as part of a CORS request. The response comes from the cache.
CORS check fails, no Access-Control-Allow-Origin header.
The server is at fault here, it should use the Vary header to indicate its response changes depending on the Origin header (and others). It sends this header in response to CORS requests, but it should send it in response to non-CORS requests too.
Chrome is somewhat at fault here, as it should use the credentials mode of the request as part of the caching key, so a non-credentialed request (such as those sent by fetch()) shouldn't match items in the cache that were requested with credentials. I think there are other browsers that behave like Chrome here, but Firefox doesn't.
However, since you're using a CDN, you can't rely on browsers to get this right, as the caching may still happen at the CDN. Adding the correct Vary header is the right fix.
tl;dr: Add the following header to all of your responses that support CORS:
Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
We experienced the same problem when migrating our JS to Webpack.
Our setup is similar:
assets are uploaded to an S3 bucket during deployment
the bucket is set as the Cloudfront origin
When migrating to Webpack, we wanted to take advantage of JS sourcemaps for better error reporting to Airbrake.
To allow errors to be catched properly, the crossorigin="anonymous" attribute had to be set on the script tags.
The reason why is explained here:
https://blog.sentry.io/2016/05/17/what-is-script-error.html
Part of the problem was that CORS response headers were sometimes returned, sometimes not, triggering a CORS errror in the browser.
Cloudfront servers were caching the response with OR without the CORS headers, depending on the first client request making a Miss request.
So two possible outcomes:
First client does not send the Origin request header => Cloudfront server caches the response without CORS headers.
First client sends the Origin request header => Cloudfront server caches the response without CORS headers.
This made the problem look like it was random, but it was just a matter of race condition (how the first client made the request) and different headers cached on different Cloudfront servers: timing and location dependent.
Add to that the fact that browsers might cache these wrong headers...
So you need to properly configure Cloudront's distribution behavior to:
allow and cache preflight (OPTIONS) requests
base the cache on CORS request headers (Origin, Access-Control-Request-Headers, Access-Control-Request-Method)
Here is the configuration that solved our problem.
S3 bucket / Permissions / CORS configuration:
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
<AllowedOrigin>*</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>HEAD</AllowedMethod>
<MaxAgeSeconds>300</MaxAgeSeconds>
<AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
Cloudfront distribution / Edit Behavior:
We now experience a problem similar to yours, as we just migrated our CSS to Webpack. We are experiencing even more sporadic CORS errors for CSS files.
We are trying to remove the crossorigin="anonymous" attribute on <link rel="stylesheet" /> tags since we don't need error tracking for CSS files.
I can shed a little light to how it happened with us. Azure CDN (which we use) does not support Vary: headers right now. So far so bad. But now we use the script crossorigin attribute which - and thats the interesting thing - is not supported by some browsers.
If now such browser comes to our site, it does not send origin: because it does not understand the "crossorigin" attribute. If later another one comes who understands it, it will send origin: -> CORS Error because first response is cached.
Ugly.
I want to share that we were having the same issue, but in this case, specifically preloading some fonts.
We noticed that the combination of S3, CloudFront, and Safari was killing us, so we decided to remove preload and crossorigin="anonymous".
We were trying to do this:
<link rel="preload" href="<zzz.cloudfrontUrl.com>" as="font" crossOrigin="anonymous" />
But safari somehow corrupted the cache and it was giving not allowed access-control-allow-origin but only sometimes.
Our assumption there is that maybe there is a problem between safari and preload fonts from maybe any CDN and the crossOrigin="anonymous" (which is required).
Regards.
Assuming that the CROS configuration has been set in the S3 already, these two points below can make sure that the video will be always loaded on the browser.
Add crossorigin=”anonymous” on the tag
Add the "?q=#{Time.now.to_i}" at the end of the S3 URLs
HAML + Ruby code will look something like this.
%video{controls: "", controlslist: "nodownload", crossorigin: 'anonymous'}
%source{src: "#{s3_url}?q=#{Time.now.to_i}", type: "video/mp4"}
I was having same problem here is how I solved it.
Added Corsrule for wildcard domain(you can choose you origin domain)
<CORSRule>
<AllowedOrigin>*</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>HEAD</AllowedMethod>
<MaxAgeSeconds>3000</MaxAgeSeconds>
<AllowedHeader>*</AllowedHeader>
</CORSRule>
Then go to
CloudFront Distribution > Origin and Origin Groups > Edit Origin
In your "Origin Custom Headers" fill "Header Name" value "origin" and "Value" to "https://www.yourorigindomain.com"
When you click on info icon in the right of Origin Custom Headers you will see the message:
All custom header keys and values you specify here will be included in every >request to this origin. If a header was already supplied in the client request, it is overridden.
So cloudfront add origin header in every request it made to s3 bucket, wheather you pass it from client or not and cache response the headers.
You can check access-control-allow-origin: * in response header using
`curl -i https://cloufrontdistributiondomain.com/example.png`
i.e without passing the host.
When using the Azure CDN, Vary headers are ignored for the cached content, which means Azure only keeps one copy per value of the Accept-Encoding header. This of course introduces issues with CORS.
In order to avoid such issues, the workaround can be to rely on the rules engine and to force the headers:
Global rule:
Overwrite Vary response header with value: Origin,Access-Control-Request-Method,Access-Control-Request-Headers,Accept-Encoding
Overwrite Access-Control-Allow-Origin response header with value: https://<your domain>
Delete Server response header
Create additional rules for each potential Origin:
If Request Header Origin Equals ...
The response headers should be defined in a similar way to the global rule
I ran into a similar situation. And I want to document my debugging process to help more people.
We are building an online image editor. Because we need to export canvas as an image, we need to handle cors for images properly (with crossOrigin: "anonymous").
Similar to the PO, our images are hosted on s3 with cloudfront as the CDN. I noticed the same cors error as that of the question.
Related Details: both the s3 bucket and the cloudfront CDN had been
running for a while without the headers:
access-control-allow-origin: *
access-control-allow-methods: PUT, POST, DELETE, GET
The headers were only added recently by following:
https://docs.aws.amazon.com/AmazonS3/latest/dev/cors.html
https://aws.amazon.com/premiumsupport/knowledge-center/no-access-control-allow-origin-error/
The issue was very strange to me, because I was pretty sure I set s3 and cloudfront properly to return the headers. I could confirm it with curl, by passing an "Origin".
So my initial guess was that this was a chrome bug.
My friends and I tried different environments, the issue can't be reproduced consistently. Here is the summary:
Ubuntu 1 + chrome : repro
Ubuntu 1 + firefox : no repro
Mac 1 + chrome : repro
Mac 1 + safari : repro
Mac 2 + chrome : no repro
Mac 3 + chrome : no repro
Our chrome version is the same 87.0.4280.88 (Official Build) (64-bit)
Another strange thing was that the issue would be gone randomly while I was trying to debug the issue for no reason. However, the issue would come back next day.
I suspected browser cache at first. I tried clearing cache / hard reload, incognito mode, and a completely fresh chrome session, using:
google-chrome --user-data-dir=~/chromeTemp
They couldn't resolve the issue.
So the key question to ask was if this is an AWS issue or this is a chrome issue.
Although chrome's devtool did show that the headers didn't present,
access-control-allow-origin: *
access-control-allow-methods: PUT, POST, DELETE, GET
given the strangeness of this issue and the opposite fact the curl command had told me, I couldn't trust chrome's devtool.
Plus, I also tried right clicking in chrome devtool's network panel over the failed image and choosing "copy as curl", and pasting that curl command line directly into a terminal, and the issue wasn't there either. The headers were simply returned.
Because the image is served by cloudfront with https, I didn't know how to intercept https traffic using wire shark. But I found chrome's network dump function:
chrome://net-export/
Later, I noticed that cloudfront has an option to serve using http. That could be an easier way to debug this issue.
And was able to capture the raw traffic:
Now it is very clear that AWS didn't return the right headers.
I tried immediately with curl with the "Origin" header:
curl -v -H "origin: http://localhost:9003" https://xxxxx.cloudfront.net/xxxxx.jpg
The headers were returned.
Notice that in addition to the "Origin" header, chrome sends a lot more headers to the cloudfront endpoint. So I asked if I can reproduce the issue with curl by sending exactly the same headers:
curl -v -H "pragma: no-cache" -H "cache-control: no-cache" -H "sec-ch-ua: "Google Chrome";v="87", " Not;A Brand";v="99", "Chromium";v="87"" -H "origin: http://localhost:9003" -H "sec-ch-ua-mobile: ?0" -H "user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" -H "accept: image/avif,image/webp,image/apng,image/,/*;q=0.8" -H "sec-fetch-site: cross-site" -H "sec-fetch-mode: cors" -H "sec-fetch-dest: image" -H "referer: http://localhost:9003/" -H "accept-encoding: gzip, deflate, br" -H "accept-language: en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7,zh-TW;q=0.6" https://xxx.cloudfront.net/xxx/xxx.jpg
And indeed, if I serve exactly the same headers to cloudfront, I can reproduce the issue with curl too. Most importantly, notice that, when the issue reproduces, there is a x-cache header says "cache hit", whereas when it doesn't reproduce, the same header carries "cache miss".
So, what if I clear the cloudfront cache? I actually tried this first time I saw the bug. I did it by invalidating only the path of the problematic image. But this time, I wanted to invalidate the entire cloudfront cache for that s3 bucket.
After that, I tried again with the same curl command line, and it worked!
Notice that not only the headers were returned, it also said "cache miss".
So here is my conclusion / learning:
Looks like cloudfront does caching not only based on the content itself, but also based on request headers. Different headers will result in different caching behavior and thus different returned data.
The reproducible chrome might send different headers than those browsers that didn't reproduce the issue. Essentially they were reading from different caches.
The request headers as shown and copied by the chrome devtool is not trust worthy.
chrome://net-export/ is an awesome tool.
At this point, I'm 95% sure this is due to cloudfront caching behavior / stale data / headers. But given the intermittency of the issue, I will need to keep observing for a while to be sure.
Update 1:
Unfortunately, I checked the next day, the issue just came back:
I feel that this is cloudfront's bug.
Five years later I stumbled over this Post having a JavaScript image loading issue with CORS on S3 on Safari. The solution, in my case, was inbelievable:
This code works:
image.crossOrigin = 'anonymous';
image.src = resultSrc;
This code does not:
image.src = resultSrc;
image.crossOrigin = 'anonymous';
In fact, safari loads the picture on the first line, without CORS headers, and puts in into the cache.
The he loads the image AGAIN, on the second line, with CORS headers, and correctly uses it (on the first request). ON second request, result comes from cache and fails.
I manage to solve the issue with the same setup using CloudFront + S3 serving files from another domain (CORS required).
For me, the CORS issue occurred for the most recent added SVG for unknown reasons (we have multiple SVG's used by CSS, which works fine). Our app is build using Webpack + Module federation which means the app shell handles the injection of CSS.
I used the chrome://net-export/ as mentioned above (GREAT tool, thanks!) to investigate the headers returned by CF/S3 and the access-control-allow-origin header was not included in the response for this specific file. The CORS issue now makes sense but why is CF/S3 not returning the CORS headers in the response?
Solution
Except adding CORS settings to S3 we added the Origin header to CloudFront under Behaviours/Cache key and origin requests/Legacy cache settings/headers where you can include custom headers.
Suppose the server sends a response with an Access-Control-Allow-Origin value with an explicit origin (rather than the "*" wildcard). In that case, the response should also include a Vary response header with the value Origin — to indicate to browsers that server responses can differ based on the value of the Origin request header. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Origin#cors_and_caching
As I understand from above the CF setting should not be required when using the Access-Control-Allow-Origin: * header. Anyway, now the CORS headers are included in the response together with the Vary: Origin header and it works.
Question still exist why this issue was introduced to this specific file...

How does a web browser determine what to do with a resource?

In the browser's address bar, I can specify a resource using any extension or none, e.g., http://www.something.com/someResource.someExtension. How does the browser determine what to do with this resource? e.g., should the browser parse it as an HTML document, or treat it as some script? Is there a notion of a resource type? Thank you.
P.S. I could not believe what I was thinking! :( (see my flaw in the comment to Luka's answer). How could the browser look at a resource locally! The browser is a client, and the resource resides on the server side. Duh! (I've found myself on this "mental" drug occasionally)
The HTTP response returned by server typically contains "Content-type: text/html" or similar line (application/octet-stream, etc).
Here's an example (the easiest way to view similar results is to open firebug's Net tab):
Cache-Control public, max-age=60
Content-Encoding gzip
Content-Length 9334
Content-Type text/html; charset=utf-8<----------------here's it
Date Sat, 05 May 2012 20:34:36 GMT
Expires Sat, 05 May 2012 20:35:36 GMT
Last-Modified Sat, 05 May 2012 20:34:36 GMT
Vary *
It looks at the Mime Type of the document.
HTML pages have the mime type text/html, JPEG images have image/jpeg
More information: http://en.wikipedia.org/wiki/Internet_media_type
It does using MIME types http://en.wikipedia.org/wiki/Internet_media_type.

Trying to pass pci complience but have a cross-site scripting issue

I'm currently trying to pass PCI compliance for one of my client's sites but the testing company are flagging up a vulnerability that I don't understand!
The (site removed) details from the testing company are as follows:
The issue here is a cross-site
scripting vulnerability that is
commonly associated with e-commerce
applications. One of the tests
appended a harmless script in a GET
request on the end of the your site
url. It flagged as a cross-site
scripting vulnerability because this
same script that was entered by the
user (our scanner) was returned by the
server unsanitized in the header. In
this case, the script was returned in
the header so our scanner flagged the
vulnerability.
Here is the test I ran from my
terminal to duplicate this:
GET
/?osCsid=%22%3E%3Ciframe%20src=foo%3E%3C/iframe%3E
HTTP/1.0 Host:(removed)
HTTP/1.1 302 Found
Connection: close
Date: Tue, 11 Jan 2011 23:33:19 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: http://www.(removed).co.uk/index.aspx?osCsid="><iframe src=foo></iframe>
Set-Cookie: ASP.NET_SessionId=bc3wq445qgovuk45ox5qdh55; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 203
<html><head><title>Object moved</title></head><body>
<h2>Object moved to here.</h2>
</body></html>
The solution to this issue is to
sanitize user input on these types of
requests, making sure characters that
could trigger executable scripts are
not returned on the header or page.
Firstly, I can't get the result that the tester did, it only ever returns a 200 header which doesn't include the location, nor will it return the object moved page. Secondly, i'm not sure how (on iis 6) to stop it returning a header with the query string in it! Lastly, why does code in the header matter, surely browsers wouldn't actually execute code from the http header?
Request: GET /?osCsid=%22%3E%3Ciframe%20src=foo%3E%3C/iframe%3E HTTP/1.0 Host:(removed)
The <iframe src=foo></iframe> is the issue here.
Response text:
<html><head><title>Object moved</title></head><body>
<h2>Object moved to here.</h2>
</body></html>
The response link is:
http://www.(removed).co.uk/index.aspx?osCsid="><iframe src=foo></iframe>
Which contains the contents from the request string.
Basically, someone can send someone else a link where your osCsid contains text that allows the page to be rendered in a different way. You need to make sure that osCsid sanitizes input or filters against things that could be like this. For example, I could provide a string that lets me load in whatever javascript I want, or make the page render entirely different.
As a side note, it tries to forward your browser to that non-existent page.
It turned out that I have a Response.redirect for any pages which are accessed by https which don't need to be secure and this was returning the location as part of the redirect. Changing this to:
Response.Status = "301 Moved Permanently";
Response.AddHeader("Location", Request.Url.AbsoluteUri.Replace("https:", "http:"));
Response.End();
Fixed the issue

How does GMail implement Comet?

With the help of HttpWatch, I tried to figure out how GMail implements Comet.
I login in to GMail with two accounts, one in IE and the other in Firefox. Chatting in GTalk in GMail with some magic words like "WASSUP". Then, I logoff both GMail accounts, filter any http content without "WASSUP" string. The result shows which HTTP request is the streaming channel. (Note: I have to logoff. Otherwise, never-ending HTTP would not show content in HttpWatch.)
The result is interesting. The URL for stream channel is like:
https://mail/channel/bind?VER=8&at=xn3j33vcvk39lkfq.....
There is no surprise that GMail do Comet in IE with IFRAME. The Http content starts with "<html><body>".
Originally, I guessed that GMail does Comet in Firefox with multipart XmlHttpRequest. To my surprise, the response header doesn't have "multipart/x-mixed-replace" header. The response headers are as below:
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Date: Sat, 20 Mar 2010 01:52:39 GMT
X-Frame-Options: ALLOWALL
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
Server: GSE
X-XSS-Protection: 0
Unfortunately, the HttpWatch doesn't tell whether a HTTP request is from XmlHttpRequest or not. The content is not HTML but JSON. It looks like a response for XHR, but that would not work for Comet without multipart/x-mixed-replace, right?
Is there any way else to figure out how GMail implements Comet?
Update:
After further investigation, I believe GMail implements Comet this way:
1) in IE, it use a forever-hidden-iframe;
2) in Firefox, it use forever-XHR without multipart/x-mixed-replace header. The client will response in conditon (readyState == 3) OR (readyState == 4). That is, in both interactive state and complete state.
Per this article,
So what is the solution used by Google
Gmail?
The solution is really simple,
straight forward and very portable!
What Gmail did is requesting an
endless html page that contains
streams of Javascript portions. Give
it a try, It’s very powerful. So, we
will have on the client side a js file
that processes the responses, and
another endless html that contains the
Javascript Streams.
The rest of the article goes into much more detail, including an exploration of alternatives as well as the specific one picked by GMail.

Resources