How to stop index.html being cached by Azure CDN - azure

I am using Azure CDN to host a static website I am building.
It's great, other than the fact that when I update my web app the old page is cached and so still shown.
I have added the following Cache rule in the rules engine to put it to refresh every 60 seconds, however this does nothing and I still get the old content, the only way to get the new content is to go to an incognito browser.
Anyone have any ideas it's driving me crazy!
Here is a screenshot of the browser dev window when I hit the index.html page, I can't see any cache control headers here, I would think that the Azure CDN would/should be putting these on, is that incorrect?

The rule you are modifying controls the "internal max age". If a file shows up correctly in icognito mode, this rule is working fine. You have to set "external max age" to control the Cache-Control header.
https://learn.microsoft.com/en-us/azure/cdn/cdn-verizon-premium-rules-engine-reference-features
Looks like it is not Azure CDN which is caching index.html, it is your browser. Ensure that the Cache-Control header is returned correctly by using the developer tools.
https://learn.microsoft.com/en-us/azure/cdn/cdn-manage-expiration-of-cloud-service-content
https://learn.microsoft.com/en-us/azure/cdn/cdn-manage-expiration-of-blob-content

Related

How to Use eTag on IIS for text/html Pages

I have a website which sits on a non-public domain and is delivered via a proxy through on a different domain. We're having some trouble with caching of content - this is an Umbraco site and making changes updates the pages if you hit the domain directly, but not through the proxy.
I've been informed that the proxy honours response headers and setting an eTag would fix the issue. Having looked into this I can see that IIS sets the eTag by default, and I can see this is working on static content i.e. .js, .css files like so:
However, if I visit a page on the site, for example /uk/products/product I don't see the eTag header.
Is this expected behaviour, should it only be working with those static content files or can I set this on the page to tell the proxy that it should recache?
The ETag HTTP response header is an identifier for a specific version of a resource. It lets caches be more efficient and save bandwidth, as a web server does not need to resend a full response if the content has not changed. Additionally,etags help prevents simultaneous updates of a resource from overwriting each other ("mid-air collisions").
If the resource at a given URL changes, a new Etag value must be generated.
Static content does not change from request to request. The content that gets returned to the Web browser is always the same. Examples of static content include HTML, JPG, or GIF files.
IIS automatically caches static content (such as HTML pages, images, and style sheets), since these types of content do not change from request to request. IIS also detects changes to the files when you make updates, and IIS flushes the cache as needed.
to enable caching in iis you could use iis output caching feature:
1)open iis manager. select site.
2)select the output caching feature from the middle pane.
3)select edit feature setting from the middle pane.
4)check the enable cache and enable kernel cache box and click ok.
if you want to set the ETag as blank you could also do by adding below code in web.config file:
<httpProtocol>
<customHeaders>
<add name="ETag" value="" />
</customHeaders>
</httpProtocol>
refer this below article for more detail:
Caching
To use or not to use ETag, that is the question.
Configure IIS Output Caching
I've read that IIS after version 7 automatically enables E-tags, however, I ran a Pingdom speed test and the report advised me to enable E-tags. I'm not sure that report is accurate, or the information I read about IIS 7 and newer may not be correct.

Azure: WebContentNotFound on refreshing page of SPA deployed as Azure Blob Static Website with CDN

I have a SPA (built with angular) and deployed to Azure Blob Storage. Everything works fine and well as you go from the default domain but the moment I refresh any of the pages/routes, index.html no longer gets loaded and instead getting the error "the requested content does not exist"
Googling that term results in 3 results total so I'm at a loss trying to diagnose & fix this.
You can simply configure the error page to index.html in your static website:
Actually the issue was I didn't have 404.html defined -- the blob storage for SPA doesn't understand what file to serve for any other routes than the root one. So every other route will go to the 404 file. But in a SPA even the 404 goes through the index file. So all I did is mention index.html as my 404 file and all is well.
For me adding the index.html page as the Error document page did not help when navigating by url as it would still reload the app. I posted an answer elsewhere relating to rather using the Angular HashLocationStrategy and that does not cause a page reload when changing the URL manually.
Answer on other SO question
There is a new static Webapps solution by Microsoft. It is currently in preview mode but I think it is the most convenient way to use/deploy a SPA in the Azure infrastructure. You can use your custom domain with free SSL, version control, and set up a route to redirect everything to the index.html (fallback routes: https://learn.microsoft.com/en-us/azure/static-web-apps/routes) for example. see more details here: https://learn.microsoft.com/en-us/azure/static-web-apps/
Generally, you've created a CDN profile and an endpoint, but your content doesn't seem to be available on the CDN. Users who attempt to access your content via the CDN URL receive an HTTP 404 status code. You can follow these methods in troubleshooting Azure CDN endpoints that return a 404 status code
There are several possible causes, including:
The file's origin isn't visible to the CDN. The endpoint is
misconfigured, causing the CDN to look in the wrong place. The host is
rejecting the host header from the CDN. The endpoint hasn't had time
to propagate throughout the CDN.
With CDN, At the initial request, the client directly accesses to the origin server, afterward, at the following request, when you refresh the page, the client requests to the CDN cache server until their time-to-live (TTL) elapses. See Manage expiration of Azure Blob storage in Azure CDN and Control Azure CDN caching behavior with caching rules.
In this case, you may ensure websites blob content is publicly available on the Internet. After that, you may verify that your origin settings are properly configured. Verify that the values of the Origin type and Origin hostname are correct. Verify HTTP and HTTPS ports is represented as your static website is listening on. Kindly you could get more details from that troubleshooting link.
TL/DR
You could set the error document (404) to also be your index.html
This is a quick fix that will still return 404, however will also actually follow your deep link.
This isn't a 'fix'. It's more of a hack - the real fix is to add a CDN with some URL redirect rules on your hosting server. here is a great guide: https://antbutcher.medium.com/hosting-a-react-js-app-on-azure-blob-storage-azure-cdn-for-ssl-and-routing-8fdf4a48feeb
Rule itself
But to save you the click, the CDN rule using standard microsoft CDN (the cheaper one) is something like this:
(add the condition with the '+ condition' button)
If URL file extension > less than 1 extension > no case transform
(add the action with '+ Add action' button)
source pattern: '/' > Destination: '/index.html' > preserve unmatched path: no
Explanation
Ill attempt to add an explanation that I think nobody else did nicely.
What this rule will do is say any URL request that isn't for a direct file, eg.
example.com/xyz
example.com/user/xyz
example.com/tabs/post/12345
Or ANYTHING without a direct file extension (like '.png' or '.pdf' or '.html')
Then we will rewrite the URL to be 'index.html' this is the host file where the SPA has javascript to handle deep links for paths like in the example - therefore you will not get a 404 and the code will handle gracefully.

Cache control header not working

I have set Cache control in my response header as Cache-Control:public, max-age=86400. But when I try to refresh page or open a new tab, it always hits my server. The response status I got is 200, server log is appeared for this request also I checked chrome://cache/ this request is not in the list. I already looked some similar SO questions cache-control not working without etag and why cache-control:max-age don't work?. But still with no luck. Tested on chrome 56.
Chrome disables cache when DevTools is open, or at least it does Chrome 59. Open DevTools, go to Network, uncheck "Disable cache" at the top. Now you should be able to refresh the page and see it in chrome://cache.
Cache control tells your browser (and proxy servers like Squid) what resources it cannot cache. But it does not force your browser to cache a resource.
I recommend to check the error_logs to see if you really go to the backend, or stay in the browser.
In my case, browser gives me 200OK in the console logs but I don't reach the back end according to the error_log ...
Cache-Control response header will not work for page refresh. Try making that request twice without refreshing the page, then you will see it being cached (the request won't reach your server internally).
To achieve what you want you might have to cache your request by accessing localStorage, or just cache it through a back-end caching library.

Azure CDN update for WebApp

I have a setup a azure cdn that point to my webapp. while i am changing in my style sheet and deploying webapp, the styles are updating immediately. so is there no any rquiremtn for purge in this case? does in this case cdn automatically update styles from webapp?
I am working according to this article
https://azure.microsoft.com/en-in/documentation/articles/cdn-websites-with-cdn/
If the URL of the resource remains the same, the CDN servers (and the browsers) are free to cache them. So, if you are using CDN, you need to force a URL change every time the file content changes (commonly done by adding a version string).
Since, it is working for you, either your files are not getting served from the CDN at all or somehow the URL is getting updated.
Look at the URL from where your style sheet is getting fetched (network tab in the browser's debugger). Make sure the URL path is actually from the CDN and not your website directly.
If you have a MVC.net app and you are using System.Web.Optimization.BundleCollection for style bundle, it add a query parameter to the URL embedded in the HTML and changes it if the file contents change. This ensures that the stale cached copies of the resources are not used.
See CDN and bundle caching sections at http://www.asp.net/mvc/overview/performance/bundling-and-minification
No, CDN does not automatically update the CSS for webapp.
To be safe, you should always purge.
CDN is a global service, you saw the CSS update doesn't mean everyone else all see the CSS update. Another IP address might still have the old CSS cached.
Besides, cache control header also plays a role here.

After renaming xhtml pages, old file getting loaded in jsf

I am having some kind of strange problem which i am not able to debug.
Our application servers needs the downtime and hence we have built the temporary xhtml page for our website which will give the message that temporary servers down. Our plan is that during downtime we will rename our original index.xhtml page to something like index-original.xhtml and downtime.xhtml to index.xhtml. So during downtime we can show the website temporary unavailable page. And we will revert these changes when downtime is over
Now when we were testing this renaming thing, we found out that even after renaming the downtime page to index.xhtml and preserving the original index page, browser was still loading the original index page. We have disabled the caching by using following code in login filter.
res.setHeader("Cache-Control", "no-cache, no-store, must-revalidate"); // HTTP 1.1.
res.setHeader("Pragma", "no-cache"); // HTTP 1.0.
res.setDateHeader("Expires", 0); // Proxies.
res.setHeader("Access-Control-Allow-Origin", "*");
What I found from the server logs is that, the request is hitting the server, but somehow browser is still showing the old page. When i checked in the browser's developer tools, i found that page is not getting loaded from cache and it is coming from server. But somehow server returning the old page after renaming
After one server restart, the renamed downtime page gets dispalyed.
The same server restart required when we want our original home page page back after the downtime is over.
My concern is, why server restart is required for this renaming to work? Shouldn't it load the renamed file directly as the request is hitting the server.
I get the following messages ocassionally in chrome developer tool is
Document was loaded from Application Cache with manifest https://www.google.co.in/_/chrome/newtab/manifest?espv=2&ie=UTF-8
Application Cache Checking event
Application Cache NoUpdate event
But in the network section of developer tool, it is not showing that page is loaded from cache.
I am thoroughly confused here as i am a junior developer.
Most likely you have facelet caching turned on. See this stackoverflow answer JSF and automatic reload of xhtml files

Resources