azure cdn purge not refreshing cached content

azure cdn purge not refreshing cached content - azure

I have an Azure CDN (Verizon, premium) connected to blob storage. I have 2 rules in place based on step 6 in this tutorial. The rules are designed to force the CDN to serve "index.html" when the root of the CDN is called. They may or may not be relevant to the issue, but they are described in Step 6 as follows:
Make sure the dropdown says “IF” and “Always”
click on “+” button next to “Features” twice.
set the two newly-created dropdowns to “URL Rewrite”
set the all the sources and destination dropdowns to the endpoint you created (the value with the endpoint name)
for the first source pattern, set to ((?:[^\?]/)?)($|\?.)
for the first destination pattern, set to $1index.html$2
for the second source pattern, set to ((?:[^\?]/)?[^\?/.]+)($|\?.)
for the second destination pattern, set to $1/index.html$2
I initially uploaded files to blob storage, was able to hit them via the CDN (demonstrating the above rules worked correctly), and then made changes to the local files (debugging) for uploading to blob storage. After updating all files on blob storage and manually purging the CDN endpoint with the "purge all" option checked, I am served the old files by the CDN, and the new files when hitting blob storage directly. This seems to occur for every file (even when hitting the file directly, not just index.html). This still occurs after waiting ~10hrs, clearing browser cache, and trying browsers never before used to access the CDN.
Does anyone know what may be occurring? Is it cached somewhere between my network and the CDN endpoint? I feel like I'm probably missing something very simple...
edit 1: I have another Verizon (non-premium) CDN connected to the same storage container, and it picks up the correct files after a purge; however, even now (24 hrs later) the premium CDN is not serving the updated files.
edit 2: Called Microsoft support for Azure, and they spent about 6 hours investigating to no avail. We eventually tried purging again, and now the updated files are being sent. Still not sure what the issue was.

After working with Microsoft and Verizon Digital Media support for a number of weeks, they finally figured out the solution.
In order to avoid interfering with the purge process, the easiest method is to implement the following "IF" statement before the "Features" portion of your rule:
IF | Request Header Wildcard | Name | User-Agent | Does Not Match | Values | ECPurge/* | Ignore Case (checked)
This if statement will skip this rule altogether for requests made with the purge user agent, allowing the request to hit the CDN as normal.

I also found if you purge each file separately, for example, put /css/main.css in 'Content path' when purging instead of 'purge all' it will work

10/14/2019 - With this similar setup (Azure CDN/Blob-Static-Site/Verizon), I was having a cache issue too. I happened to have only 1 URL rewrite rule. What solved it for me was to set a subsequent rule regarding the caching/Max-Age.
Here's how I did that:
Azure Portal -> CDN profiles -> Click your profile (Overview section)
Click the manage icon in the Overview (in the details section/blade). This should
open the craptastic Verizon page.
Hover over "Http Large" and then click Rules Engine
Enter a Name/Description in the textbox. For example: "Force Cache Refresh"
Leave If/Always selected
Click the "+" button next to Features
In the new dropdown that appears choose "Force Internal Max-Age" (leave 200 as the response)
Enter a reasonable* value in the field for seconds. (300 for example)
Click the black Add button
It says it could take 4 hours for this rule to work, for me it was about 2.
Lastly, for this method, the order of the rules should not matter, but for above rewrite rules keep in mind the rules do allow you to move them up and down in priority. For me, i have, Https redirect first, Url rewrite (for "React-routing") 2nd. and lastly Force Cache Refresh last.
*note: This is a balance since purge is not working for me, you want the static (app/site) to update after you publish, but you don't want to have tons of needless traffic to refresh the cache either. For my dev server i settled for 5 minutes. I think production will be 1 hour... haven't decided.

is everything working now?
I see that your rules are probably more complicated than you need. If the goal is just to have root "/" rewrite to /index.html, this is the only rule you would need:
If always
URL Rewrite - Source: "/"; Destination "/index.html"

Related

How to stop index.html being cached by Azure CDN

I am using Azure CDN to host a static website I am building.
It's great, other than the fact that when I update my web app the old page is cached and so still shown.
I have added the following Cache rule in the rules engine to put it to refresh every 60 seconds, however this does nothing and I still get the old content, the only way to get the new content is to go to an incognito browser.
Anyone have any ideas it's driving me crazy!
Here is a screenshot of the browser dev window when I hit the index.html page, I can't see any cache control headers here, I would think that the Azure CDN would/should be putting these on, is that incorrect?

The rule you are modifying controls the "internal max age". If a file shows up correctly in icognito mode, this rule is working fine. You have to set "external max age" to control the Cache-Control header.
https://learn.microsoft.com/en-us/azure/cdn/cdn-verizon-premium-rules-engine-reference-features
Looks like it is not Azure CDN which is caching index.html, it is your browser. Ensure that the Cache-Control header is returned correctly by using the developer tools.
https://learn.microsoft.com/en-us/azure/cdn/cdn-manage-expiration-of-cloud-service-content
https://learn.microsoft.com/en-us/azure/cdn/cdn-manage-expiration-of-blob-content

Azure: WebContentNotFound on refreshing page of SPA deployed as Azure Blob Static Website with CDN

I have a SPA (built with angular) and deployed to Azure Blob Storage. Everything works fine and well as you go from the default domain but the moment I refresh any of the pages/routes, index.html no longer gets loaded and instead getting the error "the requested content does not exist"
Googling that term results in 3 results total so I'm at a loss trying to diagnose & fix this.

You can simply configure the error page to index.html in your static website:

Actually the issue was I didn't have 404.html defined -- the blob storage for SPA doesn't understand what file to serve for any other routes than the root one. So every other route will go to the 404 file. But in a SPA even the 404 goes through the index file. So all I did is mention index.html as my 404 file and all is well.

For me adding the index.html page as the Error document page did not help when navigating by url as it would still reload the app. I posted an answer elsewhere relating to rather using the Angular HashLocationStrategy and that does not cause a page reload when changing the URL manually.
Answer on other SO question

There is a new static Webapps solution by Microsoft. It is currently in preview mode but I think it is the most convenient way to use/deploy a SPA in the Azure infrastructure. You can use your custom domain with free SSL, version control, and set up a route to redirect everything to the index.html (fallback routes: https://learn.microsoft.com/en-us/azure/static-web-apps/routes) for example. see more details here: https://learn.microsoft.com/en-us/azure/static-web-apps/

Generally, you've created a CDN profile and an endpoint, but your content doesn't seem to be available on the CDN. Users who attempt to access your content via the CDN URL receive an HTTP 404 status code. You can follow these methods in troubleshooting Azure CDN endpoints that return a 404 status code
There are several possible causes, including:
The file's origin isn't visible to the CDN. The endpoint is
misconfigured, causing the CDN to look in the wrong place. The host is
rejecting the host header from the CDN. The endpoint hasn't had time
to propagate throughout the CDN.
With CDN, At the initial request, the client directly accesses to the origin server, afterward, at the following request, when you refresh the page, the client requests to the CDN cache server until their time-to-live (TTL) elapses. See Manage expiration of Azure Blob storage in Azure CDN and Control Azure CDN caching behavior with caching rules.
In this case, you may ensure websites blob content is publicly available on the Internet. After that, you may verify that your origin settings are properly configured. Verify that the values of the Origin type and Origin hostname are correct. Verify HTTP and HTTPS ports is represented as your static website is listening on. Kindly you could get more details from that troubleshooting link.

TL/DR
You could set the error document (404) to also be your index.html
This is a quick fix that will still return 404, however will also actually follow your deep link.
This isn't a 'fix'. It's more of a hack - the real fix is to add a CDN with some URL redirect rules on your hosting server. here is a great guide: https://antbutcher.medium.com/hosting-a-react-js-app-on-azure-blob-storage-azure-cdn-for-ssl-and-routing-8fdf4a48feeb
Rule itself
But to save you the click, the CDN rule using standard microsoft CDN (the cheaper one) is something like this:
(add the condition with the '+ condition' button)
If URL file extension > less than 1 extension > no case transform
(add the action with '+ Add action' button)
source pattern: '/' > Destination: '/index.html' > preserve unmatched path: no
Explanation
Ill attempt to add an explanation that I think nobody else did nicely.
What this rule will do is say any URL request that isn't for a direct file, eg.
example.com/xyz
example.com/user/xyz
example.com/tabs/post/12345
Or ANYTHING without a direct file extension (like '.png' or '.pdf' or '.html')
Then we will rewrite the URL to be 'index.html' this is the host file where the SPA has javascript to handle deep links for paths like in the example - therefore you will not get a 404 and the code will handle gracefully.

Azure CDN update for WebApp

I have a setup a azure cdn that point to my webapp. while i am changing in my style sheet and deploying webapp, the styles are updating immediately. so is there no any rquiremtn for purge in this case? does in this case cdn automatically update styles from webapp?
I am working according to this article
https://azure.microsoft.com/en-in/documentation/articles/cdn-websites-with-cdn/

If the URL of the resource remains the same, the CDN servers (and the browsers) are free to cache them. So, if you are using CDN, you need to force a URL change every time the file content changes (commonly done by adding a version string).
Since, it is working for you, either your files are not getting served from the CDN at all or somehow the URL is getting updated.
Look at the URL from where your style sheet is getting fetched (network tab in the browser's debugger). Make sure the URL path is actually from the CDN and not your website directly.
If you have a MVC.net app and you are using System.Web.Optimization.BundleCollection for style bundle, it add a query parameter to the URL embedded in the HTML and changes it if the file contents change. This ensures that the stale cached copies of the resources are not used.
See CDN and bundle caching sections at http://www.asp.net/mvc/overview/performance/bundling-and-minification

No, CDN does not automatically update the CSS for webapp.
To be safe, you should always purge.
CDN is a global service, you saw the CSS update doesn't mean everyone else all see the CSS update. Another IP address might still have the old CSS cached.
Besides, cache control header also plays a role here.

Set up DNS based URL forwarding in Amazon Route53 [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
I'm trying to setup forwarding in Amazon Route53. My last DNS service (Nettica) allowed me to route requests to "aws.example.com" to "https://myaccount.signin.aws.amazon.com/console/".
Is this functionality supported by Route53?
How does Nettica achieve this? Does it insert a special A, CNAME, PTR, or TXT record(s)?

I was running into the exact same problem that Saurav described, but I really needed to find a solution that did not require anything other than Route 53 and S3. I created a how-to guide for my blog detailing what I did.
Here is what I came up with.
Objective
Using only the tools available in Amazon S3 and Amazon Route 53, create a URL Redirect that automatically forwards http://url-redirect-example.vivekmchawla.com to the AWS Console sign-in page aliased to "MyAccount", located at https://myaccount.signin.aws.amazon.com/console/ .
This guide will teach you set up URL forwarding to any URL, not just ones from Amazon. You will learn how to set up forwarding to specific folders (like "/console" in my example), and how to change the protocol of the redirect from HTTP to HTTPS (or vice versa).
Step One: Create Your S3 Bucket
Open the S3 management console and click "Create Bucket".
Step Two: Name Your S3 Bucket
Choose a Bucket Name. This step is really important! You must name the bucket EXACTLY the same as the URL you want to set up for forwarding. For this guide, I'll use the name "url-redirect-example.vivekmchawla.com".
Select whatever region works best for you. If you don't know, keep the default.
Don't worry about setting up logging. Just click the "Create" button when you're ready.
Step 3: Enable Static Website Hosting and Specify Routing Rules
In the properties window, open the settings for "Static Website Hosting".
Select the option to "Enable website hosting".
Enter a value for the "Index Document". This object (document) will never be served by S3, and you never have to upload it. Just use any name you want.
Open the settings for "Edit Redirection Rules".
Paste the following XML snippet in its entirety.
<RoutingRules>
<RoutingRule>
<Redirect>
<Protocol>https</Protocol>
<HostName>myaccount.signin.aws.amazon.com</HostName>
<ReplaceKeyPrefixWith>console/</ReplaceKeyPrefixWith>
<HttpRedirectCode>301</HttpRedirectCode>
</Redirect>
</RoutingRule>
</RoutingRules>
If you're curious about what the above XML is doing, visit the AWM Documentation for "Syntax for Specifying Routing Rules". A bonus technique (not covered here) is forwarding to specific pages at the destination host, for example http://redirect-destination.com/console/special-page.html. Read about the <ReplaceKeyWith> element if you need this functionality.
Step 4: Make Note of Your Redirect Bucket's "Endpoint"
Make note of the Static Website Hosting "endpoint" that Amazon automatically created for this bucket. You'll need this for later, so highlight the entire URL, then copy and paste it to notepad.
CAUTION! At this point you can actually click this link to check to see if your Redirection Rules were entered correctly, but be careful! Here's why...
Let's say you entered the wrong value inside the <Hostname> tags in your Redirection Rules. Maybe you accidentally typed myaccount.amazon.com, instead of myaccount.signin.aws.amazon.com. If you click the link to test the Endpoint URL, AWS will happily redirect your browser to the wrong address!
After noticing your mistake, you will probably edit the <Hostname> in your Redirection Rules to fix the error. Unfortunately, when you try to click the link again, you'll most likely end up being redirected back to the wrong address! Even though you fixed the <Hostname> entry, your browser is caching the previous (incorrect!) entry. This happens because we're using an HTTP 301 (permanent) redirect, which browsers like Chrome and Firefox will cache by default.
If you copy and paste the Endpoint URL to a different browser (or clear the cache in your current one), you'll get another chance to see if your updated <Hostname> entry is finally the correct one.
To be safe, if you want to test your Endpoint URL and Redirect Rules, you should open a private browsing session, like "Incognito Mode" in Chrome. Copy, paste, and test the Endpoint URL in Incognito Mode and anything cached will go away once you close the session.
Step 5: Open the Route53 Management Console and Go To the Record Sets for Your Hosted Zone (Domain Name)
Select the Hosted Zone (domain name) that you used when you created your bucket. Since I named my bucket "url-redirect-example.vivekmchawla.com", I'm going to select the vivekmchawla.com Hosted Zone.
Click on the "Go to Record Sets" button.
Step 6: Click the "Create Record Set" Button
Clicking "Create Record Set" will open up the Create Record Set window on the right side of the Route53 Management Console.
Step 7: Create a CNAME Record Set
In the Name field, enter the hostname portion of the URL that you used when naming your S3 bucket. The "hostname portion" of the URL is everything to the LEFT of your Hosted Zone's name. I named my S3 bucket "url-redirect-example.vivekmchawla.com", and my Hosted Zone is "vivekmchawla.com", so the hostname portion I need to enter is "url-redirect-example".
Select "CNAME - Canonical name" for the Type of this Record Set.
For the Value, paste in the Endpoint URL of the S3 bucket we created back in Step 3.
Click the "Create Record Set" button. Assuming there are no errors, you'll now be able to see a new CNAME record in your Hosted Zone's list of Record Sets.
Step 8: Test Your New URL Redirect
Open up a new browser tab and type in the URL that we just set up. For me, that's http://url-redirect-example.vivekmchawla.com. If everything worked right, you should be sent directly to an AWS sign-in page.
Because we used the myaccount.signin.aws.amazon.com alias as our redirect's destination URL, Amazon knows exactly which account we're trying to access, and takes us directly there. This can be very handy if you want to give a short, clean, branded AWS login link to employees or contractors.
Conclusions
I personally love the various AWS services, but if you've decided to migrate DNS management to Amazon Route 53, the lack of easy URL forwarding can be frustrating. I hope this guide helped make setting up URL forwarding for your Hosted Zones a bit easier.
If you'd like to learn more, please take a look at the following pages from the AWS Documentation site.
Example: Setting Up a Static Website Using a Custom Domain
Configure a Bucket for Website Hosting
Creating a Domain that Uses Route 53
Creating, Changing, and Deleting Resource Records

The AWS support pointed a simpler solution. It's basically the same idea proposed by #Vivek M. Chawla, with a more simple implementation.
AWS S3:
Create a Bucket named with your full domain, like aws.example.com
On the bucket properties, select Redirect all requests to another host name and enter your URL:
https://myaccount.signin.aws.amazon.com/console/
AWS Route53:
Create a record set type A. Change Alias to Yes. Click on Alias
Target field and select the S3 bucket you created in the previous
step.
Reference: How to redirect domains using Amazon Web Services
AWS official documentation: Is there a way to redirect a domain to another domain using Amazon Route 53?

I was able to use nginx to handle the 301 redirect to the aws signin page.
Go to your nginx conf folder (in my case it's /etc/nginx/sites-available in which I create a symlink to /etc/nginx/sites-enabled for the enabled conf files).
Then add a redirect path
server {
listen 80;
server_name aws.example.com;
return 301 https://myaccount.signin.aws.amazon.com/console;
}
If you are using nginx, you will most likely have additional server blocks (virtualhosts in apache terminology) to handle your zone apex (example.com) or however you have it setup. Make sure that you have one of them set to be your default server.
server {
listen 80 default_server;
server_name example.com;
# rest of config ...
}
In Route 53, add an A record for aws.example.com and set the value to the same IP used for your zone apex.

Update
While my original answer below is still valid and might be helpful to understand the cause for DNS based URL forwarding not being available via Amazon Route 53 out of the box, I highly recommend checking out Vivek M. Chawla's utterly smart indirect solution via the meanwhile introduced Amazon S3 Support for Website Redirects and achieving a self contained server less and thus free solution within AWS only like so.
Implementing an automated solution to generate such redirects is left as an exercise for the reader, but please pay tribute to Vivek's epic answer by publishing your solution ;)
Original Answer
Nettica must be running a custom redirection solution for this, here is the problem:
You could create a CNAME alias like aws.example.com for myaccount.signin.aws.amazon.com, however, DNS provides no official support for aliasing a subdirectory like console in this example.
It's a pity that AWS doesn't appear to simply do this by default when hitting https://myaccount.signin.aws.amazon.com/ (I just tried), because it would solve you problem right away and make a lot of sense in the first place; besides, it should be pretty easy to configure on their end.
For that reason a few DNS providers have apparently implemented a custom solution to allow redirects to subdirectories; I venture the guess that they are basically facilitating a CNAME alias for a domain of their own and are redirecting again from there to the final destination via an immediate HTTP 3xx Redirection.
So to achieve the same result, you'd need to have a HTTP service running performing these redirects, which is not the simple solution one would hope for of course. Maybe/Hopefully someone can come up with a smarter approach still though.

If you're still having issues with the simple approach, creating an empty bucket then Redirect all requests to another host name under Static web hosting in properties via the console. Ensure that you have set 2 A records in route53, one for final-destination.com and one for redirect-to.final-destination.com. The settings for each of these will be identical, but the name will be different so it matches the names that you set for your buckets / URLs.

How can I prevent Amazon Cloudfront from hotlinking?

I use Amazon Cloudfront to host all my site's images and videos, to serve them faster to my users which are pretty scattered across the globe. I also apply pretty aggressive forward caching to the elements hosted on Cloudfront, setting Cache-Controlto public, max-age=7776000.
I've recently discovered to my annoyance that third party sites are hotlinking to my Cloudfront server to display images on their own pages, without authorization.
I've configured .htaccessto prevent hotlinking on my own server, but haven't found a way of doing this on Cloudfront, which doesn't seem to support the feature natively. And, annoyingly, Amazon's Bucket Policies, which could be used to prevent hotlinking, have effect only on S3, they have no effect on CloudFront distributions [link]. If you want to take advantage of the policies you have to serve your content from S3 directly.
Scouring my server logs for hotlinkers and manually changing the file names isn't really a realistic option, although I've been doing this to end the most blatant offenses.

You can forward the Referer header to your origin
Go to CloudFront settings
Edit Distributions settings for a distribution
Go to the Behaviors tab and edit or create a behavior
Set Forward Headers to Whitelist
Add Referer as a whitelisted header
Save the settings in the bottom right corner
Make sure to handle the Referer header on your origin as well.

We had numerous hotlinking issues. In the end we created css sprites for many of our images. Either adding white space to the bottom/sides or combining images together.
We displayed them correctly on our pages using CSS, but any hotlinks would show the images incorrectly unless they copied the CSS/HTML as well.
We've found that they don't bother (or don't know how).

The official approach is to use signed urls for your media. For each media piece that you want to distribute, you can generate a specially crafted url that works in a given constraint of time and source IPs.
One approach for static pages, is to generate temporary urls for the medias included in that page, that are valid for 2x the duration as the page's caching time. Let's say your page's caching time is 1 day. Every 2 days, the links would be invalidated, which obligates the hotlinkers to update their urls. It's not foolproof, as they can build tools to get the new urls automatically but it should prevent most people.
If your page is dynamic, you don't need to worry to trash your page's cache so you can simply generate urls that are only working for the requester's IP.

As of Oct. 2015, you can use AWS WAF to restrict access to Cloudfront files. Here's an article from AWS that announces WAF and explains what you can do with it. Here's an article that helped me setup my first ACL to restrict access based on the referrer.
Basically, I created a new ACL with a default action of DENY. I added a rule that checks the end of the referer header string for my domain name (lowercase). If it passes that rule, it ALLOWS access.
After assigning my ACL to my Cloudfront distribution, I tried to load one of my data files directly in Chrome and I got this error:

As far as I know, there is currently no solution, but I have a few possibly relevant, possibly irrelevant suggestions...
First: Numerous people have asked this on the Cloudfront support forums. See here and here, for example.
Clearly AWS benefits from hotlinking: the more hits, the more they charge us for! I think we (Cloudfront users) need to start some sort of heavily orchestrated campaign to get them to offer referer checking as a feature.
Another temporary solution I've thought of is changing the CNAME I use to send traffic to cloudfront/s3. So let's say you currently send all your images to:
cdn.blahblahblah.com (which redirects to some cloudfront/s3 bucket)
You could change it to cdn2.blahblahblah.com and delete the DNS entry for cdn.blahblahblah.com
As a DNS change, that would knock out all the people currently hotlinking before their traffic got anywhere near your server: the DNS entry would simply fail to look up. You'd have to keep changing the cdn CNAME to make this effective (say once a month?), but it would work.
It's actually a bigger problem than it seems because it means people can scrape entire copies of your website's pages (including the images) much more easily - so it's not just the images you lose and not just that you're paying to serve those images. Search engines sometimes conclude your pages are the copies and the copies are the originals... and bang goes your traffic.
I am thinking of abandoning Cloudfront in favor of a strategically positioned, super-fast dedicated server (serving all content to the entire world from one place) to give me much more control over such things.
Anyway, I hope someone else has a better answer!

This question mentioned image and video files.
Referer checking cannot be used to protect multimedia resources from hotlinking because some mobile browsers do not send referer header when requesting for an audio or video file played using HTML5.
I am sure of that about Safari and Chrome on iPhone and Safari on Android.
Too bad! Thank you, Apple and Google.

How about using Signed cookies ? Create signed cookie using custom policy which also supports various kind of restrictions you want to set and also it is wildcard.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string