Azure Verizon CDN - 100% Cache CONFIG_NOCACHE - azure

I set up an Azure Verizon Premium CDN a few days ago as follows:
Origin: An Azure web app (.NET MVC 5 website)
Settings: Custom Domain, no geo-filtering
Caching Rules: standard-cache (doesn't care about parameters)
Compression: Enabled
Optimized for: Dynamic site acceleration
Protocols: HTTP, HTTPS, custom domain HTTPS
Rules: Force HTTPS via Rules Engine (if request scheme = http, 301 redirect to https://{customdomain}/$1)
So - this CDN has been running for a few days now, but the ADN reports are saying that nearly 100% (99.36%) of the cache status is "CONFIG_NOCACHE" (Description: "The object was configured to never be cached in accordance with customer-specific configurations residing on the edge servers, so the response was served via the origin server.") A few (0.64%) of them are "NONE" (Description: "The cache was bypassed entirely for this request. For instance, the request was immediately rejected by the token auth module, or the client request method used an uncacheable request method such as "PUT".") Also, in the "Cache Hit" report, it says "0 hits, 0 misses" for every day. Nothing is coming through the "HTTP Large" side, only "ADN".
I couldn't find these exact messages while searching around, but I've tried:
Updating cache-control header to max-age, public (ie: cache-control: public,max-age=1209600)
Updating the cache-control header to max-age (cache-control: max-age=1209600)
Updating the expires header to a date way in the future (expires: Tue, 19 Jan 2038 03:14:07 GMT)
Using different browsers so the request cache info is different. In Chrome, the request is "cache-control: no-cache" in my browser. In Firefox, it'll say "Cache-Control: max-age=0". In any case, I'd assume the users on the website wouldn't have these same settings, right?
Refreshing the page a bunch of times, and looking at the real time report to see hits/misses/cache statuses, and it shows the same thing - CONFIG_NOCACHE for almost everything.
Tried running a "worldwide" speed test on https://www.dotcom-tools.com/website-speed-test.aspx, but that had the same result - a bunch of "NOCACHE" hits.
Tried adding ADN rules to set the internal and external max age to 864000 sec (10 days).
Tried adding an ADN rule to ignore "no-cache" requests and just return the cached result.
So, the message for "NOCACHE" says it's a node configuration issue... but I haven't really even configured it! I'm so confused. It could also be an application issue, but I feel like I've tried all the different permutations of "cache-control" that I can. Here's an example of one file that I'd expect to be cached:
Ultimately, I would hope that most of the requests are being cached, so I'd see most of the requests be "TCP Hit". Maybe that's incorrect? Thanks in advance for your help!

So, I eventually figured out this issue. Apparently Azure Verzion Premium CDN ADN platform has "bypass cache" enabled by default.
To disable this behavior you need to configure additional features to your caching rules.
Example:
IF Always
Features:
Bypass Cache Disabled
Force Internal Max-Age Response 200 864000 Seconds
Ignore Origin No-Cache 200

Related

Cached content in Azure CDN from Microsoft expires too early

We are using a Standard Microsoft Azure CDN to serve images for a web application. These images are requested as /api/img?param1=aaa&param2=bbb, so we cache every unique URL. The cache duration is 7 days. We also override the "Cache-Control" header so that the image is only cached for 1 hours by the client browser.
The problem is, the images do not stay in cache for 7 days. The first day after the images have been requested, they seem to be in CDN (I verify the X-Cache header and it returns "TCP_HIT"), however if I make the same requests 2-3 days later, around 25% of images are not cached anymore (the X-Cache header is "TCP_MISS"). The origin server receives and logs requests, so I am sure that they bypass CDN.
Is there any explanation for this? Do I have to set additional parameters for images to be cached correctly?
We use the following settings :
Caching rules "Cache every unique URL"
Rules Engine:
if URL path begins with /api/img
then Cache expiration: [cache behaviour] Override, [duration] 7 days
and then Modify response header: Overwrite, "Cache-Control", "public, max-age=3600"
From some folks on the CDN Product Group:
for all but the Verizon Premium SKU, the max-age and cache expiration are one and the same thing, so 2c overrides 2b.
The CDN reserves the right to flush entries from the CDN if they are not used - cache items are evicted using an LRU algorithm.
the Verizon Premium SKU offers the ability to have two different age values, one for browser-to-edge (the "External Max-Age") and one for edge-to-source (original expiration time, or forced override time - see the docs).

Cache in Azure Front door service

I have setup azure front door service for three different geographies. The users are getting routed to the nearest data centre which is working as expected. Currently, I am setting up Caching under routing rules. I need to exclude some of the files that need not be cached. I do not see any configuration which allows exclude caching from certain files.
Below is the screenshot of the configuration setting.
https://imgur.com/biy9tjj
Since Azure front door matches the request to and then takes the defined action according to the particular routing rules. So if you need to exclude some of the files that do not be cached, you could try to create a separate routing rule with PATTERNS TO MATCH to set to the path of specific no-need-to cached files. Then set disabled caching in the ROUTE DETAILS in this separate routing rule.
Ref: How Front Door matches requests to a routing rule
While I think Nancy Xiong answer would work, I don't think this the correct approach.
Azure Front Door respects Cache-Control headers, therefore make sure your web server that is serving the files you don't want to cache returns a proper value. A decent starting point might be Cache-Control: no-cache, but check out the docs here for the details and options.
And talking about Azure Front Door - it claims that it respects these values (docs here):
Cache-Control response headers that indicate that the response won’t be cached such as Cache-Control: private, Cache-Control: no-cache, and Cache-Control: no-store are honored. However, if there are multiple requests in-flight at a POP for the same URL, they may share the response. If no Cache-Control is present the default behavior is that AFD will cache the resource for X amount of time where X is randomly picked between 1 to 3 days.

Handling of requests with Header Accept-Encoding in CloudFront vs request without the header Accept-Encoding

I have a Cloud Front Distribution to cache my images. my origin server is NOT S3, its some server i run.
I use these images in my website(taking the advantage of CF caching). Now to explain the problem, lets assume in my home page i am using an image called banner.png.
I visit my home page lets say from chrome for the first time - for banner.png its a cache miss, so it gets fetched fro origin and cached in CF.
After this i visit my page from FF,opera, Chromium, GET "banner.png" using postman - this all gets me the file from CF cache.
Now i GET "banner.png" using insomnia (Another rest client) - Now CF doesn't send me from cache, it goes back to origin to get the image, and reply me with **"x-cache: RefreshHit from cloudfront"**.
the difference between these 2 sets of clients are first set of clients sends "Accept-Encoding: gzip" header in the request and second client did not.
in my CF behaviour -
"Cache Based on Selected Request Headers" = NONE
Objects Automatically" = NO"Compress
.
Any pointers ?
CloudFront keeps two different copies of cache based on Accept-encoding.
One if Header contains Accept-encoding: gzip
Accept-encoding: any other value or without the header.
You can test it using curl, first without accept-encoding and second request with accept-encoding: gzip and you'll see MISS from CloudFront, this is expected with CloudFront.
The reason being is that CloudFront supports only gzip compression and it keeps this header into consideration to know if it needs to compress the response or not.
However, Your problem seems different, You're seeing Refersh from CloudFront which happens when CloudFront TTLs/Max-age expires and CloudFront is making a Condition GET to the origin to know if the content has been modified or not.
Ideally, it should be a Miss From CloudFront if no accept-header is present.

Enabling HSTS without forcibly redirecting to HTTPS?

This seems like a silly question but I am wondering:
How is HSTS deployed without forcibly redirecting users to HTTPS?
How is HTTP content still served from the same domain as one using HSTS? (Either an entire site or mixed content)
Why would anyone do this?
I'm reading from the EFF site and it appears that that was done:
We recently enabled HSTS for eff.org. It took less than an hour to set up, and we found a way to do it without forcibly redirecting users to HTTPS, so we can state an unequivocal preference for HTTPS access while still making the site available in HTTP. It worked like a charm and a significant fraction of our users are now automatically accessing our site in HTTPS, perhaps without even knowing it.
As I'm aware, HSTS works by sending an HTTP header:
Strict-Transport-Security: max-age=31536000
So if I access a page on https://example.net/ that sends that header, all future requests to the domain example.net for the next 31536000 seconds will use HTTPS, and if the (response?) is HTTP then the browser will show giant red warnings.
Can someone please clarify this for me? Is my understanding of HSTS accurate or am I missing something?
HSTS headers should only be issued over HTTPS and only enforced by a User Agent if they are received over HTTPS. A User Agent should disregard the HSTS header sent over HTTP as an attacker could have maliciously injected it.
This means the site can continue to serve over HTTP and the user can continue browsing over HTTP at their choice. However, if they manually insert https:// into the address bar, they will receive the HSTS header and the User Agent will then treat it as a HSTS host.
This is a good way of progressively introducing enforced HTTPS traffic without forcing it all there on day one. As users become aware of the secure option the HTTPS traffic will increase and remain thanks to HSTS. Perhaps the EFF is introducing it gradually and may just flick the big 'HTTPS Switch' one day once they are satisfied they can accommodate it.

Varnish caching - age gets reset

I have a very simple site and am setting up varnish cache on it. The server is nginx.
The cache seems to get automatically purged after 120 seconds as when I go on the site i see the Age header being reset.
Can anyone point me towards where to remove this and have pages cached indefinitely or until i manually purge varnish?
You did not mention your OS or distribution, but for example on CentOS /etc/sysconfig/varnish sets the defaults for Varnish. Amongst those defaults is VARNISH_TTL=120, which sets the default TTL to 120 seconds.
If you only wish to set a high TTL for all objects, you can just edit the default one in /etc/sysconfig/varnish.
If the backend sends to the Varnish age headers, the Varnish will consider them as a real expiration date just like a web browser and will purge it's content when the header expires.
You should make sure that the backend doesn't send cache-control headers to the varnish and only the varnish will add cache-control headers when sending data to the browsers.

Resources