Why do we get the below 2 headers while trying to read files stored in Azure Data Lake using REST API?
Cache-Control →no-cache, no-cache, no-store, max-age=0 (Why do we have multiple no-cache)
Pragma →no-cache
Why are these Headers getting set and How can we override them such them we can cache the responses ?
Below is my curl request
curl -v -X GET -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6IlJyUXF1OXJ5ZEJWUldtY29jdVhVYjIwSEdSTSIsImtpZCI6IlJyUXF1OXJ5ZEJWUldtY29jdVhVYjIwSEdSTSJ9.eyJhdWQiOiJodHRwczovL21hbmFnZW1lbnQuY29yZS53aW5kb3dzLm5ldC8iLCJpc3MiOiJodHRwczovL3N0cy53aW5kb3dzLm5ldC8wNmZhNTQ1Yy01MmQ3LTRiOGMtYjBmNy03MzQ5MTFiNDA0MmEvIiwiaWF0IjoxNDgxMTc3MTE3LCJuYmYiOjE0ODExNzcxMTcsImV4cCI6MTQ4MTE4MTAxNywiYXBwaWQiOiI3MzU0NDBhNy1kM2RmLTQ0YjEtYTk2Yy0wMTlhMzE5NmEwYmYiLCJhcHBpZGFjciI6IjEiLCJpZHAiOiJodHRwczovL3N0cy53aW5kb3dzLm5ldC8wNmZhNTQ1Yy01MmQ3LTRiOGMtYjBmNy03MzQ5MTFiNDA0MmEvIiwib2lkIjoiNTM4OTQyMjQtNDIyNC00MTllLTgxNzAtMzQ3NTQwNGI2NGFlIiwic3ViIjoiNTM4OTQyMjQtNDIyNC00MTllLTgxNzAtMzQ3NTQwNGI2NGFlIiwidGlkIjoiMDZmYTU0NWMtNTJkNy00YjhjLWIwZjctNzM0OTExYjQwNDJhIiwidmVyIjoiMS4wIn0.TrkCayxF0MJbXe7SPc8ZtMx8Aw07Plv0PE1KDAUw1hjHBgmTE95y0ivA2qKpmkvbLkreaGICmzc-4DPNcPBgQFHaiHzS9MoiC6c0mOO_0oOw7FRsbDYnL-P03_MEoHYDas7o2BC88ruZlHHePmoOHqwwXwBOgr6si5RwRmFz7InJpfILqENKD-fk2uWBWfQ1JU3xvmVLUgeoToFK-q7Xs
g6eHgW84S4gGF7xuvjz2ogduxmhaV18A80rFFRFk70uHXllFcDylHKXPqgRJ9dfHswZEczxQSQCI2hH5XTn72xMUI0ygIFX4mPjwPQhxPAaygMLxYBOhG5gNm1vBAsJww" "https://signstorage.azuredatalakestore.net/webhdfs/v1/signsdata/test.txt?op=OPEN&api-version=2016-11-01&read=true"
Response
File contents and Response Headers are
HTTP/1.1 200 OK
Cache-Control: no-cache, no-cache, no-store, max-age=0
Pragma: no-cache
Transfer-Encoding: chunked
Content-Type: application/octet-stream
Expires: -1
x-ms-request-id: 302fd601-0eca-4db0-a2de-cc2ee5d951d8
x-ms-webhdfs-version: 16.07.18.01
Status: 0x0
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=15724800; includeSubDomains
Date: Thu, 08 Dec 2016 06:27:31 GMT
For security reasons, ADLS doesn't support caching mechanisms, hence the headers that you see are sent by the service to limit caching.
Thank you,
Guy
Related
I'm noticing this behavior on Varnish 6.5 where it's not making backend calls per the max-age cache control origin response, if the request is not frequently requested by clients.
Below's the expected behavior I see for a cache requested every 1 second. It has 20 seconds max-age cache-control header from origin:
Request 1:
HTTP/2 200
date: Tue, 20 Jul 2021 02:02:02 GMT
content-type: application/json
content-length: 33692
server: Apache/2.4.25 (Debian)
x-ua-compatible: IE=edge;chrome=1
pragma:
cache-control: public, max-age=20
x-varnish: 1183681 1512819
age: 17
via: 1.1 varnish (Varnish/6.5)
vary: Accept-Encoding
x-cache: HIT
accept-ranges: bytes
Request 2:
HTTP/2 200
date: Tue, 20 Jul 2021 02:02:04 GMT
content-type: application/json
content-length: 33692
server: Apache/2.4.25 (Debian)
x-ua-compatible: IE=edge;chrome=1
pragma:
cache-control: public, max-age=20
x-varnish: 891620 1512819
age: 19
via: 1.1 varnish (Varnish/6.5)
vary: Accept-Encoding
x-cache: HIT
accept-ranges: bytes
Request 3:
HTTP/2 200
date: Tue, 20 Jul 2021 02:02:05 GMT
content-type: application/json
content-length: 33692
server: Apache/2.4.25 (Debian)
x-ua-compatible: IE=edge;chrome=1
pragma:
cache-control: public, max-age=20
x-varnish: 1183687 1512819
age: 20
via: 1.1 varnish (Varnish/6.5)
vary: Accept-Encoding
x-cache: HIT
accept-ranges: bytes
Request 4:
HTTP/2 200
date: Tue, 20 Jul 2021 02:02:06 GMT
content-type: application/json
content-length: 33692
server: Apache/2.4.25 (Debian)
x-ua-compatible: IE=edge;chrome=1
pragma:
cache-control: public, max-age=20
x-varnish: 854039 1183688
age: 1
via: 1.1 varnish (Varnish/6.5)
vary: Accept-Encoding
x-cache: HIT
accept-ranges: bytes
You can see the Request #4 above makes a new origin request with the cache request id being 1183688.
Now if I wait a long while and make that same request, the cache age is pretty old and varnish does not make an origin request to cache a fresh object:
Request 5 after a while:
HTTP/2 200
date: Tue, 20 Jul 2021 02:10:08 GMT
content-type: application/json
content-length: 33692
server: Apache/2.4.25 (Debian)
x-ua-compatible: IE=edge;chrome=1
pragma:
cache-control: public, max-age=20
x-varnish: 1512998 1183688
age: 482
via: 1.1 varnish (Varnish/6.5)
vary: Accept-Encoding
x-cache: HIT
accept-ranges: bytes
I suppose I could start adding the Expires header from origin, but looking for explanation why varnish behaves this way if the request is infrequent. Thanks.
TTL header precedence in Varnish
Varnish does check the max-age directive, but there might be other factors can cause the TTL to be an unexpected value.
Here's the TTL precedence:
The Cache-Control header's s-maxage directive is checked.
When there's no s-maxage, Varnish will look for max-age to set its TTL.
When there's no Cache-Control header being returned, Varnish will use the Expires header to set its TTL.
When none of the above apply, Varnish will use the default_ttl runtime parameter as the TTL value. Its default value is 120 seconds.
Only then will Varnish enter vcl_backend_response, letting you change the TTL.
Any TTL being set in VCL using set beresp.ttl will get the upper hand, regardless of any other value being set via response headers.
Your specific situation
The best way to figure out what's going on is by running varnishlog and adding a filter for the URL you want to track.
Here's an example for the homepage:
varnishlog -g request -q "ReqUrl eq '/'"
The output will be extremely verbose, but will contain all the info you need.
Tags that are of particular interest are:
TTL see https://varnish-cache.org/docs/6.5/reference/vsl.html#varnish-shared-memory-logging
BerespHeader (specifically the Cache-Control backend response header)
RespHeader (specifically the Cache-Control response header)
Please also have a look at your VCL and check whether or not the TTL is changed by set beresp.ttl =.
What do I need to help you
In summary, if you want further assistance, please provide your full VCL, as well as a varnishlog extract for the transactions that is giving you to unexpected behavior.
Based on that information, we'll have a pretty good idea what's going on.
I'm trying to integrate Gmail API in my app in order to send emails to users.
i've created a project on Google Developers Console, enabled Gamil API in it, downloaded the credentials as JSON and followed the instruction provided at https://developers.google.com/gmail/api/quickstart/nodejs
I've also added https://developers.google.com/oauthplayground as a redirect URI for the project.
When I run the code, I get redirected to a consent screen. I choose the account which has the project, then get redirected to "oauthplayground"
However, when I try to exchange authorization code for tokens, I receive 401 unauthorized
the full response is:
HTTP/1.1 401 Unauthorized
Content-length: 75
X-xss-protection: 0
X-content-type-options: nosniff
Transfer-encoding: chunked
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Vary: Origin, X-Origin, Referer
Server: scaffolding on HTTPServer2
-content-encoding: gzip
Pragma: no-cache
Cache-control: no-cache, no-store, max-age=0, must-revalidate
Date: Wed, 21 Oct 2020 07:53:18 GMT
X-frame-options: SAMEORIGIN
Alt-svc: h3-Q050=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-T050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
Content-type: application/json; charset=utf-8
{
"error_description": "Unauthorized",
"error": "unauthorized_client"
}
Any help is appreciated
Thanks
I have a server which returns a different response based on the Accept header e.g. if Accept header includes "image/webp", a webp image is served, otherwise a jpg is served.
We run Varnish at server-level and it does this correctly, as per example below:
Request (with image/webp in Accept header):
curl -s -D - -o /dev/null "https://REDACTED/media/tokinoha_bowl-4.jpg?sh=2&fmt=webp,jpg" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8"
Response (webp image served):
HTTP/2 200
date: Wed, 06 Feb 2019 08:25:05 GMT
content-type: image/webp
access-control-allow-origin: *
cache-control: public, s-maxage=31536000, max-age=31536000
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
strict-transport-security: max-age=31536000; includeSubDomains
vary: Accept-Encoding, Accept-Encoding,Origin
referrer-policy: strict-origin-when-cross-origin
accept-ranges: bytes
content-length: 60028
Request (no webp in Accept header, jpg served):
curl -s -D - -o /dev/null "https://REDACTED/media/tokinoha_bowl-4.jpg?sh=2&fmt=webp,jpg" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/apng,*/*;q=0.8"
Response:
HTTP/2 200
date: Wed, 06 Feb 2019 08:25:18 GMT
content-type: image/jpeg
access-control-allow-origin: *
cache-control: public, s-maxage=31536000, max-age=31536000
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
strict-transport-security: max-age=31536000; includeSubDomains
vary: Accept-Encoding, Accept-Encoding,Origin
referrer-policy: strict-origin-when-cross-origin
accept-ranges: bytes
content-length: 166991
We have the below options in the Rules Engine set up, however whichever content-type is cached first is served on all subsequent requests irrespective of request header.
Rules Engine settings
Does anyone know of a way to achieve this?
Thanks in advance!
We had the same problem with Verizon/Edgecast: One URL delivered two different image types (JPEG and WebP) depending on Accept header. The origin (imgix) sent correctly Vary: Accept, but Edgecast ignored that and cached what it get and so browsers without WebP support got sometimes the wrong format.
We solved it with a rule in Edgecast:
WebP rule
The query parameter auto is always part of the URL and can therefore always be removed from the cache key. With the second query parameter varyWebP we recognize the URLs definitely and prevent a collision with URLs without query parameter auto.
In this case the URL
https://[HOST]/[PATH]?a=1&b=2&c=3&auto=compress,format
creates the same cache key as:
https://[HOST]/[PATH]?a=1&b=2&c=3
That's why the query parameter varyWebP protects us.
I'm seeing a situation where requests through Cloudfront have a different Cache-Control than my origin. I have Object Caching set to "Use Origin Cache Headers" and (I don't think this is relevant) Compress Objects Automatically set to "No"
I've found that if I change Object Caching to "Customize" and change the value around that does in fact change the headers returned from the CDN. That's okay and all... but I'm curious to know why with my existing settings this header isn't being passed through.
Thanks!
Compressed Request from Origin - shows Cache-Control of '31536000'
(05:34 PM) jsharpe#mbp:~ curl -I https://staging.testing.com/assets/application-0d5691ba401c3f5a305fda52745a831376545a605a6c16e50fc838fdaa567e57.css --compressed
HTTP/1.1 200 OK
Server: Cowboy
Date: Wed, 16 Aug 2017 21:34:22 GMT
Connection: keep-alive
Last-Modified: Wed, 16 Aug 2017 05:05:25 GMT
Content-Type: text/css
Cache-Control: public, max-age=31536000
Content-Encoding: gzip
Vary: Accept-Encoding, Origin
Content-Length: 33563
Via: 1.1 vegur
Compressed Request from CDN - shows Cache-Control of '86400'
(05:34 PM) jsharpe#mbp:~ curl -I https://staging-cdn.testing.com/assets/application-0d5691ba401c3f5a305fda52745a831376545a605a6c16e50fc838fdaa567e57.css --compressed
HTTP/1.1 200 OK
Content-Type: text/css
Content-Length: 33563
Connection: keep-alive
Server: Cowboy
Date: Wed, 16 Aug 2017 05:07:12 GMT
Last-Modified: Wed, 16 Aug 2017 05:05:25 GMT
Cache-Control: public, max-age=86400
Content-Encoding: gzip
Via: 1.1 vegur, 1.1 7d327ef7e21429ba6a44eb6374c976f3.cloudfront.net (CloudFront)
Vary: Accept-Encoding
Age: 59233
X-Cache: Hit from cloudfront
X-Amz-Cf-Id: TEqKbQ5ZYySY7m8rDft_MAlygEiam6gYvzrXBpS7D2DrBNbVUZ1y3Q==
I have website which is online. When I'am using it via browser everything is ok and this page is present in browser. When I'm using it as googlebot ( via webmastertools ) i'm getting error
HTTP/1.1 404 Not Found
Date: Mon, 19 Nov 2012 09:57:37 GMT
Server: Apache
X-Powered-By: PHP/5.2.17
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: symfony=55240a0a341202d07fc96cbc1c1bcca5; path=/
Keep-Alive: timeout=2, max=200
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
and rest of the html code...
same thing when I'm trying to validate it via wc3 validator.
Please help :( I tryied everything :(
website address is mojaczestochowa.pl
If more info is needed please let me know.
Try to check the pae with web-sniffer and set user agent to google.bot
Here is the exact query, which will simulate server's response to the GoogleBot crawler:
https://websniffer.cc/?url=http://mojaczestochowa.pl/&uak=9