Azure CDN - Images Respond 404 to CURL - azure

We have a vendor who sends us photos that are hosted on Azure Edge. These photos are available and I can download them, but if we do a CURL request we get a 404 roughly 4 out of 5 times. If we do a HEAD request to get the filesize, we get a 404 about 7 out of 10 times. On our production server, we get a 404 100% of the time. Any idea how we might work around this or if there's another way to check these files without the vendor having to fix their issue?
Sample file:
curl -I http://tdrvehicles2.azureedge.net/photos/202008/1419/1850/f253435f-86b1-4cc4-b95c-7756addddad4.jpg
HTTP/1.1 404 Not Found
Pragma: no-cache
Content-Length: 0
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Cache-Control: max-age=31536000
Expires: Thu, 19 Aug 2021 14:12:54 GMT
Date: Wed, 19 Aug 2020 14:12:54 GMT
Connection: keep-alive```

Related

Head Request Returns a 403

When I make a HEAD request for a URL with curl -I I get a 403 error:
HTTP/2 403
content-type: application/xml
date: Wed, 22 Aug 2018 15:50:29 GMT
server: AmazonS3
x-cache: Error from cloudfront
via: 1.1 bfda628fa09b80e042e2c85c85acf1c7.cloudfront.net (CloudFront)
x-amz-cf-id: ryZHGZJ1cKbIa-n1Oxznge9MYl7C2-btqva9W0wQTvooVd6vzAaktw==
When I make the full request with just curl it works fine. I've verified that the behavior is set up with Allowed HTTP Methods GET and HEAD. What am I missing?

Why is CloudFront not forwarding my Content-Type header for an svg served from S3?

I'm trying to load a page from CloudFront, and the svg is showing up as a missing image.
When I look into the response headers, I see that when I load the S3 bucket directly, the response contains the proper content type: image/svg+xml
$ curl -I https://s3-eu-west-1.amazonaws.com/pages.ivizone.com/1/19/1509969889/images/kenzo-logo-v2.svg
HTTP/1.1 200 OK
x-amz-id-2: k3+bRpJLp+avBaUWO4VSgB+Djxb+nebnGJs3u6kQ0rMeX95h3XeLHA03XYaWioat+JqNG6x61x8=
x-amz-request-id: 43D8ED0E9EB4490C
Date: Mon, 06 Nov 2017 15:06:13 GMT
Last-Modified: Mon, 06 Nov 2017 14:08:00 GMT
ETag: "4b8f9e399ec9bc166040a2641cf33fb3"
Accept-Ranges: bytes
Content-Type: image/svg+xml
Content-Length: 9484
Server: AmazonS3
However when I pass through CloudFront, the header is missing:
$ curl -I https://pages.ivizone.com/1/19/1509969889/images/kenzo-logo-v2.svg
HTTP/1.1 200 OK
Content-Length: 9484
Connection: keep-alive
Date: Mon, 06 Nov 2017 14:01:01 GMT
Last-Modified: Mon, 06 Nov 2017 12:04:52 GMT
ETag: "4b8f9e399ec9bc166040a2641cf33fb3"
Server: AmazonS3
X-Cache: RefreshHit from cloudfront
Via: 1.1 ed9babcd75a95b818a6df1694ba95225.cloudfront.net (CloudFront)
X-Amz-Cf-Id: va4AIkAzw7-tNZ-qQo4KA_czM29tFQAzmNH_P0wjYd_TiboSBAyohA==
As a result, this is causing problems rendering my images.
Would anyone know why Cloudfront strips the header, and how to fix it?
Thanks!
Ok, It looks like I screwed up somewhere. When uploading the svg image to S3, I had to add the content type string to the S3 Object metadata:
"image/svg+xml"
(no spaces)
Once I added this on upload, the image was served properly.
S3 doesn't send a content-type header by default, so my browser probably interpreted the svg in an incorrect format. By specifying the header, it knew how to handle it

302 redirect loop... with a twist

I have this situation that utterly baffles me:
The setup
a newly setup server (centos 7, apache, mysql, nothing fancy) that hosts a simple php app that I need to interact with from my main app on another server. This service is setup to run on service-name.domain.tld while the main app is on domain.tld (just mentioning in case it make any difference).
The problem
For some reason when I try to access the service app from the main server I get an endless loop of 302 redirects.
If I do a curl -D - http://service-name.domain.tld from the main server I get:
HTTP/1.1 302 Found
Date: Wed, 03 Aug 2016 12:30:26 GMT
Server: Apache
Location: http://service-name.domain.tld
Content-Length: 218
Connection: close
Content-Type: text/html; charset=iso-8859-1
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved here.</p>
</body></html>
If I do the same command from my computer, the result is
HTTP/1.1 200 OK
Date: Wed, 03 Aug 2016 12:31:39 GMT
Server: Apache/2.4.6 (CentOS) PHP/5.6.24
X-Powered-By: PHP/5.6.24
Cache-Control: no-cache
Content-Length: 30
Content-Type: text/html; charset=UTF-8
Welcome.
As it should be. Same happens for any request I make (even static files). Also I should specify same situation happens for wget and get_file_contents in php
I'm really lost right now, not knowing where to go from here. So any kind of direction is greatly appreciated.

Caching local files

I use Privoxy or Proxomitron to inject custom Javascript tags into websites which then load scripts from a local python server (on localhost:8888):
... <script type="text/javascript"
src="http://localhost:8888/tweakscript.js"
></script></body>
Some of these script tags are huge third party javascript libraries which are also stored on my computer. They never change but are reloaded each time. I want to cache them.
i tried these headers, without success:
HTTP/1.1 200 OK
Content-type: text/javascript; charset=UTF-8
Vary: Accept-Encoding
Date: Sun, 04 Oct 2015 03:24:14 GMT # the current date
Last-Modified: Fri, 02 Oct 2015 06:34:40 GMT # never changes
Expires: Fri, 01 Apr 2016 03:24:14 GMT # one month in future
Cache-Control: public, max-age=15552000 # cache for one year
Access-Control-Allow-Origin: * # Content Security Policy
Access-Control-Allow-Methods: GET, POST, OPTIONS
Connection: close
(javascript code here)
How can i make the webbrowser cache these files?
OK, i found the solution:
When requesting files, the browser adds an If-Modified-Since request header, eg:
If-Modified-Since: Tue, 28 Jul 2015 22:48:42 GMT
If the server then sends as response ...
HTTP/1.1 304 Not Modified
... then the browser will load the file from cache.

Foursquare venue photos API only occasionally working with client_id/client_secret?

I've found that some venues will only return photos if I use a signed in user instead of a client_id / client_secret. Is this intentional?
curl -i https://api.foursquare.com/v2/venues/4c36476d93db0f47f6cc1d92/photos?client_id=xxx\&client_secret=xxx\&group=venue\&v=20120304
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Cache-Control: no-cache, private, no-store
Content-Type: application/json; charset=utf-8
Date: Mon, 05 Mar 2012 00:28:34 GMT
Expires: Mon, 5 Mar 2012 00:28:34 GMT
Pragma: no-cache
Server: nginx/0.8.52
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4999
Content-Length: 66
Connection: keep-alive
{"meta":{"code":200},"response":{"photos":{"count":0,"items":[]}}}
curl -i https://api.foursquare.com/v2/venues/4c36476d93db0f47f6cc1d92/photos?group=venue\&v=20120304\&oauth_token=xxx\&v=20120304
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Cache-Control: no-cache, private, no-store
Content-Type: application/json; charset=utf-8
Date: Mon, 05 Mar 2012 00:29:19 GMT
Expires: Mon, 5 Mar 2012 00:29:19 GMT
Pragma: no-cache
Server: nginx/0.8.52
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 1000
Content-Length: 15311
Connection: keep-alive
{"meta":{"code":200},"notifications":[{"type":"notificationTray","item":{"unreadCount":0}}],"response":{"photos":{"count":14,"items":[lots of images here]}}}
I want to fetch a photo to associate with a given place as a background process, not tied to the specific user. Is it intended that this API only functions correctly for signed in users?
Looks like there's a bug in userless access to /venues/photos. The team is investigating. The intended behavior is that userless access of that endpoint returns all public photos attached to that venue.

Resources