I have a classic ASP website that posts a form to a page that then generates and streams an Excel file out to users. Actually, its a raw HTML table which I send with:
Response.AddHeader "Content-Disposition", "inline; filename=file.xls"
Response.AddHeader "Content-Type", "application/vnd.ms-excel"
The intranet website is secured via integrated windows authentication. No other access mode is checked. The user logs in with their network password and all is well.
Now, when the user submits the form, this action results in two more login dialogs. You can actually cancel out of both and still open the file. In fact, if you put in your credentials, it requires you to enter them four times! If you check "Remember password", it doesn't affect the need for logging in. Also, this happens even if the URLs are listed on the Trusted Sites section of IE.
Any ideas on what I can do to minimize this?
PS: Not sure, but seems to be a relatively recent issue, meaning a more recent version of IE (7/8), Office (2007+) and or Windows (Vista/7).
UPDATE: Using Fiddler, I can see that something called "User-Agent: Microsoft-WebDAV-MiniRedir/6.1.7600" is attempting to connect, and getting 401.2'ed. Is IE offloading the download to something else that isn't authenticating properly?
UPDATE2: Doubly-interestingly, Firefox does none of this. It receives and interprets things properly:
HTTP/1.1 200 OK
Date: Mon, 21 Feb 2011 19:25:26 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Content-Disposition: inline; filename="SavingsReport_4Q2010.xls"
Content-Type: application/vnd.ms-excel
Content-Length: 111851
Cache-control: private
Old question but here goes in case someone else stumbles on it.
Office tries to authenticate with the file server, through an OPTIONS request, in order to access the file as explained on this article.
Without entirely understanding why, changing your content-disposition from inline to attachment will not prompt for authentication in most environments.
Be careful as this seems to have an effect on the file's name (on win XP - IE7.
E.g. file with name file name.xls would be opened as file_name.xls
Here is example classic ASP Code:
Response.AddHeader "Content-Disposition", "attachment; filename=MyReport.xls"
Related
My well-answered question here on SO has led to another question.
The Azure account I mention in that original question is not managed by us. Here is an example of the headers received when requesting its blob files:
HTTP/1.1 200 OK
Content-MD5: R57initOyxxq6dVKtoAx3w==
Content-Type: application/octet-stream
Date: Wed, 02 Mar 2016 14:32:35 GMT
Etag: 0x8D3180DA8EBF063
Last-Modified: Fri, 08 Jan 2016 09:25:33 GMT
Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
x-ms-blob-type: BlockBlob
x-ms-lease-status: unlocked
x-ms-request-id: 19d0a689-0001-0039-2990-74a33e000000
x-ms-version: 2009-09-19
Content-Length: 263748
So the files are being returned as application/octet-stream which I understand effectively means unknown file type. When I hit the URL in a browser I'm prompted to download, even when the file is an image.
Ultimately the files in this blob storage will be used in 2 ways. Some are images which will be used for website imagery. Others are 'assets' (mainly PDFs) which need to be downloaded as opposed to opened in the browser.
So my question is, if I leave the blob storage as is, with all assets being returned as application/octet-stream, are there any negative implications when using its images as web content and linking to its PDFs for download? e.g. are there browsers which will behave differently?
In other words, what advantage would there be if I insisted the headers were changed to...
Content-Type: image/png
Content-Disposition: inline; filename="picture.png"
...and...
Content-Type: application/pdf
Content-Disposition: attachment; filename="file.pdf"
So my question is, if I leave the blob storage as is, with all assets
being returned as application/octet-stream, are there any negative
implications when using its images as web content and linking to its
PDFs for download? e.g. are there browsers which will behave
differently?
Yes, browsers behave differently. For example, if you have a png file stored in Azure Blob Storage with content type as application/octet-stream, in Chrome it will always be downloaded but in Internet Explorer it will be displayed. So in essence you're left at browser's mercy as to how they deal with the content based on it's type.
Ultimately the files in this blob storage will be used in 2 ways. Some
are images which will be used for website imagery. Others are 'assets'
(mainly PDFs) which need to be downloaded as opposed to opened in the
browser.
It is recommended that you set the content type properly for the content you want to display in the browser. Even for the content that will be downloaded, please ensure that the content type is set properly. For downloadable content, there are 2 scenarios that you may want to consider:
Content will always be downloaded: If the content will always be downloaded, you could set the content-disposition property of the blob to attachment; filename="file.pdf".
Content sometimes will be downloaded but sometimes it will be presented inline: In this scenario, don't set the content-disposition property of the blob and set its content type properly. So when the content is accessed, by default it will be displayed inline. When you need to force download the content, you could create a Shared Access Signature (SAS) on the blob with at least Read permission and override content-disposition property of the blob in SAS. When someone access the blob using SAS URL, the content will always be downloaded.
Background:
IIS 7
AspNet 3.5 web app
Chrome dev tools lists 98 requests for the home page of the web app (aspx + js + css + images). In following requests, status code is 200 for css/images files. No cache info, browser asks server each time if file has to be updated. OK.
In IIS 7 I set HTTP header for cache control, set to 6 hours for the "ressources" folder. In Chrome, using dev tools, I can see that header is well set in response:
Cache-Control: max-age=21600
But I still get 98 requests... I thought that browser should not request one ressource if its expiration date is not reached, and I was expecting the number of requests to drop...
I got it. Google Chrome ignores the Cache-Control or Expires header if you make a request immediately after another request to the same URI in the same tab (by clicking the refresh button, pressing the F5 key or pressing Command + R). It probably has an algorithm to guess what does the user really want to do.
A way to test the Cache-Control header is to return an HTML document with a link to itself. When clicking the link, Chrome serves the document from the cache. E.g., name the following document self.html:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Test Page</title>
</head>
<body>
<p>
Link to the same page.
If correctly cached, a request should not be made
when clicking the link.
</p>
</body>
</html>
Another option is to copy the URL and paste it in the same tab or another tab.
UPDATE: On a Chrome post published on January 26, 2017, it is described what was the previous behavior and how it is changing by doing only revalidation of the main resource, but not of the sub-resources:
Users typically reload either because a page is broken or the content seems stale. The existing reload behavior usually solves broken pages, but stale content is inefficiently addressed by a regular reload, especially on mobile. This feature was originally designed in times when broken pages were quite common, so it was reasonable to address both use cases at once. However, this original concern has now become far less relevant as the quality of web pages has increased. To improve the stale content use case, Chrome now has a simplified reload behavior to only validate the main resource and continue with a regular page load. This new behavior maximizes the reuse of cached resources and results in lower latency, power consumption, and data usage.
In a Facebook post also published on January 26, 2017, it is mentioned that they found a piece of code were Chrome invalidates all cached resources after a POST request:
we found that Chrome would revalidate all resources on pages that were loaded from making a POST request. The Chrome team told us the rationale for this was that POST requests tend to be pages that make a change — like making a purchase or sending an email — and that the user would want to have the most up-to-date page.
It seems this is not the case anymore.
Finally, it is described that Firefox is introducing Cache-Control: immutable to completely stop revalidation of resources:
Firefox implemented a proposal from one of our engineers to add a new cache-control header for some resources in order to tell the browser that this resource should never be revalidated. The idea behind this header is that it's an extra promise from the developer to the browser that this resource will never change during its max-age lifetime. Firefox chose to implement this directive in the form of a cache-control: immutable header.
Chrome appears to be ignoring your Cache-Control settings if you're reloading in the same tab. If you copy the URL to a new tab and load it there, Chrome will respect the cache control tags and reuse the contents from the cache.
As an example I had this Ruby Sinatra app:
#!/usr/bin/env ruby
require 'sinatra'
before do
content_type :txt
end
get '/' do
headers "Cache-Control" => "public, must-revalidate, max-age=3600",
"Expires" => Time.at(Time.now.to_i + (60 * 60)).to_s
"This page rendered at #{Time.now}."
end
When I continuously reloaded it in the same Chrome tab it would display the new time.
This page rendered at 2014-10-08 13:36:46 -0400.
This page rendered at 2014-10-08 13:36:48 -0400.
The headers looked like this:
< HTTP/1.1 200 OK
< Content-Type: text/plain;charset=utf-8
< Cache-Control: public, must-revalidate, max-age=3600
< Expires: 2014-10-08 13:36:46 -0400
< Content-Length: 48
< X-Content-Type-Options: nosniff
< Connection: keep-alive
* Server thin is not blacklisted
< Server: thin
However accessing the same URL, http://localhost:4567/ from multiple new tabs would recycle the previous result from the cache.
After doing some tests with Cache-Control:max-age=xxx:
Pressing reload button: header ignored
Entering same url any tab (current or not): honored
Using JS (window.location.reload()): ignored
Using Developer Tools (with Disable cache unselected) or incognito doesn't affect
So, the best option while developing is put the cursor in the omnibox and press enter instead of refresh button.
Note: a right button click on refresh icon will show refresh options (Normal, Hard, Empty Cache). Incredibly, no one of these affect on these headers.
If Chrome Developer Tools are open (F12), Chrome usually disables caching.
It is controllable in the Developer Tools settings - the Gear icon to the right of the dev-tools top bar.
While this question is old, I wanted to add that if you are developing using a self-signed certificate over https and there is an issue with the certificate then google will not cache the response no matter what cache headers you use.
This is noted in this bug report:
https://bugs.chromium.org/p/chromium/issues/detail?id=110649
This is addition to kievic answer
To force browser to NOT send Cache-Control header in request, open chrome console and type:
location = "https://your.page.com"
To force browser to add this header click "reload" button.
Quite an old question, but I noticed just recently (2020), that Chrome sometimes ignores the Cache-Control headers for my image resources when browsing using an Incognito window.
"Sometimes" because in my case the Cache-Control directive was honored for small images (~60-200KB), but not for larger ones (10MB).
Not using Incognito window resulted in Chrome using the disk cached version even for the large images.
Another tip:
Do not forget to verify "Date" header - if server has incorrect date/time (or is located in another time zone) - Chrome will keep requesting resource again and again.
I'm trying to implement a page that allows Excel users to use the data it provides via the Web Query feature provided by Excel.
It's all working out pretty nicely, as long as I use HTTP (even BASIC user authentication works).
As soon as I switch that over to HTTPS Excel won't download the data anymore (it's even a fully official SSL certificate, so it's not a problem with a self-signed one).
This Microsoft knowledge base article pretty much describes the problem.
Now the part that makes me wonder is this:
This issue occurs when Excel cannot initiate a connection because of the settings on the secure Web server.
This seems to imply that there is some way to get this working, but there's not even a hint at the direction I need to look at.
Should the "because of the settings on the secure Web server" be taken at face value, or is it just a Microsoft way of saying "this won't work unless you buy the right software from us"?
It seems I've found the problem:
MS Excel seems to be unable to use the data on the page if the HTTP headers of the page specify that it should not be cache and it is transfered via HTTPS (the same headers sent via HTTP seem to get ignored).
So by not sending these headers, Excel was suddenly able to access the data:
Pragma: no-cache
Cache-Control: no-cache
Joachim's answer solved the problem for me. The server-side web framework (PHP5 / Expression Engine 1.6.7) was sending a Pragma: no-cache on every request (even though my web-query results page set Pragma: public, I guess the framework overrode it). Once I removed it, everything started working.
IE and Office behavior for Pragma: no-cache is similar to that described in MS KB Article: Internet Explorer is unable to open Office documents from an SSL Web site
See also this caching tutorial's Warning: Pragma no-cache Deprecated. With this in mind I set Expression Engine' Output and Debugging > Generate HTTP Page Headers? option to No. (Other frameworks have similar config options). But some of the other automatically sent headers were needed for successfully caching the rest of the site, so I opted for commenting out the Pragma: no-cache lines in the framework source code.
If you do not have the option of modifying the HTTP headers sent by your web server / framework, the only MS-Office-client-side-only option will be to use VBA macros to automate an Internet Explorer component to get around Office's caching behavior. See Different Ways of Using Web Queries in Microsoft Office Excel 2003 as a starting point.
We have a problem with an IIS5 server.
When certain users/browsers click to download .zip files, binary gibberish text sometimes renders in the browser window. The desired behavior is for the file to either download or open with the associated zip application.
Initially, we suspected that the wrong content-type header was set on the file. The IIS tech confirmed that .zip files were being served by IIS with the mime-type "application/x-zip-compressed".
However, an inspection of the HTTP packets using Wireshark reveals that requests for zip files return two Content-Type headers.
Content-Type: text/html;
charset=UTF-8
Content-Type:
application/x-zip-compressed
Any idea why IIS is sending two content-type headers? This doesn't happen for regular HTML or images files. It does happen with ZIP and PDF.
Is there a particular place we can ask the IIS tech to look? Or is there a configuration file we can examine?
I believe - and i may be wrong that the http 1.1 header sends multiple headers definitions and the most specific has precedence .
so in your example here it is sending 2 text/html and then application/x-zip-commercial so the second one would be the most specific - if that cant be handled on the client then the more general one is used (the first one in this case ) -
I have read through this http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html and that sort of points to what you are saying - not sure if this is what is actually happening though.
Of course i may be totally wrong here
Make sure that you don't have any ISAPI filters or ASP.net HTTP modules set up to rewrite the headers. If they don't check to see if the header already exists, it will be appended rather than replaced. We had issues a while ago with an in-house authentication module not correctly updating the headers so we were getting two Authorization headers, one from IIS and one from our module.
What software has been installed on the server to work with .zip files?
It looks like IIS picks up MIME translations from the registry, perhaps zip-software you use has registered the MIME-type. This doesn't explain why IIS would respond with two content-type headers, so any ISAPI filter and other Mime-table is suspect.
This may be related to this knowledge base article. It is suggesting that IIS may be gzipping the already zipped file, but some browsers just buck pass straight to a secondary application giving you bad data (as it has been zipped twice). If you change the mime type of the zip extension to application/octet-stream this may not happen.
It sounds like there may be a issue with your configuration of IIS. However that is not possible to tell from your post if this is the case.
You can have mime types configured on several levels on your IIS. My IIS 5 knowledge is a bit rusty, as far as I can remeber this behavior is the same for IIS 6. I tried to simulate this on a IIS 6 enviroment, but only ever received one mime type depending on the accepted header
I have set the the header for zip files on the site to application/x-zip-compressed and for the file I have explicity set it to
tinyget -srv:dev.24.com -uri:/helloworld.zip -tbLoadSecurity
WWWConnect::Connect("server.domain.com","80")
IP = "127.0.0.1:80"
source port: 1581
REQUEST: **************
GET /helloworld.zip HTTP/1.1
Host: server.domain.com
Accept: */*
RESPONSE: **************
HTTP/1.1 200 OK
Content-Length: 155
Content-Type: text/html
Last-Modified: Wed, 29 Apr 2009 08:43:10 GMT
Accept-Ranges: bytes
ETag: "747da786a6c8c91:0"
Server: Microsoft-IIS/6.0
Date: Wed, 29 Apr 2009 10:47:10 GMT
PK??
? ? ? helloworld.txthello worldPK??¶
? ? ? ? helloworld.txtPK?? ? ? < 7 ? hello world sample
WWWConnect::Close("server.domain.com","80")
closed source port: 1581
However I dont feel this prove much. It does however raise a few questions:
What is all the mime maps that have been setup on the server (ask the server admin for the metabase.xml file, and then you can make sure he has not missed some setting)
Is those clients on a network that is under your control? Probably not, I wonder what proxy server might be sitting inbetween your server and the clients
How does the IIS log's look like, for that request, I am spesifically intrested in the Accept header.
I wonder what fiddler will show?
I've encountered a similar problem. I was testing downloads on IIS 6 and couldn't figure out why a zipped file called test.zip was displaying as text in IE8 (it was fine in other browsers, where it would download).
Then I realised that for the test I'd compressed a very small text file. My guess is that IE sniffed the file, saw the text (which was pretty much uncompressed because of the small size) and decided it was plain text.
I tried again with a larger file and the download prompt appeared OK in IE8.
May not be relevant to your case, but thought I'd mention it.
Tim
I can't seem to get the magic combination of enabling NTLM authentication and still having RDS work. If I leave just anonymous authentication on, RDS works fine - as soon as I enabled it site wide, RDS fails (which is to be expected). Here is what I have done:
This is Windows XP SP2 and ColdFusion 8, Eclipse + Adobe plugins
In the IIS Manager, Right click on default web site and choose Properties
Directory Security tab, click the Edit button for anonymous access and authentication control
Authentication Methods popup window, uncheck anonymous access, and check Integrated Windows authentication (all other checks blank as well).
Click OK, OK, and override the settings for all child sites as well such that the entire site is "secured" using NTLM authentication.
Back in the IIS manager, right click on the CFIDE virtual directory, choose Properties
Directory security tab, edit the authentication methods. Uncheck Integrated Windows authentication and check anonymous access. Hit OK, OK and test:
C:\>wget -S -O - http://localhost/CFIDE/administrator/
--2009-01-21 10:11:59-- http://localhost/CFIDE/administrator/
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.1
Date: Wed, 21 Jan 2009 17:12:00 GMT
X-Powered-By: ASP.NET
Set-Cookie: CFID=712;expires=Fri, 14-Jan-2039 17:12:00 GMT;path=/
Set-Cookie: CFTOKEN=17139032;expires=Fri, 14-Jan-2039 17:12:00 GMT;path=/
Set-Cookie: CFAUTHORIZATION_cfadmin=;expires=Mon, 21-Jan-2008 17:12:00 GMT;path=/
Cache-Control: no-cache
Content-Type: text/html; charset=UTF-8
Length: unspecified [text/html]
Saving to: `STDOUT'
... html output follows ...
And so far so good, the CFIDE directory and at least one child directory appear to be working without NTLM authentication. So I fire up Eclipse and try to establish an RDS connection. Unfortunately I just get an Access Denied message. Investigating a bit further it appears that Eclipse is trying to communicate with /CFIDE/main/ide.cfm - fair enough, pull out trusty wget once again see what IIS is doing:
C:\>wget -S -O - http://localhost/CFIDE/main/ide.cfm
--2009-01-21 10:16:56-- http://localhost/CFIDE/main/ide.cfm
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 401 Access Denied
Server: Microsoft-IIS/5.1
Date: Wed, 21 Jan 2009 17:16:56 GMT
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
Content-Length: 4431
Content-Type: text/html
Authorization failed.
One potential hang up that has been documented elsewhere is that the main directory and ide.cfm page don't actually exist on disk. IIS is configured to hand off all .cfm files to JRun and JRun is configured to map ide.cfm to the RDS servlet. In an attempt to force IIS to be a bit more sensible, I dropped a main directory and empty ide.cfm file on disk hoping it would solve the authentication issue but it didn't make any difference.
What I can do as a work around is leave the entire site as anonymous access and then just enable the specific application folders to use NTLM integrated authentication, but there are quite literally hundreds of possible web applications I would have to do that for. Yuck.
Please Help!!!
There is something strange about answering your own question, but I did finally get it resolved.
NTLM integrated authentication can be enabled for the entire web site
Anonymous access must be enabled for the CFIDE virtual directory
Anonymous access must be enabled for the JRunScripts virtual directory
Once both CFIDE and JRunScripts had anonymous access enabled, RDS and debugging through Eclipse worked like a charm.