How does a web browser determine what to do with a resource?

How does a web browser determine what to do with a resource? - browser

In the browser's address bar, I can specify a resource using any extension or none, e.g., http://www.something.com/someResource.someExtension. How does the browser determine what to do with this resource? e.g., should the browser parse it as an HTML document, or treat it as some script? Is there a notion of a resource type? Thank you.
P.S. I could not believe what I was thinking! :( (see my flaw in the comment to Luka's answer). How could the browser look at a resource locally! The browser is a client, and the resource resides on the server side. Duh! (I've found myself on this "mental" drug occasionally)

The HTTP response returned by server typically contains "Content-type: text/html" or similar line (application/octet-stream, etc).
Here's an example (the easiest way to view similar results is to open firebug's Net tab):
Cache-Control public, max-age=60
Content-Encoding gzip
Content-Length 9334
Content-Type text/html; charset=utf-8<----------------here's it
Date Sat, 05 May 2012 20:34:36 GMT
Expires Sat, 05 May 2012 20:35:36 GMT
Last-Modified Sat, 05 May 2012 20:34:36 GMT
Vary *

It looks at the Mime Type of the document.
HTML pages have the mime type text/html, JPEG images have image/jpeg
More information: http://en.wikipedia.org/wiki/Internet_media_type

It does using MIME types http://en.wikipedia.org/wiki/Internet_media_type.

Related

CloudFront Modify JS / CSS Content

My website's theme is broken when I am serving JS and CSS via CloudFront. Further troubleshooting shows that some JS and CSS contents are different from the origin and I suspect this is the reason. Is it possible that CF has some kind of optimization features that modify our JS /CSS content? If yes, how can we disable or fix this problem?
I believe it is not a caching problem due to there isn't any changes to the origin's file after CF enabled. Also, I've tried to invalidated /wp-content/uploads/sites/2386/bb-plugin/cache/* but still getting the same behavior. As shown in the print screen below, I've also set query string to "Forward all, cache based on all".
Below are the JS and CSS files that are different by comparing the origin and CF, and my CF settings print screen:
JS
(Origin) https://www.seeustosee.com/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.js?ver=774d199e19697e00bc26b83ff78afa2c
(CF) https://da4e1j5r7gw87.cloudfront.net/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.js?ver=774d199e19697e00bc26b83ff78afa2c
CSS
(Origin) https://www.seeustosee.com/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.css?ver=774d199e19697e00bc26b83ff78afa2c
(CF) https://da4e1j5r7gw87.cloudfront.net/wp-content/uploads/sites/2386/bb-plugin/cache/2650-layout.css?ver=774d199e19697e00bc26b83ff78afa2c
CF Behavior Settings
https://imgur.com/XiPDq0X

CloudFront does not modify payload. Even when Compress Objects Automatically is enabled (which it isn't), the compression is transparent gzip that results in a response body identical to the original, after decompression.
But take a look at your response headers, and you'll see the problem. Your origin server is Nginx, but you don't have CloudFront configured to use that server as the origin for these requests. You have CloudFront sending the requests to an Amazon S3 bucket. The JS file there is from August 28, 2019.
Content-Type: application/javascript
Content-Length: 18371
Date: Fri, 31 Jan 2020 02:21:42 GMT
Last-Modified: Wed, 28 Aug 2019 06:53:02 GMT
Server: AmazonS3

DocuSign - Unable to convert document

We’re getting an error – “unable to convert document” for one of our clients on our multi-tenant server. I’ve had a rummage and it looks like that error is generated when you’re sending a file with an unexpected extension meaning that DocuSign doesn’t know how to convert it to a PDF (https://stackoverflow.com/questions/53771197/docusign-random-unable-to-convert-document-error). What I’m failing to understand is how it can be working for some – it works for me on our multi-tenant server – but not others. Is there more to this than meets the eye or am I missing something?
Headers : X-RateLimit-Reset: 1573833600
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 991
X-DocuSign-TraceToken: #####
Strict-Transport-Security: max-age=31536000; includeSubDomains
Cache-Control: no-cache
Date: Fri, 15 Nov 2019 15:20:40 GMT
Response stream : {
"errorCode": "UNABLE_TO_CONVERT_DOCUMENT",
"message": "System was unable to convert this document to a PDF. Unable to convert Document(2019.11.15_NDA - MyDocument) to a PDF. Error: UserId:##### IPAddress:##### Source:ApiRESTv2:FileType 15_nda - my document is ineligible for conversion."
}

Check that you are setting the fileExtension attribute to pdf in the document object in your Envelopes::create call.
If you don't set it, DocuSign does some guessing, but setting the attribute explicitly is the way to go.

Opening file content resource (Excel) of JasperReports Server / Tomcat with Internet Explorer displays binary data inline

Content-Type/Accept/MIME HTTP headers issue?
JasperReports Server (5.2.0) (update 2014-08-20/21: 5.5.0 & 5.6.0 alike)
running on Tomcat 7
clients tried
Internet Explorer
5.2.0 tests (default below)
9.0.8112.16421 64bit (default below)
11.0.9600.17105 64bit
5.5.0 tests (update 2014-08-20)
8.0.7601.17514
9.0.8112.16421
10.0.9200.16384
Firefox 28.0
Chrome (34.0.1847.131 m)
If I navigate in the JasperReports Server Web GUI to my previously uploaded Inhaltsresource (content resource), a *.xlsx Excel document, it works well in Firefox and Chrome, by offering to save or open the file, but it fails in Internet Explorer, by displaying the files binary content in the tab :-(
I did quite some research, but could not find a definitive cause, although some points may point at the cause:
(more general observation:)
the IEs/Jasper GUIs sent HTTP request header (ACCEPT string) seems to be wrong/incomplete/IE-incompatible
(thus) the Jasper Servlets HTTP response header (Content-Type string) seems to be wrong/incomplete/IE-incompatible
(when thinking about this a little deeper:)
shouldn't the JasperServer itself (or the Tomcat as the container to a certain degree on delivery) try to determine the to-be-delivered content type?
either by letting the user-set it manually or better by determining it via heuristics (file extension, content parsing, ...)
this way it could also be stored along with the file (I would only do it if the user want's to override the result of the heuristically determined type)
since the filename or the URL already easily indicate that it is a *.xlsx file and the content starts with PK... it already strongly indicates that it really is a (ZIP-packed) Excel file
so I would see two basic ways this should work in general...
the request header (Jasper-delivered GUI page) should define the content type explicitely (maybe only, if it can't be easily determined by the response functionality itself)
(generally maybe more appropriate:) the response header (Jasper/Tomcat server logic) should set the requested, correct or estimated content-type explicitely
looking at the header responses of IE or FF one can clearly see that no Content-Type is set here, although the REST-API call further down has it set (and it works there) to application/octet-stream;charset=UTF-8
Here are details that I checked already:
ok: the HTTP response headers for FF and IE do not significantly differ to me (although the request headers are quite different) (see below), thus indicating some issue with the magic of result content detection (where FF and Chrome seem to be better in this case)
the HTTP Headers of IE and FF request/response cycles:
IE 9 (captured with onboard dev tools):
request header
Anforderung GET http://...:8080/jasperserver/fileview/fileview/....xlsx? HTTP/1.1
Accept application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, */*
Accept-Language de-DE
User-Agent Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E)
UA-CPU AMD64
Accept-Encoding gzip, deflate
Host ...:8080
Proxy-Connection Keep-Alive
Cookie userTimezone=Europe/Berlin; JSESSIONID=0FEF6E9F46EB2202A041A0A6F37B249A; userLocale=de_DE; treefoldersTree=1%7Copen%3B4%7Copen%3B5%7Copen%3B8%7Copen%3B; lastFolderUri=/...
response header
Antwort HTTP/1.0 200 OK
Server Apache-Coyote/1.1
Cache-Control no-store
Expires Thu, 01 Jan 1970 01:00:00 CET
P3P CP="ALL"
Pragma
Content-Language de-DE
Content-Length 453242
Date Thu, 08 May 2014 10:54:46 GMT
X-Cache MISS from ..some-proxy-host..
X-Cache-Lookup MISS from ..some-proxy-host..:8080
Via 1.1 ..some-proxy-host..:8080 (squid/2.7.STABLE8)
Connection keep-alive
Proxy-Connection keep-alive
FF (captured with HttpFox addon)
request header
(Request-Zeile) GET /jasperserver/fileview/fileview/....xlsx? HTTP/1.1
Host viasaxinfo.list.smwa.sachsen.de:8080
User-Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language de,en-US;q=0.7,en;q=0.3
Accept-Encoding gzip, deflate
Referer http://...:8080/jasperserver/flow.html?_flowId=searchFlow
Cookie userLocale=de; userTimezone=Europe/Berlin; JSESSIONID=E3989F65A4198047DA87FBB7BB73ABBA; treefoldersTree=1%7Copen%3B4%7Copen%3B5%7Copen%3B8%7Copen%3B; lastFolderUri=/...
Connection keep-alive
response header
(Status-Zeile) HTTP/1.0 200 OK
Server Apache-Coyote/1.1
Cache-Control no-store
Expires Thu, 01 Jan 1970 01:00:00 CET
P3P CP="ALL"
Content-Language de
Content-Length 453242
Date Thu, 08 May 2014 11:00:48 GMT
X-Cache MISS from ..some-proxy-host..
X-Cache-Lookup MISS from ..some-proxy-host..:8080
Via 1.1 ..some-proxy-host..:8080 (squid/2.7.STABLE8)
Connection keep-alive
Proxy-Connection keep-alive
ok: the compatibility view in IE does not help it
checking potential HTTP response problems (which differ)
Pragma: should have the same meaning like Cache-Control: Public
What does the HTTP header Pragma: Public mean?
Content-Language: shouldn't matter here I guess
checking potential HTTP request problems
order of request header rows shouldn't matter
Accept: problematic?
seems ok looking at the specs http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
Accept-Language: shouldn't matter
Cookie: content shouldn't matter
Proxy-Connection: disabling/enabling proxy settings did not change something
ok: MIME type setup in tomcat7/conf/web.xml
<mime-mapping>
<extension>xlsx</extension>
<mime-type>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mime-type>
</mime-mapping>
putting it as well under jasperserver/WEB-INF/web.xml does not help either
some more details about this can be also found here:
http://blogs.adobe.com/techcomm/2012/11/handling-xlsx-docx-and-pptx-baggage-files-when-publishing-to-robohelp-server.html
http://filext.com/faq/office_mime_types.php
using the Rest API (.../jasperserver/rest/resource/...) works in both FF and IE
IE 9:
with fileData=true (brings up a dialog whether to open or save the file where opening works as expected)
HTTP request header
Anforderung GET http://...:8080/jasperserver/rest/resource/....xlsx?fileData=true HTTP/1.1
Accept text/html, application/xhtml+xml, */*
Accept-Language de-DE
User-Agent Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)
UA-CPU AMD64
Accept-Encoding gzip, deflate
Host ...:8080
Proxy-Connection Keep-Alive
Cookie userTimezone=Europe/Berlin; userLocale=de_DE; JSESSIONID=1B91EC2172C438C51A551CB967A3148D; treefoldersTree=1%7Copen%3B4%7Copen%3B5%7Copen%3B7%7Copen%3B10%7Copen%3B; lastFolderUri=...; foldersPanelWidth=239
HTTP response header
Antwort HTTP/1.0 200 OK
Server Apache-Coyote/1.1
Cache-Control private
Expires Thu, 01 Jan 1970 01:00:00 CET
P3P CP="ALL"
Content-Disposition attachment; filename=....xlsx
Content-Type application/octet-stream;charset=UTF-8
Date Fri, 09 May 2014 12:44:05 GMT
X-Cache MISS from LIST-SRV-PROXY03
X-Cache-Lookup MISS from LIST-SRV-PROXY03:8080
Via 1.1 ...some-proxy-host...:8080 (squid/2.7.STABLE8)
Connection close
without fileData=true returning the expected resource meta data XML (displayed inline)
<resourceDescriptor name="....xlsx" wsType="contentResource" uriString="/....xlsx" isNew="false">
<label><![CDATA[....xlsx]]></label>
<creationDate>1399636098445</creationDate>
<resourceProperty name="PROP_RESOURCE_TYPE">
<value><![CDATA[com.jaspersoft.jasperserver.api.metadata.common.domain.ContentResource]]></value>
</resourceProperty>
<resourceProperty name="PROP_PARENT_FOLDER">
<value><![CDATA[/...]]></value>
</resourceProperty>
<resourceProperty name="PROP_VERSION">
<value><![CDATA[0]]></value>
</resourceProperty>
<resourceProperty name="PROP_SECURITY_PERMISSION_MASK">
<value><![CDATA[1]]></value>
</resourceProperty>
<resourceProperty name="CONTENT_TYPE">
<value><![CDATA[contentResource]]></value>
</resourceProperty>
<resourceProperty name="DATA_ATTACHMENT_ID">
<value><![CDATA[/....xlsx]]></value>
</resourceProperty>
</resourceDescriptor>
I spent quite some time on this, but neither googleing (I wonder why nobody else seems to have this issue although it looks very common to me) nor various debugging did help. Maybe I would have to play in detail with the related Jasper classes to debug further, but maybe somebody else had this issue as well or knows a solution?

it seems there is a manual workaround possible: http://community.jaspersoft.com/jasperreportsr-server/issues/3716#comment-808481
we implemented a servlet filter class to try to set the Content-Disposition header of the response in the cases when we knew that the MIME type was wrongly set. As we knew that the response was flushed after being processed by the web service end point, we set the header BEFORE being processed as Content-Disposition: attachment; filename='filename.extension'. This turned out to work, and we were able to download the file with an appropriate file extension.
but they also mention that it would work with a v5.6.0 although it did not in our tests (see comment above: Opening file content resource (Excel) of JasperReports Server / Tomcat with Internet Explorer displays binary data inline)
v5.6.0, and apparently on this release the MIME type of the response was correctly set, so we finally get to a proper solution for our problem.

Trying to pass pci complience but have a cross-site scripting issue

I'm currently trying to pass PCI compliance for one of my client's sites but the testing company are flagging up a vulnerability that I don't understand!
The (site removed) details from the testing company are as follows:
The issue here is a cross-site
scripting vulnerability that is
commonly associated with e-commerce
applications. One of the tests
appended a harmless script in a GET
request on the end of the your site
url. It flagged as a cross-site
scripting vulnerability because this
same script that was entered by the
user (our scanner) was returned by the
server unsanitized in the header. In
this case, the script was returned in
the header so our scanner flagged the
vulnerability.
Here is the test I ran from my
terminal to duplicate this:
GET
/?osCsid=%22%3E%3Ciframe%20src=foo%3E%3C/iframe%3E
HTTP/1.0 Host:(removed)
HTTP/1.1 302 Found
Connection: close
Date: Tue, 11 Jan 2011 23:33:19 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: http://www.(removed).co.uk/index.aspx?osCsid="><iframe src=foo></iframe>
Set-Cookie: ASP.NET_SessionId=bc3wq445qgovuk45ox5qdh55; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 203
<html><head><title>Object moved</title></head><body>
<h2>Object moved to here.</h2>
</body></html>
The solution to this issue is to
sanitize user input on these types of
requests, making sure characters that
could trigger executable scripts are
not returned on the header or page.
Firstly, I can't get the result that the tester did, it only ever returns a 200 header which doesn't include the location, nor will it return the object moved page. Secondly, i'm not sure how (on iis 6) to stop it returning a header with the query string in it! Lastly, why does code in the header matter, surely browsers wouldn't actually execute code from the http header?

Request: GET /?osCsid=%22%3E%3Ciframe%20src=foo%3E%3C/iframe%3E HTTP/1.0 Host:(removed)
The <iframe src=foo></iframe> is the issue here.
Response text:
<html><head><title>Object moved</title></head><body>
<h2>Object moved to here.</h2>
</body></html>
The response link is:
http://www.(removed).co.uk/index.aspx?osCsid="><iframe src=foo></iframe>
Which contains the contents from the request string.
Basically, someone can send someone else a link where your osCsid contains text that allows the page to be rendered in a different way. You need to make sure that osCsid sanitizes input or filters against things that could be like this. For example, I could provide a string that lets me load in whatever javascript I want, or make the page render entirely different.
As a side note, it tries to forward your browser to that non-existent page.

It turned out that I have a Response.redirect for any pages which are accessed by https which don't need to be secure and this was returning the location as part of the redirect. Changing this to:
Response.Status = "301 Moved Permanently";
Response.AddHeader("Location", Request.Url.AbsoluteUri.Replace("https:", "http:"));
Response.End();
Fixed the issue

How does GMail implement Comet?

With the help of HttpWatch, I tried to figure out how GMail implements Comet.
I login in to GMail with two accounts, one in IE and the other in Firefox. Chatting in GTalk in GMail with some magic words like "WASSUP". Then, I logoff both GMail accounts, filter any http content without "WASSUP" string. The result shows which HTTP request is the streaming channel. (Note: I have to logoff. Otherwise, never-ending HTTP would not show content in HttpWatch.)
The result is interesting. The URL for stream channel is like:
https://mail/channel/bind?VER=8&at=xn3j33vcvk39lkfq.....
There is no surprise that GMail do Comet in IE with IFRAME. The Http content starts with "<html><body>".
Originally, I guessed that GMail does Comet in Firefox with multipart XmlHttpRequest. To my surprise, the response header doesn't have "multipart/x-mixed-replace" header. The response headers are as below:
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Date: Sat, 20 Mar 2010 01:52:39 GMT
X-Frame-Options: ALLOWALL
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
Server: GSE
X-XSS-Protection: 0
Unfortunately, the HttpWatch doesn't tell whether a HTTP request is from XmlHttpRequest or not. The content is not HTML but JSON. It looks like a response for XHR, but that would not work for Comet without multipart/x-mixed-replace, right?
Is there any way else to figure out how GMail implements Comet?
Update:
After further investigation, I believe GMail implements Comet this way:
1) in IE, it use a forever-hidden-iframe;
2) in Firefox, it use forever-XHR without multipart/x-mixed-replace header. The client will response in conditon (readyState == 3) OR (readyState == 4). That is, in both interactive state and complete state.

Per this article,
So what is the solution used by Google
Gmail?
The solution is really simple,
straight forward and very portable!
What Gmail did is requesting an
endless html page that contains
streams of Javascript portions. Give
it a try, It’s very powerful. So, we
will have on the client side a js file
that processes the responses, and
another endless html that contains the
Javascript Streams.
The rest of the article goes into much more detail, including an exploration of alternatives as well as the specific one picked by GMail.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How does a web browser determine what to do with a resource? - browser

It looks at the Mime Type of the document. HTML pages have the mime type text/html, JPEG images have image/jpeg More information: http://en.wikipedia.org/wiki/Internet_media_type

It does using MIME types http://en.wikipedia.org/wiki/Internet_media_type.

Related

CloudFront Modify JS / CSS Content

DocuSign - Unable to convert document

Opening file content resource (Excel) of JasperReports Server / Tomcat with Internet Explorer displays binary data inline

Trying to pass pci complience but have a cross-site scripting issue

How does GMail implement Comet?

Categories

Resources