I have an old style ISAPI filter which intercepts SF_NOTIFY_SEND_RESPONSE and changes the Content-Type to / and sets Content-Encoding to empty string. It happens when the response body is smaller than some threshold and its done for compression cancelation. So far it works but I have two concerns.
Is this the right way to do what I've done from technical point of view?
Could Content-Type altering be potentially dangerous?
Looks like setting Content-Encoding to empty string is enough, this way you dont have to deal with mime-type changes which can be potentially dangerous
Related
I am setting up a Cloudfront distribution for my companies website.
We would like to set the caching time by using the Cache-Control headers on the server-side (Node.Js with Express), like this:
if (req.url.startsWith('/static')) {
res.setHeader('Cache-Control', 'public,max-age=500');
}
At first, this seems to work well, but one of the criteria for the cache is failing, and that is, to ignore query string parameters.
For example, the request "domain.com/static/logo" and "domain.com/static/logo?foo=bar" should be interpreted as the same resource, and cached as one.
I wonder if it is possible to cache a resource while ignoring its query string parameters, using only the Cache-Control headers.
Thank you.
Bydefault CloudFront does remove the query string and also doesn't consider it into the cache , this is a default behaviour of CloudFront so that there are not multiple cache copies based on different query string parameter.
If you don't seem this behaviour, you may have "Query string" set to Forward all and cache based on call in CloudFront's cache behaviour.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/QueryStringParameters.html
I have a rest like API through Node Express.
The ETag is default, not explicitly turned on or off. However whenever I test hitting the server, it always gives me a new ETag, even if the returned JSON/HTML is exactly the same. I also checked the returned header and they look the same. I tested this with two types of content, an API and a static HTML content like a privacy page.
Any idea how to check what's making it different each time?
Express' default behavior is to provide a "strong"-ly validated etag which will only be the same as a previous response if the current response is precisely the same, byte-for-byte.
You could try setting express' etag to weally validate the response, which indicates to the browser that the current response is semantically equivalent as a previous one with the same value, that is, while they might not be byte-for-byte the same, they encapulate or represent the same meaning. To do this, use app.set('etag','weak')
Finally, if this doesn't work for you, you can create your own etag validation function using app.get('etag',function(body,encoding){...}) where you return a hash generated from your content; this allows you to control what express (and thus, the browser) considers being different means in the context of your response.
More than you ever wanted to know about etags can be found at Wikipedi:HTTP_ETag
The third-party libraries "node-formidable" and "express" come with the ability to handle multipart POST requests (e.g. with a file upload form), but I don't want to use any third-party code. How do I make the file upload process in pure JavaScript on Node.js?
There are very few resources in this regard. How can this be done? Thank you, love is.
Just to clarify because it seems some people are angry that the other answer didn't help much: There is no simple way of doing this without relying on a library doing it for you.
First, here's an answer to another question trying to clarify what happens on a POST file upload: https://stackoverflow.com/a/8660740/2071242
To summarize, to parse such an upload, you'll first need to check for a Content-Type header containing "multipart/form-data" and, if one exists, read the boundary attribute within the header.
After this, the content comes in multiple parts, each starting with the boundary string, including some additional headers and then the data itself after a blank line. The browser can select the boundary string pretty freely as long as such byte sequence doesn't exist in the uploaded data (see the spec at https://www.rfc-editor.org/rfc/rfc1867 for details). You can read in the data by registering a callback function for the request object's data event: request.on('data', callback);
For example, with boundary "QweRTy", an upload might look something like this:
POST /upload HTTP/1.1
(some standard HTTP headers)
Content-Type: multipart/form-data; boundary=QweRTy
--QweRTy
Content-Disposition: form-data; name="upload"; filename="my_file.txt"
Content-Type: text/plain
(The contents of the file)
--QweRTy--
Note how after the initial headers two dashes are added to the beginning of each boundary string and two dashes are added to the end of the last one.
Now, what makes this challenging is that you might need to read the incoming data (within the callback function mentioned above) in several chunks, and there are no guarantees that the boundary will be contained within one chunk. So you'll either need to buffer all the data (not necessarily a good idea) or implement a state machine parser that goes through the data byte by byte. This is actually exactly what the formidable library is doing.
So after having similar considerations, what I personally decided to do is to use the library. Re-implementing such a parser is pretty error-prone and in my opinion not worth the effort. But if you really want to avoid any libraries, checking the code of formidable might be a good start.
This is a bit old question, but still quite relevant.
I have been looking for a similar solution and no luck. So decided to do my own which might come handy to some other users.
GIST: https://gist.github.com/patrikbego/6b80c6cfaf4f4e6c119560e919409bb2
Nodejs itself recommends (as seen here) formidable, but I think that such a basic functionality should be provided by Nodejs out of the box.
I think you need to parse form by yourself if you don't want to use any modules very much. When uploading a file, the form will be in multipart/form-data format, which means your request content will be divided by a string that is generated randomly by your browser. You need to read this string at the beginning of the form, try to load data and find this string, then parse them one by one.
For more information about multipart/form-data you can refer http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2
I think the best solution is to use formidable. It handles vary scenarios and works prefect I think.
I'd like to exclude certain pages from the Varnish cache based on the content of the page (for instance if the Form uses a particular hidden field which is a security feature and needs to be unique on every page refresh).
I have dozens of forms, so I don't want to have to exclude each unique page individually from the cache.
Is this possible within the VCL?
No, normally not. The proper way to do it would be to set cache-headers (for instance "Cache-Control: no-cache, must-revalidate") on your pages with the non-cacheable forms that varnish in turn will read.
As a nice side effect that will also cancel most client side caches that also often can cause troubles with CAPTCHAs and the like.
We'd like to double-check our http headers for security before we send them out. Obviously we can't allow '\r' or '\n' to appear, as that would allow content injection.
I see just two options here:
Truncate the value at the newline character.
Strip the invalid character from the header value.
Also, from reading RFC2616, it seems that only ascii-printable characters are valid for http header values Should also I follow the same policy for the other 154 possible invalid bytes?
Or, is there any authoritative prior art on this subject?
This attack is called "header splitting" or "response splitting".
That OWASP link points out that removing CRLF is not sufficient. \n can be just as dangerous.
To mount a successful exploit, the application must allow input that contains CR (carriage return, also given by 0x0D or \r) and LF (line feed, also given by 0x0A or \n)characters into the header.
(I do not know why OWASP (and other pages) list \n as a vulnerability or whether that only applies to query fragments pre-decode.)
Serving a 500 on any attempt to set a header that contains a character not allowed by the spec in a header key or value is perfectly reasonable, and will allow you to identify offensive requests in your logs. Failing fast when you know your filters are failing is a fine policy.
If the language you're working in allows it, you could wrap your HTTP response object in one that raises an exception when a bad header is seen, or you could change the response object to enter an invalid state, set the response code to 500, and close the response body stream.
EDIT:
Should I strip non-ASCII inputs?
I prefer to do that kind of normalization in the layer that receives trusted input unless, as in the case of entity-escaping to convert plain-text to HTML escaping, there is a clear type conversion. If it's a type conversion, I do it when the output type is required, but if it is not a type-conversion, I do it as early as possible so that all consumers of data of that type see a consistent value. I find this approach makes debugging and documentation easier since layers below input handling never have to worry about unnormalized inputs.
When implementing the HTTP response wrapper, I would make it fail on all non-ascii characters (including non-ASCII newlines like U+85, U+2028, U+2029) and then make sure my application tests include a test for each third-party URL input to makes sure that any Location headers are properly %-encoded before the Location reaches setHeader, and similarly for other inputs that might reach the request headers.
If your cookies include things like a user-id or email address, I would make sure the dummy accounts for tests include a dummy account with a user-id or email address containing a non-ASCII letter.
The simple removal of new lines \n will prevent HTTP Response Splitting. Even though a CRLF is used as a delimiter in the RFC, the new line alone is recognized by all browsers.
You still have to worry about user content within a set-cookie or content-type. Attributes within these elements are delimited using a ;, it maybe possible for an attacker to change the content type to UTF-7 and bypass your XSS protection for IE users (and only IE users). It may also be possible for an attacker to create a new cookie, which introduces the possibility of Session Fixation.
Non-ASCII characters are allowed in header fields, although the spec doesn't really clearly say what they mean; so it's up to sender and recipient to agree on their semantics.
What made you think otherwise?