What should I do if I get an empty CSP violation? - web

I use Content Security Policy. I get genuinely useful warnings like this:
CSP violation!
{ 'csp-report':
{ 'document-uri': 'about:blank',
referrer: '',
'violated-directive': 'img-src \'self\' data: pbs.twimg.com syndication.twitter.com p.typekit.net',
'original-policy': 'longPolicyGoesHere',
'blocked-uri': 'https://platform.twitter.com',
'source-file': 'https://platform.twitter.com',
'line-number': 2 } }
Cool, I need to add 'platform.twitter.com' as an img-src
But sometimes I get blank CSP warnings like this:
CSP violation!
{}
Ie, there's been a POST, but the JSON is empty. What do I do?

I found the problem in my case; it might not be the problem for you.
Since the CSP reporter calls the report-uri file with the POST method, I assumed that the $_POST variable would contain the posted data. This turned out to be false, because the data was not sent from a form or file upload (see PHP "php://input" vs $_POST).
The following code works for me perfectly (thanks to inspiration by the slightly buggy code in https://mathiasbynens.be/notes/csp-reports):
<?php
// Receive and log Content-Security-Policy report
// (WriteLog function omitted here: it just writes text into a log file)
$data=file_get_contents('php://input');
if (!$data) // Data is usually non-empty
exit(0);
// Prettify the JSON-formatted data.
$val=json_decode($data);
$data = json_encode($val,JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
WriteLog($data);
?>

Related

Adding message to gmail error "Payload parts count different from expected"

I am adding a message to a gmail folder using this (example) URL:
https://www.googleapis.com/gmail/v1/users/user#domain.com/messages/import?uploadType=multipart
The body of the request looks like this:
--test_abc123
Content-Type: application/json; charset=UTF-8
{
"labelIds": [ "Label_525" ],
"raw": "RnJvbTogIlNlY3RpZ28gQ2VydGlmaWNh..."
}
--test_abc123--
The raw data is a base64 encoded standard MIME message that looks normal to me. The result of this POST is http error 400 with the error response "Payload parts count different from expected 2. Request payload parts count: 1".
I can supply the original MIME text if that is helpful, but let me emphasize that I have been running this code for several years without problem. I've tried different messages to test this out, but it appears that Google has changed something to break my software.
Is Google objecting to my raw data, or something about the MIME encoding? Any ideas what the problem could be?
---- Addendum ----
I have gotten a few messages to work, they seem to all have image or data attachments. However I really don't see any problem with the messages that are failing - I can import them into Office 365 or Thunderbird or anything else and they render just fine. As a test, I tried importing the message below, which was taken from the MIME RFC. It fails with the same error. I think that Google has changed something to make their MIME parser very fussy, but I don't see how I can fix my input data.
From: Nathaniel Borenstein <nsb#bellcore.com>
To: Ned Freed <ned#innosoft.com>
Subject: Sample message
MIME-Version: 1.0
Content-type: multipart/mixed; boundary="simple boundary"
This is the preamble. It is to be ignored, though it
is a handy place for mail composers to include an
explanatory note to non-MIME compliant readers.
--simple boundary
This is implicitly typed plain ASCII text.
It does NOT end with a linebreak.
--simple boundary
Content-type: text/plain; charset=us-ascii
This is explicitly typed plain ASCII text.
It DOES end with a linebreak.
--simple boundary--
This is the epilogue. It is also to be ignored.
Addendum 2: I tried a simple upload (using content-type header message/rfc822) and it worked, except the message was unlabeled. How
would I specify what label I want applied to a message? I was originally trying to follow the documentation here
link
which tells me to create the json body that I gave above. This allows me to specify the label. But I cannot seem to use
this body in a simple upload. The content type is either invalid, or what Gmail imports is just literally the json body,
it does not parse out the raw data. If you could point me to a specific example showing the URI, message body, http headers
(not java code) that would be very useful to me.
OK never mind, I got it working by adding an empty message/rfc822 part to the body of the multipart upload. That satisfies Google, and the empty part is ignored in favor of the raw data.
You are doing a multipart upload,see here:
The body of the request is formatted as a multipart/related content
type [RFC2387] and contains exactly two parts. The parts are
identified by a boundary string, and the final boundary string is
followed by two hyphens.
This is why it works only for your messages with images or attachments, since your message
--test_abc123
is only one part.
In the past there was no check if this condition is fulfilled, so you might have gotten away with using multipart for 1-part-messages.
But now it's not possible anymore, so if have a single-part message, you should use Simple upload.
If you do not know in advance how many parts your message has, you can always try the multipart first, implementing a try...catch statement, and implement a simple upload request within catch in case of failure.

NodeJs web crawler file extension handling

I'm developing a web crawler in nodejs. I've created a unique list of the urls in the website crawle body. But some of them have extensions like jpg,mp3, mpeg ... I want to avoid crawling those who have extensions. Is there any simple way to do that?
Two options stick out.
1) Use path to check every URL
As stated in comments, you can use path.extname to check for a file extension. Thus, this:
var test = "http://example.com/images/banner.jpg"
path.extname(test); // '.jpg'
This would work, but this feels like you'll wind up having to create a list of file types you can crawl or you must avoid. That's work.
Side note -- be careful using path. Typically, url is your best tool for parsing links because path is aimed at files/directories, not urls. On some systems (Windows), using path to manipulate a url can result in drama because of the slashes involved. Fair warning!
2) Get the HEAD for each link & see if content-type is set to text/html
You may have reasons to avoid making more network calls. If so, this isn't an option. But if it is OK to make additional calls, you could grab the HEAD for each link and check the MIME type stored in content-type.
Something like this:
var headersOptions = {
method: "HEAD",
host: "http://example.com",
path: "/articles/content.html"
};
var req = http.request(headersOptions, function (res) {
// you will probably need to also do things like check
// HTTP status codes so you handle 404s, 301s, and so on
if (res.headers['content-type'].indexOf("text/html") > -1) {
// do something like queue the link up to be crawled
// or parse the link or put it in a database or whatever
}
});
req.end();
One benefit is that you only grab the HEAD, so even if the file is a gigantic video or something, it won't clog things up. You get the HEAD, see the content-type is a video or whatever, then move along because you aren't interested in that type.
Second, you don't have to keep track of file names because you're using a standard MIME type to differentiate html from other data formats.

Why would I get a CSP warning where blocked-uri is an empty string?

I've been using CSP on my localhost server, and as well as normal CSP messages, have seen this:
{
"csp-report": {
"document-uri": "https://localhost:3000/",
"referrer": "",
"violated-directive": "script-src 'self' 'unsafe-eval' cdn.mxpnl.com js.stripe.com platform.twitter.com syndication.twitter.com use.typekit.net",
"effective-directive": "script-src",
"original-policy": veryLongPOlicyGoesHere,
"blocked-uri": "",
"source-file": "https://platform.twitter.com",
"line-number": 2,
"column-number": 28911,
"status-code": 0
}
}
Why is blocked-uri" ""? What's causing this CSP warning?
While it may not be easy to parse, you can find information about that type of report in a CSP "fingerprint" project I ran for a while: https://gist.github.com/oreoshake/29edbf9aae8125f05b66
Empty blocked-uris indicate an inline script/style violation, an eval call, or an inline event handler/javascript: href. Your violated-directive allows eval however.
If you can trigger the same error in a Firefox browser, you can inspect the script-sample field. It may contain the content of the inline script or it may mention the event handler triggered or it will include "eval" in the message.
A very large number of unexpected reports in this format come from browser extensions, namely lastpass.

Login via POST does not yield valid session

I am currently trying to convert a smallish app from nodejs to golang (hence the two tags) but I'm running into a bit of trouble in doing so.
Essentially it is a very simple http POST login which I can't seem to realise. The background is that my university provides a calendar export function and I would like to provide this calendar as a feed that could be added to Google Cal.
Now the thing is that I have the whole thing working in node, but I would really like to be able realise it in go aswell.
The important bit of node code would be
var query = url.parse(req.url, true).query;
var data = {
u: query.user, // Username
p: query.password, // Password
};
needle.post(LOGIN_URL, data, {}, function (error, response) {
//extract cookies etc.
});
which is working like a charm but if I try to do the same in go
import "github.com/parnurzeal/gorequest"
//...
resp, body, err := gorequest.New().Post(LOGIN_URL).Send("u=user&p=pass").End()
//extract cookies etc.
I end up an invalid (timed out) session. I already tried using just net/http in go, which doesn't seem to change anything.
The result the POST request yields is a 302 redirect to an overview page (Btw: it is ASP based). Could it be that this is what's causing the problem, since gorequest then fetches that overview page without the cookies returned in resp, effectively creating a new session that isn't authorized, or am I overlooking something terribly simple?
So it seems that I found the answer myself by following your advice and using "net/http" and digging a little deeper into what the http.Client actually does. To anyone who might encounter similar problems, here is my solution:
http.Client automatically redirects if it receives a 30x response by the server see documentation. Although one can override the redirect policy, I was unable to prevent redirection entirely.
Additionally it seems as if the client has a bug (what I would call it at least), as it drops all header upon redirect see the issue or in the source where new headers are created.
While searching around in net/http I found http.DefaultTransport which is used by http.Client and does not redirect. It seems somewhat lower level and exactly what I was after. The following piece of code demonstrates how I replaced the line realised with gorequest from above:
data := url.Values{"u": {USER}, "p": {PASS}}
req, err := http.NewRequest("POST", LOGIN_URL, bytes.NewBufferString(data.Encode()))
//I needed quite some time to figure out that I needed to set the content type accordingly
req.Header.Add("Content-Type", "application/x-www-form-urlencoded")
//...
resp, err := http.DefaultTransport.RoundTrip(req)
//...
//resp.Header["Set-Cookie"] now contains the login/session cookies
Although I need to extract cookies myself and set a few header values, the solution works perfectly and I am quite happy with it. If anybody has some improvements to my solution or any other advice I am glad to hear it. And thanks to JimB and Volker :).

Multiple cookies issue in OWIN security AuthenticationHandler

I am using Facebook Owin Authentication and more or less follow Microsoft sample. I am more or less following the First time user logs in, everything is ok. But if they sign out and try again, it seems like the previous .AspNet.Correlation.Facebook is not removed, but set to empty string. So my next call to api/getexternallogin looks like this in Fiddler:
This is when we are generating a correlationId and having multiple cookies at this point is not a show stopper. In the response, I set it to the new CorrelationId:
Later when facebook calls back to "/signin-facebook", we try to validate the correlationId in ValidateCorrelationId method. The request seems like this:
So the new CorrelationId has been set but the extra cookie with no value means when I go Request.Cookies["ValidateCorrelationId"], it returns empty string.
I have checked the code and it seems like the only methods modifying this cookie are GenerateCorrelationId and ValidateCorrelationId. Implementation of these methods can be found in here:
http://katanaproject.codeplex.com/SourceControl/latest#src/Microsoft.Owin.Security/Infrastructure/AuthenticationHandler.cs
Curiously enough, my browser does not seem to see the issue:
Any ideas will be much appreciated.
OK this has taken me a fair bit of frustration but when Response.Cookies.Delete(".AspNet.Correlation.Facebook") is called in ValidateCorrelationId method, it sends the following in response:
So the value of "expires" has been concatenated and treated as two separate "set-cookie"s. Hence, the cookie is not expired but its value set to empty string. It seems like the comma after "Thu" is causing it.
The fix I have come up with was to comment out Response.Cookies.Delete(".AspNet.Correlation.Facebook") and do the following instead:
Response.Headers.Add("Set-Cookie", new[] { CorrelationKey + "=; path=/; expires=Fri 02-Jan-1970 00:00:00 GMT" })
No commas there and it is working now.
This does seem like a genuine bug in OWIN.

Resources