How is this code protecting against XSS attacks? - node.js

I'm using a validation library that removes some common XSS attacks from the input to my web application. It works fine, and I'm also escaping everything I render to protect against XSS attacks.
The library contains this line in part of the XSS filtering process:
// Protect query string variables in URLs => 901119URL5918AMP18930PROTECT8198
str = str.replace(/\&([a-z\_0-9]+)\=([a-z\_0-9]+)/i, xss_hash() + '$1=$2');
xss_hash returns a string of random alpha-numeric characters. Basically it takes a URL with a query string, and mangles it a bit:
> xss('http://example.com?something=123&somethingElse=456&foo=bar')
'http://example.com?something=123eujdfnjsdhsomethingElse=456&foo=bar'
Besides having a bug (it only "protects" one parameter, not all of them), it seems to me the whole thing is itself a bug.
So my question is: what kind of attack vector is this kind of replacement protecting against?
If it's not really doing anything, I would like to submit a patch to the project removing it completely. And if it is legitimately protecting users of the library, I'd like to submit a patch to fix the existing bug.

xss_hash returns a string of random alpha-numeric characters.
Are they definitely random, or is it generated based on computable data?
It appears to be Security through obscurity: it's trying to replace all the &'s with xss_hash()'s so that the query is less readable. I'm guessing there is a part of the library which undoes this (i.e. treats all the xss_hash()'s in the string as &s for parsing purposes).

The code in question "protected query string variables" by replacing the & separating URL parameters with a random string, before doing some other processing that would remove or otherwise mangle ampersands. As Jay Shah pointed out, there was code just below that was meant to replace the query string ampersands, but another bug was preventing it from working as intended.

Related

Acceptable encoding for Cosmos DB IDs to replace illegal characters?

I'm trying to store data in Cosmos DB where the IDs use a slash (/). However slash is an illegal character in Cosmos IDs. I initially tried to resolve this by URL encoding slashes (%2F) as that's the form I'd generally receive them in through API requests. However, though percent (%) is not an illegal character for IDs, Cosmos still chokes on them being unable to retrieve many documents with a percent in the ID (it works for some, but it appears if the % is followed by certain characters it fails).
Is there a encoding that is suitable for Cosmos DB IDs that will replace illegal characters in the original ID text without introducing illegal or unhandled characters (like %) in the encoded ID text? I'd prefer to stay away from things like Base64 which makes the IDs hard to decipher for people. And I'd also like to avoid simple character replacement (/ becomes -) in case an ID uses the replacement character.
I ended up doing simple character replacement, swapping out slashes (/) with pipes (|).
The key thing to make this livable is adding a value converter with EntityFramework.
Expression<Func<string?, string>> toDB = v => v!.Replace("/", "|");
Expression<Func<string, string?>> fromDB = v => v!.Replace("|", "/");
builder.Property(p => p.Id).HasConversion(toDB, fromDB);
This allows the character replacement to happen automatically when reading & writing to the database. The only time you need to worry about the difference is if you're accessing the database directly or from other code without the converter. Or possibly doing custom searches. I manually do the translation for a filtering framework we use, and I suspect that other id search solutions would need the same manual translation.
Ultimately I decided this was acceptable as we are unlikely to have other characters that need translation for our case, the translation is easy to do visually, and it's transparent in most cases with ValueConverters. But it isn't a general solution that would work for any possible string id.
Edit:
On second thought, this solution is deficient. Cosmos does actually allow creating documents with illegal characters in the ID, it just doesn't allow accessing or deleting them easily. An ideal solution would prevent all illegal characters across the board, whether expected or not.

How to verify security of meekrodb?

Meekrodb is a simple php-->mysql library. How do I test/verify that it is secure, such as against sql injection attacks?
The first option is to read the FAQ:
Are there any extra precautions I should take to prevent SQL injection?
MeekroDB makes SQL injection 100% impossible if you follow 2 simple rules. First, never use the %l (literal) placeholder with user-supplied data. This placeholder doesn't escape your data the way all of the others do. Second, never change the character set at runtime using MySQL commands SET NAMES or SET CHARACTER SET. If you need to change the character set, only use DB::$encoding at the same place where you set your MySQL username/password.
The second option, assuming you have a license:
Use a query/input field by filling in:
'\"
Which could potentially cause the weirdest errors you have ever seen. It might just be converted, in which case you prove it's secure.
Update
For example, going from there first claim (in combination with security):
"MeekroDB takes care of quotes and escaping for you."
Now for testing this specific claim they have provided there way of handling this situation:
DB::query("SELECT * FROM login WHERE username=%s AND password=%s",
$username,
$password);
To prove that this claim is actually true, you could write a small application which would (for example) input:
$username = "''""\";

Node.js URL-encoding for pre-RFC3986 urls (using + vs %20)

Within Node.js, I am using querystring.stringify() to encode an object into a query string for usage in a URL. Values that have spaces are encoded as %20.
I'm working with a particularly finicky web service that will only accept spaces encoded as +, as used to be commonly done prior to RFC3986.
Is there a way to set an option for querystring so that it encodes spaces as +?
Currently I am simply doing a .replace() to replace all instances of %20 with +, but this is a bit tedious if there is an option I can set ahead of time.
If anyone still facing this issue, "qs" npm package has feature to encode spaces as +
qs.stringify({ a: 'b c' }, { format : 'RFC1738' })
I can't think of any library doing that by default, and unfortunately, I'd say your implementation may be the more efficient way to do this, since any other option would probably either do what you're already doing, or will use slower non-compiled pure JavaScript code.
What about asking the web service provider to follow the RFC?
https://github.com/kvz/phpjs is a node.js package that provides all the php functions. The http_build_query implementation at the time of writing this only supports urlencode (the query string includes + instead of spaces), but hopefully soon will include the enc_type parameter / rawurlencode (%20's for spaces).
See http://php.net/http_build_query.
RFC1738 (+'s) will be the default enc_type either way, so you can use it immediately for your purposes.

Problem using unicode in URLs with cgi.PATH_INFO in ColdFusion

My ColdFusion (MX7 on IIS 6) site has search functionality which appends the search term to the URL e.g. http://www.example.com/search.cfm/searchterm.
The problem I'm running into is this is a multilingual site, so the search term may be in another language e.g. القاهرة leading to a search URL such as http://www.example.com/search.cfm/القاهرة
The problem is when I come to retrieve the search term from the URL. I'm using cgi.PATH_INFO to retrieve the path of the search page and the search term and extracting the search term from this e.g. /search.cfm/searchterm however, when unicode characters are used in the search they are converted to question marks e.g. /search.cfm/??????.
These appear actual question marks, rather than the browser not being able to format unicode characters, or them being mangled on output.
I can't find any information about whether ColdFusion supports unicode in the URL, or how I can go about resolving this and getting hold of the complete URL in some way - does anyone have any ideas?
Cheers,
Tom
Edit: Further research has lead me to believe the issue may related to IIS rather than ColdFusion, but my original query still stands.
Further edit
The result of GetPageContext().GetRequest().GetRequestUrl().ToString() is http://www.example.com/search.cfm/searchterm/????? so it appears the issue goes fairly deep.
Yeah, it's not really ColdFusion's fault. It's a common problem.
It's mostly the fault of the original CGI specification, which specifies that PATH_INFO has to be %-decoded, thus losing the original %xx byte sequences that would have allowed you to work out which real characters were meant.
And it's partly IIS's fault, because it always tries to read submitted %xx bytes in the path part as UTF-8-encoded Unicode (unless the path isn't a valid UTF-8 byte sequence in which case it plumps for the Windows default code page, but gives you no way to find out this has happened). Having done so, it puts it in environment variables as a Unicode string (as envvars are Unicode under Windows).
However most byte-based tools using the C stdio (and I'm assuming this applies to ColdFusion, as it does under Perl, Python 2, PHP etc.) then try to read the environment variables as bytes, and the MS C runtime encodes the Unicode contents again using the Windows default code page. So any characters that don't fit in the default code page are lost for good. This would include your Arabic characters when running on a Western Windows install.
A clever script that has direct access to the Win32 GetEnvironmentVariableW API could call that to retrieve a native-Unicode environment variable which they could then encode to UTF-8 or whatever else they wanted, assuming that the input was also UTF-8 (which is what you'd generally want today). However, I don't think CodeFusion gives you this access, and in any case it only works from IIS6 onwards; IIS5.x will throw away any non-default-codepage characters before they even reach the environment variables.
Otherwise, your best bet is URL-rewriting. If a layer above CF can convert that search.cfm/القاهرة to search.cfm/?q=القاهرة then you don't face the same problem, as the QUERY_STRING variable, unlike PATH_INFO, is not specified to be %-decoded, so the %xx bytes remain where a tool at CF's level can see them.
Here's what you could do:
<cfset url.searchTerm = URLEncodedFormat("القاهر", "utf-8") >
<cfset myVar = URLDecode(url.searchTerm , "utf-8") >
Ofcourse, I'd recommend that you work with something like this in that case:
yourtemplate.cfm?searchTerm=%C3%98%C2%A7%C3%99%E2%80%9E
And then you do URL rewriting in IIS (if not already done by framework/rest of the app) http://learn.iis.net/page.aspx/461/creating-rewrite-rules-for-the-url-rewrite-module/ to match your pattern.
You can set the character encoding of the URL and FORM scope using the setEncoding() function:
http://www.adobe.com/livedocs/coldfusion/7/htmldocs/wwhelp/wwhimpl/common/html/wwhelp.htm?context=ColdFusion_Documentation&file=00000623.htm
You need to do this before you access any of the variables in this scope.
But, the default encoding of those scopes is already UTF-8, so this may not help. Also, this would probably not affect the CGI scope.
Is the IIS Server logging the correct characters into the request log?

Can I be vulnerable to SQL injection by appending input with no whitespace to my query?

I am taking in a string from user input, and splitting it on whitespace (using \w) into an array of strings. I then loop through the array, and append a part of the where clause like this:
query += " AND ( "
+ "field1 LIKE '%" + searchStrings[i] +"%' "
+ " OR field2 LIKE '%" + searchStrings[i] +"%' "
+ " OR field3 LIKE '%" + searchStrings[i] +"%' "
+ ") ";
I feel like this is dangerous, since I am appending user input to my query. However, I know that there isn't any whitespace in any of the search strings, since I split the initial input on whitespace.
Is it possible to attack this via a SQL injection? Giving Robert');DROP TABLE students;-- wouldn't actually drop anything, since there needs to be whitespace in there. In that example, it would not behave properly, but no damage would be done.
Can anyone with more experience fighting SQL injections help me either fix this, or put my mind at ease?
Thanks!
EDIT:
Wow, that is a lot of great input. Thank you everyone who responded. I will investigate full-text search and, at a minimum, parameterize my query.
Just so I can better understand the problem, would it be possible to inject if all whitespace AND single quotes were escaped?
Any time you allow a user to enter data into a query string like this you are vulnerable to SQL injection and it should be avoided like the plague!
You should be very careful how you allow your searchStrings[] array to be populated. You should always append variable data to your query using parameter objects:
+ field1 like #PropertyVal Or field2 like #PropertyVal Or field3 like #PropertyVal etc...
And if you're using SQL Server for example
Query.Parameters.Add(new SqlParameter("PropertyVal", '%' + searchStrings[i] + '%'));
Be very wary how you build a query string that you're going to run against a production server, especially if it has any data of consequence in it!
In your example you mentioned Little Bobby Tables
Robert');DROP TABLE students;--
And cited that because it needs white space, you couldn't do it - but if the malicious user encoded it using something like this:
Robert');Exec(Replace('Drop_Table_students','_',Char(32)));--
I would say - better to be safe and do it the right way. There's no simple way to make sure you catch every scenario otherwise...
Yes, this script did not contain any whitespaces, just encoded characters that SQL decoded back and executed:
http://www.f-secure.com/weblog/archives/00001427.html
The injected script was something like this:
DECLARE%20#S%20NVARCHAR(4000);SET%20#S=CAST(0x440045004300
4C00410052004500200040005400200076006100720063006800610072
00280032003500350029002C0040004300200076006100720063006800
610072002800320035003500290020004400450043004C004100520045
0020005400610062006C0065005F0043007500720073006F0072002000
43005500520053004F005200200046004F0052002000730065006C0065
0063007400200061002E006E0061006D0065002C0062002E006E006100
6D0065002000660072006F006D0020007300790073006F0062006A0065
00630074007300200061002C0073007900730063006F006C0075006D00
6E00730020006200200077006800650072006500200061002E00690064
003D0062002E0069006400200061006E006400200061002E0078007400
7900700065003D00270075002700200061006E0064002000280062002E
00780074007900700065003D003900390020006F007200200062002E00
780074007900700065003D003300350020006…
Wich SQL decoded to :
DECLARE #T varchar(255)'#C
varchar(255) DECLARE Table_Cursor
CURSOR FOR select a.name'b.name from
sysobjects a'syscolumns b where
a.id=b.id and a.xtype='u' and
(b.xtype=99 or b.xtype=35 or b…
and so on, so its possible to do whatever you want with every table in the database without using any whitespaces.
There are just too many ways to get this wrong, that I wouldn't rely if anyone told me "no, this will be safe, because ..."
What about escaping the whitespace in some form (URL-Encode or somethign). What about using non-obvious Unicode whitespace characters that your simple tests don't check for. What if your DB supports some malicious operations that don't require a white space?
Do the correct way: Use a PreparedStatement (or whatever your platform uses for injection-safe parameterization), append and prepend the "%" to the user input and use that as a parameter.
The rule of thumb is: If the string you're appending isn't SQL, it must be escaped using prepared statements or the correct 'escape' function from your DB client library.
I totally agree that query parameters are the absolutely safest way to go. With them you have no risk of SQL injection whatsover (unless you do something stupid with the parameters) and there is no overhead of escaping.
If your DBMS does not support query parameters, then it MUST support string escaping. In the worst case you can try to escape single-quotes yourself, although there still is a Unicode exploit that can circumvent this. However, if your DBMS does not support query parameters, it probably doesn't support Unicode either. :)
Added: Also. queries like you wrote up there are killers for performance - no indexes can be used. I'd advise to look up your DBMS' full-text-indexing capabilities. They are meant exactly for cases such as this.
Yes, they could still inject items, without spaces it might not do much, but it is still a vulnerability.
In general blindly adding user input to a query is not a good idea.
Here's a trivial injection, if I set field1 to this I've listed all rows in your database. This may be bad for security...
'+field1+'
You should use parameters (these are valid in inline SQL too), e.g.
AND Field1 = #Field1

Resources