I have to update old projects at work. I do not have any experience with classic asp, although i'm familiar with php scripting.
Are there any functions I should use?
Can you provide me with a good function for some basic protection?
Is there something like a parameterized query in asp?
Thanks!
Yes you can use parametrized queries in classic ASP (more accurately, classic ADO).
Here is a link.
As for encoding output, I might be tempted to create a wrapper for the latest Microsoft Anti-XSS library and call it with Server.CreateObject. I am far from an expert on this kind of thing as I spend much more time in .Net, so I only think this would work.
Server.HTMLEncode is really not good enough, as it only blacklists a few encoding characters. The Anti-XSS library is much better as it whitelists what is acceptable.
Always use Server.HTMLEncode to sanitize user input.
For example, if you're setting a variable from a form text box:
firstName = Server.HTMLEncode(trim(request.form("firstname")))
Watch out for SQL injection. Do not concatenate user input to a SQL string and then execute it. Instead, always used parameterized queries.
There is a bunch of functions starting with Is, such as IsNumber, IsArray etcetera, that might be of interest. Also if you're expecting a integer, you could use CLng(Request("blabla")) to get it, thus if it's not a integer the CLng function will raise an error.
One way to do it might be to add a check in a header.asp file that iterates through the Request object looking for inappropriate characters. For example:
<%
for each x in Request.Form ' Do this for Request.Querystring also
If InStr(x,"<") <> 0 Then
' encode the value or redirect to error page?
End If
next
%>
Related
I've got some source code that has some cross site scripting vulnerabilities in it. There is no input validation that happens when the browser sends data over to the server which is executing server-side Javascript and classic ASP (IIS 7.0).
My question is, is there a way to override the Request.Form("foo") object/method so that I can call a sanitization function too and get rid of prohibited JS/HTML? I don't want to do a find and replace on every single file everywhere Request.Form is called. I was hoping for something more elegant.
Any suggestions are appreciated.
I don't think you can change Request.Form members.
What you can do, as a partial solution, is to create a code that will run first on every page (for example, using an include directive) which loops over Request.Form, Request.QueryString etc., and if it finds suspected code, it terminates the code execution (Response.End). This solution is partial because it doesn't really sanitize input, it just drops execution when it finds suspected text.
Another option: Create an array, parallel to Request.Form. Populate this array with the same members as in Request.Form, but this time sanitized. Then, quickly do a Find-and-Replace over your whole code base, and change Request.Form to your custom array variable.
There is a way to replace the whole Request object with another COM object but its an insane solution and it would still require that all ASP files that use Form contain a common top include file. Its not possible to replace the Request object or one of its members globally at the application level.
The correct solution to the problem, your statement "don't want to do a find and replace on every single file everywhere" notwithstanding, is to perform such global replace.
Despite the number of .asp files that exist the cost is no more than knocking up a simple program to open each ASP file in a folder tree, adding an include line and replacing Request.Form.
I'm wondering what the bare minimum to make a site safe from XSS is.
If I simply replace < with < in all user submitted content, will my site be safe from XSS?
Depends hugely on context.
Also, encoding less than only isn't that flash of an idea. You should just encode all characters which have special meaning and could be used for XSS...
<
>
"
'
&
For a trivial example of where encoding the less than won't matter is something like this...
Welcome to Dodgy Site. Please link to your homepage.
Malicious user enters...
http://www.example.com" onclick="window.location = 'http://nasty.com'; return false;
Which obviously becomes...
View user's website
Had you encoded double quotes, that attack would not be valid.
There are also case where the encoding of the page counts. Ie - if your page character set is not correct or does not match in all applicable spots, then there are potential vulnerabilities. See http://openmya.hacker.jp/hasegawa/security/utf7cs.html for details.
No. You have to escape all user input, regardless of what it contains.
Depending on the framework you are using, many now have an input validation module. A key piece I tell software students when I do lectures is USE THE INPUT VALIDATION MODULES WHICH ALREADY EXIST!
reinventing the wheel is often less effective than using the tried and tested modules which exist already. .Net has most of what you might need built in - really easy to whitelist (where you know the only input allowed) or blacklist (a bit less effective as known 'bad' things always change, but still valuable)
If you escape all user input you should be safe.
That mean EVERYTHING EVERYWHERE it shows up. Even a username on a profile.
The Zend Framework Manual says the following:
60.3.1. Escaping Output
One of the most important tasks to
perform in a view script is to make
sure that output is escaped properly;
among other things, this helps to
avoid cross-site scripting attacks.
Unless you are using a function,
method, or helper that does escaping
on its own, you should always escape
variables when you output them.
Why 'always'? Why do I have to escape variables that have not been created or altered by user input?
Users aren't the only source of dodgy strings in output. Consider, for example, the apparently safe string "Romeo & Juliet" coming out of a database. No cross-site scripting there, you say? True enough. Stick it in a web page, however, and the raw ampersand could cause some interesting problems with validation, parsing, etc.
Output escaping isn't just to guard against malicious or accidentally borked input, it ensures that the output is thoroughly sanitised and treated as having no special meaning in the surrounding output format, whether that's HTML, XML, JSON or whatever.
As a rule, I would escape anything coming from user input, a data source or even calculations. You want the output to be predictable, escaping ensures that it is. If the value when converted to a string contains characters that break your desired markup, things would get messy.
If you're using a view, $this->escape($variableToEscape) should suffice.
Another thing is many times things that are hard coded one day become user or at least database generated another day. Its just better practice to manage output of variables in your code.
You could look at it this way: you should always HTML-encode variables, unless you know that they've already been encoded.
Say you have a variable that contains:
foo <b>bar</b>
If you know that it contains HTML tags, and you're okay with that, then you can say that this variable has already been properly HTML-encoded. You could even assign it to a different variable type to make the compiler aware of the distinction (Joel's idea), and have your output functions handle these types without escaping them.
Of course, this means that
foo & <b>bar</b>
is an incorrect value; you would need to ensure that it's:
foo & <b>bar</b>
I think the best practice here is always escape output unless you intend to output a raw HTML fragment. Even "safe" data can contain characters which need to be escaped. For example, consider the e-mail address '"Bob" <bob#bob.com>'. If you don't escape it, the browser will think <bob#bob.com> is a tag.
Obviously you want to escape things that are the result of user data to prevent XSS attacks. Since you're often changing what you're republishing and what you're not, you probably can't remember all the places that need to be changed... So even if you get all the nuances correct now, and your site is secure from XSS scripting today, you may at some point may add user input to some variable you're not escaping (or more likely, some variable to some variable to some variable which you're not escaping), which would open you up to XSS attacks.
Escaping by default would prevent that attack.
The other reason is more conceptual: with MVC, all of your markup--which is, by definition, the "view"--should be in your view templates. So if your controller is determining the view, and the view contains all the markup, why not escape your variables?
Well, if you have hard-coded values (let's say language translations, which you read from a database or XML file), you don't have to escape them.
But if there is a value that has been created/modified by user, even let's say in admin panel, you have to escape it, because you don't know what kind of data user or if I'm more radical, even administrator, will send.
When do you call Microsoft.Security.Application.AntiXss.HtmlEncode? Do you do it when the user submits the information or do you do when you're displaying the information?
How about for basic stuff like First Name, Last Name, City, State, Zip?
You do it when you are displaying the information. Preserve the original as it was entered, convert it for display on a web page. Let's say you were displaying it in some other way, like exporting it into Excel. In that case, you'd want to export the preserved original.
Encode every single string.
You should only encode or escape your data at the last possible moment, whether that's directly before you put it in the database, or display it on the screen. If you encode too soon, you run the risk of accidentally double encoding (you'll often see & on newbies' websites - myself included).
If you do want to encode sooner than that, then take measures to avoid the double encoding. Joel wrote an article about good uses for hungarian notation, where he advocated use of prefixes to determine what is stored in the variable. eg: "us" for unsafe string, "ss" for safe string.
usFirstName = getUserInput('firstName')
ssFirstName = cleanString(usFirstName);
Also note that it doesn't matter what the type of information is (city, zip code, etc) - leaving any of these unchecked is asking for trouble.
It depends on your situation. Where I work, for years the company did no HTML encoding, so when we started doing it, it would have been almost impossible to find every location within the system that user input could be displayed on the page.
Instead we chose to sanitize input on its way into the system since there were fewer input points than output points. We sanitize immediately before inputting data into the DB, although we don't use Microsoft's AntiXss library, we use a set of homebrew methods that whitelist ranges of HTML tags and characters depending on the type of input.
If you're designing the system from scratch, or you have a system that is small (or managed well) enough to encode output, follow Corey's suggestion. It's definitely the better way to do it.
Encoding is not a property of the data, it is a property of the transport mechanism. Therefore you should unencode data when you receive it, and encode it appropriately before transmission. The transport mechanism determines what sort of encoding is necessary.
This principle holds true whether your transport mechanism is HTML, HTTP, smoke signals, etc. The trick is knowing how to do the types of encoding manually, and when various frameworks do the steps for you automagically. For instance, ASP.NET will encode data assigned to a System.Web.UI.WebControls.Button's Text, but not text assigned to a System.Web.UI.WebControls.Literal's Text. jQuery will encode content you set with .innerText(), but not content you set with .innerHtml().
I am not concerned about other kinds of attacks. Just want to know whether HTML Encode can prevent all kinds of XSS attacks.
Is there some way to do an XSS attack even if HTML Encode is used?
No.
Putting aside the subject of allowing some tags (not really the point of the question), HtmlEncode simply does NOT cover all XSS attacks.
For instance, consider server-generated client-side javascript - the server dynamically outputs htmlencoded values directly into the client-side javascript, htmlencode will not stop injected script from executing.
Next, consider the following pseudocode:
<input value=<%= HtmlEncode(somevar) %> id=textbox>
Now, in case its not immediately obvious, if somevar (sent by the user, of course) is set for example to
a onclick=alert(document.cookie)
the resulting output is
<input value=a onclick=alert(document.cookie) id=textbox>
which would clearly work. Obviously, this can be (almost) any other script... and HtmlEncode would not help much.
There are a few additional vectors to be considered... including the third flavor of XSS, called DOM-based XSS (wherein the malicious script is generated dynamically on the client, e.g. based on # values).
Also don't forget about UTF-7 type attacks - where the attack looks like
+ADw-script+AD4-alert(document.cookie)+ADw-/script+AD4-
Nothing much to encode there...
The solution, of course (in addition to proper and restrictive white-list input validation), is to perform context-sensitive encoding: HtmlEncoding is great IF you're output context IS HTML, or maybe you need JavaScriptEncoding, or VBScriptEncoding, or AttributeValueEncoding, or... etc.
If you're using MS ASP.NET, you can use their Anti-XSS Library, which provides all of the necessary context-encoding methods.
Note that all encoding should not be restricted to user input, but also stored values from the database, text files, etc.
Oh, and don't forget to explicitly set the charset, both in the HTTP header AND the META tag, otherwise you'll still have UTF-7 vulnerabilities...
Some more information, and a pretty definitive list (constantly updated), check out RSnake's Cheat Sheet: http://ha.ckers.org/xss.html
If you systematically encode all user input before displaying then yes, you are safe you are still not 100 % safe.
(See #Avid's post for more details)
In addition problems arise when you need to let some tags go unencoded so that you allow users to post images or bold text or any feature that requires user's input be processed as (or converted to) un-encoded markup.
You will have to set up a decision making system to decide which tags are allowed and which are not, and it is always possible that someone will figure out a way to let a non allowed tag to pass through.
It helps if you follow Joel's advice of Making Wrong Code Look Wrong or if your language helps you by warning/not compiling when you are outputting unprocessed user data (static-typing).
If you encode everything it will. (depending on your platform and the implementation of htmlencode) But any usefull web application is so complex that it's easy to forget to check every part of it. Or maybe a 3rd party component isn't safe. Or maybe some code path that you though did encoding didn't do it so you forgot it somewhere else.
So you might want to check things on the input side too. And you might want to check stuff you read from the database.
As mentioned by everyone else, you're safe as long as you encode all user input before displaying it. This includes all request parameters and data retrieved from the database that can be changed by user input.
As mentioned by Pat you'll sometimes want to display some tags, just not all tags. One common way to do this is to use a markup language like Textile, Markdown, or BBCode. However, even markup languages can be vulnerable to XSS, just be aware.
# Markup example
[foo](javascript:alert\('bar'\);)
If you do decide to let "safe" tags through I would recommend finding some existing library to parse & sanitize your code before output. There are a lot of XSS vectors out there that you would have to detect before your sanitizer is fairly safe.
I second metavida's advice to find a third-party library to handle output filtering. Neutralizing HTML characters is a good approach to stopping XSS attacks. However, the code you use to transform metacharacters can be vulnerable to evasion attacks; for instance, if it doesn't properly handle Unicode and internationalization.
A classic simple mistake homebrew output filters make is to catch only < and >, but miss things like ", which can break user-controlled output out into the attribute space of an HTML tag, where Javascript can be attached to the DOM.
No, just encoding common HTML tokens DOES NOT completely protect your site from XSS attacks. See, for example, this XSS vulnerability found in google.com:
http://www.securiteam.com/securitynews/6Z00L0AEUE.html
The important thing about this type of vulnerability is that the attacker is able to encode his XSS payload using UTF-7, and if you haven't specified a different character encoding on your page, a user's browser could interpret the UTF-7 payload and execute the attack script.
One other thing you need to check is where your input comes from. You can use the referrer string (most of the time) to check that it's from your own page, but putting in a hidden random number or something in your form and then checking it (with a session set variable maybe) also helps knowing that the input is coming from your own site and not some phishing site.
I'd like to suggest HTML Purifier (http://htmlpurifier.org/) It doesn't just filter the html, it basically tokenizes and re-compiles it. It is truly industrial-strength.
It has the additional benefit of allowing you to ensure valid html/xhtml output.
Also n'thing textile, its a great tool and I use it all the time, but I'd run it though html purifier too.
I don't think you understood what I meant re tokens. HTML Purifier doesn't just 'filter', it actually reconstructs the html. http://htmlpurifier.org/comparison.html
I don't believe so. Html Encode converts all functional characters (characters which could be interpreted by the browser as code) in to entity references which cannot be parsed by the browser and thus, cannot be executed.
<script/>
There is no way that the above can be executed by the browser.
**Unless their is a bug in the browser ofcourse.*
myString.replace(/<[^>]*>?/gm, '');
I use it, then successfully.
Strip HTML from Text JavaScript