I have a rich-text editor on my site that I'm trying to protect against XSS attacks. I think I have pretty much everything handled, but I'm still unsure about what to do with images. Right now I'm using the following regex to validate image URLs, which I'm assuming will block inline javascript XSS attacks:
"https?://[-A-Za-z0-9+&##/%?=~_|!:,.;]+"
What I'm not sure of is how open this leaves me to XSS attacks from the remote image. Is linking to an external image a serious security threat?
The only thing I can think of is that the URL entered references a resource that returns "text/javascript" as its MIME type instead of some sort of image, and that javascript is then executed.
Is that possible? Is there any other security threat I should consider?
Another thing to worry about is that you can easily embed PHP code inside an image and upload that most of the time. The only thing an attack would then have to be able to do is find a way to include the image. (Only the PHP code will get executed, the rest is just echoed). Check the MIME-type won't help you with this because the attacker can easily just upload an image with the correct first few bytes, followed by arbitrary PHP code. (The same is somewhat true for HTML and Javascript code).
If the end-viewer is in a password protected area and your app contains Urls that initiate actions based on GET requests, you can make a request on the user's behalf.
Examples:
src="http://yoursite.com/deleteuser.xxx?userid=1234"
src="http://yoursite.com/user/delete/1234"
src="http://yoursite.com/dosomethingdangerous"
In that case, look at the context around it: do users only supply a URL? In that case it's fine to just validate the URLs semantics and MIME-type. If the user also gets to input tags of some sort you'll have to make sure that they are not manipulatable to do anything other then display images.
Related
On this site (archived snapshot) under “The Theory of XSS’, it says:
the hacker infects a legitimate web page with his malicious client-side script
My first question on reading this is: if the application is deployed on a server that is secure (as is the case with a bank for example), how can the hacker ever get access to the source code of the web page? Or can he/she inject the malicious script without accessing the source code?
With cross-site scripting, it's possible to infect the HTML document produced without causing the web server itself to be infected. An XSS attack uses the server as a vector to present malicious content back to a client, either instantly from the request (a reflected attack), or delayed though storage and retrieval (a stored attack).
An XSS attack exploits a weakness in the server's production of a page that allows request data to show up in raw form in the response. The page is only reflecting back what was submitted in a request... but the content of that request might hold characters that break out of ordinary text content and introduce HTML or JavaScript content that the developer did not intend.
Here's a quick example. Let's say you have some sort of templating language made to produce an HTML page (like PHP, ASP, CGI, or a Velocity or Freemarker script). It takes the following page and substitutes "<?=$name?>" with the unescaped value of the "name" query parameter.
<html>
<head><title>Example</title></head>
<body>Hi, <?=$name?></body>
</html>
Someone calling that page with the following URL:
http://example.com/unsafepage?name=Rumplestiltskin
Should expect to see this message:
Hi, Rumplestiltskin
Calling the same page with something more malicious can be used to alter the page or user experience substantially.
http://example.com/unsafepage?name=Rumplestiltskin<script>alert('Boo!')</script>
Instead of just saying, "Hi, Rumplestiltskin", this URL would also cause the page to pop up an alert message that says, "Boo!". That is, of course, a simplistic example. One could provide a sophisticated script that captures keystrokes or asks for a name and password to be verified, or clears the screen and entirely rewrites the page with shock content. It would still look like it came from example.com, because the page itself did, but the content is being provided somewhere in the request and just reflected back as part of the page.
So, if the page is just spitting back content provided by the person requesting it, and you're requesting that page, then how does a hacker infect your request? Usually, this is accomplished by providing a link, either on a web page or sent to you by e-mail, or in a URL-shortened request, so it's difficult to see the mess in the URL.
<a href="http://example.com?name=<script>alert('Malicious content')</script>">
Click Me!
</a>
A server with an exploitable XSS vulnerability does not run any malicious code itself-- its programming remains unaltered-- but it can be made to serve malicious content to clients.
That attacker doesn't need access to the source code.
A simple example would be a URL parameter that is written to the page. You could change the URL parameter to contain script tags.
Another example is a comment system. If the website doesn't properly sanitize the input/output, an attacker could add script to a comment, which would then be displayed and executed on the computers of anyone who viewed the comment.
These are simple examples. There's a lot more to it and a lot of different types of XSS attacks.
It's better to think of the script as being injected into the middle of the conversation between the badly coded web page and the client's web browser. It's not actually injected into the web page's code; but rather into the stream of data going to the client's web browser.
There are two types of XSS attacks:
Non-persistent: This would be a specially crafted URL that embeds a script as one of the parameters to the target page. The nasty URL can be sent out in an email with the intent of tricking the recipient into clicking it. The target page mishandles the parameter and unintentionally sends code to the client's machine that was passed in originally through the URL string.
Persistent: This attack uses a page on a site that saves form data to the database without handling the input data properly. A malicious user can embed a nasty script as part of a typical data field (like Last Name) that is run on the client's web browser unknowingly. Normally the nasty script would be stored to the database and re-run on every client's visit to the infected page.
See the following for a trivial example: What Is Cross-Site Scripting (XSS)?
My web application displays some sensitive information to a logged in user. The user visits another site without explicitly logging out of my site first. How do I ensure that the other site can not access the sensitive information without accept from me or the user?
If for example my sensitive data is in JavaScript format, the other site can include it in a script tag and read the side effects. I could continue on building a blacklist, but I do not want to enumerate what is unsafe. I want to know what is safe, but I can not find any documentation of this.
UPDATE: In my example JavaScript from the victim site was executed on the attacker's site, not the other way around, which would have been Cross Site Scripting.
Another example is images, where any other site can read the width and height, but I don't think they can read the content, but they can display it.
A third example is that everything without an X-Frame-Options header can be loaded into an iframe, and from there it is possible to steal the data by tricking the user into doing drag-and-drop or copy-and-paste.
The key point of Cross Site Attack is to ensure that your input from user which is going to be displayed, is legal, not containing some scripts. You may stop it at the beginning.
If for example my sensitive data is in JavaScript format, the other site can include it in a script tag
Yep! So don't put it in JavaScript/JSONP format.
The usual fix for passing back JSON or JS code is to put something unexecutable at the front to cause a syntax error or a hang (for(;;); is popular). So including the resource as a <script> doesn't get the attacker anywhere. When you access it from your own site you can fetch it with an XMLHttpRequest and chop off the prefix before evaluating it.
(A workaround that doesn't work is checking window.location in the returned script: when you're being included in an attacker's page they have control of the JavaScript environment and could sabotage the built-in objects to do unexpected things.)
Since I did not get the answer I was looking for here, I asked in another forum an got the answer. It is here:
https://groups.google.com/forum/?fromgroups=#!topic/mozilla.dev.security/9U6HTOh-p4g
I also found this page which answers my question:
http://code.google.com/p/browsersec/wiki/Part2#Life_outside_same-origin_rules
First of all like superpdm states, design your app from the ground up to ensure that either the sensitive information is not stored on the client side in the first place or that it is unintelligible to a malicious users.
Additionally, for items of data you don't have much control over, you can take advantage of inbuilt HTTP controls like HttpOnly that tries to ensure that client-side scripts will not have access to cookies like your session token and so forth. Setting httpOnly on your cookies will go a long way to ensure malicious vbscripts, javascripts etc will not read or modify your client-side tokens.
I think some confusion is still in our web-security knowledge world. You are afraid of Cross Site Request Forgery, and yet describing and looking for solution to Cross Site Scripting.
Cross Site Scripting is a vulnerability that allows malicious person to inject some unwanted content into your site. It may be some text, but it also may be some JS code or VB or Java Applet (I mentioned applets because they can be used to circumvent protection provided by the httpOnly flag). And thus if your aware user clicks on the malicious link he may get his data stolen. It depends on amount of sensitive data presented to the user. Clicking on a link is not only attack vector for XSS attack, If you present to users unfiltered contents provided by other users, someone may also inject some evil code and do some damage. He does not need to steal someone's cookie to get what he wants. And it has notnig to do with visiting other site while still being logged to your app. I recommend:XSS
Cross Site Request Forgery is a vulnerability that allows someone to construct specially crafted form and present it to Logged in user, user after submitting this form may execute operation in your app that he didin't intended. Operation may be transfer, password change, or user add. And this is the threat you are worried about, if user holds session with your app and visits site with such form which gets auto-submited with JS such request gets authenticated, and operation executed. And httpOnly will not protect from it because attacker does not need to access sessionId stored in cookies. I recommend: CSRF
A user recently reported to me that they could exploit the BBCode tag [img] that was available to them through the forums.
[img=http://url.to.external.file.ext][img]
Of course, it would show up as a broken image, however the browser would retrieve the file over there. I tested it myself and sure enough it was legit.
I'm not sure how to prevent this type of XSS injection other than downloading the image and checking if it is a legitimate image through PHP. This easily could be abused with a insanely huge file.
Are there any other solutions to this?
You could request the headers and check if the file is actually an image.
Edit:
Sorry that I couldn't answer in more depth; I was enjoying dinner.
There are two ways I see it:
You check to see if the supplied address is actually a image when the post is submitted or viewed, you could accomplish this by checking the headers (making sure it's actually an image) or by using file extension. This isn't fool-proof and has some obvious issues (changing the image on the fly, etc.).
Secure your site that even if there is a compromise with the [img] tag there is no real problem, for example: the malicious code can't use stolen cookies.
Use a script that requests an external image and modifies the headers.
A basic way to check the remote files content type:
$Headers = get_headers('http://url.to.external.file.ext');
if($Headers[8] == 'text/html') {
echo 'Wrong content type.';
exit;
}
There's only two solutions to this problem. Either download the image and serve from your webserver, or only allow a white-list of url patterns for the images.
Some gotchas if you decide to download the images -
Make sure you have a validation for the maximum file size. There are ways to stop the download if the file exceeds a certain size, but these are language specific.
Check that the file is actually an image.
If you store it on the hard-disk, be sure to rename it. You shouldn't allow the user to control the file name on the system.
When you serve the images, use a throw-away domain, or use naked ip address to serve the images. If the browser is ever tricked in thinking the image is executable code, the same-origin policy will prevent further damage.
Like a lot of developers, I want to make JavaScript served up by Server "A" talk to a web service on Server "B" but am stymied by the current incarnation of same origin policy. The most secure means of overcoming this (that I can find) is a server script that sits on Server "A" and acts as a proxy between it and "B". But if I want to deploy this JavaScript in a variety of customer environments (RoR, PHP, Python, .NET, etc. etc.) and can't write proxy scripts for all of them, what do I do?
Use JSONP, some people say. Well, Doug Crockford pointed out on his website and in interviews that the script tag hack (used by JSONP) is an unsafe way to get around the same origin policy. There's no way for the script being served by "A" to verify that "B" is who they say they are and that the data it returns isn't malicious or will capture sensitive user data on that page (e.g. credit card numbers) and transmit it to dastardly people. That seems like a reasonable concern, but what if I just use the script tag hack by itself and communicate strictly in JSON? Is that safe? If not, why not? Would it be any more safe with HTTPS? Example scenarios would be appreciated.
Addendum: Support for IE6 is required. Third-party browser extensions are not an option. Let's stick with addressing the merits and risks of the script tag hack, please.
Currently browser venders are split on how cross domain javascript should work. A secure and easy to use optoin is Flash's Crossdomain.xml file. Most languages have a Cross Domain Proxies written for them, and they are open source.
A more nefarious solution would be to use xss how the Sammy Worm used to spread. XSS can be used to "read" a remote domain using xmlhttprequest. XSS isn't required if the other domains have added a <script src="https://YOUR_DOMAIN"></script>. A script tag like this allows you to evaluate your own JavaScript in the context of another domain, which is identical to XSS.
It is also important to note that even with the restrictions on the same origin policy you can get the browser to transmit requests to any domain, you just can't read the response. This is the basis of CSRF. You could write invisible image tags to the page dynamically to get the browser to fire off an unlimited number of GET requests. This use of image tags is how an attacker obtains documnet.cookie using XSS on another domain. CSRF POST exploits work by building a form and then calling .submit() on the form object.
To understand the Same Orgin Policy, CSRF and XSS better you must read the Google Browser Security Handbook.
Take a look at easyXDM, it's a clean javascript library that allows you to communicate across the domain boundary without any server side interaction. It even supports RPC out of the box.
It supports all 'modern' browser, as well as IE6 with transit times < 15ms.
A common usecase is to use it to expose an ajax endpoint, allowing you to do cross-domain ajax with little effort (check out the small sample on the front page).
What if I just use the script tag hack by itself and communicate strictly in JSON? Is that safe? If not, why not?
Lets say you have two servers - frontend.com and backend.com. frontend.com includes a <script> tag like this - <script src="http://backend.com/code.js"></script>.
when the browser evaluates code.js is considered a part of frontend.com and NOT a part of backend.com. So, if code.js contained XHR code to communicate with backend.com, it would fail.
Would it be any more safe with HTTPS? Example scenarios would be appreciated.
If you just converted your <script src="https://backend.com/code.js> to https, it would NOT be any secure. If the rest of your page is http, then an attacker could easily man-in-the-middle the page and change that https to http - or worse, include his own javascript file.
If you convert the entire page and all its components to https, it would be more secure. But if you are paranoid enough to do that, you should also be paranoid NOT to depend on an external server for you data. If an attacker compromises backend.com, he has effectively got enough leverage on frontend.com, frontend2.com and all of your websites.
In short, https is helpful, but it won't help you one bit if your backend server gets compromised.
So, what are my options?
Add a proxy server on each of your client applications. You don't need to write any code, your webserver can automatically do that for you. If you are using Apache, look up mod_rewrite
If your users are using the latest browsers, you could consider using Cross Origin Resource Sharing.
As The Rook pointed out, you could also use Flash + Crossdomain. Or you could use Silverlight and its equivalent of Crossdomain. Both technologies allow you to communicate with javascript - so you just need to write a utility function and then normal js code would work. I believe YUI already provides a flash wrapper for this - check YUI3 IO
What do you recommend?
My recommendation is to create a proxy server, and use https throughout your website.
Apologies to all who attempted to answer my question. It proceeded under a false assumption about how the script tag hack works. The assumption was that one could simply append a script tag to the DOM and that the contents of that appended script tag would not be restricted by the same origin policy.
If I'd bothered to test my assumption before posting the question, I would've known that it's the source attribute of the appended tag that's unrestricted. JSONP takes this a step further by establishing a protocol that wraps traditional JSON web service responses in a callback function.
Regardless of how the script tag hack is used, however, there is no way to screen the response for malicious code since browsers execute whatever JavaScript is returned. And neither IE, Firefox nor Webkit browsers check SSL certificates in this scenario. Doug Crockford is, so far as I can tell, correct. There is no safe way to do cross domain scripting as of JavaScript 1.8.5.
I read that spammers may be downloading a specific registration page on my site using curl. Is there any way to block that specific page from being CURLed, either through htaccess or other means?
I don't think this is possible to block curl, as curl has the ability to send user agents, cookies, etc. As far as I understand, it can completely emulate a normal user.
If you are worried about protecting a form, you can generate a random token which is submitted automatically when the form is submitted. That way, anyone who tries to make a script to automate registration will have to worry about scraping it first.
There is one weakness in CURL, which you can exploit, it can not run javascript like a browser. So you can take advantage of this fact, one first landing on the reg page, have your server side code check for a cookie, if it isnt there, send some javascript code to the browser, this code will set the cookie and do a redirect/reload ... after reload the server side again checks for the cookie, incase of browsers it will find it.. incase of curl the cookie generation and reload/redirect wont happen in the first place.
I hope i made some sense, bottom line .. utilize javascript to differentiate between curl and browser.
As Oren says, spammers can forge user-agents, so you can't just block the curl user-agent string. The typical solution here is some kind of CATPCHA. These are often jumbled images (though non-visual forms exist) sites (including StackOverflow) have you transcribe to prove you're human.