I need a output text which works like h:outputText with escape="false" attribute, but doesn't let scripts to run. After a little search I found tr:outputFormatted makes that, but in our project we doesn't use trinidad. Is there something like outputFormatted in tomahawk, or in another taglib?
for example,
<h:outputText id="id" value="<b>test text</b><script type="text/javascipt">alert('I dont want these alert to show');</script>" escape="false"/>
that shows 'test text' bold but it popups the alert dialog too, I don't want the script to run. it can write script code or delete it but shouldn't run.
Use a HTML parser to get rid of those malicious things.
Among others, Jsoup is capable of this. Here's an extract of relevance from its site.
Sanitize untrusted HTML
Problem
You want to allow untrusted users to supply HTML for output on your website (e.g. as comment submission). You need to clean this HTML to avoid cross-site scripting (XSS) attacks.
Solution
Use the jsoup HTML Cleaner with a configuration specified by a Whitelist.
String unsafe =
"<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
// now: <p>Link</p>
So, all you basically need to do is the the following during preparing the text:
String sanitizedText = Jsoup.clean(rawText, Whitelist.basic());
(you can do it before or after saving the text in DB, but keep in mind that when doing it before without saving the original text, you can't detect malicious users and do social actions anymore)
and then display it as follows:
<h:outputText value="#{bean.sanitizedText}" escape="false" />
Related
Is there any HTML sanitizer or cleanup methods available in any JSF utilities kit or libraries like PrimeFaces/OmniFaces?
I need to sanitize HTML input by user via p:editor and display safe HTML output using escape="true", following the stackexchange style. Before displaying the HTML I'm thinking to store sanitized input data to the database, so that it is ready to safe use with escape="true" and XSS is not a danger.
In order to achieve that, you basically need a standalone HTML parser. HTML parsing is rather complex and the task and responsibility of that is beyond the scope of JSF, PrimeFaces and OmniFaces. You're supposed to just grab one of the many existing HTML parsing libraries.
An example is Jsoup, it has even a separate method for the particular purpose of sanitizing HTML against a Safelist: Jsoup#clean(). For example, if you want to allow some basic HTML without images, use Safelist.basic():
String sanitizedHtml = Jsoup.clean(rawHtml, Safelist.basic());
A completely different alternative is to use a specific text formatting syntax, such as Markdown (which is also used here). Basically all of those parsers also sanitize HTML under the covers. An example is CommonMark. Perhaps this is what you actually meant when you said "stackexchange style".
As to saving in DB, you'd better save both the raw and parsed forms in 2 separate text columns. The raw form should be redisplayed during editing. The parsed form should be updated in background when the raw form has been edited. During display, obviously only show the parsed form with escape="false".
See also:
Markdown or HTML
Is there any HTML sanitizer or cleanup methods available in any JSF utilities kit or libraries like PrimeFaces/OmniFaces?
I need to sanitize HTML input by user via p:editor and display safe HTML output using escape="true", following the stackexchange style. Before displaying the HTML I'm thinking to store sanitized input data to the database, so that it is ready to safe use with escape="true" and XSS is not a danger.
In order to achieve that, you basically need a standalone HTML parser. HTML parsing is rather complex and the task and responsibility of that is beyond the scope of JSF, PrimeFaces and OmniFaces. You're supposed to just grab one of the many existing HTML parsing libraries.
An example is Jsoup, it has even a separate method for the particular purpose of sanitizing HTML against a Safelist: Jsoup#clean(). For example, if you want to allow some basic HTML without images, use Safelist.basic():
String sanitizedHtml = Jsoup.clean(rawHtml, Safelist.basic());
A completely different alternative is to use a specific text formatting syntax, such as Markdown (which is also used here). Basically all of those parsers also sanitize HTML under the covers. An example is CommonMark. Perhaps this is what you actually meant when you said "stackexchange style".
As to saving in DB, you'd better save both the raw and parsed forms in 2 separate text columns. The raw form should be redisplayed during editing. The parsed form should be updated in background when the raw form has been edited. During display, obviously only show the parsed form with escape="false".
See also:
Markdown or HTML
I am using urlrewriting and my pattern matches ../products/product_id and maps to /product.jsf?product_id=$1.
The rewriting performs well, but I am having trouble generating dynamic links.
In a context of iteration:
...
<h:link value="view product" outcome="products/#{item.id}"/>
...
The case is very simple. I just want the url generated has the form "products/123", but the page doesn't render, I guess because the outcome cannot be resolved at generation time.
I could just generate a link with "/product.jsf" and add a view parameter. But I prefer the other way. How can I have this behavior?
If you don't have a valid navigation case outcome, just use plain HTML <a>.
view product
Depending on the current URI, you may only need to prepend the context path yourself.
view product
in my application i have some tinymce editors and the userinput is shown with
<h:outputText escape="false"/>
but how can i prevent malicious input, like javascript or iframes? Is there any lib which can filter the input strings?
UPDATE:
i found "htmlpurifier" but it is for php, is there anyting like this for java?
You'd need to use a HTML parser which supports cleaning/whitelisting tags/attributes. Among them there's Jsoup, it has a clean() method for exactly this purpose. Here's an extract of relevance from its site.
Sanitize untrusted HTML
Problem
You want to allow untrusted users to supply HTML for output on your website (e.g. as comment submission). You need to clean this HTML to avoid cross-site scripting (XSS) attacks.
Solution
Use the jsoup HTML Cleaner with a configuration specified by a Whitelist.
String unsafe =
"<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
// now: <p>Link</p>
Using Microsoft's AntiXssLibrary, how do you handle input that needs to be edited later?
For example:
User enters:
<i>title</i>
Saved to the database as:
<i>title</i>
On an edit page, in a text box it displays something like:
<i>title</i> because I've encoded it before displaying in the text box.
User doesn't like that.
Is it ok not to encode when writing to an input control?
Update:
I'm still trying to figure this out. The answers below seem to say to decode the string before displaying, but wouldn't that allow for XSS attacks?
The one user who said that decoding the string in an input field value is ok was downvoted.
Looks like you're encoding it more than once. In ASP.NET, using Microsoft's AntiXss Library you can use the HtmlAttributeEncode method to encode untrusted input:
<input type="text" value="<%= AntiXss.HtmlAttributeEncode("<i>title</i>") %>" />
This results in
<input type="text" value="<i>title</i>" /> in the rendered page's markup and is correctly displayed as <i>title</i> in the input box.
Your problem appears to be double-encoding; the HTML needs to be escaped once (so it can be inserted into the HTML on the page without issue), but twice leads to the encoded version appearing literally.
You can call HTTPUtility.HTMLDecode(MyString) to get the text back to the unencoded form.
If you are allowing users to enter HTML that will then be rendered on the site, you need to do more than just Encode and Decode it.
Using AntiXss prevents attacks by converting script and markup to text. It does not do anything to "clean" markup that will be rendered directly. You're going to have to manually remove script tags, etc. from the user's input to be fully protected in that scenario.
You'll need to strip out script tags as well as JavaScript attributes on legal elements. For example, an attacker could inject malicious code into the onclick or onmouseover attributes.
Yes, the code inside input boxes is safe from scripting attacks and does not need to be encoded.