Capybara can not match xml page - cucumber

I have problem with matching response text on xml page on capybara.
When I use page.should(have_content(arg1)) capybara raises error that there is no \html element (there shouldn't be as it's xml).
When I use page.should(have_xpath(arg1)) it raises Element at 40 no longer present in the DOM (Capybara::Webkit::NodeNotAttachedError)
What is the correct way to test xml ?

When using capybara-webkit, the driver will try to use a browser's HTML DOM to look for elements. That doesn't work, because you don't have an HTML document.
One workaround is to fall back to Capybara's string implementation:
xml = Capybara.string(page.body)
expect(xml).to have_xpath(arg1)
expect(xml).to have_content(arg1)
Assuming your page returns a content type of text/xml, capybara-webkit won't mess with the response body at all, so you can pass it through to Capybara.string (or directly to Nokogiri if you like).

Related

Geb Test Framework -- get raw page content

Is there a way to get the raw page content using Geb ?
For example the following test should work (but PhantomJS seems to bad the JSON response with HTML code):
def "Get page content example -- health check"() {
given:
go "https://status.github.com/api/status.json"
expect:
assert driver.pageSource.startsWith('{"status":"(good)"')
}
Note that, YES I understand that I could just NOT use Geb, and simply just make a URL call in Groovy, but for a number of reasons I want to explicitly use Geb (one of the reasons is dealing with redirects).
What a web browser renders when it loads a URL depends on the browser itself, there is nothing you can do about it. PhantomJS uses the same engine as Chrome, thus the two of them render some HTML around the JSON. IE, Edge and Firefox do the same, by the way. HtmlUnit for a change renders the pure JSON. But why bother with exact matches like startsWith if you can just use a regular expression? It is much more flexible:
expect:
driver.pageSource =~ /"status":"good"/
This should work in all browser engines.
P.S.: You do not need assert in then: or expect: blocks, that is the beauty of Spock/Geb.

verify heading text in css selector

This is my xpath "//div[#class='city']/h4[text()='Newyork']"
xpaths I can use in Geb but I want to write similar expression in CSS or better Groovy-ish, Gebish locator.
I have tried
.city>h4:'Newyork'
.city>h4:contains('Newyork')
but none worked.
I have referred https://sqa.stackexchange.com/questions/362/a-way-to-match-on-text-using-css-locators
Geb has an ability to further filter down the elements matched using a CSS selector by passing a map of element attributes to its methods together with that selector. It also introduces a special attribute for matching text of a node. Your selector would look like this:
$('div.city > h4', text: 'Newyork')
Please note that this works by fetching text for every matched element and checking it against the provided value, that is it does the work on the JVM side and not in the browser, which means that you want your css selector to be as specific as possible and match as little elements as possible otherwise the selector will be very slow.

Embed HTML in JSF [duplicate]

Is there any HTML sanitizer or cleanup methods available in any JSF utilities kit or libraries like PrimeFaces/OmniFaces?
I need to sanitize HTML input by user via p:editor and display safe HTML output using escape="true", following the stackexchange style. Before displaying the HTML I'm thinking to store sanitized input data to the database, so that it is ready to safe use with escape="true" and XSS is not a danger.
In order to achieve that, you basically need a standalone HTML parser. HTML parsing is rather complex and the task and responsibility of that is beyond the scope of JSF, PrimeFaces and OmniFaces. You're supposed to just grab one of the many existing HTML parsing libraries.
An example is Jsoup, it has even a separate method for the particular purpose of sanitizing HTML against a Safelist: Jsoup#clean(). For example, if you want to allow some basic HTML without images, use Safelist.basic():
String sanitizedHtml = Jsoup.clean(rawHtml, Safelist.basic());
A completely different alternative is to use a specific text formatting syntax, such as Markdown (which is also used here). Basically all of those parsers also sanitize HTML under the covers. An example is CommonMark. Perhaps this is what you actually meant when you said "stackexchange style".
As to saving in DB, you'd better save both the raw and parsed forms in 2 separate text columns. The raw form should be redisplayed during editing. The parsed form should be updated in background when the raw form has been edited. During display, obviously only show the parsed form with escape="false".
See also:
Markdown or HTML

Reusing Yesod widgets in AJAX results

I'm writing a very simple Yesod message list that uses AJAX to add new list items without reloading the page (both in the case of other users modifying the database, or the client themselves adding an item). This means I have to encode the HTML structure of the message items in both the Halmet template (when the page loads initially) and the Julius template (for when the dynamic addition happens). They look something like this:
In homepage.hamlet:
$if not $ null messages
<ul id=#{listId}>
$forall Entity mid message <- messages
<li id=#{toPathPiece mid}>
<p>#{showMarkdown $ messageText message}
<abbr .timeago title=#{showUTCTime $ messagePosted message}>
And in homepage.julius:
function(message) {
$('##{rawJS listId}').prepend(
$('<li>')
.attr('id', message.id)
.append('<p>' + message.text + '</p>')
.append($('<abbr class=timeago />')
.attr('title', message.posted).timeago())
.slideDown('slow')
);
}
I'd love to be able to unify these two representations somehow. Am I out of luck, or could I somehow abuse widgets into both generating an HTML response, and filling in code in a JavaScript file?
Note: Of course, I understand that the templates would have to work very differently, since the AJAX call is getting its values from a JS object, not from the server. It's a long shot, but I thought I'd see if anyone's thought about this before.
I think it's something of a AJAX best-practice to pick one place to do your template rendering, either on the server or client. Yesod is (currently) oriented toward doing the rendering on the server.
This can still work with AJAX replacement of contents, though. Instead of getting a JSON response from the POST, you should get a text/html response that contains the result of rendering the template on the server with the values that would have been returned via JSON and then replacing the innerHTML of the DOM node that's being updated.
If you want to support both JSON and HTML responses (to support 3rd party applications via API or something) you would have to make the format of the response be a function of the request; either appending ".json" or ".html" to the URL or including a HTTP header that lists the specific document type required by the client.
It would be nice if Yesod provided a 'jwhamlet' template or something that would render the HTML via javascript in order to support client rendering, but I'm not aware of one. That's not to say there isn't one I'm not aware of, though, so keep an eye open for other answers.
If you wanted to make such a thing, you might try tweaking the hamlet quasi-quote code so that instead of expanding the quasi-quotes to an html-generating function, it expanded them to a JSON-generating function and a pre-rendered chunk of text that's a template in mustache-style such that the JSON returned by the function would provide the correct context for the template to be rendered the way you want.

Encode HTML attributes when inserting HTML in the DOM

I have questions on preventing XSS attacks.
1) Question:
I have an HTML template as Javascript string (trusted) and insert content coming from a server request (untrusted). I replace placeholders within that HTML template strings with that untrusted content and output it to the DOM using innerHTML/Text.
In particular I insert texts that I output in <div> and <p> tags that are already present in the template HTML string and form element values, i.e. texts in input tag's value attribute, select option and textarea tags.
Do I understand correctly that I can treat every inserted text mentioned above as HTML subcontext thus I only encode like so: encodeForJavascript( encodeForHTML( inserted_text ) ). Or do I have to encode the texts that I insert into value attributes of the input fields for the HTML Attribute subcontext?
After reading up on this issue on OWASP I am inclined to think that latter is only necessary in case I set the attribute with unstrusted content via Javascript like so: document.forms[ 0 ].elements[ 0 ].value = encodeForHTMLAttribute, is that correct?
2) Question:
What is the added value of server side encoding server responses that enter the client side via Ajax and get handled anyway (like in question 1). In addition, don't we risk problems when double encoding the content?
Thanks
You need to encode for the context in question, so to data inserted into html context needs to be encoded for html, and data inserted into html attributes, should be html attribute encoded. This is addition to the javascript encoding you mentioned.
I would javascript encode for transfer and then encode for the correct context client side, where I know which context is the right one.

Resources