assume the following:
Your model has: Products.slug
urls: path('<int:id>/<slug:slug>/', views.product_detail, name='product_detail'),
views: products = Products.objects.get(id=id, slug=slug)
Someone goes to /products/1/brazil-nuts/, and it goes to the right page.
But then someone copy/pastes the url wrong to: /products/1/brazil-nu/
Then you get a DoesNotExist error... Now you could "fix" this by not forcing the slug=slug argument in your query, but then the url is incorrect, and people could link from /products/1/brazil-nu-sjfhkjfg-dfh, which is sloppy.
So how would I redirect an incorrect url to the correct one with the same url structure as the correct one (only id and slug arguments), without relying on JavaScript's window.history.pushState(null, null, proper_url); every time the page loads?
I've already tried making the same url pattern to redirect, but since django uses the first pattern match, it won't work no matter where you put the url in your list.
Just update your view:
products_q = Products.objects.filter(id=id, slug__startswith=slug)
products_count = products_q.count()
if products_count == 1:
product = products_q.first()
return reverse_lazy('product_detail', args=[product.id, product.slug])
I have a random bug in my JSF app that strips all my HTTP parameters. It happens randomly and I can't get any error messages (even when following BalusC's advice here and here).
I can't pinpoint the cause and fix it so I'm wondering if another solution is possile: forcing the request to be resent if all my parameters are empty. Is there a way to make the browser resend its request? For example, through a JSF or HTTP error code.
EDIT: Cleaned up unnecessary code.
In the end, I followed #Xtreme Biker's advice and instead built a filter. It checks if parameters are present, otherwise it sends a redirect response (HTTP error code 307). Then the browser send back the same request which proceeds through.
Something like:
String formWebContainerWidth = httpRequest.getParameter("myParam");
if(formWebContainerWidth == null){
// SC_TEMPORARY_REDIRECT = 307
httpResponse.setStatus(HttpServletResponse.SC_TEMPORARY_REDIRECT);
httpResponse.setHeader("Location", httpRequest.getRequestURI());
} else {
chain.doFilter(request, response);
}
EDIT: Added the location, otherwise the browser sometimes displays a "page cannot be reached" error message.
Is there a way to get the FULL URL loaded by a WKWebView for every request?
webView:didFinishNavigation:
Works only for mainFrame navigations and does not provide a URL request parameter.
How do I get the FULL URL just like in UIWebViewDelegate's
webViewDidFinishLoad:webView
...which gets invoked after any loading finishes and you can get the full request URL from the webView parameter.
It's nice that WKWebView's URL property saves the work that needs to be done to extract a UI-friendly base URL, but it's a huge loss we can't get the full one!
I have tried using
webView:decidePolicyForNavigationAction:decisionHandler:
...but it produces different results for URLs compared to what a UIWebView's request property holds after finishing the load of a page.
You can get URL for a newly requested Webpage by "navigationAction.request.URL" in decidePolicyForNavigationAction delegate method.
func webView(webView: WKWebView, decidePolicyForNavigationAction navigationAction: WKNavigationAction, decisionHandler: (WKNavigationActionPolicy) -> Void) {
if let urlStr = navigationAction.request.URL?.absoluteString{
//urlStr is what you want, I guess.
}
decisionHandler(.Allow)
}
First, I think you are confusing NSURL and NSURLRequest. The first is readily accessibly via webView.URL and it does actually give you the full URL of whatever was loaded. Assuming that where you say URL you mean NSURL.
If that is not what you meant, for example if you wanted to see the redirect chain or the response headers, then I'm afraid the answer is that you cannot get to tht specific information via the WKWebView.
You will have to fall back to UIWebView where you can intercept requests relatively easily and see the full request/response.
This is Yuichi Kato's answer for Swift 4. It retrieves the full URL from the request property of the navigation action in the webView(_:decidePolicyFor:decisionHandler:) method of WKNavigationDelegate.
func webView(_ webView: WKWebView, decidePolicyFor navigationAction: WKNavigationAction, decisionHandler: #escaping (WKNavigationActionPolicy) -> Void) {
if let urlStr = navigationAction.request.url?.absoluteString {
//urlStr is what you want
}
decisionHandler(.allow)
}
Don't forget to conform your class to WKNavigationDelegate and set your web view's delegate accordingly:
class WebViewController: UIViewController, WKNavigationDelegate
[...]
webView.navigationDelegate = self
I know writing business logic in getters and setters is a very bad programming practice, but is there any way to handle exceptions if the response is already committed?
What exactly is the meaning of "Response already committed" and "Headers are already sent to the client"?
There's no nice way to handle exceptions if the response is already committed. The HTTP response exist basically of a header and a body. The headers basically instruct the client (the webbrowser) how exactly it should deal with the response, e.g. the content type, the content length, the character encoding, the body encoding, the cache instructions, etcetera.
You can see the headers in the HTTP traffic monitor of the webbrowser's developer toolset. Press F12 in Chrome/IE9+/Firefox23+ and check the "Network" tab. The below screenshow is what my Chrome shows on your current question:
(note: the "Response" tab shows the response body)
The response body is the actual content, usually in flavor of a bunch of HTML code. The server has usually a fixed size buffer to write the response to. The buffer size depends on server make/version and configuration and is usually 2KB~10KB. If this buffer overflows, then it will be flushed to the other end of the connection, the client. This is the commit of a response. The client has already obtained the first part of the response, usually already representing the whole bunch of headers and maybe a part of the body.
The commit of a response is a point of no return. The server cannot take the already sent bytes back. It's too late to change the response headers (for example, a redirect is basically instructed by a Location header with therein the new URL), let alone the response body. Best what you can do is to append the error information to the already written response body. But this may end up in some weird looking HTML as it's not known which HTML tags needs to be closed at that point. The browser may fail to present it in a proper manner.
Apart from avoiding business logic in getters so that the exceptions are not thrown while rendering the response, another way to avoid an already committed response is to configure the response buffer size to be as large as the largest page which your webapp can serve. How to do that depends on the server make/version. In Tomcat for example, you can configure it as bufferSize attribute of the <Connector> element. Note that this won't prevent from flushing if your own code is (implicitly) calling flush() on the response output stream.
Good exlanation BalusC and I would add that primefaces has an issue in their exception handler. They try to redirect to error page after request was already committed. And as you said the only solution I found is to add some extra content to the response body. I owerride the handler and add this code
if ( extContext.isResponseCommitted() ) {
PartialResponseWriter writer = context.getPartialViewContext().getPartialResponseWriter();
writer.startElement( "script", null );
writer.write( "window.location.href = '" + errorPageUrl + "';" );
writer.endElement( "script" );
writer.getWrapped().endCDATA();
writer.endElement( "update" );
writer.getWrapped().endDocument();
}
else {
extContext.redirect( errorPageUrl );
context.responseComplete();
}
We have a high security application and we want to allow users to enter URLs that other users will see.
This introduces a high risk of XSS hacks - a user could potentially enter javascript that another user ends up executing. Since we hold sensitive data it's essential that this never happens.
What are the best practices in dealing with this? Is any security whitelist or escape pattern alone good enough?
Any advice on dealing with redirections ("this link goes outside our site" message on a warning page before following the link, for instance)
Is there an argument for not supporting user entered links at all?
Clarification:
Basically our users want to input:
stackoverflow.com
And have it output to another user:
stackoverflow.com
What I really worry about is them using this in a XSS hack. I.e. they input:
alert('hacked!');
So other users get this link:
stackoverflow.com
My example is just to explain the risk - I'm well aware that javascript and URLs are different things, but by letting them input the latter they may be able to execute the former.
You'd be amazed how many sites you can break with this trick - HTML is even worse. If they know to deal with links do they also know to sanitise <iframe>, <img> and clever CSS references?
I'm working in a high security environment - a single XSS hack could result in very high losses for us. I'm happy that I could produce a Regex (or use one of the excellent suggestions so far) that could exclude everything that I could think of, but would that be enough?
If you think URLs can't contain code, think again!
https://owasp.org/www-community/xss-filter-evasion-cheatsheet
Read that, and weep.
Here's how we do it on Stack Overflow:
/// <summary>
/// returns "safe" URL, stripping anything outside normal charsets for URL
/// </summary>
public static string SanitizeUrl(string url)
{
return Regex.Replace(url, #"[^-A-Za-z0-9+&##/%?=~_|!:,.;\(\)]", "");
}
The process of rendering a link "safe" should go through three or four steps:
Unescape/re-encode the string you've been given (RSnake has documented a number of tricks at http://ha.ckers.org/xss.html that use escaping and UTF encodings).
Clean the link up: Regexes are a good start - make sure to truncate the string or throw it away if it contains a " (or whatever you use to close the attributes in your output); If you're doing the links only as references to other information you can also force the protocol at the end of this process - if the portion before the first colon is not 'http' or 'https' then append 'http://' to the start. This allows you to create usable links from incomplete input as a user would type into a browser and gives you a last shot at tripping up whatever mischief someone has tried to sneak in.
Check that the result is a well formed URL (protocol://host.domain[:port][/path][/[file]][?queryField=queryValue][#anchor]).
Possibly check the result against a site blacklist or try to fetch it through some sort of malware checker.
If security is a priority I would hope that the users would forgive a bit of paranoia in this process, even if it does end up throwing away some safe links.
Use a library, such as OWASP-ESAPI API:
PHP - http://code.google.com/p/owasp-esapi-php/
Java - http://code.google.com/p/owasp-esapi-java/
.NET - http://code.google.com/p/owasp-esapi-dotnet/
Python - http://code.google.com/p/owasp-esapi-python/
Read the following:
https://www.golemtechnologies.com/articles/prevent-xss#how-to-prevent-cross-site-scripting
https://www.owasp.org/
http://www.secbytes.com/blog/?p=253
For example:
$url = "http://stackoverflow.com"; // e.g., $_GET["user-homepage"];
$esapi = new ESAPI( "/etc/php5/esapi/ESAPI.xml" ); // Modified copy of ESAPI.xml
$sanitizer = ESAPI::getSanitizer();
$sanitized_url = $sanitizer->getSanitizedURL( "user-homepage", $url );
Another example is to use a built-in function. PHP's filter_var function is an example:
$url = "http://stackoverflow.com"; // e.g., $_GET["user-homepage"];
$sanitized_url = filter_var($url, FILTER_SANITIZE_URL);
Using filter_var allows javascript calls, and filters out schemes that are neither http nor https. Using the OWASP ESAPI Sanitizer is probably the best option.
Still another example is the code from WordPress:
http://core.trac.wordpress.org/browser/tags/3.5.1/wp-includes/formatting.php#L2561
Additionally, since there is no way of knowing where the URL links (i.e., it might be a valid URL, but the contents of the URL could be mischievous), Google has a safe browsing API you can call:
https://developers.google.com/safe-browsing/lookup_guide
Rolling your own regex for sanitation is problematic for several reasons:
Unless you are Jon Skeet, the code will have errors.
Existing APIs have many hours of review and testing behind them.
Existing URL-validation APIs consider internationalization.
Existing APIs will be kept up-to-date with emerging standards.
Other issues to consider:
What schemes do you permit (are file:/// and telnet:// acceptable)?
What restrictions do you want to place on the content of the URL (are malware URLs acceptable)?
Just HTMLEncode the links when you output them. Make sure you don't allow javascript: links. (It's best to have a whitelist of protocols that are accepted, e.g., http, https, and mailto.)
You don't specify the language of your application, I will then presume ASP.NET, and for this you can use the Microsoft Anti-Cross Site Scripting Library
It is very easy to use, all you need is an include and that is it :)
While you're on the topic, why not given a read on Design Guidelines for Secure Web Applications
If any other language.... if there is a library for ASP.NET, has to be available as well for other kind of language (PHP, Python, ROR, etc)
For Pythonistas, try Scrapy's w3lib.
OWASP ESAPI pre-dates Python 2.7 and is archived on the now-defunct Google Code.
How about not displaying them as a link? Just use the text.
Combined with a warning to proceed at your own risk may be enough.
addition - see also Should I sanitize HTML markup for a hosted CMS? for a discussion on sanitizing user input
There is a library for javascript that solves this problem
https://github.com/braintree/sanitize-url
Try it =)
In my project written in JavaScript I use this regex as white list:
url.match(/^((https?|ftp):\/\/|\.{0,2}\/)/)
the only limitation is that you need to put ./ in front for files in same directory but I think I can live with that.
Using Regular Expression to prevent XSS vulnerability is becoming complicated thus hard to maintain over time while it could leave some vulnerabilities behind. Having URL validation using regular expression is helpful in some scenarios but better not be mixed with vulnerability checks.
Solution probably is to use combination of an encoder like AntiXssEncoder.UrlEncode for encoding Query portion of the URL and QueryBuilder for the rest:
public sealed class AntiXssUrlEncoder
{
public string EncodeUri(Uri uri, bool isEncoded = false)
{
// Encode the Query portion of URL to prevent XSS attack if is not already encoded. Otherwise let UriBuilder take care code it.
var encodedQuery = isEncoded ? uri.Query.TrimStart('?') : AntiXssEncoder.UrlEncode(uri.Query.TrimStart('?'));
var encodedUri = new UriBuilder
{
Scheme = uri.Scheme,
Host = uri.Host,
Path = uri.AbsolutePath,
Query = encodedQuery.Trim(),
Fragment = uri.Fragment
};
if (uri.Port != 80 && uri.Port != 443)
{
encodedUri.Port = uri.Port;
}
return encodedUri.ToString();
}
public static string Encode(string uri)
{
var baseUri = new Uri(uri);
var antiXssUrlEncoder = new AntiXssUrlEncoder();
return antiXssUrlEncoder.EncodeUri(baseUri);
}
}
You may need to include white listing to exclude some characters from encoding. That could become helpful for particular sites.
HTML Encoding the page that render the URL is another thing you may need to consider too.
BTW. Please note that encoding URL may break Web Parameter Tampering so the encoded link may appear not working as expected.
Also, you need to be careful about double encoding
P.S. AntiXssEncoder.UrlEncode was better be named AntiXssEncoder.EncodeForUrl to be more descriptive. Basically, It encodes a string for URL not encode a given URL and return usable URL.
You could use a hex code to convert the entire URL and send it to your server. That way the client would not understand the content in the first glance. After reading the content, you could decode the content URL = ? and send it to the browser.
Allowing a URL and allowing JavaScript are 2 different things.