Our site just rolled a new version, and now pages have Unicode in the url. I see that Rails have properly url escaped these UTF-8 characters when rendering the anchor tags.
/regions/%E4%B8%AD%E5%BD%B0%E6%8A%95/
However I still see a lot of traffic with incorrectly encoded urls:
/regions/%A4%A4%B9%FC%A7%EB/
Apparently this is the same address, but encoded in something other than UTF-8, and then url escaped.
Question
I am wondering if there is any old browser, which will take correctly escaped url, unescape it to get UTF-8, encode it in some other encoding, and then url-escape it when requesting the server?
Otherwise I don't know how to explain these traffic.
I have tested in Internet Explorer 6 and 7. I also tested the "Always send URLs as UTF-8" option. None of the combination caused incorrect encoded request.
I am guessing this was by some web crawler which handles the decoding but not encoding.
Related
I am writing a node.js application. In which I am trying to encode the url that would show in the address bar of the browser.
For an example: http:/www.abcdefghjs.com?q=This%20is%20good
Chrome shows the url in the address bar with the encoding as below or above url
http:/www.abcdefghjs.com?q=This%20is%20good
IE 11 & FireFox does not showing up the %20 for space, the url in the address bar as below:
http:/www.abcdefghjs.com?q=This is good
Any sort of help is appreciated to fix this for IE11 & Firefox.
NOTE: I have tried both the node.js function encodeURIComponent() and encodeURI(). But noot working in IE11 and FireFox
Is it actually affecting your application? Could be that Firefox is just showing it to you un-encoded but actually sends the request encoded.
Also, how are you navigating to the URL? Are you setting it with JS or are you just clicking a link?
Edit: From a quick google search, for Firefox, it looks like when the URL is already encoded, it displays as decoded in the address bar. If it is not already encoded it will show encoded.
Just as chrispytoes says, it's a browser behavior by design and there's nothing we can do about it. Besides, we can use Fiddler to track the url request. As we can see in the Inspectors, IE actually sends the encoded request though it shows un-encoded in address bar:
Why would #1 work, but not #2 or 3 when used in a $$Return field if database is being accessed using IE11? The field is hidden.
[db_path/db_filename/Page?OpenPage]
http://server_dns/db_path/db_filename/Page?OpenPage
server_dns/db_path/db_filename/Page?OpenPage
A URL in brackets (e.g., [db_path/db_filename/Page?OpenPage]) is interpreted by the Domino server as a command to send an HTTP 30x REDIRECT response (probably a 303, but I'm not sure) to the browser. Upon receipt of this response, the browser interprets it as an instruction to retrieve the specified URL. That's simply a matter of compliance with standards, so all browsers will do it.
The other choices you list are not treated as anything special by the Domino server. They are simply sent as ordinary content in a 200 OK response to the browser's POST request. No standards apply to this, so a browser may or may not choose to recognize that the response text looks like a URL and may or may not choose to do something with it - e.g., follow the link. Based on your question, it appears that IE11 does not do anything with it. It doesn't follow the URL. Frankly, I had no idea that any browser would do actually follow a URL if it is received as the sole content with a 200 OK response.
I was reading about IDN homograph atack and didn't find exactly stated does browsers encode in punycode only domain or rest of the URL is included (path and query). So my question is does one of popular browsers (FF, IE, Chrome, Safari, Opera) encode rest of the URL (IRI to be exact) with punycode ?
Only the domain name part is encoded with punycode. This is due to the restrictions imposed on the allowable characters in a (traditional) domain name. The path part of the URL has no such restrictions, so UTF-8 is often used.
Is there any length limitation of query string in ios phonegap?
Thanks!!!
Although the specification of the HTTP protocol does not specify any maximum length, practical limits are imposed by web browser and server software.
Microsoft Internet Explorer (Browser)
Microsoft states that the maximum length of a URL in Internet Explorer is 2,083 characters, with no more than 2,048 characters in the path portion of the URL. In my tests, attempts to use URLs longer than this produced a clear error message in Internet Explorer.
Firefox (Browser)
After 65,536 characters, the location bar no longer displays the URL in Windows Firefox 1.5.x. However, longer URLs will work. I stopped testing after 100,000 characters.
Safari (Browser)
At least 80,000 characters will work. I stopped testing after 80,000 characters.
Opera (Browser)
At least 190,000 characters will work. I stopped testing after 190,000 characters. Opera 9 for Windows continued to display a fully editable, copyable and pasteable URL in the location bar even at 190,000 characters.
Apache (Server)
My early attempts to measure the maximum URL length in web browsers bumped into a server URL length limit of approximately 4,000 characters, after which Apache produces a "413 Entity Too Large" error. I used the current up to date Apache build found in Red Hat Enterprise Linux 4. The official Apache documentation only mentions an 8,192-byte limit on an individual field in a request.
Microsoft Internet Information Server
The default limit is 16,384 characters (yes, Microsoft's web server accepts longer URLs than Microsoft's web browser). This is configurable.
Perl HTTP::Daemon (Server)
Up to 8,000 bytes will work. Those constructing web application servers with Perl's HTTP::Daemon module will encounter a 16,384 byte limit on the combined size of all HTTP request headers. This does not include POST-method form data, file uploads, etc., but it does include the URL. In practice this resulted in a 413 error when a URL was significantly longer than 8,000 characters. This limitation can be easily removed. Look for all occurrences of 16x1024 in Daemon.pm and replace them with a larger value. Of course, this does increase your exposure to denial of service attacks.
Recommendations
Extremely long URLs are usually a mistake. URLs over 2,000 characters will not work in the most popular web browser. Don't use them if you intend your site to work for the majority of Internet users.
References : http://www.boutell.com/newfaq/misc/urllength.html
I've added a SSL certificate to an existing site, and now in IE I get a mixed content warning. Problem is, I don't know what's the non-secure content IE is warning me about. It's a simple html page, with a few Flash, a few images, a loaded CSS and JS.
How can I find out what's the non-secured content..?
Edit:
I found the culprit: it's the JS AC_RunActiveContent.js used to display Flash movie. So anyone has an idea on how to prevent SSL mixed content when using AC_RunActiveContent.js.?
This means that something is requesting content using the http protocol specifically, or you have an absolute path to an image or other content that begins with http instead of https.
A few tips: Use relative paths everywhere you can. If you must use an absolute path, and it's to a server you own, use https. If you're loading stuff from off your site, you're probably stuck with the mixed-content warning.
This also goes for your scripts, check out the JS, and the CSS template and make sure they're not the guilty parties - if they are change them to use relative paths, or to request items via https instead of http (assuming you're positive that the server they're referencing supports https, if it doesn't you're stuck).
There are a few other details, this might be helpful.
Ok, so here is the solution for my particular problem. It was the codebase value in my code that needed to be https as well (I didn't think it would trigger the warning, as my Flash were displaying correctly, oh well)...
AC_FL_RunContent( 'codebase','https://download.macromedia.com/pub/shoc...
Link to Adobe info on this: Security Information error in Internet Explorer
I use the Firefox console -- it reports the http resources it blocks from fetching on a mixed content page.
Search your source for http: only. Another great tool to help you out is Fiddler with which you can see what's getting downloaded upon requesting your page.