How many HTML error pages should I make? - .htaccess

I am currently developing my website and have added some custom error pages, namely for HTTP 404 and 500. Obviously there are many more error codes than that, but these two are the most common.
What I am wondering is whether there is any rule of thumb for how many you create. Most websites have custom error pages (especially for these two errors), but some have it for others as well. Just how many error pages should I make?

There is no rule of thumb and there are so many different error codes that creating a custom page for each of them would be overkilling.
Check the different error codes here:
https://support.microsoft.com/en-us/kb/318380
If you want completeness, I recommend you to go to the Error Pages feature configuration in IIS, for example, and check the ones in there and it will be very easy for you to decide whether you want a custom message or not.
401
403
404
405
406
412
500
501
502
My suggestion, and again it is based on you wanting completeness of customization, would go with those. Otherwise, just stick with the defaults and the two you have already customized.
Hope it helps.

Related

How to check a site if it is having a custom 404 page or a default one?

I am creating a SEO Audit tool using NodeJS. I want to check if a URL has setup a custom 404 page or not. How can I check ?
I have analysed the response for both custom 404 page and default one both return same content-type and response headers. Both return HTML content only so how can I decide if it is a custom 404 page or not.
If this is very important for you to know (maybe you are selling custom 404 pages), you'll need to examine the HTML returned by the request.
Many popular servers, such as tomcat, iis, and apache return a standard 404 page that you should be able to recognize. Same thing with frameworks such as django or rails. You could build some logic that compares 404 results with the "fingerprints" of a known population of default 404 pages.
For example certain versions of tomcat have a title on their error pages that looks like this:
<title>Apache Tomcat/7.0.50 - Error report</title>
If you see something that looks like that you can be pretty sure that you are dealing with the default tomcat error page.
There are machine learning techniques that can probably do this for you without needing to compile a library of 404 page fingerprints (similar to filters that distinguish spam messages from legit ones).

Directory listing protection, blank index vs 404 vs 401

In your opinion what is the best way to protect directory listing from external users?
Option 1: Blank index. This is the standar way that i have seen on several sites, it has te advantage of not showing anything but the disadvantage of implying that there is something there
Option 2: 404, send a fake 404 page and redirect, will this can cause problems with the webcrawlers?
Option 3: 401 error and redirection, this is similar to the blank index, except that it will show an "unauthorized" header, i think this will be a very bad option (because im implicity saying that there is something important inside), but i would like to hear your thoughts on this too
Thanks for your help if you know any other option that i might use please tell me as well
The 'best' way is to disable directory listing the server (this will normally cause a 403 error, see error 404 in the following list for discussion of information leakage)
The easiest way is a blank page (normally index.html or index.htm)
Other options with returning errorcodes:
403 (forbidden) is the default in apache httpd and i think this is better than a blank page.
404 is for 'not found' which is not the case here (could be used if nobody knows that the directory exists in order to prevent disclosure, but if ppl. know it exits it doesn't make any sense as its existance is already known) and
401 (authentication required) doesn't make any sense in any case
Other considerations
some browsers do not display custom error pages. If you want to provide a link to the main page (or somewhere else) a 'blank' page containing a link or a direct 301/302 redirect could be used.

How to display an error page for static files when using existingResponse="PassThrough" in IIS 7

To be able to get my application (Umbraco CMS) to handle 404 errors, I need to have following setting in my web.config:
<httpErrors existingResponse="PassThrough" />
It works well for displaying a custom 404 error page from Umbraco, but it doesn't work for displaying a 404 error page when a static file cant be found.
For example http://www.example.com/non-existing-file.png returns a 404 status code, but respons is blank. That fits with this description of the PassThrough mode:
PassThrough – If existingResponse is seen as “PassThrough”, custom
error module will always pass through responses from modules. This
setting will make custom error module return blank response if modules
don’t set any text.
(http://blogs.iis.net/ksingla/archive/2008/02/18/what-to-expect-from-iis7-custom-error-module.aspx)
What is the "custom error module"? And how do I get it to return a non blank response?
Update
After stepping through the request handling routines in Umbraco (with a debugger), I have a better understanding of the problem space. As written in the citation above, when existingResponse="PassThrough" then all handling of http errors is done in the custom modules. So to answer my own question, a "custom error module" is in this instance the "UmbracoModule".
One way to solve this problem would be to create a new custom module that will check for the existence of the static files on the disk. It can be done quite nice, I read the list of static files from IIS and read the configuration of customErrors or httpErrors elements from the web.config to get a custom error page. But I kind of think this is crude solution. I would much rather pass the responsibility back to IIS.
Any ideas?
You're correct. For the benefit of anyone else looking for help with this, here's some extra information on 404 handling.
Old Umbraco wiki page - describes your solution using the file /config/umbracoSettings.config
404 for files - works for Umbraco 7 and describes the use of the tag <httpErrors/>
IIS version 6 fix - fixing on IIS6 and how it compares with IIS7

Can't get IIS 7/Coldfusion to deliver 404

So... we have a custom CMS. We have a rewrite rule that any page request (when a file doesn't exist) goes to the root/index.cfm file. There we search our DB for the page in question. If the page exists, we serve up the correct template,etc. If the page doesn't exist I want to server up a 404 page. Now I "think" I cannot do this in IIS since I need to handle the request in CF, so it has to get through. The file will always exist. When the page doesn't exist I've tried using <cfheader statusCode="404" > and then include some html, it puts The resource you are looking for has been removed, had its name changed, or is temporarily unavailable. at the top of the page before my html. In order to get it to display the page I had to remove the 404 status code handler from IIS.
In addition when I fetch as Google, it get's a 301. However when I view response headers in Firefox I get.
Transfer-Encoding: chunked
Content-Type: text/html
Server: Microsoft-IIS/7.5
X-Powered-By: ASP.NET
Date: Wed, 16 Jan 2013 21:31:42 GMT
404 Not Found
I've tried a combination of redirecting and all sorts of things. I open to letting IIS handle the 404, if there is a way, but I cannot figure out how to get Coldfusion to correctly deliver a 404 so Google gets it right. Webmaster tools gets mad at me because I am delivering "Soft 404s" before this point, so I am trying to fix that.
I've also tried setting <httpErrors existingResponse="PassThrough" /> whatever the hell that does, but didn't work either. I've been looking up other threads trying to figure this out and just can't.
EDIT: Looking further into this, viewing the header info in both Firebug & Chrome I clearly see the headers say 404. Why would Fetch as Bing and Fetch as Google say differently?
I tested the fact that if I add .cfm to the URLs, it Fetch as Google will deliver see 404. However without the .cfm, it thinks it's 301. Firebug sees both as 404. This seems like a Google issue.
ANSWER Kind of:
So I was doing more testing this morning (Right after I added a bounty actually), and I noticed in webmaster tools, Google correctly noted one of my pages as a 404. So I started looking into it. I have an "Add Trailing Slash Rule". Google notices domain.com/page as a 301 (Correct I guess) to domain.com/page/. But it does notice domain.com/page/ as a 404. I think using the trailing slash rule as I have it is the right way, however, should I be doing something different, or is using the redirect with the slash the "correct" way of doing things, even though Google wants to ding me for it sometimes.
I'm not entirely sure I follow the specifics of your approach, so I will give you a few things that you need to look at in order to get this approach working well (or at least what has worked best for me).
Under "Error Pages", make sure that your 404 error page is set to "Execute a URL on this site" ( I generally set mine to something like "/404.cfm"). This will make sure that your ColdFusion page is called correctly for 404 pages (it sounds like you have this working correctly).
Under "Handler Mappings", double-click on the handler for ".cfm". Then click the "Request Restrictions..." button. It should open to the "mappings" tab. The "Invoke handler only if request is mapped to:" checkbox should NOT be checked.
This can really trip up this sort of operation because it means that IIS won't invoke ColdFusion if the file doesn't exist. This shouldn't be an issue if your 404 is set up correctly, but still something to look into.
While you are in the "Handler Mapping" section, look for the IsapiModule with a path of "*". Mine is always set to ColdFusion - not sure if that makes a difference or not.
The other thing to look at is the "Default Document" setting. Keep in mind that this could impact you when forwarding to a folder.
You might also look at your rewrite rule again and make sure it isn't adding slashes where one already exists.

Tracking where a custom 404 is handled - code, IIS, etc?

Is there any means to track where within code a 404 error is handled?
We have a site that we have taken over that doesn't seem to be reacting as expected.
We have changed the IIS custom errors page to point to the new page we would like but something seems to still be redirecting it to the old 404.htm file in the root of this site. We have not been able to track down where or why this happens.
Any suggestions on how we might find the referring logic?

Resources