I've recently restructured my site and in the process have removed many pages.
I've put in .htaccess 301 redirects to reflect the changed structure where new pages directly replace old pages; have put in a custom 404 page and have included the following .htaccess command: ErrorDocument 404 /page-not-found.html.
For pages removed altogether that have no new equivalent page, I've used .htaccess to redirect to the 'page not found' custom page.
In webmaster tools, though, I'm seeing soft 404s reported - pages not found that are not returning 404 response codes.
Can anybody please advise a way forward on this?
Should I not redirect references to the deleted pages and let the 'pages not found' stack up in the webmaster tools report, or is there a way of returning a 404 code from the server for my deleted pages even though a redirect exists?
Any advice would be much appreciated.
Thanks,
Andrew
I think if the page doesn't have a new equivalent, then let it fall through not found, and return a 404 (which should show your custom page).
Google is showing a soft 404 because of the redirect (which is a 301, not a 404), as you say.
If you did redirect to the custom page, and somehow returned a 404, that's essentially the same thing as just letting them not be found.
The pages not found will stack up, but if they really aren't found anymore, then that's the correct response.
The alternative would be a 301 to a different page, that was a real "hey this moved, try this instead" page, but it would have to look like something other than "not found", to avoid google treating it as a soft 404.
Related
I have correctly configured the resource id in the "site" section in the system settings. But when I enter a mysite/ non-existentaddress in the browser, instead of a redirect to 404, a redirect to the main page occurs
I tried to add in my htacceess file redirect to 404, but it didn’t help
So your error_page system setting has published resource ID as desired 404 page, right?
What do you see here https://www.redirect-checker.org/index.php checking yoursite/non-existentaddress URL?
This setting should work without any htaccess additions, please comment them if any.
BTW how about friendly URLs, is it active now?
I have the following code in my htaccess file to handle 404 codes:
ErrorDocument 404 https://www.mywebsite.co.uk/lost.php
RewriteEngine On
And of course a custom 404 page is returned (works fine).
I did a nibbler check and it said the following:
It is a common mistake to setup missing page handling by using a
redirect. The missing page should directly return a 404 error and not
redirect to another page.
This website does not return a 404 error HTTP status code for missing
pages. This is bad because search engines like Google might mistake
this for a real page of content.
What is the best practice to handle 404 error codes if the above is no good? I am just wanting something generic that is a catch all. No need to use a 301 because the point of my 404 is just to let visitors know that what they are looking for is not there or they have made a typo.
If it is a "common mistake" to to use a redirect how should it be done?
Thank you for any help.
You need to remove https:// or http:// from 404 handler otherwise it forces Apache to do a full redirect instead of internal rewrite.
So you can use:
ErrorDocument 404 /lost.php
It depends. If you have an application running on the webserver that takes in every URL coming in and has some logic implemented, that by itself has to figure out if the URL exists, then the application should directly return a 404 error page under the given URL and not redirect to another page like 404.html or so. But if your webserver handles the url requests with the .htaccess file then it is OK to setup a document that is returned in case of an invalid URL.
So I overhauled a complete website the other day and found some of the old pages snippets in the google search results. The old page had an ugly link structure such as domain.com/index.php?article_id=123. The new site uses pretty permalinks such as domain.com/pagetitle.
Is there a piece of code I could put into the .htaccess file in order to redirect all ugly permalinks to the new site?
Edit
Additional info: The old links don't exist anymore. The old site and the new one's structure differs a lot, not all contents from the all site were adapted. Main problem is that I don't want the old links in the google search results to always throw a 404 at the user.
Maybe something of a
RedirectMatch ^/index.php?$ http://www.example.com/somepage
This will redirect all pages starting from index.php to another location
I don't have the rep to comment on the other answer, but that is a very improper solution if you value your SEO at all. A redirect is your way of telling Google "I've got the same page, I just moved it". There's a much better way to do this that won't negatively affect your SEO at all.
You should create some logic to redirect those old links to your new links.
Here's an example of how you could do it:
Go to the beginning of your program, before any logic takes place.
Use code to retrieve the requested page. In this case, you might be able to get away with simply checking for GET variables that match article_id.
If the requested page is a match for your GET variable, run a query to see if the article exists. (Obviously, you'll still want to 404 articles that don't exist).
Retrieve the content used to generate the new, more SEO-friendly URL's. This is probably the article title or something.
Write some code to generate the new article title. At this point, if this is working properly, you should be able to system print that new URL to make sure it's correct.
301 redirect to the new URL. Don't 302 or any other number, 301 redirect it. This lets search engines know it's the same page and content, but it has permanently moved.
I have a site submitted to google webmaster tools that I helped a family member redevelop. The site is great and working but google says 4 old url's displayed crawl errors and return 404 errors.
From what i've read this is common when using CMS systems and page alias's get changed. However, while I have access to the php and .htaccess files and I can see that mod rewrite is on I don't know how to implement a 301 redirect. I'm not familiar with php code.
The 4 url's google says are errors I can't even find, so I don't know where they are being linked from. For example this is one of the errors:
mywebsite.co.uk/index.php?page=kitchens (which produces a 404 error page)
should link, or redirect to,
mywebsite.co.uk/kitchens-northampton/
I noticed this code in the htaccess file if it's of any relevance:
RewriteRule ^(.+)$ index.php?page=$1
I just don't know what it means. Can anyone offer some advice?
Thank you.
OK...I am going crazy...
I have a Magento store in a /store/ directory. The homepage displays fine and users are able to make purchases, but to bots it's showing a 404 status code.
So /store/ and /store/index.php are both 404 header responses but I can navigate to those pages and browse the website from there. I've never seen this before.....blah
When I crawl the site with Xenu from the http://www.mywebsite.com/store/ URL it says it's 404 and the title shows, "redir" so I am assuming there is some redirect somewhere that I am missing(?)....
Any help is appreciated...I may need to explain this better so if so, I will gladly :)
Thanks in advance!
I appreciate all the help!
Somehow, changing the CMS page assigned as the homepage fixed the problem. The original homepage was calling some entity that returned a 404 error in turn giving a 404 error for the entire page.
Really not sure what happened...but like I said, changing the CMS homepage solved the problem!
Have you tried browsing your site with browser and a user agent switcher? There are examples of Magento being unable to build collections when the user agent is one of the bots which is explicitly left out of the log information.