SEO - Getting a 301 page indexed by search engines - http-status-code-301

I have a site (say site1.com) which 301-redirects to another page on a different site (say http://site2.com/some/dirty/url).
Typical code at site1.com:
<?php
header("HTTP/1.1 301"); header("refresh:0;url=http://site2.com/some/dirty/url");
?><html>
<head>
<title>
Site 1 - heading.
</title>
<meta name="description" content="some description" />
</head>
<body />
</html>
Typically, Search Engines never index site1.com, even when there are external links like:
Click Here
But this is considered as an external link to http://site2.com/some/dirty/url and thus http://site2.com/some/dirty/url is seo'd.
I some how want to get site1.com indexed (Just the title, meta description and URL) though http://site2.com/some/dirty/url getting indexed is not a problem. Is this really possible or is it just what I have to forget about?

The 301 redirect tells search engines, and any other user agent that respects HTTP status codes, that http://site.com no longer exists and has moved to a new location. This means they now consider the new location of http://site.com to be http://site2.com/some/dirty/url and to associate everything, including all links to http://site.com to be associated with http://site2.com/some/dirty/url. So basically http://site.com does not exist anymore and no matter how many links you point to it, it won't change anything since they now will be associated with http://site2.com/some/dirty/url. And that makes sense since a 301 HTTP status does indicate that a page has moved permanently. If that page hasn't moved permanently then you are using the wrong HTTP status code.

Yes,It can be indexed......But it requires a better on page work on the both of your sites
(http://site.com) and (http://site2.com/some/dirty/url) .............
For example I have recently worked on the same conditions the website url is "http://www.top-alliance.de" which redirects to "http://www.top-alliance.com" and these both sites are indexed by the search engine recently by 04 June 2012.This is happened because i have done a better onpage work for both pages...
So the conclusion is the both your sites will require better on page work so it will definitely indexed by the search engine.
Thanks & Regards
Nitin Bhatnagar

To easily create redirects in your WordPress, an alternative is a simple 301 redirect plugin. Once you've installed and activated the plugin, add a new menu in the Settings area of ​​your dashboard.
There is really nothing to worry about with this plugin. The 301 Redirect Configuration window shows you two simple fields. One labeled as a request and the other as a destination. This is basically where the old permanent link structure and the new permanent link structure come from. You only need to add information after your domain name in these fields.
In the example above, the request field is the WordPress setting for the month and name Permilix, while the destination field is the WordPress setting for the post name Permalink structure. After you add these two fields, save your changes. It will ask any search engine traffic to come back to the old links.

Related

duplicate URLs in my page, best solution?

I have a website that write URLs like this:
mypage.com/post/3453/post-title-name-person
In fact, what is important is the post and ID part (3453). The title I just add for SEO.
I changed some title names recently, but people can still using the old URL to access, because I just get the ID to open the page, so:
mypage.com/post/3453/post-title-name-person
mypage.com/post/3453/name-person
...
Will open the same page.
Is it wrong? Google webmaster tools tells me that I have 8765 duplications pages. So, to try to solve this I am redirecting old title to post/id/current-title but it seems that Google doesn't understand this redirecting and still give me duplications.
Should i redirect to not found if title doesn't match with the actual data base? (But this can be a problem because links that people shared won't open) Or what?
Maybe Google has not processed your redirections yet. It may take several weeks and sometimes several months to process all pages, especially if they are not revisited often. Make sure your redirects are 301 and not 302 (temporary).
That being said, there is a better method than redirections for duplicate pages: the canonical tag. If you can, implement it. There is less risk to mix up redirections.
Google can pick your new URL's only after the implementation of 301 redirection through .htaccess file. You should always need to remember that 301 re-direct should be proper and one to one to the new url. After this implementation you need to fetch those new URL via Google Search console so that Google index those URL's fast.

URL Rewrite IIS and search engine

I've configured my IIS (asp.net site) to use URL Rewrite.
In particular this is my rule (dynamic one): whatever url in format number/string will be redirected to a special aspx page.
SSo whatever url starts with mysite/id/Name is redirected to showprof.aspx?id=id&title=Name. This works perfectly.
My question is about search engines. I don't have any "fixed" page that contains links like mysite/id/Name that the spider can scan, so I'm trying to figure it out how search engines could index my dynamic pages. Should I create a sitemap.xml? if yes in wich way? or should I create a "hidden" page that contains every link to all my dynamic contents like mysite/id1/Name1 mysite/id2/Name2 and so on?
thank you
A starting point is definitely a Sitemap.xml, You could try for example the IIS SEO Toolkit and see if it is able to index any of your pages: http://www.iis.net/downloads/microsoft/search-engine-optimization-toolkit
It also has functionality to generate a sitemap.xml, although I'm guessing in your case you probably have some dynamic content, so a better approach would be to have a "handler" that generates it dynamically on demand (maybe cache it for performance reasons).
I would also recommend to have some pages that actually are accessible through normal links, for example maybe have in your home page of the site a link to a "site map" page (not sitemap.xml), where there you render a set of links that you want to index (at least the ones that are most important to you), and that will make them easy to discover.

301 redirect all ugly permalinks from old site to new site

So I overhauled a complete website the other day and found some of the old pages snippets in the google search results. The old page had an ugly link structure such as domain.com/index.php?article_id=123. The new site uses pretty permalinks such as domain.com/pagetitle.
Is there a piece of code I could put into the .htaccess file in order to redirect all ugly permalinks to the new site?
Edit
Additional info: The old links don't exist anymore. The old site and the new one's structure differs a lot, not all contents from the all site were adapted. Main problem is that I don't want the old links in the google search results to always throw a 404 at the user.
Maybe something of a
RedirectMatch ^/index.php?$ http://www.example.com/somepage
This will redirect all pages starting from index.php to another location
I don't have the rep to comment on the other answer, but that is a very improper solution if you value your SEO at all. A redirect is your way of telling Google "I've got the same page, I just moved it". There's a much better way to do this that won't negatively affect your SEO at all.
You should create some logic to redirect those old links to your new links.
Here's an example of how you could do it:
Go to the beginning of your program, before any logic takes place.
Use code to retrieve the requested page. In this case, you might be able to get away with simply checking for GET variables that match article_id.
If the requested page is a match for your GET variable, run a query to see if the article exists. (Obviously, you'll still want to 404 articles that don't exist).
Retrieve the content used to generate the new, more SEO-friendly URL's. This is probably the article title or something.
Write some code to generate the new article title. At this point, if this is working properly, you should be able to system print that new URL to make sure it's correct.
301 redirect to the new URL. Don't 302 or any other number, 301 redirect it. This lets search engines know it's the same page and content, but it has permanently moved.

Old website (2002) to new website (2014)

One of my main concerns is about SEO, when I have the intention to completely redesign a website and make it work on a mobile device.
Following that idea, I have been researching on Google Developers and have decided choose the first option "Responsive Design"
CURRENT SITUATION
Made in TABLE
SEO based only in KEYWORDS and DESCRIPTION
URL not friendly
Use the first version HTML
Layout old-fashioned
Excellent position in the GOOGLE Rank
Excellent traffic visitors
TARGET
Create mobile version to attend target group which use mobile device
Full redesign, including best practices organic SEO Friendly URL
HTML5
CCS3
Responsive Design
New technology JSP to PHP (Laravel 4)
OBSERVATION
Because this site has been online since 2002, it has developed an excellent position on GOOGLE. The biggest concern of all, is to lose the position because of the migration to the new version. Im Seeking alternative or more efficient solution, I'v identified the use of 301 redirect to the new URL.
My Questions are as follows:
If the domain of the website is maintained, will this change only the URL?
example:
From
www.website.com/cs/detail.jsp?id=123456
www.website.com/aboutus.html
To
www.website.com./product/detail/123456/lorem-ipsum-dolem-sit-amet
www.website.com/about-us
Following that thought, I'v found some solutions like 301 redirect.
DOUBT 01
I will use the 301 redirect to each page, then will I have to put that 301 redirect one-to-one?
aboutus.html
response.setStatus(301);
response.setHeader( “Location”, “http://www.website.com/about-us” );
response.setHeader( “Connection”, “close” );
cs/detail.jsp
response.setStatus(301);
response.setHeader( “Location”, “http://www.website.com./product/detail/123456/lorem-ipsum-dolem-sit-amet” );
response.setHeader( “Connection”, “close” );
DOUBT 02
Following the doubt above, will i have to put the new website in sub folder?
Example:
|public_html
|-index.html
|-quemsomos.html
|-cs
|--detalhe.jsp
|-novo-site
|--index.php
And will the URL will be just like that of:
www.website.com/new-website/quem-somos
www.website.com/new-website/product/detail/123456/lorem-ipsum-dolem-sit-amet
DOUBT 03
Is there anything else i need to worry about?
Answer 1:
Its better to redirect the entire site using a .htaccess file rather than writing it in every single page. You can refer to my link below that will help you gain an understanding of this redirection.
Answer 2:
The URL strictly follows your directory structure.
Answer 3:
The search engines will take some time to discover the 301, recognize it, and credit the new page with the rankings and trust of its predecessor. This process can be lengthier if search engine spiders rarely visit the given web page, or if the new URL doesn't properly resolve.
You may refer to the link below to gain a better understanding about the 301 redirection.
how to 301 redirect

How to stop search engines from crawling the whole website?

I want to stop search engines from crawling my whole website.
I have a web application for members of a company to use. This is hosted on a web server so that the employees of the company can access it. No one else (the public) would need it or find it useful.
So I want to add another layer of security (In Theory) to try and prevent unauthorized access by totally removing access to it by all search engine bots/crawlers. Having Google index our site to make it searchable is pointless from the business perspective and just adds another way for a hacker to find the website in the first place to try and hack it.
I know in the robots.txt you can tell search engines not to crawl certain directories.
Is it possible to tell bots not to crawl the whole site without having to list all the directories not to crawl?
Is this best done with robots.txt or is it better done by .htaccess or other?
Using robots.txt to keep a site out of search engine indexes has one minor and little-known problem: if anyone ever links to your site from any page indexed by Google (which would have to happen for Google to find your site anyway, robots.txt or not), Google may still index the link and show it as part of their search results, even if you don't allow them to fetch the page the link points to.
If this might be a problem for you, the solution is to not use robots.txt, but instead to include a robots meta tag with the value noindex,nofollow on every page on your site. You can even do this in a .htaccess file using mod_headers and the X-Robots-Tag HTTP header:
Header set X-Robots-Tag noindex,nofollow
This directive will add the header X-Robots-Tag: noindex,nofollow to every page it applies to, including non-HTML pages like images. Of course, you may want to include the corresponding HTML meta tag too, just in case (it's an older standard, and so presumably more widely supported):
<meta name="robots" content="noindex,nofollow" />
Note that if you do this, Googlebot will still try to crawl any links it finds to your site, since it needs to fetch the page before it sees the header / meta tag. Of course, some might well consider this a feature instead of a bug, since it lets you look in your access logs to see if Google has found any links to your site.
In any case, whatever you do, keep in mind that it's hard to keep a "secret" site secret very long. As time passes, the probability that one of your users will accidentally leak a link to the site approaches 100%, and if there's any reason to assume that someone would be interested in finding the site, you should assume that they will. Thus, make sure you also put proper access controls on your site, keep the software up to date and run regular security checks on it.
It is best handled with a robots.txt file, for just bots that respect the file.
To block the whole site add this to robots.txt in the root directory of your site:
User-agent: *
Disallow: /
To limit access to your site for everyone else, .htaccess is better, but you would need to define access rules, by IP address for example.
Below are the .htaccess rules to restrict everyone except your people from your company IP:
Order allow,deny
# Enter your companies IP address here
Allow from 255.1.1.1
Deny from all
If security is your concern, and locking down to IP addresses isn't viable, you should look into requiring your users to authenticate in someway to access your site.
That would mean that anyone (google, bot, person-who-stumbled-upon-a-link) who isn't authenticated, wouldn't be able to access your pages.
You could bake it into your website itself, or use HTTP Basic Authentication.
https://www.httpwatch.com/httpgallery/authentication/
In addition to the provided answers, you can stop search engines from crawling/indexing a specific page on your website in .robot.text. Below is an example:
User-agent: *
Disallow: /example-page/
The above example is especially handy when you have dynamic pages, otherwise, you may want to add the below HTML meta tag on the specific pages you want to be disallowed from search engines:
<meta name="robots" content="noindex, nofollow" />

Resources