How to stop google from indexing a domain used for redirection - google-index

I have 2 domains, a site is deployed on one of them while the other has no content and simply redirects to the one with stuff. Google is indexing both of them, showing the same content from the first domain in the search details.
Q: How can I prevent the one that redirects from showing up on the search results?
Is it just a matter of deploying a robot.txt on the domain that redirects?

If all you want is to stop Google from indexing your site you should use the following robots.txt file:
User-agent: *
Disallow: /
However, if you want to make sure the correct domain shows up in Google's results you should:
a) Use HTTP 301 Redirects
b) Specify your canonical

According to Google...
Q: I have the same content available on two domains (example.com and example2.org). How do I let Google know that the two domains are the same site?
A: Use a 301 redirect to direct traffic from the alternative domain (example2.org) to your preferred domain (example.com). This tells Google to always look for your content in one location, and is the best way to ensure that Google (and other search engines!) can crawl and index your site correctly. Ranking signals (such as PageRank or incoming links) will be passed appropriately across 301 redirects. If you're changing domains, read about some best practices for making the move.
Source
So I'm guessing you aren't doing a 301 or Google changed.

Related

Combine variations of the same domain in Google Analytics

Is there a way of forcing Google Analytics to combine variations of the same domain so it treats all subsequent visits from one user to any of these URLs as the same user? Here's an example:
http://www.example.com/mypage.php
https://www.example.com/mypage.php
http://example.com/mypage.php
https://example.com/mypage.php
What I hope to achieve is a setup where if a single user clicked each of these links, the results would appear in Analytics as:
http://www.example.com/mypage.php - Pageviews: 4 Users: 1
This question expands on this one from earlier
As I understand it, if I were to add a global 301 redirect to my .htaccess file, the user would be automatically redirected to whatever domain I specify. Is this the best solution?
The answer you linked to is outdated (if it was ever right). If you use Universal Analytics and set the cookie domain parameer to auto (which is the default) the cookie will be set for the domain and all subdomains:
Automatic Cookie Domain Configuration simplifies cross domain tracking
implementations by automatically writing cookies to the highest level
domain possible when the auto parameter is used. When used on the
domain www.example.co.uk, it will try to write cookies in the
following order:
co.uk
example.co.uk
www.example.co.uk
(see documentation). So you will have a cookie for example.co.uk on both domain and www subdomain which is valid for both.
Having said that, you should still use the 301 for SEO reasons (to avoid duplicate content, plus Google announced last year that ssl is a ranking factor now so you might want to use your https pages only).

Block Bots from crawling one of my sites on a multistore multidomain prestashop

Hello i have a multistore multidomain prestashop installation with main domain example.com and i want to block all bots from crawling a subdomain site subdomain.example.com made for resellers where they can buy at lower prices because the content is duplicate to the original site, and i am not exacly sure how to do it. Usualy if i want to block the bots for a site i would use
User-agent: *
Disallow: /
But how do i use it without hurting the whole store ? and is it possible to block the bots from the htacces too ?
Regarding your first question:
If you don't want search engines to gain access to the subdomain (sub.example.com/robots.txt), using a robots.txt file ON the subdomain is the way to go. Don't put it on your regular domain (example.com/robots.txt) - see Robots.txt reference guide.
Additionally, I would verify both domains in Google Search Console. There you can monitor and control the indexation of the subdomain and main domain.
Regarding your second question:
I've found a SO thread here which explains what you want to know: Block all bots/crawlers/spiders for a special directory with htaccess.
We use a canonical URL to tell the search engines where to find the original content.
https://yoast.com/rel-canonical/
A canonical URL allows you to tell search engines that certain similar
URLs are actually one and the same. Sometimes you have products or
content that is accessible under multiple URLs, or even on multiple
websites. Using a canonical URL (an HTML link tag with attribute
rel=canonical) these can exist without harming your rankings.

Will making my site secure (https) affect my google ranking?

I am managing a website (www.faa.net.au) which is currently running as a standard http:// website.
I am now looking at capturing some information that needs to be confidential. In order to do this, I am looking at purchasing an SSL Certificate for this particular domain.
I have 2 questions really:
Will my Rankings be effected at all?
Will I need to set up 301 redirects if there are links that are referring to http:// instead of https://?
Gong HTTPS will not negatively affect your page rankings. And yes, you should set up a 301 redirect unless this is a temporary change.
In a nutshell, the search engine bots connect to the pages as normal, so it doesn't matter if it's using SSL/TLS or not. The 301 will pretty much update the bots with the current information.
Going https will actually positivly affect your site's SEO as Google has now introduced it as a ranking factor. http://googlewebmastercentral.blogspot.co.uk/2014/08/https-as-ranking-signal.html

Blocking Google (and other search engines) from crawling domain

We want to open a new domain for certain purposes (call them PR). The thing is we want the domain to point to the same website we currently have.
We do not want this new domain to appear on search engines (specifically Google) at all.
Options we've ruled out:
Robots.txt can't be used - it will work the same on both domains, which isn't what we want.
The rel=canonical doesn't block - only suggests to index a similar page instead. The original page might end up being indexed.
Is there a way to handle this?
EDIT
Regarding .htaccess suggestions: we're on IIS7.
rel=canonical is not a suggestion. It tells Google exactly which page to use.
Having said that, when serving pages that are in the domain you do not want indexed you can use the `x-robots-tag- to block those pages from being indexed:
Simply add any supported META tag to a new X-Robots-Tag directive in
the HTTP Header used to serve the file.
Don't include this document in the Google search results:
X-Robots-Tag: noindex
Have you tried setting your preferred domain in Google Webmaster Tools?
The drawback to this approach is that it doesn't work for other search engines.
I would block via say a .htaccess file on the domain in question at the root of the site.
BrowserMatchNoCase SpammerRobot bad_bot
Order Deny,Allow
Deny from env=bad_bot
Where you'd have to specify the different bots used by the major search engines.
Or you could allow all known webbrowsers and white list them instead.

How to get a domain un-indexed by search engines

I have a domain with a loto of indexed pages, I use this one as a online test domain. I understand that I should test it on a intranet or somewhat, but in time Google indexed a few websites which are not relavent anymore.
Does anyone know how to get a domain totlally unindexed from the most search engines?
There is a couple things you can do.
Set up a restrictive robots.txt file
Password protect the domain root
Request removal directly from SEs
If you have a static ip and you are the only one accessing the site, you can simply deny access to any ips other than yours.
Place a robots.txt file in the root directory of your webpage. It can be used to control how much access search engine spiders have to your content. You can specify certain areas of your site off limits to indexing, on a directory-by-directory basis.
Remove alias domain if you have
Remove url redirect from old to new
so that Search Engines can slowly de-index your old domain.

Resources