I want to hide one individual page from google. How can I do it?
I have a UserControl for this page.
Tnx in advance
Try the robots.txt approach first. Refer to the description here http://www.robotstxt.org/robotstxt.html.
Write a robots.txt in the root of your site, make it accessible to anybody for read and put
User-agent: *
Disallow: /<your_page_url>
there
Related
I am looking for some clarity on trying to block Google Bot from specific pages on my site but at the same time allowing them to be indexed in my Google Site Search(GSA). I cannot find a clear answer on this. This is my best guess.
User-agent: *
Disallow: /wp-admin/
Disallow: /example/custom/
User-Agent: gsa-crawler
Allow: /example/custom/
I would like to block Google Bot from indexing any pages with www.example.com/example/custom/ but at the same time index them with GSA. Would this be the correct implementation in my robots.txt file? Or would GSA need to go above User-agent: * ? Any insight is much appreciated.
Not sure if it can be helpful:
https://www.google.com/support/enterprise/static/gsa/docs/admin/72/gsa_doc_set/admin_crawl/preparing.html
Security tip: remember hackers search in robots.txt to see what dirs you want to "guard".
Cheers!
I created a website www.example.com. I created a mobile version of the website with subdomain www.m.example.com. I used htaccess file for redirectiong to mobile version in smartphones. I put my mobile website's files in folder named "mobile". I put a robot.txt file in main root folder for prevent indexing mobile urls in search engines result.
my robot.txt file is like this.
User-agent: *
Disallow: /mobile/
I also put a robot.txt file in folder named mobile.
User-agent: *
Disallow: /
My problem is that.
In desktop version all result and snippets are correct.
but when i searching in mobil, the result in snippet shows like this.
A description for this result is not available because of this site's robots.txt – learn more
How to solve this?
By using this robots.txt on www.m.example.com
User-agent: *
Disallow: /
you are forbidding bots to crawl any resource on www.m.example.com.
If bots are not allowed to crawl, they can’t access your meta-description.
So everything is working as intended.
If you want your pages to get crawled (and indexed), you have to allow it in your robots.txt (or remove it altogether).
By using the canonical link type, you can denote that two (or more) pages are the same, or that they only have trivial differences (e.g., different HTML structure, table sorted differently etc.), or that one is the superset of the other.
By using the alternate link type, you can denote that it’s an alternate representation of essentially the same content.
(You can see examples in my answer on Webmasters SE.)
I'm looking for an advice and the method to so;
I have a folder on my domain where I am testing a certain landing page;
If it goes well I'll might build a new website and domain with this landing page,
and that's the main reasons I don't want it to get crawled, so I won't be punished by Google for duplicate content. I also don't want unwanted bots to scrape this landing page, as no good can come out of it. does it make sense to you?
If so, how can I do this? I don't think robots.txt is the best method as I understood that not all crawlers respect it, and even google may not fully respect it. I can't put a password since the landing page should be open to all humans (so the solution must not cause any problem to human visitors). does it leave the .htaccess file? If so, what code should I add there? are there any downsides I didn't get?
Thanks!
Use robots.txt file with following content:
User-agent: *
Disallow: /some-folder/
I have a website, and I use Drupal for CMS. i.e. I write articles in mysite.com/drupal/ but I display them in mysite.com/show_article.php?article_id=x&article_name=y (actually mysite.com/article/x/y with .htaccess RewriteRule) by loading the content from Drupal database.
However, when I search for an article in google, mysite.com/drupal/node/x/y appears as the result, but mysite.com/article/x/y doesn't.
so I guess I need to add some <google nofollow> tags in drupal's php pages, but which ones? or is there an easier configuration setting for this?
Thanks !
robots.txt:
User-agent: *
disallow: /drupal/
Then you'll want to submit an index for mysite.com/article to Google. It sounds like you won't be able to use XML sitemap for this however.
That should be all you need to do what you need. Of course the obvious question (if I understand what you're doing) is why you're using custom php to serve up Drupal content from the same domain, and not just doing this in Drupal.
my domain is www.yellowandred.in
if user click on yellowandred.in/mainWork/, it should redirect to yellowandred.in/mainWork.php
and if user click on yellowandred.in/mainWork/index or yellowandred.in/mainWork.php/index it should show 404 error,please help.
This is the first link I found in google
http://kb.mediatemple.net/questions/85/Using+.htaccess+rewrite+rules#gs
You may want to try out this tool to help you create an htaccess file:
http://htaccessredirect.net
You will want to do a url "rewrite".
There are others also. Just search "htaccess generator".