Bot function after NoFollow rule - bots

I was just wondering what the function of googlebot or any other search engine spider/bot was after you use the no follow rule in a meta tag. Presumably the bot is on your site and gets to a page through link redirection, etc but if the linked page includes the code <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">, where does the bot go after that? Does it go back to the previous page or does it do some other function? Hope this doesn't sound like a stupid question but I was just curious.

usually, a web crawler does not visit links found on a given webpage directly when he encounters them, instead these links are added to a waiting list, when the spider finish loading the current page, he just look up into this list and pop another url from there, the new link is not necessary from the last fetched page, it can be from the previous page or even another website ( depending how the list is organized ).

Related

How to tell search engines like Google to display its tabs/url?

When we search in google engine it displays top web site tabs or links too. Like when we search "bing" or "net beans".
Q: How it displays those links. Do we have to tell it to display these links.
Q: Does it something have to do with sitemap.xml/robots.txt or it displays the links present in index.php of that website?
Robots.txt: allow/disallow bots to crawl which page.
sitemap.xml: tells the map/loc of your website pages and also tells the frequency.
Q: How does it display description of a website?
I have searched about description it has to do with meta tag name description. But i open the source file of net beans
<META NAME="description" CONTENT="Welcome to NetBeans">
But the description google showing is
Fully-featured Java IDE written completely in Java, with many modules available, such as: debugger, form editor, object browser, CVS, emacs integration, ...
For your first question I should say that those links which you've mentioned are automatically genereted in top most visited websites and portals. If you'd set the sitemap.xml and robots.txt correctly in your website root folder, After a while if your website has a lot of visitors traffic, google detect your top most visited links which users most redirect to and show them in its result as you wish.
For the second question meta tags are not the only criteria the search engines show them in their results, Rather they catch the page content and extract the context from the text content of the page and show the description based on your entered keyword. However your meta description will be shown when keyword is the website name or its domain.
Take a look at Open Graph Protocol to extend your information about meta tags and your requirements for seo.
Regards

How can I show a picture before a link in google sites

I have a website that i've set up through google sites. I have a link to an external webpage. What I'd really like to have happen is, if someone clicks the link, it shows a jpg picture for about 5 seconds and then forwards them off to the linked website. Is there a way to do that?
Thanks,
Rich
Adding to following tag:
<META http-equiv="refresh" content="5;URL=http://example.com">
to the <head> section of a webpage will redirect the user to example.com, or whatever the URL value is. You can display an image in the <body> section of this page. This seems like the simplest way to accomplish what you want.

Disallow In-Page Url Crawls

I want to disallow all the bots to crawl specific type of pages. I know this can be done via robots.txt as well as .htaccess. However, these pages are generated from the database from the user's request. I have searched the internet and could not get a good answer for doing so.
My link looks like:
http://www.my_website/some_controller/some_action/download?id=<encrypted_id>
There is a view page for the users wherein all the data that is displayed comes from the database including the kind of links that I have mentioned before. I want to hide those links from the bots and not the entire page. How can I do that?
Could the page not be generated with a
<meta name="robots" content="noindex">
in the head?
you cannot hide stuff from bots but make it available to other traffic, afterall how do you distinguish between a bot and regular traffic... you cant without some sort of verification like them pictures of a word you type in a box.
Robots.txt does not stop bots, most bots will look at it and that will stop them out of there own choice, however that is only because they are programmed to do so. They do not have to do this and therefore if they wish can ignore robots.txt completely.

SEO - Getting a 301 page indexed by search engines

I have a site (say site1.com) which 301-redirects to another page on a different site (say http://site2.com/some/dirty/url).
Typical code at site1.com:
<?php
header("HTTP/1.1 301"); header("refresh:0;url=http://site2.com/some/dirty/url");
?><html>
<head>
<title>
Site 1 - heading.
</title>
<meta name="description" content="some description" />
</head>
<body />
</html>
Typically, Search Engines never index site1.com, even when there are external links like:
Click Here
But this is considered as an external link to http://site2.com/some/dirty/url and thus http://site2.com/some/dirty/url is seo'd.
I some how want to get site1.com indexed (Just the title, meta description and URL) though http://site2.com/some/dirty/url getting indexed is not a problem. Is this really possible or is it just what I have to forget about?
The 301 redirect tells search engines, and any other user agent that respects HTTP status codes, that http://site.com no longer exists and has moved to a new location. This means they now consider the new location of http://site.com to be http://site2.com/some/dirty/url and to associate everything, including all links to http://site.com to be associated with http://site2.com/some/dirty/url. So basically http://site.com does not exist anymore and no matter how many links you point to it, it won't change anything since they now will be associated with http://site2.com/some/dirty/url. And that makes sense since a 301 HTTP status does indicate that a page has moved permanently. If that page hasn't moved permanently then you are using the wrong HTTP status code.
Yes,It can be indexed......But it requires a better on page work on the both of your sites
(http://site.com) and (http://site2.com/some/dirty/url) .............
For example I have recently worked on the same conditions the website url is "http://www.top-alliance.de" which redirects to "http://www.top-alliance.com" and these both sites are indexed by the search engine recently by 04 June 2012.This is happened because i have done a better onpage work for both pages...
So the conclusion is the both your sites will require better on page work so it will definitely indexed by the search engine.
Thanks & Regards
Nitin Bhatnagar
To easily create redirects in your WordPress, an alternative is a simple 301 redirect plugin. Once you've installed and activated the plugin, add a new menu in the Settings area of ​​your dashboard.
There is really nothing to worry about with this plugin. The 301 Redirect Configuration window shows you two simple fields. One labeled as a request and the other as a destination. This is basically where the old permanent link structure and the new permanent link structure come from. You only need to add information after your domain name in these fields.
In the example above, the request field is the WordPress setting for the month and name Permilix, while the destination field is the WordPress setting for the post name Permalink structure. After you add these two fields, save your changes. It will ask any search engine traffic to come back to the old links.

If a page is not linked to the main website, can search engines find it?

I want to put a secret page in my website (www.mywebsite.com). The page URL is "www.mywebsite.com/mysecretpage".
If there is no clickable link to this secret page in the home page (www.mywebsite.com), can search engines still find it?
If you want to hide from a web crawler: http://www.robotstxt.org/robotstxt.html
A web crawler collects links, and looks them up. So if your not linking to the site, and no one else is, the site won't be found on any search engine.
But you can't be sure, that someone looking for your page won't find it. If you want secret data, you should use a script of some kind, to grant access to those, who shall get access.
Here is a more useful link : http://www.seomoz.org/blog/12-ways-to-keep-your-content-hidden-from-the-search-engines
No. A web spider crawls based on links from previous pages. If no page is linking it, search engine wouldn't be able to find it.

Resources