how to make our website to be among ten in Google results? [closed] - search

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
how to make our site viewable in top ten of the Google search...
I want my website to be available for the user who Google with search name social networking or something like ssit How to do that?

GET to know SEO(Search Engine Optimization).
1) Use proper relevant meta tag's keywords and description
2) Include title tag, get more important keyword in heading tags
3) Use proper title and alt for images
4) Have a page for site-map in your website
5) Have more back-links to your site by submitting articles, press release and news
6) Cross linking between pages of the same website to provide more links to most important pages may improve its visibility
But before anything one should know about the type of users they want for their site and search for the relevant keywords for their site, Google analytic definitely helps for this purpose.
And most important don't expect you site to be on the top soon, it will take some time like 6 month at least to get on the top of search engine. As soon as users of your site increase rank will increase. So BEST OF LUCK

The algorithm used by Google gives a rank based on the number of other sites linking your site in association to specific keywords.
This has been spoofed to do so-called "google bombing": if a lot of people spread a link to a specific site using a specific word, they connect that word to that site and have the top rank (e.g., it has been used to associate insults to politics). The same technique has been used by spammers to rise the rank of garbage sites: they flood forums and blog comments with links to their sites. Although the algorithm has been improved to try to avoid this issue, it is still a viable way to raise the site rank.
It is clear that using such methods to improve visibility of your site will give a very bad reputation to your site.
I suggest instead to pay Google to advertise on it (so you will get top in a legitimate way).
Of course, you are supposed to get the top ten if your site is really the top for the specific argument.

Slip google a fiver (cough) I mean, uh, get other well known websites to link to you. Google's Pagerank works out a pages final rank or priority by determining how many other sites link to it, with links that come from already high ranks sites being weighted more then links form a 'unknown' site.

Related

Why would you stop Google from indexing pages in your website? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I've read some articles on how to stop the indexing, but I'm not clear WHY you would actually want to do that.
1) The explanation I found for why was:
"For marketers, one common reason is to prevent duplicate content (when there is more than one version of a page indexed by the search engines, as in a printer-friendly version of your content) from being indexed.
Another good example? A thank-you page (i.e., the page a visitor lands on after converting on one of your landing pages). This is usually where the visitor gets access to whatever offer that landing page promised, such as a link to an ebook PDF." [Basically you don't want the user to find your Thank You page with freebies through search without signing up]
However, in both these cases it actually seems like a bad idea to prevent indexing? You'd rather just redirect to the sign-in page (in the second example) after your user finds you? At least the user will be able to reach your website.
2) It's also mentioned that indexing is not the same as appearing in Google search results, but it's not really clear what the difference is. Could someone enlighten?
TIA.
Let me provide few good reasons from my experience, though I believe many more exist.
Traditionally known primary reason is to save computing resources. Imagine a search engine - probably it would not like another search engine to index all of its results.
A big part of it is to prevent waste of resources. Imagine a search engine would index itself, that can take some time. This also applies to binary data which has no text.
Your example somewhat falls into this category
"For marketers, one common reason is to prevent duplicate content (when there is more than one version of a page indexed by the search engines, as in a printer-friendly version of your content) from being indexed.
But this is not considered a valid reason any more, as resource consumption is generally low, and proper disambiguation should be done with html metadata like
<link rel='canonical' href='<permanent link>' ...>
<link rel='alternate' media='printed' ...>
Another big reason to prevent indexing is privacy. E.g. facebook profiles are not indexed if owner chooses so.
Another good example? A thank-you page (i.e., the page a visitor lands on after converting on one of your landing pages). This is usually where the visitor gets access to whatever offer that landing page promised, such as a link to an ebook PDF." [Basically you don't want the user to find your Thank You page with freebies through search without signing up]
This falls into privacy category. Even better, a search engine once indexed a set of these "thank you" pages from a website of mobile operator, which also included the message sent.
One observed reason is general newbie paranoia. It is a bad reason, because paranoia solution would be much better implemented with HTTP authentication.

Redirects vs. "true" page hits : A Crawler's perspective [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Background :
Site domains such as bit.ly, ow.ly instagr.am, and gowal.la are shorteners which forward elsewhere . Since most of thes url's actually forward to other, third party sites, Im assuming they can handle a pretty heavy load.
Question :
Is there a different politeness metric when crawling 301 redirects from a single domain (i.e. ow.ly) , compared with crawling "real" content pages (i.e. blogger.com/) ?
More concretely : How many times a day would we expect to be able to hit a site which issues 301 redirects, compared with a normal site which streams real content.
Some initial thoughts :
My initial guess would be (10E6 = 1,000,000), given that what i see online suggests that hitting a mature site at the order of 10E3-10E5 times a day is not a huge issue, considering that large site like tumbler receives around (10E7 = 10,000,000+) views per day, with sites like google are on the order of 10E8 (billions) of view per day.
In any case, I hope this very raw bit of fact-finding that I've done will spur some thoughts on defining the difference in "politeness" metrics when we are discussing 301 redirects versus "true" page crawls (which are bandwidth intensive).
When in doubt, check robots.txt. There's a non-standard extension called Crawl-delay, which as you may be able to imagine, specifies how many seconds to wait between requests.
You mentioned bit.ly; their robots.txt has no such restrictions, and a human-friendly comment saying "robots welcome". As long as you are not abusive, you probably won't have a problem with them. There are also comments in there stating that they have an API. Using that API may be more useful than crawling.
As for defining abusive... well, unfortunately that's a very subjective thing, and there's not going to be any one right answer. You'd probably need to ask each specific vendor what their recommendations and limits are, if they don't provide this information through documentation on their site, robots.txt, or through an actual API, which itself may have well-defined access limits.

Cursors + Pagination & SEO

I would like to know if it's possible to paginate using cursors and keep those pages optimized for SEO at the same time.
/page/1
/page/2
Using offsets, gives to Google bot some information about the depth, that's not the case with curors:
/page/4wd3TsiqEIbc4QTcu9TIDQ
/page/5Qd3TvSUF6Xf4QSX14mdCQ
Should I just only use them as an parameter ?
/page?c=5Qd3TvSUF6Xf4QSX14mdCQ
Well, this question is really interesting and I'll try to answer your question thoroughly.
Introduction
A general (easy to solve) con
If you are using a pagination system, you're probably showing, for each page, a snippet of your items (news, articles, pages and so on). Thus, you're dealing with the famous duplicate content issue. In the page I've linked you'll find the solution to this problem too. In my opinion, this is one of the best thing you can do:
Use 301s: If you've restructured your site, use 301 redirects
("RedirectPermanent") in your .htaccess file to smartly redirect
users, Googlebot, and other spiders. (In Apache, you can do this with
an .htaccess file; in IIS, you can do this through the administrative
console.)
A little note to the general discussion: Since few weeks, Google has been introducing a "system" to help they recognise the relationship between pages as you can see here: Pagination with rel="next" and rel="prev"
Said that, now I can go to the core of the question. In each of the two solutions, there are pros and cons.
As subfolder (page/1)
Cons: You are losing link juice on the page "page" because every piece (page) of your pagination system, will be seen as an indipendent source because they have a different url (infact you are not using parameters).
Pros: If your whole system is doing using the '/' as separator between parameters (which is in a lot of case a good thing) this solution will give coninuity to your system.
As parameter (page?param=1)
Cons: Though Google and the other S.E.s manage the parameters without problems, you're letting them decide for you if a parameter is important or not and if they have to take care to manage them or ignore them. Obviously this is true unless you're not deciding how to manage them in their respective webmaster tool panel.
Pros: You're taking all the link juice on the page "page" but indeed this is not so important because you want to give the link juice to those pages which will show the detailed items.
An "alternative" to pagination
As you can see, I posted on this website a question which is related to your. To sum up, I wanted to know an alternative to pagination. Here is the question (read the accepter answer): How to avoid pagination in a website to have a flat architecture?
Well, I really hope I've answered your question thoroughly.

Does PageRank mean anything?

Is it a measure of anything that a developer or even manager can look at and get meaning from? I know at one time, it was all about the 7, 8, 9, and 10 PageRank. But is it still a valid measure of anything? If so, what can you learn from a PageRank?
Note that I'm assuming that you have other measurements that you can analyze.
PageRank is specific to Google and is a trademarked proprietary algorithm.
There are many variables in the formulas used by Google, but PageRank is primarily affected by the number of links pointing to the page, the number of internal links pointing to the page within the site and the number of pages in the site.
Thing you must consider is it's specific to a web page, not to a web site. So you need to optimize every pages.
Google sends Googlebot, its indexing robot, to spider your website, the bot is instructed not to crawl your site too deep unless it has a reasonable amount of PR (PageRank).
As to what I have experienced, the pagerank is an indicator for how many sites recently linked to your site. But it is not necessarily connected to your position on Google for example.
There were times where we increased our marketing and other sites linked to us, and the pagerank rose a bit.
I think the factors resulting in any SERP position are changing too much to put all your faith into one. Pagerank was very important, and still is to some degree but how much is a question I can't answer.
Every link you send out on a page passes some of the page's pagerank to where the link is pointing. The more links, the less pagerank passed on to each. Use rel="nofollow" in your links to focus pagerank flow in a more controlled manner.
The page rank algorithm is the probability distribution used to represent the likelihood that a person randomly clicking on links will arrive at any particular page. It is a relatively good approximation of importance of a webpage.

When the bots attack! [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are some popular spam prevention methods besides CAPTCHA?
I have tried doing 'honeypots' where you put a field and then hide it with CSS (marking it as 'leave blank' for anyone with stylesheets disabled) but I have found that a lot of bots are able to get past it very quickly. There are also techniques like setting fields to a certain value and changing them with JS, calculating times between load time and submit time, checking the referer URL, and a million other things. They all have their pitfalls and pretty much all you can hope for is to filter as much as you can with them while not alienating who you're here for: the users.
At the end of the day, though, if you really, really, don't want bots to be sending things through your form you're going to want to put a CAPTCHA on it - best one I've seen that takes care of mostly everything is reCAPTCHA - but thanks to India's CAPTCHA solving market and the ingenuity of spammers everywhere that's not even successful all of the time. I would beware using something that is 'ingenious' but kind of 'out there' as it would be more of a 'wtf' for users that are at least somewhat used to your usual CAPTCHAs.
Shocking, but almost every response here included some form of CAPTCHA. The OP wanted something different, I guess maybe he wanted something that actually works, and maybe even solves the real problem.
CAPTCHA doesn't work, and even if it did - its the wrong problem - humans can still flood your system, and by definition CAPTCHA wont stop that (cuz its designed only to tell if you're a human or not - not that it does that well...)
So, what other solutions are there? Well, it depends... on your system and your needs.
For instance, if all you're trying to do is limit how many times a user can fill out a "Contact Me" form, you can simply throttle how many requests each user can submit per hour/day/whatever. If your users are anonymous, maybe you need to throttle according to IP addresses, and occasionally blacklist an IP (though this too can be circumvented, and causes other problems).
If you're referring to a forum or blog comments (such as this one), well the more I use it the more I like the solution. A mix between authenticated users, authorization (based on reputation, not likely to be accumulated through flooding), throttling (how many you can do a day), the occasional CAPTCHA, and finally community moderation to cleanup the few that get through - all combine to provide a decent solution. (I wonder if Jeff can provide some info on how much spam and other malposts actually get through...?)
Another control to consider (dont know if they have it here), is some form of IDS/IPS - if you can detect and recognize spam, you can block THAT pattern. Moderation fills that need manually, here...
Note that any one of these does not prevent the spam, but incrementally lowers the probability, and thus the profitability. This changes the economic equation, and leaves CAPTCHA to actually provide enough value to be worth it - since its no longer worth it for the spammers to bother breaking it or going around it (thanks to the other controls).
Give the user the possibility to calculate:
What is the sum of 3 and 8?
By the way: Just surfed by an interesting approach of Microsoft Research: Asirra.
http://research.microsoft.com/asirra/
It shows you several pictures and you have to identify the pictures with a given motif.
Try Akismet
Captchas or any form of human-only questions are horrible from a usability perspective. Sometimes they're necessary, but I prefer to kill spam using filters like Akismet.
Akismet was originally built to thwart spam comments on WordPress blogs, but the API is capabable of being adapted for other uses.
Update: We've started using the ruby library Rakismet on our Rails app, Yarp.com. So far, it's been working great to thwart the spam bots.
A very simple method which puts no load on the user is just to disable the submit button for a second after the page has been loaded. I used it on a public forum which had continuous spam posts, and it stopped them since.
Ned Batchelder wrote up a technique that combines hashes with honeypots for some wickedly effective bot-prevention. No captchas, just code.
It's up at Stopping spambots with hashes and honeypots:
Rather than stopping bots by having people identify themselves, we can stop the bots by making it difficult for them to make a successful post, or by having them inadvertently identify themselves as bots. This removes the burden from people, and leaves the comment form free of visible anti-spam measures.
This technique is how I prevent spambots on this site. It works. The method described here doesn't look at the content at all. It can be augmented with content-based prevention such as Akismet, but I find it works very well all by itself.
http://chongqed.org/ maintains blacklists of active spam sources and the URLs being advertised in the spams. I have found filtering posts for the latter to be very effective in forums.
The most common ones I've observed orient around user input to solve simple puzzles e.g. of the following is a picture of a cat. (displaying pictures of thumbnails of dogs surrounding a cat). Or simple math problems.
While interesting I'm sure the arms race will also overwhelm those systems too.
You can use Recaptcha to at least make a captcha useful. Then you can make questions with simple verbal math problems or similar. Microsoft's Asirra makes you find pics of cats and dogs. Requiring a valid email address to activate an account stops spammers when they wouldn't get enough benefit from the service, but might deter normal users as well.
The following is unfeasible with today's technology, but I don't think it's too far off. It's also probably overkill for dealing with forum spam, but could be useful for account sign-ups, or any situation where you wanted to be really sure you were dealing with humans and they would be prepared for it to take a few minutes to complete the process.
Have 2 users who are trying to prove themselves human connect to each other via their webcams and ask them if the person they are seeing is human and live (i.e. not a recording), by getting them to, for example, mirror each other's movements, or write something on a piece of paper. Get everyone to do this a few times with different users, and throw a few recordings into the mix which they also have to identify correctly as such.
A popular method on forums is to simply queue the threads of members with less than 10 posts in a moderation queue. Of course, this doesn't help if you don't have moderators, or it's not a forum. A more general method is the calculation of hyperlink to text ratios. Often, spam posts contain a ton of hyperlinks, and you can catch a lot this way. In the same vein is comparing the content of consecutive posts. Simply do not allow consecutive posts that are extremely similar.
Of course, anyone with knowledge of the measures you take is going to be able to get around them. To be honest, there is little you can do if you are the target of a specific attack. Rather, you should focus on preventing more general, unskilled attacks.
For human moderators it surely helps to be able to easily find and delete all posts from some IP, or all posts from some user if the bot is smart enough to use a registered account. Likewise the option to easily block IP addresses or accounts for some time, without further administration, will lessen the administrative burden for human moderators.
Using cookies to make bots and human spammers believe that their post is actually visible (while only they themselves see it) prevents them (or trolls) from changing techniques. Let the spammers and trolls see the other spam and troll messages.
Javascript evaluation techniques like this Invisible Captcha system require the browser to evaluate Javascript before the page submission will be accepted. It falls back nicely when the user doesn't have Javascript enabled by just displaying a conventional CAPTCHA test.
Animated captchas' - scrolling text - still easy to recognize by humans but if you make sure that none of the frames offer something complete to recognize.
multiple choice question - All it takes is a ______ and a smile. idea here is that the user will have to choose/understand.
session variable - checking that a variable you put into a session is part of the request. will foil the dumb bots that simply generate requests but probably not the bots that are modeled like a browser.
math question - 2 + 5 = - this again is to ask a question that is easy to solve but prevents the bots ability to generate a response.
image grid - you create grid of images - select 1 or 2 of a particular type such as 3x3 grid picture of animals and you have to pick out all the birds on the grid.
Hope this gives you some ideas for your new solution.
A friend has the simplest anti-spam method, and it works.
He has a custom text box which says "please type in the number 4".
His blog is rather popular, but still not popular enough for bots to figure it out (yet).
Please remember to make your solution accessible to those not using conventional browsers. The iPhone crowd are not to be ignored, and those with vision and cognitive problems should not be excluded either.
Honeypots are one effective method. Phil Haack gives one good honeypot method, that could be used in principle for any forum/blog/etc.
You could also write a crawler that follows spam links and analyzes their page to see if it's a genuine link or not. The most obvious would be pages with an exact copy of your content, but you could pick out other indicators.
Moderation and blacklisting, especially with plugins like these ones for WordPress (or whatever you're using, similar software is available for most platforms), will work in a low-volume environment. If your environment is a low volume one, don't underestimate the advantage this gives you. Personally deciding what is reasonable content and what isn't gives you ultimate flexibility in spam control, if you have the time.
Don't forget, as others have pointed out, that CAPTCHAs are not limited to text recognition from an image. Visual association, math problems, and other non-subjective questions relayed through an image also qualify.
Sblam is an interesting project.
Invisble form fields. Make a form field that doesn't appear on the screen to the user. using display: none as a css style so that it doesn't show up. For accessibility's sake, you could even put hidden text so that people using screen readers would know not to fill it in. Bots almost always fill in all fields, so you could block any post that filled in the invisible field.
Block access based on a blacklist of spammers IP addresses.
Honeypot techniques put an invisible decoy form at the top of the page. Users don't see it and submit the correct form, bots submit the wrong form which does nothing or bans their IP.
I've seen a few neat ideas along the lines of Asira which ask you to identify which pictures are cats. I believe the idea originated from KittenAuth a while ago..
Use something like the google image labeler with appropriately chosen images such that a computer wouldn't be able to recognise the dominant features of it that a human could.
The user would be shown an image and would have to type words associated with it. They would keep being shown images until they have typed enough words that agreed with what previous users had typed for the same image. Some images would be new ones that they weren't being tested against, but were included to record what words are associated with them. Depending on your audience you could also possibly choose images that only they would recognise.
Mollom is supposedly good at stopping spam. Both personal (free) and professional versions are available.
I know some people mentioned ASIRRA, but if you go to all the adopt me links for the images, it will say on that linked page if its a cat or dog. So it should be relatively easy for a bot to just go to all the adoptme links. So its just a matter of time for that project.
just verify the email address and let google/yahoo etc worry about it
You could get some device ID software the41 has some fraud prevention software that can detect the hardware being used to access your site. I belive they use it to catch fraudsters but could be used to stop bots. Once you have identified an device being used by a bot you can just block that device. Last time a checked it can even trace your route throught he phone network ( Not your Geo-IP !! ) so can even block a post code if you want.
Its expensive through so prop. a better cheaper solution that is a little less big brother.

Resources