Online test security measures - security

I'm developing a feature for a client in which users voluntarily take an important test online. The test is difficult and the users will be highly motivated to do well (think SATs or GRE, etc)... so there's also a high incentive to cheat. Apparently there are 3rd party services in which a human virtually monitors the test taker via a webcam, but they're really expensive and we don't quite have the budget. We still need to make it as hard as possible for a user to game the system. Some of the things we suspect they might try are:
Getting someone else to take the test for them (a pinch hitter).
Taking the test multiple times with different profiles to practice
and gain an unfair advantage.
Taking the test alongside friends or while in contact with a friends
to tell them the answers.
The question order will change, as well as the order of the answers. The test will be timed, and an "open book" format, so we're not really worried about the user looking things up online, but we can't have them sharing their screen and having others assist them. So the main concern at this point is ensuring that the user is, in fact, who they say they are (and not someone else).
Here are a few of the security measures we're considering:
Requiring the user's device to have a webcam, which we'll activate and either record/photograph the user during the test (with the user's consent of course).
Asking users to verify an arbitrary bank deposit amount (presumably via PayPal). There's nothing to stop them from opening up multiple bank accounts, but at least it's a big hassle.
Really scary terms of use that threaten legal action if the user is caught cheating.
QUESTION: Are there any other measure we can/should take to make sure our test is secure and the results are reliable?
CLARIFICATION: We realize that with enough resources and determination, any security system can eventually be beaten. The goal of this question is not to find a magically unbeatable solution, but to find ways to raise the stakes enough so that it won't be worth it for most users to cheat. In this spirit, I'd much prefer answers that focus on what can be done as opposed to what can't.

As you know there are many ways of cheating. Your goal is limit the possibility of cheating as much as possible. Cheating in online courses has been a hot topic.
A pinch hitter:
This type of attack can be conducted a number of ways. Even if you have a cam looking at the person, the video that the test taker is seeing could be mirrored on another screen. A pinch hitter could see the question and just read him the answers or otherwise feed answers the test taker in a covert channel.
Possible counters to this attack is to also enable the mic to see if they are talking to anyone. You can also record the screen while they take the test. This could prevent them from opening a chat window or viewing other unauthorized content. (Kind of like the Elance tracker)
user verification:
In order to register the person should attach a scanned copy of their photo-id. This way you are linking a photo of the person to a unique identifier, such as a drivers license number. Before the person starts taking the test, ask the user to look directly at the camera and make sure you get a good image of them that can be verified against their photo id.
A simple attack against this system is to use photoshop to modify the id. To make this attack more difficult you could verify their name against a credit/debit card transaction. The names should match on both cards.
An evercookie could be used to track machines to see if the same computer is being used. This could happen though legitimate reasons, but it could also be used to flag tests for further review. A variant on the evercookie is to drop a file with a random value or set a registry key with a random value to "mark" that machine.

Related

securing http transaction in browser games

I want to build a system where many people can play browser based simple (swf or html5) games (like snakes, car race, ludo etc). These games will be dead simple and does not require logic to be written in server.
The person who makes the highest score will win some prize.
Now the issue is securing the workflow. My approach may my incorrect, please feel free to suggest any alternative.
1. When game begins, a game id is generated.
2. When game ends, score and game id is sent back.
Problem is, Step 2 can be spoofed. Any one can send similar http request, with generated game id and any score. How do I know that score is coming from my game and its not being sent directly.
I thought of encrypting the score, but again the encryption mechanism will be there in javascript, which can easily be replicated.
Please help, thank in advance.
EDIT 1: I am not worried about sniffers, session hijacking, man in the middle attack. HTTPS will take care of all that. I am worried about user itself. He can just right click > inspect in a browser and check the request header being sent. He can easily replay same request header (just by right clicking and open in a new tab)
You can add layers of submission signing/encoding to obscure the data, and JS code obfuscation to make it harder for a user to undo that level of protection.
But ultimately you cannot solve the problem. You are placing the trust to run your code in the expected way to an untrusted client, and that naturally gives untrustable output. Spoofing the score is only one point of entry - an attacker could just as easily tamper with other parts of the client-side game logic to cheat.
Trusting the client is a perennial security problem to which there is little possible solution. For a trivial high score table that no-one cares about, you can potentially get away with a quantity of obfuscation to keep off the casual attacker, coupled with manual monitoring and pruning for obviously fraudulent submissions.
If you are considering offering actual desirable prizes, you may need to move some essential part of the game onto servers which you control. What pieces you need to protect and how is highly game-dependent and there are likely to be residual attacks that are difficult to defend against. (Consider for example the lengths that commercial multiplayer server-based games go to try to defend against client hacks like aimbots.)
I guess it depends upon the value of the data, but it sounds like the value is very low.
Pick a functionally impossible to guess ID. For example, a GUID (128 bit randomly generated value). If someone were to guess it, they get the game. But it's statistically impossible that they will ever get it right. And if you send it over HTTPS then they can't sniff it out (I'd send it over HTTPS, just because).
If you want to do better you can pick a random password to go with it, but I question if the math supports this as being necessary. It sounds like your data is crazy low value. Does it really matter?
FWIW, the bigger thing I'd be worried about is the person who decides to DoS you by asking you to store tons and tons of data for games that didn't exist...

How to deal with clients and iterations in Agile team? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
This thread is a follow up to my previous one. It's in fact 2 questions, so I hope no one minds, as they are dependent on each other.
We are starting a new project at work and we consider it as a great opportunity to try Agile techniques in action. We had a brainstorming about ideas we read in several books and articles, and came up with concept that would suit us the best: 2 weeks iteration, followed by call with clients who would choose what stuff they want to have in next iteration. I just have few more questions, which we couldn't figure out ourselves.
What to do in the first iteration?
What to, generally, do in the first few iterations if we start from the scratch? Just give it a month of development to code core of the application or start with simple wire-frames with limited pre-coded functionality? What usually clients want to see? Shiny stuff that doesn't work or ugly stuff that does work?
How to communicate with clients?
Our initial thought it to set the process to something like this:
alt text http://img690.imageshack.us/img690/2553/communication.png
Is it a good idea to have a Focal Point on client side or is it better to communicate straight with all the clients to prevent miscommunication?
Any thoughts are welcome! Thanks in advance.
In my opinion, a key success factor for agile development is to focus on delivering value for the customer in each iteration. I would definitely pick "ugly stuff that does work" over "shiny stuff that doesn't work". Doing shiny UIs and trying to get the client to understand hat business logic takes a lot of time to implement is always risky which Joel Spolsky has written a good article about.
If the client wants enhancements to the UI, they can always put that as a requirement for the next iteration.
Regarding communication with clients I think that your scetch should be slightly adjusted. Talking in scrum terms your "focal point" is called "product owner". Having one person coordinating with the clients is good, as it can take quite a lot of time to get the different stakeholders agree on the needs. However the product owner (or focal point) should be in direct contact with the developer, without going through the project manager. In fact, the product owner and the project manager has quite distinct roles that gain a lot by being split on two people.
The product owner is the stakeholders' voice to the development team. The project manager on the other hand is responsible for the wellbeing of the project team and often keeps track of budget etc. These roles sometimes has opposing agendas, and having them split on two people gives a healthy opportunity for negotiation between conflicting interests. If one person has both roles, that person often tend to favour one of them, automatically reducing the other one. You don't want to work on a team where the project manager always puts the client before the team's needs. On the other hand no customer wants a product owner that always puts the team's needs first, neglegting the customer. Splitting the responsibilities on two people helps to remedy that situation.
I'd agree with Anders answer. My one extra observation is that many clients find it impossible to ignoire the Ugly. They get concerned about presentation rather than function. Hence you may need to bite the bullet and do at least one "Nice" screen to show that you will pay attention to presentation details.
What to, generally, do in the first few iterations if we start from the scratch?
Many teams use an Iteration Zero to:
setup the development infrastructure (source control, development machines, the automated build, a continuous integration process, a testing environment, etc),
educated the customer and agree with him on the methodology,
create an initial list of features, identify the most important and do an initial estimation,
define time of meetings (planning meeting, demo, retrospective), choose the the iteration length.
Iteration Zero is very special because it doesn't deliver any functionality to the customer but focus on what is necessary to run the next iterations in an agile way. But subsequent iterations should start to deliver value to the customer.
Just give it a month of development to code core of the application or start with simple wire-frames with limited pre-coded functionality?
No, don't develop the core of your application during one month. Instead, start delivering vertical slice of the application (from the UI to the database) immediately, not horizontal slices. This doesn't mean that a screen has to be complete (e.g. implement only one search field in a search screen) but it should ideally be representative of the final look & feel (unless you agreed with the customer on an intermediate step). The important part is to build things that provide immediate value to the customer incrementally.
What usually clients want to see? Shiny stuff that doesn't work or ugly stuff that does work?
To my experience, they want to see demonstrable progresses and you want to get feedback as soon as possible.
Is it a good idea to have a Focal Point on client side or is it better to communicate straight with all the clients to prevent miscommunication?
You need one person to represent the clients (who is called the Product Owner in Scrum):
he provides a single authoritative voice
he has a perfect knowledge of the business (i.e. he can answer questions)
he knows how to maximize the ROI (i.e. how to prioritize functionalities)
Agile generally wants to provide the client something valuable, quickly.
So I certainly would not spend "month of development to code core of the application". To me, that smells of the "big up front design" anti-pattern. Also, see YAGNI.
Get as much information from the clients about what they need soonest, and implement that in your first iteration. "Valuable" is in the eye of the client. Thet will know if they want to see slick UI (maybe they want to give a slide show about the product at a trade show, so functionality can be fake) or simple working features (maybe you're developing something that they need to start using ASAP). Business Value is what they say will help them do their job.
I'd make my iterations as short as I can (your 2 weeks could work, I suggest considering 1 week) If you absolutely can't have your dev team and your clients co-located, instead of having a call with the clients, I suggest a meeting. Demo what you've done over the previous iteration and solicit feedback about what should stay, what should change, and what should be added.
As others have said, your "Focal point" sounds like a Product Owner. What worries me about your drawing is if it is meant to imply that devs don't interact with the PO or the clients. One thing that makes Agile work is when there is lots of communication. Having communication to/from the dev team always filtered through the Project Manager is almost certainly bound to result in miscommunication, unnecessary work, and missed details.
I agree with the two answers given but I would just add one thing from personal experience. Are your customers bought in to the change towards quick iterations? As well as providing feedback after each iteration which is going to require the customer performing usability tests on each feature.
Now I don't know what your groups relationship is with your customer but its not unusual for customers to take a "Put request in - get working system out" attitude in that they are enthusiastic when giving requirements but not so forthoming with time when it comes to testing the feature.
Now this may be totally inappropriate to your situation but its always worth considering how your customer workflow will have to change as well as your groups.
Cheers

How can you prevent bogus high scores from appearing on a global high score list?

Suppose you are designing a PC game that keeps track of high scores. In addition to keeping local scores, a global high score server is set up that is accessed by the game over the internet. Players should be able to submit their high scores to the global high score list right after they have completed a game, or later, from their local high score list. This must be a common problem; arcade games on modern game consoles often feature a global high score list that works like this.
My question boils down to: how can you prevent someone from submitting bogus high scores? Or, stated another way, how can the global high score server be sure that a submitted score was really produced by a run through the game?
The more I thought about this, the more I think it may be an unsolvable problem.
What you'd commonly do to verify that a message originated from a certain source is have the source digitally sign the message. You could certainly do that in this case, but the real problem is that the player, by having the software, also has the software's private key. No matter how obfuscated it might be, it can be reverse engineered, or even just plucked from memory.
Another option would be to send along a replay of the player's game to the high score server, which would quickly run the replay and verify that the submitted score matches the outcome of the replay. This doesn't solve the problem, but it certainly makes it more difficult to forge a bogus high score if you also have to produce a very complex replay that "proves" it.
Is this a problem that has a solution, or is it really unsolvable? Are there techniques used by the home game console developers to prevent this sort of exploit, or do they simply rely on the console preventing unauthorized code from running?
For a PC game where the score is generated by the client, this is not a solvable problem since the client is inherently untrustworthy.
The only thing you can try to do is make it harder for someone to submit a fake score.
Some thoughts:
Protect the in-memory score.
You can use API's like CryptProtectMemory to hide the score in memory - a simple memory write will not work. However, with an attached debugger or via injecting code into your process, they could still modify the score. You can look into various schemes for trying to defeat debuggers.
Protect the score en-route to the server.
You can encrypt the data being sent to the service, as you suggest, but since the untrusted party has control over the key, this is merely obfuscation and offers no solid protection.
Validate the score at the service.
I'd be loathe to do this, beyond very simple checks. A bug here will result in you rejecting valid scores. It'll be impossible to distinguish between cheaters and strong players.
At this point, you really have to ask your self if the engineering effort is really worth it. What if someone posts an invalid score? What do you actually lose? I would do the bare minimum to protect against a kid with a simple script. I.e., don't have your score submission be just:
http://myservice.com/submitscore.aspx?PlayerName=Michael&Score=999999999
I would just use simple protection in memory against casual snoops, some simple obfuscation on the wire (hash the score with a server cookie) and be done.
To my knowledge, this is unsolvable.
I have seen many people try to obfuscate the encryption key. I have seen a few people include other sanity checks like time elapsed, or enemies remaining. I have never seen one that sends a replay, though of course it is possible.
In a website that will remain unamed, they setup a fake send high score routine that is easily found. If a perpetrator uses it, their IP address will be automatically banned from future updates.
"Another option would be to send along a replay of the player's game ... This doesn't solve the problem, "
Really? Why not?
"you also have to produce a very complex replay that "proves" [the score]."
Are you suggesting someone could fake the replay? Reason out a super-high-score solution without actually playing the game? Fake the timestamps and everything? Pull a Donald Crowhurst?
Why not just play the game rather than attempt to fake a log of playing the game?
Or... if it's that easy to fake a history of game play that leads to a super high score, perhaps the game is misdesigned. If the game is "hard", the person must make all the right choices. If the game is easy, then the score doesn't reflect the player's choices and is fakable.
Think of it this way. Pick any game or sport. Someone says that -- say -- Switzerland beat New Zealand in a yacht race. You would challenge them by seeking substantiating details on the venue, the boats, the teams and the individual races to convince yourself it was true. Certainly, they could fake it, but if they've got a rich set of details covering the race, then... how's that not "proof"?
Forget the cryptogaphic issue; if your code can be hacked on the local machine, how do you keep someone from setting a crazy high score artificially and then transmitting it using the existing trusted mechanism?
Until you can establish trust in the code itself, a crypto mechanism for the communications isn't going to be the real problem.
Send the hash of the (software + a random salt) along with the score. Check this hash with the server. If it matches (meaning that the software is unaltered) accept it. Otherwise, the score comes from a "modded" version of the game. Reject it. The random salt should change every time the hash is generated (current sys time or something like that)
Check Quake III Arena source code. Their version was quite fool proof. Unfortunately, I don't remember the link now.

What should I do when my boss tells me to make passwords the same as usernames by default in our software?

My boss is against requiring our users to have secure passwords, even going so far to request they be setup by default to have passwords the same as their username. What should I do in this situation? What would you do?
Update - Some users have brought up the question of whether the application needs high security. This isn't credit card information for example but does include sensitive information and a mailing list management and sending functionality.
Make the best case you can for strong passwords and then, unfortunately, if they do not see your point of view either do what they asked or find a better job.
What you're told.
...
Then respecfully let the superior know in writing what problems that will cause.
Do not CC anyone. This is my opinion, of course. If you CC it will look obvious. You really just want security but you have to cover yourself. You don't have to be a horse's behind about it though.
Keep it in your sent box, print it, whatever, if you are truly concerned.
edit - You do what you're told unless it is some sort of question of moral turpitude. Then you simply document what you did and why you did it. Just remember that if you do not document it - it did not happen. Documenting is something you should always be doing.
As a compromise there are way better defaults, like using the user's serial number, year of birth, initials, some combination, depending what you have on hand. Not the most secure but not the least either.
Does your application require high security? If the data controlled by your software is not sensitive and the risk to the user is low, perhaps you really don't need strong passwords.
If your app does pose a significant risk to the user if passwords are allowed to be weak, you should make that case as best you can, in writing. If you can quantify risk and liability, do so, but ultimately you will have to leave the decision up to your superiors.
There's nothing wrong with a default password the same as the username provided that the system requests that the user creates a new password the first time the user logs in. You then allow anything as a password if there is low security requirement. If you're handling sensitive data then password strength needs to be of an appropriate level. You haven't said what data you're hiding. There's no point in having super strong passwords (12 chars, lower case, upper case, digit and symbol and no words from dictionary) if it's an intranet based time tracking system. If you're accessing something like a tax record database then you'd need at least two level authentication - string password and one time key generation.
You should hit him hard. Explain him/her what sort of bad publicity might happen because of this, also depends on the data, data protection act and similar stuff can actually cause serious liability. Basically doing it such can be considered as a software defect therefore company can be responsible for the results.
Basically you need to give him a reason which will bite him, scare him. That's how you sell security and insurance :)
If you boss can't figure out such a simple thing and can't trust guys like you at the end, maybe you should start looking for a new place which you can actually use your own potential instead of dealing with these sort of issues.
This is poor security.
If it can result in, for example, identify theft for your users, then you have a very serious social responsibility to improve the security. You are essentially dealing with people's lives. Go to your boss, go to his or her boss. Print out these comments and bring them along. Go to your legal department and tell them how much exposure this causes. If your company was dumping toxic waste whistle blower laws would apply. Personal information and identify theft is no less serious. Do everything in writing to cover yourself and to provide a paper trail of evidence for the lawsuits that will surely follow. Don't allow your company to deny any knowledge of the risk after the fact. Companies that knowingly implement horrible security that results in identify theft should fail in the market place and deserve nothing but shame, ridicule and failure.
If on the other hand this poor security can result in comparatively minor things then your your effort to improve the security can also be scaled back from what I describe above.
Email him your concern (in a non-aggressive way). Give the logical attack vector, reveal what will be exposed. Close by asking for his confirmation taht this is his instruction. Then send to him (only him, as previously suggested)
Email archive both your original email and his confirmation. This will cover you if something happens.
Argue the case for having stronger passwords but also make a compromise. Have the passwords as defaulting to the username with certain letters replaced with numbers perhaps? This all depends on the system as well. If this is an internal system, it could be quite hard for somebody to gain access to the system & do any harm.
Do what your boss says, but make the passwords expire within a relatively short time period.
I would put together a summary document on password policies, benefits of strong passwords, etc and submit it to him for review and try and make it part of company policy. If they still don't like it then do what they ask, as they are the end client and you have done your part to educate them of the pitfalls.
why using user/pasword in the first?
to log user activity?
the operating system asks for it?
if you want to connect an action (whatever) with an user, I as an user would require that my password be safe!
if your boss is afraid, that he may loose "knowledge", if a user is away, and he needs to get access to that uesers data, require everyone to write down his password in a sealed envelope.
if your boss does not trust you, kündige!
Peter
I would consider what is behind the request to have it that way first.
Is it really an active user with username+password what should be being set up in the first place? i.e. perhaps the user should be receiving an email with a link to activate :)
When does the sensitive information comes into the system? Assuming it is input by the user, just have an activation step where the user changes the password (or is the first time (s)he has a password for that matter).
Notice that if you are working with sensitive information, it is likely there is a law relating to it. I would also look into that, if it is illegal it makes for a strong case, and in that case you should Really consider saying plain no (explaining the reason first of course).
Did he say they had to be all lowercase...
Did he explicitly say they had to not include numbers...
You should hack into his account. Then he will know why username=password dont work.
I've run into this before, where they didn't want to use a secure password &/or lock down their computers.
Then it happened our website had been hacked into (not b/c of a password breach, but b/c of flawed component/module for the CMS that we used - but that's a different story) and in a few different occasions, people have logged into the exec's computer to view a few inappropriate things.
The reason for this explanation is to say that it wasn't until this and a few other case studies that I brought to their attention that they understood just how important it is for secure passwords.
As a solution, you may try to do some research on case studies where a breach has occurred on systems or sites where the information stored or protected wasn't terribly important, but the damaged cause and money it took to recover was substantial - such as someone setting up a phishing scam on your site, the holding hostage a server or site & having to wipe clean the whole box to start over, or some other type of breach.
Anyways, take it for what it's worth.
A few things come to mind that you might want to share with your boss -
The biggest security threat isn't outsiders, it's the folks you work with. If there has been someone fired with cause since you've been there, bring that up with your boss - "What if XXXX had access to other people's accounts?" That person many not steal data, but they may try to vandalize the system or mess with data out of spite. Or could they even share that data with a competitor?
Propose a somewhat stronger default as a compromise - username and 4 digits of home phone number. It's not much stronger but it does make guessing a little harder.
People can make fairly secure passwords by using mnemonics. However, you need to train people in how to do that. Offer to hold a session with your users on how to create secure passwords. Honestly, it's not just good for where they work but for anyone who shops or banks online. Something that's easy for IT people who have to juggle multiple passwords may be harder for others.
BTW, I found a nice javascript generator of mnemonic passwords.
http://digitarald.de/playground/mnemonic-password-generator/
I've found situations where a password is shared by several people, because sometimes security is less important than other stuff. Specially in intranets.
A solution can be to store the IP address of each user. It's a security measure closer to security cameras than to locks, but it might be enough for what your boss has in mind.
Slough may be onto something - but it might be too harsh.
Maybe take a combination approach.
Do what`s asked - but when presenting it, make sure it breaks, or you have some mechanism that will show how or why it is not a secure approach. (This will go through a review process before being implemented right?)
Also find any documentation that describes "best coding practices" from respected industry peers either in books or online or even office colleagues that may be able to back up your point of view. Present your sources, and if their ignored, you've done your duty and due diligence, and the final outcome will rest on the superiors shoulders.

When the bots attack! [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are some popular spam prevention methods besides CAPTCHA?
I have tried doing 'honeypots' where you put a field and then hide it with CSS (marking it as 'leave blank' for anyone with stylesheets disabled) but I have found that a lot of bots are able to get past it very quickly. There are also techniques like setting fields to a certain value and changing them with JS, calculating times between load time and submit time, checking the referer URL, and a million other things. They all have their pitfalls and pretty much all you can hope for is to filter as much as you can with them while not alienating who you're here for: the users.
At the end of the day, though, if you really, really, don't want bots to be sending things through your form you're going to want to put a CAPTCHA on it - best one I've seen that takes care of mostly everything is reCAPTCHA - but thanks to India's CAPTCHA solving market and the ingenuity of spammers everywhere that's not even successful all of the time. I would beware using something that is 'ingenious' but kind of 'out there' as it would be more of a 'wtf' for users that are at least somewhat used to your usual CAPTCHAs.
Shocking, but almost every response here included some form of CAPTCHA. The OP wanted something different, I guess maybe he wanted something that actually works, and maybe even solves the real problem.
CAPTCHA doesn't work, and even if it did - its the wrong problem - humans can still flood your system, and by definition CAPTCHA wont stop that (cuz its designed only to tell if you're a human or not - not that it does that well...)
So, what other solutions are there? Well, it depends... on your system and your needs.
For instance, if all you're trying to do is limit how many times a user can fill out a "Contact Me" form, you can simply throttle how many requests each user can submit per hour/day/whatever. If your users are anonymous, maybe you need to throttle according to IP addresses, and occasionally blacklist an IP (though this too can be circumvented, and causes other problems).
If you're referring to a forum or blog comments (such as this one), well the more I use it the more I like the solution. A mix between authenticated users, authorization (based on reputation, not likely to be accumulated through flooding), throttling (how many you can do a day), the occasional CAPTCHA, and finally community moderation to cleanup the few that get through - all combine to provide a decent solution. (I wonder if Jeff can provide some info on how much spam and other malposts actually get through...?)
Another control to consider (dont know if they have it here), is some form of IDS/IPS - if you can detect and recognize spam, you can block THAT pattern. Moderation fills that need manually, here...
Note that any one of these does not prevent the spam, but incrementally lowers the probability, and thus the profitability. This changes the economic equation, and leaves CAPTCHA to actually provide enough value to be worth it - since its no longer worth it for the spammers to bother breaking it or going around it (thanks to the other controls).
Give the user the possibility to calculate:
What is the sum of 3 and 8?
By the way: Just surfed by an interesting approach of Microsoft Research: Asirra.
http://research.microsoft.com/asirra/
It shows you several pictures and you have to identify the pictures with a given motif.
Try Akismet
Captchas or any form of human-only questions are horrible from a usability perspective. Sometimes they're necessary, but I prefer to kill spam using filters like Akismet.
Akismet was originally built to thwart spam comments on WordPress blogs, but the API is capabable of being adapted for other uses.
Update: We've started using the ruby library Rakismet on our Rails app, Yarp.com. So far, it's been working great to thwart the spam bots.
A very simple method which puts no load on the user is just to disable the submit button for a second after the page has been loaded. I used it on a public forum which had continuous spam posts, and it stopped them since.
Ned Batchelder wrote up a technique that combines hashes with honeypots for some wickedly effective bot-prevention. No captchas, just code.
It's up at Stopping spambots with hashes and honeypots:
Rather than stopping bots by having people identify themselves, we can stop the bots by making it difficult for them to make a successful post, or by having them inadvertently identify themselves as bots. This removes the burden from people, and leaves the comment form free of visible anti-spam measures.
This technique is how I prevent spambots on this site. It works. The method described here doesn't look at the content at all. It can be augmented with content-based prevention such as Akismet, but I find it works very well all by itself.
http://chongqed.org/ maintains blacklists of active spam sources and the URLs being advertised in the spams. I have found filtering posts for the latter to be very effective in forums.
The most common ones I've observed orient around user input to solve simple puzzles e.g. of the following is a picture of a cat. (displaying pictures of thumbnails of dogs surrounding a cat). Or simple math problems.
While interesting I'm sure the arms race will also overwhelm those systems too.
You can use Recaptcha to at least make a captcha useful. Then you can make questions with simple verbal math problems or similar. Microsoft's Asirra makes you find pics of cats and dogs. Requiring a valid email address to activate an account stops spammers when they wouldn't get enough benefit from the service, but might deter normal users as well.
The following is unfeasible with today's technology, but I don't think it's too far off. It's also probably overkill for dealing with forum spam, but could be useful for account sign-ups, or any situation where you wanted to be really sure you were dealing with humans and they would be prepared for it to take a few minutes to complete the process.
Have 2 users who are trying to prove themselves human connect to each other via their webcams and ask them if the person they are seeing is human and live (i.e. not a recording), by getting them to, for example, mirror each other's movements, or write something on a piece of paper. Get everyone to do this a few times with different users, and throw a few recordings into the mix which they also have to identify correctly as such.
A popular method on forums is to simply queue the threads of members with less than 10 posts in a moderation queue. Of course, this doesn't help if you don't have moderators, or it's not a forum. A more general method is the calculation of hyperlink to text ratios. Often, spam posts contain a ton of hyperlinks, and you can catch a lot this way. In the same vein is comparing the content of consecutive posts. Simply do not allow consecutive posts that are extremely similar.
Of course, anyone with knowledge of the measures you take is going to be able to get around them. To be honest, there is little you can do if you are the target of a specific attack. Rather, you should focus on preventing more general, unskilled attacks.
For human moderators it surely helps to be able to easily find and delete all posts from some IP, or all posts from some user if the bot is smart enough to use a registered account. Likewise the option to easily block IP addresses or accounts for some time, without further administration, will lessen the administrative burden for human moderators.
Using cookies to make bots and human spammers believe that their post is actually visible (while only they themselves see it) prevents them (or trolls) from changing techniques. Let the spammers and trolls see the other spam and troll messages.
Javascript evaluation techniques like this Invisible Captcha system require the browser to evaluate Javascript before the page submission will be accepted. It falls back nicely when the user doesn't have Javascript enabled by just displaying a conventional CAPTCHA test.
Animated captchas' - scrolling text - still easy to recognize by humans but if you make sure that none of the frames offer something complete to recognize.
multiple choice question - All it takes is a ______ and a smile. idea here is that the user will have to choose/understand.
session variable - checking that a variable you put into a session is part of the request. will foil the dumb bots that simply generate requests but probably not the bots that are modeled like a browser.
math question - 2 + 5 = - this again is to ask a question that is easy to solve but prevents the bots ability to generate a response.
image grid - you create grid of images - select 1 or 2 of a particular type such as 3x3 grid picture of animals and you have to pick out all the birds on the grid.
Hope this gives you some ideas for your new solution.
A friend has the simplest anti-spam method, and it works.
He has a custom text box which says "please type in the number 4".
His blog is rather popular, but still not popular enough for bots to figure it out (yet).
Please remember to make your solution accessible to those not using conventional browsers. The iPhone crowd are not to be ignored, and those with vision and cognitive problems should not be excluded either.
Honeypots are one effective method. Phil Haack gives one good honeypot method, that could be used in principle for any forum/blog/etc.
You could also write a crawler that follows spam links and analyzes their page to see if it's a genuine link or not. The most obvious would be pages with an exact copy of your content, but you could pick out other indicators.
Moderation and blacklisting, especially with plugins like these ones for WordPress (or whatever you're using, similar software is available for most platforms), will work in a low-volume environment. If your environment is a low volume one, don't underestimate the advantage this gives you. Personally deciding what is reasonable content and what isn't gives you ultimate flexibility in spam control, if you have the time.
Don't forget, as others have pointed out, that CAPTCHAs are not limited to text recognition from an image. Visual association, math problems, and other non-subjective questions relayed through an image also qualify.
Sblam is an interesting project.
Invisble form fields. Make a form field that doesn't appear on the screen to the user. using display: none as a css style so that it doesn't show up. For accessibility's sake, you could even put hidden text so that people using screen readers would know not to fill it in. Bots almost always fill in all fields, so you could block any post that filled in the invisible field.
Block access based on a blacklist of spammers IP addresses.
Honeypot techniques put an invisible decoy form at the top of the page. Users don't see it and submit the correct form, bots submit the wrong form which does nothing or bans their IP.
I've seen a few neat ideas along the lines of Asira which ask you to identify which pictures are cats. I believe the idea originated from KittenAuth a while ago..
Use something like the google image labeler with appropriately chosen images such that a computer wouldn't be able to recognise the dominant features of it that a human could.
The user would be shown an image and would have to type words associated with it. They would keep being shown images until they have typed enough words that agreed with what previous users had typed for the same image. Some images would be new ones that they weren't being tested against, but were included to record what words are associated with them. Depending on your audience you could also possibly choose images that only they would recognise.
Mollom is supposedly good at stopping spam. Both personal (free) and professional versions are available.
I know some people mentioned ASIRRA, but if you go to all the adopt me links for the images, it will say on that linked page if its a cat or dog. So it should be relatively easy for a bot to just go to all the adoptme links. So its just a matter of time for that project.
just verify the email address and let google/yahoo etc worry about it
You could get some device ID software the41 has some fraud prevention software that can detect the hardware being used to access your site. I belive they use it to catch fraudsters but could be used to stop bots. Once you have identified an device being used by a bot you can just block that device. Last time a checked it can even trace your route throught he phone network ( Not your Geo-IP !! ) so can even block a post code if you want.
Its expensive through so prop. a better cheaper solution that is a little less big brother.

Resources