Pitfalls of cryptographic code

Pitfalls of cryptographic code - security

I'm modifying existing security code. The specifications are pretty clear, there is example code, but I'm no cryptographic expert. In fact, the example code has a disclaimer saying, in effect, "Don't use this code verbatim."
While auditing the code I'm to modify (which is supposedly feature complete) I ran across this little gem which is used in generating the challenge:
static uint16 randomSeed;
...
uint16 GetRandomValue(void)
{
return randomSeed++;/* This is not a good example of very random generation :o) */
}
Of course, the first thing I immediately did was pass it around the office so we could all get a laugh.
The programmer who produced this code knew it wasn't a good algorithm (as indicated by the comment), but I don't think they understood the security implications. They didn't even bother to call it in the main loop so it would at least turn into a free running counter - still not ideal, but worlds beyond this.
However, I know that the code I produce is going to similarly cause a real security guru to chuckle or quake.
What are the most common security problems, specific to cryptography, that I need to understand?
What are some good resources that will give me suitable knowledge about what I should know beyond common mistakes?
-Adam

Don't try to roll your own - use a standard library if at all possible. Subtle changes to security code can have a huge impact that aren't easy to spot, but can open security holes. For example, two modified lines to one library opened a hole that wasn't readily apparent for quite some time.

Applied Cryptography is an excellent book to help you understand crypto and code. It goes over a lot of fundamentals, like how block ciphers work, and why choosing a poor cipher mode will make your code useless even if you're using a perfectly implemented version of AES.
Some things to watch out for:
Poor Sources of Randomness
Trying to design your own algorithm or protocol - don't do it, ever.
Not getting it code reviewed. Preferably by publishing it online.
Not using a well established library and trying to write it yourself.
Crypto as a panacea - encrypting data does not magically make it safe
Key Management. These days it's often easier to steal the key with a side-channel attack than to attack the crypto.

Your question shows one of the more common ones: poor sources of randomness. It doesn't matter if you use a 256 bit key if they bits aren't random enough.
Number 2 is probably assuming that you can design a system better than the experts. This is an area where a quality implementation of a standard is almost certainly going to be better than innovation. Remember, it took 3 major versions before SSL was really secure. We think.

IMHO, there are four levels of attacks you should be aware of:
social engineering attacks. You should train your users not to do stupid things and write your software such that it is difficult for users to do stupid things. I don't know of any good reference about this stuff.
don't execute arbitrary code (buffer overflows, xss exploits, sql injection are all grouped here). The minimal thing to do in order to learn about this is to read Writing Secure Code from someone at MS and watching the How to Break Web Software google tech talk. This should also teach you a bit about defense in depth.
logical attacks. If your code is manipulating plain-text, certificates, signatures, cipher-texts, public keys or any other cryptographic objects, you should be aware that handling them in bad ways can lead to bad things. Minimal things you should be aware about include offline&online dictionary attacks, replay attacks, man-in-the-middle attacks. The starting point to learning about this and generally a very good reference for you is http://www.soe.ucsc.edu/~abadi/Papers/gep-ieee.ps
cryptographic attacks. Cryptographic vulnerabilities include:
stuff you can avoid: bad random number generation, usage of a broken hash function, broken implementation of security primitive (e.g. engineer forgets a -1 somewhere in the code, which renders the encryption function reversible)
stuff you cannot avoid except by being as up-to-date as possible: new attack against a hash function or an encryption function (see e.g. recent MD5 talk), new attack technique (see e.g. recent attacks against protocols that send encrypted voice over the network)
A good reference in general should be Applied Cryptography.
Also, it is very worrying to me that stuff that goes on a mobile device which is probably locked and difficult to update is written by someone who is asking about security on stackoverflow. I believe your case would one of the few cases where you need an external (good) consultant that helps you get the details right. Even if you hire a security consultant, which I recommend you to do, please also read the above (minimalistic) references.

What are the most common security problems, specific to cryptography, that I need to understand?
Easy - you(1) are not smart enough to come up with your own algorithm.
(1) And by you, I mean you, me and everyone else reading this site...except for possibly Alan Kay and Jon Skeet.

I'm not a crypto guy either, but S-boxes can be troublesome when messed with (and they do make a difference). You also need a real source of entropy, not just a PRNG (no matter how random it looks). PRNGs are useless. Next, you should ensure the entropy source isn't deterministic and that it can't be tampered with.
My humble advice is: stick with known crypto algorithms, unless you're an expert and understand the risks. You could be better off using some tested, publicly-available open source / public domain code.

Related

Using built in functions

I am developing a Windows Form Application in C#.I have heard that one should not use built in methods and functions in code since hackers have deep understanding of such built in methods and know how to fail them Instead one should always use his/her own functions and methods and if not then call built in functions intelligently from those newly made functions.How much is that true?
A supporting example in favour of my argument is that I have seen developer always develope there own made encryption algorithm like AES,DES,RC4 and Hash functions since they believe that built in encryption algorithm have many times backdoor in them.

What?! No, no, no! Whoever told you this is just wrong.
There is a common fallacy that published source code is more vulnerable to "h4ckerz" because it is available for anyone to spot the flaws in. However, I'm glad you mentioned crypto, because this is an area where this line of reasoning really stands out as the fallacy it is.
One of the most popular questions of all time on https://security.stackexchange.com/ is about a developer (in the OP he was given the pseudonym "Dave") who shared this fear of published code. Dave, like the developer you saw, was trying to homebrew his own encryption algorithm. Here's one of the most popular comments in that thread:
Dave has a fundamentally false premise, that the security of an algorithm relies on (even partially) its obscurity - that's not the case. The security of a hashing algorithm relies on the limits of our understanding of mathematics, and, to a lesser extent, the hardware ability to brute-force it. Once Dave accepts this reality (and it really is reality, read the Wikipedia article on hashing), it's a question of who is smarter - Dave by himself, or a large group of specialists devoted to this very particular problem. (emphasis added)
As a matter of fact, as it stands now, the top two memes on Security.SE are "Don't roll your own" and "Don't be a Dave".
While this has all been about crypto, this applies in general to most open-source software. The chance that a backdoor will get found and fixed goes up with each new set of eyes laid on the code. This should be a simple and uncontroversial premise: the more people are looking for something, the higher the chance it will be found. Yes, this applies to malicious users looking for exploits. However, it also applies to power users, white hat hackers, security researchers, cryptographers, professional developers, and others working for "good", which generally (hopefully) outnumber those working for "evil". This also implicitly relies on the false premise that hackers need to see the source code to do bad things. This should be obviously false based on the sheer number of proprietary systems whose source code has never been published (various Microsoft and Adobe programs come to mind) which have been inundated with vulnerabilities for years. Maybe having source code to read makes the hacker's job easier, but maybe not -- is it easier to pore over source code looking for an attack vector or to just use scanning tools and scripts against a compiled binary?
tl;dr Don't be a Dave. Rolling your own means you have to be the best at everything to succeed, instead of taking a sampling of the best the community has to offer.
Heartbleed
In your comment, you rebut:
Then why was the Heartbleed bug in openSSL not found and corrected [earlier]?
Because no one was looking at it. That's the sad truth. Here's the difference -- what happened once someone did find it? Now tens of thousands of security researchers, crypto experts, and others are looking at it. Suppose the same kind of vulnerability existed in one of the proprietary products I mentioned earlier, which it very well could. Once it's caught (if it's caught), ask yourself:
Could the team of programmers at the company responsible benefit from the help of the entire worldwide community of security experts, cryptographers, and other analysts right now?
If a bug this critical were discovered (and that's a big if!) in your software, would you be prepared to deal with the fallout caused by your custom implementation?

Unless you know of specific failure modes or weaknesses of the built-in methods your application would use and know how to minimize or eliminate them, it is probably better to use the methods provided by the language or library designers, which will often be both more efficient and more secure than what an average programmer would come up with on the fly for a particular project.
Your example absolutely does not support your view: developing your own encryption algorithm without some serious background in the domain and review by cryptanalysts, and then employing it in security-critical code, is a recipe for disaster. Even developing your own custom implementation of an industry standard encryption algorithm can present problems, and almost certainly will if you are inexperienced at it.

When can you trust yourself to implement cryptography based solutions?

I've read quite a few times how I shouldn't use cryptography if I'm not an expert. Basically both Jeff and Eric tell you the same:
Cryptography is difficult, better buy the security solution from experts than doing it yourself.
I completely agree, for a start it's incredibly difficult to perceive all possible paths an scenario might take, all the possible attacks against it and against your solution... but then When should we use it?
I will face in a few months with the task of providing a security solution to a preexisting solution we have. That is, we exchange data between servers, second phase of the project is providing good security to it. Buying a third party solution will eat up the budget anyway so ... When is it good to use cryptography for a security solution? Even if you are not a TOP expert.
Edit: To clarify due to some comments.
The project is based on data transport across network locations, the current implementation allows for a security layer to be placed before transport and we can make any changes in implementation we like (assuming reasonable changes, the architecture is well design so changes should have an acceptable impact). The question revolves around this phrase from Eric Lippert:
I don’t know nearly enough about cryptography to safely design or implement a crypto-based security system.
We're not talking about reinventing the wheel, I had in mind a certain schema when I designed the system that implied secure key exchange, encryption and decryption and some other "counter measures" (man in the middle, etc) using C# .NET and the included cryptography primitives, but I'm by no means an expert in the field so when I read that, I of course start doubting myself. Am I even capable of implementing a secure system? Would it always be parts of the system that will be insecure unless I subcontract that part?

I think this blog posting (not mine!) gives some good guidelines.
Other than that there are some things you should never do unless you're an expert. This is stuff like implementing your own crypto algorithm (or your own version of a published algorithm). It's just crazy to do that yourself! (When there's CAPI, JCE, OpenSSL, ....)
Beyond that though if you're 'inventing' anything it's almost certainly wrong. In the Coding Horror post you linked to - the main mistake to my mind is that he's doing it a very low level and you just don't need to. If you were encrypting things in Java (I'm not so familiar with .NET) you could use Jasypt which uses strong default algorithms and parameters and doesn't require you to know about ECB and CBC (though, arguably, you should anyway just because...).
There is going to be a prebuilt system for just about anything you're going to want to do with crypto. If you're storing keys then theres KeyCzar, in other cases theres Jasypt. The point is if you're doing anything 'unusual' with crypto - you shouldn't be; if you're doing something not 'unusual' then you don't need to do the crypto yourself. Don't invent a new way to store keys, generate keys from passwords, verify signatures etc - it's not necessary, it's complicated and you'll almost certainly make a mistake unless you're very very careful...
So... I don't think you necessarily need to be afraid of encrypting things but be aware that if you're specifying algorithms and parameters to those algorithms directly in your code it is probably not good. There are exceptions to any rule but as in the blog post I linked above - if you type AES into your code you're doing it wrong!
The key "take-away" from the Matasano blog post is right at the end (note that TLS is a more precise name for SSL):
THOMAS PTACEK
GPG for data at rest. TLS for data in
motion.
NATE LAWSON
You can also use Guttman's cryptlib,
which has a sane API. Or Google
Keyczar. They both have really simple
interfaces, and they try to make it
hard to do the wrong thing. What we
need are fewer libraries with higher
level interfaces. But we also need
more testing for those libraries.

The rule of thumb with cryptography isn't that you shouldn't use it if you're not an expert; rather, it's that you shouldn't re-invent the wheel unless you're an expert. In other words, use existing implementations / libraries / algorithms as much as possible. For example, don't write your own cryptographic authentication algorithm, or come up with yet another way to store keys.
As for when to use it: whenever you have data that needs to be protected from having others see it. Beyond that, it comes down to which algorithms / approaches are best: SSL vs. IPsec vs. symmetric vs. PKI, etc.
Also, a word of advice: key management is often the most challenging part of any comprehensive cryptographic solution.

You have things backwards: first you must specify your actual requirements in detail ("provide a security solution" is meaningless marketing drivel). Then you look for ways to satisfy those specific requirements; croptography will satisfy some of them.
Example of requirements that cryptography can satisfy:
Protect data sent over publich channels from spying
Protect data against tampering (or rather, detect manipulated data)
Allow servers and clients as well as users to prove their identity to each other

You need to go through the same process as for any other requirement. What is the problem being solved, what is the outcome the users are looking for, how is the solution proposed going to be supported going forward, what are the timescales involved. Sometimes there is an off the shelf solution that does the job, sometimes what you want needs to be developed as a custom solution, and sometimes you'll choose a custom solution as it will work out more cost effective than an off the shelf one.
The same is true with security requirements, but the added complexity is that to do any sort of custom solution requires additional expertise in the technical teams (development, support etc). There is also the issue that the solution may need to be not only secure but recognised as secure. This may be far easier to achieve with an off the shelf solution.
And RickNZ is absolutely right - don't forget key management. Consider this right at the outset as part of the decision making process.

The question I would start by asking, is what are you trying to achieve.
If you are trying to just secure the transmission of the data from server a to server b, then there are a number of mechanisms you could use, which would require little work, such as SSL.
However if you are trying to secure all of the data stored in the application that is a far more difficult, although if it is a requirement, then I would suggest that any cryptography, regardless of how easy to break, is better than none.

As someone who has been asked to do similar things, you face a daunting number of questions in implementing your system. There are major difference between securing a system and implementing cryptography systems.
Implementing a cryptography system is very difficult and experts routinely get it wrong, both in theory and practice. A famous theoretical failure was the knapsack cryptosystem which has been largely abandoned due to the Lenstra–Lenstra–Lovász lattice basis reduction algorithm. On the other side, we saw in the last year how an incorrect seed in Debian's random number generator opened up any key generated by the OS. You want to use a prepackaged cryptosystem, not because its an "experts-only" field, but because you want a community tested and supported system. Almost every cryptographic algorithm I know of has bounds that assume certain tasks to be hard, and if those tasks turn out to be computable (as in the LLL algorithm) the whole system becomes useless over night.
But, I believe, the real heart of the question is how to use things in order to make a secure system. While there are many libraries out there to generate keys, cipher the text, and so on, there are very few systems that implement the entire package. But as always security boils down to two concepts: worth of protection and circle of trust.
If you are guarding the Hope diamond, you spend a lot of money designing a system to protect it, employ a constant force to watch it, and hire crackers to continually try to break in. If you are just discouraging bored teenagers from reading your email, you hack something up in an hour and you don't use that address for secret company documents.
Additionally managing the circle of trust is just as difficult of a task. If your circle includes tech savvy, like-minded friends, you make a system and give them a large amount of trust with the system. If it includes many levels of trust, such as users, admins, and so on, you have a tiered system. Since you have to manage more and more interactions with a larger circle, the bugs in the larger system become more weaknesses to hack and thus you must be extremely careful in designing this system.
Now to answer your question. You hire a security expert the moment the item you're protecting is valuable enough and your circle of trust includes those you cannot trust. You don't design cryptography systems unless you do it for a living and have a community to break them, it is a full time academic discipline. If you want to hack for fun, remember that it is only for fun and don't let the value of what you are protecting get too high.

Pay for security (of which cryptography is a part but only a part) what it is worth but no more. So your first task is to decide what your security is worth, or or how much various states of security are worth. Then invite whoever holds the budget to select which state to aim for and therefore how much to spend.
No absolutes here, it's all relative.

Why buy cryptography? It's one of the most developed area in open source software of great quality :) See for example TrueCrypt or OpenSSL
There is a good chance that whatever you need cryptography for there is already a good quality, reputable open source project for it! (And if you can see the source you can see what they did; I once saw an article about a commercial software supposed to "encrypt" a file that simply xorred every byte with a fixed value!)
And, also, why would you want to re-invent the wheel? It's unlikely that with no cryptography background you will do better or even come close to the current algorithms such as AES.

I think it totally depends on what you are trying to achieve.
Does the data need to be stored encrypted at either end or does it just need to be encrypted whilst in transit?
How are you transferring the data? FTP, HTTP etc?
Probably not a good idea to have security as a second phase as by that point presumably you've been moving data around insecurely for a period of time?

Why is security through obscurity a bad idea? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I recently came across a system where all of the DB connections were managed by routines obscured in various ways, including base 64 encoding, md5sums and various other techniques.
Why is security through obscurity a bad idea?

Security through obscurity would be burying your money under a tree. The only thing that makes it safe is no one knows it's there. Real security is putting it behind a lock or combination, say in a safe. You can put the safe on the street corner because what makes it secure is that no one can get inside it but you.
As mentioned by #ThomasPadron-McCarty below in a comment below:
If someone discovers the password, you can just change the password, which is easy. If someone finds the location, you need to dig up the money and move it somewhere else, which is much more work. And if you use security by obscurity in a program, you would have to rewrite the program.

Security through obscurity can be said to be bad because it often implies that the obscurity is being used as the principal means of security. Obscurity is fine until it is discovered, but once someone has worked out your particular obscurity, then your system is vulnerable again. Given the persistence of attackers, this equates to no security at all.
Obscurity should never be used as an alternative to proper security techniques.
Obscurity as a means of hiding your source code to prevent copying is another subject. I'm rather split on that topic; I can understand why you might wish to do that, personally I've never been in a situation where it would be wanted.

Security through obscurity is an interesting topic. It is (rightly) maligned as a substitute for effective security. A typical principle in cryptography is that a message is unknown but the contents are not. Algorithms for encyrption are typically widely published, analyzed by mathematicians and, after a time, some confidence is built up in their effectivness but there is never a guarantee that they're effective.
Some people hide their cryptographic algorithms but this is considered a dangerous practice because then such algorithms haven't gone through the same scrutiny. Only organisations like the NSA, which has a significant budget and staff of mathematicians, can get away with this kind of approach.
One of the more interesting developments in recent years has been the risk of steganography, which is the practice is hiding message in images, sound files or some other medium. The biggest problem in steganalysis is identifying whether or not a message is there or not, making this security through obscurity.
Last year I came across a story that Researchers Calculate Capacity of a Steganographic Channel but the really interesting thing about this is:
Studying a stego-channel in this way
leads to some counter-intuitive
results: for example, in certain
circumstances, doubling the number of
algorithms looking for hidden data can
increase the capacity of the
steganographic channel.
In other words, the more algorithms you use to identify messages the less effective it becomes, which goes against the normal criticism of security through obscurity.
Interesting stuff.

The main reason it is a bad idea is that it does not FIX the underlying problems, just attempts to hide them. Sooner or later, the problems will be discovered.
Also, extra encryption will incur additional overhead.
Finally excessive obscurity (like using checksums) makes maintenance a nightmare.
Better security alternatives is to eliminate potential weaknesses in your code such as enforced inputs to prevent injection attacks.

One factor the ability to recover from a security breach. If someone discovers your password, just reset it. But if someone uncovers your obscure scheme, you're hosed.

Using obscurity as all these people agree is not security, its buying yourself time. That said having a decent security system implemented then adding an extra layer of obscurity is still useful. Lets say tomorrow someone finds an unbeatable crack/hole in the ssh service that can't be patched immediately.
As a rule I've implemented in house... all public facing servers expose only the ports needed ( http/https ) and nothing more. One public facing server then will have ssh exposed to the internet at some obscure high numbered port and a port scanning trigger setup to block any IP's that try to find it.
Obscurity has its place in the world of security, but not as the first and last line of defense. In the example above, I don't get any script/bot attacks on ssh because they don't want to spend the time searching for a non-standard ssh service port and if they do, their unlikely to find it before another layer of security steps in and cuts them off.

All of the forms of security available are actually forms of security through obscurity. Each method increases in complexity and provides better security but they all rely on some algorithm and one or more keys to restore the encrypted data. "Security through obscurity" as most call it is when someone chooses one of the simplest and easiest to crack algorithms.
Algorithms such as character shifting are easy to implement and easy to crack, that's why they are a bad idea. It's probably better than nothing, but it will, at most, only stop a casual glance at the data from being easily read.
There are excellent resources on the Internet you can use to educate yourself about all of the available encryption methods and their strengths and weaknesses.

Security is about letting people in or keeping them out depending on what they know, who they are, or what they have. Currently, biometrics aren't good at finding who you are, and there's always going to be problems with it (fingerprint readers for somebody who's been in a bad accident, forged fingerprints, etc.). So, actually, much of security is about obfuscating something.
Good security is about keeping the stuff you have to keep secret to a minimum. If you've got a properly encrypted AES channel, you can let the bad guys see everything about it except the password, and you're safe. This means you have a much smaller area open to attack, and can concentrate on securing the passwords. (Not that that's trivial.)
In order to do that, you have to have confidence in everything but the password. This normally means using industry-standard crypto that numerous experts have looked at. Anybody can create a cipher they can't break, but not everybody can make a cipher Bruce Schneier can't break. Since there's a thorough lack of theoretical foundations for cipher security, the security of a cipher is determined by having a lot of very smart and knowledgeable people try to come up with attacks, even if they're not practical (attacks on ciphers always get better, never worse). This means the crypto algorithm needs to be widely known. I have very strong confidence in the Advanced Encryption Standard, and almost none in a proprietary algorithm Joe wrote and obfuscated.
However, there's been problems with implementations of crypto algorithms. It's easy to inadvertantly leave holes whereby the key can be found, or other mischief done. It happened with an alternate signature field for PGP, and weaknesses with SSL implemented on Debian Linux. It's even happened to OpenBSD, which is probably the most secure operating system readily available (I think it's up to two exploits in ten years). Therefore, these should be done by a reputable company, and I'd feel better if the implementations were open source. (Closed source won't stop a determined attacker, but it'll make it harder for random good guys to find holes to be closed.)
Therefore, if I wanted security, I'd try to have my system as reliable as possible, which means as open as possible except for the password.
Layering security by obscurity on top of an already secure system might help some, but if the system's secure it won't be necessary, and if it's insecure the best thing is to make it secure. Think of obscurity like the less reputable forms of "alternative medicine" - it is very unlikely to help much, and while it's unlikely to hurt much by itself it may make the patient less likely to see a competent doctor or computer security specialist, whichever.
Lastly, I'd like to make a completely unsolicited and disinterested plug for Bruce Schneier's blog, as nothing more than an interested reader. I've learned a lot about security from it.

One of the best ways of evaluating, testing or improving a security product is to have it banged on by a large, clever peer group.
Products that rely for their security on being a "black box" can't have the benefit of this kind of test. Of course, being a "black box" always invites the suspicion (often justified) that they wouldn't stand up to that kind of scrutiny anyway.

I argued in one case that password protection is really security through obscurity. The only security I can think of that wouldn't be STO is some sort of biometric security.
Besides that bit of semantics and nit picking, STO (Security through obscurity) is obviously bad in any case where you need real security. However, there might be cases where it doesn't matter. I'll often XOR pad a text file i don't want anyone reading. But I don't really care if they do, i'd just prefer that it not be read. In that case, it doesn't matter, and an XOR pad is a perfect example of an easy to find out STO.

It is almost never a good idea. It is the same to say, is it a good idea to drive without seatbelt? Of course you can find some cases where it fits, but the anwser due to experience seems obvious.

Weak encryption will only deter the least motivated hackers, so it isn't valueless, it just isn't very valuable, especially when strong encryption, like AES, is available.
Security through obscurity is based on the assumption that you are smart and your users are stupid. If that assumption is based on arrogance, and not empirical data, then your users- and hackers-- will determine how to invoke the hidden method, bring up the unlinked page, decompile and extract the plain text password from the .dll, etc.
That said, providing comprehensive meta-data to users is not a good idea, and obscuring is perfectly valid technique as long as you back it up with encryption, authorization, authentication and all those other principles of security.

If the OS is Windows, look at using the Data Protection API (DPAPI). It is not security by obscurity, and is a good way to store login credentials for an unattended process. As pretty much everyone is saying here, security through obscurity doesn't give you much protection.
http://msdn.microsoft.com/en-us/library/ms995355.aspx
http://msdn.microsoft.com/en-us/library/ms998280.aspx

The one point I have to add which hasn't been touched on yet is the incredible ability of the internet to smash security through obscurity.
As has been shown time and time again, if your only defense is that "nobody knows the back door/bug/exploit is there", then all it takes is for one person to stumble across it and, within minutes, hundreds of people will know. The next day, pretty much everyone who wants to know, will. Ouch.

Preventing copy protection circumvention

Anyone visiting a torrent tracker is sure to find droves of "cracked" programs ranging from simple shareware to software suites costing thousands of dollars. It seems that as long as the program does not rely on a remote service (e.g. an MMORPG) that any built-in copy protection or user authentication is useless.
Is it effectively not possible to prevent a cracker from circumventing the copy protection? Why?

No, it's not really possible to prevent it. You can make it extremely difficult - some Starforce versions apparently accomplished that, at the expense of seriously pissing off a number of "users" (victims might be more accurate).
Your code is running on their system and they can do whatever they want with it. Attach a debugger, modify memory, whatever. That's just how it is.
Spore appears to be an elegant example of where draconian efforts in this direction have not only totally failed to prevent it from being shared around P2P networks etc, but has significantly harmed the image of the product and almost certainly the sales.
Also worth noting that users may need to crack copy protection for their own use; I recall playing Diablo on my laptop some years back, which had no internal optical drive. So I dropped in a no-cd crack, and was then entertained for several hours on a long plane flight. Forcing that kind of check, and hence users to work around it is a misfeature of the stupidest kind.

It is impossible to stop it without breaking your product. The proof:
Given: The people you are trying to prevent from hacking/stealing will inevitably be much more technically sophisticated than a large portion of your market.
Given: Your product will be used by some members of the public.
Given: Using your product requires access to it's data on some level.
Therefore, You have to released you encrypt-key/copy protection method/program data to the public in enough of a fashion that the data has been seen in it's useable/unencrypted form.
Therefore, you have in some fashion made your data accessible to pirates.
Therefore, your data will be more easily accessible to the hackers than your legitimate audience.
Therefore, ANYTHING past the most simplistic protection method will end up treating your legitimate audience like pirates and alienating them
Or in short, the way the end user sees it:

Because it's a fixed defense against a thinking opponent.
The military theorists beat this one to death how many millennia ago ?

Copy-protection is like security -- it's impossible to achieve 100% perfection but you can add layers that make it successively more difficult to crack.
Most applications have some point where they ask (themselves), "Is the license valid?" The hacker just needs to find that point and alter the compiled code to return "yes." Alternatively, crackers can use brute-force to try different license keys until one works. There's also social factors -- once one person buys the tool they might post a valid license code on the Internet.
So, code obfuscation makes it more difficult (but not impossible) to find the code to alter. Digital signing of the binaries makes it more difficult to change the code, but still not impossible. Brute-force methods can be combated with long license codes with lots of error-correction bits. Social attacks can be mitigated by requiring a name, email, and phone number that is part of the license code itself. I've used that method to great effect.
Good luck!

Sorry to bust in on an ancient thread, but this is what we do for a living and we're really really good at it. It's all we do. So some of the information here is wrong and I want to set the record straight.
Theoretically uncrackable protection is not only possible it's what we sell. The basic model the major copy protection vendors (including us) follow is to use encryption of the exe and dlls and a secret key to decrypt at runtime.
There are three components:
Very strong encryption: we use AES 128-bit encryption which is effectively immune to a brute force attack. Some day when quantum computers are common it might be possible to break it but it's unreasonable to assume you will crack this strength encryption to copy software as opposed to national secrets.
Secure key storage: if a cracker can get the key to the encryption, you're hosed. The only way to GUARANTEE a key can't be stolen is to store it on a secure device. We use a dongle (it comes in many flavors but the OS always just sees it as a removable flash drive). The dongle stores the key on a smart card chip which is hardened against side channel attacks like DPA. The key generation is tied to multiple factors which are non-deterministic and dynamic so no single key/master crack is possible. The communication between the key storage and the runtime on the computer is also encrypted so a man-in-the-middle attack is thwarted.
Debugger detection: Basically you want to stop a cracker from taking a snapshot of memory (after decryption) and making an executable out of that. Some of the stuff we do to prevent this is secret, but in general we allow for debugger detection and lock the license when a debugger is present (this is an optional setting). We also never completely decrypt the entire program in memory so you can never get all the code by "stealing" memory.
We have a full time cryptologist who can crack just about anybody's protection system. He spends all his time studying how to crack software so we can prevent it. So you don't think this is just a cheap shill for what we do, we're not unique: other companies such as SafeNet and Arxan Technologies can do some very strong protection as well.
A lot of software-only or obfuscation schemes are easy to crack since the cracker can just identify the program entry point and branch around any any license checking or other stuff the ISV has put in to try to prevent piracy. Some people even with dongles will throw up a dialog when the license isn't found--setting a breakpoint on that error will give the cracker a nice place in the assembly code to do a patch. Again, this requires unencrypted machine code to be available--something you don't get if you do strong encryption of the .exe.
One last thing: I think we're unique in that we've had several open contests where we provided a system to people and invited them to crack it. We've had some pretty hefty cash prizes but no one has yet cracked our system. If an ISV takes our system and implements it incorrectly it's no different from putting a great padlock on your front door attached to a cheap hasp with wood screws--easy to circumvent. But if you use our tools as we suggest we believe your software cannot be cracked.
HTH.

The difference between security and copy-protection is that with security, you are protecting an asset from an attacker while allowing access by an authorized user. With copy protection, the attacker and the authorized user are the same person. That makes perfect copy protection impossible.

I think given enough time a would-be cracker can circumvent any copy-protection, even ones using callbacks to remote servers. All it takes is redirecting all outgoing traffic through a box that will filter those requests, and respond with the appropriate messages.
On a long enough timeline, the survival rate of copy protection systems is 0. Everything is reverse-engineerable with enough time and knowledge.
Perhaps you should focus on ways of making your software be more attractive with real, registered, uncracked versions. Superior customer service, perks for registration, etc. reward legitimate users.

Basically history has shown us the most you can buy with copy protection is a little time. Fundamentally since there is data you want someone to see one way, there is a way to get to that data. Since there is a way someone can exploit that way to get to the data.
The only thing that any copy protection or encryption for that matter can do is make it very hard to get at something. If someone is motivated enough there is always the brute force way of getting around things.
But more importantly, in the computer software space we have tons of tools that let us see how things are working, and once you get the method of how the copy protection works then its a very simple matter to get what you want.
The other issue is that copy protection for the most part just frustrates your users who are paying for your software. Take a look at the open source model they don't bother and some folks are making a ton of money encouraging people to copy their software.

"Trying to make bits uncopyable is like trying to make water not wet." -- Bruce Schneier
Copy protection and other forms of digital restrictions management are inherently breakable, because it is not possible to make a stream of bits visible to a computer while simultaneously preventing that computer from copying them. It just can't be done.
As others have pointed out, copy protection only serves to punish legitimate customers. I have no desire to play Spore, but if I did, I'd likely buy it but then install the cracked version because it's actually a better product for its lack of the system-damaging SecuROM or property-depriving activation scheme.

}} Why?
You can buy the most expensive safe in the world, and use it to to protect something. Once you give away the combination to open the safe, you have lost your security.
The same is true for software, if you want people to use your product you must given them the ability to open the proverbial safe and access the contents, obfuscating the method to open the lock doesn't help. You have granted them the ability to open it.

You can either trust your customers/users, or you can waste inordinate amounts of time and resource trying to defeat them instead of providing the features they want to pay for.
It just doesn't pay to bother. Really. If you don't protect your software, and it's good, undoubtedly someone will pirate it. The barrier will be low, of course. But the time you save from not bothering will be time you can invest in your product, marketing, customer relationships, etc., building your customer base for the long term.
If you do spend the time on protecting your product instead of developing it, you'll definitely reduce piracy. But now your competitors may be able to develop features that you didn't have time for, and you may very well end up selling less, even in the short term.

As others point out, you can easily end up frustrating real and legitimate users more than you frustrate the crooks. Always keep your paying users in mind when you develop a circumvention technique.
If your software is wanted, you have no hope against the army of bored 17 year old's. :)

In the case of personal copying/non-commercial copyright infringement, the key factor would appear to be the relationship between the price of the item and the ease of copying it. You can increase the difficulty to copy it, but with diminishing returns as highlighted by some of the previous answers. The other tack to take would be to lower the price until even the effort to download it via bittorrent is more cumbersome than simply buying it.
There are actually many successful examples where an author has found a sweet spot of pricing that has certainly resulted in a large profit for themselves. Trying to chase a 100% unauthorized copy prevention is a lost cause, you only need to get a large group of customers willing to pay instead of downloading illegaly. The very thing that makes pirating softweare inexpensive is also what makes it inexpensive to publish software.

There's an easy way, I'm amazed you haven't said so in the answers above.
Move the copy protection to a secured area (understand your server in your secure lab).
Your server will receive random number from clients (check that the number wasn't used before), encrypt some ever evolving binary code / computation results with clients' number and your private key and send it back.
No hacker can circumvent this since they don't have access to your server code.
What I'm describing is basically webservice other SSL, that's where most company goes nowadays.
Cons: A competitor will develop an offline version of the same featured product during the time you finish your crypto code.

On protections that don't require network:
According to notes floated around it took two years to crack a popular application which used similar scheme as described in John's answer. (custom hardware dongle protection)
Another scheme which doesn't involve a dongle is "expansive protection". I coined this just now, but it works like this: There's an application which saves user data and for which the users can buy expansions and such from 3rd parties. When user loads the data or uses new expansion, the expansions and the saved data contains also code which performs checks. And of course these checks are also protected by checksum checks. It's not as secure on paper as the other scheme but in practise this application has been half-cracked all the time, so that it mostly functions as a trial despite being cracked as the cracks will always miss some checks and have to patch these expansions as well.
The key point is, while these can be cracked, if enough software vendors used such schemes, this would overwork the few people in the warescene who are willing to dedicate themselves to those. If you do the maths, the protections don't have to be even that great, as long as enough vendors used these custom protections that changed constantly, it would simply overwhelm the crackers and the warez scene would end then and there. *
The only reason this hasn't happened is because publishers buy a single protection that they use all over, making it a huge target just like Windows is target for malware, any protection used in more than single app is a bigger target. So everyone needs to be doing their own custom, unique multi-layered expansive protection. The amount of warez releases would drop to maybe dozen releases per year if it takes months to crack a single release by the very best crackers.
Now for some theorycrafting in marketing software:
If you believe that warez provides worthwhile marketing value, then that should be factored in the business plan. This could entail a very very (too) basic lite version that still cost few dollars to ensure it was cracked. Then you'd hook in the users with "limited time upgrade cheaply from the lite version" offers regularly and other upselling tactics. The lite version should really have at most one buy-worthy feature and otherwise be very crippled. The price should probably be <10 $. The full version should probably be twice as much as the upgrade price from the $10 lite pay-demo version. eg. If the full-version is $80, You'd offer upgrades from the lite version to full version for $40 or something that really seems like killer bargain. Of course you'd avoid revealing these bargains to purchasers who went direct for the $80 edition.
It would be critical that the full version shared no similarity in code to the lite version. You'd intend that the lite-version gets warezed and the full-version will either be time intensive to crack or have network dependency in functionality that will be hard to mimic locally. Crackers are probably more specialized in cracking than trying to code up/replicate parts of functionality that the application has on the web server.
* addendum: for apps/games the scene might end in such unlikely and theoretical circumstance, for other things like music/movies and in practise, I'd look at making it cheap for digital dl buyers to get additional collectible physical items or online-only value - many people are collectors of stuff (especially the pirates) and they could be enticed into buying if it gains something desirable enough over just a digital copy.
Beware though - There's something called "the law of rising expectations". Example from games: Ultima 4-6 standard box included a map made of cloth, and Skyrim Collectors edition has a map made of paper. Expectations had risen and some people aren't going to be happy with a paper map. You want to either keep quality of produce or service constant or manage expectations ahead of time. I believe this is critical when considering these value-add things as you want them to be desirably but not increasingly expensive to make and not turn into something that seems so worthless that it defeats the purpose.

This is one occasion where quality software is a bad thing, because if no one whats your software then they will not spend time trying to crack it, on the other hand things like Adobe's Master Collection CS3, were available just days after release.
So the moral of this story is if you don't want someone to steal your software there is one option: don't write anything worth stealing.

I think someone will come up with a dynamic AI way of defeating all the currently standard methods of copy protection; heck, I'd sure love to get paid to work on that problem. Once they get there then new methods will be developed, but it'll slow things down.
The second best way for society to stop theft of software, is to penalize it heavily, and enforce the penalties.
The best way is to reverse the moral decline, and thereby increase the level of integrity in society.

A lost cause if ever I heard one... of course that doesn't mean you shouldn't try.
Personally, I like Penny Arcade's take on it: "A Cyclical Argument With A Literal Strawman"alt text http://sonicloft.net/im/52

What techniques do you use when writing your own cryptography methods? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
For years, maybe 10, I've been fascinated with cryptography. I read a book about XOR bit-based encryption, and have been hooked ever since thing.
I guess it's more fair to say that I'm fascinated by those who can break various encryption methods, but I digress.
To the point -- what methods do you use when writing cryptography? Is obfuscation good in cryptography?
I use two key-based XOR encryption, various hashing techniques (SHA1) on the keys, and simple things such as reversing strings here and there, etc.
I'm interested to see what others think of and try when writing a not-so-out-of-the-box encryption method. Also -- any info on how the pros go about "breaking" various cryptography techniques would be interesting as well.
To clarify -- I have no desire to use this in any production code, or any code of mine for that matter. I'm interesting in learning how it works through toying around, not reinventing the wheel. :)
Ian

The best advice I can give you is: resist the temptation to reinvent the wheel. Cryptography is harder than you think.
Get Bruce Schneier's book Applied Cryptography and read it carefully.

To contradict what everyone else has said so far, go for it! Yeah, your code might have buffer overflow vulnerabilities in it, and may be slow, buggy, etc, but you're doing this for FUN! I completely understand the recreational enjoyment found in playing with crypto.
That being said, cryptography isn't based on obfuscation at all (or at least shouldn't be). Good crypto will continue to work, even once Eve has slogged through your obfuscated code and completely understands what is going on. IE: Many newspapers have substitution code puzzles that readers try and break over breakfast. If they started doing things like reversing the whole string, yes, it'd be harder, but Joe Reader would still be able to break it, neve tuohtiw gnieb dlot.
Good crypto is based on problems that are assumed to be (none proven yet, AFAIK) really difficult. Examples of this include factoring primes, finding the log, or really any other NP-complete problem.
[Edit: snap, neither of those are proven NP-complete. They're all unproven, yet different. Hopefully you still see my point: crypto is based on one-way functions. Those are operations that are easy to do, but hard to undo. ie multiply two numbers vs find the prime factors of the product. Good catch tduehr]
More power to you for playing around with a really cool branch of mathematics, just remember that crypto is based on things that are hard, not complicated. Many crypto algorithms, once you really understand them, are mindbogglingly simple, but still work because they're based on something that is hard, not just switching letters around.
Note: With this being said, some algorithms do add in extra quirks (like string seversal) to make brute forcing them that much more difficult. A part of me feels like I read this somewhere referencing DES, but I don't believe it... [EDIT: I was right, see 5th paragraph of this article for a reference to the permutations as useless.]
BTW: If you haven't found it before, I'd guess the TEA/XTEA/XXTEA series of algorithms would be of interest.

The correct answer is to not do something like this. The best method is to pick one of the many cryptography libraries out there for this purpose and use them in your application. Security through obscurity never works.
Pick the current top standards for cryptography algorithms as well. AES for encryption, SHA256 for hashing. Elgamal for public key.
Reading Applied Cryptography is a good idea as well. But a vast majority of the book is details of implementations that you won't need for most applications.
Edit: To expand upon the new information given in the edit. The vast majority of current cryptography involves lots of complicated mathematics. Even the block ciphers which just seem like all sorts of munging around of bits are the same.
In this case then read Applied Cryptography and then get the book Handbook of Applied Cryptography which you can download for free.
Both of these have lots of information on what goes into a cryptography algorithm. Some explanation of things like differential and linear cryptanalysis. Another resource is Citeseer which has a number of the academic papers referenced by both of those books for download.
Cryptography is a difficult field with a huge academic history to it for going anywhere. But if you have the skills it is quite rewarding as I have found it to be.

Do the exercises here:
http://www.schneier.com/crypto-gram-9910.html#SoYouWanttobeaCryptographer
For starters, look at the cube attack paper (http://eprint.iacr.org/2008/385) and try breaking some algorithms with it. After you are familiar with breaking cryptographic schemes, you'll become better at creating them.
As far as production code goes, I'll repeat what has already been said: just use what's available in the market, since all the mainstream schemes have already gone through multiple rounds of cryptanalysis.

All the above advice is sound. Obfuscation bad. Don't put your own crypto into production without first letting the public beat on it for a while.
a couple things to add:
Encoding is not encryption. I recently bypassed a website's authentication system due to the developers misunderstanding here.
Learn how to break even the most basic systems. You'd be surprised how often knowledge of simple rotation ciphers is actually useful.
A^B = C. You stated you've been working with two key XOR encryption. When building a cryptosystem always check that your steps are actually accomplishing something. in the two key XOR case you're really just using a different key.
A^A = 0. XOR enryption is very weak against known or chosen plaintext attacks. If you know all or part of the plaintext, you can get all or part of the key. Plaintext ^ Cyphertext = Key
Another good book to read is The Code Book by Simon Singh. It goes over some of the history of cryptography and methods for breaking most of the cryptosystems he covers.
Two algorithms to learn (learn them and the history behind them):
3DES: yes it's obsolete but it's a good starting point for learning fiestel and block cyphers and there are some good lessons in it's creation from DES. Also, the reasoning for the encrypt, decrypt, encrypt methodology used is a good thing to learn.
RSA: I'm going to display my inner math geek here. Probably the simplest encryption algorithm in use today. Methods of breaking it are known (just factor the key) but computationally extremely difficult. m^d mod n where n = p*q (p and q prime) and gcd(d,n)=1. A little bit of group/number theory explains why this isn't easily reversed without knowing p and q. In my number theory course we proved the theory behind this at least half a dozen ways.
A note for PhirePhly:
prime factorization and discrete log are not NP-Complete, or NP-Hard for that matter. They are both unknown in complexity. I imagine you'd get a decent amount of fame from just figuring that part out. That said, the rest of your assertion is correct. Good crypto is based on things that are easy to do but hard to undo without the key.

Unless you're (becoming) an expert in the field, do not use home-made crypto in production products. Enough said.

DON'T!
Even the experts have a very hard time knowing if they got it right. Outside of a crypto CS class, just use other people's code. Port code only if you absolutely must and then test the snot out of it with known good code.

Most experts agree that openness is more valuable than obfuscation in developing cryptographic methods and algorithms.
In other words, everyone seems to be able to design a new code that everyone can break except them. The best crypto survives the test of having the algorithm and some encrypted messages put out there and having the best crypto hackers try to break it.
In general, most obfuscation methods and simple hashing (and I've done quite a few of them myself) are very easily broken. That doesn't mean they aren't fun to experiment with and learn about.
List of Cryptography Books (from Wikipedia)
This question caught my eye because I'm currently re-reading Cryptonomicon by Neal Stephenson, which isn't a bad overview itself, even though it's a novel...

To echo everyone else (for posterity), never ever implement your own crypto. Use a library.
That said, here is an article on how to implement DES:
http://scienceblogs.com/goodmath/2008/09/des_encryption_part_1_encrypti.php
Permutation and noise are crucial to many encryption algorithms. The point isn't so much to obscure things, but to add steps to the process that make brute force attacks impractical.
Also, get and read Applied Cryptography. It's a great book.

Have to agree with other posters. Don't unless you are writing a paper on it and need to do some research or something.
If you think you know a lot about it go and read the Applied Cryptography book. I know a lot of math and that book still kicked my butt. You can read and analyze from his pseudo-code. The book also has a ton of references in the back to dig deeper if you want.
Crypto is one of those things that a lot of people think is very cool, but the actual math behind the concepts is beyond their grasp. I decided a long time ago that it was not worth the mental effort for me to get to that level.
If you just want to see HOW it is done (study existing implementations in code) I would suggest taking a peek at the Crypto++ library even if you don't normally code in C++ it is a good view of the topics and parts of implementing encryption.
Bruce also has a good list of resources you can get from his site.

I attended a code security session at this years Aus TechEd. When talking about the AES algorithm in .Net and how it was selected, the presenter (Rocky Heckman) told us one of the techniques that had been used to break the previous encryption. Someone had managed to use a thermal imaging camera to record a cpu's heat signature whilst it was encrypting data. They were able to use this recording to ascertain what types of calculations the chip was doing and then reverse engineer the algoritm. They had way too much time on their hands, and I am fairly confident I will never be smart enough to beat people like that! :(
Note: I sincerely hope I have relayed the story correctly, if not - the mistake is likely mine, not that of the presenter mentioned.

It's already been beaten to death that you shouldn't use home grown crypto in a product. But I've read your question and you clearly state that you're just doing it for fun. Sounds like the true geek/hacker/academic spirit to me. You know it works, you want to know why it works and try to see if you can make it work.
I completely encourage that and do the same with many programs I've written just for fun. I suggest reading this post (http://rdist.root.org/2008/09/18/dangers-of-amateur-cryptography/) over at a blog called "rootlabs". In the post are a series of links that you should find very interesting. A guy interested in math/crypto with a PhD in Computer Science and who works for Google decided to write a series of articles on programming crypto. He made several non-obvious mistakes that were pointed out by industry expert Nate Lawson.
I suggest you read it. If it doesn't encourage you to keep trying, it will no doubt still teach you something.
Best of luck!

I agree with not re-inventing the wheel.
And remember, security through obscurity is no security at all. If any part of your security mechanisms use the phrase "nobody will ever figure this out!", it's not secure. Think about AES -- the algorithm is publicly available, so everybody knows exactly how it works, and yet nobody can break it.

Per other answers - inventing an encryption scheme is definitely a thing for the experts and any new proposed crypto scheme really does need to be put to public scrutiny for any reasonable hope of validation and confidence in its robustness. However, implementing existing algorithms and systems is a much more practical endeavor "for fun" and all the major standards have good test vectors to help prove the correctness of your implementation.
With that said, for production solutions, existing implementations are plentiful and there should typically be no reason you would need to implement a system yourself.

I agree with all the answers, both "don't write your own crypto algorithm for production use" and "hell yeah, go for it for your own edification", but I am reminded of something that I believe the venerable Bruce Schneier often writes: "it's easy for someone to create something that they themselves cannot break."

The only cryptography that an non experts should be able to expect to get right is bone simple One Time Pad ciphers.
CipherTextArray = PlainTextArray ^ KeyArray;
Aside from that, anything even worth looking at (even for recreation) will need a high level degree in math.

I dont want to go into depth on correct answers that have already been given (don't do it for production; simple reversal not enough; obfuscation bad; etc).
I just want to add Kerckoff's principle, "A cryptosystem should be secure even if everything about the system, except the key, is public knowledge".
While I'm at it, I'll also mention Bergofsky's Principle (quoted by Dan Brown in Digital Fortress): "If a computer tried enough keys, it was mathematically guaranteed to find the right one. A code’s security was not that its pass-key was unfindable but rather that most people didn’t have the time or equipment to try."
Only that's inherently not true; Dan Brown made it up.

Responding to PhirePhly and tduehr, on the complexity of factoring:
It can readily be seen that factoring is in NP and coNP. What we need to see is that the problems "given n and k, find a prime factor p of n with 1 < p <= k" and "show that no such p exists" are both in NP (the first being the decision variant of the factoring problem, the second being the decision variant of the complement).
First problem: given a candidate solution p, we can easily (i.e. in polynomial time) check whether 1 < p <= k and whether p divides n. A solution p is always shorter (in the number of bits used to represent it) than n, so factoring is in NP.
Second problem: given a complete prime factorization (p_1, ..., p_m), we can quickly check that their product is n, and that none are between 1 and k. We know that PRIMES is in P, so we can check the primality of each p_i in polynomial time. Since the smallest prime is 2, there is at most log_2(n) prime factors in any valid factorization. Each factor is smaller than n, so they use at most O(n log(n)) bits. So if n doesn't have a prime factor between 1 and k, there is a short (polynomial-size) proof which can be verified quickly (in polynomial time).
So factoring is in NP and coNP. If it was NP-complete, then NP would equal coNP, something which is often assumed to be false. One can take this as evidence that factoring is indeed not NP-complete; I'd rather just wait for a proof ;-)

Usually, I start by getting a Ph.D in number theory. Then I do a decade or so of research and follow that up with lots of publishing and peer review. As far as the techniques I use, they are various ones from my research and that of my peers. Occasionally, when I wake up in the middle of the night, I'll develop a new technique, implement it, find holes in it (with the help of my number theory and computer science peers) and then refine from there.
If you give a mouse an algorithm...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string