As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
For years, maybe 10, I've been fascinated with cryptography. I read a book about XOR bit-based encryption, and have been hooked ever since thing.
I guess it's more fair to say that I'm fascinated by those who can break various encryption methods, but I digress.
To the point -- what methods do you use when writing cryptography? Is obfuscation good in cryptography?
I use two key-based XOR encryption, various hashing techniques (SHA1) on the keys, and simple things such as reversing strings here and there, etc.
I'm interested to see what others think of and try when writing a not-so-out-of-the-box encryption method. Also -- any info on how the pros go about "breaking" various cryptography techniques would be interesting as well.
To clarify -- I have no desire to use this in any production code, or any code of mine for that matter. I'm interesting in learning how it works through toying around, not reinventing the wheel. :)
Ian
The best advice I can give you is: resist the temptation to reinvent the wheel. Cryptography is harder than you think.
Get Bruce Schneier's book Applied Cryptography and read it carefully.
To contradict what everyone else has said so far, go for it! Yeah, your code might have buffer overflow vulnerabilities in it, and may be slow, buggy, etc, but you're doing this for FUN! I completely understand the recreational enjoyment found in playing with crypto.
That being said, cryptography isn't based on obfuscation at all (or at least shouldn't be). Good crypto will continue to work, even once Eve has slogged through your obfuscated code and completely understands what is going on. IE: Many newspapers have substitution code puzzles that readers try and break over breakfast. If they started doing things like reversing the whole string, yes, it'd be harder, but Joe Reader would still be able to break it, neve tuohtiw gnieb dlot.
Good crypto is based on problems that are assumed to be (none proven yet, AFAIK) really difficult. Examples of this include factoring primes, finding the log, or really any other NP-complete problem.
[Edit: snap, neither of those are proven NP-complete. They're all unproven, yet different. Hopefully you still see my point: crypto is based on one-way functions. Those are operations that are easy to do, but hard to undo. ie multiply two numbers vs find the prime factors of the product. Good catch tduehr]
More power to you for playing around with a really cool branch of mathematics, just remember that crypto is based on things that are hard, not complicated. Many crypto algorithms, once you really understand them, are mindbogglingly simple, but still work because they're based on something that is hard, not just switching letters around.
Note: With this being said, some algorithms do add in extra quirks (like string seversal) to make brute forcing them that much more difficult. A part of me feels like I read this somewhere referencing DES, but I don't believe it... [EDIT: I was right, see 5th paragraph of this article for a reference to the permutations as useless.]
BTW: If you haven't found it before, I'd guess the TEA/XTEA/XXTEA series of algorithms would be of interest.
The correct answer is to not do something like this. The best method is to pick one of the many cryptography libraries out there for this purpose and use them in your application. Security through obscurity never works.
Pick the current top standards for cryptography algorithms as well. AES for encryption, SHA256 for hashing. Elgamal for public key.
Reading Applied Cryptography is a good idea as well. But a vast majority of the book is details of implementations that you won't need for most applications.
Edit: To expand upon the new information given in the edit. The vast majority of current cryptography involves lots of complicated mathematics. Even the block ciphers which just seem like all sorts of munging around of bits are the same.
In this case then read Applied Cryptography and then get the book Handbook of Applied Cryptography which you can download for free.
Both of these have lots of information on what goes into a cryptography algorithm. Some explanation of things like differential and linear cryptanalysis. Another resource is Citeseer which has a number of the academic papers referenced by both of those books for download.
Cryptography is a difficult field with a huge academic history to it for going anywhere. But if you have the skills it is quite rewarding as I have found it to be.
Do the exercises here:
http://www.schneier.com/crypto-gram-9910.html#SoYouWanttobeaCryptographer
For starters, look at the cube attack paper (http://eprint.iacr.org/2008/385) and try breaking some algorithms with it. After you are familiar with breaking cryptographic schemes, you'll become better at creating them.
As far as production code goes, I'll repeat what has already been said: just use what's available in the market, since all the mainstream schemes have already gone through multiple rounds of cryptanalysis.
All the above advice is sound. Obfuscation bad. Don't put your own crypto into production without first letting the public beat on it for a while.
a couple things to add:
Encoding is not encryption. I recently bypassed a website's authentication system due to the developers misunderstanding here.
Learn how to break even the most basic systems. You'd be surprised how often knowledge of simple rotation ciphers is actually useful.
A^B = C. You stated you've been working with two key XOR encryption. When building a cryptosystem always check that your steps are actually accomplishing something. in the two key XOR case you're really just using a different key.
A^A = 0. XOR enryption is very weak against known or chosen plaintext attacks. If you know all or part of the plaintext, you can get all or part of the key. Plaintext ^ Cyphertext = Key
Another good book to read is The Code Book by Simon Singh. It goes over some of the history of cryptography and methods for breaking most of the cryptosystems he covers.
Two algorithms to learn (learn them and the history behind them):
3DES: yes it's obsolete but it's a good starting point for learning fiestel and block cyphers and there are some good lessons in it's creation from DES. Also, the reasoning for the encrypt, decrypt, encrypt methodology used is a good thing to learn.
RSA: I'm going to display my inner math geek here. Probably the simplest encryption algorithm in use today. Methods of breaking it are known (just factor the key) but computationally extremely difficult. m^d mod n where n = p*q (p and q prime) and gcd(d,n)=1. A little bit of group/number theory explains why this isn't easily reversed without knowing p and q. In my number theory course we proved the theory behind this at least half a dozen ways.
A note for PhirePhly:
prime factorization and discrete log are not NP-Complete, or NP-Hard for that matter. They are both unknown in complexity. I imagine you'd get a decent amount of fame from just figuring that part out. That said, the rest of your assertion is correct. Good crypto is based on things that are easy to do but hard to undo without the key.
Unless you're (becoming) an expert in the field, do not use home-made crypto in production products. Enough said.
DON'T!
Even the experts have a very hard time knowing if they got it right. Outside of a crypto CS class, just use other people's code. Port code only if you absolutely must and then test the snot out of it with known good code.
Most experts agree that openness is more valuable than obfuscation in developing cryptographic methods and algorithms.
In other words, everyone seems to be able to design a new code that everyone can break except them. The best crypto survives the test of having the algorithm and some encrypted messages put out there and having the best crypto hackers try to break it.
In general, most obfuscation methods and simple hashing (and I've done quite a few of them myself) are very easily broken. That doesn't mean they aren't fun to experiment with and learn about.
List of Cryptography Books (from Wikipedia)
This question caught my eye because I'm currently re-reading Cryptonomicon by Neal Stephenson, which isn't a bad overview itself, even though it's a novel...
To echo everyone else (for posterity), never ever implement your own crypto. Use a library.
That said, here is an article on how to implement DES:
http://scienceblogs.com/goodmath/2008/09/des_encryption_part_1_encrypti.php
Permutation and noise are crucial to many encryption algorithms. The point isn't so much to obscure things, but to add steps to the process that make brute force attacks impractical.
Also, get and read Applied Cryptography. It's a great book.
Have to agree with other posters. Don't unless you are writing a paper on it and need to do some research or something.
If you think you know a lot about it go and read the Applied Cryptography book. I know a lot of math and that book still kicked my butt. You can read and analyze from his pseudo-code. The book also has a ton of references in the back to dig deeper if you want.
Crypto is one of those things that a lot of people think is very cool, but the actual math behind the concepts is beyond their grasp. I decided a long time ago that it was not worth the mental effort for me to get to that level.
If you just want to see HOW it is done (study existing implementations in code) I would suggest taking a peek at the Crypto++ library even if you don't normally code in C++ it is a good view of the topics and parts of implementing encryption.
Bruce also has a good list of resources you can get from his site.
I attended a code security session at this years Aus TechEd. When talking about the AES algorithm in .Net and how it was selected, the presenter (Rocky Heckman) told us one of the techniques that had been used to break the previous encryption. Someone had managed to use a thermal imaging camera to record a cpu's heat signature whilst it was encrypting data. They were able to use this recording to ascertain what types of calculations the chip was doing and then reverse engineer the algoritm. They had way too much time on their hands, and I am fairly confident I will never be smart enough to beat people like that! :(
Note: I sincerely hope I have relayed the story correctly, if not - the mistake is likely mine, not that of the presenter mentioned.
It's already been beaten to death that you shouldn't use home grown crypto in a product. But I've read your question and you clearly state that you're just doing it for fun. Sounds like the true geek/hacker/academic spirit to me. You know it works, you want to know why it works and try to see if you can make it work.
I completely encourage that and do the same with many programs I've written just for fun. I suggest reading this post (http://rdist.root.org/2008/09/18/dangers-of-amateur-cryptography/) over at a blog called "rootlabs". In the post are a series of links that you should find very interesting. A guy interested in math/crypto with a PhD in Computer Science and who works for Google decided to write a series of articles on programming crypto. He made several non-obvious mistakes that were pointed out by industry expert Nate Lawson.
I suggest you read it. If it doesn't encourage you to keep trying, it will no doubt still teach you something.
Best of luck!
I agree with not re-inventing the wheel.
And remember, security through obscurity is no security at all. If any part of your security mechanisms use the phrase "nobody will ever figure this out!", it's not secure. Think about AES -- the algorithm is publicly available, so everybody knows exactly how it works, and yet nobody can break it.
Per other answers - inventing an encryption scheme is definitely a thing for the experts and any new proposed crypto scheme really does need to be put to public scrutiny for any reasonable hope of validation and confidence in its robustness. However, implementing existing algorithms and systems is a much more practical endeavor "for fun" and all the major standards have good test vectors to help prove the correctness of your implementation.
With that said, for production solutions, existing implementations are plentiful and there should typically be no reason you would need to implement a system yourself.
I agree with all the answers, both "don't write your own crypto algorithm for production use" and "hell yeah, go for it for your own edification", but I am reminded of something that I believe the venerable Bruce Schneier often writes: "it's easy for someone to create something that they themselves cannot break."
The only cryptography that an non experts should be able to expect to get right is bone simple One Time Pad ciphers.
CipherTextArray = PlainTextArray ^ KeyArray;
Aside from that, anything even worth looking at (even for recreation) will need a high level degree in math.
I dont want to go into depth on correct answers that have already been given (don't do it for production; simple reversal not enough; obfuscation bad; etc).
I just want to add Kerckoff's principle, "A cryptosystem should be secure even if everything about the system, except the key, is public knowledge".
While I'm at it, I'll also mention Bergofsky's Principle (quoted by Dan Brown in Digital Fortress): "If a computer tried enough keys, it was mathematically guaranteed to find the right one. A code’s security was not that its pass-key was unfindable but rather that most people didn’t have the time or equipment to try."
Only that's inherently not true; Dan Brown made it up.
Responding to PhirePhly and tduehr, on the complexity of factoring:
It can readily be seen that factoring is in NP and coNP. What we need to see is that the problems "given n and k, find a prime factor p of n with 1 < p <= k" and "show that no such p exists" are both in NP (the first being the decision variant of the factoring problem, the second being the decision variant of the complement).
First problem: given a candidate solution p, we can easily (i.e. in polynomial time) check whether 1 < p <= k and whether p divides n. A solution p is always shorter (in the number of bits used to represent it) than n, so factoring is in NP.
Second problem: given a complete prime factorization (p_1, ..., p_m), we can quickly check that their product is n, and that none are between 1 and k. We know that PRIMES is in P, so we can check the primality of each p_i in polynomial time. Since the smallest prime is 2, there is at most log_2(n) prime factors in any valid factorization. Each factor is smaller than n, so they use at most O(n log(n)) bits. So if n doesn't have a prime factor between 1 and k, there is a short (polynomial-size) proof which can be verified quickly (in polynomial time).
So factoring is in NP and coNP. If it was NP-complete, then NP would equal coNP, something which is often assumed to be false. One can take this as evidence that factoring is indeed not NP-complete; I'd rather just wait for a proof ;-)
Usually, I start by getting a Ph.D in number theory. Then I do a decade or so of research and follow that up with lots of publishing and peer review. As far as the techniques I use, they are various ones from my research and that of my peers. Occasionally, when I wake up in the middle of the night, I'll develop a new technique, implement it, find holes in it (with the help of my number theory and computer science peers) and then refine from there.
If you give a mouse an algorithm...
Related
I am developing a Windows Form Application in C#.I have heard that one should not use built in methods and functions in code since hackers have deep understanding of such built in methods and know how to fail them Instead one should always use his/her own functions and methods and if not then call built in functions intelligently from those newly made functions.How much is that true?
A supporting example in favour of my argument is that I have seen developer always develope there own made encryption algorithm like AES,DES,RC4 and Hash functions since they believe that built in encryption algorithm have many times backdoor in them.
What?! No, no, no! Whoever told you this is just wrong.
There is a common fallacy that published source code is more vulnerable to "h4ckerz" because it is available for anyone to spot the flaws in. However, I'm glad you mentioned crypto, because this is an area where this line of reasoning really stands out as the fallacy it is.
One of the most popular questions of all time on https://security.stackexchange.com/ is about a developer (in the OP he was given the pseudonym "Dave") who shared this fear of published code. Dave, like the developer you saw, was trying to homebrew his own encryption algorithm. Here's one of the most popular comments in that thread:
Dave has a fundamentally false premise, that the security of an algorithm relies on (even partially) its obscurity - that's not the case. The security of a hashing algorithm relies on the limits of our understanding of mathematics, and, to a lesser extent, the hardware ability to brute-force it. Once Dave accepts this reality (and it really is reality, read the Wikipedia article on hashing), it's a question of who is smarter - Dave by himself, or a large group of specialists devoted to this very particular problem. (emphasis added)
As a matter of fact, as it stands now, the top two memes on Security.SE are "Don't roll your own" and "Don't be a Dave".
While this has all been about crypto, this applies in general to most open-source software. The chance that a backdoor will get found and fixed goes up with each new set of eyes laid on the code. This should be a simple and uncontroversial premise: the more people are looking for something, the higher the chance it will be found. Yes, this applies to malicious users looking for exploits. However, it also applies to power users, white hat hackers, security researchers, cryptographers, professional developers, and others working for "good", which generally (hopefully) outnumber those working for "evil". This also implicitly relies on the false premise that hackers need to see the source code to do bad things. This should be obviously false based on the sheer number of proprietary systems whose source code has never been published (various Microsoft and Adobe programs come to mind) which have been inundated with vulnerabilities for years. Maybe having source code to read makes the hacker's job easier, but maybe not -- is it easier to pore over source code looking for an attack vector or to just use scanning tools and scripts against a compiled binary?
tl;dr Don't be a Dave. Rolling your own means you have to be the best at everything to succeed, instead of taking a sampling of the best the community has to offer.
Heartbleed
In your comment, you rebut:
Then why was the Heartbleed bug in openSSL not found and corrected [earlier]?
Because no one was looking at it. That's the sad truth. Here's the difference -- what happened once someone did find it? Now tens of thousands of security researchers, crypto experts, and others are looking at it. Suppose the same kind of vulnerability existed in one of the proprietary products I mentioned earlier, which it very well could. Once it's caught (if it's caught), ask yourself:
Could the team of programmers at the company responsible benefit from the help of the entire worldwide community of security experts, cryptographers, and other analysts right now?
If a bug this critical were discovered (and that's a big if!) in your software, would you be prepared to deal with the fallout caused by your custom implementation?
Unless you know of specific failure modes or weaknesses of the built-in methods your application would use and know how to minimize or eliminate them, it is probably better to use the methods provided by the language or library designers, which will often be both more efficient and more secure than what an average programmer would come up with on the fly for a particular project.
Your example absolutely does not support your view: developing your own encryption algorithm without some serious background in the domain and review by cryptanalysts, and then employing it in security-critical code, is a recipe for disaster. Even developing your own custom implementation of an industry standard encryption algorithm can present problems, and almost certainly will if you are inexperienced at it.
I've heard a bit about using automated theorem provers in attempts to show that security vulnerabilities don't exist in a software system. In general this is fiendishly hard to do.
My question is has anyone done work on using similar tools to find vulnerabilities in existing or proposed systems?
Eidt: I'm NOT asking about proving that a software system is secure. I'm asking about finding (ideally previously unknown) vulnerabilities (or even classes of them). I'm thinking like (but an not) a black hat here: describe the formal semantics of the system, describe what I want to attack and then let the computer figure out what chain of actions I need to use to take over your system.
Yes, a lot of work has been done in this area. Satisfiability (SAT and SMT) solvers are regularly used to find security vulnerabilities.
For example, in Microsoft, a tool called SAGE is used to eradicate buffer overruns bugs from windows.
SAGE uses the Z3 theorem prover as its satisfiability checker.
If you search the internet using keywords such as “smart fuzzing” or “white-box fuzzing”, you will find several other projects using satisfiability checkers for finding security vulnerabilities.
The high-level idea is the following: collect execution paths in your program (that you didn't manage to exercise, that is, you didn't find an input that made the program execute it), convert these paths into mathematical formulas, and feed these formulas to a satisfiability solver.
The idea is to create a formula that is satisfiable/feasible only if there is an input that will make the program execute the given path.
If the produced formula is satisfiable (i.e., feasible), then the satisfiability solver will produce an assignment and the desired input values. White-box fuzzers use different strategies for selecting execution paths.
The main goal is to find an input that will make the program execute a path that leads to a crash.
So, at least in some meaningful sense, the opposite of proving something is secure is finding code paths for which it isn't.
Try Byron Cook's TERMINATOR project.
And at least two videos on Channel9. Here's one of them
His research is likely to be a good starting point for you to learn about this extremely interesting area of research.
Projects such as Spec# and Typed-Assembly-Language are related too. In their quest to move the possibility of safety checks from runtime back to compile-time, they allow the compiler to detect many bad code paths as compilation errors. Strictly, they don't help your stated intent, but the theory they exploit might be useful to you.
I'm currently writing a PDF parser in Coq together with someone else. While the goal in this case is to produce a secure piece of code, doing something like this can definitely help with finding fatal logic bugs.
Once you've familiarized yourself with the tool, most proof become easy. The harder proofs yield interesting test cases, that can sometimes trigger bugs in real, existing programs. (And for finding bugs, you can simply assume theorems as axioms once you're sure that there's no bug to find, no serious proving necessary.)
About a moth ago, we hit a problem parsing PDFs with multiple / older XREF tables. We could not prove that the parsing terminates. Thinking about this, I constructed a PDF with looping /Prev Pointers in the Trailer (who'd think of that? :-P), which naturally made some viewers loop forever. (Most notably, pretty much any poppler-based viewer on Ubuntu. Made me laugh and curse Gnome/evince-thumbnailer for eating all my CPU. I think they fixed it now, tho.)
Using Coq to find lower-level bugs will be difficult. In order to prove anything, you need a model of the program's behavior. For stack / heap problems, you'll probably have to model the CPU-level or at least C-level execution. While technically possible, I'd say this is not worth the effort.
Using SPLint for C or writing a custom checker in your language of choice should be more efficient.
STACK and KINT used constraint solvers to find vulnerabilities in many OSS projects, like the linux kernel and ffmpeg. The project pages point to papers and code.
It's not really related to theorem-proving, but fuzz testing is a common technique for finding vulnerabilities in an automated way.
There is the L4 verified kernel which is trying to do just that. However, if you look at the history of exploitation, completely new attack patterns are found and then a lot of software written up to that point is very vulnerable to attacks. For instance, format string vulnerabilities weren't discovered until 1999. About a month ago H.D. Moore released DLL Hijacking and literally everything under windows is vulnerable.
I don't think its possible to prove that a piece of software is secure against an unknown attack. At least not until a theorem is able to discover such an attack, and as far as I know this hasn't happened.
Disclaimer: I have little to no experience with automated theorem provers
A few observations
Things like cryptography are rarely ever "proven", just believed to be secure. If your program uses anything like that, it will only be as strong as the crypto.
Theorem provers can't analyze everything (or they would be able to solve the halting problem)
You would have to define very clearly what insecure means for the prover. This in itself is a huge challenge
Yes. Many theorem proving projects show the quality of their software by demonstrating holes or defects in software. To make it security related, just imagine finding a hole in a security protocol. Carlos Olarte's Ph.D. thesis under Ugo Montanari has one such example.
It is in the application. Not really the theorem prover itself that has anything to do with security or special knowledge thereof.
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
How do you protect your software from illegal distribution?
Best practice to prevent software copy
Hypothetical situation:
Lets say I have built a software product from the scratch and it does wonderful things. The only problem is that, once someone takes a look at the code, they will understand it very easily and they can easily build it up themselves.
Now, the thing is that I built the code from the scratch 100% and uses a mixture of API calls.
Nobody else is involved in the development of the code.
If I want to sell this product, what is the guarantee that someone much smarter than me will reverse engineer the whole thing and come up with better product?
Right now I am thinking of fragmenting the whole code. Adding lots of redundant code and tonnes of comments.
Is there any software which encrypts the software code, that will make debugging, troubleshooting, and understanding how the code works virtually impossible? and yet runs as usual? so that the developer can have peace of mind?
Very few things in a program are truly novel. Almost everything that you are likely to put into your code, someone else could invent on their own. Generally more easily than they could learn it by reading your code. Reading code is harder than writing it, and most programmers don't really like doing it anyway.
So it's much more likely that they will look at your app and think "I could do that", then "That's cool, I'm gonna read that code and then copy it!". Even if they understand it, you will still own the copyright, you still get to market first.
I recommend that you just forget about it.
once someone takes a look at the
code, they will understand it very
easily and they can easily build it up
themselves.
So don't give anybody the source code.
If I want to sell this product, what
is the guarantee that someone much
smarter than me will reverse engineer
the whole thing and come up with
better product?
(a) So start selling it now and capture the market. Reverse engineering takes time, during which you are capturing market and 'mind-share'. (b) Put a provision in your licence agreement that prohibits reverse-engineering. (c) Make sure everybody who gets the product signs the agreement.
Right now I am thinking of fragmenting
the whole code. Adding lots of
redundant code and tonnes of comments.
That only has a point if you're going to distribute the source code. In which case nobody even needs to reverse-engineer. They have your source code. Don't give it to them.
Is there any software ...
There's lots of software that purports to do this job. However it is a technical solution to a business problem. All software can be reverse-engineered, because at some point or other it all has to be decrypted and de-obfuscated to the point where the CPU will understand it. At that point it is essentially plaintext. So no technical solution is formally speaking possible (short of something like code that executes in a tamper-proof HSM).
I will add that there is another business mechanism you can use to defend against business loss, which is what this is all about: price. Make the price so high that the licensees will value their copy and not permit it to be inspected, or make it so low that reverse-engineering is cost-infeasible; or make it free and make your money on the support contract.
Once you actually have the knowledge and experience to write such a codebase, it will be clear to you that obfuscation is meant to deter casual IP infringement.
Someone who wants to know your code is going to know your code.
If it becomes an issue of monetary loss, the courts are your protection.
That's how it works.
Someone will always be able to understand and work out your code. Heck, if you had 0 way getting to the code, even just using the system is enough for someone to be able to replicate the process.
Example: I take a jug of water and pour it into the cup, while my back is facing to another person. This other person knows that water and gravity are awesome at making things fall into other containers, so they can then work out a process of lifting a jug to let gravity (API call) work in their favour. They mightn't know exact what angle you used in your forearm and any super-sneaky cup-holding techniques you used, but they can replicate the same process and improve on it over time.
tl;dr: You can't protect code.
The thing to do is invent even more wonderful things while the competition is reverse-engineering your current stuff. It's called competing through innovation.
I am not a lawyer
if you are really worried about it, to the point you are willing to invest money in it, dont protect your code (beyond something reasonable like obfuscation or encryption) but rather patent your idea and your art. Then if someone does take it, reverse engineer it and make a better process based of yours, you have legal grounds to get your money.
There are tons of things you will have to do, include proving they took your idea (which isnt easy), but if this is the solution to world hunger and all of humanities problems its the thing to do.
Now for the downside, I will guess, and probably be 90% right that your method is:
Not patentable, for various reasons (I was amazed at the number of already patented ideas, and how difficult it was to identify original art)
Not new, or unique (i.e. there is already established art for it)
Not worth patenting because the expense far outways the benefits
An IP lawyer can tell you for sure, and the expense of a consult is not that much. Overall it will be cheaper to consult with them then to invest a lot of time in hiding code.
Good luck.
Don't even bother. If your code really "does wonderful things" be assured that it'll get hacked. And be it just for curiosity.
There is no 100% way to protect your code from reverse engineering. What language are we talking about? If this is C/C++ then it is pretty hard to reverse engineer, more you could strip it from debugging information etc. But if this is for example Java then even if you obfuscate the code, there are some pretty cool tools (like JAD) that will reveal much of your work anyway.
Despite all of this I think you should try to change your attitude. Big companies pay a lot of money for simple solutions and it seems that nowadays service is the most important thing, not the software (hence the success of open-software based companies). So, if you have a great software don't be scared that someone might steal it, rather think how to sell it good.
Is there any software which encrypts the software code, that will make debugging, troubleshooting, and understanding how the code works virtually impossible? and yet runs as usual? so that the developer can have peace of mind?
This is the totally wrong mindset IMO. What happens if you get hit by a bus? Your company goes bankrupt? All your data gets destroyed in a fire? For every single one of your customers, the value of their investment in your software will drop, and eventually reach zero, because the software can't be developed, or troubleshot, any further without you. I have seen so much money wasted that way, I think it's a horrible business model.
I earn my bread with making software myself so I know the hardships of making a living with it. Still, obfuscation can't be the way to go nowadays. Impose strict license agreements on your customers, scare the hell out of them so they don't even think about redistributing the software, but leave it open.
This is futile. There is always someone smarter than you and therefore they will be able to reverse engineer your obfuscation.
Usually someone smart enough to hack your code and use it in a meaningful way is smart enough to do it on their own, and probably thinks they can do it better than you did, so they won't bother stealing your stuff.
Don't worry about the people who can hack your code but not make meaningful use of it. If you've done a good job, this can only reinforce the quality of the job you've done (think of all the crappy touchscreen phone imitators).
They are going to reverse-engineer your code. Nothing can stop them.. The only thing you can do is make it harder. This ranges from obfuscating code that is inheritely "open" such as PHP and Javascript, all the way down to littering your code with a crap load of self-modification.
In a lot of ways, I think, the thing that makes a piece of software valuable, is not the crazy technological advancement that it provides, but rather the things that we think might think of as being tertiary to the piece of software itself. Like the fact that you'll be there to support it. Or that it's provided as a web service and you'll be there to make sure the server is running. Or that it's a community, and you'll be there to moderate and build the community.
While you may be actually selling code, the value you that your code has isn't intrinsic to the code itself, but rather derives from the features and ecosystem that surrounds your code.
I've read quite a few times how I shouldn't use cryptography if I'm not an expert. Basically both Jeff and Eric tell you the same:
Cryptography is difficult, better buy the security solution from experts than doing it yourself.
I completely agree, for a start it's incredibly difficult to perceive all possible paths an scenario might take, all the possible attacks against it and against your solution... but then When should we use it?
I will face in a few months with the task of providing a security solution to a preexisting solution we have. That is, we exchange data between servers, second phase of the project is providing good security to it. Buying a third party solution will eat up the budget anyway so ... When is it good to use cryptography for a security solution? Even if you are not a TOP expert.
Edit: To clarify due to some comments.
The project is based on data transport across network locations, the current implementation allows for a security layer to be placed before transport and we can make any changes in implementation we like (assuming reasonable changes, the architecture is well design so changes should have an acceptable impact). The question revolves around this phrase from Eric Lippert:
I don’t know nearly enough about cryptography to safely design or implement a crypto-based security system.
We're not talking about reinventing the wheel, I had in mind a certain schema when I designed the system that implied secure key exchange, encryption and decryption and some other "counter measures" (man in the middle, etc) using C# .NET and the included cryptography primitives, but I'm by no means an expert in the field so when I read that, I of course start doubting myself. Am I even capable of implementing a secure system? Would it always be parts of the system that will be insecure unless I subcontract that part?
I think this blog posting (not mine!) gives some good guidelines.
Other than that there are some things you should never do unless you're an expert. This is stuff like implementing your own crypto algorithm (or your own version of a published algorithm). It's just crazy to do that yourself! (When there's CAPI, JCE, OpenSSL, ....)
Beyond that though if you're 'inventing' anything it's almost certainly wrong. In the Coding Horror post you linked to - the main mistake to my mind is that he's doing it a very low level and you just don't need to. If you were encrypting things in Java (I'm not so familiar with .NET) you could use Jasypt which uses strong default algorithms and parameters and doesn't require you to know about ECB and CBC (though, arguably, you should anyway just because...).
There is going to be a prebuilt system for just about anything you're going to want to do with crypto. If you're storing keys then theres KeyCzar, in other cases theres Jasypt. The point is if you're doing anything 'unusual' with crypto - you shouldn't be; if you're doing something not 'unusual' then you don't need to do the crypto yourself. Don't invent a new way to store keys, generate keys from passwords, verify signatures etc - it's not necessary, it's complicated and you'll almost certainly make a mistake unless you're very very careful...
So... I don't think you necessarily need to be afraid of encrypting things but be aware that if you're specifying algorithms and parameters to those algorithms directly in your code it is probably not good. There are exceptions to any rule but as in the blog post I linked above - if you type AES into your code you're doing it wrong!
The key "take-away" from the Matasano blog post is right at the end (note that TLS is a more precise name for SSL):
THOMAS PTACEK
GPG for data at rest. TLS for data in
motion.
NATE LAWSON
You can also use Guttman's cryptlib,
which has a sane API. Or Google
Keyczar. They both have really simple
interfaces, and they try to make it
hard to do the wrong thing. What we
need are fewer libraries with higher
level interfaces. But we also need
more testing for those libraries.
The rule of thumb with cryptography isn't that you shouldn't use it if you're not an expert; rather, it's that you shouldn't re-invent the wheel unless you're an expert. In other words, use existing implementations / libraries / algorithms as much as possible. For example, don't write your own cryptographic authentication algorithm, or come up with yet another way to store keys.
As for when to use it: whenever you have data that needs to be protected from having others see it. Beyond that, it comes down to which algorithms / approaches are best: SSL vs. IPsec vs. symmetric vs. PKI, etc.
Also, a word of advice: key management is often the most challenging part of any comprehensive cryptographic solution.
You have things backwards: first you must specify your actual requirements in detail ("provide a security solution" is meaningless marketing drivel). Then you look for ways to satisfy those specific requirements; croptography will satisfy some of them.
Example of requirements that cryptography can satisfy:
Protect data sent over publich channels from spying
Protect data against tampering (or rather, detect manipulated data)
Allow servers and clients as well as users to prove their identity to each other
You need to go through the same process as for any other requirement. What is the problem being solved, what is the outcome the users are looking for, how is the solution proposed going to be supported going forward, what are the timescales involved. Sometimes there is an off the shelf solution that does the job, sometimes what you want needs to be developed as a custom solution, and sometimes you'll choose a custom solution as it will work out more cost effective than an off the shelf one.
The same is true with security requirements, but the added complexity is that to do any sort of custom solution requires additional expertise in the technical teams (development, support etc). There is also the issue that the solution may need to be not only secure but recognised as secure. This may be far easier to achieve with an off the shelf solution.
And RickNZ is absolutely right - don't forget key management. Consider this right at the outset as part of the decision making process.
The question I would start by asking, is what are you trying to achieve.
If you are trying to just secure the transmission of the data from server a to server b, then there are a number of mechanisms you could use, which would require little work, such as SSL.
However if you are trying to secure all of the data stored in the application that is a far more difficult, although if it is a requirement, then I would suggest that any cryptography, regardless of how easy to break, is better than none.
As someone who has been asked to do similar things, you face a daunting number of questions in implementing your system. There are major difference between securing a system and implementing cryptography systems.
Implementing a cryptography system is very difficult and experts routinely get it wrong, both in theory and practice. A famous theoretical failure was the knapsack cryptosystem which has been largely abandoned due to the Lenstra–Lenstra–Lovász lattice basis reduction algorithm. On the other side, we saw in the last year how an incorrect seed in Debian's random number generator opened up any key generated by the OS. You want to use a prepackaged cryptosystem, not because its an "experts-only" field, but because you want a community tested and supported system. Almost every cryptographic algorithm I know of has bounds that assume certain tasks to be hard, and if those tasks turn out to be computable (as in the LLL algorithm) the whole system becomes useless over night.
But, I believe, the real heart of the question is how to use things in order to make a secure system. While there are many libraries out there to generate keys, cipher the text, and so on, there are very few systems that implement the entire package. But as always security boils down to two concepts: worth of protection and circle of trust.
If you are guarding the Hope diamond, you spend a lot of money designing a system to protect it, employ a constant force to watch it, and hire crackers to continually try to break in. If you are just discouraging bored teenagers from reading your email, you hack something up in an hour and you don't use that address for secret company documents.
Additionally managing the circle of trust is just as difficult of a task. If your circle includes tech savvy, like-minded friends, you make a system and give them a large amount of trust with the system. If it includes many levels of trust, such as users, admins, and so on, you have a tiered system. Since you have to manage more and more interactions with a larger circle, the bugs in the larger system become more weaknesses to hack and thus you must be extremely careful in designing this system.
Now to answer your question. You hire a security expert the moment the item you're protecting is valuable enough and your circle of trust includes those you cannot trust. You don't design cryptography systems unless you do it for a living and have a community to break them, it is a full time academic discipline. If you want to hack for fun, remember that it is only for fun and don't let the value of what you are protecting get too high.
Pay for security (of which cryptography is a part but only a part) what it is worth but no more. So your first task is to decide what your security is worth, or or how much various states of security are worth. Then invite whoever holds the budget to select which state to aim for and therefore how much to spend.
No absolutes here, it's all relative.
Why buy cryptography? It's one of the most developed area in open source software of great quality :) See for example TrueCrypt or OpenSSL
There is a good chance that whatever you need cryptography for there is already a good quality, reputable open source project for it! (And if you can see the source you can see what they did; I once saw an article about a commercial software supposed to "encrypt" a file that simply xorred every byte with a fixed value!)
And, also, why would you want to re-invent the wheel? It's unlikely that with no cryptography background you will do better or even come close to the current algorithms such as AES.
I think it totally depends on what you are trying to achieve.
Does the data need to be stored encrypted at either end or does it just need to be encrypted whilst in transit?
How are you transferring the data? FTP, HTTP etc?
Probably not a good idea to have security as a second phase as by that point presumably you've been moving data around insecurely for a period of time?
I'm modifying existing security code. The specifications are pretty clear, there is example code, but I'm no cryptographic expert. In fact, the example code has a disclaimer saying, in effect, "Don't use this code verbatim."
While auditing the code I'm to modify (which is supposedly feature complete) I ran across this little gem which is used in generating the challenge:
static uint16 randomSeed;
...
uint16 GetRandomValue(void)
{
return randomSeed++;/* This is not a good example of very random generation :o) */
}
Of course, the first thing I immediately did was pass it around the office so we could all get a laugh.
The programmer who produced this code knew it wasn't a good algorithm (as indicated by the comment), but I don't think they understood the security implications. They didn't even bother to call it in the main loop so it would at least turn into a free running counter - still not ideal, but worlds beyond this.
However, I know that the code I produce is going to similarly cause a real security guru to chuckle or quake.
What are the most common security problems, specific to cryptography, that I need to understand?
What are some good resources that will give me suitable knowledge about what I should know beyond common mistakes?
-Adam
Don't try to roll your own - use a standard library if at all possible. Subtle changes to security code can have a huge impact that aren't easy to spot, but can open security holes. For example, two modified lines to one library opened a hole that wasn't readily apparent for quite some time.
Applied Cryptography is an excellent book to help you understand crypto and code. It goes over a lot of fundamentals, like how block ciphers work, and why choosing a poor cipher mode will make your code useless even if you're using a perfectly implemented version of AES.
Some things to watch out for:
Poor Sources of Randomness
Trying to design your own algorithm or protocol - don't do it, ever.
Not getting it code reviewed. Preferably by publishing it online.
Not using a well established library and trying to write it yourself.
Crypto as a panacea - encrypting data does not magically make it safe
Key Management. These days it's often easier to steal the key with a side-channel attack than to attack the crypto.
Your question shows one of the more common ones: poor sources of randomness. It doesn't matter if you use a 256 bit key if they bits aren't random enough.
Number 2 is probably assuming that you can design a system better than the experts. This is an area where a quality implementation of a standard is almost certainly going to be better than innovation. Remember, it took 3 major versions before SSL was really secure. We think.
IMHO, there are four levels of attacks you should be aware of:
social engineering attacks. You should train your users not to do stupid things and write your software such that it is difficult for users to do stupid things. I don't know of any good reference about this stuff.
don't execute arbitrary code (buffer overflows, xss exploits, sql injection are all grouped here). The minimal thing to do in order to learn about this is to read Writing Secure Code from someone at MS and watching the How to Break Web Software google tech talk. This should also teach you a bit about defense in depth.
logical attacks. If your code is manipulating plain-text, certificates, signatures, cipher-texts, public keys or any other cryptographic objects, you should be aware that handling them in bad ways can lead to bad things. Minimal things you should be aware about include offline&online dictionary attacks, replay attacks, man-in-the-middle attacks. The starting point to learning about this and generally a very good reference for you is http://www.soe.ucsc.edu/~abadi/Papers/gep-ieee.ps
cryptographic attacks. Cryptographic vulnerabilities include:
stuff you can avoid: bad random number generation, usage of a broken hash function, broken implementation of security primitive (e.g. engineer forgets a -1 somewhere in the code, which renders the encryption function reversible)
stuff you cannot avoid except by being as up-to-date as possible: new attack against a hash function or an encryption function (see e.g. recent MD5 talk), new attack technique (see e.g. recent attacks against protocols that send encrypted voice over the network)
A good reference in general should be Applied Cryptography.
Also, it is very worrying to me that stuff that goes on a mobile device which is probably locked and difficult to update is written by someone who is asking about security on stackoverflow. I believe your case would one of the few cases where you need an external (good) consultant that helps you get the details right. Even if you hire a security consultant, which I recommend you to do, please also read the above (minimalistic) references.
What are the most common security problems, specific to cryptography, that I need to understand?
Easy - you(1) are not smart enough to come up with your own algorithm.
(1) And by you, I mean you, me and everyone else reading this site...except for possibly Alan Kay and Jon Skeet.
I'm not a crypto guy either, but S-boxes can be troublesome when messed with (and they do make a difference). You also need a real source of entropy, not just a PRNG (no matter how random it looks). PRNGs are useless. Next, you should ensure the entropy source isn't deterministic and that it can't be tampered with.
My humble advice is: stick with known crypto algorithms, unless you're an expert and understand the risks. You could be better off using some tested, publicly-available open source / public domain code.