Latest stream cipher considered reasonably secure & easy to implement? - security

(A)RC4 used to fit the bill, since it was so simple to write. But it's also less-than-secure these days.
I'm wondering if there's a successor that's:
Code is small enough to write & debug within an hour or so, using pseudo code as a template.
Still considered secure, as of 2010.
Optimized for software.
Not encumbered by licensing issues.
I can't use crypto libraries, otherwise all of this would be moot. Also, I'll consider block algorithms though I think most are pretty hefty.
Thanks.

Honestly your best bet is to go use a crypto library. Its an already tested platform and when even the crypto libraries can/do have trouble with implementing the algorithms... Its better to use the pre-existing crypto libraries, its already tough enough to do encryption/decryption correctly using the API as it is as in this post on Coding Horror: Why Isn't My Encryption.. Encrypting?
Now I've gone to the Wikipedia article on Stream ciphers it might be worth going through the list of ciphers on the article, there has been several ciphers developed since RC4 in 1987, and to my very limited cryptography knowledge some of them seems like they might be more secure than RC4. You may also want to consider checking out the Wikipedia article on eSTREAM. There are several ciphers which are in the portfolio: HC-128, Rabbit, Salsa20/12, SOSEMANUK.

No cipher is easy to implement, especially symmetric ciphers and they never will be. There is a lot that can go wrong, and most programmers don't realize this. You need to do a lot more reading into this topic.
With block ciphers you must be concerned with the mode you use, and different modes fill different needs(But ECB is always the wrong choice). You must also be very careful about maintaining a unique IV for each message. If you are using a "password" as your key then you have to use a string2key function.
Stream ciphers don't have IV's or modes, and this actually makes things more difficult. A stream cipher function accepts only a key and the output is a "PRNG stream" that is infinity large. This stream of random data is then XOR'ed with your message. So if you use the same key, you will get the same PRNG stream. If an attacker knows the plain text of 1 message (or a part of a message) then he can XOR out the PRNG from the cipher text and then decrypt all other messages using that key in constant time O(1). For a stream cipher to be practically secure you can never reuse the same key.
I highly recommend that you pick up a copy of Practical Cryptography, which has a few chapters dedicated to Symmetric Cipher attacks. This book is straight to the point and doesn't require a lot of math. If don't really care about implementing your own then you can use a proven cipher implementation such as Jasypt which "just works" in a very secure way.

Related

How to protect the encryption key from reverse engineering?

My software is using AES Rijndael.
I am using a SHA-256 hash to generate a key from a string with an arbitrary length, and then passing this as both the private and public key since in this instance I do not need to differentiate between the two.
How do I protect my key from being hacked out of the executable?
I know not to use a literal but instead generate the key at runtime with some predetermined steps, but all the same the key will still be in memory right before its sent on to the AES initialization function and so can quite easily be retrieved then.
AES is obviously very secure, but what good does that do me if someone breaks the executable instead?
Is there some common practise when solving this problem?
This can't be done. This is the basic problem with e.g. DRM scheme's on PC's: they need to have the key in memory, so it can be extracted. You can maybe obscure it while it is not in use, but that's about it. And if your application is popular and distributed, then somebody will crack you delicious scheme. That's why some companies use dongles or TPM chips for high value applications.
There is something - very complex in mathematical theory - called "whitebox cryptography". In this case, the AES algorithm is modified in a way, that it builds up the secret during encryption. I do not know exactly, how this is achieved, but this one does not need to have a initialized secret, but the secret is part of the algorithm.
An attacker might see, that your AES implementation is a bit "different" but at no time in execution the key is visible in memory. The only chance an attacker will have, is to copy the whole whitebox code but it is really hard to reverse engineer this - he would just be able to use it. Anyway depending on the way you use the AES, this might be enough to break in.

AES vs Blowfish for file encryption

I want to encrypt a binary file. My goal is that to prevent anyone to read the file who doesn't have the password.
Which is the better solution, AES or Blowfish with the same key length? We can assume that the attacker has great resources (softwares, knowledge, money) for cracking the file.
Probably AES. Blowfish was the direct predecessor to Twofish. Twofish was Bruce Schneier's entry into the competition that produced AES. It was judged as inferior to an entry named Rijndael, which was what became AES.
Interesting aside: at one point in the competition, all the entrants were asked to give their opinion of how the ciphers ranked. It's probably no surprise that each team picked its own entry as the best -- but every other team picked Rijndael as the second best.
That said, there are some basic differences in the basic goals of Blowfish vs. AES that can (arguably) favor Blowfish in terms of absolute security. In particular, Blowfish attempts to make a brute-force (key-exhaustion) attack difficult by making the initial key setup a fairly slow operation. For a normal user, this is of little consequence (it's still less than a millisecond) but if you're trying out millions of keys per second to break it, the difference is quite substantial.
In the end, I don't see that as a major advantage, however. I'd generally recommend AES. My next choices would probably be Serpent, MARS and Twofish in that order. Blowfish would come somewhere after those (though there are a couple of others that I'd probably recommend ahead of Blowfish).
It is a not-often-acknowledged fact that the block size of a block cipher is also an important security consideration (though nowhere near as important as the key size).
Blowfish (and most other block ciphers of the same era, like 3DES and IDEA) have a 64 bit block size, which is considered insufficient for the large file sizes which are common these days (the larger the file, and the smaller the block size, the higher the probability of a repeated block in the ciphertext - and such repeated blocks are extremely useful in cryptanalysis).
AES, on the other hand, has a 128 bit block size. This consideration alone is justification to use AES instead of Blowfish.
In terms of the algorithms themselves I would go with AES, for the simple reason is that it's been accepted by NIST and will be peer reviewed and cryptanalyzed for years. However I would suggest that in practical applications, unless you're storing some file that the government wants to keep secret (in which case the NSA would probably supply you with a better algorithm than both AES and Blowfish), using either of these algorithms won't make too much of a difference. All the security should be in the key, and both of these algorithms are resistant to brute force attacks. Blowfish has only shown to be weak on implementations that don't make use of the full 16 rounds. And while AES is newer, that fact should make you lean more towards BlowFish (if you were only taking age into consideration). Think of it this way, BlowFish has been around since the 90's and nobody (that we know of) has broken it yet....
Here is what I would pose to you... instead of looking at these two algorithms and trying to choose between the algorithm, why don't you look at your key generation scheme. A potential attacker who wants to decrypt your file is not going to sit there and come up with a theoretical set of keys that can be used and then do a brute force attack that can take months. Instead he is going to exploit something else, such as attacking your server hardware, reverse engineering your assembly to see the key, trying to find some config file that has the key in it, or maybe blackmailing your friend to copy a file from your computer. Those are going to be where you are most vulnerable, not the algorithm.
AES.
(I also am assuming you mean twofish not the much older and weaker blowfish)
Both (AES & twofish) are good algorithms. However even if they were equal or twofish was slightly ahead on technical merit I would STILL chose AES.
Why? Publicity. AES is THE standard for government encryption and thus millions of other entities also use it. A talented cryptanalyst simply gets more "bang for the buck" finding a flaw in AES then it does for the much less know and used twofish.
Obscurity provides no protection in encryption. More bodies looking, studying, probing, attacking an algorithm is always better. You want the most "vetted" algorithm possible and right now that is AES. If an algorithm isn't subject to intense and continual scrutiny you should place a lower confidence of it's strength. Sure twofish hasn't been compromised. Is that because of the strength of the cipher or simply because not enough people have taken a close look ..... YET
The algorithm choice probably doesn't matter that much. I'd use AES since it's been better researched. What's much more important is choosing the right operation mode and key derivation function.
You might want to take a look at the TrueCrypt format specification for inspiration if you want fast random access. If you don't need random access than XTS isn't the optimal mode, since it has weaknesses other modes don't. And you might want to add some kind of integrity check(or message authentication code) too.
I know this answer violates the terms of your question, but I think the correct answer to your intent is simply this: use whichever algorithm allows you the longest key length, then make sure you choose a really good key. Minor differences in the performance of most well regarded algorithms (cryptographically and chronologically) are overwhelmed by a few extra bits of a key.
Both algorithms (AES and twofish) are considered very secure. This has been widely covered in other answers.
However, since AES is much widely used now in 2016, it has been specifically hardware-accelerated in several platforms such as ARM and x86. While not significantly faster than twofish before hardware acceleration, AES is now much faster thanks to the dedicated CPU instructions.

new encryption algorithm for ssh

I am asked to add a new algorithm to ssh so data is ciphered in new algorithm, any idea how to add new algorithm to ssh ?
thanks
It is possible to add some new algorithm to SSH communication, and this is done from time to time (eg. AES was added later). But the question is that you need to modify both client and server so that they both support this algorithm, otherwise it makes no sense.
I assume that you were asked to add some custom, either home-made or non-standard algorithm. So first thing I'd like to do is to warn you that the added algorithm can be weak. You need to perform at least basic search for information about this algorithm, as if it's broken, you will do completely useless and even dangerous work.
As for software modification themselves - it's a rare job to do so most likely you won't find anybody with this experience there. However the code that handles various algorithms is typical and adding new algorithm is trivial - you add one source file with algorithm implementation and then modify a bunch of places by adding one more case to switch statement.
In my career I've worked on a private fork of ssh that was sold as closed-source commercial software. Even they in all their crazy stupidity (private fork? who in their right mind uses non-Open Source encryption software? I thought our customers were completely off their rockers.) didn't add a new encryption algorithm.
It can be done though. Adding the hooks to the ssh protocol to support it isn't hard. The protocol is designed to be extensible in that way. At the beginning the client and server exchange lists of encryption algorithms they're willing to use.
This means, of course, that only a modified client and modified server will talk to eachother.
The real difficulty is OpenSSL. ssh does not use TLS/SSL, but it does use the OpenSSL encryption library. You would have to add the new algorithm to that library, and that library is a terrible beast.
Though, I suppose you could add the algorithm without adding it to OpenSSL. That might be tricky though since I think openssh may rely heavily on the way the OpenSSL APIs work. And part of how they work allows you to pass around a constant representing which algorithm you want to use and then a standard set of calls for encryption and decryption that use the constant to decide on the algorithm.
Again though, if I recall correctly, OpenSSL has an API specifically for adding new algorithms to its suite. So that may not be so hard. You will have to make sure this happens when the OpenSSL library is being initialized.
Anyway, this is a fairly vague answer, but maybe it will point you in the right direction. You should make whoever is doing this pay enormous sums of money. Stupidity that requires this level of knowledge to pull off should never come cheaply.

Pitfalls of cryptographic code

I'm modifying existing security code. The specifications are pretty clear, there is example code, but I'm no cryptographic expert. In fact, the example code has a disclaimer saying, in effect, "Don't use this code verbatim."
While auditing the code I'm to modify (which is supposedly feature complete) I ran across this little gem which is used in generating the challenge:
static uint16 randomSeed;
...
uint16 GetRandomValue(void)
{
return randomSeed++;/* This is not a good example of very random generation :o) */
}
Of course, the first thing I immediately did was pass it around the office so we could all get a laugh.
The programmer who produced this code knew it wasn't a good algorithm (as indicated by the comment), but I don't think they understood the security implications. They didn't even bother to call it in the main loop so it would at least turn into a free running counter - still not ideal, but worlds beyond this.
However, I know that the code I produce is going to similarly cause a real security guru to chuckle or quake.
What are the most common security problems, specific to cryptography, that I need to understand?
What are some good resources that will give me suitable knowledge about what I should know beyond common mistakes?
-Adam
Don't try to roll your own - use a standard library if at all possible. Subtle changes to security code can have a huge impact that aren't easy to spot, but can open security holes. For example, two modified lines to one library opened a hole that wasn't readily apparent for quite some time.
Applied Cryptography is an excellent book to help you understand crypto and code. It goes over a lot of fundamentals, like how block ciphers work, and why choosing a poor cipher mode will make your code useless even if you're using a perfectly implemented version of AES.
Some things to watch out for:
Poor Sources of Randomness
Trying to design your own algorithm or protocol - don't do it, ever.
Not getting it code reviewed. Preferably by publishing it online.
Not using a well established library and trying to write it yourself.
Crypto as a panacea - encrypting data does not magically make it safe
Key Management. These days it's often easier to steal the key with a side-channel attack than to attack the crypto.
Your question shows one of the more common ones: poor sources of randomness. It doesn't matter if you use a 256 bit key if they bits aren't random enough.
Number 2 is probably assuming that you can design a system better than the experts. This is an area where a quality implementation of a standard is almost certainly going to be better than innovation. Remember, it took 3 major versions before SSL was really secure. We think.
IMHO, there are four levels of attacks you should be aware of:
social engineering attacks. You should train your users not to do stupid things and write your software such that it is difficult for users to do stupid things. I don't know of any good reference about this stuff.
don't execute arbitrary code (buffer overflows, xss exploits, sql injection are all grouped here). The minimal thing to do in order to learn about this is to read Writing Secure Code from someone at MS and watching the How to Break Web Software google tech talk. This should also teach you a bit about defense in depth.
logical attacks. If your code is manipulating plain-text, certificates, signatures, cipher-texts, public keys or any other cryptographic objects, you should be aware that handling them in bad ways can lead to bad things. Minimal things you should be aware about include offline&online dictionary attacks, replay attacks, man-in-the-middle attacks. The starting point to learning about this and generally a very good reference for you is http://www.soe.ucsc.edu/~abadi/Papers/gep-ieee.ps
cryptographic attacks. Cryptographic vulnerabilities include:
stuff you can avoid: bad random number generation, usage of a broken hash function, broken implementation of security primitive (e.g. engineer forgets a -1 somewhere in the code, which renders the encryption function reversible)
stuff you cannot avoid except by being as up-to-date as possible: new attack against a hash function or an encryption function (see e.g. recent MD5 talk), new attack technique (see e.g. recent attacks against protocols that send encrypted voice over the network)
A good reference in general should be Applied Cryptography.
Also, it is very worrying to me that stuff that goes on a mobile device which is probably locked and difficult to update is written by someone who is asking about security on stackoverflow. I believe your case would one of the few cases where you need an external (good) consultant that helps you get the details right. Even if you hire a security consultant, which I recommend you to do, please also read the above (minimalistic) references.
What are the most common security problems, specific to cryptography, that I need to understand?
Easy - you(1) are not smart enough to come up with your own algorithm.
(1) And by you, I mean you, me and everyone else reading this site...except for possibly Alan Kay and Jon Skeet.
I'm not a crypto guy either, but S-boxes can be troublesome when messed with (and they do make a difference). You also need a real source of entropy, not just a PRNG (no matter how random it looks). PRNGs are useless. Next, you should ensure the entropy source isn't deterministic and that it can't be tampered with.
My humble advice is: stick with known crypto algorithms, unless you're an expert and understand the risks. You could be better off using some tested, publicly-available open source / public domain code.

Best general-purpose digest function?

When writing an average new app in 2009, what's the most reasonable digest function to use, in terms of security and performance? (And how can I determine this in the future, as conditions change?)
When similar questions were asked previously, answers have included SHA1, SHA2, SHA-256, SHA-512, MD5, bCrypt, and Blowfish.
I realize that to a great extent, any one of these could work, if used intelligently, but I'd rather not roll a dice and pick one randomly. Thanks.
I'd follow NIST/FIPS guidelines:
March 15, 2006: The SHA-2 family of
hash functions (i.e., SHA-224,
SHA-256, SHA-384 and SHA-512) may be
used by Federal agencies for all
applications using secure hash
algorithms. Federal agencies should
stop using SHA-1 for digital
signatures, digital time stamping and
other applications that require
collision resistance as soon as
practical, and must use the SHA-2
family of hash functions for these
applications after 2010. After 2010,
Federal agencies may use SHA-1 only
for the following applications:
hash-based message authentication
codes (HMACs); key derivation
functions (KDFs); and random number
generators (RNGs). Regardless of use,
NIST encourages application and
protocol designers to use the SHA-2
family of hash functions for all new
applications and protocols.
You say "digest function"; presumably that means you want to use it to compute digests of "long" messages (not just hashing "short" "messages" like passwords). That means bCrypt and similar choices are out; they're designed to be slow to inhibit brute-force attacks on password databases. MD5 is completely broken, and SHA-0 and SHA-1 are too weakened to be good choices. Blowfish is a stream cipher (though you can run it in a mode that produces digests), so it's not such a good choice either.
That leaves several families of hash functions, including SHA-2, HAVAL, RIPEMD, WHIRLPOOL, and others. Of these, the SHA-2 family is the most thoroughly cryptanalyzed, and so it would be my recommendation for general use. I would recommend either SHA2-256 or SHA2-512 for typical applications, since those two sizes are the most common and likely to be supported in the future by SHA-3.
It really depends on what you need it for.
If you are in need of actual security, where the ability to find a collision easily would compromise your system, I would use something like SHA-256 or SHA-512 as they come heavily recommended by various agencies.
If you are in need of something that is fast, and can be used to uniquely identify something, but there are no actual security requirements (ie, an attacker wouldn't be able to do anything nasty if they found a collision) then I would use something like MD5.
MD4, MD5, and SHA-1 have been shown to be more easily breakable, in the sense of finding a collision via a birthday attack method, than expected. RIPEMD-160 is well regarded, but at only 160 bits a birthday attack needs only 2^80 operations, so it won't last forever. Whirlpool has excellent characteristics and appears the strongest of the lot, though it doesn't have the same backing as SHA-256 or SHA-512 does - in the sense that if there was a problem with SHA-256 or SHA-512 you'd be more likely to find out about it via proper channels.

Resources