finding a generator for elgamal - security

how are generators found for the elgamal signature scheme? are there values that are used by most programs that are good generators? or is there a method to find a generator for a prime value? if so, how? Would it be true to say that a prime number has at least 1 generator?

Use DSA instead of the ElGamal signature scheme.
There are just too many mistakes that can be made implementing ElGamal. One of those mistakes is what GregS proposed: to use the IKE parameters. These parameters were generated for the ElGamal encryption and not for the signature scheme. The two schemes have distinct requirements. In particular using g=2 as a generator is a good choice for the encryption, but a very bad choice for the signature scheme. (See e.g. the "Handbook of Applied Cryptography" http://www.cacr.math.uwaterloo.ca/hac/ note 11.67 in chapter 11 for some details). Correct would be to select the generator randomly. But once again, if you just use DSA then you can simply avoid these pitfalls by following the standard.
Just to add a little more: OpenPGP https://www.rfc-editor.org/rfc/rfc4880 used to allow ElGamal signatures, but has deprecated them some time ago. This deprecation was quite reasonable, since DSA has only advantages: it is more efficient, more secure and standardized. Of course, you could look at old PGP implementations, but it wouldn't tell you if these implementations give you reasonable choices without reading the literature first.

how are generators found for the elgamal signature scheme?
are there values that are used by most programs that are good generators?
or is there a method to find a generator for a prime value? if so, how?
You can use the generic, probabilistic algorithm 4.86 in Handbook of Applied Cryptography. You still need to weed out from the output of such algorithm the values known to be insecure for Elgamal signature though. At the very least any value that divides p-1 (for instance 2) and any value whose inverse divides p-1. Note that those are the conditions I am aware of today. Some in-depth research over papers published on the topic may be needed.
Personally, I would not trust domain parameters already used in existing programs. The authors may not have considered all the conditions above, plus research may have highlighted new conditions since they were chosen.
Would it be true to say that a prime number has at least 1 generator?
Absolutely true: there always is at least one generator for the multiplicative group over the integers modulo p (with p being a prime number). It has actually many more: phi(phi(p)), with phi being the totient function. Not all of them will be safe for the Elgamal signature scheme though.

El Gamal can be seen as a variant of the Diffie Hellman algorithm, and parameters for the latter can be used for the former. So for example you can use IKE groups 1 and 2 from RFC 2409, and larger IKE groups sprinkled in other RFCs. You can also follow the discussion in FIPS 186 for generating DSA parameters. Also, see this discussion of primitive roots.
EDIT:
As noted by #abc, this is wrong for el gamal signatures. Follow the DSA link (FIPS 186).

Related

CLPFD for real numbers

CLP(FD) allows user to set the domain for every wannabe-integer variable, so it's able to solve equations.
So far so good.
However you can't do the same in CLP(R) or similar languages (where you can do only simple inferences). And it's not hard to understand why: the fractional part of a number may have an almost infinite region, putted down by an implementation limit. This mean the search space will be too large to make any practical use for a solver which deals with floats like with integers. So it's the user task to write generator in CLP(R) and set constraint guards where needed to get variables instantiated with numbers (if simple inference is not possible).
So my question here: is there any CLP(FD)-like language over reals? I think it could be implemented by use of number rounding, searching and following incremental approximation.
There are at least some major CLP(FD) solvers that support real (decision) variables:
Gecode
JaCoP
ECLiPSe CLP (ic library)
Choco (using Ibex)
(The first three also support var float in MiniZinc.)
The answser to your question is yes. There is Constraint-based Solvers dedicated for floating numbers. I do not have a list of solvers but I know that that ibex http://www.ibex-lib.org is a library allowing the use of floats. You should also have a look at SMT-Solvers implementing the Real-Theory (http://smtlib.cs.uiowa.edu/solvers.shtml).

Combined digital signature for checking if majority of signed elements are not changed

I'm not familiar with the field, so please excuse me if the question seems trivial or dull.
I have N strings which I need to sign in such a way, that the signature remains valid if
up to M strings are changed, removed, or added. N > M, N may vary. The signature should not allow for deducing N from the signature itself.
All I can imagine so far, is a strightforward approach with building hashes for each string separately, and storing them all as signature, but it does not comply with the latest requirement.
If there exist some language specific examples, please mention them as well, - Java, C++, PHP, etc. are OK.
After some investigation of the matter I found the following information which might be helpful.
There is a so called "Rolling hashes". And another closely related technology has the name "Context Triggered Piecewise Hashes" (CTPH). There is a comprehensible article on CTPH:
Identifying almost identical files using context triggered piecewise hashing.
I assume it can be used for signing N elements, concatenated in a single input file.
The algorithm supposes that block size (used for periodical splicing of signatures built from traditional hashes for pieces) is publicly known as it's embedded into the final signature. This could possibly allow deducing approximate size of a signed content, but apparently keeps N in secret.
For a given unknown content the algoritm provides a measure of similarity or homology, in terms of CTPH, as a value between 0 (difference) and 100 (identity) between a known signed content and the unknown content using their signatures.

How (if at all) does a predictable random number generator get more secure after SHA-1ing its output?

This article states that
Despite the fact that the Mersenne Twister is an extremely good pseudo-random number generator, it is not cryptographically secure by itself for a very simple reason. It is possible to determine all future states of the generator from the state the generator has at any given time, and either 624 32-bit outputs, or 19,937 one-bit outputs are sufficient to provide that state. Using a cryptographically-secure hash function, such as SHA-1, on the output of the Mersenne Twister has been recommended as one way of obtaining a keystream useful in cryptography.
But there are no references on why digesting the output would make it any more secure. And honestly, I don't see why this should be the case. The Mersenne Twister has a period of 2^19937-1, but I think my reasoning would also apply to any periodic PRNG, e.g. Linear Congruential Generators as well. Due to the properties of a secure one-way function h, one could think of h as an injective function (otherwise we could produce collisions), thus simply mapping the values from its domain into its range in a one-to-one manner.
With this thought in mind I would argue that the hashed values will produce exactly the same periodical behaviour as the original Mersenne Twister did. This means if you observe all values of one period and the values start to recur, then you are perfectly able to predict all future values.
I assume this to be related to the same principle that is applied in password-based encryption (PKCS#5) - because the domain of passwords does not provide enough entropy, simply hashing passwords doesn't add any additional entropy - that's why you need to salt passwords before you hash them. I think that exactly the same principle applies here.
One simple example that finally convinced me: Suppose you have a very bad PRNG that will always produce a "random number" of 1. Then even if SHA-1 would be a perfect one-way function, applying SHA-1 to the output will always yield the same value, thus making the output no less predictable than previously.
Still, I'd like to believe there is some truth to that article, so surely I must have overlooked something. Can you help me out? To a large part, I have left out the seed value from my arguments - maybe this is where the magic happens?
The state of the mersenne twister is defined by the previous n outputs, where n is the degree of recurrence (a constant). As such, if you give the attacker n outputs straight from a mersenne twister, they will immediately be able to predict all future values.
Passing the values through SHA-1 makes it more difficult, as now the attacker must try to reverse the RNG. However, for a 32-bit word size, this is unlikely to be a severe impediment to a determined attacker; they can build a rainbow table or use some other standard approach for reversing SHA-1s, and in the event of collisions, filter candidates by whether they produce the RNG stream observed. As such, the mersenne twister should not be used for cryptographically sensitive applications, SHA-1 masking or no. There are a number of standard CSPRNGs that may be used instead.
An attacker is able to predict the output of MT based on relatively few outputs not because it repeats over such a short period (it doesn't), but because the output leaks information about the internal state of the PRNG. Hashing the output obscures that leaked information. As #bdonlan points out, though, if the output size is small (32 bits, for instance), this doesn't help, as the attacker can easily enumerate all valid plaintexts and precalculate their hashes.
Using more than 32 bits of PRNG output as an input to the hash would make this impractical, but a cryptographically secure PRNG is still a much better choice if you need this property.

Security of Exclusive-OR (XOR) encryption

XOR encryption is known to be quite weak. But how weak is it if I have a key that is made up of multiple keys of different (ideally prime) lengths which are combined to make a longer key. eg I have a text keys of length 5, 9 and 11. If I just apply the first key using XOR encryption then it should be easy to break as the encryption byte will repeat every 5 bytes. However if I 'overlay' the 3 of these keys I get an effective non-repeating length of 5*9*11 = 495. This sounds to me pretty strong. If I use a couple of verses of a poem using each line as a key then my non-repeating length will be way bigger than most files. How strong would this be (providing the key remains secret! :) )
XOR encryption is exactly as strong as the key stream. If you XOR with a "One time pad" - a sequence of physically generated random numbers that you only use once, then your encryption is theoretically unbreakable. You do have the problem however of hiding and distributing the key.
So your question comes down to - "how secure/random is a keystream made of three text strings?" The answer is "not very secure at all". Probably good enough to keep out your little sister, but not necessarily if you've got a smart little sister like I have.
What about the 'known plaintext' attack? If you know the encrypted and the cleartext versions of the same string, you can retrieve the key.
http://en.wikipedia.org/wiki/XOR_cipher
http://en.wikipedia.org/wiki/Known-plaintext_attack
http://en.wikipedia.org/wiki/Stream_cipher_attack
If P and Q are two independent cryptographic methods, the composite cryptographic function P(Q(x)) won't be any weaker than the stronger of P(x) or Q(x), but it won't necessarily be meaningfully stronger either. In order for a composite cryptographic function to gain any strength, the operations comprising it have to meet certain criteria. Combining weak ciphers arbitrarily, no matter how many one uses, is unlikely to yield a strong cipher.

Random access encryption with AES In Counter mode using Fortuna PRNG:

I'm building file-encryption based on AES that have to be able to work in random-access mode (accesing any part of the file). AES in Counter for example can be used, but it is well known that we need an unique sequence never used twice.
Is it ok to use a simplified Fortuna PRNG in this case (encrypting a counter with a randomly chosen unique key specific to the particular file)? Are there weak points in this approach?
So encryption/decryption can look like this
Encryption of a block at Offset:
rndsubseq = AESEnc(Offset, FileUniqueKey)
xoredplaintext = plaintext xor rndsubseq
ciphertext = AESEnc(xoredplaintext, PasswordBasedKey)
Decryption of a block at Offset:
rndsubseq = AESEnc(Offset, FileUniqueKey)
xoredplaintext = AESDec(ciphertext, PasswordBasedKey)
plaintext = xoredplaintext xor rndsubseq
One observation. I came to the idea used in Fortuna by myself and surely discovered later that it is already invented. But as I read everywhere the key point about it is security, but there's another good point: it is a great random-access pseudo random numbers generator so to speak (in simplified form). So the PRNG that not only produces very good sequence (I tested it with Ent and Die Hard) but also allow to access any sub-sequence if you know the step number. So is it generally ok to use Fortuna as a "Random-access" PRNG in security applications?
EDIT:
In other words, what I suggest is to use Fortuna PRNG as a tweak to form a tweakable AES Cipher with random-access ability. I read the work of Liskov, Rivest and Wagner, but could not understand what was the main difference between a cipher in a mode of operation and a tweakable cipher. They said they suggested to bring this approach from high level inside the cipher itself, but for example in my case xoring the plain text with the tweak, is this a tweak or not?
I think you may want to look up how "tweakable block ciphers" work and have a look at how the problem of disc encryption is solved: Disk encryption theory. Encrypting the whole disk is similar to your problem: encryption of each sector must be done independently (you want independent encryption of data at different offsets) and yet the whole thing must be secure. There is a lot of work done on that. Wikipedia seems to give a good overview.
EDITED to add:
Re your edit: Yes, you are trying to make a tweakable block cipher out of AES by XORing the tweak with the plaintext. More concretely, you have Enc(T,K,M) = AES (K, f(T) xor M) where AES(K,...) means AES encryption with the key K and f(T) is some function of the tweak (in your case I guess it's Fortuna). I had a brief look at the paper you mentioned and as far as I can see it's possible to show that this method does not produce a secure tweakable block cipher.
The idea (based on definitions from section 2 of the Liskov, Rivest, Wagner paper) is as follows. We have access to either the encryption oracle or a random permutation and we want to tell which one we are interacting with. We can set the tweak T and the plaintext M and we get back the corresponding ciphertext but we don't know the key which is used. Here is how to figure out if we use the construction AES(K, f(T) xor M).
Pick any two different values T, T', compute f(T), f(T'). Pick any message M and then compute the second message as M' = M xor f(T) xor f(T'). Now ask the encrypting oracle to encrypt M using tweak T and M' using tweak T'. If we deal with the considered construction, the outputs will be identical. If we deal with random permutations, the outputs will be almost certainly (with probability 1-2^-128) different. That is because both inputs to the AES encryptions will be the same, so the ciphertexts will be also identical. This would not be the case when we use random permutations, because the probability that the two outputs are identical is 2^-128. The bottom line is that xoring tweak to the input is probably not a secure method.
The paper gives some examples of what they can prove to be a secure construction. The simplest one seems to be Enc(T,K,M) = AES(K, T xor AES(K, M)). You need two encryptions per block, but they prove the security of this construction. They also mention faster variants, but they require additional primitive (almost-xor-universal function families).
Even though I think your approach is secure enough, I don't see any benefits over CTR. You have the exact same problem, which is you don't inject true randomness to the ciphertext. The offset is a known systematic input. Even though it's encrypted with a key, it's still not random.
Another issue is how do you keep the FileUniqueKey secure? Encrypted with password? A whole bunch issues are introduced when you use multiple keys.
Counter mode is accepted practice to encrypt random access files. Even though it has all kinds of vulnerabilities, it's all well studied so the risk is measurable.

Resources