Is /dev/random considered truly random? - linux

For instance, could it be used to generate a one-time pad key?
Also, what are its sources and how could it be used to generate a random number between x and y?

Strictly speaking, /dev/random is not really completely random. /dev/random feeds on hardware sources which are assumed to be unpredictible in some way; then it mixes such data using functions (hash functions, mostly) which are also assumed to be one-way. So the "true randomness" of /dev/random is thus relative to the inherent security of the mixing functions, security which is no more guaranteed than that of any other cryptographic primitive, in particular the PRNG hidden in /dev/urandom.
The difference between /dev/random and /dev/urandom is that the former will try to maintain an estimate (which means "a wild guess") of how much entropy it has gathered, and will refuse to output more bits than that. On the other hand, /dev/urandom will happily produce megabytes of data from the entropy it has.
The security difference between the two approaches is meaningless unless you assume that "classical" cryptographic algorithms can be broken, and you use one of the very few information-theoretic algorithms (e.g. OTP or Shamir's secret sharing); and, even then, /dev/random may be considered as more secure than /dev/urandom only if the mixing functions are still considered to be one-way, which is not compatible with the idea that a classical cryptographic algorithm can be broken. So, in practice and even in theory, no difference whatsoever. You can use the output of /dev/urandom for an OTP and it will not be broken because of any structure internal to /dev/urandom -- actual management of the obtained stream will be the weak point (especially long-time storage). On the other hand, /dev/random has very real practical issues, namely that it can block at untimely instants. It is really irksome when an automated OS install blocks (for hours !) because SSH server key generation insists on using /dev/random and needlessly stalls for entropy.
There are many applications which read /dev/random as a kind of ritual, as if it was "better" than /dev/urandom, probably on a karmic level. This is plain wrong, especially when the alea is to be used with classical cryptographic algorithms (e.g. to generate a SSH server public key). Do not do that. Instead, use /dev/urandom and you will live longer and happier. Even for one-time pad.
(Just for completeness, there is a quirk with /dev/urandom as implemented on Linux: it will never block, even if it has not gathered any entropy at all since previous boot. Distributions avoid this problem by creating a "random seed" at installation time, with /dev/random, and using that seed at each boot to initialize the PRNG used by /dev/urandom; a new random seed is regenerated immediately, for next boot. This ensures that /dev/urandom always works over a sufficiently big internal seed. The FreeBSD implementation of /dev/urandom will block until a given entropy threshold is reached, which is safer.)

The only thing in this universe that can be considered truly is one based on quantum effects. Common example is radioactive decay. For certain atoms you can be sure only about half-life, but you can't be sure which nucleus will break up next.
About /dev/random - it depends on implementation. In Linux it uses as entropy sources:
The Linux kernel generates entropy
from keyboard timings, mouse
movements, and IDE timings and makes
the random character data available to
other operating system processes
through the special files /dev/random
and /dev/urandom.
Wiki
It means that it is better than algorithmic random generators, but it is not perfect as well. The entropy may not be distributed randomly and can be biased.
This was philosophy. Practice is that on Linux /dev/random is random enough for vast majority of tasks.
There are implementations of random generators that have more entropy sources, including noise on audio inputs, CPU temperature sensors etc. Anyway they are not true.
There is interesting site where you can get Genuine random numbers, generated by radioactive decay.

/dev/random will block if there's not enough random data in the entropy pool whereas /dev/urandom will not. Instead, /dev/urandom will fall back to a PRNG (kernel docs). From the same docs:
The random number generator [entropy pool] gathers environmental noise from device drivers and other sources into an entropy pool.
So /dev/random is not algorithmic, like a PRNG, but it may not be "truly random" either. Mouse movements and keystroke timings tend to follow patterns and can be used for exploits but you'll have to weigh the risk against your use case.
To get a random number between x and y using /dev/random, assuming you're happy with a 32-bit integer, you could have a look at the way the Java java.util.Random class does it (nextInt()), substituting in appropriate code to read from /dev/random for the nextBytes() method.

Related

How to produce a "securely random" string token in Haskell?

I want to produce string tokens to implement the password reset functionality of a web app. They can be v4 UUIDs but it is not required.
I want each token to be "securely random" in the sense of this SO question. The system's entropy pool should be sampled for each generated token.
I found the uuid package, that generates v4 UUIDs. The docs mention that
we use the System.Random StdGen as our random source.
but it is unclear to me if this is enough.
Is there other library that I should use instead?
The requirement that the system entropy pool should be sampled on each generated token sounds suspicious. Entropy is usually a scarce resource.
Let's agree that /dev/random is definitely out. Other processes may need it, it could lead to potential denial of service etc. First and foremost though, /dev/random blocks on read when it deems the entropy pool insufficient so it's a definite no. /dev/urandom on the other hand should be perfectly fine. Some say that it has issues in some odd cases (for example right after boot on diskless machines) but let's no go there.
Note that on most systems /dev/[u]random both use the same algorithm. The difference is that /dev/urandom never blocks but also not necessarily uses any new entropy on each read. So if using /dev/urandom counts as "sampling the system entropy pool" then whatever solution you have that properly uses it is probably fine.
However, it comes at a cost of impurity. Reading from /dev/urandom forces us into IO. Being haskellers we have to at least consider alternatives and we are in luck if only we drop the entropy-sample requirement which seems weird to begin with.
Instead we can use a crypto-secure deterministic RNG and only seed it from /dev/random. ChaChaDRG from cryptonite which is an implementation of DJB's ChaCha would be a good choice. Since it is deterministic we only need IO to get the initial seed. Everything after that is pure.
Cryptonite offers getRandomBytes so you can tweak the length of the token to suit your needs.

Genuine Random Numbers with Multithreading

A programm can only produce pseudo random numbers, because it is always deterministic. But with Multithreading you get non-determinism, because of all the effects of scheduling/cache/swaps etc.
Could you use this effect to produce real random numbers, because this depends not only on deterministic code, but also on physical phenomina as latency etc.
This is used all the time. If you're reading from /dev/random on many Unix-like systems, you're probably getting "environmental" effects mixed into the entropy. But don't over-read this as "real random numbers." The effects you're describing still only vary over a limited range, and in some cases may vary over a very limited range such that they are very close to deterministic (and "close" is often enough if you build your security around your RNG).
The classic version of this problem is a router booting up and seeding its RNG with latency information from the network. In a very noisy network, this may be pretty random. In a fairly quiet network, this may be very predictable. This is a very real-world problem and difficult to solve in embedded systems.
A lesson from this is to avoid inventing your own RNGs, particularly if you are building security systems on top of them. Research the RNGs your system provides, and use them (or research cryptographic random number generation before embarking on a new solution).
If this subject interests you, random.org has some good introductory materials, along with an implementation based on atmospheric data. (Whether even this is "true random" or just "deterministic based on a state we don't know" is an argument for physicists, but it's about as close as we've got.)

Is XXTEA a good encryption algorithm for a PIC microcontroller?

I need a good encrypt algorithm for a PIC microcontroller. After some googling, it seems XXTEA is the only option, however, "XXTEA is vulnerable to a chosen-plaintext attack requiring 2^59 queries and negligible work".
I am not good at cryptography, so I would like to ask: how accurate is the above statement? Could I use XXTEA in a commercial security application? If no, is there any available algorithm I could use for my embedded system?
You cannot know what makes an encryption algorithm secure. Nobody knows what makes an encryption algorithm secure. The best we have are "algorithms which have sustained heavy scrutiny from hundreds of cryptographers during many years, and are still relatively unscathed". This is the case for AES, not for XXTEA. We may note that the attack on XXTEA is still very expensive, on the verge of the feasible and probably not applicable to most "commercial" situations, but still, this algorithm has been demonstrated flaky. As such, if you value your security, don't get creative with your crypto; use well-vetted standards.
Why would you want to use XXTEA ? What does it do for you, that AES does not ? You may want to have a look at this question for some pointers to implementations of AES for some PIC microcontrollers.
(The main design criterion of TEA and its derivatives like XXTEA was to have compact source code, so that it could be learned by heart and typed again on a computer. This does not immediately translates to compactness of compiled code. (X*)TEA algorithms tend to be slow and to rely on 32-bit operations which are ill-fitting for small microcontrollers.)
One would look for other encryption methods like XXTEA if they want block size to be lesser than 128 bits. Because you may have communication medium with very low bandwidth like powerline communication in noisy environment. There are useful data transmission of only few bytes like 4 or 5 bytes of payload. In this condition, if AES is used, then the block size becomes 16 bytes and creates lot of overhead on the bandwidth available.
In case of XXTEA the block size is only 64 bits (8 bytes) so creates less overhead.

How secure is 64-bit RC2?

In encryption, would two symmetric algorithms be considered to be equal in terms of security if their key sizes are equivalent? (i.e. does a 64-bit RC2 algorithm provide the same exact security that a 64-bit AES algorithm would?)
How secure (or insecure) would it be to use a 64-bit RC2 algorithm?
How long could I expect it to take for a brute force attack to crack this kind of encryption?
What kind of data would it be okay to secure with this algorithm? (e.g. I'm guessing that credit card info would not be okay to encrypt with this algorithm since the algorithm is not secure enough).
In general, equivalent key sizes does not imply equivalent security, for a variety of reasons:
First, it's simply the case that some algorithms are have known attacks where others do not. The size of the key is just the upper bound of the effort it would take to break the cipher; in the worst case, you can always try every possible key and succeed (on average) after checking half the key space. That doesn't mean this is the best possible attack. Here's an example: AES with 128 bit keys uses 10 rounds. If you used AES with a 128 bit key, but only one round, it would be trivially breakable even though the key is the same size. For many algorithms, there are known attacks which can break the algorithm much faster the searching the entire key space.
In the case of block ciphers, there are other considerations as well. That is because block ciphers process data in chunks of bits. There are various combinatorial properties which come into play after you've started encrypting large amounts of data. For instance using the common CBC mode, you start running into problems after encrypting about 2^(n/2) blocks (this problem is intrinsic to CBC). For a 64 bit cipher like RC2, that means 2^32 64 bit blocks, or 32 GiB, which while large is quite easy to imagine (eg you encrypt a disk image with it). Whereas for a 128 bit cipher like AES, the problem only starts to crop up after about 2^64 128 bit blocks, or roughly 295 exabytes. In a scenario like this, AES with a 64 bit key would in fact be much more secure than RC2 with a 64 bit key.
Here we get to the epistemology portion of the answer: even if there are no known attacks, it doesn't mean that there are no attacks possible. RC2 is quite old and is rarely used; even when it was a fairly current cipher there was rather less analysis of it than, say, DES. It's quite likely that nobody in the last 5 years has bothered to go back and look at how to break RC2 using the latest attack techniques, simply because in the relatively academic publish-or-perish model that modern public cryptography research operates under, there is less gain to be had; it's much much better if you're seeking tenure (or looking to beef up your reputation to get more consulting work) to publish even a very marginal improvement on attacking AES than it would be to utterly demolish RC2, because nobody uses it anymore.
And with a 64 bit key, you've immediately constrained yourself to that upper bound, and 2^64 effort is really quite low; possibly within reach not just for intelligence agencies but even reasonably sized corporations (or botnet herders).
Finally, I'll point out that RC2 was designed to be fast on 286/386-era processors. On modern machines it is substantially (roughly 4-6x) slower than AES or similar ciphers designed in the last 10 years.
I really can't see any upside to using RC2 for anything, the only use I can imagine that would make sense would be for compatibility with some ancient (in computer time) system. Use AES (or one of the 4 other AES finalists if you must).
Here is my personal explanation about the expression "attack on n out of p rounds" that you can find on the page http://en.wikipedia.org/wiki/Block_cipher_security_summary . But beware: I am actually posting this as an answer so that people can tell me if I'm wrong. No-one ever explained this to me, and I am not a specialist, this is just the only explanation that makes sense that I could figure.
Cryptographers consider any algorithm that require less than brute force operations to be a successful attack. When a cipher is said to have an attack on "n out of p rounds", I guess that it means that if the cipher was defined as n rounds of the basic function it is actually defined as p rounds of, there would be an attack for it. Perhaps the algorithm actually keeps working for more than n rounds, but the cut-off point where it becomes more expensive than brute-force is n. In other words, this is a very fine distinction for an algorithm that is not broken, and it tells us how close we are to understanding abstractly the mathematical function it implements. This explains the seemingly arbitrary numbers than occur as values of "n" when this expression is employed.
To reiterate, a cipher that has an attack on n out of p rounds is a cipher that is not broken.
Also, an algorithm that is "broken" because it has an attack in 2100 operations for a 128-bit key can still be useful. The worry is in this case that further mathematical discoveries can continue to eat at the number of operations it takes to crack it. But 2100 is just as impractical as 2128.

Why do you use a random number generator/extractor?

I am dealing with some computer security issues at the school at the moment and I am interested in general programming public preferences, customs, ideas etc. If you have to use a random number generator or extractor, which one do you choose? Why do you choose it? The mathematical properties, already implemented as a package or for what reason? Do you write your own or use some package?
If computational time is no object, then you can't go wrong with Blum Blum Shub (http://en.wikipedia.org/wiki/Blum_blum_shub). Informally speaking, it's at least as secure (hard to predict) as integer factorization.
dev/random, or equivalent on your platform.
It returns bits from an entropy pool fed by device drivers. No need to worry about mathematical properties.
If you're after a cryptographically secure PRNG, then repeated application of a secure hash to a large seed array is generally the way to go. Don't invent your own algorithm, though, go for a version of Fortuna or something else reasonably well reviewed.
The keys for encryption of phone calls between presidents of the USA and USSR were said to be generated from cosmic rays. We checked it in the physics lab at out univercity -- their energies yield true Gaussian distribution. ;-) So for the best encryption you should use these, because such random sequence can not be replayed. Unless, of course, your adversary covertly builds a particle accelerator near your random number generator.
Ah... about computers... Well, acquire a stream that comes from something physical, not computed. /dev/random is an easiest solution, but your hand-made Geiger-counter attached to USB would give the best randomness ever.
For a little school project, I'd use whatever the OS provides for random number generation.
For a serious security application (eg: COMSEC-level encryption), I use a hardware random number generator. Pure algorithms with no hardware access by definition don't produce random numbers.
HotBits.

Resources