What does secrets module do to make perfect random sequences in Python - python-3.x

Now I have a decent knowledge of math, and I know it's possible to create pseudo-random sequences using a specific mathematics algorithm. I also know that in Python, there is a secrets module that apparently can produce random numbers. I tried tweaking around with it a little, but I still don't understand how it's supposed to work. Lets say this piece of code:
import secrets
secret_num = secrets.choice([1, 2, 3])
print(secret_num)
It's supposed to output a perfectly random number. But how is that possible using computers?

The documentation for the secrets module says it produces "cryptographically strong random numbers suitable for managing data such as passwords, account authentication, security tokens, and related secrets". The documentation doesn't specify how it does so exactly.
However, a usual requirement for "cryptographically strong random numbers" is that they should be hard to guess by outside attackers. To this end, the secrets module may rely on the random number generator provided by the operating system (as secrets.SystemRandom does, for example), and how that generator works depends in turn on the operating system. But in general, a random number generator designed for information security ultimately relies on gathering hard-to-guess bits from nondeterministic sources, as further explained in the following question:
How to get truly random data, not random data fed into a PRNG seed like CSRNG's do?

Related

How to produce a "securely random" string token in Haskell?

I want to produce string tokens to implement the password reset functionality of a web app. They can be v4 UUIDs but it is not required.
I want each token to be "securely random" in the sense of this SO question. The system's entropy pool should be sampled for each generated token.
I found the uuid package, that generates v4 UUIDs. The docs mention that
we use the System.Random StdGen as our random source.
but it is unclear to me if this is enough.
Is there other library that I should use instead?
The requirement that the system entropy pool should be sampled on each generated token sounds suspicious. Entropy is usually a scarce resource.
Let's agree that /dev/random is definitely out. Other processes may need it, it could lead to potential denial of service etc. First and foremost though, /dev/random blocks on read when it deems the entropy pool insufficient so it's a definite no. /dev/urandom on the other hand should be perfectly fine. Some say that it has issues in some odd cases (for example right after boot on diskless machines) but let's no go there.
Note that on most systems /dev/[u]random both use the same algorithm. The difference is that /dev/urandom never blocks but also not necessarily uses any new entropy on each read. So if using /dev/urandom counts as "sampling the system entropy pool" then whatever solution you have that properly uses it is probably fine.
However, it comes at a cost of impurity. Reading from /dev/urandom forces us into IO. Being haskellers we have to at least consider alternatives and we are in luck if only we drop the entropy-sample requirement which seems weird to begin with.
Instead we can use a crypto-secure deterministic RNG and only seed it from /dev/random. ChaChaDRG from cryptonite which is an implementation of DJB's ChaCha would be a good choice. Since it is deterministic we only need IO to get the initial seed. Everything after that is pure.
Cryptonite offers getRandomBytes so you can tweak the length of the token to suit your needs.

Haskell: Where is (System.)Crypto?

I have written a one-time pad encryption module that can also generate pads. I have read that Haskell comes with some kind of cryptographically secure random number generator, whose module name contains "Crypto". So I use my GHCi and type "import " and tab to bring up all of the possible imports. There is no Crypto module, only the plain old Random. I explicitly try to import "System.Crypto" and then, just "Crypto"; no luck. I perform a text search with Power Shell on the results of the autocomplete, but it finds no applicable entry. Where is the cryptographic random number generator? If I don't have it, where can I get it? Am I imagining things?
Details:
GHCi
version 7.6.3
Windows 7
Haskell does not "come with" a cryptographically secure random number generator, if by that you mean it should be included in the Haskell Platform.
Searching for "crypto" on Hackage throws up a number of packages you can install with Cabal, though. I'm not well versed in those but the top crypto-random one looks promising. This doesn't necessarily mean much, though. Although Hackage is the place to find most things Haskell, unfortunately it doesn't yet have good features to find out which of its packages are actually high quality. (This is expected to improve as the new Hackage 2 implementation is much more flexible than the old one.)
If you installed Haskell platform, you can find all libraries out of box here.
I find there is a System.Random package which has a global random number generator. The global random number generator is initialized automatically in some system-dependent fashion, for example, by using the time of day, or Linux's kernel random number generator.
Or you can use other packages by cabal install such as crypto-api.

True random number generator (TRNG), Haskell and an empirical / formal method

I want to produce verifications to a true random number generator (TRNG) numbers generated by specific hardware, but I'm not used to this.
Firstly, I want to test the consistency of the True Random Number Generator (TRNG) via empiric methods (AKA, I want to check if they are really true random numbers (TRNs)); and I don't know if I can check this with formal methods.
Are there some specific lectures on this topic? What about some tips? Are there tools for this empiric method testing?
I'd suggest that you not try to duplicate existing tools, since it would be a lot of work. Marsaglia's Diehard tests should work, or you can use dieharder, which is a GPL reimplementation. From the webpage:
The primary point of dieharder (like diehard before it) is to make it easy to time and test (pseudo)random number generators, both software and hardware, for a variety of purposes in research and cryptography. The tool is built entirely on top of the GSL's random number generator interface and uses a variety of other GSL tools (e.g. sort, erfc, incomplete gamma, distribution generators) in its operation.

Why do you use a random number generator/extractor?

I am dealing with some computer security issues at the school at the moment and I am interested in general programming public preferences, customs, ideas etc. If you have to use a random number generator or extractor, which one do you choose? Why do you choose it? The mathematical properties, already implemented as a package or for what reason? Do you write your own or use some package?
If computational time is no object, then you can't go wrong with Blum Blum Shub (http://en.wikipedia.org/wiki/Blum_blum_shub). Informally speaking, it's at least as secure (hard to predict) as integer factorization.
dev/random, or equivalent on your platform.
It returns bits from an entropy pool fed by device drivers. No need to worry about mathematical properties.
If you're after a cryptographically secure PRNG, then repeated application of a secure hash to a large seed array is generally the way to go. Don't invent your own algorithm, though, go for a version of Fortuna or something else reasonably well reviewed.
The keys for encryption of phone calls between presidents of the USA and USSR were said to be generated from cosmic rays. We checked it in the physics lab at out univercity -- their energies yield true Gaussian distribution. ;-) So for the best encryption you should use these, because such random sequence can not be replayed. Unless, of course, your adversary covertly builds a particle accelerator near your random number generator.
Ah... about computers... Well, acquire a stream that comes from something physical, not computed. /dev/random is an easiest solution, but your hand-made Geiger-counter attached to USB would give the best randomness ever.
For a little school project, I'd use whatever the OS provides for random number generation.
For a serious security application (eg: COMSEC-level encryption), I use a hardware random number generator. Pure algorithms with no hardware access by definition don't produce random numbers.
HotBits.

Self validating binaries?

My question is pretty straightforward: You are an executable file that outputs "Access granted" or "Access denied" and evil persons try to understand your algorithm or patch your innards in order to make you say "Access granted" all the time.
After this introduction, you might be heavily wondering what I am doing. Is he going to crack Diablo3 once it is out? I can pacify your worries, I am not one of those crackers. My goal are crackmes.
Crackmes can be found on - for example - www.crackmes.de. A Crackme is a little executable that (most of the time) contains a little algorithm to verify a serial and output "Access granted" or "Access denied" depending on the serial. The goal is to make this executable output "Access granted" all the time. The methods you are allowed to use might be restricted by the author - no patching, no disassembling - or involve anything you can do with a binary, objdump and a hex editor. Cracking crackmes is one part of the fun, definately, however, as a programmer, I am wondering how you can create crackmes that are difficult.
Basically, I think the crackme consists of two major parts: a certain serial verification and the surrounding code.
Making the serial verification hard to track just using assembly is very possible, for example, I have the idea to take the serial as an input for a simulated microprocessor that must end up in a certain state in order to get the serial accepted. On the other hand, one might grow cheap and learn more about cryptographically strong ways to secure this part. Thus, making this hard enough to make the attacker try to patch the executable should not be tha
t hard.
However, the more difficult part is securing the binary. Let us assume a perfectly secure serial verification that cannot be reversed somehow (of course I know it can be reversed, in doubt, you rip parts out of the binary you try to crack and throw random serials at it until it accepts). How can we prevent an attacker from just overriding jumps in the binary in order to make our binary accept anything?
I have been searching on this topic a bit, but most results on binary security, self verifying binaries and such things end up in articles that try to prevent attacks on an operating system using compromised binaries. by signing certain binaries and validate those signatures with the kernel.
My thoughts currently consist of:
checking explicit locations in the binary to be jumps.
checksumming parts of the binary and compare checksums computed at runtime with those.
have positive and negative runtime-checks for your functions in the code. With side-effects on the serial verification. :)
Are you able to think of more ways to annoy a possible attacker longer? (of course, you cannot keep him away forever, somewhen, all checks will be broken, unless you managed to break a checksum-generator by being able to embed the correct checksum for a program in the program itself, hehe)
You're getting into "Anti-reversing techniques". And it's an art basically. Worse is that even if you stomp newbies, there are "anti-anti reversing plugins" for olly and IDA Pro that they can download and bypass much of your countermeasures.
Counter measures include debugger detection by trap Debugger APIs, or detecting 'single stepping'. You can insert code that after detecting a debugger breakin, continues to function, but starts acting up at random times much later in the program. It's really a cat and mouse game and the crackers have a significant upper hand.
Check out...
http://www.openrce.org/reference_library/anti_reversing - Some of what's out there.
http://www.amazon.com/Reversing-Secrets-Engineering-Eldad-Eilam/dp/0764574817/ - This book has a really good anti-reversing info and steps through the techniques. Great place to start if you're getting int reversing in general.
I believe these things are generally more trouble than they're worth.
You spend a lot of effort writing code to protect your binary. The bad guys spend less effort cracking it (they're generally more experienced than you) and then release the crack so everyone can bypass your protection. The only people you'll annoy are those honest ones who are inconvenienced by your protection.
Just view piracy as a cost of business - the incremental cost of pirated software is zero if you ensure all support is done only for paying customers.
There's TPM technology: tpm on wikipedia
It allows you to store the cryptographic check sums of a binary on special chip, which could act as one-way verification.
Note: TPM has sort of a bad rap because it could be used for DRM. But to experts in the field, that's sort of unfair, and there's even an open-TPM group allowing linux users control exactly how their TPM chip is used.
One of the strongest solutions to this problem is Trusted Computing. Basically you would encrypt the application and transmit the decryption key to a special chip (the Trusted Platform Module), The chip would only decrypt the application once it has verified that the computer is in a "trusted" state: no memory viewers/editors, no debuggers etc. Basically, you would need special hardware to just be able to view the decrypted program code.
So, you want to write a program that accepts a key at the beginning and stores it in memory, subsequently retrieving it from disc. If it's the correct key, the software works. If it's the wrong key, the software crashes. The goal is that it's hard for pirates to generate a working key, and it's hard to patch the program to work with an unlicensed key.
This can actually be achieved without special hardware. Consider our genetic code. It works based on the physics of this universe. We try to hack it, create drugs, etc., and we fail miserably, usually creating tons of undesirable side-effects, because we haven't yet fully reverse engineered the complex "world" in which the genetic "code" evolved to operate. Basically, if you're running everything on an common processor (a common "world"), which everyone has access to, then it's virtually impossible to write such a secure code, as demonstrated by current software being so easily cracked.
To achieve security in software, you essentially would have to write your own sufficiently complex platform, which others would have to completely and thoroughly reverse engineer in order to modify the behavior of your code without unpredictable side effects. Once your platform is reverse engineered, however, you'd be back to square one.
The catch is, your platform is probably going to run on common hardware, which makes your platform easier to reverse engineer, which in turn makes your code a bit easier to reverse engineer. Of course, that may just mean the bar is raised a bit for the level of complexity required of your platform to be sufficiently difficult to reverse engineer.
What would a sufficiently complex software platform look like? For example, perhaps after every 6 addition operations, the 7th addition returns the result multiplied by PI divided by the square root of the log of the modulus 5 of the difference of the total number of subtract and multiply operations performed since system initialization. The platform would have to keep track of those numbers independently, as would the code itself, in order to decode correct results. So, your code would be written based on knowledge of the complex underlying behavior of a platform you engineered. Yes, it would eat processor cycles, but someone would have to reverse engineer that little surprise behavior and re-engineer it into any new code to have it behave properly. Furthermore, your own code would be difficult to change once written, because it would collapse into irreducible complexity, with each line depending on everything that happened prior. Of course, there would be much more complexity in a sufficiently secure platform, but the point is that someone would have reverse engineer your platform before they could reverse engineer and modify your code, without debilitating side-effects.
Great article on copy protection and protecting the protection Keeping the Pirates at Bay:
Implementing Crack Protection for Spyro: Year of the Dragon
The most interesting idea mentioned in there that hasn't yet been mentioned is cascading failures - you have checksums that modify a single byte that causes another checksum to fail. Eventually one of the checksums causes the system to crash or do something strange. This makes attempts to pirate your program seem unstable and makes the cause occur a long way from the crash.

Resources