Dealing with big numbers in Haskell - haskell

I'm trying to implement the Miller test in Haskell (Not Miller-Rabin.) I'm dealing with big numbers, and in particular I need to exponentiate big numbers and take the modulus of a large number mod another large number.
Are there any standard functions for doing this? The normal expt function ^ tells me I run out of memory before it computes a result. For example, I'd like to do:
(mod (8888^38071670985) 9746347772161)
I could implement my own algorithms, but it'd be nice if these already exist.

There is modular exponentiation (and much more) in the arithmoi package.
Since I wrote it, I'd be interested to hear if you find it useful and what could be improved.
If you try to compute
(mod (8888^38071670985) 9746347772161)
as it stands, the intermediate result 8888^38071670985 would occupy roughly 5*1011 bits, about 60GB. Even if you have so much RAM, that is close to (maybe slightly above) the limits of GMP (the size field in the GMP integers is four bytes).
So you also have to reduce the intermediate results during the calculation. Not only does that let the computation fit into memory without problems, but it's also faster since the involved numbers remain fairly small.

An approximation to your number before taking modulo is
10^log(8888^38071670985)
= 10^(38071670985 * log(8888))
= 10^(1.5 * 10^11)
In other words it has around 1.5 * 10^11 digits. It would need around
1.5 * 10^11 / log(2) / 8 / (2^30) = 58GB
of memory just to represent.
So starting with this may not be the best idea. Does the library have to support calculation with this large numbers?

Related

what are the advantages of using smaller integer types in rust?

I am learning rust and in the official tutorial, the author assigned the value 5 to a variable like so:
let x: i32 = 5;
I thought this was weird as one could use u8 as the type and the program would run fine. This got me thinking, are there any advantages to using a lower bit number? Is it faster?
The main advantage is that they use less memory. A vector<i32> with 1 billion elements will use 4GB, while a vector<u8> will use 1GB. This can be a significant advantage regardless of speed.
Arithmetic on smaller integer types on modern CPUs is not faster in general. There are some issues with using only part of a register but optimizers will almost certainly resolve these performance problems for you.
When you have a lot of integers and the optimizer can make use of vectorization (for example adding your 1 billion integers in the vector) then smaller types will typically yield better performance, because more of them fit in a SIMD register.
If you use them just as one scalar stack variable like in your example, I highly doubt there will be a difference in 99% of cases. Here other considerations are more important:
A bigger type will make overflows less likely, maybe you did calculate your maximal possible value wrong.
For public interfaces bigger types are more future proof.
Its better to cast from i8 to i32 than the other way round.

How long does it take to crack a hash?

I want to calculate the time it will take to break a SHA-256 hash. So I research and found the following calculation. If I have a password in lower letter with a length of 6 chars, I would have 26^6passwords right?
To calculate the time I have to divide this number by a hashrate, I guess. So if I had one RTX 3090, the hashrate would be 120 MH/s (1.2*10^8 H/s) and than I need to calculate 26^6/(1.2*10^8) to get the time in seconds right?
Is this idea right or wrong?
Yes, but a lowercase-latin 6 character string is also short enough that you would expect to compute this one time and put it into a database so that you could look it up in O(1). It's only a bit over 300M entries. That said, given you're 50% likely to find the answer in the first half of your search, it's so fast to crack that you might not even bother unless you were doing this often. You don't even need a particularly fancy GPU for something on this scale.
Note that in many cases a 6 character string can also be a 5 character string, so you need to add 26^6 + 26^5 + 26^4 + ..., but all of these together only raises this to around 320M hashes. It's a tiny space.
Adding uppercase, numbers and the easily typed symbols gets you up to 96^6 ~ 780B. On the other hand, adding just 3 more lowercase-letters (9 total) gets you to 26^9 ~ 5.4T. For brute force on random strings, longer is much more powerful than complicated.
To your specific question, note that it does matter how you implement this. You won't get these kinds of hash rates if you don't write your code in a way to maximize the GPU. For example, writing simple code that sends one value to the GPU to hash at a time, and then compares the result on the CPU could be incredibly slow (in some cases slower than just doing all the work on a CPU). Setting up your memory efficiently and maximizing things the GPU can do in parallel are very important. If you're not familiar with this kind of programming, I recommend using or studying a tool like John the Ripper.

Is the time complexity of `isInfixOf` in Data.ByteString.Char8 O(m * n)?

From reading the relative source code,
https://hackage.haskell.org/package/bytestring-0.9.2.1/docs/src/Data-ByteString.html#isInfixOf ,
it seems that the isInfix algorithm is actually O(m * n). But in fact, it runs much faster than KMP code by myself?
So is this algorithm actually O(m * n), and how does haskell make this functional extremely efficient?
Asymptoticall complexity does not consider any constants, which may be more imporant in real-life scenarious than complexity itself. This may be the case - they have very small constants, you have big ones.
By the definition, the bigger complexity is, when one function has in one point to infinity bigger value, even if you multiply other function by constant of any size.
However that "one point" may be huge.
For example if you have two alghoritms with this run-time : 1000000*n*sqrt(n) and n^2, the complexity is bigger for n^2, but to achieve higher speed for first one, n must be higher than 1 000 000 000 000. For smaller numbers n^2 alghoritms is faster.
Therefore consider complexity by But in fact, it runs much faster than KMP code by myself? is not good approach.

How can natural numbers be represented to offer constant time addition?

Cirdec's answer to a largely unrelated question made me wonder how best to represent natural numbers with constant-time addition, subtraction by one, and testing for zero.
Why Peano arithmetic isn't good enough:
Suppose we use
data Nat = Z | S Nat
Then we can write
Z + n = n
S m + n = S(m+n)
We can calculate m+n in O(1) time by placing m-r debits (for some constant r), one on each S constructor added onto n. To get O(1) isZero, we need to be sure to have at most p debits per S constructor, for some constant p. This works great if we calculate a + (b + (c+...)), but it falls apart if we calculate ((...+b)+c)+d. The trouble is that the debits stack up on the front end.
One option
The easy way out is to just use catenable lists, such as the ones Okasaki describes, directly. There are two problems:
O(n) space is not really ideal.
It's not entirely clear (at least to me) that the complexity of bootstrapped queues is necessary when we don't care about order the way we would for lists.
As far as I know, Idris (a dependently-typed purely functional language which is very close to Haskell) deals with this in a quite straightforward way. Compiler is aware of Nats and Fins (upper-bounded Nats) and replaces them with machine integer types and operations whenever possible, so the resulting code is pretty effective. However, that's not true for custom types (even isomorphic ones) as well as for compilation stage (there were some code samples using Nats for type checking which resulted in exponential growth in compile-time, I can provide them if needed).
In case of Haskell, I think a similar compiler extension may be implemented. Another possibility is to make TH macros which would transform the code. Of course, both of options aren't easy.
My understanding is that in basic computer programming terminology the underlying problem is you want to concatenate lists in constant time. The lists don't have cheats like forward references, so you can't jump to the end in O(1) time, for example.
You can use rings instead, which you can merge in O(1) time, regardless if a+(b+(c+...)) or ((...+c)+b)+a logic is used. The nodes in the rings don't need to be doubly linked, just a link to the next node.
Subtraction is the removal of any node, O(1), and testing for zero (or one) is trivial. Testing for n > 1 is O(n), however.
If you want to reduce space, then at each operation you can merge the nodes at the insertion or deletion points and weight the remaining ones higher. The more operations you do, the more compact the representation becomes! I think the worst case will still be O(n), however.
We know that there are two "extremal" solutions for efficient addition of natural numbers:
Memory efficient, the standard binary representation of natural numbers that uses O(log n) memory and requires O(log n) time for addition. (See also Chapter "Binary Representations" in the Okasaki's book.)
CPU efficient which use just O(1) time. (See Chapter "Structural Abstraction" in the book.) However, the solution uses O(n) memory as we'd represent natural number n as a list of n copies of ().
I haven't done the actual calculations, but I believe for the O(1) numerical addition we won't need the full power of O(1) FIFO queues, it'd be enough to bootstrap standard list [] (LIFO) in the same way. If you're interested, I could try to elaborate on that.
The problem with the CPU efficient solution is that we need to add some redundancy to the memory representation so that we can spare enough CPU time. In some cases, adding such a redundancy can be accomplished without compromising the memory size (like for O(1) increment/decrement operation). And if we allow arbitrary tree shapes, like in the CPU efficient solution with bootstrapped lists, there are simply too many tree shapes to distinguish them in O(log n) memory.
So the question is: Can we find just the right amount of redundancy so that sub-linear amount of memory is enough and with which we could achieve O(1) addition? I believe the answer is no:
Let's have a representation+algorithm that has O(1) time addition. Let's then have a number of the magnitude of m-bits, which we compute as a sum of 2^k numbers, each of them of the magnitude of (m-k)-bit. To represent each of those summands we need (regardless of the representation) minimum of (m-k) bits of memory, so at the beginning, we start with (at least) (m-k) 2^k bits of memory. Now at each of those 2^k additions, we are allowed to preform a constant amount of operations, so we are able to process (and ideally remove) total of C 2^k bits. Therefore at the end, the lower bound for the number of bits we need to represent the outcome is (m-k-C) 2^k bits. Since k can be chosen arbitrarily, our adversary can set k=m-C-1, which means the total sum will be represented with at least 2^(m-C-1) = 2^m/2^(C+1) ∈ O(2^m) bits. So a natural number n will always need O(n) bits of memory!

Why does System.Numerics.Complex use doubles instead of decimals?

I've been working with System.Numerics.Complex recently, and I've started to notice the typical floating-point "drift" where the value stored gets calculated a tenth of a millionth off or something like that, which is well-known and common with the float type and even the double type. I looked into the Complex struct, and sure enough, it used double variables. Why does it use double values to store its data and not decimal values, which are designed to prevent this? How do I work around this?
To answer your question:
doubles are several orders of magnitude faster, as operations are done at the hardware level
base-2 floats can actually be more accurate for large computations, as there is less "wobble" when shifting up and down exponents: 1 bit of precision is less than 1 decimal digit. Moreover, base-2 can use an implicit leading bit, which means they can represent more numbers than other bases.
complex numbers are typically used for scientific/engineering applications, where small relative errors of approx 10-16 are outweighed by other sources of error (e.g. due to measurement or the model).
decimals on the other hand are typically used for "accounting" type operations, where round-off error is typically negligible (i.e. addition of small numbers, multiplication by integers, etc.)

Resources