How to break out of recursion at a base case in Haskell? - haskell

I'm trying to write a function that computes all the weird numbers up to a given integer. To determine if a number n is pseudoperfect, I was trying to come up with a better solution than naively computing the sum of every combination of divisors until I either exhaust all the combinations or find a solution. I ended up with the idea that you can recursively reduce the problem by definitely proving inclusion of certain divisors in the sum. Take 12 for example: you know that 6 must be included in the final sum because the total of all other divisors is less than 12. Then the problem is reduced to summing to the rest of 12 (ie 12 - 6, ie 6) from the remaining divisors, and so forth. If you reach 0, then the number is pseudoperfect. If you exhaust all the divisors then it's not pseudoperfect.
Here's my code so far:
(The functions powerset and divisors do the expected: return a list of all sublists, and a list of divisors respectively. Obviously RETURN is just a placeholder for I want to do.)
pseudoperfect _ 0 = RETURN True
pseudoperfect d n =
let g = zipWith (\x y -> x) d (filter (< 0) (map (subtract n) (scanl1 (+) d))) in
map (\x -> pseudoperfect g (n - (sum x))) (filter (not . null) (powerset(d \\ g)))
-- Determine if n is weird, ie, abundant and not pseudoperfect.
weird n = ((sum . divisors) (n) > n) && not (pseudoperfect (divisors n) n)
-- Find weird numbers up to given n.
weirds n = filter (weird) [1..n]
I tried working through a couple examples on paper and I got the correct result. My problem is that I want to just break out of the recursion and return True when I reach that base case of n = 0. Is there a way to do this in Haskell, or do I need to restructure the function somehow so that I'm passing returned value at the base case value back up through the stack?

Related

Primes in Haskell

I'm learning Haskell, and I've tried to generate an infinite list of primes, but I can't understand what my function is doing wrong.
The function:
prime = 2:3:filter (\x -> all (\y -> (mod x y) > 0) (init prime)) [5..]
I think it's the init prime, but the strange thing is that even if I set an upper bound to the range (5..10 for example), the function loops forever and never gets any result for prime !! 2
Can you please tell me what I'm doing wrong?
Well, for one let's look at what init does for a finite list:
init [1] == []
init [1,2] == [1]
init [1,2,3] == [1,2]
ok, so it gives us all but the last element of the list.
So what's init primes? Well, prime without the last element. Hopefully if we implemented prime correctly it shouldn't have a last element (because there are infinitely many primes!), but more importantly we don't quite need to care yet because we don't have the full list for now anyway - we only care about the first couple of elements after all, so for us it's pretty much the same as just prime itself.
Now, looking at all: What does this do? Well, it takes a list and a predicate and tells us if all the elements of the list satisfy the predicate:
all (<5) [1..4] == True
all even [1..4] == False
it even works with infinite lists!
all (<5) [1..] == False
so what's going on here? Well, here's the thing: It does work with infinite lists... but only if we can actually evaluate the list up to the first element of the list that violates the predicate! Let's see if this holds true here:
all (\y -> (mod 5 y) > 0) (init prime)
so to find out if 5 is a prime number, we'd have to check if there's a number in prime minus the last element of prime that divides it. Let's see if we can do that.
Now let's look at the definition of prime, we get
all (\y -> (mod 5 y) > 0) (2:3:filter (\x -> all (\y -> (mod x y) > 0) (init prime)) [5..])
So to determine whether 5 is a prime number, we only have to check if it's:
divisible by 2 - it's not, let's continue
divisible by 3 - still no
divisible by ...? Well, we're in the process of checking what the 3rd prime is so we don't know yet...
and there's the crux of the problem. With this logic, to determine the third prime number you need to know the third prime number! Of course logically, we actually don't want to check this at all, rather we only need to check if any of the smaller prime numbers are divisors of the current candidate.
So how do we go about doing that? Well, we'll have to change our logic unfortunately. One thing we can do is try to remember how many primes we already have, and only take as many as we need for our comparison:
prime = 2 : 3 : morePrimes 2 [5..]
morePrimes n (x:xs)
| all (\y -> mod x y > 0) (take n prime) = x : morePrimes (n+1) xs
| otherwise = morePrimes n xs
so how does this work? Well, it basically does what we were just talking about: We remember how many primes we already have (starting at 2 because we know we have at least [2,3] in n. We then check if our next prime is divisible by any of the of n primes we already know by using take n, and if it is we know it's our next prime and we need to increment n - otherwise we just carry on.
There's also the more well known form inspired by (although not quite the same as) the Sieve of Eratosthenes:
prime = sieve [2..] where
sieve (p:xs) = p : sieve (filter (\x -> mod x p > 0) xs)
so how does this work? Well, again with a similar idea: We know that the next prime number needs to be non-divisible by any previous prime number. So what do we do? Well, starting at 2 we know that the first element in the list is a prime number. We then throw away every number divisible by that prime number using filter. And afterwards, the next item in the list is going to be a prime number again (because we didn't throw it away), so we can repeat the process.
Neither of these are one liners like the one you were hoping for though.
If the code in the other answer is restructured under the identity
[take n primes | n <- [0..]] == inits primes
eventually we get
import Data.List
-- [ ([], 2), ([2], 3), ([2,3], 5), ... ]
primes = 2 : [ c | (ps, p) <- zip (inits primes) primes,
c <- take 1 [c | c <- [p+1..],
and [mod c p > 0 | p <- ps]]]
Further improving it algorithmically, it becomes
primes = 2 : [ c | (ps, r:q:_) <- zip (inits primes) -- [] [3,4,...]
(tails $ 3 : map (^2) primes), -- [2] [4,9,...]
c <- [r..q-1], and [mod c p > 0 | p <- ps]] -- [2,3] [9,25,...]

Haskell Does Not Evaluate Lazily takeWhile

isqrt :: Integer -> Integer
isqrt = floor . sqrt . fromIntegral
primes :: [Integer]
primes = sieve [2..] where
sieve (p:ps) = p : sieve [x | x <- ps, x `mod` p > 0]
primeFactors :: Integer -> [Integer]
primeFactors n = takeWhile (< n) [x | x <- primes, n `mod` x == 0]
Here is my code. I think you guessed what I am trying to do: A list of prime factors of a given number using infinite list of prime numbers. But this code does not evaluate lazily.
When I use ghci and :l mycode.hs and enter primeFactors 24, the result is [2, 3 ( and the cursor constantly flashing there) there isn't a further Prelude> prompt. I think there is a problem there. What am I doing wrong?
Thanks.
takeWhile never terminates for composite arguments. If n is composite, it has no prime factors >= n, so takeWhile will just sit there.
Apply takeWhile to the primes list and then filter the result with n mod x, like this:
primeFactors n = [x | x <- takeWhile (<= n) primes, n `mod` x == 0]
(<= is used instead of < for maximum correctness, so that prime factors of a prime number would consist of that number).
Have an illustration of what happens:
http://sketchtoy.com/67338195
Your problem isn't directly takeWhile, but rather the list comprehension.
[x | x <- primes, n `mod` x == 0]
For n = 24, we get 24 `mod` 2 == 0 and 24 `mod` 3 == 0, so the value of this list comprehension starts with 2 : 3 : .... But consider the ... part.
The list comprehension has to keep pulling values from primes and checking 24 `mod` x == 0. Since there are no more prime factors of 24 nothing will ever pass that test and get emitted as the third value of the list comprehension. But since there's always another prime to test, it will never stop and conclude that the remaining tail of the list is empty.
Because this is lazily evaluated, if you only ever ask for the first two elements of this list then you're fine. But if your program ever needs the third one (or even just to know whether or not there is a third element), then the list comprehension will just spin forever trying to come up with one.
takeWhile (< 24) keeps pulling elements from its argument until it finds one that is not < 24. 2 and 3 both pass that test, so takeWhile (< 24) does need to know what the third element of the list comprehension is.
But it's not really a problem with takeWhile; the problem is that you've written a list comprehension to find all of the prime factors (and nothing else), and then trying to use a filter on the results of that to cut off the infinite exploration of all the higher primes that can't possibly be factors. That doesn't really make sense if you stop to think about it; by definition anything that isn't a prime factor can't be an element of that list, so you can't filter out the non-factors larger than n from that list. Instead you need to filter the input to that list comprehension so that it doesn't try to explore an infinite space, as #n.m's answer shows.

Generating Cartesian products in Haskell

I am trying to generate all possible combinations of n numbers. For example if n = 3 I would want the following combinations:
(0,0,0), (0,0,1), (0,0,2)... (0,0,9), (0,1,0)... (9,9,9).
This post describes how to do so for n = 3:
[(a,b,c) | m <- [0..9], a <- [0..m], b <- [0..m], c <- [0..m] ]
Or to avoid duplicates (i.e. multiple copies of the same n-uple):
let l = 9; in [(a,b,c) | m <- [0..3*l],
a <- [0..l], b <- [0..l], c <- [0..l],
a + b + c == m ]
However following the same pattern would become very silly very quickly for n > 3. Say I wanted to find all of the combinations: (a, b, c, d, e, f, g, h, i, j), etc.
Can anyone point me in the right direction here? Ideally I'd rather not use a built in funtion as I am trying to learn Haskell and I would rather take the time to understand a peice of code than just use a package written by someone else. A tuple is not required, a list would also work.
My other answer gave an arithmetic algorithm to enumerate all the combinations of digits. Here's an alternative solution which arises by generalising your example. It works for non-numbers, too, because it only uses the structure of lists.
First off, let's remind ourselves of how you might use a list comprehension for three-digit combinations.
threeDigitCombinations = [[x, y, z] | x <- [0..9], y <- [0..9], z <- [0..9]]
What's going on here? The list comprehension corresponds to nested loops. z counts from 0 to 9, then y goes up to 1 and z starts counting from 0 again. x ticks the slowest. As you note, the shape of the list comprehension changes (albeit in a uniform way) when you want a different number of digits. We're going to exploit that uniformity.
twoDigitCombinations = [[x, y] | x <- [0..9], y <- [0..9]]
We want to abstract over the number of variables in the list comprehension (equivalently, the nested-ness of the loop). Let's start playing around with it. First, I'm going to rewrite these list comprehensions as their equivalent monad comprehensions.
threeDigitCombinations = do
x <- [0..9]
y <- [0..9]
z <- [0..9]
return [x, y, z]
twoDigitCombinations = do
x <- [0..9]
y <- [0..9]
return [x, y]
Interesting. It looks like threeDigitCombinations is roughly the same monadic action as twoDigitCombinations, but with an extra statement. Rewriting again...
zeroDigitCombinations = [[]] -- equivalently, `return []`
oneDigitCombinations = do
z <- [0..9]
empty <- zeroDigitCombinations
return (z : empty)
twoDigitCombinations = do
y <- [0..9]
z <- oneDigitCombinations
return (y : z)
threeDigitCombinations = do
x <- [0..9]
yz <- twoDigitCombinations
return (x : yz)
It should be clear now what we need to parameterise:
combinationsOfDigits 0 = return []
combinationsOfDigits n = do
x <- [0..9]
xs <- combinationsOfDigits (n - 1)
return (x : xs)
ghci> combinationsOfDigits' 2
[[0,0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7],[0,8],[0,9],[1,0],[1,1] ... [9,8],[9,9]]
It works, but we're not done yet. I want to show you that this is an instance of a more general monadic pattern. First I'm going to change the implementation of combinationsOfDigits so that it folds up a list of constants.
combinationsOfDigits n = foldUpList $ replicate n [0..9]
where foldUpList [] = return []
foldUpList (xs : xss) = do
x <- xs
ys <- foldUpList xss
return (x : ys)
Looking at the definiton of foldUpList :: [[a]] -> [[a]], we can see that it doesn't actually require the use of lists per se: it only uses the monad-y parts of lists. It could work on any monad, and indeed it does! It's in the standard library, and it's called sequence :: Monad m => [m a] -> m [a]. If you're confused by that, replace m with [] and you should see that those types mean the same thing.
combinationsOfDigits n = sequence $ replicate n [0..9]
Finally, noting that sequence . replicate n is the definition of replicateM, we get it down to a very snappy one-liner.
combinationsOfDigits n = replicateM n [0..9]
To summarise, replicateM n gives the n-ary combinations of an input list. This works for any list, not just a list of numbers. Indeed, it works for any monad - though the "combinations" interpretation only makes sense when your monad represents choice.
This code is very terse indeed! So much so that I think it's not entirely obvious how it works, unlike the arithmetic version I showed you in my other answer. The list monad has always been one of the monads I find less intuitive, at least when you're using higher-order monad combinators and not do-notation.
On the other hand, it runs quite a lot faster than the number-crunching version. On my (high-spec) MacBook Pro, compiled with -O2, this version calculates the 5-digit combinations about 4 times faster than the version which crunches numbers. (If anyone can explain the reason for this I'm listening!)
What are all the combinations of three digits? Let's write a few out manually.
000, 001, 002 ... 009, 010, 011 ... 099, 100, 101 ... 998, 999
We ended up simply counting! We enumerated all the numbers between 0 and 999. For an arbitrary number of digits this generalises straightforwardly: the upper limit is 10^n (exclusive), where n is the number of digits.
Numbers are designed this way on purpose. It would be jolly strange if there was a possible combination of three digits which wasn't a valid number, or if there was a number below 1000 which couldn't be expressed by combining three digits!
This suggests a simple plan to me, which just involves arithmetic and doesn't require a deep understanding of Haskell*:
Generate a list of numbers between 0 and 10^n
Turn each number into a list of digits.
Step 2 is the fun part. To extract the digits (in base 10) of a three-digit number, you do this:
Take the quotient and remainder of your number with respect to 100. The quotient is the first digit of the number.
Take the remainder from step 1 and take its quotient and remainder with respect to 10. The quotient is the second digit.
The remainder from step 2 was the third digit. This is the same as taking the quotient with respect to 1.
For an n-digit number, we take the quotient n times, starting with 10^(n-1) and ending with 1. Each time, we use the remainder from the last step as the input to the next step. This suggests that our function to turn a number into a list of digits should be implemented as a fold: we'll thread the remainder through the operation and build a list as we go. (I'll leave it to you to figure out how this algorithm changes if you're not in base 10!)
Now let's implement that idea. We want calculate a specified number of digits, zero-padding when necessary, of a given number. What should the type of digits be?
digits :: Int -> Int -> [Int]
Hmm, it takes in a number of digits and an integer, and produces a list of integers representing the digits of the input integer. The list will contain single-digit integers, each one of which will be one digit of the input number.
digits numberOfDigits theNumber = reverse $ fst $ foldr step ([], theNumber) powersOfTen
where step exponent (digits, remainder) =
let (digit, newRemainder) = remainder `divMod` exponent
in (digit : digits, newRemainder)
powersOfTen = [10^n | n <- [0..(numberOfDigits-1)]]
What's striking to me is that this code looks quite similar to my English description of the arithmetic we wanted to perform. We generate a powers-of-ten table by exponentiating numbers from 0 upwards. Then we fold that table back up; at each step we put the quotient on the list of digits and send the remainder to the next step. We have to reverse the output list at the end because of the right-to-left way it got built.
By the way, the pattern of generating a list, transforming it, and then folding it back up is an idiomatic thing to do in Haskell. It's even got its own high-falutin' mathsy name, hylomorphism. GHC knows about this pattern too and can compile it into a tight loop, optimising away the very existence of the list you're working with.
Let's test it!
ghci> digits 3 123
[1, 2, 3]
ghci> digits 5 10101
[1, 0, 1, 0, 1]
ghci> digits 6 99
[0, 0, 0, 0, 9, 9]
It works like a charm! (Well, it misbehaves when numberOfDigits is too small for theNumber, but never mind about that.) Now we just have to generate a counting list of numbers on which to use digits.
combinationsOfDigits :: Int -> [[Int]]
combinationsOfDigits numberOfDigits = map (digits numberOfDigits) [0..(10^numberOfDigits)-1]
... and we've finished!
ghci> combinationsOfDigits 2
[[0,0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7],[0,8],[0,9],[1,0],[1,1] ... [9,7],[9,8],[9,9]]
* For a version which does require a deep understanding of Haskell, see my other answer.
combos 1 list = map (\x -> [x]) list
combos n list = foldl (++) [] $ map (\x -> map (\y -> x:y) nxt) list
where nxt = combos (n-1) list
In your case
combos 3 [0..9]

My solution for Euler Project #3 is too slow

I'm new to Haskell and tinkering around with the Euler Project problems. My solution for problem #3 is far too slow. At first I tried this:
-- Problem 3
-- The prime factors of 13195 are 5, 7, 13 and 29.
-- What is the largest prime factor of the number 600851475143 ?
problem3 = max [ x | x <- [1..n], (mod n x) == 0, n /= x]
where n = 600851475143
Then I changed it to return all x and not just the largest one.
problem3 = [ x | x <- [1..n], (mod n x) == 0, n /= x]
where n = 600851475143
After 30 minutes, the list is still being processed and the output looks like this
[1,71,839,1471,6857,59569,104441,486847,1234169,5753023,10086647,87625999,408464633,716151937
Why is it so slow? Am I doing something terribly wrong or is it normal for this sort of task?
With your solution, there are about 600 billion possible numbers. As noted by delnan, making every check of the number quicker is not going to make much difference, we must limit the number of candidates.
Your solution does not seem to be correct either. 59569 = 71 * 839 isn't it? The question
only asks for prime factors. Notice that 71 and 839 is in your list so you are
doing something right. In fact, you are trying to find all factors.
I think the most dramatic effect you get simply by dividing away the factor before continuing.
euler3 = go 2 600851475143
where
go cand num
| cand == num = [num]
| cand `isFactorOf` num = cand : go cand (num `div` cand)
| otherwise = go (cand + 1) num
isFactorOf a b = b `mod` a == 0
This may seem like an obvious optimization but it relies on the fact that if both a and b divides c and a is coprime to b then a divides c/b.
If you want to do more, the common "Only check until the square root" trick has been
mentioned here. The same trick can be applied to this problem, but the performance gain does not show, unfortunately, on this instance:
euler3 = go 2 600851475143
where
go cand num
| cand*cand > num = [num]
| cand `isFactorOf` num = cand : go cand (num `div` cand)
| otherwise = go (cand + 1) num
isFactorOf a b = b `mod` a == 0
Here, when a candidate is larger than the square root of the remaining number (num), we know that num must be a prime and therefore a prime factor of the original
number (600851475143).
It is possible to remove even more candidates by only considering prime numbers,
but this is slightly more advanced because you need to make a reasonably performant
way of generating primes. See this page for ways of doing that.
It's doing a lot of work! (It's also going to give you the wrong answer, but that's a separate issue!)
There are a few very quick ways you could speed it up by thinking about the problem a little first:
You are applying your function over all numbers 1..n, and checking each one of them to ensure it isn't n. Instead, you could just go over all numbers 1..n-1 and skip out n different checks (small though they are).
The answer is odd, so you can very quickly filter out any even numbers by going from 1..(n-1)/2 and checking for 2x instead of x.
If you think about it, all factors occur in pairs, so you can in fact just search from 1..sqrt(n) (or 1..sqrt(n)/2 if you ignore even numbers) and output pairs of numbers in each step.
Not related to the performance of this function, but it's worth noting that what you've implemented here will find all of the factors of a number, whereas what you want is only the largest prime factor. So either you have to test each of your divisors for primality (which is going to be slow, again) or you can implement the two in one step. You probably want to look at 'sieves', the most simple being the Sieve of Eratosthenes, and how you can implement them.
A complete factorization of a number can take a long time for big numbers. For Project Euler problems, a brute force solution (which this is) is usually not enough to find the answer in your lifetime.
Hint: you do not need to find all prime factors, just the biggest one.
TL;DR: The two things you were doing non-optimally, are: not stopping at the square root, and not dividing out each smallest factor, as they are found.
Here's a little derivation of the (2nd) factorization code shown in the answer by HaskellElephant. We start with your code:
f1 n = [ x | x <- [2..n], rem n x == 0]
n3 = 600851475143
Prelude> f1 n3
[71,839,1471,6857,59569,104441,486847Interrupted.
So it doesn't finish in any reasonable amount of time, and some of the numbers it produces are not prime... But instead of adding primality check to the list comprehension, let's notice that 71 is prime. The first number produced by f1 n is the smallest divisor of n, and thus it is prime. If it weren't, we'd find its smallest divisor first - a contradiction.
So, we can divide it out, and continue searching for the prime factors of newly reduced number:
f2 n = tail $ iterate (\(_,m)-> (\f->(f, quot m f)) . head $ f1 m) (1,n)
Prelude> f2 n3
[(71,8462696833),(839,10086647),(1471,6857),(6857,1),(*** Exception: Prelude.hea
d: empty list
(the error, because f1 1 == []). We're done! (6857 is the answer, here...). Let's wrap it up:
takeUntil p xs = foldr (\x r -> if p x then [x] else x:r) [] xs
pfactors1 n = map fst . takeUntil ((==1).snd) . f2 $ n -- prime factors of n
Trying out our newly minted solution,
Prelude> map pfactors1 [n3..]
[[71,839,1471,6857],[2,2,2,3,3,1259Interrupted.
suddenly we hit a new inefficiency wall, on numbers without small divisors. But if n = a*b and 1 < a <= b, then a*a <= a*b == n and so it is enough to test only until the square root of a number, to find its smallest divisor.
f12 n = [ x | x <- takeWhile ((<= n).(^2)) [2..n], rem n x == 0] ++ [n]
f22 n = tail $ iterate (\(_,m)-> (\f->(f, quot m f)) . head $ f12 m) (1,n)
pfactors2 n = map fst . takeUntil ((==1).snd) . f22 $ n
What couldn't finish in half an hour now finishes in under one second (on a typical performant box):
Prelude> f12 n3
[71,839,1471,6857,59569,104441,486847,600851475143]
All the divisors above sqrt n3 were not needed at all. We unconditionally add n itself as the last divisor in f12 so it is able to handle prime numbers:
Prelude> f12 (n3+6)
[600851475149]
Since n3 / sqrt n3 = sqrt n3 ~= 775146, your original attempt at f1 n3 should have taken about a week to finish. That's how important this optimization is, of stopping at the square root.
Prelude> f22 n3
[(71,8462696833),(839,10086647),(1471,6857),(6857,1),(1,1),(1,1),(1,1),(1,1),(1,
1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1)Interrupted
We've apparently traded the "Prelude.head: empty list" error for a non-terminating - but productive - behavior.
Lastly, we break f22 up in two parts and fuse them each into the other functions, for a somewhat simplified code. Also, we won't start over anew, as f12 does, searching for the smallest divisor from 2 all the time, anymore:
-- smallest factor of n, starting from d. directly jump from sqrt n to n.
smf (d,n) = head $ [ (x, quot n x) | x <- takeWhile ((<=n).(^2)) [d..]
, rem n x == 0] ++ [(n,1)]
pfactors n = map fst . takeUntil ((==1).snd) . tail . iterate smf $ (2,n)
This expresses guarded (co)recursion through a higher-order function iterate, and is functionally equivalent to that code mentioned above. The following now runs smoothly, and we're even able to find a pair of twin primes as a bonus there:
Prelude Saga> map pfactors [n3..]
[[71,839,1471,6857],[2,2,2,3,3,1259,6628403],[5,120170295029],[2,13,37,227,27514
79],[3,7,7,11,163,2279657],[2,2,41,3663728507],[600851475149],[2,3,5,5,19,31,680
0809],[600851475151],[2,2,2,2,37553217197],[3,3,3,211,105468049],[2,7,11161,3845
351],[5,67,881,2035853],[2,2,3Interrupted.
Here is my solution for Euler Project #3. It takes only 1.22 sec on my Macbook Air.
First we should find all factors of the given number. But we know, that even numbers can't be prime numbers (except number 2). So, to solve Euler Project #3 we need not all, but only odd factors:
getOddFactors num = [ x | x <- [3,5..num], num `rem` x == 0 ]
But we can optimize this function. If we plan to find a factor of num greater than sqrt num, we should have another factor which is less than sqrt num - and these possible factors we have found already. Hence, we can limit our list of possible factors by sqrt num:
getOddFactors num = [ x | x <- [3, 5..(floor.sqrt.fromIntegral) num],
num `rem` x == 0 ]
Next we want to know which of our odd factors of num are prime numbers:
isPrime number = [ x | x <- [3..(floor.sqrt.fromIntegral) number],
number `rem` x == 0] == []
Next we can filter odd factors of num with the function isPrime to find all prime factors of num. But to use laziness of Haskell to optimize our solution, we apply function filter isPrime to the reversed list of odd factors of the num. As soon as our function finds the first value which is prime number, Haskell stops computations and returns solution:
largestPrimeFactor = head . filter isPrime . reverse . getOddDivisors
Hence, the solution is:
ghci> largestPrimeFactor 600851475143
6857
(1.22 secs, 110646064 bytes)

Iterating a function and analysing the result in haskell

Ok, referring back to my previous question, I am still working on learning haskell and solving the current problem of finding the longest chain from the following iteration:
chain n | n == 0 = error "What are you on about?"
| n == 1 = [1]
| rem n 2 == 0 = n : chain (n `div` 2)
| otherwise = n : chain (3 * n + 1)
I have this bit sorted, but I need to find the longest chain from a starting number below 1,000,000. So how do I make it do each starting number up to 1,000,000 and then print the one with the longest chain length.
I can do it for one example with:
Main> length (chain n)
I assume I need the output as an array and then use the maximum function to find the value largest chain length and then see how far along it is in the array of answers.
Is this a good way to go about finding a solution or is there a better way (perhaps with better efficiency)?
You are right about the maximum part. To get the list (that's what Haskell's []s are, arrays are different structures) you need to use the map higher-order function, like this:
chainLength n = length (chain n)
lengths = map chainLength [1..1000000]
Essentially, map takes as arguments a function and a list. It applies the function to each element in the list and returns the list of the results.
Since you will be needing the number whose chain has that length, you may want change the chainLength function to return the number as well, like this:
chainLength n = (n, length (chain n))
That way you will have an array of pairs, with each number and its chain length.
Now you need to get the pair with the largest second component. That's where the maximumBy function comes in. It works just like maximum but takes a function as a parameter to select how to compare the values. In this case, the second component of the pair. This comparison function takes two numbers and returns a value of type Ordering. This type has only three possible values: LT, EQ, GT, for less than, equal, and greater than, respectively.
So, we need a function that given two pairs tells us how the second components compare to each other:
compareSnd (_, y1) (_, y2) = compare y1 y2
-- Or, if you import Data.Function, you can write it like this (thanks alexey_r):
compareSnd = compare `on` snd -- reads nicely
I used the default compare function that compares numbers (well, not just numbers).
Now we only need to get the maximum using this function:
longestChain = maximumBy compareSnd lengths
That gets you a pair of the number with the longest chain and the corresponding length. Feel free to apply fst and snd as you please.
Note that this could be more much more concisely using zip and composition, but since you tagged the question as newbie, I thought it better to break it down like this.
SPOILER (solving the problem for positive integers under 100):
module Test where
import Data.List -- this contains maximumBy
chain n
| n == 0 = error "What are you on about?"
| n == 1 = [1]
| rem n 2 == 0 = n : chain (n `div` 2)
| otherwise = n : chain (3 * n + 1)
chains = map (\x -> (x,chain x)) [1..100]
cmpSnd (a,b) (c,d)
| length b > length d = GT
| length b == length d = EQ
| otherwise = LT
solve = (fst . maximumBy cmpSnd) chains
The chains function makes use of map. It applies a function to every element of a list of a values, so
map succ [1,2]
is the same as
[succ 1,succ 2]
The cmpSnd function is a comparison function that probably exists somewhere deep in the Hierarchical Libraries, but I could not find it, so I created it. GT means "the first value is greater than the second", the rest is trivial.
Solve takes the maximum (by utilizing the comparison function we defined earlier) of the list. This will be a pair of an integer and a list. It will return the integer only (because of the fst).
A comment: Your chain function is not tail-recursive. This means that large chains will inevitably result in a Stack Overflow. You shall add an explicit accumulator variable and make it tail-recursive.
Something like
fst $ maximumBy (length . snd) $ zip [1..1000000] $ map chain [1..1000000]
(untested)
i.e. don't work out how far along the longest chain is in the list of longest chains, but carry around the seed values with the chains instead.
I studied Haskell years ago, so I don't remember it that well. On the other hand I've tested this code and it works. You will get the max chain and the number that generates it. But as fiships has stated before, it will overflow for big values.
chain :: Int -> [Int]
chain n
| n == 0 = []
| n == 1 = [1]
| rem n 2 == 0 = n : chain (n `div` 2)
| otherwise = n : chain (3 * n + 1)
length_chain :: Int -> Int
length_chain n = length (chain n)
max_pos :: (Int,Int) -> Int -> [Int] -> (Int,Int)
max_pos (m,p) _ [] = (m,p)
max_pos (m,p) a (x:xs)
| x > m = max_pos (x,a) (a+1) xs
| otherwise = max_pos (m,p) (a+1) xs
The instruction will be
Main> max_pos (0,0) 1 (map length_chain [1..10000])
(262,6171)

Resources