Generating Cartesian products in Haskell - haskell

I am trying to generate all possible combinations of n numbers. For example if n = 3 I would want the following combinations:
(0,0,0), (0,0,1), (0,0,2)... (0,0,9), (0,1,0)... (9,9,9).
This post describes how to do so for n = 3:
[(a,b,c) | m <- [0..9], a <- [0..m], b <- [0..m], c <- [0..m] ]
Or to avoid duplicates (i.e. multiple copies of the same n-uple):
let l = 9; in [(a,b,c) | m <- [0..3*l],
a <- [0..l], b <- [0..l], c <- [0..l],
a + b + c == m ]
However following the same pattern would become very silly very quickly for n > 3. Say I wanted to find all of the combinations: (a, b, c, d, e, f, g, h, i, j), etc.
Can anyone point me in the right direction here? Ideally I'd rather not use a built in funtion as I am trying to learn Haskell and I would rather take the time to understand a peice of code than just use a package written by someone else. A tuple is not required, a list would also work.

My other answer gave an arithmetic algorithm to enumerate all the combinations of digits. Here's an alternative solution which arises by generalising your example. It works for non-numbers, too, because it only uses the structure of lists.
First off, let's remind ourselves of how you might use a list comprehension for three-digit combinations.
threeDigitCombinations = [[x, y, z] | x <- [0..9], y <- [0..9], z <- [0..9]]
What's going on here? The list comprehension corresponds to nested loops. z counts from 0 to 9, then y goes up to 1 and z starts counting from 0 again. x ticks the slowest. As you note, the shape of the list comprehension changes (albeit in a uniform way) when you want a different number of digits. We're going to exploit that uniformity.
twoDigitCombinations = [[x, y] | x <- [0..9], y <- [0..9]]
We want to abstract over the number of variables in the list comprehension (equivalently, the nested-ness of the loop). Let's start playing around with it. First, I'm going to rewrite these list comprehensions as their equivalent monad comprehensions.
threeDigitCombinations = do
x <- [0..9]
y <- [0..9]
z <- [0..9]
return [x, y, z]
twoDigitCombinations = do
x <- [0..9]
y <- [0..9]
return [x, y]
Interesting. It looks like threeDigitCombinations is roughly the same monadic action as twoDigitCombinations, but with an extra statement. Rewriting again...
zeroDigitCombinations = [[]] -- equivalently, `return []`
oneDigitCombinations = do
z <- [0..9]
empty <- zeroDigitCombinations
return (z : empty)
twoDigitCombinations = do
y <- [0..9]
z <- oneDigitCombinations
return (y : z)
threeDigitCombinations = do
x <- [0..9]
yz <- twoDigitCombinations
return (x : yz)
It should be clear now what we need to parameterise:
combinationsOfDigits 0 = return []
combinationsOfDigits n = do
x <- [0..9]
xs <- combinationsOfDigits (n - 1)
return (x : xs)
ghci> combinationsOfDigits' 2
[[0,0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7],[0,8],[0,9],[1,0],[1,1] ... [9,8],[9,9]]
It works, but we're not done yet. I want to show you that this is an instance of a more general monadic pattern. First I'm going to change the implementation of combinationsOfDigits so that it folds up a list of constants.
combinationsOfDigits n = foldUpList $ replicate n [0..9]
where foldUpList [] = return []
foldUpList (xs : xss) = do
x <- xs
ys <- foldUpList xss
return (x : ys)
Looking at the definiton of foldUpList :: [[a]] -> [[a]], we can see that it doesn't actually require the use of lists per se: it only uses the monad-y parts of lists. It could work on any monad, and indeed it does! It's in the standard library, and it's called sequence :: Monad m => [m a] -> m [a]. If you're confused by that, replace m with [] and you should see that those types mean the same thing.
combinationsOfDigits n = sequence $ replicate n [0..9]
Finally, noting that sequence . replicate n is the definition of replicateM, we get it down to a very snappy one-liner.
combinationsOfDigits n = replicateM n [0..9]
To summarise, replicateM n gives the n-ary combinations of an input list. This works for any list, not just a list of numbers. Indeed, it works for any monad - though the "combinations" interpretation only makes sense when your monad represents choice.
This code is very terse indeed! So much so that I think it's not entirely obvious how it works, unlike the arithmetic version I showed you in my other answer. The list monad has always been one of the monads I find less intuitive, at least when you're using higher-order monad combinators and not do-notation.
On the other hand, it runs quite a lot faster than the number-crunching version. On my (high-spec) MacBook Pro, compiled with -O2, this version calculates the 5-digit combinations about 4 times faster than the version which crunches numbers. (If anyone can explain the reason for this I'm listening!)

What are all the combinations of three digits? Let's write a few out manually.
000, 001, 002 ... 009, 010, 011 ... 099, 100, 101 ... 998, 999
We ended up simply counting! We enumerated all the numbers between 0 and 999. For an arbitrary number of digits this generalises straightforwardly: the upper limit is 10^n (exclusive), where n is the number of digits.
Numbers are designed this way on purpose. It would be jolly strange if there was a possible combination of three digits which wasn't a valid number, or if there was a number below 1000 which couldn't be expressed by combining three digits!
This suggests a simple plan to me, which just involves arithmetic and doesn't require a deep understanding of Haskell*:
Generate a list of numbers between 0 and 10^n
Turn each number into a list of digits.
Step 2 is the fun part. To extract the digits (in base 10) of a three-digit number, you do this:
Take the quotient and remainder of your number with respect to 100. The quotient is the first digit of the number.
Take the remainder from step 1 and take its quotient and remainder with respect to 10. The quotient is the second digit.
The remainder from step 2 was the third digit. This is the same as taking the quotient with respect to 1.
For an n-digit number, we take the quotient n times, starting with 10^(n-1) and ending with 1. Each time, we use the remainder from the last step as the input to the next step. This suggests that our function to turn a number into a list of digits should be implemented as a fold: we'll thread the remainder through the operation and build a list as we go. (I'll leave it to you to figure out how this algorithm changes if you're not in base 10!)
Now let's implement that idea. We want calculate a specified number of digits, zero-padding when necessary, of a given number. What should the type of digits be?
digits :: Int -> Int -> [Int]
Hmm, it takes in a number of digits and an integer, and produces a list of integers representing the digits of the input integer. The list will contain single-digit integers, each one of which will be one digit of the input number.
digits numberOfDigits theNumber = reverse $ fst $ foldr step ([], theNumber) powersOfTen
where step exponent (digits, remainder) =
let (digit, newRemainder) = remainder `divMod` exponent
in (digit : digits, newRemainder)
powersOfTen = [10^n | n <- [0..(numberOfDigits-1)]]
What's striking to me is that this code looks quite similar to my English description of the arithmetic we wanted to perform. We generate a powers-of-ten table by exponentiating numbers from 0 upwards. Then we fold that table back up; at each step we put the quotient on the list of digits and send the remainder to the next step. We have to reverse the output list at the end because of the right-to-left way it got built.
By the way, the pattern of generating a list, transforming it, and then folding it back up is an idiomatic thing to do in Haskell. It's even got its own high-falutin' mathsy name, hylomorphism. GHC knows about this pattern too and can compile it into a tight loop, optimising away the very existence of the list you're working with.
Let's test it!
ghci> digits 3 123
[1, 2, 3]
ghci> digits 5 10101
[1, 0, 1, 0, 1]
ghci> digits 6 99
[0, 0, 0, 0, 9, 9]
It works like a charm! (Well, it misbehaves when numberOfDigits is too small for theNumber, but never mind about that.) Now we just have to generate a counting list of numbers on which to use digits.
combinationsOfDigits :: Int -> [[Int]]
combinationsOfDigits numberOfDigits = map (digits numberOfDigits) [0..(10^numberOfDigits)-1]
... and we've finished!
ghci> combinationsOfDigits 2
[[0,0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7],[0,8],[0,9],[1,0],[1,1] ... [9,7],[9,8],[9,9]]
* For a version which does require a deep understanding of Haskell, see my other answer.

combos 1 list = map (\x -> [x]) list
combos n list = foldl (++) [] $ map (\x -> map (\y -> x:y) nxt) list
where nxt = combos (n-1) list
In your case
combos 3 [0..9]

Related

Haskell Listing the first 10 numbers starting from 1 which are divisible by all the numbers from 2 to 15

--for number divisible by 15 we can get it easily
take 10 [x | x <- [1..] , x `mod` 15 == 0 ]
--but for all how do I use the all option
take 10 [x | x <- [1..] , x `mod` [2..15] == 0 ]
take 10 [x | x <- [1..] , all x `mod` [2..15] == 0 ]
I want to understand how to use all in this particular case.
I have read Haskell documentation but I am new to this language coming from Python so I am unable to figure the logic.
First you can have a function to check if a number is mod by all [2..15].
modByNumbers x ns = all (\n -> x `mod` n == 0) ns
Then you can use it like the mod function:
take 10 [x | x <- [1..] , x `modByNumbers` [2..15] ]
Alternatively, using math, we know that the smallest number divible by all numbers less than n is the product of all of the prime numbers x less than n raised to the floor of the result of logBase x n.
A basic isPrime function:
isPrime n = length [ x | x <- [2..n], n `mod` x == 0] == 1
Using that to get all of the primes less than 15:
p = [fromIntegral x :: Float | x <- [2..15], isPrime x]
-- [2.0,3.0,5.0,7.0,11.0,13.0]
Now we can get the exponents:
e = [fromIntegral (floor $ logBase x 15) :: Float | x <- p']
-- [3.0,2.0,1.0,1.0,1.0,1.0]
If we zip these together.
z = zipWith (**) p e
-- [8.0,9.0,5.0,7.0,11.0,13.0]
And then find the product of these we get the smallest number divisible by all numbers between 2 and 15.
smallest = product z
-- 360360.0
And now to get the rest we just need to multiply that by the numbers from 1 to 15.
map round $ take 10 [smallest * x | x <- [1..15]]
-- [360360,720720,1081080,1441440,1801800,2162160,2522520,2882880,3243240,3603600]
This has the advantage of running substantially faster.
Decompose the problem.
You already know how to take the first 10 elements of a list, so set that aside and forget about it. There are infinitely many numbers divisible by all of [2,15], your remaining task is to list them all.
There are infinitely many natural numbers (unconstrained), and you already know how to list them all ([1..]), so your remaining task is to transform that list into the "sub-list" who's elements are divisible by all of [2,15].
You already know how to transform a list into the "sub-list" satisfying some constraint (predicate :: X -> Bool). You're using a list comprehension in your posted code, but I think the rest of this is going to be easier if you use filter instead. Either way, your remaining task is to represent "is divisible by all of [2,15]" as a predicate..
You already know how to check if a number x is divisible by another number y. Now for something new: you want to abstract that as a predicate on x, and you want to parameterize that predicate by y. I'm sure you could get this part on your own if asked:
divisibleBy :: Int -> (Int -> Bool)
divisibleBy y x = 0 == (x `mod` y)
You already know how to represent [2,15] as [2..15]; we can turn that into a list of predicates using fmap divisibleBy. (Or map, worry about that difference tomorrow.) Your remaining task is to turn a list of predicates into a predicate.
You have a couple of options, but you already found all :: (a -> Bool) -> [a] -> Bool, so I'll suggest all ($ x). (note)
Once you've put all these pieces together into something that works, you'll probably be able to boil it back down into something that looks a little bit like what you first wrote.

Exponentiation using list comprehension

I'm trying to solve the following exercise (I'm learning Haskell):
Define x^n using a list comprehension.
And I'm struggling to find a solution.
Using recursion or fold, the solution is not complicated (for instance, foldr (*) 1 [x | c <- [1..n]]). However, using only list comprehension it gets difficult (at least for me).
In order to solve the problem, I'm trying to create a list of x^n elements and then get the length. Generating a list of x*n elements is easy, but I fail to generate a list of x^n elements.
ppower x n = length [1 | p <- [1..x], c <- [1..n]]
returns a list of x*n elements giving a wrong result. Any ideas on this will be appreciated.
A naturally-occurring exponential comes from sequence:
length (sequence [[1..x] | _ <- [1..n]])
If you haven't seen sequence yet, it's quite a general function but
when used with lists it works like:
sequence [xs1, ... , xsk] = [[x1, ... xk] | x1 <- xs1, ... , xk <- xsk]
But this is really cheating since sequence is defined recursively.
If you want to use nothing but length and list comprehensions I think
it might be impossible. The rest of this answer will be sketchy and I half
expect someone to prove me wrong. However:
We'll try to prove that such an expression can only compute values up
to some finite power of x or n, and therefore can't compute values
as big as x^n for arbitrary x and n.
Specifically we show by induction on the structure of expressions that
any expression expr has an upper bound ub(expr, m) = m^k where m
is the maximum of the free variables it uses, and k is a known finite
power which we could calculate from the structure of the expression expr.
(When we look at the whole expression, m will be max x n.)
Our upper bounds on list expressions will be bounds on both the length of the list and also bounds on any of
its elements (and lengths of its elements, etc.).
For example if we have [x..y] and we know that x <= m and y <= m, we
know that all the elements are <= m and the length is also <= m.
So we have ub([x..y], m) = m^1.
The tricky case is the list comprehension:
[eleft | x1 <- e1, ... , xk <- ek]
The result will have length equal to length e1 * ... * length ek, so
an upper bound for it would be the product of the upper bounds for
e1 to ek, or if m^i is the maximum of these then an upper bound
would be (m^i)^k = m^(i*k).
To get a bound on the elements, suppose expression eleft has ub(eleft, m') = m'^j. It can use x1
... xk. If m^i is an upper bound for these, as above, we need to
take m' = m^i and so ub(eleft, m) = (m^i)^j = m^(i*j)
As a conservative upper bound for the whole list comprehension e we
could take ub(e, m) = m^(i*j*k).
I should really also work through cases for pattern matching
(shouldn't be a problem because the parts matched are smaller than
what we already had), let definitions and functions (but we banned
recursion, so we can just fully expand these before we start), and
list literals like [x,37,x,x,n] (we can throw their lengths
into m as initially-available values).
If infinite lists like [x..] or [x,y..] are allowed they would need some
thinking about. We can construct head and filter, which means we can get
from an infinite list to its first element matching a predicate, and that looks suspiciously like a way to get recursive functions. I don't
think it's a problem since 1. they are only arithmetic sequences and
2. we'll have to construct any numbers we want to use in the
predicate. But I'm not certain here.
As #n.m suggested, I asked Richard Bird (author of the book "Introduction to functional programming", first edition, the book where I got the exercise) for an answer/guidance in solving this exercise. He kindly replied and here I post the answer he gave me:
Since a list comprehension returns a list not a number, x^n cannot be
defined as an instance of a list comprehension. Your solution x^n =
product [x | c <- [1..n]] is the correct one.
So, I guess I'll stick to the solution I posted (and discarded for using recursion):
foldr (*) 1 [x | c <- [1..n]]
He didn't say anything about creating a list of x^n elements with lists comprehensions (no recursion) though as #David Fletcher and #n.m point out in their comments, it might be impossible.
May be you can do as follows;
pow :: Int -> Int -> Int
pow 0 _ = 1
pow 1 x = x
pow n x = length [1 | y <- [1..x], z <- [1..pow (n-1) x]]
so pow 3 2 would return 8

Enumerate All Finite Sequences of Integers?

I want to write a Haskell list comprehension to enumerate all finite sequences of integers.
I'm pretty sure that this set is countable.
This is what I have so far:
enumIntSeqs = [ (x, [ ( x, [0..x] ) | x <- [ x | x <- [0..x] ] ] ) | x <- [0..] ]
Another idea I have is to somehow list every finite path in the infinite array
Z* X Z* where Z* = {0, 1, -1, 2, -2,...}
This is, indeed, possible. But it is not easy. Imagine you have an enumeration of all integers, an enumeration of all pairs of integers, an enumeration of all triples of integers, etc. Then you need to choose "fairly" from those enumerations to be sure to hit each element of each. A similar problem will arise when you try even to enumerate all pairs of integers. I suggest you start with that problem, and then look into something like Control.Monad.Omega, or perhaps even Control.Monad.Logic.
I am not going to spoil your fun by attempting a full answer, so let me just demonstrate a handful of things through the simplified problem of enumerating all finite, non-empty, sequences of contiguous naturals starting from zero -- something that you seem close to achieving on your own already. The key steps are already amidst your enumIntSeqs, but you don't have to nest list comprehensions like that. If you begin with...
[ {- etc. -} | x <- [0..] ]
... you can generate a new list for each x simply by doing...
[ {- etc. -} | x <- [0..], let ys = [0..x] ]
... and then returning those lists:
[ ys | x <- [0..], let ys = [0..x] ]
(Note that I didn't write ys <- [0..x]. Try to predict what would happen in that case, and then check it in GHCi.)
The separate let definition isn't necessary, nor does it add anything in terms of clarity in this simple comprehension, so we can just write:
[ [0..x] | x <- [0..] ]
And that's it.
Prelude> take 4 $ [ [0..x] | x <- [0..] ]
[[0],[0,1],[0,1,2],[0,1,2,3]]
P.S.: Two other ways of writing the enumeration. Using do-notation...
someIntSeqs = do
x <- [0..]
return [0..x]
... and with a humble fmap (which in this case is the same as map):
Prelude> take 4 $ fmap (\x -> [0..x]) [0..]
[[0],[0,1],[0,1,2],[0,1,2,3]]
Prelude> -- Or, equivalently...
Prelude> take 4 $ (\x -> [0..x]) <$> [0..]
[[0],[0,1],[0,1,2],[0,1,2,3]]
I don't know if this is what you want, because your question wasn't very clear, but given the example that you provided [0,1,-1,-2,2..], I'm going to give a few examples of how to produce a sorted list of integers, and change it according to your needs.
numberline :: Integer -> [Integer]
numberline n = line (-abs n) (abs n) []
where
line 0 0 l = (0:l)
line r s t = r : line (r+1) (s-1) (s:t)
What it does is to ensure that there is a positive and negative number, with the use of -abs and abs as inputs, so that the line function returns them in that order. Then, you only need to map this function to a list of an arbitrary size to get a list of lists with integers ranging from a negative number to its symmetric end.
map numberline [1..2] -- [[-1,0,1],[-2,-1,0,1,2]]
map numberline [1..] -- [[-1,0,1],[-2,-1,0,1,2],[-3,-2,-1,0,1,2,3]..
So how do you change the above piece of code to generate an alternating path between integers as pairs of natural numbers in some array without repeating them?
countable :: Integer -> [Integer]
countable n = count (-abs n) (abs n) []
where
count 0 0 l = (0:l)
count r s t = count (r+1) (s-1) (s:r:t)
The end result being [0,1,-1,2,-2,3,-3] for countable 3. It's not very efficient for big numbers, so you may want to consider a list starting from the argument down to 0 instead of the other way around.
I hope that I helped.

Dovetail iteration over infinite lists in Haskell

I want to iterate 2 (or 3) infinite lists and find the "smallest" pair that satisfies a condition, like so:
until pred [(a,b,c) | a<-as, b<-bs, c<-cs]
where pred (a,b,c) = a*a + b*b == c*c
as = [1..]
bs = [1..]
cs = [1..]
The above wouldn't get very far, as a == b == 1 throughout the run of the program.
Is there a nice way to dovetail the problem, e.g. build the infinite sequence [(1,1,1),(1,2,1),(2,1,1),(2,1,2),(2,2,1),(2,2,2),(2,2,3),(2,3,2),..] ?
Bonus: is it possible to generalize to n-tuples?
There's a monad for that, Omega.
Prelude> let as = each [1..]
Prelude> let x = liftA3 (,,) as as as
Prelude> let x' = mfilter (\(a,b,c) -> a*a + b*b == c*c) x
Prelude> take 10 $ runOmega x'
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15),(12,9,15),(8,15,17),(15,8,17)]
Using it's applicative features, you can generalize to arbitrary tuples:
quadrupels = (,,,) <$> as <*> as <*> as <*> as -- or call it liftA4
But: this alone does not eliminate duplication, of course. It only gives you proper diagonalization. Maybe you could use monad comprehensions together with an approach like Thomas's, or just another mfilter pass (restricting to b /= c, in this case).
List comprehensions are great (and concise) ways to solve such problems. First, you know you want all combinations of (a,b,c) that might satisfy a^2 + b^2 = c^2 - a helpful observation is that (considering only positive numbers) it will always be the case that a <= c && b <= c.
To generate our list of candidates we can thus say c ranges from 1 to infinity while a and b range from one to c.
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c]]
To get to the solution we just need to add your desired equation as a guard:
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c], a*a+b*b == c*c]
This is inefficient, but the output is correct:
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15)...
There are more principled methods than blind testing that can solve this problem.
{- It depends on what is "smallest". But here is a solution for a concept of "smallest" if tuples were compared first by their max. number and then by their total sum. (You can just copy and paste my whole answer into a file as I write the text in comments.)
We will need nub later. -}
import Data.List (nub)
{- Just for illustration: the easy case with 2-tuples. -}
-- all the two-tuples where 'snd' is 'n'
tuples n = [(i, n) | i <- [1..n]]
-- all the two-tuples where 'snd' is in '1..n'
tuplesUpTo n = concat [tuples i | i <- [1..n]]
{-
To get all results, you will need to insert the flip of each tuple into the stream. But let's do that later and generalize first.
Building tuples of arbitrary length is somewhat difficult, so we will work on lists. I call them 'kList's, if they have a length 'k'.
-}
-- just copied from the tuples case, only we need a base case for k=1 and
-- we can combine all results utilizing the list monad.
kLists 1 n = [[n]]
kLists k n = do
rest <- kLists (k-1) n
add <- [1..head rest]
return (add:rest)
-- same as above. all the klists with length k and max number of n
kListsUpTo k n = concat [kLists k i | i <- [1..n]]
-- we can do that unbounded as well, creating an infinite list.
kListsInf k = concat [kLists k i | i <- [1..]]
{-
The next step is rotating these lists around, because until now the largest number is always in the last place. So we just look at all rotations to get all the results. Using nub here is admittedly awkward, you can improve that. But without it, lists where all elements are the same are repeated k times.
-}
rotate n l = let (init, end) = splitAt n l
in end ++ init
rotations k l = nub [rotate i l | i <- [0..k-1]]
rotatedKListsInf k = concatMap (rotations k) $ kListsInf k
{- What remains is to convert these lists into tuples. This is a bit awkward, because every n-tuple is a separate type. But it's straightforward, of course. -}
kListToTuple2 [x,y] = (x,y)
kListToTuple3 [x,y,z] = (x,y,z)
kListToTuple4 [x,y,z,t] = (x,y,z,t)
kListToTuple5 [x,y,z,t,u] = (x,y,z,t,u)
kListToTuple6 [x,y,z,t,u,v] = (x,y,z,t,u,v)
{- Some tests:
*Main> take 30 . map kListToTuple2 $ rotatedKListsInf 2
[(1,1),(1,2),(2,1),(2,2),(1,3),(3,1),(2,3),(3,2),(3,3),(1,4),(4,1),(2,4),(4,2),(3,4),
(4,3),(4,4),(1,5),(5,1),(2,5),(5,2),(3,5),(5,3),(4,5),(5,4),(5,5),(1,6),(6,1),
(2,6), (6,2), (3,6)]
*Main> take 30 . map kListToTuple3 $ rotatedKListsInf 3
[(1,1,1),(1,1,2),(1,2,1),(2,1,1),(1,2,2),(2,2,1),(2,1,2),(2,2,2),(1,1,3),(1,3,1),
(3,1,1),(1,2,3),(2,3,1),(3,1,2),(2,2,3),(2,3,2),(3,2,2),(1,3,3),(3,3,1),(3,1,3),
(2,3,3),(3,3,2),(3,2,3),(3,3,3),(1,1,4),(1,4,1),(4,1,1),(1,2,4),(2,4,1),(4,1,2)]
Edit:
I realized there is a bug: Just rotating the ordered lists isn't enough of course. The solution must be somewhere along the lines of having
rest <- concat . map (rotations (k-1)) $ kLists (k-1) n
in kLists, but then some issues with repeated outputs arise. You can figure that out, I guess. ;-)
-}
It really depends on what you mean by "smallest", but I assume you want to find a tuple of numbers with respect to its maximal element - so (2,2) is less than (1,3) (while standard Haskell ordering is lexicographic).
There is package data-ordlist, which is aimed precisely at working with ordered lists. It's function mergeAll (and mergeAllBy) allows you to combine a 2-dimensional matrix ordered in each direction into an ordered list.
First let's create a desired comparing function on tuples:
import Data.List (find)
import Data.List.Ordered
compare2 :: (Ord a) => (a, a) -> (a, a) -> Ordering
compare2 x y = compare (max2 x, x) (max2 y, y)
where
max2 :: Ord a => (a, a) -> a
max2 (x, y) = max x y
Then using mergeAll we create a function that takes a comparator, a combining function (which must be monotonic in both arguments) and two sorted lists. It combines all possible elements from the two lists using the function and produces a result sorted list:
mergeWith :: (b -> b -> Ordering) -> (a -> a -> b) -> [a] -> [a] -> [b]
mergeWith cmp f xs ys = mergeAllBy cmp $ map (\x -> map (f x) xs) ys
With this function, it's very simple to produce tuples ordered according to their maximum:
incPairs :: [(Int,Int)]
incPairs = mergeWith compare2 (,) [1..] [1..]
Its first 10 elements are:
> take 10 incPairs
[(1,1),(1,2),(2,1),(2,2),(1,3),(2,3),(3,1),(3,2),(3,3),(1,4)]
and when we (for example) look for the first pair whose sum of squares is equal to 65:
find (\(x,y) -> x^2+y^2 == 65) incPairs
we get the correct result (4,7) (as opposed to (1,8) if lexicographic ordering were used).
This answer is for a more general problem for a unknown predicate. If the predicate is known, more efficient solutions are possible, like others have listed solutions based on knowledge that you don't need to iterate for all Ints for a given c.
When dealing with infinite lists, you need to perform breadth-first search for solution. The list comprehension only affords depth-first search, that is why you never arrive at a solution in your original code.
counters 0 xs = [[]]
counters n xs = concat $ foldr f [] gens where
gens = [[x:t | t <- counters (n-1) xs] | x <- xs]
f ys n = cat ys ([]:n)
cat (y:ys) (x:xs) = (y:x): cat ys xs
cat [] xs = xs
cat xs [] = [xs]
main = print $ take 10 $ filter p $ counters 3 [1..] where
p [a,b,c] = a*a + b*b == c*c
counters generates all possible counters for values from the specified range of digits, including a infinite range.
First, we obtain a list of generators of valid combinations of counters - for each permitted digit, combine it with all permitted combinations for counters of smaller size. This may result in a generator that produces a infinite number of combinations. So, we need to borrow from each generator evenly.
So gens is a list of generators. Think of this as a list of all counters starting with one digit: gens !! 0 is a list of all counters starting with 1, gens !! 1 is a list of all counters starting with 2, etc.
In order to borrow from each generator evenly, we could transpose the list of generators - that way we would get a list of first elements of the generators, followed by a list of second elements of the generators, etc.
Since the list of generators may be infinite, we cannot afford to transpose the list of generators, because we may never get to look at the second element of any generator (for a infinite number of digits we'd have a infinite number of generators). So, we enumerate the elements from the generators "diagonally" - take first element from the first generator; then take the second element from the first generator and the first from the second generator; then take the third element from the first generator, the second from the second, and the first element from the third generator, etc. This can be done by folding the list of generators with a function f, which zips together two lists - one list is the generator, the other is the already-zipped generators -, the beginning of one of them being offset by one step by adding []: to the head. This is almost zipWith (:) ys ([]:n) - the difference is that if n or ys is shorter than the other one, we don't drop the remainder of the other list. Note that folding with zipWith (:) ys n would be a transpose.
For this answer I will take "smallest" to refer to the sum of the numbers in the tuple.
To list all possible pairs in order, you can first list all of the pairs with a sum of 2, then all pairs with a sum of 3 and so on. In code
pairsWithSum n = [(i, n-i) | i <- [1..n-1]]
xs = concatMap pairsWithSum [2..]
Haskell doesn't have facilities for dealing with n-tuples without using Template Haskell, so to generalize this you will have to switch to lists.
ntuplesWithSum 1 s = [[s]]
ntuplesWithSum n s = concatMap (\i -> map (i:) (ntuplesWithSum (n-1) (s-i))) [1..s-n+1]
nums n = concatMap (ntuplesWithSum n) [n..]
Here's another solution, with probably another slightly different idea of "smallest". My order is just "all tuples with max element N come before all tuples with max element N+1". I wrote the versions for pairs and triples:
gen2_step :: Int -> [(Int, Int)]
gen2_step s = [(x, y) | x <- [1..s], y <- [1..s], (x == s || y == s)]
gen2 :: Int -> [(Int, Int)]
gen2 n = concatMap gen2_step [1..n]
gen2inf :: [(Int, Int)]
gen2inf = concatMap gen2_step [1..]
gen3_step :: Int -> [(Int, Int, Int)]
gen3_step s = [(x, y, z) | x <- [1..s], y <- [1..s], z <- [1..s], (x == s || y == s || z == s)]
gen3 :: Int -> [(Int, Int, Int)]
gen3 n = concatMap gen3_step [1..n]
gen3inf :: [(Int, Int, Int)]
gen3inf = concatMap gen3_step [1..]
You can't really generalize it to N-tuples, though as long as you stay homogeneous, you may be able to generalize it if you use arrays. But I don't want to tie my brain into that knot.
I think this is the simplest solution if "smallest" is defined as x+y+z because after you find your first solution in the space of Integral valued pythagorean triangles, your next solutions from the infinite list are bigger.
take 1 [(x,y,z) | y <- [1..], x <- [1..y], z <- [1..x], z*z + x*x == y*y]
-> [(4,5,3)]
It has the nice property that it returns each symmetrically unique solution only once. x and z are also infinite, because y is infinite.
This does not work, because the sequence for x never finishes, and thus you never get a value for y, not to mention z. The rightmost generator is the innermost loop.
take 1 [(z,y,x)|z <- [1..],y <- [1..],x <- [1..],x*x + y*y == z*z]
Sry, it's quite a while since I did haskell, so I'm going to describe it with words.
As I pointed out in my comment. It is not possible to find the smallest anything in an infinite list, since there could always be a smaller one.
What you can do is, have a stream based approach that takes the lists and returns a list with only 'valid' elements, i. e. where the condition is met. Lets call this function triangle
You can then compute the triangle list to some extent with take n (triangle ...) and from this n elements you can find the minium.

My solution for Euler Project #3 is too slow

I'm new to Haskell and tinkering around with the Euler Project problems. My solution for problem #3 is far too slow. At first I tried this:
-- Problem 3
-- The prime factors of 13195 are 5, 7, 13 and 29.
-- What is the largest prime factor of the number 600851475143 ?
problem3 = max [ x | x <- [1..n], (mod n x) == 0, n /= x]
where n = 600851475143
Then I changed it to return all x and not just the largest one.
problem3 = [ x | x <- [1..n], (mod n x) == 0, n /= x]
where n = 600851475143
After 30 minutes, the list is still being processed and the output looks like this
[1,71,839,1471,6857,59569,104441,486847,1234169,5753023,10086647,87625999,408464633,716151937
Why is it so slow? Am I doing something terribly wrong or is it normal for this sort of task?
With your solution, there are about 600 billion possible numbers. As noted by delnan, making every check of the number quicker is not going to make much difference, we must limit the number of candidates.
Your solution does not seem to be correct either. 59569 = 71 * 839 isn't it? The question
only asks for prime factors. Notice that 71 and 839 is in your list so you are
doing something right. In fact, you are trying to find all factors.
I think the most dramatic effect you get simply by dividing away the factor before continuing.
euler3 = go 2 600851475143
where
go cand num
| cand == num = [num]
| cand `isFactorOf` num = cand : go cand (num `div` cand)
| otherwise = go (cand + 1) num
isFactorOf a b = b `mod` a == 0
This may seem like an obvious optimization but it relies on the fact that if both a and b divides c and a is coprime to b then a divides c/b.
If you want to do more, the common "Only check until the square root" trick has been
mentioned here. The same trick can be applied to this problem, but the performance gain does not show, unfortunately, on this instance:
euler3 = go 2 600851475143
where
go cand num
| cand*cand > num = [num]
| cand `isFactorOf` num = cand : go cand (num `div` cand)
| otherwise = go (cand + 1) num
isFactorOf a b = b `mod` a == 0
Here, when a candidate is larger than the square root of the remaining number (num), we know that num must be a prime and therefore a prime factor of the original
number (600851475143).
It is possible to remove even more candidates by only considering prime numbers,
but this is slightly more advanced because you need to make a reasonably performant
way of generating primes. See this page for ways of doing that.
It's doing a lot of work! (It's also going to give you the wrong answer, but that's a separate issue!)
There are a few very quick ways you could speed it up by thinking about the problem a little first:
You are applying your function over all numbers 1..n, and checking each one of them to ensure it isn't n. Instead, you could just go over all numbers 1..n-1 and skip out n different checks (small though they are).
The answer is odd, so you can very quickly filter out any even numbers by going from 1..(n-1)/2 and checking for 2x instead of x.
If you think about it, all factors occur in pairs, so you can in fact just search from 1..sqrt(n) (or 1..sqrt(n)/2 if you ignore even numbers) and output pairs of numbers in each step.
Not related to the performance of this function, but it's worth noting that what you've implemented here will find all of the factors of a number, whereas what you want is only the largest prime factor. So either you have to test each of your divisors for primality (which is going to be slow, again) or you can implement the two in one step. You probably want to look at 'sieves', the most simple being the Sieve of Eratosthenes, and how you can implement them.
A complete factorization of a number can take a long time for big numbers. For Project Euler problems, a brute force solution (which this is) is usually not enough to find the answer in your lifetime.
Hint: you do not need to find all prime factors, just the biggest one.
TL;DR: The two things you were doing non-optimally, are: not stopping at the square root, and not dividing out each smallest factor, as they are found.
Here's a little derivation of the (2nd) factorization code shown in the answer by HaskellElephant. We start with your code:
f1 n = [ x | x <- [2..n], rem n x == 0]
n3 = 600851475143
Prelude> f1 n3
[71,839,1471,6857,59569,104441,486847Interrupted.
So it doesn't finish in any reasonable amount of time, and some of the numbers it produces are not prime... But instead of adding primality check to the list comprehension, let's notice that 71 is prime. The first number produced by f1 n is the smallest divisor of n, and thus it is prime. If it weren't, we'd find its smallest divisor first - a contradiction.
So, we can divide it out, and continue searching for the prime factors of newly reduced number:
f2 n = tail $ iterate (\(_,m)-> (\f->(f, quot m f)) . head $ f1 m) (1,n)
Prelude> f2 n3
[(71,8462696833),(839,10086647),(1471,6857),(6857,1),(*** Exception: Prelude.hea
d: empty list
(the error, because f1 1 == []). We're done! (6857 is the answer, here...). Let's wrap it up:
takeUntil p xs = foldr (\x r -> if p x then [x] else x:r) [] xs
pfactors1 n = map fst . takeUntil ((==1).snd) . f2 $ n -- prime factors of n
Trying out our newly minted solution,
Prelude> map pfactors1 [n3..]
[[71,839,1471,6857],[2,2,2,3,3,1259Interrupted.
suddenly we hit a new inefficiency wall, on numbers without small divisors. But if n = a*b and 1 < a <= b, then a*a <= a*b == n and so it is enough to test only until the square root of a number, to find its smallest divisor.
f12 n = [ x | x <- takeWhile ((<= n).(^2)) [2..n], rem n x == 0] ++ [n]
f22 n = tail $ iterate (\(_,m)-> (\f->(f, quot m f)) . head $ f12 m) (1,n)
pfactors2 n = map fst . takeUntil ((==1).snd) . f22 $ n
What couldn't finish in half an hour now finishes in under one second (on a typical performant box):
Prelude> f12 n3
[71,839,1471,6857,59569,104441,486847,600851475143]
All the divisors above sqrt n3 were not needed at all. We unconditionally add n itself as the last divisor in f12 so it is able to handle prime numbers:
Prelude> f12 (n3+6)
[600851475149]
Since n3 / sqrt n3 = sqrt n3 ~= 775146, your original attempt at f1 n3 should have taken about a week to finish. That's how important this optimization is, of stopping at the square root.
Prelude> f22 n3
[(71,8462696833),(839,10086647),(1471,6857),(6857,1),(1,1),(1,1),(1,1),(1,1),(1,
1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1),(1,1)Interrupted
We've apparently traded the "Prelude.head: empty list" error for a non-terminating - but productive - behavior.
Lastly, we break f22 up in two parts and fuse them each into the other functions, for a somewhat simplified code. Also, we won't start over anew, as f12 does, searching for the smallest divisor from 2 all the time, anymore:
-- smallest factor of n, starting from d. directly jump from sqrt n to n.
smf (d,n) = head $ [ (x, quot n x) | x <- takeWhile ((<=n).(^2)) [d..]
, rem n x == 0] ++ [(n,1)]
pfactors n = map fst . takeUntil ((==1).snd) . tail . iterate smf $ (2,n)
This expresses guarded (co)recursion through a higher-order function iterate, and is functionally equivalent to that code mentioned above. The following now runs smoothly, and we're even able to find a pair of twin primes as a bonus there:
Prelude Saga> map pfactors [n3..]
[[71,839,1471,6857],[2,2,2,3,3,1259,6628403],[5,120170295029],[2,13,37,227,27514
79],[3,7,7,11,163,2279657],[2,2,41,3663728507],[600851475149],[2,3,5,5,19,31,680
0809],[600851475151],[2,2,2,2,37553217197],[3,3,3,211,105468049],[2,7,11161,3845
351],[5,67,881,2035853],[2,2,3Interrupted.
Here is my solution for Euler Project #3. It takes only 1.22 sec on my Macbook Air.
First we should find all factors of the given number. But we know, that even numbers can't be prime numbers (except number 2). So, to solve Euler Project #3 we need not all, but only odd factors:
getOddFactors num = [ x | x <- [3,5..num], num `rem` x == 0 ]
But we can optimize this function. If we plan to find a factor of num greater than sqrt num, we should have another factor which is less than sqrt num - and these possible factors we have found already. Hence, we can limit our list of possible factors by sqrt num:
getOddFactors num = [ x | x <- [3, 5..(floor.sqrt.fromIntegral) num],
num `rem` x == 0 ]
Next we want to know which of our odd factors of num are prime numbers:
isPrime number = [ x | x <- [3..(floor.sqrt.fromIntegral) number],
number `rem` x == 0] == []
Next we can filter odd factors of num with the function isPrime to find all prime factors of num. But to use laziness of Haskell to optimize our solution, we apply function filter isPrime to the reversed list of odd factors of the num. As soon as our function finds the first value which is prime number, Haskell stops computations and returns solution:
largestPrimeFactor = head . filter isPrime . reverse . getOddDivisors
Hence, the solution is:
ghci> largestPrimeFactor 600851475143
6857
(1.22 secs, 110646064 bytes)

Resources