The example I want to generalize
The following function generates every length-2 list [a,b] using the integers from 1 to top for which a < b.
ascendingPairs top = [ [x,y]
| x <- [1..top],
y <- [x+1..top] ]
Here it is in action:
> mapM_ (putStrLn . show) $ f 4
[1,2]
[1,3]
[1,4]
[2,3]
[2,4]
[3,4]
The generalization I want
ascendingPairs is defined only for the special case that the lists produced should have length 2. What if I don't know how long the output lists should be, and want that length to be an argument?
Some bad strategies
Here's a dumb way to provide for an additional length argument:
partial :: Int -> Int -> [[Int]]
partial 1 top = [ [x]
| x <- [1..top] ]
partial 2 top = [ [x,y]
| x <- [1..top],
y <- [x+1 .. top] ]
partial 3 top = [ [x,y,z]
| x <- [1..top],
y <- [x+1 .. top],
z <- [y+1 .. top] ]
partial 4 top = ...
partial 5 top = ...
...
Writing partial that way, it would take literally forever to cover every case.
Here's a way to cover all cases. (This way actually does need the list monad.)
slow :: Int -> Int -> [[Int]]
slow top size =
filter monotonicAscending $
mapM (\x -> [1..top]) [1..size]
monotonicAscending :: [Int] -> Bool
monotonicAscending (a:b:rest) = a < b
&& monotonicAscending (b:rest)
monotonicAscending _ = True
slow saves human time, but at the expense of a ton of machine time, because it generating so many scales that filter monotonicAscending immediately drops. Computing length $ f 41 7 takes at least a minute, maybe hours (I cut it off). For my purposes I'd actually prefer partial to slow.
A solution that I hope is more complex than necessary
Eventually I did find a way, but I feel like I'm brute forcing it.
monoAscending :: Int -> Int -> [[Int]]
monoAscending top size =
map reverse $
incrementNTimes top (size-1) $ [ [a]
| a <- [1..top] ]
incrementNTimes :: Int -> Int -> [[Int]] -> [[Int]]
incrementNTimes top 0 lists = lists
incrementNTimes top n lists = let
x :: [[Int]]
x = concatMap (increments top) lists
in incrementNTimes top (n-1) x
-- | All the ways of prepending a bigger element to the input list.
-- This assumes the input list is in descending order.
increments :: Int -> [Int] -> [[Int]]
increments top (a:as) = [ b:a:as
| b <- [a+1 .. top]]
It works:
> mapM_ (putStrLn . show) $ monoAscending 4 3
[1,2,3]
[1,2,4]
[1,3,4]
[2,3,4]
But is there a better way?
I recommend directly extending your very first idea, for partial. Just... use recursion!
notPartial 0 bot top = [[]]
notPartial n bot top = [ x:xs
| x <- [bot..top]
, xs <- notPartial (n-1) (x+1) top
]
Then you can make an alias that fixes bot if you like.
monoAscending n top = notPartial n 1 top
Try it out:
> monoAscending 3 5
[[1,2,3],[1,2,4],[1,2,5],[1,3,4],[1,3,5],[1,4,5],[2,3,4],[2,3,5],[2,4,5],[3,4,5]]
Your solution suffers from three resolvable complexities.
You have two base cases (the last argument to incrementNTimes in monoAscending is a base case, as well as the base case in incrementNTimes).
You're passing the result as an argument rather than simply using it as a result (almost like you've prematurely tail-call optimized).
You're messing around with reversing.
Fortunately, all three of these are simplifiable!
First, let's start by having just one base case:
mkLists :: Int -> Int -> [[Int]]
mkLists _top 0 = [[]]
mkLists top size = ...
There is a single possible list of zero length, which is the empty list, so we return the list containing the empty list.
Now, let's use the result of the recursive call without tail-call optimizing:
mkLists top size = concatMap increments $ mkLists top (size - 1)
where
increments ...
There's no need to pass the recursive call as an argument anywhere if we can manipulate it directly with concatMap go.
Lastly, let's write a version of increments that adds small elements to the beginning of the list rather than big ones so that we don't need to reverse the lists at the end.
where
increments [] = pure <$> [1..top]
increments xs#(x:_) = (:xs) <$> [1..(x-1)]
When increments gets the empty list, it presumably means the recursive call was for size=0, so we produce all of the singleton lists from 1..top. For all other cases, we want to cons a smaller element on the front of the given list.
In total, the code is:
mkLists :: Int -> Int -> [[Int]]
mkLists _top 0 = [[]]
mkLists top size = concatMap increments $ mkLists top (size-1)
where
increments [] = pure <$> [1..top]
increments xs#(x:_) = (:xs) <$> [1..(x-1)]
From here, if you want, you can code golf this down even more. The helper function increments can be written as a one-liner (increments xs = (:xs) <$> [1..(maybe top (subtract 1) $ headMay xs)]), and the recursion itself can be rewritten as a fold.
I'm using Project Euler to learn Haskell. I'm new at Haskell and am having a lot of trouble coming up with an algorithm that doesn't take an absurd amount of time. I'm estimating that the program here would take 14 gigayears to arrive at the solution.
The problem:
Which prime, below one-million, can be written as the sum of the most
consecutive primes?
Here's my source. I've left out isPrime. I've posted it because it's far too inefficient to solve the problem. I think the issue lies with the slicedChains and primeChains calls, but I'm not sure what it is. I've resolved this before with C++. But for whatever reason, the efficient solution seems beyond me in Haskell.
Edit: I've included isPrime.
import System.Environment (getArgs)
import Data.List (nub,maximumBy)
import Data.Ord (comparing)
isPrime :: Integer -> Bool
isPrime 1 = False
isPrime 2 = True
isPrime x
| any (== 0) (fmap (x `mod`) [2..x-1]) = False
| otherwise = True
primeChain :: Integer -> [Integer]
primeChain x = [ n | n <- 1 : 2 : [3,5..x-1], isPrime n ]
slice :: [a] -> [Int] -> [a]
slice xs args = take (to - from + 1) (drop from xs)
where from = head args
to = last args
subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
in if n>l then [] else subsequencesBySize xs !! (l-n)
where
subsequencesBySize [] = [[[]]]
subsequencesBySize (x:xs) = let next = subsequencesBySize xs
in zipWith (++) ([]:next) (map (map (x:)) next ++ [[]])
slicedChains :: Int -> [Integer] -> [[Integer]]
slicedChains len xs = nub [x | x <- fmap (xs `slice`) subseqs, length x > 1]
where subseqs = [x | x <- (subsequencesOfSize 2 [1..len]), (last x) > (head x)]
primeSums :: Integer -> [[Integer]]
primeSums x = filter (\ns -> sum ns == x) chain
where xs = primeChain x
len = length xs
chain = slicedChains len xs
compLength :: [[a]] -> [a]
compLength xs = maximumBy (comparing length) xs
cleanSums :: [Integer] -> [[Integer]]
cleanSums xs = fmap (compLength) filtered
where filtered = filter (not . null) (fmap primeSums xs)
main :: IO()
main = do
args <- getArgs
let arg = read (head args) :: Integer
let xs = primeChain arg
print $ maximumBy (comparing length) $ cleanSums xs
Your basic problem is that you are not pruning your search space based on the best solution you have found so far.
I can tell this just from the fact that you are using maximumBy to find the longest sequence.
For instance, if during your search your find a consecutive sequence of 4 primes whose sum is a prime < 10^6, you don't have to examine any sequence which begins with a prime greater than 250000.
To do this kind of pruning you have to keep track of the solution found so far and interleave the testing of candidate sequences with their generation so that the best solution found so far can stop the search early.
Update
There are several inefficiencies in slicedChains. Haskell lists are implemented a linked lists. This video is pretty good overview of linked lists and how they differ from arrays: (link)
The following expressions in your code are going to be problematic w.r.t. efficiency:
* nub has quadratic running time
* length x > 1 - the complexity of length is O(n) where n is the length of the list. A better way to write this is:
lengthGreaterThan1 :: [a] -> Bool
lengthGreaterThan1 (_:_:_) = True
lengthGreaterThan1 _ = False
* subsequencesOfSize 2 [1..len] may be more succinctly written:
[ [a,b] | a <- [1..len], b <- [a+1..len] ]
and this will also ensure that a < b.
* The take and drop calls in slice are also O(n)
* In primeSums the call to primeChain will regenerate essentially the same list over and over again resulting in a lot of multiple calls to isPrime. A better approach is to define primeChain like this:
allPrimes = filter isPrime [1..]
primeChain x = takeWhile (<= x) allPrimes
The list allPrimes will be generated once, and primeChain simply takes prefixes of that list.
* primeSums x is charged with finding sequences whose sum is exactly x, but it looks at a lot of sequences that can't possibly work. For instance, primeSums 31 will examine:
11 + 13 + 17, 11 + 13 + 17 + 23, 11 + 13 + 17 + 23 + 29,
17 + 19, 17 + 19 + 23, 17 + 19 + 23 + 29,
19 + 23, 19 + 23 + 29
23 + 29
even though it's pretty obvious that none of these sums could equal 31.
So the first thing you need is a good data structure: Once you find a sequence of length n you don't care about sequences of shorter length, so your primary needs are: (1) tracking the sum, (2) tracking the primes in the set, (3) removing the least element, (4) adding a new greatest element. The key is amortization, where a big cost is paid infrequently enough that you can pretend it is a small cost per procedure. The data structure looks like this:
data Queue x = Q [x] [x]
q_empty (Q [] []) = True
q_empty _ = False
q_headtails (Q (x:xs) rest) = (x, Q xs rest)
q_headtails (Q [] xs) = case reverse xs of y:ys -> (y, Q ys [])
[] -> error "End of queue."
q_append el (Q beg end) = Q beg (el:end)
So deconstructing the list is possible, but sometimes triggers an O(n) operation, but that's OK because when it does, we won't have to do it for another n steps, so it averages out to one operation per step. (You might also want to do it with a spine-strict list.)
To save on length operations and summing the items of the list you probably want to cache those, too:
type Length = Int
type Sum = Int
type Prime = Int
data PrimeSeq = PS Length Sum (Queue Prime)
headTails (PS len sum q) = (x, PS (len - 1) (sum - x) xs)
where (x, xs) = q_headtails q
append x (PS len sum xs) = PS (len + 1) (sum + x) (q_append x xs)
The algorithm for these looks like:
Cache a copy of the PrimeSeq you're starting with
Keep adding primes to it and testing primality until you get to 10^6.
If you find a new prime with a longer sequence, replace the cache.
Whenever you run into 10^6, revert to the cache, pull a prime off the front of the queue, then repeat as needed.
Your prime generation is quadratic (isPrime 101 tests rem 101 100 == 0 even though 10 is the biggest number by which 101 needs to be tested -- and actually 7 is enough).
Yet even with it, a simple enough list-based code finds the answer in under 2 seconds (on an Intel Core i7 2.5 GHz, interpreted in GHCi). And with the code corrected to take advantage of the above mentioned optimization (and additionally, testing by primes only), it takes 0.1s.
Also, f x | t = False | otherwise = True is the same as f x = not t.
We are asked by the PE site not to give you even a hint.
But in general, the key to efficiency in Haskell, thanks to its laziness, is being generative with as small a duplication of effort as possible. As one example, instead of calculating each slice of a list in isolation starting anew, we can produce the bunch of them together as part of one process,
slices :: Int -> [a] -> [[a]]
slices n = map (take n) . iterate tail -- sequence of list's slices of length n each
Another principle is, try to solve a more general problem, of which yours is an instance.
Having written such a function, we can play with it by trying out different values for its parameters, from smaller to the bigger ones, for an exploratory style of problem solving. We're told about 21 consecutive primes. What about 22 of them? 27? 1127 of them? ... and I've said enough about this already.
If it starts taking too much time, we can assess the full solution's needed run time by empirical orders of growth analysis.
Though the solution is found quickly enough with your unoptimized isPrime code, the exploratory process can be prohibitively slow with it, but it is fast enough with the optimized code:
primes :: [Int]
primes = 2 : filter isPrime [3,5..]
isPrime n = and [rem n p > 0 | p <- takeWhile ((<= n).(^2)) primes]
I want to iterate 2 (or 3) infinite lists and find the "smallest" pair that satisfies a condition, like so:
until pred [(a,b,c) | a<-as, b<-bs, c<-cs]
where pred (a,b,c) = a*a + b*b == c*c
as = [1..]
bs = [1..]
cs = [1..]
The above wouldn't get very far, as a == b == 1 throughout the run of the program.
Is there a nice way to dovetail the problem, e.g. build the infinite sequence [(1,1,1),(1,2,1),(2,1,1),(2,1,2),(2,2,1),(2,2,2),(2,2,3),(2,3,2),..] ?
Bonus: is it possible to generalize to n-tuples?
There's a monad for that, Omega.
Prelude> let as = each [1..]
Prelude> let x = liftA3 (,,) as as as
Prelude> let x' = mfilter (\(a,b,c) -> a*a + b*b == c*c) x
Prelude> take 10 $ runOmega x'
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15),(12,9,15),(8,15,17),(15,8,17)]
Using it's applicative features, you can generalize to arbitrary tuples:
quadrupels = (,,,) <$> as <*> as <*> as <*> as -- or call it liftA4
But: this alone does not eliminate duplication, of course. It only gives you proper diagonalization. Maybe you could use monad comprehensions together with an approach like Thomas's, or just another mfilter pass (restricting to b /= c, in this case).
List comprehensions are great (and concise) ways to solve such problems. First, you know you want all combinations of (a,b,c) that might satisfy a^2 + b^2 = c^2 - a helpful observation is that (considering only positive numbers) it will always be the case that a <= c && b <= c.
To generate our list of candidates we can thus say c ranges from 1 to infinity while a and b range from one to c.
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c]]
To get to the solution we just need to add your desired equation as a guard:
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c], a*a+b*b == c*c]
This is inefficient, but the output is correct:
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15)...
There are more principled methods than blind testing that can solve this problem.
{- It depends on what is "smallest". But here is a solution for a concept of "smallest" if tuples were compared first by their max. number and then by their total sum. (You can just copy and paste my whole answer into a file as I write the text in comments.)
We will need nub later. -}
import Data.List (nub)
{- Just for illustration: the easy case with 2-tuples. -}
-- all the two-tuples where 'snd' is 'n'
tuples n = [(i, n) | i <- [1..n]]
-- all the two-tuples where 'snd' is in '1..n'
tuplesUpTo n = concat [tuples i | i <- [1..n]]
{-
To get all results, you will need to insert the flip of each tuple into the stream. But let's do that later and generalize first.
Building tuples of arbitrary length is somewhat difficult, so we will work on lists. I call them 'kList's, if they have a length 'k'.
-}
-- just copied from the tuples case, only we need a base case for k=1 and
-- we can combine all results utilizing the list monad.
kLists 1 n = [[n]]
kLists k n = do
rest <- kLists (k-1) n
add <- [1..head rest]
return (add:rest)
-- same as above. all the klists with length k and max number of n
kListsUpTo k n = concat [kLists k i | i <- [1..n]]
-- we can do that unbounded as well, creating an infinite list.
kListsInf k = concat [kLists k i | i <- [1..]]
{-
The next step is rotating these lists around, because until now the largest number is always in the last place. So we just look at all rotations to get all the results. Using nub here is admittedly awkward, you can improve that. But without it, lists where all elements are the same are repeated k times.
-}
rotate n l = let (init, end) = splitAt n l
in end ++ init
rotations k l = nub [rotate i l | i <- [0..k-1]]
rotatedKListsInf k = concatMap (rotations k) $ kListsInf k
{- What remains is to convert these lists into tuples. This is a bit awkward, because every n-tuple is a separate type. But it's straightforward, of course. -}
kListToTuple2 [x,y] = (x,y)
kListToTuple3 [x,y,z] = (x,y,z)
kListToTuple4 [x,y,z,t] = (x,y,z,t)
kListToTuple5 [x,y,z,t,u] = (x,y,z,t,u)
kListToTuple6 [x,y,z,t,u,v] = (x,y,z,t,u,v)
{- Some tests:
*Main> take 30 . map kListToTuple2 $ rotatedKListsInf 2
[(1,1),(1,2),(2,1),(2,2),(1,3),(3,1),(2,3),(3,2),(3,3),(1,4),(4,1),(2,4),(4,2),(3,4),
(4,3),(4,4),(1,5),(5,1),(2,5),(5,2),(3,5),(5,3),(4,5),(5,4),(5,5),(1,6),(6,1),
(2,6), (6,2), (3,6)]
*Main> take 30 . map kListToTuple3 $ rotatedKListsInf 3
[(1,1,1),(1,1,2),(1,2,1),(2,1,1),(1,2,2),(2,2,1),(2,1,2),(2,2,2),(1,1,3),(1,3,1),
(3,1,1),(1,2,3),(2,3,1),(3,1,2),(2,2,3),(2,3,2),(3,2,2),(1,3,3),(3,3,1),(3,1,3),
(2,3,3),(3,3,2),(3,2,3),(3,3,3),(1,1,4),(1,4,1),(4,1,1),(1,2,4),(2,4,1),(4,1,2)]
Edit:
I realized there is a bug: Just rotating the ordered lists isn't enough of course. The solution must be somewhere along the lines of having
rest <- concat . map (rotations (k-1)) $ kLists (k-1) n
in kLists, but then some issues with repeated outputs arise. You can figure that out, I guess. ;-)
-}
It really depends on what you mean by "smallest", but I assume you want to find a tuple of numbers with respect to its maximal element - so (2,2) is less than (1,3) (while standard Haskell ordering is lexicographic).
There is package data-ordlist, which is aimed precisely at working with ordered lists. It's function mergeAll (and mergeAllBy) allows you to combine a 2-dimensional matrix ordered in each direction into an ordered list.
First let's create a desired comparing function on tuples:
import Data.List (find)
import Data.List.Ordered
compare2 :: (Ord a) => (a, a) -> (a, a) -> Ordering
compare2 x y = compare (max2 x, x) (max2 y, y)
where
max2 :: Ord a => (a, a) -> a
max2 (x, y) = max x y
Then using mergeAll we create a function that takes a comparator, a combining function (which must be monotonic in both arguments) and two sorted lists. It combines all possible elements from the two lists using the function and produces a result sorted list:
mergeWith :: (b -> b -> Ordering) -> (a -> a -> b) -> [a] -> [a] -> [b]
mergeWith cmp f xs ys = mergeAllBy cmp $ map (\x -> map (f x) xs) ys
With this function, it's very simple to produce tuples ordered according to their maximum:
incPairs :: [(Int,Int)]
incPairs = mergeWith compare2 (,) [1..] [1..]
Its first 10 elements are:
> take 10 incPairs
[(1,1),(1,2),(2,1),(2,2),(1,3),(2,3),(3,1),(3,2),(3,3),(1,4)]
and when we (for example) look for the first pair whose sum of squares is equal to 65:
find (\(x,y) -> x^2+y^2 == 65) incPairs
we get the correct result (4,7) (as opposed to (1,8) if lexicographic ordering were used).
This answer is for a more general problem for a unknown predicate. If the predicate is known, more efficient solutions are possible, like others have listed solutions based on knowledge that you don't need to iterate for all Ints for a given c.
When dealing with infinite lists, you need to perform breadth-first search for solution. The list comprehension only affords depth-first search, that is why you never arrive at a solution in your original code.
counters 0 xs = [[]]
counters n xs = concat $ foldr f [] gens where
gens = [[x:t | t <- counters (n-1) xs] | x <- xs]
f ys n = cat ys ([]:n)
cat (y:ys) (x:xs) = (y:x): cat ys xs
cat [] xs = xs
cat xs [] = [xs]
main = print $ take 10 $ filter p $ counters 3 [1..] where
p [a,b,c] = a*a + b*b == c*c
counters generates all possible counters for values from the specified range of digits, including a infinite range.
First, we obtain a list of generators of valid combinations of counters - for each permitted digit, combine it with all permitted combinations for counters of smaller size. This may result in a generator that produces a infinite number of combinations. So, we need to borrow from each generator evenly.
So gens is a list of generators. Think of this as a list of all counters starting with one digit: gens !! 0 is a list of all counters starting with 1, gens !! 1 is a list of all counters starting with 2, etc.
In order to borrow from each generator evenly, we could transpose the list of generators - that way we would get a list of first elements of the generators, followed by a list of second elements of the generators, etc.
Since the list of generators may be infinite, we cannot afford to transpose the list of generators, because we may never get to look at the second element of any generator (for a infinite number of digits we'd have a infinite number of generators). So, we enumerate the elements from the generators "diagonally" - take first element from the first generator; then take the second element from the first generator and the first from the second generator; then take the third element from the first generator, the second from the second, and the first element from the third generator, etc. This can be done by folding the list of generators with a function f, which zips together two lists - one list is the generator, the other is the already-zipped generators -, the beginning of one of them being offset by one step by adding []: to the head. This is almost zipWith (:) ys ([]:n) - the difference is that if n or ys is shorter than the other one, we don't drop the remainder of the other list. Note that folding with zipWith (:) ys n would be a transpose.
For this answer I will take "smallest" to refer to the sum of the numbers in the tuple.
To list all possible pairs in order, you can first list all of the pairs with a sum of 2, then all pairs with a sum of 3 and so on. In code
pairsWithSum n = [(i, n-i) | i <- [1..n-1]]
xs = concatMap pairsWithSum [2..]
Haskell doesn't have facilities for dealing with n-tuples without using Template Haskell, so to generalize this you will have to switch to lists.
ntuplesWithSum 1 s = [[s]]
ntuplesWithSum n s = concatMap (\i -> map (i:) (ntuplesWithSum (n-1) (s-i))) [1..s-n+1]
nums n = concatMap (ntuplesWithSum n) [n..]
Here's another solution, with probably another slightly different idea of "smallest". My order is just "all tuples with max element N come before all tuples with max element N+1". I wrote the versions for pairs and triples:
gen2_step :: Int -> [(Int, Int)]
gen2_step s = [(x, y) | x <- [1..s], y <- [1..s], (x == s || y == s)]
gen2 :: Int -> [(Int, Int)]
gen2 n = concatMap gen2_step [1..n]
gen2inf :: [(Int, Int)]
gen2inf = concatMap gen2_step [1..]
gen3_step :: Int -> [(Int, Int, Int)]
gen3_step s = [(x, y, z) | x <- [1..s], y <- [1..s], z <- [1..s], (x == s || y == s || z == s)]
gen3 :: Int -> [(Int, Int, Int)]
gen3 n = concatMap gen3_step [1..n]
gen3inf :: [(Int, Int, Int)]
gen3inf = concatMap gen3_step [1..]
You can't really generalize it to N-tuples, though as long as you stay homogeneous, you may be able to generalize it if you use arrays. But I don't want to tie my brain into that knot.
I think this is the simplest solution if "smallest" is defined as x+y+z because after you find your first solution in the space of Integral valued pythagorean triangles, your next solutions from the infinite list are bigger.
take 1 [(x,y,z) | y <- [1..], x <- [1..y], z <- [1..x], z*z + x*x == y*y]
-> [(4,5,3)]
It has the nice property that it returns each symmetrically unique solution only once. x and z are also infinite, because y is infinite.
This does not work, because the sequence for x never finishes, and thus you never get a value for y, not to mention z. The rightmost generator is the innermost loop.
take 1 [(z,y,x)|z <- [1..],y <- [1..],x <- [1..],x*x + y*y == z*z]
Sry, it's quite a while since I did haskell, so I'm going to describe it with words.
As I pointed out in my comment. It is not possible to find the smallest anything in an infinite list, since there could always be a smaller one.
What you can do is, have a stream based approach that takes the lists and returns a list with only 'valid' elements, i. e. where the condition is met. Lets call this function triangle
You can then compute the triangle list to some extent with take n (triangle ...) and from this n elements you can find the minium.