Prevent list comprehension from accumulating in memory [duplicate] - haskell

I run out of memory trying to run moderate inputs such as this:
variation_models 15 25
also running higher numbers for ncars seems to make a huge difference in speed and memory usage.
The slowdown is expected (there are more things to compare), but the exponential increase of memory usage doesn't make sense to me
import Control.Monad
orderedq f [] = True
orderedq f (x:[]) = True
orderedq f (x:y:zs) = f x y && orderedq f (y:zs)
num_orderedq = orderedq (<=)
adds_up_to n xs = n == sum xs
both_conditions f g xs = f xs && g xs
variation_models ncars nlocations =
filter (both_conditions (adds_up_to nlocations) num_orderedq) $ replicateM ncars [1..nlocations-ncars+1]
What is causing the large difference in memory usage? replicateM?

I think you've seen elsewhere that your specific problem (creating ordered lists of integers that sum to a given number) is better solved using an alternative algorithm, rather than filtering a huge list of lists of integers.
However, getting back to your original issue, it is possible to construct an equivalent of:
replicateM p [1..n]
that runs in exponential time (of course) but constant space.
The problem is that this expression is more or less equivalent to the recursion:
badPower 0 _ = pure []
badPower p n = [x:xs | x <- [1..n], xs <- badPower (p-1) n]
So, in the list comprehension, for each selected x, the whole list badPower (p-1) n needs to be re-generated from the start. GHC, sensibly enough, decides to keep badPower (p-1) n around so it doesn't need to be recomputed each time. So, the badPower p n call needs the entire badPower (p-1) n list kept in memory, which already accounts for n^(p-1) elements and exponential memory use, even without considering badPower (p-2) n, etc.
If you just flip the order of the implicit loops around:
goodPower 0 _ = pure []
goodPower p n = [x:xs | xs <- goodPower (p-1) n, x <- [1..n]]
That fixes the problem. Even though the list goodPower (p-1) n is "big", we take it's first element, use it n times for each value of x and then can discard it and move to the next element. So, goodPower (p-1) n can be garbage collected as it's used.
Note that goodPower generates the elements in a different order than badPower, with the first coordinate of the lists varying fastest, instead of the last. (If this matters, you can map reverse $ goodPower .... While reverse is "slow", it's only being applied to short lists here.)
Anyway, the following program runs (practically) forever, but does so in constant space:
power :: Int -> [a] -> [[a]]
power 0 _ = [[]]
power p lst = [x:xs | xs <- power (p-1) lst, x <- lst ]
main = do
print $ length (power 15 [1..11])

replicateM :: Applicative m => Int -> m a -> m [a]
When 'm' is [], monad join implementation will make replicateM build all permutations of n elements from the list elements. The number of such permutations is written P(n,k), and is equal to n!/(n-k)!. This is where the exponential growth come from.

Related

Are recursive calls in my "permutations with repetition" code accumulated to clog the RAM?

A bit of background:
I am an amateur programmer, having picked up Haskell a few months ago, on my spare time, after a period of Mathematica programmning (my first language). I am currently going through my second Haskell book, by Will Kurt, but I still have miles to go to call myself comfortable around Haskell code. Codeabbey has been my platform for experimentation and learning so far.
I have written a piece of code to generate permutations of a given number, that deals with possible duplicate numbers, so for 588 it will internally generate 588, 858 and 885.
However, because I want to scale to pretty big input numbers (think perhaps even a hundred digits long), I don't want to output the whole list and then perform calculations on it, instead every number that is generated is checked on the spot for a certain property and if it has it, well, we have a winner, the number is returned as output and there's no need to go through the rest of the humongous list. If sadly no desired number is found and we unsuccessfully go through all possible permutations, it outputs a "0".
I have also opted to make it a command line program to feed values to it via gnu parallel for faster work.
So here is the code
import System.Environment
import Data.List
toDigits :: Integer -> [Integer]
toDigits n = map (\n -> read [n]) (show n)
fromDigits :: Integral a => [a] -> Integer
fromDigits list = fromDigitsHelperFunction list 0
fromDigitsHelperFunction :: Integral a => [a] -> Integer -> Integer
fromDigitsHelperFunction [] acc = acc
fromDigitsHelperFunction (x:[]) acc = (fromIntegral x) + acc
fromDigitsHelperFunction digits#(x:xs) acc = fromDigitsHelperFunction xs (acc + ((fromIntegral x) * 10 ^((length digits) - 1 )))
testPermutationsWithRepetition :: ([Integer],Int,[Int],[(Int,Integer)]) -> [Integer]
testPermutationsWithRepetition (digits, index, rotationMap, registeredPositions)
| index == 0 && rotationMap !! index == 0 = [0,0,0] --finish state (no more recursion). Nothing more to do
| index == digitsLength - 1 && beautyCheck (fromDigits digits) = digits
| index == digitsLength - 1 = testPermutationsWithRepetition (digits, index-1, rotationMap, registeredPositions)
| not ((index,digits!!index) `elem` registeredPositions) = testPermutationsWithRepetition (digits, index+1, rotationMap, (index,digits!!index):registeredPositions)
| rotationMap!!index == 0 = testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)
| rotationMap!!index > 0 && (index,digits!!index) `elem` registeredPositions = testPermutationsWithRepetition (shiftLDigits, index, subtractRot, registeredPositions)
where digitsLength = length digits
shiftLDigits = (fst splitDigits) ++ (tail $ snd splitDigits) ++ [head $ snd splitDigits]
splitDigits = splitAt index digits
restoredRotMap = (fst splitRotMap) ++ [digitsLength - index] ++ (tail $ snd splitRotMap)
splitRotMap = splitAt index rotationMap
restoredRegPositions = filter (\pos -> fst pos < index) registeredPositions --clear everything below the parent index
subtractRot = (fst splitRotMap) ++ [(head $ snd splitRotMap) - 1] ++ (tail $ snd splitRotMap)
--Frontend function for testing permutations by inputting a single parameter (the number in digit form)
testPermsWithRep :: [Integer] -> [Integer]
testPermsWithRep digits = testPermutationsWithRepetition (digits, 0, [length $ digits, (length $ digits) -1 .. 1], [])
main :: IO ()
main = do
args <- getArgs
let number = read (head args) :: Integer
let checkResult = fromDigits $ testPermsWithRep $ toDigits number
print checkResult
It's really a sequential process with an index variable that points to a certain number on the digit list and performs a recursive call on that list based on my rules. The functions tracks its progress through the digit list for visited numbers in certain positions so far (to avoid repetition following already visited paths until it gets to the last digit (index == length -1). If the number that we get there passes the beauty check, it exits with the number produced.
Now, in a Mathematica (or I guess any imperative language) I would probably implement this with a While loop and Cases for its checks, and by the logic of the program, however long it took to compute (generate the permutations and check them for validity) it would take a moderate amount of memory, just enough to hold the list of "registeredPositions" really (you could call it the record of visited digits in specific positions, so it's a variable list as we go deeper in index but gets cleaned up as we move back up). However in this case, the recursive calls stack up as it seems and the whole thing acts as a fork bomb for sufficiently large numbers (e.g 27777772222222222222222223333) and eventually crashes. Is this behaviour something that can be handled differently in Haskell or is there no way to avoid the recursion and memory hogging?
I really like Haskell because the programs make logical sense, but I would like to use it also for cases like this where performance (and resources) matters.
As a side note, my brother pointed to this Algorithm to print all permutations with repetition of numbers in C that is reasonably fast (only generates a list though) and most importantly has minimal memory footprint, although I can tell there's also recursion used in it. Other that that I'm clueless when it comes to C and I would like to stick to Haskell, if it can do what I want at the end of the day, that is.
Any help is welcome. Have a good day!
Edit:
Per Soleil's suggestion I update my post with additional info provided in the comments. Specifically:
After compiling with "ghc checking_program.hs" I run the program with "./checking program 27777772222222222222222223333". On an i5 3470 with 4GB RAM it runs for about 10 minutes and exits with a segmentation fault. On my brothers 32GB machine he let it run until it took up 20GB of RAM. No need to go further I guess. My tests were on Ubuntu via Win10 WSL. His is bare Linux
testPermsWithRep is just a front end for testPermutationsWithRepetition, so that I can only provide the number and testPermsWithRep creates the initial parameters and calls testPermutationsWithRepetition with those. It outputs exactly what testPermutationsWithRepetition outputs, either a number (in digit form) that passes the test, or [0,0,0]. Now the test, the beautyCheck function is simply a test for single digit divisors of that number, that returns True or False. I didn't include it because it really is inconsequential. It could even be just a "bigger than x number" test.
An an example, calling "testPermsWithRep [2,6,7,3]" will call "testPermutationsWithRepetition ([2,6,7,3], 0, [4,3,2,1],[])" and whatever comes out of that function, testPermsWithRep will return that as well.
The performance issue with your program doesn't have anything to do with recursion. Rather, you seem to be running up against an accumulation of a partially evaluated, lazy data structure in your rotation map. Your program will run in constant memory if you use the deepseq package to fully force evaluation of the restoredRotMap:
-- Install the `deepseq` package and add this import
import Control.DeepSeq
-- And then change this one case
... | rotationMap!!index == 0 = restoredRotMap `deepseq`
testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)
Compiled with ghc -O2 and using beautyMap _ = False, this runs with a fixed resident memory usage of about 6 megs.
Some other performance targets:
You might want to replace most of your Integer types with Int, as this will be faster. I think you only need Integer for the input to toDigits and the output of fromDigits, and everything else can be Int, since it's all indexes and digits.
An even bigger win will be to replace your rotation map and registered positions with better data structures. If you find yourself splicing up lists with lots of listpart1 ++ [x] ++ listpart2 calls, there are going to be enormous performance costs to that, and the linear lookups with (!!) aren't helping either.
So I am not 100% sure of this and I am also not 100% sure I understand your code.
But as far as I understand you are generating permutations without duplicates and then you are checking for some predicate wanting whatever single number that fulfils it.
I think it should help to use as many of the prelude functions as possible because afaik then the compiler understands it can optimize recursion into a loop. As a rule of thumb I was taught to avoid explicit recursion as much as possible and instead use prelude functions like map, filter and fold. Mainly you avoid reinventing the wheel this way but there also should be a higher chance of the compiler optimizing things.
So to solve your problem try generating a list of all permutations, then filter it using filter and then just do take 1 if you want the result that is found first. Because of Haskell's lazy evaluation take 1 makes it so that we are interested only in the first x in (x:xs) that a filter would return. Therefore filter will keep dropping elements from the, again lazily evaluated, list of permutations and when it finds one it stops.
I found a permutation implementation on https://rosettacode.org/wiki/Permutations#Haskell
and used it to try this call:
take 1 $ filter ((> 67890123456789012345) . fromDigits) $ permutations' $ toDigits 12345678901234567890
it has been running for like 20 minutes now and RAM usage has stayed around 230 MB.
I hope that has answered/helped you at least in some way.
+ a bonus tip: you can simplify your fromDigits to this beautiful thing:
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl shiftAndAdd 0
where shiftAndAdd acc d = 10 * acc + fromIntegral d
EDIT:
I read some more of the comments and I see you care about ignoring duplicates but I am afraid you'll have to go smarter about that, since if I understand correctly your implementation still generates all the duplicates it only throws them away after checking if they are in a list (which has O(n) complexity). And when you only care about finding one permutation that fits your predicate you drop the not fitting ones anyway.
And people have already correctly pointed out that !! is generally also very bad.
Thanks to everyone for your helpful answers and comments.
#lordQuick permuations used with filter is still terrible but that fromDigits code is a beauty, so I used it.
#k-a-buhr That's exactly what I did yesterday, also per others suggestion, I replaced all use of !! and ++. When I did the latter all memory problems disappeared. Wow! I mean I knew ++ is bad I just didn't realise how bad! We're talking orders of magnitude bad! 3M of RAM vs several GB. Also, valid point about integers. I will try that.
Oh, also a very important thing. I replaced recursive calls with until. This is the approach I would have followed in Mathematica (a NestWhile function to be exact), and I'm glad I found it in Haskell. It seemed to make things a bit faster too.
Anyway, the revised code, that solves my memory issues is here for anyone if interested.
{-compiled with "ghc -Rghc-timing -O2 checking_program_v3.hs"-}
import System.Environment
import Data.List
--A little help with triples
fstOfThree (a, _, _) = a
sndOfThree (_, b, _) = b
thrOfThree (_, _, c) = c
--And then some with quads
fstOfFour (a, _, _, _) = a
sndOfFour (_, b, _, _) = b
thrOfFour (_, _, c, _) = c
--This function is a single pass test for single digit factors
--It will be called as many times as needed by pryForSDFactors
trySingleDigitsFactors :: (Bool, Integer, [Integer]) -> (Bool, Integer, [Integer])
trySingleDigitsFactors (True, n, f) = (True, n, f)
trySingleDigitsFactors (b, n, []) = (b, n, [])
trySingleDigitsFactors (b, n, (f:fs))
| mod n f == 0 = (True, div n f, fs)
| otherwise = trySingleDigitsFactors (False, n, fs)
--This function will take a number and repeatedly divide by single digits till it gets to a single digit if possible
--Then it will return True
pryForSDFactors :: Integer -> Bool
pryForSDFactors n
| sndOfThree sdfTry < 10 = True
| fstOfThree sdfTry == True = pryForSDFactors $ sndOfThree sdfTry
| otherwise = False
where sdfTry = trySingleDigitsFactors (False, n, [7,5,3,2])
toDigits :: Integer -> [Integer]
toDigits n = map (\n -> read [n]) (show n)
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl shiftAndAdd 0
where shiftAndAdd acc d = 10 * acc + fromIntegral d
replaceElementAtPos :: a -> Int -> [a] -> [a]
replaceElementAtPos newElement pos [] = []
replaceElementAtPos newElement 0 (x:xs) = newElement:xs
replaceElementAtPos newElement pos (x:xs) = x : replaceElementAtPos newElement (pos-1) xs
checkPermutationsStep :: ([Integer],Int,[Int],[(Int,Integer)]) -> ([Integer],Int,[Int],[(Int,Integer)])
checkPermutationsStep (digits, index, rotationMap, registeredPositions)
| index == digitsLength - 1 = (digits, index-1, rotationMap, registeredPositions)
| not ((index, digitAtIndex) `elem` registeredPositions) = (digits, index+1, rotationMap, (index,digitAtIndex):registeredPositions)
| rotationAtIndex == 0 = (digits, index-1, restoredRotMap, restoredRegPositions)
| rotationAtIndex > 0 && (index, digitAtIndex) `elem` registeredPositions = (shiftLDigits, index, subtractRot, registeredPositions)
where digitsLength = length digits
digitAtIndex = head $ drop index digits
rotationAtIndex = head $ drop index rotationMap
--restoredRotMap = (fst splitRotMap) ++ [digitsLength - index] ++ (tail $ snd splitRotMap)
restoredRotMap = replaceElementAtPos (digitsLength - index) index rotationMap
--splitRotMap = splitAt index rotationMap
restoredRegPositions = filter (\pos -> fst pos < index) registeredPositions --clear everything below the parent index
shiftLDigits = (fst splitDigits) ++ (tail $ snd splitDigits) ++ [head $ snd splitDigits]
splitDigits = splitAt index digits
--subtractRot = (fst splitRotMap) ++ [(head $ snd splitRotMap) - 1] ++ (tail $ snd splitRotMap)
subtractRot = replaceElementAtPos (rotationDigitAtIndex - 1) index rotationMap
rotationDigitAtIndex = head $ drop index rotationMap
checkConditions :: ([Integer],Int,[Int],[(Int,Integer)]) -> Bool
checkConditions (digits, index, rotationMap, registeredPositions)
| (index == 0 && rotationAtIndex == 0) || ((index == (length digits) - 1) && pryForSDFactors (fromDigits digits)) = True
| otherwise = False
where rotationAtIndex = head $ drop index rotationMap
testPermsWithRep :: Integer -> Integer
testPermsWithRep n
| sndOfFour computationResult == 0 && (head . thrOfFour) computationResult == 0 = 0
| otherwise = (fromDigits . fstOfFour) computationResult
where computationResult = until checkConditions checkPermutationsStep (digitsOfn, 0 , [digitsLength, digitsLength -1 .. 1], [])
digitsOfn = toDigits n
digitsLength = length digitsOfn
main :: IO ()
main = do
args <- getArgs
let inputNumber = read (head args) :: Integer
let checkResult = testPermsWithRep inputNumber
print checkResult
Now, bear in mind that this code, as I've mentioned, checks for a condition of each generated permutation (single digit factors) on the spot, and moves on if False, but it's pretty easy to repurpose it for output list generation.
Sure it's now just inefficient in terms of big O complexity (scales terribly), and I was at first thinking of replacing lists with Data.Map because that's what I've learned so far (though not so comfortable with maps yet).
I've also read that there's a more efficient replacement for read since that's also called a lot for numbers-to-digits conversions.
# lordQuick I don't know about HashMaps or vectors yet but I'm still learning. Every little optimization will pay off in computation time because this is my first piece of "practical" code, not just Codeabbey credit
Cheers!
Here is a solution using a more efficient, insertion-based algorithm to compute unique permutations:
import Data.List
permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insert) [[]]
where insert y = foldr combine [[y]] . (zip <*> tail . tails)
where combine (x, xs) xss = (y : x : xs) :
if y == x then [] else map (x :) xss
headDef :: a -> [a] -> a
headDef x [] = x
headDef x (h : t) = h
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl1' ((+) . (10 *)) . map fromIntegral
toDigits :: Integer -> [Int]
toDigits = map (read . pure) . show
firstValidPermutation :: (Integer -> Bool) -> Integer -> Integer
firstValidPermutation p =
headDef 0 .
filter p .
map fromDigits .
permutationsNub .
toDigits
The basic idea is that, given the unique permutations of a list's tail, we can compute the unique permutations of the whole list by inserting its head into all of the tail's permutations, in every position that doesn't follow an occurrence of the head (to avoid creating duplicates). From my tests, permutationsNub seems to be faster than permutations from Data.List even when the input contains no repetitions. However, unlike that function, it consumes its input eagerly and thus cannot handle an infinite input. Exercise: Prove this algorithm's correctness.
to be continued

Why does my function not work with an infinite list?

I'm trying to learn haskell and implemented a function conseq that would return a list of consecutive elements of size n.
conseq :: Int -> [Int] -> [[Int]]
conseq n x
| n == length(x) = [x]
| n > length(x) = [x]
| otherwise = [take n x] ++ (conseq n (drop 1 x))
This works correctly.
> take 5 $ conseq 2 [1..10]
[[1,2],[2,3],[3,4],[4,5],[5,6]]
However, if I pass [1..] instead of [1..10], the program gets stuck in an infinite loop.
As I understood it, haskell has lazy evaluation so I should still be able to get the same result right? Is it length? Shouldn't the first two conditions evaluate to false as soon as the length becomes greater than n?
What did I misunderstand?
One of the main reasons why using length is not a good idea is because when it has to be evaluated on an infinite list, it will get stuck in an infinite loop.
The good news is however, we don't need length. It would also make the time complexity worse. We can work with two enumerators, one is n-1 places ahead of the other. If this enumerator reaches the end of the list, then we know that the first enumerator still has n-1 elements, and thus we can stop yielding values:
conseq :: Int -> [a] -> [[a]]
conseq n ys = go (drop (n-1) ys) ys
where go [] _ = []
go (_:as) ba#(~(_:bs)) = take n ba : go as bs
This gives us thus:
Prelude> conseq 3 [1 ..]
[[1,2,3],[2,3,4],[3,4,5],[4,5,6],[5,6,7],[6,7,8],[7,8,9],[8,9,10],[9,10,11],[10,11,12],[11,12,13],[12,13,14],[13,14,15],[14,15,16],[15,16,17],[16,17,18],[17,18,19],[18,19,20],[19,20,21],[20,21,22],[21,22,23],[22,23,24],[23,24,25],[24,25,26],[25,26,27],…
Prelude> conseq 3 [1 .. 4]
[[1,2,3],[2,3,4]]
The first thing your function does is calculate length(x), so it knows whether it should return [x], [x], or [take n x] ++ (conseq n (drop 1 x))
length counts the number of elements in the list - all the elements. If you ask for the length of an infinite list, it never finishes counting.

haskell running out of memory with finite lists

I run out of memory trying to run moderate inputs such as this:
variation_models 15 25
also running higher numbers for ncars seems to make a huge difference in speed and memory usage.
The slowdown is expected (there are more things to compare), but the exponential increase of memory usage doesn't make sense to me
import Control.Monad
orderedq f [] = True
orderedq f (x:[]) = True
orderedq f (x:y:zs) = f x y && orderedq f (y:zs)
num_orderedq = orderedq (<=)
adds_up_to n xs = n == sum xs
both_conditions f g xs = f xs && g xs
variation_models ncars nlocations =
filter (both_conditions (adds_up_to nlocations) num_orderedq) $ replicateM ncars [1..nlocations-ncars+1]
What is causing the large difference in memory usage? replicateM?
I think you've seen elsewhere that your specific problem (creating ordered lists of integers that sum to a given number) is better solved using an alternative algorithm, rather than filtering a huge list of lists of integers.
However, getting back to your original issue, it is possible to construct an equivalent of:
replicateM p [1..n]
that runs in exponential time (of course) but constant space.
The problem is that this expression is more or less equivalent to the recursion:
badPower 0 _ = pure []
badPower p n = [x:xs | x <- [1..n], xs <- badPower (p-1) n]
So, in the list comprehension, for each selected x, the whole list badPower (p-1) n needs to be re-generated from the start. GHC, sensibly enough, decides to keep badPower (p-1) n around so it doesn't need to be recomputed each time. So, the badPower p n call needs the entire badPower (p-1) n list kept in memory, which already accounts for n^(p-1) elements and exponential memory use, even without considering badPower (p-2) n, etc.
If you just flip the order of the implicit loops around:
goodPower 0 _ = pure []
goodPower p n = [x:xs | xs <- goodPower (p-1) n, x <- [1..n]]
That fixes the problem. Even though the list goodPower (p-1) n is "big", we take it's first element, use it n times for each value of x and then can discard it and move to the next element. So, goodPower (p-1) n can be garbage collected as it's used.
Note that goodPower generates the elements in a different order than badPower, with the first coordinate of the lists varying fastest, instead of the last. (If this matters, you can map reverse $ goodPower .... While reverse is "slow", it's only being applied to short lists here.)
Anyway, the following program runs (practically) forever, but does so in constant space:
power :: Int -> [a] -> [[a]]
power 0 _ = [[]]
power p lst = [x:xs | xs <- power (p-1) lst, x <- lst ]
main = do
print $ length (power 15 [1..11])
replicateM :: Applicative m => Int -> m a -> m [a]
When 'm' is [], monad join implementation will make replicateM build all permutations of n elements from the list elements. The number of such permutations is written P(n,k), and is equal to n!/(n-k)!. This is where the exponential growth come from.

Dovetail iteration over infinite lists in Haskell

I want to iterate 2 (or 3) infinite lists and find the "smallest" pair that satisfies a condition, like so:
until pred [(a,b,c) | a<-as, b<-bs, c<-cs]
where pred (a,b,c) = a*a + b*b == c*c
as = [1..]
bs = [1..]
cs = [1..]
The above wouldn't get very far, as a == b == 1 throughout the run of the program.
Is there a nice way to dovetail the problem, e.g. build the infinite sequence [(1,1,1),(1,2,1),(2,1,1),(2,1,2),(2,2,1),(2,2,2),(2,2,3),(2,3,2),..] ?
Bonus: is it possible to generalize to n-tuples?
There's a monad for that, Omega.
Prelude> let as = each [1..]
Prelude> let x = liftA3 (,,) as as as
Prelude> let x' = mfilter (\(a,b,c) -> a*a + b*b == c*c) x
Prelude> take 10 $ runOmega x'
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15),(12,9,15),(8,15,17),(15,8,17)]
Using it's applicative features, you can generalize to arbitrary tuples:
quadrupels = (,,,) <$> as <*> as <*> as <*> as -- or call it liftA4
But: this alone does not eliminate duplication, of course. It only gives you proper diagonalization. Maybe you could use monad comprehensions together with an approach like Thomas's, or just another mfilter pass (restricting to b /= c, in this case).
List comprehensions are great (and concise) ways to solve such problems. First, you know you want all combinations of (a,b,c) that might satisfy a^2 + b^2 = c^2 - a helpful observation is that (considering only positive numbers) it will always be the case that a <= c && b <= c.
To generate our list of candidates we can thus say c ranges from 1 to infinity while a and b range from one to c.
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c]]
To get to the solution we just need to add your desired equation as a guard:
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c], a*a+b*b == c*c]
This is inefficient, but the output is correct:
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15)...
There are more principled methods than blind testing that can solve this problem.
{- It depends on what is "smallest". But here is a solution for a concept of "smallest" if tuples were compared first by their max. number and then by their total sum. (You can just copy and paste my whole answer into a file as I write the text in comments.)
We will need nub later. -}
import Data.List (nub)
{- Just for illustration: the easy case with 2-tuples. -}
-- all the two-tuples where 'snd' is 'n'
tuples n = [(i, n) | i <- [1..n]]
-- all the two-tuples where 'snd' is in '1..n'
tuplesUpTo n = concat [tuples i | i <- [1..n]]
{-
To get all results, you will need to insert the flip of each tuple into the stream. But let's do that later and generalize first.
Building tuples of arbitrary length is somewhat difficult, so we will work on lists. I call them 'kList's, if they have a length 'k'.
-}
-- just copied from the tuples case, only we need a base case for k=1 and
-- we can combine all results utilizing the list monad.
kLists 1 n = [[n]]
kLists k n = do
rest <- kLists (k-1) n
add <- [1..head rest]
return (add:rest)
-- same as above. all the klists with length k and max number of n
kListsUpTo k n = concat [kLists k i | i <- [1..n]]
-- we can do that unbounded as well, creating an infinite list.
kListsInf k = concat [kLists k i | i <- [1..]]
{-
The next step is rotating these lists around, because until now the largest number is always in the last place. So we just look at all rotations to get all the results. Using nub here is admittedly awkward, you can improve that. But without it, lists where all elements are the same are repeated k times.
-}
rotate n l = let (init, end) = splitAt n l
in end ++ init
rotations k l = nub [rotate i l | i <- [0..k-1]]
rotatedKListsInf k = concatMap (rotations k) $ kListsInf k
{- What remains is to convert these lists into tuples. This is a bit awkward, because every n-tuple is a separate type. But it's straightforward, of course. -}
kListToTuple2 [x,y] = (x,y)
kListToTuple3 [x,y,z] = (x,y,z)
kListToTuple4 [x,y,z,t] = (x,y,z,t)
kListToTuple5 [x,y,z,t,u] = (x,y,z,t,u)
kListToTuple6 [x,y,z,t,u,v] = (x,y,z,t,u,v)
{- Some tests:
*Main> take 30 . map kListToTuple2 $ rotatedKListsInf 2
[(1,1),(1,2),(2,1),(2,2),(1,3),(3,1),(2,3),(3,2),(3,3),(1,4),(4,1),(2,4),(4,2),(3,4),
(4,3),(4,4),(1,5),(5,1),(2,5),(5,2),(3,5),(5,3),(4,5),(5,4),(5,5),(1,6),(6,1),
(2,6), (6,2), (3,6)]
*Main> take 30 . map kListToTuple3 $ rotatedKListsInf 3
[(1,1,1),(1,1,2),(1,2,1),(2,1,1),(1,2,2),(2,2,1),(2,1,2),(2,2,2),(1,1,3),(1,3,1),
(3,1,1),(1,2,3),(2,3,1),(3,1,2),(2,2,3),(2,3,2),(3,2,2),(1,3,3),(3,3,1),(3,1,3),
(2,3,3),(3,3,2),(3,2,3),(3,3,3),(1,1,4),(1,4,1),(4,1,1),(1,2,4),(2,4,1),(4,1,2)]
Edit:
I realized there is a bug: Just rotating the ordered lists isn't enough of course. The solution must be somewhere along the lines of having
rest <- concat . map (rotations (k-1)) $ kLists (k-1) n
in kLists, but then some issues with repeated outputs arise. You can figure that out, I guess. ;-)
-}
It really depends on what you mean by "smallest", but I assume you want to find a tuple of numbers with respect to its maximal element - so (2,2) is less than (1,3) (while standard Haskell ordering is lexicographic).
There is package data-ordlist, which is aimed precisely at working with ordered lists. It's function mergeAll (and mergeAllBy) allows you to combine a 2-dimensional matrix ordered in each direction into an ordered list.
First let's create a desired comparing function on tuples:
import Data.List (find)
import Data.List.Ordered
compare2 :: (Ord a) => (a, a) -> (a, a) -> Ordering
compare2 x y = compare (max2 x, x) (max2 y, y)
where
max2 :: Ord a => (a, a) -> a
max2 (x, y) = max x y
Then using mergeAll we create a function that takes a comparator, a combining function (which must be monotonic in both arguments) and two sorted lists. It combines all possible elements from the two lists using the function and produces a result sorted list:
mergeWith :: (b -> b -> Ordering) -> (a -> a -> b) -> [a] -> [a] -> [b]
mergeWith cmp f xs ys = mergeAllBy cmp $ map (\x -> map (f x) xs) ys
With this function, it's very simple to produce tuples ordered according to their maximum:
incPairs :: [(Int,Int)]
incPairs = mergeWith compare2 (,) [1..] [1..]
Its first 10 elements are:
> take 10 incPairs
[(1,1),(1,2),(2,1),(2,2),(1,3),(2,3),(3,1),(3,2),(3,3),(1,4)]
and when we (for example) look for the first pair whose sum of squares is equal to 65:
find (\(x,y) -> x^2+y^2 == 65) incPairs
we get the correct result (4,7) (as opposed to (1,8) if lexicographic ordering were used).
This answer is for a more general problem for a unknown predicate. If the predicate is known, more efficient solutions are possible, like others have listed solutions based on knowledge that you don't need to iterate for all Ints for a given c.
When dealing with infinite lists, you need to perform breadth-first search for solution. The list comprehension only affords depth-first search, that is why you never arrive at a solution in your original code.
counters 0 xs = [[]]
counters n xs = concat $ foldr f [] gens where
gens = [[x:t | t <- counters (n-1) xs] | x <- xs]
f ys n = cat ys ([]:n)
cat (y:ys) (x:xs) = (y:x): cat ys xs
cat [] xs = xs
cat xs [] = [xs]
main = print $ take 10 $ filter p $ counters 3 [1..] where
p [a,b,c] = a*a + b*b == c*c
counters generates all possible counters for values from the specified range of digits, including a infinite range.
First, we obtain a list of generators of valid combinations of counters - for each permitted digit, combine it with all permitted combinations for counters of smaller size. This may result in a generator that produces a infinite number of combinations. So, we need to borrow from each generator evenly.
So gens is a list of generators. Think of this as a list of all counters starting with one digit: gens !! 0 is a list of all counters starting with 1, gens !! 1 is a list of all counters starting with 2, etc.
In order to borrow from each generator evenly, we could transpose the list of generators - that way we would get a list of first elements of the generators, followed by a list of second elements of the generators, etc.
Since the list of generators may be infinite, we cannot afford to transpose the list of generators, because we may never get to look at the second element of any generator (for a infinite number of digits we'd have a infinite number of generators). So, we enumerate the elements from the generators "diagonally" - take first element from the first generator; then take the second element from the first generator and the first from the second generator; then take the third element from the first generator, the second from the second, and the first element from the third generator, etc. This can be done by folding the list of generators with a function f, which zips together two lists - one list is the generator, the other is the already-zipped generators -, the beginning of one of them being offset by one step by adding []: to the head. This is almost zipWith (:) ys ([]:n) - the difference is that if n or ys is shorter than the other one, we don't drop the remainder of the other list. Note that folding with zipWith (:) ys n would be a transpose.
For this answer I will take "smallest" to refer to the sum of the numbers in the tuple.
To list all possible pairs in order, you can first list all of the pairs with a sum of 2, then all pairs with a sum of 3 and so on. In code
pairsWithSum n = [(i, n-i) | i <- [1..n-1]]
xs = concatMap pairsWithSum [2..]
Haskell doesn't have facilities for dealing with n-tuples without using Template Haskell, so to generalize this you will have to switch to lists.
ntuplesWithSum 1 s = [[s]]
ntuplesWithSum n s = concatMap (\i -> map (i:) (ntuplesWithSum (n-1) (s-i))) [1..s-n+1]
nums n = concatMap (ntuplesWithSum n) [n..]
Here's another solution, with probably another slightly different idea of "smallest". My order is just "all tuples with max element N come before all tuples with max element N+1". I wrote the versions for pairs and triples:
gen2_step :: Int -> [(Int, Int)]
gen2_step s = [(x, y) | x <- [1..s], y <- [1..s], (x == s || y == s)]
gen2 :: Int -> [(Int, Int)]
gen2 n = concatMap gen2_step [1..n]
gen2inf :: [(Int, Int)]
gen2inf = concatMap gen2_step [1..]
gen3_step :: Int -> [(Int, Int, Int)]
gen3_step s = [(x, y, z) | x <- [1..s], y <- [1..s], z <- [1..s], (x == s || y == s || z == s)]
gen3 :: Int -> [(Int, Int, Int)]
gen3 n = concatMap gen3_step [1..n]
gen3inf :: [(Int, Int, Int)]
gen3inf = concatMap gen3_step [1..]
You can't really generalize it to N-tuples, though as long as you stay homogeneous, you may be able to generalize it if you use arrays. But I don't want to tie my brain into that knot.
I think this is the simplest solution if "smallest" is defined as x+y+z because after you find your first solution in the space of Integral valued pythagorean triangles, your next solutions from the infinite list are bigger.
take 1 [(x,y,z) | y <- [1..], x <- [1..y], z <- [1..x], z*z + x*x == y*y]
-> [(4,5,3)]
It has the nice property that it returns each symmetrically unique solution only once. x and z are also infinite, because y is infinite.
This does not work, because the sequence for x never finishes, and thus you never get a value for y, not to mention z. The rightmost generator is the innermost loop.
take 1 [(z,y,x)|z <- [1..],y <- [1..],x <- [1..],x*x + y*y == z*z]
Sry, it's quite a while since I did haskell, so I'm going to describe it with words.
As I pointed out in my comment. It is not possible to find the smallest anything in an infinite list, since there could always be a smaller one.
What you can do is, have a stream based approach that takes the lists and returns a list with only 'valid' elements, i. e. where the condition is met. Lets call this function triangle
You can then compute the triangle list to some extent with take n (triangle ...) and from this n elements you can find the minium.

Haskell - Prime Powers Excercise - Infinite merges

At university my task is the following :
define the following function:
primepowers :: Integer -> [Integer]
that calculates the infinite list of the first n powers of the prime numbers for a given parameter n, sorted asc.
That is,
primepowers n contains in ascending order the elements of
{p^i | p is prime, 1≤i≤n}.
After working on this task I came to a dead end. I have the following four functions:
merge :: Ord t => [t] -> [t] -> [t]
merge [] b = b
merge a [] = a
merge (a:ax) (b:bx)
| a <= b = a : merge ax (b:bx)
| otherwise = b : merge (a:ax) bx
primes :: [Integer]
primes = sieve [2..]
where sieve [] = []
sieve (p:xs) = p : sieve (filter (not . multipleOf p) xs)
where multipleOf p x = x `mod` p == 0
powers :: Integer -> Integer -> [Integer]
powers n num = map (\a -> num ^ a) [1..n]
primepowers :: Integer -> [Integer]
primepowers n = foldr merge [] (map (powers n) primes)
I think that they work independently, as I have tested with some sample inputs.
merge merges two ordered lists to one ordered list
primes returns infinite list of prime numbers
powers calculates n powers of num (that is num^1 , num^2 ... num^n)
I try to merge everything in primepowers, but functions are not evaluated nothing happens respectively theres some kind of infinite loop.
I am not interested in optimization of primes or powers. Just I don't understand why that does not work. Or is my approach not good, not functional, not haskell?
I suspect the problem is: primes is an infinite list. Therefore, map (powers n) primes is an infinite list of (finite) lists. When you try to foldr merge [] them all together, merge must evaluate the head of each list...
Since there are an infinite number of lists, this is an infinite loop.
I would suggest transposing the structure, something like:
primepowers n = foldr merge [] [map (^i) primes | i <- [1..n]]
While you can probably not use this for your assignment, this can be solved quite elegantly using the primes and data-ordlist packages from Hackage.
import Data.List.Ordered
import Data.Numbers.Primes
primePowers n = mergeAll [[p^k | k <- [1..n]] | p <- primes]
Note that mergeAll is able to merge an infinite number of lists because it assumes that the heads of the lists are ordered in addition to the lists themselves being ordered. Thus, we can easily make this work for infinite powers as well:
allPrimePowers = mergeAll [[p^k | k <- [1..]] | p <- primes]
The reason why your program runs into an infinite loop is that you are trying to merge infinitely many lists only by using the invariant that each list is sorted in the ascending order. Before the program can output “2,” it has to know that none of the lists contains anything smaller than 2. This is impossible because there are infinitely many lists.
You need the following function:
mergePrio (h : l) r = h : merge l r

Resources