Using fold* to grow a list in Haskell - haskell

I'm trying to solve the following problem in Haskell: given an integer return the list of its digits. The constraint is I have to only use one of the fold* functions (* = {r,l,1,l1}).
Without such constraint, the code is simple:
list_digits :: Int -> [Int]
list_digits 0 = []
list_digits n = list_digits r ++ [n-10*r]
where
r = div n 10
But how do I use fold* to, essentially grow a list of digits from an empty list?
Thanks in advance.

Is this a homework assignment? It's pretty strange for the assignment to require you to use foldr, because this is a natural use for unfoldr, not foldr. unfoldr :: (b -> Maybe (a, b)) -> b -> [a] builds a list, whereas foldr :: (a -> b -> b) -> b -> [a] -> b consumes a list. An implementation of this function using foldr would be horribly contorted.
listDigits :: Int -> [Int]
listDigits = unfoldr digRem
where digRem x
| x <= 0 = Nothing
| otherwise = Just (x `mod` 10, x `div` 10)
In the language of imperative programming, this is basically a while loop. Each iteration of the loop appends x `mod` 10 to the output list and passes x `div` 10 to the next iteration. In, say, Python, this'd be written as
def list_digits(x):
output = []
while x > 0:
output.append(x % 10)
x = x // 10
return output
But unfoldr allows us to express the loop at a much higher level. unfoldr captures the pattern of "building a list one item at a time" and makes it explicit. You don't have to think through the sequential behaviour of the loop and realise that the list is being built one element at a time, as you do with the Python code; you just have to know what unfoldr does. Granted, programming with folds and unfolds takes a little getting used to, but it's worth it for the greater expressiveness.
If your assignment is marked by machine and it really does require you to type the word foldr into your program text, (you should ask your teacher why they did that and) you can play a sneaky trick with the following "id[]-as-foldr" function:
obfuscatedId = foldr (:) []
listDigits = obfuscatedId . unfoldr digRem

Though unfoldr is probably what the assignment meant, you can write this using foldr if you use foldr as a hylomorphism, that is, building up one list while it tears another down.
digits :: Int -> [Int]
digits n = snd $ foldr go (n, []) places where
places = replicate num_digits ()
num_digits | n > 0 = 1 + floor (logBase 10 $ fromIntegral n)
| otherwise = 0
go () (n, ds) = let (q,r) = n `quotRem` 10 in (q, r : ds)
Effectively, what we're doing here is using foldr as "map-with-state". We know ahead of time
how many digits we need to output (using log10) just not what those digits are, so we use
unit (()) values as stand-ins for those digits.
If your teacher's a stickler for just having a foldr at the top-level, you can get
away with making go partial:
digits' :: Int -> [Int]
digits' n = foldr go [n] places where
places = replicate num_digits ()
num_digits | n > 0 = floor (logBase 10 $ fromIntegral n)
| otherwise = 0
go () (n:ds) = let (q,r) = n `quotRem` 10 in (q:r:ds)
This has slightly different behaviour on non-positive numbers:
>>> digits 1234567890
[1,2,3,4,5,6,7,8,9,0]
>>> digits' 1234567890
[1,2,3,4,5,6,7,8,9,0]
>>> digits 0
[]
>>> digits' 0
[0]
>>> digits (negate 1234567890)
[]
>>> digits' (negate 1234567890)
[-1234567890]

Related

Are recursive calls in my "permutations with repetition" code accumulated to clog the RAM?

A bit of background:
I am an amateur programmer, having picked up Haskell a few months ago, on my spare time, after a period of Mathematica programmning (my first language). I am currently going through my second Haskell book, by Will Kurt, but I still have miles to go to call myself comfortable around Haskell code. Codeabbey has been my platform for experimentation and learning so far.
I have written a piece of code to generate permutations of a given number, that deals with possible duplicate numbers, so for 588 it will internally generate 588, 858 and 885.
However, because I want to scale to pretty big input numbers (think perhaps even a hundred digits long), I don't want to output the whole list and then perform calculations on it, instead every number that is generated is checked on the spot for a certain property and if it has it, well, we have a winner, the number is returned as output and there's no need to go through the rest of the humongous list. If sadly no desired number is found and we unsuccessfully go through all possible permutations, it outputs a "0".
I have also opted to make it a command line program to feed values to it via gnu parallel for faster work.
So here is the code
import System.Environment
import Data.List
toDigits :: Integer -> [Integer]
toDigits n = map (\n -> read [n]) (show n)
fromDigits :: Integral a => [a] -> Integer
fromDigits list = fromDigitsHelperFunction list 0
fromDigitsHelperFunction :: Integral a => [a] -> Integer -> Integer
fromDigitsHelperFunction [] acc = acc
fromDigitsHelperFunction (x:[]) acc = (fromIntegral x) + acc
fromDigitsHelperFunction digits#(x:xs) acc = fromDigitsHelperFunction xs (acc + ((fromIntegral x) * 10 ^((length digits) - 1 )))
testPermutationsWithRepetition :: ([Integer],Int,[Int],[(Int,Integer)]) -> [Integer]
testPermutationsWithRepetition (digits, index, rotationMap, registeredPositions)
| index == 0 && rotationMap !! index == 0 = [0,0,0] --finish state (no more recursion). Nothing more to do
| index == digitsLength - 1 && beautyCheck (fromDigits digits) = digits
| index == digitsLength - 1 = testPermutationsWithRepetition (digits, index-1, rotationMap, registeredPositions)
| not ((index,digits!!index) `elem` registeredPositions) = testPermutationsWithRepetition (digits, index+1, rotationMap, (index,digits!!index):registeredPositions)
| rotationMap!!index == 0 = testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)
| rotationMap!!index > 0 && (index,digits!!index) `elem` registeredPositions = testPermutationsWithRepetition (shiftLDigits, index, subtractRot, registeredPositions)
where digitsLength = length digits
shiftLDigits = (fst splitDigits) ++ (tail $ snd splitDigits) ++ [head $ snd splitDigits]
splitDigits = splitAt index digits
restoredRotMap = (fst splitRotMap) ++ [digitsLength - index] ++ (tail $ snd splitRotMap)
splitRotMap = splitAt index rotationMap
restoredRegPositions = filter (\pos -> fst pos < index) registeredPositions --clear everything below the parent index
subtractRot = (fst splitRotMap) ++ [(head $ snd splitRotMap) - 1] ++ (tail $ snd splitRotMap)
--Frontend function for testing permutations by inputting a single parameter (the number in digit form)
testPermsWithRep :: [Integer] -> [Integer]
testPermsWithRep digits = testPermutationsWithRepetition (digits, 0, [length $ digits, (length $ digits) -1 .. 1], [])
main :: IO ()
main = do
args <- getArgs
let number = read (head args) :: Integer
let checkResult = fromDigits $ testPermsWithRep $ toDigits number
print checkResult
It's really a sequential process with an index variable that points to a certain number on the digit list and performs a recursive call on that list based on my rules. The functions tracks its progress through the digit list for visited numbers in certain positions so far (to avoid repetition following already visited paths until it gets to the last digit (index == length -1). If the number that we get there passes the beauty check, it exits with the number produced.
Now, in a Mathematica (or I guess any imperative language) I would probably implement this with a While loop and Cases for its checks, and by the logic of the program, however long it took to compute (generate the permutations and check them for validity) it would take a moderate amount of memory, just enough to hold the list of "registeredPositions" really (you could call it the record of visited digits in specific positions, so it's a variable list as we go deeper in index but gets cleaned up as we move back up). However in this case, the recursive calls stack up as it seems and the whole thing acts as a fork bomb for sufficiently large numbers (e.g 27777772222222222222222223333) and eventually crashes. Is this behaviour something that can be handled differently in Haskell or is there no way to avoid the recursion and memory hogging?
I really like Haskell because the programs make logical sense, but I would like to use it also for cases like this where performance (and resources) matters.
As a side note, my brother pointed to this Algorithm to print all permutations with repetition of numbers in C that is reasonably fast (only generates a list though) and most importantly has minimal memory footprint, although I can tell there's also recursion used in it. Other that that I'm clueless when it comes to C and I would like to stick to Haskell, if it can do what I want at the end of the day, that is.
Any help is welcome. Have a good day!
Edit:
Per Soleil's suggestion I update my post with additional info provided in the comments. Specifically:
After compiling with "ghc checking_program.hs" I run the program with "./checking program 27777772222222222222222223333". On an i5 3470 with 4GB RAM it runs for about 10 minutes and exits with a segmentation fault. On my brothers 32GB machine he let it run until it took up 20GB of RAM. No need to go further I guess. My tests were on Ubuntu via Win10 WSL. His is bare Linux
testPermsWithRep is just a front end for testPermutationsWithRepetition, so that I can only provide the number and testPermsWithRep creates the initial parameters and calls testPermutationsWithRepetition with those. It outputs exactly what testPermutationsWithRepetition outputs, either a number (in digit form) that passes the test, or [0,0,0]. Now the test, the beautyCheck function is simply a test for single digit divisors of that number, that returns True or False. I didn't include it because it really is inconsequential. It could even be just a "bigger than x number" test.
An an example, calling "testPermsWithRep [2,6,7,3]" will call "testPermutationsWithRepetition ([2,6,7,3], 0, [4,3,2,1],[])" and whatever comes out of that function, testPermsWithRep will return that as well.
The performance issue with your program doesn't have anything to do with recursion. Rather, you seem to be running up against an accumulation of a partially evaluated, lazy data structure in your rotation map. Your program will run in constant memory if you use the deepseq package to fully force evaluation of the restoredRotMap:
-- Install the `deepseq` package and add this import
import Control.DeepSeq
-- And then change this one case
... | rotationMap!!index == 0 = restoredRotMap `deepseq`
testPermutationsWithRepetition (digits, index-1, restoredRotMap, restoredRegPositions)
Compiled with ghc -O2 and using beautyMap _ = False, this runs with a fixed resident memory usage of about 6 megs.
Some other performance targets:
You might want to replace most of your Integer types with Int, as this will be faster. I think you only need Integer for the input to toDigits and the output of fromDigits, and everything else can be Int, since it's all indexes and digits.
An even bigger win will be to replace your rotation map and registered positions with better data structures. If you find yourself splicing up lists with lots of listpart1 ++ [x] ++ listpart2 calls, there are going to be enormous performance costs to that, and the linear lookups with (!!) aren't helping either.
So I am not 100% sure of this and I am also not 100% sure I understand your code.
But as far as I understand you are generating permutations without duplicates and then you are checking for some predicate wanting whatever single number that fulfils it.
I think it should help to use as many of the prelude functions as possible because afaik then the compiler understands it can optimize recursion into a loop. As a rule of thumb I was taught to avoid explicit recursion as much as possible and instead use prelude functions like map, filter and fold. Mainly you avoid reinventing the wheel this way but there also should be a higher chance of the compiler optimizing things.
So to solve your problem try generating a list of all permutations, then filter it using filter and then just do take 1 if you want the result that is found first. Because of Haskell's lazy evaluation take 1 makes it so that we are interested only in the first x in (x:xs) that a filter would return. Therefore filter will keep dropping elements from the, again lazily evaluated, list of permutations and when it finds one it stops.
I found a permutation implementation on https://rosettacode.org/wiki/Permutations#Haskell
and used it to try this call:
take 1 $ filter ((> 67890123456789012345) . fromDigits) $ permutations' $ toDigits 12345678901234567890
it has been running for like 20 minutes now and RAM usage has stayed around 230 MB.
I hope that has answered/helped you at least in some way.
+ a bonus tip: you can simplify your fromDigits to this beautiful thing:
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl shiftAndAdd 0
where shiftAndAdd acc d = 10 * acc + fromIntegral d
EDIT:
I read some more of the comments and I see you care about ignoring duplicates but I am afraid you'll have to go smarter about that, since if I understand correctly your implementation still generates all the duplicates it only throws them away after checking if they are in a list (which has O(n) complexity). And when you only care about finding one permutation that fits your predicate you drop the not fitting ones anyway.
And people have already correctly pointed out that !! is generally also very bad.
Thanks to everyone for your helpful answers and comments.
#lordQuick permuations used with filter is still terrible but that fromDigits code is a beauty, so I used it.
#k-a-buhr That's exactly what I did yesterday, also per others suggestion, I replaced all use of !! and ++. When I did the latter all memory problems disappeared. Wow! I mean I knew ++ is bad I just didn't realise how bad! We're talking orders of magnitude bad! 3M of RAM vs several GB. Also, valid point about integers. I will try that.
Oh, also a very important thing. I replaced recursive calls with until. This is the approach I would have followed in Mathematica (a NestWhile function to be exact), and I'm glad I found it in Haskell. It seemed to make things a bit faster too.
Anyway, the revised code, that solves my memory issues is here for anyone if interested.
{-compiled with "ghc -Rghc-timing -O2 checking_program_v3.hs"-}
import System.Environment
import Data.List
--A little help with triples
fstOfThree (a, _, _) = a
sndOfThree (_, b, _) = b
thrOfThree (_, _, c) = c
--And then some with quads
fstOfFour (a, _, _, _) = a
sndOfFour (_, b, _, _) = b
thrOfFour (_, _, c, _) = c
--This function is a single pass test for single digit factors
--It will be called as many times as needed by pryForSDFactors
trySingleDigitsFactors :: (Bool, Integer, [Integer]) -> (Bool, Integer, [Integer])
trySingleDigitsFactors (True, n, f) = (True, n, f)
trySingleDigitsFactors (b, n, []) = (b, n, [])
trySingleDigitsFactors (b, n, (f:fs))
| mod n f == 0 = (True, div n f, fs)
| otherwise = trySingleDigitsFactors (False, n, fs)
--This function will take a number and repeatedly divide by single digits till it gets to a single digit if possible
--Then it will return True
pryForSDFactors :: Integer -> Bool
pryForSDFactors n
| sndOfThree sdfTry < 10 = True
| fstOfThree sdfTry == True = pryForSDFactors $ sndOfThree sdfTry
| otherwise = False
where sdfTry = trySingleDigitsFactors (False, n, [7,5,3,2])
toDigits :: Integer -> [Integer]
toDigits n = map (\n -> read [n]) (show n)
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl shiftAndAdd 0
where shiftAndAdd acc d = 10 * acc + fromIntegral d
replaceElementAtPos :: a -> Int -> [a] -> [a]
replaceElementAtPos newElement pos [] = []
replaceElementAtPos newElement 0 (x:xs) = newElement:xs
replaceElementAtPos newElement pos (x:xs) = x : replaceElementAtPos newElement (pos-1) xs
checkPermutationsStep :: ([Integer],Int,[Int],[(Int,Integer)]) -> ([Integer],Int,[Int],[(Int,Integer)])
checkPermutationsStep (digits, index, rotationMap, registeredPositions)
| index == digitsLength - 1 = (digits, index-1, rotationMap, registeredPositions)
| not ((index, digitAtIndex) `elem` registeredPositions) = (digits, index+1, rotationMap, (index,digitAtIndex):registeredPositions)
| rotationAtIndex == 0 = (digits, index-1, restoredRotMap, restoredRegPositions)
| rotationAtIndex > 0 && (index, digitAtIndex) `elem` registeredPositions = (shiftLDigits, index, subtractRot, registeredPositions)
where digitsLength = length digits
digitAtIndex = head $ drop index digits
rotationAtIndex = head $ drop index rotationMap
--restoredRotMap = (fst splitRotMap) ++ [digitsLength - index] ++ (tail $ snd splitRotMap)
restoredRotMap = replaceElementAtPos (digitsLength - index) index rotationMap
--splitRotMap = splitAt index rotationMap
restoredRegPositions = filter (\pos -> fst pos < index) registeredPositions --clear everything below the parent index
shiftLDigits = (fst splitDigits) ++ (tail $ snd splitDigits) ++ [head $ snd splitDigits]
splitDigits = splitAt index digits
--subtractRot = (fst splitRotMap) ++ [(head $ snd splitRotMap) - 1] ++ (tail $ snd splitRotMap)
subtractRot = replaceElementAtPos (rotationDigitAtIndex - 1) index rotationMap
rotationDigitAtIndex = head $ drop index rotationMap
checkConditions :: ([Integer],Int,[Int],[(Int,Integer)]) -> Bool
checkConditions (digits, index, rotationMap, registeredPositions)
| (index == 0 && rotationAtIndex == 0) || ((index == (length digits) - 1) && pryForSDFactors (fromDigits digits)) = True
| otherwise = False
where rotationAtIndex = head $ drop index rotationMap
testPermsWithRep :: Integer -> Integer
testPermsWithRep n
| sndOfFour computationResult == 0 && (head . thrOfFour) computationResult == 0 = 0
| otherwise = (fromDigits . fstOfFour) computationResult
where computationResult = until checkConditions checkPermutationsStep (digitsOfn, 0 , [digitsLength, digitsLength -1 .. 1], [])
digitsOfn = toDigits n
digitsLength = length digitsOfn
main :: IO ()
main = do
args <- getArgs
let inputNumber = read (head args) :: Integer
let checkResult = testPermsWithRep inputNumber
print checkResult
Now, bear in mind that this code, as I've mentioned, checks for a condition of each generated permutation (single digit factors) on the spot, and moves on if False, but it's pretty easy to repurpose it for output list generation.
Sure it's now just inefficient in terms of big O complexity (scales terribly), and I was at first thinking of replacing lists with Data.Map because that's what I've learned so far (though not so comfortable with maps yet).
I've also read that there's a more efficient replacement for read since that's also called a lot for numbers-to-digits conversions.
# lordQuick I don't know about HashMaps or vectors yet but I'm still learning. Every little optimization will pay off in computation time because this is my first piece of "practical" code, not just Codeabbey credit
Cheers!
Here is a solution using a more efficient, insertion-based algorithm to compute unique permutations:
import Data.List
permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insert) [[]]
where insert y = foldr combine [[y]] . (zip <*> tail . tails)
where combine (x, xs) xss = (y : x : xs) :
if y == x then [] else map (x :) xss
headDef :: a -> [a] -> a
headDef x [] = x
headDef x (h : t) = h
fromDigits :: Integral a => [a] -> Integer
fromDigits = foldl1' ((+) . (10 *)) . map fromIntegral
toDigits :: Integer -> [Int]
toDigits = map (read . pure) . show
firstValidPermutation :: (Integer -> Bool) -> Integer -> Integer
firstValidPermutation p =
headDef 0 .
filter p .
map fromDigits .
permutationsNub .
toDigits
The basic idea is that, given the unique permutations of a list's tail, we can compute the unique permutations of the whole list by inserting its head into all of the tail's permutations, in every position that doesn't follow an occurrence of the head (to avoid creating duplicates). From my tests, permutationsNub seems to be faster than permutations from Data.List even when the input contains no repetitions. However, unlike that function, it consumes its input eagerly and thus cannot handle an infinite input. Exercise: Prove this algorithm's correctness.
to be continued

How to break a number into a list of digits? [duplicate]

Given an arbitrary number, how can I process each digit of the number individually?
Edit
I've added a basic example of the kind of thing Foo might do.
For example, in C# I might do something like this:
static void Main(string[] args)
{
int number = 1234567890;
string numberAsString = number.ToString();
foreach(char x in numberAsString)
{
string y = x.ToString();
int z = int.Parse(y);
Foo(z);
}
}
void Foo(int n)
{
Console.WriteLine(n*n);
}
Have you heard of div and mod?
You'll probably want to reverse the list of numbers if you want to treat the most significant digit first. Converting the number into a string is an impaired way of doing things.
135 `div` 10 = 13
135 `mod` 10 = 5
Generalize into a function:
digs :: Integral x => x -> [x]
digs 0 = []
digs x = digs (x `div` 10) ++ [x `mod` 10]
Or in reverse:
digs :: Integral x => x -> [x]
digs 0 = []
digs x = x `mod` 10 : digs (x `div` 10)
This treats 0 as having no digits. A simple wrapper function can deal with that special case if you want to.
Note that this solution does not work for negative numbers (the input x must be integral, i.e. a whole number).
digits :: Integer -> [Int]
digits = map (read . (:[])) . show
or you can return it into []:
digits :: Integer -> [Int]
digits = map (read . return) . show
or, with Data.Char.digitToInt:
digits :: Integer -> [Int]
digits = map digitToInt . show
the same as Daniel's really, but point free and uses Int, because a digit shouldn't really exceed maxBound :: Int.
Using the same technique used in your post, you can do:
digits :: Integer -> [Int]
digits n = map (\x -> read [x] :: Int) (show n)
See it in action:
Prelude> digits 123
[1,2,3]
Does that help?
You could also just reuse digits from Hackage.
Textbook unfold
import qualified Data.List as L
digits = reverse . L.unfoldr (\x -> if x == 0 then Nothing else Just (mod x 10, div x 10))
You can use
digits = map (`mod` 10) . reverse . takeWhile (> 0) . iterate (`div` 10)
or for reverse order
rev_digits = map (`mod` 10) . takeWhile (> 0) . iterate (`div` 10)
The iterate part generates an infinite list dividing the argument in every step by 10, so 12345 becomes [12345,1234,123,12,1,0,0..]. The takeWhile part takes only the interesting non-null part of the list. Then we reverse (if we want to) and take the last digit of each number of the list.
I used point-free style here, so you can imagine an invisible argument n on both sides of the "equation". However, if you want to write it that way, you have to substitute the top level . by $:
digits n = map(`mod` 10) $ reverse $ takeWhile (> 0) $ iterate (`div`10) n
Via list comprehension:
import Data.Char
digits :: Integer -> [Integer]
digits n = [toInteger (digitToInt x) | x <- show n]
output:
> digits 1234567890
[1,2,3,4,5,6,7,8,9,0]
I was lazy to write my custom function so I googled it and tbh I was surprised that none of the answers on this website provided a really good solution – high performance and type safe. So here it is, maybe somebody would like to use it. Basically:
It is type safe - it returns a type checked non-empty list of Word8 digits (all the above solutions return a list of numbers, but it cannot happen that we get [] right?)
This one is performance optimized with tail call optimization, fast concatenation and no need to do any reversing of the final values.
It uses special assignment syntax which in connection to -XStrict allows Haskell to fully do strictness analysis and optimize the inner loop.
Enjoy:
{-# LANGUAGE Strict #-}
digits :: Integral a => a -> NonEmpty Word8
digits = go [] where
go s x = loop (head :| s) tail where
head = fromIntegral (x `mod` 10)
tail = x `div` 10
loop s#(r :| rs) = \case
0 -> s
x -> go (r : rs) x
Here's an improvement on an answer above. This avoids the extra 0 at the beginning ( Examples: [0,1,0] for 10, [0,1] for 1 ). Use pattern matching to handle cases where x < 10 differently:
toDigits :: Integer -> [Integer] -- 12 -> [1,2], 0 -> [0], 10 -> [1,0]
toDigits x
| x < 10 = [x]
| otherwise = toDigits (div x 10) ++ [mod x 10]
I would have put this in a reply to that answer, but I don't have the needed reputation points :(
Applicative. Pointfree. Origami. Neat.
Enjoy:
import Data.List
import Data.Tuple
import Data.Bool
import Control.Applicative
digits = unfoldr $ liftA2 (bool Nothing) (Just . swap . (`divMod` 10)) (> 0)
I've been following next steps(based on this comment):
Convert the integer to a string.
Iterate over the string
character-by-character.
Convert each character back to an integer,
while appending it to the end of a list.
toDigits :: Integer -> [Integer]
toDigits a = [(read([m])::Integer) | m<-show(a)]
main = print(toDigits(1234))
For returning a list of [Integer]
import Data.Char
toDigits :: Integer -> [Integer]
toDigits n = map (\x -> toInteger (digitToInt x)) (show n)
The accepted answer is great but fails in cases of negative numbers since mod (-1) 10 evaluates to 9. If you would like this to handle negative numbers properly... which may not be the case the following code will allow for it.
digs :: Int -> [Int]
digs 0 = []
digs x
| x < 0 = digs ((-1) * x)
| x > 0 = digs (div x 10) ++ [mod x 10]
The accepted answer is correct except that it will output an empty list when input is 0, however I believe the output should be [0] when input is zero.
And I don't think it deal with the case when the input is negative. Below is my implementation, which solves the above two problems.
toDigits :: Integer -> [Integer]
toDigits n
| n >=0 && n < 10 = [n]
| n >= 10 = toDigits (n`div`10) ++ [n`mod`10]
| otherwise = error "make sure your input is greater than 0"
I would like to improve upon the answer of Dave Clarke in this page. It boils down to using div and mod on a number and adding their results to a list, only this time it won't appear reversed, nor resort to ++ (which is slower concatenation).
toDigits :: Integer -> [Integer]
toDigits n
| n <= 0 = []
| otherwise = numToDigits (n `mod` 10) (n `div` 10) []
where
numToDigits a 0 l = (a:l)
numToDigits a b l = numToDigits (b `mod` 10) (b `div` 10) (a:l)
This program was a solution to a problem in the CIS 194 course at UPenn that is available right here. You divide the number to find its result as an integer and the remainder as another. You pass them to a function whose third argument is an empty list. The remainder will be added to the list in case the result of division is 0. The function will be called again in case it's another number. The remainders will add in order until the end.
Note: this is for numbers, which means that zeros to the left won't count, and it will allow you to have their digits for further manipulation.
digits = reverse . unfoldr go
where go = uncurry (*>) . (&&&) (guard . (>0)) (Just . swap . (`quotRem` 10))
I tried to keep using tail recursion
toDigits :: Integer -> [Integer]
toDigits x = reverse $ toDigitsRev x
toDigitsRev :: Integer -> [Integer]
toDigitsRev x
| x <= 0 = []
| otherwise = x `rem` 10 : toDigitsRev (x `quot` 10)

Project Euler 50: Algorithm is incredibly slow, failing to understand why

I'm using Project Euler to learn Haskell. I'm new at Haskell and am having a lot of trouble coming up with an algorithm that doesn't take an absurd amount of time. I'm estimating that the program here would take 14 gigayears to arrive at the solution.
The problem:
Which prime, below one-million, can be written as the sum of the most
consecutive primes?
Here's my source. I've left out isPrime. I've posted it because it's far too inefficient to solve the problem. I think the issue lies with the slicedChains and primeChains calls, but I'm not sure what it is. I've resolved this before with C++. But for whatever reason, the efficient solution seems beyond me in Haskell.
Edit: I've included isPrime.
import System.Environment (getArgs)
import Data.List (nub,maximumBy)
import Data.Ord (comparing)
isPrime :: Integer -> Bool
isPrime 1 = False
isPrime 2 = True
isPrime x
| any (== 0) (fmap (x `mod`) [2..x-1]) = False
| otherwise = True
primeChain :: Integer -> [Integer]
primeChain x = [ n | n <- 1 : 2 : [3,5..x-1], isPrime n ]
slice :: [a] -> [Int] -> [a]
slice xs args = take (to - from + 1) (drop from xs)
where from = head args
to = last args
subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
in if n>l then [] else subsequencesBySize xs !! (l-n)
where
subsequencesBySize [] = [[[]]]
subsequencesBySize (x:xs) = let next = subsequencesBySize xs
in zipWith (++) ([]:next) (map (map (x:)) next ++ [[]])
slicedChains :: Int -> [Integer] -> [[Integer]]
slicedChains len xs = nub [x | x <- fmap (xs `slice`) subseqs, length x > 1]
where subseqs = [x | x <- (subsequencesOfSize 2 [1..len]), (last x) > (head x)]
primeSums :: Integer -> [[Integer]]
primeSums x = filter (\ns -> sum ns == x) chain
where xs = primeChain x
len = length xs
chain = slicedChains len xs
compLength :: [[a]] -> [a]
compLength xs = maximumBy (comparing length) xs
cleanSums :: [Integer] -> [[Integer]]
cleanSums xs = fmap (compLength) filtered
where filtered = filter (not . null) (fmap primeSums xs)
main :: IO()
main = do
args <- getArgs
let arg = read (head args) :: Integer
let xs = primeChain arg
print $ maximumBy (comparing length) $ cleanSums xs
Your basic problem is that you are not pruning your search space based on the best solution you have found so far.
I can tell this just from the fact that you are using maximumBy to find the longest sequence.
For instance, if during your search your find a consecutive sequence of 4 primes whose sum is a prime < 10^6, you don't have to examine any sequence which begins with a prime greater than 250000.
To do this kind of pruning you have to keep track of the solution found so far and interleave the testing of candidate sequences with their generation so that the best solution found so far can stop the search early.
Update
There are several inefficiencies in slicedChains. Haskell lists are implemented a linked lists. This video is pretty good overview of linked lists and how they differ from arrays: (link)
The following expressions in your code are going to be problematic w.r.t. efficiency:
* nub has quadratic running time
* length x > 1 - the complexity of length is O(n) where n is the length of the list. A better way to write this is:
lengthGreaterThan1 :: [a] -> Bool
lengthGreaterThan1 (_:_:_) = True
lengthGreaterThan1 _ = False
* subsequencesOfSize 2 [1..len] may be more succinctly written:
[ [a,b] | a <- [1..len], b <- [a+1..len] ]
and this will also ensure that a < b.
* The take and drop calls in slice are also O(n)
* In primeSums the call to primeChain will regenerate essentially the same list over and over again resulting in a lot of multiple calls to isPrime. A better approach is to define primeChain like this:
allPrimes = filter isPrime [1..]
primeChain x = takeWhile (<= x) allPrimes
The list allPrimes will be generated once, and primeChain simply takes prefixes of that list.
* primeSums x is charged with finding sequences whose sum is exactly x, but it looks at a lot of sequences that can't possibly work. For instance, primeSums 31 will examine:
11 + 13 + 17, 11 + 13 + 17 + 23, 11 + 13 + 17 + 23 + 29,
17 + 19, 17 + 19 + 23, 17 + 19 + 23 + 29,
19 + 23, 19 + 23 + 29
23 + 29
even though it's pretty obvious that none of these sums could equal 31.
So the first thing you need is a good data structure: Once you find a sequence of length n you don't care about sequences of shorter length, so your primary needs are: (1) tracking the sum, (2) tracking the primes in the set, (3) removing the least element, (4) adding a new greatest element. The key is amortization, where a big cost is paid infrequently enough that you can pretend it is a small cost per procedure. The data structure looks like this:
data Queue x = Q [x] [x]
q_empty (Q [] []) = True
q_empty _ = False
q_headtails (Q (x:xs) rest) = (x, Q xs rest)
q_headtails (Q [] xs) = case reverse xs of y:ys -> (y, Q ys [])
[] -> error "End of queue."
q_append el (Q beg end) = Q beg (el:end)
So deconstructing the list is possible, but sometimes triggers an O(n) operation, but that's OK because when it does, we won't have to do it for another n steps, so it averages out to one operation per step. (You might also want to do it with a spine-strict list.)
To save on length operations and summing the items of the list you probably want to cache those, too:
type Length = Int
type Sum = Int
type Prime = Int
data PrimeSeq = PS Length Sum (Queue Prime)
headTails (PS len sum q) = (x, PS (len - 1) (sum - x) xs)
where (x, xs) = q_headtails q
append x (PS len sum xs) = PS (len + 1) (sum + x) (q_append x xs)
The algorithm for these looks like:
Cache a copy of the PrimeSeq you're starting with
Keep adding primes to it and testing primality until you get to 10^6.
If you find a new prime with a longer sequence, replace the cache.
Whenever you run into 10^6, revert to the cache, pull a prime off the front of the queue, then repeat as needed.
Your prime generation is quadratic (isPrime 101 tests rem 101 100 == 0 even though 10 is the biggest number by which 101 needs to be tested -- and actually 7 is enough).
Yet even with it, a simple enough list-based code finds the answer in under 2 seconds (on an Intel Core i7 2.5 GHz, interpreted in GHCi). And with the code corrected to take advantage of the above mentioned optimization (and additionally, testing by primes only), it takes 0.1s.
Also, f x | t = False | otherwise = True is the same as f x = not t.
We are asked by the PE site not to give you even a hint.
But in general, the key to efficiency in Haskell, thanks to its laziness, is being generative with as small a duplication of effort as possible. As one example, instead of calculating each slice of a list in isolation starting anew, we can produce the bunch of them together as part of one process,
slices :: Int -> [a] -> [[a]]
slices n = map (take n) . iterate tail -- sequence of list's slices of length n each
Another principle is, try to solve a more general problem, of which yours is an instance.
Having written such a function, we can play with it by trying out different values for its parameters, from smaller to the bigger ones, for an exploratory style of problem solving. We're told about 21 consecutive primes. What about 22 of them? 27? 1127 of them? ... and I've said enough about this already.
If it starts taking too much time, we can assess the full solution's needed run time by empirical orders of growth analysis.
Though the solution is found quickly enough with your unoptimized isPrime code, the exploratory process can be prohibitively slow with it, but it is fast enough with the optimized code:
primes :: [Int]
primes = 2 : filter isPrime [3,5..]
isPrime n = and [rem n p > 0 | p <- takeWhile ((<= n).(^2)) primes]

Haskell tail recusion for multi call function

Here is non tail recursive function
alg :: Int -> Int
alg n = if n<7 then n else alg(n-1) * alg(n-2) * alg(n-4) * alg(n-6)
I've been stuck on this for a while, I get the basic idea of tail recursion, and how to do it for single call recursive function, but no clue how to do it for multi call one.
Even came up with this abomination
algT :: Int -> Int
algT n = tail1 n 0 where tail1 i r = tail1(i-1) r *
tail2 n 0 where tail2 i r = tail2(i-2) r *
tail3 n 0 where tail3 i r = tail3(i-4) r *
tail4 n 0 where tail4 i r = tail4(i-6) r
It doesnt work and obviously not how recursive function should look, had few other attempts, but all of them ended in infinite 100% cpu load loop...
Have you looked into Fibonacci in Haskell? It is a similar type of function. BTW tail recursion isn't quite the right term in Haskell, as multi-recursion functions can't really be done recursively but Haskell's lazy nature makes a similar but more powerful trick possible. Here is the standard one given:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
Using the same trick on yours gives EDIT: As a function
alg :: Int -> Int
alg n = alg' !! (n - 1)
where alg' = 1 : 2 : 3 : 4 : 5 : 6 : zipWith4 (\a b c d -> a * b * c * d) (drop 5 alg') (drop 4 alg') (drop 2 alg') alg'
Note that you shouldn't use Int here, that isn't open ended and the 11th term will loop in an Int.
EDIT: Actually Int is even worse than I thought. Once you hit 32 2's in your result you will start returning 0 since every answer is 0 mod 2^32.
From your question it's not entirely clear what is the purpose of making your function tail-recusrive. If you are trying to reduce cpu/memory usage, then you should use memoization (mentioned in the Guvante's answer).
Meanwhile, there is a way to make almost any function tail-recursive, known as continuation-passing style. Your example written in the CPS looks like this:
alg_cps :: Integer -> (Integer->a) -> a
alg_cps n cont =
if n < 7
then cont n
else alg_cps (n - 1)
(\x1 -> alg_cps (n - 2)
(\x2 -> alg_cps (n - 4)
(\x3 -> alg_cps (n - 6)
(\x4 -> cont (x1*x2*x3*x4)))))
And to directly get the result you can call it with id as continuation:
alg_cps 20 id
Notice that this does not reduce algorithm complexity or memory usage compared to naive non-tail recursive implementation.
I think I have a solution, but it's not very elegant or pretty.
alg :: Int -> Int
alg n | n < 7 -> n
| otherwise -> alg' n (repeat 0)
alg' :: Int -> [Int] -> Int
alg' n [] = error "something has gone horribly wrong"
alg' n l#(x:y)
| n < 5 -> error "something else has gone horribly wrong"
| n == 6 -> product $ zipWith (^) [6,5..1] l
| otherwise -> alg' (n-1) $ zipWith (+) [x,x,0,x,0,x] (y ++ [0])
The idea is that you can keep track of how many times you're supposed to be multiplying each thing without actually doing any of the calculations until the very end. At any given time, you have information about how many times you've needed any of the next 6 values, and once you're below 7, you just raise 1-6 to the proper powers and take their product.
(I haven't actually tested this, but it seems right. And even if it's not I'm pretty sure the idea behind it is sound)
P.S. As #Guvante says, Int isn't a good choice here as it will quickly overflow. As a general rule I use Integer by default and only switch if I have a good reason.
Here is a possible solution.
let f = [1..6] ++ foldr1 (zipWith (*)) [f, drop 2 f, drop 4 f, drop 5 f]
or even:
let f = [1..6] ++ foldr1 (zipWith (*)) (map (flip drop $ f) [0,2,4,5])

How do I get the sums of the digits of a large number in Haskell?

I'm a C++ Programmer trying to teach myself Haskell and it's proving to be challenging grasping the basics of using functions as a type of loop. I have a large number, 50!, and I need to add the sum of its digits. It's a relatively easy loop in C++ but I want to learn how to do it in Haskell.
I've read some introductory guides and am able to get 50! with
sum50fac.hs::
fac 0 = 1
fac n = n * fac (n-1)
x = fac 50
main = print x
Unfortunately at this point I'm not entirely sure how to approach the problem.
Is it possible to write a function that adds (mod) x 10 to a value and then calls the same function again on x / 10 until x / 10 is less than 10? If that's not possible how should I approach this problem?
Thanks!
sumd 0 = 0
sumd x = (x `mod` 10) + sumd (x `div` 10)
Then run it:
ghci> sumd 2345
14
UPDATE 1:
This one doesn't generate thunks and uses accumulator:
sumd2 0 acc = acc
sumd2 x acc = sumd2 (x `div` 10) (acc + (x `mod` 10))
Test:
ghci> sumd2 2345 0
14
UPDATE 2:
Partially applied version in pointfree style:
sumd2w = (flip sumd2) 0
Test:
ghci> sumd2w 2345
14
I used flip here because function for some reason (probably due to GHC design) didn't work with accumulator as a first parameter.
Why not just
sumd = sum . map Char.digitToInt . show
This is just a variant of #ony's, but how I'd write it:
import Data.List (unfoldr)
digits :: (Integral a) => a -> [a]
digits = unfoldr step . abs
where step n = if n==0 then Nothing else let (q,r)=n`divMod`10 in Just (r,q)
This will product the digits from low to high, which while unnatural for reading, is generally what you want for mathematical problems involving the digits of a number. (Project Euler anyone?) Also note that 0 produces [], and negative numbers are accepted, but produce the digits of the absolute value. (I don't want partial functions!)
If, on the other hand, I need the digits of a number as they are commonly written, then I would use #newacct's method, since the problem is one of essentially orthography, not math:
import Data.Char (digitToInt)
writtenDigits :: (Integral a) => a -> [a]
writtenDigits = map (fromIntegral.digitToInt) . show . abs
Compare output:
> digits 123
[3,2,1]
> writtenDigits 123
[1,2,3]
> digits 12300
[0,0,3,2,1]
> writtenDigits 12300
[1,2,3,0,0]
> digits 0
[]
> writtenDigits 0
[0]
In doing Project Euler, I've actually found that some problems call for one, and some call for the other.
About . and "point-free" style
To make this clear for those not familiar with Haskell's . operator, and "point-free" style, these could be rewritten as:
import Data.Char (digitToInt)
import Data.List (unfoldr)
digits :: (Integral a) => a -> [a]
digits i = unfoldr step (abs i)
where step n = if n==0 then Nothing else let (q,r)=n`divMod`10 in Just (r,q)
writtenDigits :: (Integral a) => a -> [a]
writtenDigits i = map (fromIntegral.digitToInt) (show (abs i))
These are exactly the same as the above. You should learn that these are the same:
f . g
(\a -> f (g a))
And "point-free" means that these are the same:
foo a = bar a
foo = bar
Combining these ideas, these are the same:
foo a = bar (baz a)
foo a = (bar . baz) a
foo = bar . baz
The laster is idiomatic Haskell, since once you get used to reading it, you can see that it is very concise.
To sum up all digits of a number:
digitSum = sum . map (read . return) . show
show transforms a number to a string. map iterates over the single elements of the string (i.e. the digits), turns them into a string (e.g. character '1' becomes the string "1") and read turns them back to an integer. sum finally calculates the sum.
Just to make pool of solutions greater:
miterate :: (a -> Maybe (a, b)) -> a -> [b]
miterate f = go . f where
go Nothing = []
go (Just (x, y)) = y : (go (f x))
sumd = sum . miterate f where
f 0 = Nothing
f x = Just (x `divMod` 10)
Well, one, your Haskell function misses brackets, you need fac (n - 1). (oh, I see you fixed that now)
Two, the real answer, what you want is first make a list:
listdigits n = if n < 10 then [n] else (listdigits (n `div` 10)) ++ (listdigits (n `mod` 10))
This should just compose a list of all the digits (type: Int -> [Int]).
Then we just make a sum as in sum (listdigits n). And we should be done.
Naturally, you can generalize the example above for the list for many different radices, also, you can easily translate this to products too.
Although maybe not as efficient as the other examples, here is a different way of approaching it:
import Data.Char
sumDigits :: Integer -> Int
sumDigits = foldr ((+) . digitToInt) 0 . show
Edit: newacct's method is very similar, and I like it a bit better :-)

Resources