Haskell function parameter force evaluation - haskell

I need to take last n elements from list, using O(n) memory so I wrote this code
take' :: Int -> [Int] -> [Int]
take' n xs = (helper $! (length $! xs) - n + 1) xs
where helper skip [] = []
helper skip (x : xs) = if skip == 0 then xs else (helper $! skip - 1) xs
main = print (take' 10 [1 .. 100000])
this code takes O(|L|) memory where |L| -- is the length of given list.
But when I write this code
take' :: Int -> [Int] -> [Int]
take' n xs = helper (100000 - n + 1) xs
where helper skip [] = []
helper skip (x : xs) = if skip == 0 then xs else (helper $! skip - 1) xs
main = print (take' 10 [1 .. 100000])
This code now takes only O(n) memory (the only chage is (helper $! (length $! xs) - n + 1) -> helper (100000 - n + 1))
So, as I understand, Haskell for some reason doesn't evaluate length xs before the first call of helper so it leaves a thunk in skip and haskell has to keep this value in every stack frame instead of making tail recursion. But in second piece of code it evaluates (100000 - n + 1) and gives the pure value to the helper.
So the problem is how to evaluate the length of list before the first call of helper and use only O(n) memory.

The other answer referred to what it means to be a good consumer. You have posted two versions of your function, one which works for arbitrary-length lists but is not a good consumer, and one which is a good consumer but assumes a particular list length. For completeness, here is a function that is a good consumer and works for arbitrary list lengths:
takeLast n xs = go (drop n xs) xs where
go (_:xs) (_:ys) = go xs ys
go _ ys = ys

The second version does not really take only O(n) memory. Regardless of what take' does: you start off with a list of length L, and that has to be stored somewhere.
The reason it effectively takes O(n) memory is that the list is only used by one “good consumer” here, namely helper. Such a consumer deconstructs the list from head to last; because no reference to the head is needed anywhere else, the garbage collector can immediately start cleaning up those first elements – before the list comprehension has even built up the rest of the list!
That changes however if before using helper you compute the length of that list. This already forces the entire list to be NF'd†, and as I said this inevitably takes O(L) memory. Because you're still holding a reference to be used with helper, in this case the garbage collector can not take any action before the whole list is in memory.
So, it really has nothing to do with strict evaluation. In fact the only way you could achieve your goal is by making it less strict (require only a sublist of length n to be evaluated at any given time).
†More precisely: it forces the list's spine to normal form. The elements aren't evaluated, but it's still O(L).

Related

Why does my function not work with an infinite list?

I'm trying to learn haskell and implemented a function conseq that would return a list of consecutive elements of size n.
conseq :: Int -> [Int] -> [[Int]]
conseq n x
| n == length(x) = [x]
| n > length(x) = [x]
| otherwise = [take n x] ++ (conseq n (drop 1 x))
This works correctly.
> take 5 $ conseq 2 [1..10]
[[1,2],[2,3],[3,4],[4,5],[5,6]]
However, if I pass [1..] instead of [1..10], the program gets stuck in an infinite loop.
As I understood it, haskell has lazy evaluation so I should still be able to get the same result right? Is it length? Shouldn't the first two conditions evaluate to false as soon as the length becomes greater than n?
What did I misunderstand?
One of the main reasons why using length is not a good idea is because when it has to be evaluated on an infinite list, it will get stuck in an infinite loop.
The good news is however, we don't need length. It would also make the time complexity worse. We can work with two enumerators, one is n-1 places ahead of the other. If this enumerator reaches the end of the list, then we know that the first enumerator still has n-1 elements, and thus we can stop yielding values:
conseq :: Int -> [a] -> [[a]]
conseq n ys = go (drop (n-1) ys) ys
where go [] _ = []
go (_:as) ba#(~(_:bs)) = take n ba : go as bs
This gives us thus:
Prelude> conseq 3 [1 ..]
[[1,2,3],[2,3,4],[3,4,5],[4,5,6],[5,6,7],[6,7,8],[7,8,9],[8,9,10],[9,10,11],[10,11,12],[11,12,13],[12,13,14],[13,14,15],[14,15,16],[15,16,17],[16,17,18],[17,18,19],[18,19,20],[19,20,21],[20,21,22],[21,22,23],[22,23,24],[23,24,25],[24,25,26],[25,26,27],…
Prelude> conseq 3 [1 .. 4]
[[1,2,3],[2,3,4]]
The first thing your function does is calculate length(x), so it knows whether it should return [x], [x], or [take n x] ++ (conseq n (drop 1 x))
length counts the number of elements in the list - all the elements. If you ask for the length of an infinite list, it never finishes counting.

How to create a Infinite List in Haskell where the new value consumes all the previous values

If I create a infinite list like this:
let t xs = xs ++ [sum(xs)]
let xs = [1,2] : map (t) xs
take 10 xs
I will get this result:
[
[1,2],
[1,2,3],
[1,2,3,6],
[1,2,3,6,12],
[1,2,3,6,12,24],
[1,2,3,6,12,24,48],
[1,2,3,6,12,24,48,96],
[1,2,3,6,12,24,48,96,192],
[1,2,3,6,12,24,48,96,192,384],
[1,2,3,6,12,24,48,96,192,384,768]
]
This is pretty close to what I am trying to do.
This current code uses the last value to define the next. But, instead of a list of lists, I would like to know some way to make an infinite list that uses all the previous values to define the new one.
So the output would be only
[1,2,3,6,12,24,48,96,192,384,768,1536,...]
I have the definition of the first element [1].
I have the rule of getting a new element, sum all the previous elements.
But, I could not put this in the Haskell grammar to create the infinite list.
Using my current code, I could take the list that I need, using the command:
xs !! 10
> [1,2,3,6,12,24,48,96,192,384,768,1536]
But, it seems to me, that it is possible doing this in some more efficient way.
Some Notes
I understand that, for this particular example, that was intentionally oversimplified, we could create a function that uses only the last value to define the next.
But, I am searching if it is possible to read all the previous values into an infinite list definition.
I am sorry if the example that I used created some confusion.
Here another example, that is not possible to fix using reading only the last value:
isMultipleByList :: Integer -> [Integer] -> Bool
isMultipleByList _ [] = False
isMultipleByList v (x:xs) = if (mod v x == 0)
then True
else (isMultipleByList v xs)
nextNotMultipleLoop :: Integer -> Integer -> [Integer] -> Integer
nextNotMultipleLoop step v xs = if not (isMultipleByList v xs)
then v
else nextNotMultipleLoop step (v + step) xs
nextNotMultiple :: [Integer] -> Integer
nextNotMultiple xs = if xs == [2]
then nextNotMultipleLoop 1 (maximum xs) xs
else nextNotMultipleLoop 2 (maximum xs) xs
addNextNotMultiple xs = xs ++ [nextNotMultiple xs]
infinitePrimeList = [2] : map (addNextNotMultiple) infinitePrimeList
take 10 infinitePrimeList
[
[2,3],
[2,3,5],
[2,3,5,7],
[2,3,5,7,11],
[2,3,5,7,11,13],
[2,3,5,7,11,13,17],
[2,3,5,7,11,13,17,19],
[2,3,5,7,11,13,17,19,23],
[2,3,5,7,11,13,17,19,23,29],
[2,3,5,7,11,13,17,19,23,29,31]
]
infinitePrimeList !! 10
[2,3,5,7,11,13,17,19,23,29,31,37]
You can think so:
You want to create a list (call them a) which starts on [1,2]:
a = [1,2] ++ ???
... and have this property: each next element in a is a sum of all previous elements in a. So you can write
scanl1 (+) a
and get a new list, in which any element with index n is sum of n first elements of list a. So, it is [1, 3, 6 ...]. All you need is take all elements without first:
tail (scanl1 (+) a)
So, you can define a as:
a = [1,2] ++ tail (scanl1 (+) a)
This way of thought you can apply with other similar problems of definition list through its elements.
If we already had the final result, calculating the list of previous elements for a given element would be easy, a simple application of the inits function.
Let's assume we already have the final result xs, and use it to compute xs itself:
import Data.List (inits)
main :: IO ()
main = do
let is = drop 2 $ inits xs
xs = 1 : 2 : map sum is
print $ take 10 xs
This produces the list
[1,2,3,6,12,24,48,96,192,384]
(Note: this is less efficient than SergeyKuz1001's solution, because the sum is re-calculated each time.)
unfoldr has a quite nice flexibility to adapt to various "create-a-list-from-initial-conditions"-problems so I think it is worth mentioning.
A little less elegant for this specific case, but shows how unfoldr can be used.
import Data.List
nextVal as = Just (s,as++[s])
where s = sum as
initList = [1,2]
myList =initList ++ ( unfoldr nextVal initList)
main = putStrLn . show . (take 12) $ myList
Yielding
[1,2,3,6,12,24,48,96,192,384,768,1536]
in the end.
As pointed out in the comment, one should think a little when using unfoldr. The way I've written it above, the code mimicks the code in the original question. However, this means that the accumulator is updated with as++[s], thus constructing a new list at every iteration. A quick run at https://repl.it/languages/haskell suggests it becomes quite memory intensive and slow. (4.5 seconds to access the 2000nd element in myList
Simply swapping the acumulator update to a:as produced a 7-fold speed increase. Since the same list can be reused as accumulator in every step it goes faster. However, the accumulator list is now in reverse, so one needs to think a little bit. In the case of predicate function sum this makes no differece, but if the order of the list matters, one must think a little bit extra.
You could define it like this:
xs = 1:2:iterate (*2) 3
For example:
Prelude> take 12 xs
[1,2,3,6,12,24,48,96,192,384,768,1536]
So here's my take. I tried not to create O(n) extra lists.
explode ∷ Integral i ⇒ (i ->[a] -> a) -> [a] -> [a]
explode fn init = as where
as = init ++ [fn i as | i <- [l, l+1..]]
l = genericLength init
This convenience function does create additional lists (by take). Hopefully they can be optimised away by the compiler.
explode' f = explode (\x as -> f $ take x as)
Usage examples:
myList = explode' sum [1,2]
sum' 0 xs = 0
sum' n (x:xs) = x + sum' (n-1) xs
myList2 = explode sum' [1,2]
In my tests there's little performance difference between the two functions. explode' is often slightly better.
The solution from #LudvigH is very nice and clear. But, it was not faster.
I am still working on the benchmark to compare the other options.
For now, this is the best solution that I could find:
-------------------------------------------------------------------------------------
-- # infinite sum of the previous using fuse
-------------------------------------------------------------------------------------
recursiveSum xs = [nextValue] ++ (recursiveSum (nextList)) where
nextValue = sum(xs)
nextList = xs ++ [nextValue]
initialSumValues = [1]
infiniteSumFuse = initialSumValues ++ recursiveSum initialSumValues
-------------------------------------------------------------------------------------
-- # infinite prime list using fuse
-------------------------------------------------------------------------------------
-- calculate the current value based in the current list
-- call the same function with the new combined value
recursivePrimeList xs = [nextValue] ++ (recursivePrimeList (nextList)) where
nextValue = nextNonMultiple(xs)
nextList = xs ++ [nextValue]
initialPrimes = [2]
infiniteFusePrimeList = initialPrimes ++ recursivePrimeList initialPrimes
This approach is fast and makes good use of many cores.
Maybe there is some faster solution, but I decided to post this to share my current progress on this subject so far.
In general, define
xs = x1 : zipWith f xs (inits xs)
Then it's xs == x1 : f x1 [] : f x2 [x1] : f x3 [x1, x2] : ...., and so on.
Here's one example of using inits in the context of computing the infinite list of primes, which pairs them up as
ps = 2 : f p1 [p1] : f p2 [p1,p2] : f p3 [p1,p2,p3] : ...
(in the definition of primes5 there).

Haskell Cycle function

The code for cycle is as follows
cycle :: [a] -> [a]
cycle [] = errorEmptyList "cycle"
cycle xs = xs' where xs' = xs ++ xs'
I would appreciate an explanation on how the last line works. I feel that it would go off into an infinite recursion without returning or shall I say without printing anything on the screen. I guess my intuition is wrong.
A list, like basically everything else in Haskell, is lazily evaluated.
Roughly, to make an OOP analogy, you can think of a list as a sort of "iterator object". When queried, it reports on whether there is a next element, and if so, what is such element and what is the tail of the list (this being another "iterator object").
A list defined as
xs = 1 : xs
does not cause non termination. It corresponds to an "iterator object" o that, when queried, answers: "the next element is 1, and the rest of the list can be queried using o". Basically, it returns itself.
This is no different than a list having as a tail a "pointer" to the list itself: a circular list. This takes a constant amount of space.
Appending with ++ works the same:
xs = [1] ++ xs
is identical to the previous list.
In your code, the part
where xs' = xs ++ xs'
crafts a list that starts with xs and then continues with the list itself xs'. Operationally, it is an "iterator object" o that returns, one by one, the elements of xs, and when the last element of xs is returned, it is paired with "you can query the rest of the list at o". Again, a back-pointer, which builds a sort of circular list.
Let's take out the last line separately:
cycle xs = xs' where xs' = xs ++ xs'
Now, let's try to reduce it:
cycle xs = xs ++ (xs ++ (xs ++ (xs ++ ...)))
You can see that it expands infinitely. But note that this is not how expressions get reduced in Haskell. Expressions will be reduced to WHNF when it's demanded. So, let's demand some values from cycle function:
ghci > take 1 $ cycle [1..]
[1]
This is how take function is implemented:
take n _ | n <= 0 = []
take _ [] = []
take n (x:xs) = x : take (n-1) xs
Now, because of pattern matching - the value n will be evaluated first. Since it is already in normal form, no further reduction needs to be done and it will be checked if it is less than or equal to zero. Since the condition fails, it will move on to the second condition. Here, it's second argument will be checked to see if it's equal to []. As usual, haskell will evaluate it to WHNF which will be 1:_. Here _ represents thunk. Now the whole expression will be reduced to 1:take 0 _. Since this value has to be printed in ghci, the whole 1:take 0 _ will be reduced again. Following the similar steps like above again, we will get 1:[] which reduces to [1].
Hence cycle [1,2,3] will get reduced to WHNF in the form (1:xs) and will be ultimately reduced to [1]. But if the cycle function, itself is strict in it's implementation, then it will just go into an infinite loop:
cycle :: NFData a => [a] -> [a]
cycle [] = []
cycle xs = let xs' = xs ++ xs'
in deepseq xs xs'
You can test that in ghci:
ghci > take 1 $ cycle [1..]
^CInterrupted.

Pairs of elements from list

I want to convert [1,2,3,4] to [[1 2] [2 3] [3 4]] or [(1 2) (2 3) (3 4)]. In clojure I have (partition 2 1 [1,2,3,4]). How can I do it in haskell? I suspect there is such function in standard api but I can't find it.
The standard trick for this is to zip the list with it's own tail:
> let xs = [1,2,3,4] in zip xs (tail xs)
[(1,2),(2,3),(3,4)]
To see why this works, line up the list and its tail visually.
xs = 1 : 2 : 3 : 4 : []
tail xs = 2 : 3 : 4 : []
and note that zip is making a tuple out of each column.
There are two more subtle reasons why this always does the right thing:
zip stops when either list runs out of elements. That makes sense here since we can't have an "incomplete pair" at the end and it also ensures that we get no pairs from a single element list.
When xs is empty, one might expect tail xs to throw an exception. However, because zip
checks its first argument first, when it sees that it's the empty list, the second argument
is never evaluated.
Everything above also holds true for zipWith, so you can use the same method whenever you need to apply a function pairwise to adjacent elements.
For a generic solution like Clojure's partition, there is nothing in the standard libraries. However, you can try something like this:
partition' :: Int -> Int -> [a] -> [[a]]
partition' size offset
| size <= 0 = error "partition': size must be positive"
| offset <= 0 = error "partition': offset must be positive"
| otherwise = loop
where
loop :: [a] -> [[a]]
loop xs = case splitAt size xs of
-- If the second part is empty, we're at the end. But we might
-- have gotten less than we asked for, hence the check.
(ys, []) -> if length ys == size then [ys] else []
(ys, _ ) -> ys : loop (drop offset xs)
Just to throw another answer out there using a different approach:
For n=2 you want to simply zip the list with its tail. For n=3 you want to zip the list with its tail and with the tail of its tail. This pattern continues further, so all we have to do is generalise it:
partition n = sequence . take n . iterate tail
But this only works for an offset of 1. To generalise the offsets we just have to look at the genrated list. It will always have the form:
[[1..something],[2..something+1],..]
So all left to do is select every offsetth element and we should be fine. I shamelessy stole this version from #ertes from this question:
everyNth :: Int -> [a] -> [a]
everyNth n = map head . takeWhile (not . null) . iterate (drop n)
The entire function now becomes:
partition size offset = everyNth offset . sequence . take size . iterate tail
Sometimes is best to roll your own. Recursive functions are what gives LisP its power and appeal. Haskell tries to discourage them but too often a solution is best achieved with a recursive function. They are often quite simple as is this one to produce pairs.
Haskell pattern matching reduces code. This could easily be changed by changing only the pattern to (x:y:yys) to produce (a,b), (c,d), (e,f).
> prs (x:yys#(y:_)) = (x,y):prs yys
> prs "abcdefg"
[('a','b'),('b','c'),('c','d'),('d','e'),('e','f'),('f','g')

Haskell - get nth element without "!!"

I need to get the nth element of a list but without using the !! operator. I am extremely new to haskell so I'd appreciate if you can answer in more detail and not just one line of code. This is what I'm trying at the moment:
nthel:: Int -> [Int] -> Int
nthel n xs = 0
let xsxs = take n xs
nthel n xs = last xsxs
But I get: parse error (possibly incorrect indentation)
There's a lot that's a bit off here,
nthel :: Int -> [Int] -> Int
is technically correct, really we want
nthel :: Int -> [a] -> a
So we can use this on lists of anything (Optional)
nthel n xs = 0
What you just said is "No matter what you give to nthel return 0". which is clearly wrong.
let xsxs = ...
This is just not legal haskell. let ... in ... is an expression, it can't be used toplevel.
From there I'm not really sure what that's supposed to do.
Maybe this will help put you on the right track
nthelem n [] = <???> -- error case, empty list
nthelem 0 xs = head xs
nthelem n xs = <???> -- recursive case
Try filling in the <???> with your best guess and I'm happy to help from there.
Alternatively you can use Haskell's "pattern matching" syntax. I explain how you can do this with lists here.
That changes our above to
nthelem n [] = <???> -- error case, empty list
nthelem 0 (x:xs) = x --bind x to the first element, xs to the rest of the list
nthelem n (x:xs) = <???> -- recursive case
Doing this is handy since it negates the need to use explicit head and tails.
I think you meant this:
nthel n xs = last xsxs
where xsxs = take n xs
... which you can simplify as:
nthel n xs = last (take n xs)
I think you should avoid using last whenever possible - lists are made to be used from the "front end", not from the back. What you want is to get rid of the first n elements, and then get the head of the remaining list (of course you get an error if the rest is empty). You can express this quite directly as:
nthel n xs = head (drop n xs)
Or shorter:
nthel n = head . drop n
Or slightly crazy:
nthel = (head .) . drop
As you know list aren't naturally indexed, but it can be overcome using a common tips.
Try into ghci, zip [0..] "hello", What's about zip [0,1,2] "hello" or zip [0..10] "hello" ?
Starting from this observation, we can now easily obtain a way to index our list.
Moreover is a good illustration of the use of laziness, a good hint for your learning process.
Then based on this and using pattern matching we can provide an efficient algorithm.
Management of bounding cases (empty list, negative index).
Replace the list by an indexed version using zipper.
Call an helper function design to process recursively our indexed list.
Now for the helper function, the list can't be empty then we can pattern match naively, and,
if our index is equal to n we have a winner
else, if our next element is empty it's over
else, call the helper function with the next element.
Additional note, as our function can fail (empty list ...) it could be a good thing to wrap our result using Maybe type.
Putting this all together we end with.
nth :: Int -> [a] -> Maybe a
nth n xs
| null xs || n < 0 = Nothing
| otherwise = helper n zs
where
zs = zip [0..] xs
helper n ((i,c):zs)
| i == n = Just c
| null zs = Nothing
| otherwise = helper n zs

Resources