Haskell Cycle function

Haskell Cycle function - haskell

The code for cycle is as follows
cycle :: [a] -> [a]
cycle [] = errorEmptyList "cycle"
cycle xs = xs' where xs' = xs ++ xs'
I would appreciate an explanation on how the last line works. I feel that it would go off into an infinite recursion without returning or shall I say without printing anything on the screen. I guess my intuition is wrong.

A list, like basically everything else in Haskell, is lazily evaluated.
Roughly, to make an OOP analogy, you can think of a list as a sort of "iterator object". When queried, it reports on whether there is a next element, and if so, what is such element and what is the tail of the list (this being another "iterator object").
A list defined as
xs = 1 : xs
does not cause non termination. It corresponds to an "iterator object" o that, when queried, answers: "the next element is 1, and the rest of the list can be queried using o". Basically, it returns itself.
This is no different than a list having as a tail a "pointer" to the list itself: a circular list. This takes a constant amount of space.
Appending with ++ works the same:
xs = [1] ++ xs
is identical to the previous list.
In your code, the part
where xs' = xs ++ xs'
crafts a list that starts with xs and then continues with the list itself xs'. Operationally, it is an "iterator object" o that returns, one by one, the elements of xs, and when the last element of xs is returned, it is paired with "you can query the rest of the list at o". Again, a back-pointer, which builds a sort of circular list.

Let's take out the last line separately:
cycle xs = xs' where xs' = xs ++ xs'
Now, let's try to reduce it:
cycle xs = xs ++ (xs ++ (xs ++ (xs ++ ...)))
You can see that it expands infinitely. But note that this is not how expressions get reduced in Haskell. Expressions will be reduced to WHNF when it's demanded. So, let's demand some values from cycle function:
ghci > take 1 $ cycle [1..]
[1]
This is how take function is implemented:
take n _ | n <= 0 = []
take _ [] = []
take n (x:xs) = x : take (n-1) xs
Now, because of pattern matching - the value n will be evaluated first. Since it is already in normal form, no further reduction needs to be done and it will be checked if it is less than or equal to zero. Since the condition fails, it will move on to the second condition. Here, it's second argument will be checked to see if it's equal to []. As usual, haskell will evaluate it to WHNF which will be 1:_. Here _ represents thunk. Now the whole expression will be reduced to 1:take 0 _. Since this value has to be printed in ghci, the whole 1:take 0 _ will be reduced again. Following the similar steps like above again, we will get 1:[] which reduces to [1].
Hence cycle [1,2,3] will get reduced to WHNF in the form (1:xs) and will be ultimately reduced to [1]. But if the cycle function, itself is strict in it's implementation, then it will just go into an infinite loop:
cycle :: NFData a => [a] -> [a]
cycle [] = []
cycle xs = let xs' = xs ++ xs'
in deepseq xs xs'
You can test that in ghci:
ghci > take 1 $ cycle [1..]
^CInterrupted.

Related

How to create a Infinite List in Haskell where the new value consumes all the previous values

If I create a infinite list like this:
let t xs = xs ++ [sum(xs)]
let xs = [1,2] : map (t) xs
take 10 xs
I will get this result:
[
[1,2],
[1,2,3],
[1,2,3,6],
[1,2,3,6,12],
[1,2,3,6,12,24],
[1,2,3,6,12,24,48],
[1,2,3,6,12,24,48,96],
[1,2,3,6,12,24,48,96,192],
[1,2,3,6,12,24,48,96,192,384],
[1,2,3,6,12,24,48,96,192,384,768]
]
This is pretty close to what I am trying to do.
This current code uses the last value to define the next. But, instead of a list of lists, I would like to know some way to make an infinite list that uses all the previous values to define the new one.
So the output would be only
[1,2,3,6,12,24,48,96,192,384,768,1536,...]
I have the definition of the first element [1].
I have the rule of getting a new element, sum all the previous elements.
But, I could not put this in the Haskell grammar to create the infinite list.
Using my current code, I could take the list that I need, using the command:
xs !! 10
> [1,2,3,6,12,24,48,96,192,384,768,1536]
But, it seems to me, that it is possible doing this in some more efficient way.
Some Notes
I understand that, for this particular example, that was intentionally oversimplified, we could create a function that uses only the last value to define the next.
But, I am searching if it is possible to read all the previous values into an infinite list definition.
I am sorry if the example that I used created some confusion.
Here another example, that is not possible to fix using reading only the last value:
isMultipleByList :: Integer -> [Integer] -> Bool
isMultipleByList _ [] = False
isMultipleByList v (x:xs) = if (mod v x == 0)
then True
else (isMultipleByList v xs)
nextNotMultipleLoop :: Integer -> Integer -> [Integer] -> Integer
nextNotMultipleLoop step v xs = if not (isMultipleByList v xs)
then v
else nextNotMultipleLoop step (v + step) xs
nextNotMultiple :: [Integer] -> Integer
nextNotMultiple xs = if xs == [2]
then nextNotMultipleLoop 1 (maximum xs) xs
else nextNotMultipleLoop 2 (maximum xs) xs
addNextNotMultiple xs = xs ++ [nextNotMultiple xs]
infinitePrimeList = [2] : map (addNextNotMultiple) infinitePrimeList
take 10 infinitePrimeList
[
[2,3],
[2,3,5],
[2,3,5,7],
[2,3,5,7,11],
[2,3,5,7,11,13],
[2,3,5,7,11,13,17],
[2,3,5,7,11,13,17,19],
[2,3,5,7,11,13,17,19,23],
[2,3,5,7,11,13,17,19,23,29],
[2,3,5,7,11,13,17,19,23,29,31]
]
infinitePrimeList !! 10
[2,3,5,7,11,13,17,19,23,29,31,37]

You can think so:
You want to create a list (call them a) which starts on [1,2]:
a = [1,2] ++ ???
... and have this property: each next element in a is a sum of all previous elements in a. So you can write
scanl1 (+) a
and get a new list, in which any element with index n is sum of n first elements of list a. So, it is [1, 3, 6 ...]. All you need is take all elements without first:
tail (scanl1 (+) a)
So, you can define a as:
a = [1,2] ++ tail (scanl1 (+) a)
This way of thought you can apply with other similar problems of definition list through its elements.

If we already had the final result, calculating the list of previous elements for a given element would be easy, a simple application of the inits function.
Let's assume we already have the final result xs, and use it to compute xs itself:
import Data.List (inits)
main :: IO ()
main = do
let is = drop 2 $ inits xs
xs = 1 : 2 : map sum is
print $ take 10 xs
This produces the list
[1,2,3,6,12,24,48,96,192,384]
(Note: this is less efficient than SergeyKuz1001's solution, because the sum is re-calculated each time.)

unfoldr has a quite nice flexibility to adapt to various "create-a-list-from-initial-conditions"-problems so I think it is worth mentioning.
A little less elegant for this specific case, but shows how unfoldr can be used.
import Data.List
nextVal as = Just (s,as++[s])
where s = sum as
initList = [1,2]
myList =initList ++ ( unfoldr nextVal initList)
main = putStrLn . show . (take 12) $ myList
Yielding
[1,2,3,6,12,24,48,96,192,384,768,1536]
in the end.
As pointed out in the comment, one should think a little when using unfoldr. The way I've written it above, the code mimicks the code in the original question. However, this means that the accumulator is updated with as++[s], thus constructing a new list at every iteration. A quick run at https://repl.it/languages/haskell suggests it becomes quite memory intensive and slow. (4.5 seconds to access the 2000nd element in myList
Simply swapping the acumulator update to a:as produced a 7-fold speed increase. Since the same list can be reused as accumulator in every step it goes faster. However, the accumulator list is now in reverse, so one needs to think a little bit. In the case of predicate function sum this makes no differece, but if the order of the list matters, one must think a little bit extra.

You could define it like this:
xs = 1:2:iterate (*2) 3
For example:
Prelude> take 12 xs
[1,2,3,6,12,24,48,96,192,384,768,1536]

So here's my take. I tried not to create O(n) extra lists.
explode ∷ Integral i ⇒ (i ->[a] -> a) -> [a] -> [a]
explode fn init = as where
as = init ++ [fn i as | i <- [l, l+1..]]
l = genericLength init
This convenience function does create additional lists (by take). Hopefully they can be optimised away by the compiler.
explode' f = explode (\x as -> f $ take x as)
Usage examples:
myList = explode' sum [1,2]
sum' 0 xs = 0
sum' n (x:xs) = x + sum' (n-1) xs
myList2 = explode sum' [1,2]
In my tests there's little performance difference between the two functions. explode' is often slightly better.

The solution from #LudvigH is very nice and clear. But, it was not faster.
I am still working on the benchmark to compare the other options.
For now, this is the best solution that I could find:
-------------------------------------------------------------------------------------
-- # infinite sum of the previous using fuse
-------------------------------------------------------------------------------------
recursiveSum xs = [nextValue] ++ (recursiveSum (nextList)) where
nextValue = sum(xs)
nextList = xs ++ [nextValue]
initialSumValues = [1]
infiniteSumFuse = initialSumValues ++ recursiveSum initialSumValues
-------------------------------------------------------------------------------------
-- # infinite prime list using fuse
-------------------------------------------------------------------------------------
-- calculate the current value based in the current list
-- call the same function with the new combined value
recursivePrimeList xs = [nextValue] ++ (recursivePrimeList (nextList)) where
nextValue = nextNonMultiple(xs)
nextList = xs ++ [nextValue]
initialPrimes = [2]
infiniteFusePrimeList = initialPrimes ++ recursivePrimeList initialPrimes
This approach is fast and makes good use of many cores.
Maybe there is some faster solution, but I decided to post this to share my current progress on this subject so far.

In general, define
xs = x1 : zipWith f xs (inits xs)
Then it's xs == x1 : f x1 [] : f x2 [x1] : f x3 [x1, x2] : ...., and so on.
Here's one example of using inits in the context of computing the infinite list of primes, which pairs them up as
ps = 2 : f p1 [p1] : f p2 [p1,p2] : f p3 [p1,p2,p3] : ...
(in the definition of primes5 there).

Haskell function parameter force evaluation

I need to take last n elements from list, using O(n) memory so I wrote this code
take' :: Int -> [Int] -> [Int]
take' n xs = (helper $! (length $! xs) - n + 1) xs
where helper skip [] = []
helper skip (x : xs) = if skip == 0 then xs else (helper $! skip - 1) xs
main = print (take' 10 [1 .. 100000])
this code takes O(|L|) memory where |L| -- is the length of given list.
But when I write this code
take' :: Int -> [Int] -> [Int]
take' n xs = helper (100000 - n + 1) xs
where helper skip [] = []
helper skip (x : xs) = if skip == 0 then xs else (helper $! skip - 1) xs
main = print (take' 10 [1 .. 100000])
This code now takes only O(n) memory (the only chage is (helper $! (length $! xs) - n + 1) -> helper (100000 - n + 1))
So, as I understand, Haskell for some reason doesn't evaluate length xs before the first call of helper so it leaves a thunk in skip and haskell has to keep this value in every stack frame instead of making tail recursion. But in second piece of code it evaluates (100000 - n + 1) and gives the pure value to the helper.
So the problem is how to evaluate the length of list before the first call of helper and use only O(n) memory.

The other answer referred to what it means to be a good consumer. You have posted two versions of your function, one which works for arbitrary-length lists but is not a good consumer, and one which is a good consumer but assumes a particular list length. For completeness, here is a function that is a good consumer and works for arbitrary list lengths:
takeLast n xs = go (drop n xs) xs where
go (_:xs) (_:ys) = go xs ys
go _ ys = ys

The second version does not really take only O(n) memory. Regardless of what take' does: you start off with a list of length L, and that has to be stored somewhere.
The reason it effectively takes O(n) memory is that the list is only used by one “good consumer” here, namely helper. Such a consumer deconstructs the list from head to last; because no reference to the head is needed anywhere else, the garbage collector can immediately start cleaning up those first elements – before the list comprehension has even built up the rest of the list!
That changes however if before using helper you compute the length of that list. This already forces the entire list to be NF'd†, and as I said this inevitably takes O(L) memory. Because you're still holding a reference to be used with helper, in this case the garbage collector can not take any action before the whole list is in memory.
So, it really has nothing to do with strict evaluation. In fact the only way you could achieve your goal is by making it less strict (require only a sublist of length n to be evaluated at any given time).
†More precisely: it forces the list's spine to normal form. The elements aren't evaluated, but it's still O(L).

Haskell - get nth element without "!!"

I need to get the nth element of a list but without using the !! operator. I am extremely new to haskell so I'd appreciate if you can answer in more detail and not just one line of code. This is what I'm trying at the moment:
nthel:: Int -> [Int] -> Int
nthel n xs = 0
let xsxs = take n xs
nthel n xs = last xsxs
But I get: parse error (possibly incorrect indentation)

There's a lot that's a bit off here,
nthel :: Int -> [Int] -> Int
is technically correct, really we want
nthel :: Int -> [a] -> a
So we can use this on lists of anything (Optional)
nthel n xs = 0
What you just said is "No matter what you give to nthel return 0". which is clearly wrong.
let xsxs = ...
This is just not legal haskell. let ... in ... is an expression, it can't be used toplevel.
From there I'm not really sure what that's supposed to do.
Maybe this will help put you on the right track
nthelem n [] = <???> -- error case, empty list
nthelem 0 xs = head xs
nthelem n xs = <???> -- recursive case
Try filling in the <???> with your best guess and I'm happy to help from there.
Alternatively you can use Haskell's "pattern matching" syntax. I explain how you can do this with lists here.
That changes our above to
nthelem n [] = <???> -- error case, empty list
nthelem 0 (x:xs) = x --bind x to the first element, xs to the rest of the list
nthelem n (x:xs) = <???> -- recursive case
Doing this is handy since it negates the need to use explicit head and tails.

I think you meant this:
nthel n xs = last xsxs
where xsxs = take n xs
... which you can simplify as:
nthel n xs = last (take n xs)

I think you should avoid using last whenever possible - lists are made to be used from the "front end", not from the back. What you want is to get rid of the first n elements, and then get the head of the remaining list (of course you get an error if the rest is empty). You can express this quite directly as:
nthel n xs = head (drop n xs)
Or shorter:
nthel n = head . drop n
Or slightly crazy:
nthel = (head .) . drop

As you know list aren't naturally indexed, but it can be overcome using a common tips.
Try into ghci, zip [0..] "hello", What's about zip [0,1,2] "hello" or zip [0..10] "hello" ?
Starting from this observation, we can now easily obtain a way to index our list.
Moreover is a good illustration of the use of laziness, a good hint for your learning process.
Then based on this and using pattern matching we can provide an efficient algorithm.
Management of bounding cases (empty list, negative index).
Replace the list by an indexed version using zipper.
Call an helper function design to process recursively our indexed list.
Now for the helper function, the list can't be empty then we can pattern match naively, and,
if our index is equal to n we have a winner
else, if our next element is empty it's over
else, call the helper function with the next element.
Additional note, as our function can fail (empty list ...) it could be a good thing to wrap our result using Maybe type.
Putting this all together we end with.
nth :: Int -> [a] -> Maybe a
nth n xs
| null xs || n < 0 = Nothing
| otherwise = helper n zs
where
zs = zip [0..] xs
helper n ((i,c):zs)
| i == n = Just c
| null zs = Nothing
| otherwise = helper n zs

How to define a rotates function

How to define a rotates function that generates all rotations of the given list?
For example: rotates [1,2,3,4] =[[1,2,3,4],[2,3,4,1],[3,4,1,2],[4,1,2,3]]
I wrote a shift function that can rearrange the order
shift ::[Int]->[Int]
shift x=tail ++ take 1 x
but I don't how to generate these new arrays and append them together.

Another way to calculate all rotations of a list is to use the predefined functions tails and inits. The function tails yields a list of all final segments of a list while inits yields a list of all initial segments. For example,
tails [1,2,3] = [[1,2,3], [2,3], [3], []]
inits [1,2,3] = [[], [1], [1,2], [1,2,3]]
That is, if we concatenate these lists pointwise as indicated by the indentation we get all rotations. We only get the original list twice, namely, once by appending the empty initial segment at the end of original list and once by appending the empty final segment to the front of the original list. Therefore, we use the function init to drop the last element of the result of applying zipWith to the tails and inits of a list. The function zipWith applies its first argument pointwise to the provided lists.
allRotations :: [a] -> [[a]]
allRotations l = init (zipWith (++) (tails l) (inits l))
This solution has an advantage over the other solutions as it does not use length. The function length is quite strict in the sense that it does not yield a result before it has evaluated the list structure of its argument completely. For example, if we evaluate the application
allRotations [1..]
that is, we calculate all rotations of the infinite list of natural numbers, ghci happily starts printing the infinite list as first result. In contrast, an implementation that is based on length like suggested here does not terminate as it calculates the length of the infinite list.

shift (x:xs) = xs ++ [x]
rotates xs = take (length xs) $ iterate shift xs
iterate f x returns the stream ("infinite list") [x, f x, f (f x), ...]. There are n rotations of an n-element list, so we take the first n of them.

The following
shift :: [a] -> Int -> [a]
shift l n = drop n l ++ take n l
allRotations :: [a] -> [[a]]
allRotations l = [ shift l i | i <- [0 .. (length l) -1]]
yields
> ghci
Prelude> :l test.hs
[1 of 1] Compiling Main ( test.hs, interpreted )
Ok, modules loaded: Main.
*Main> allRotations [1,2,3,4]
[[1,2,3,4],[2,3,4,1],[3,4,1,2],[4,1,2,3]]
which is as you expect.
I think this is fairly readable, although not particularly efficient (no memoisation of previous shifts occurs).
If you care about efficiency, then
shift :: [a] -> [a]
shift [] = []
shift (x:xs) = xs ++ [x]
allRotations :: [a] -> [[a]]
allRotations l = take (length l) (iterate shift l)
will allow you to reuse the results of previous shifts, and avoid recomputing them.
Note that iterate returns an infinite list, and due to lazy evaluation, we only ever evaluate it up to length l into the list.
Note that in the first part, I've extended your shift function to ask how much to shift, and I've then a list comprehension for allRotations.

The answers given so far work fine for finite lists, but will eventually error out when given an infinite list. (They all call length on the list.)
shift :: [a] -> [a]
shift xs = drop 1 xs ++ take 1 xs
rotations :: [a] -> [[a]]
rotations xs = zipWith const (iterate shift xs) xs
My solution uses zipWith const instead. zipWith const foos bars might appear at first glance to be identical to foos (recall that const x y = x). But the list returned from zipWith terminates when either of the input lists terminates.
So when xs is finite, the returned list is the same length as xs, as we want; and when xs is infinite, the returned list will not be truncated, so will be infinite, again as we want.
(In your particular application it may not make sense to try to rotate an infinite list. On the other hand, it might. I submit this answer for completeness only.)

I would prefer the following solutions, using the built-in functions cycle and tails:
rotations xs = take len $ map (take len) $ tails $ cycle xs where
len = length xs
For your example [1,2,3,4] the function cycle produces an infinite list [1,2,3,4,1,2,3,4,1,2...]. The function tails generates all possible tails from a given list, here [[1,2,3,4,1,2...],[2,3,4,1,2,3...],[3,4,1,2,3,4...],...]. Now all we need to do is cutting down the "tails"-lists to length 4, and cutting the overall list to length 4, which is done using take. The alias len was introduced to avoid to recalculate length xs several times.

I think it will be something like this (I don't have ghc right now, so I couldn't try it)
shift (x:xs) = xs ++ [x]
rotateHelper xs 0 = []
rotateHelper xs n = xs : (rotateHelper (shift xs) (n - 1))
rotate xs = rotateHelper xs (length xs)

myRotate lst = lst : myRotateiter lst lst
where myRotateiter (x:xs) orig
|temp == orig = []
|otherwise = temp : myRotateiter temp orig
where temp = xs ++ [x]

I suggest:
rotate l = l : rotate (drop 1 l ++ take 1 l)
distinctRotations l = take (length l) (rotate l)

Why does this first Haskell function FAIL to handle infinite lists, while this second snippet SUCCEEDS with infinite lists?

I have two Haskell functions, both of which seem very similar to me. But the first one FAILS against infinite lists, and the second one SUCCEEDS against infinite lists. I have been trying for hours to nail down exactly why that is, but to no avail.
Both snippets are a re-implementation of the "words" function in Prelude. Both work fine against finite lists.
Here's the version that does NOT handle infinite lists:
myWords_FailsOnInfiniteList :: String -> [String]
myWords_FailsOnInfiniteList string = foldr step [] (dropWhile charIsSpace string)
where
step space ([]:xs) | charIsSpace space = []:xs
step space (x:xs) | charIsSpace space = []:x:xs
step space [] | charIsSpace space = []
step char (x:xs) = (char : x) : xs
step char [] = [[char]]
Here's the version that DOES handle infinite lists:
myWords_anotherReader :: String -> [String]
myWords_anotherReader xs = foldr step [""] xs
where
step x result | not . charIsSpace $ x = [x:(head result)]++tail result
| otherwise = []:result
Note: "charIsSpace" is merely a renaming of Char.isSpace.
The following interpreter session illustrates that the first one fails against an infinite list while the second one succeeds.
*Main> take 5 (myWords_FailsOnInfiniteList (cycle "why "))
*** Exception: stack overflow
*Main> take 5 (myWords_anotherReader (cycle "why "))
["why","why","why","why","why"]
EDIT: Thanks to the responses below, I believe I understand now. Here are my conclusions and the revised code:
Conclusions:
The biggest culprit in my first attempt were the 2 equations that started with "step space []" and "step char []". Matching the second parameter of the step function against [] is a no-no, because it forces the whole 2nd arg to be evaluated (but with a caveat to be explained below).
At one point, I had thought (++) might evaluate its right-hand argument later than cons would, somehow. So, I thought I might fix the problem by changing " = (char:x):xs" to "= [char : x] ++ xs". But that was incorrect.
At one point, I thought that pattern matching the second arg against (x:xs) would cause the function to fail against infinite lists. I was almost right about this, but not quite! Evaluating the second arg against (x:xs), as I do in a pattern match above, WILL cause some recursion. It will "turn the crank" until it hits a ":" (aka, "cons"). If that never happened, then my function would not succeed against an infinite list. However, in this particular case, everything is OK because my function will eventually encounter a space, at which point a "cons" will occur. And the evaluation triggered by matching against (x:xs) will stop right there, avoiding the infinite recursion. At that point, the "x" will be matched, but the xs will remain a thunk, so there's no problem. (Thanks to Ganesh for really helping me grasp that).
In general, you can mention the second arg all you want, as long as you don't force evaluation of it. If you've matched against x:xs, then you can mention xs all you want, as long as you don't force evaluation of it.
So, here's the revised code. I usually try to avoid head and tail, merely because they are partial functions, and also because I need practice writing the pattern matching equivalent.
myWords :: String -> [String]
myWords string = foldr step [""] (dropWhile charIsSpace string)
where
step space acc | charIsSpace space = "":acc
step char (x:xs) = (char:x):xs
step _ [] = error "this should be impossible"
This correctly works against infinite lists. Note there's no head, tail or (++) operator in sight.
Now, for an important caveat:
When I first wrote the corrected code, I did not have the 3rd equation, which matches against "step _ []". As a result, I received the warning about non-exhaustive pattern matches. Obviously, it is a good idea to avoid that warning.
But I thought I was going to have a problem. I already mentioned above that it is not OK to pattern match the second arg against []. But I would have to do so in order to get rid of the warning.
However, when I added the "step _ []" equation, everything was fine! There was still no problem with infinite lists!. Why?
Because the 3rd equation in the corrected code IS NEVER REACHED!
In fact, consider the following BROKEN version. It is EXACTLY the SAME as the correct code, except that I have moved the pattern for empty list up above the other patterns:
myWords_brokenAgain :: String -> [String]
myWords_brokenAgain string = foldr step [""] (dropWhile charIsSpace string)
where
step _ [] = error "this should be impossible"
step space acc | charIsSpace space = "":acc
step char (x:xs) = (char:x):xs
We're back to stack overflow, because the first thing that happens when step is called is that the interpreter checks to see if equation number one is a match. To do so, it must see if the second arg is []. To do that, it must evaluate the second arg.
Moving the equation down BELOW the other equations ensures that the 3rd equation is never attempted, because either the first or the second pattern always matches. The 3rd equation is merely there to dispense with the non-exhaustive pattern warning.
This has been a great learning experience. Thanks to everyone for your help.

Others have pointed out the problem, which is that step always evaluates its second argument before producing any output at all, yet its second argument will ultimately depend on the result of another invocation of step when the foldr is applied to an infinite list.
It doesn't have to be written this way, but your second version is kind of ugly because it relies on the initial argument to step having a particular format and it's quite hard to see that the head/tail will never go wrong. (I'm not even 100% certain that they won't!)
What you should do is restructure the first version so it produces output without depending on the input list in at least some situations. In particular we can see that when the character is not a space, there's always at least one element in the output list. So delay the pattern-matching on the second argument until after producing that first element. The case where the character is a space will still be dependent on the list, but that's fine because the only way that case can infinitely recurse is if you pass in an infinite list of spaces, in which case not producing any output and going into a loop is the expected behaviour for words (what else could it do?)

Try expanding the expression by hand:
take 5 (myWords_FailsOnInfiniteList (cycle "why "))
take 5 (foldr step [] (dropWhile charIsSpace (cycle "why ")))
take 5 (foldr step [] (dropWhile charIsSpace ("why " ++ cycle "why ")))
take 5 (foldr step [] ("why " ++ cycle "why "))
take 5 (step 'w' (foldr step [] ("hy " ++ cycle "why ")))
take 5 (step 'w' (step 'h' (foldr step [] ("y " ++ cycle "why "))))
What's the next expansion? You should see that in order to pattern match for step, you need to know whether it's the empty list or not. In order to find that out, you have to evaluate it, at least a little bit. But that second term happens to be a foldr reduction by the very function you're pattern matching for. In other words, the step function cannot look at its arguments without calling itself, and so you have an infinite recursion.
Contrast that with an expansion of your second function:
myWords_anotherReader (cycle "why ")
foldr step [""] (cycle "why ")
foldr step [""] ("why " ++ cycle "why ")
step 'w' (foldr step [""] ("hy " ++ cycle "why ")
let result = foldr step [""] ("hy " ++ cycle "why ") in
['w':(head result)] ++ tail result
let result = step 'h' (foldr step [""] ("y " ++ cycle "why ") in
['w':(head result)] ++ tail result
You can probably see that this expansion will continue until a space is reached. Once a space is reached, "head result" will obtain a value, and you will have produced the first element of the answer.
I suspect that this second function will overflow for infinite strings that don't contain any spaces. Can you see why?

The second version does not actually evaluate result until after it has started producing part of its own answer. The first version evaluates result immediately by pattern matching on it.
The key with these infinite lists is that you have to produce something before you start demanding list elements so that the output can always "stay ahead" of the input.
(I feel like this explanation is not very clear, but it's the best I can do.)

The library function foldr has this implementation (or similar):
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f k (x:xs) = f x (foldr f k xs)
foldr _ k _ = k
The result of myWords_FailsOnInfiniteList depends on the result of foldr which depends on the result of step which depends on the result of the inner foldr which depends on ... and so on an infinite list, myWords_FailsOnInfiniteList will use up an infinite amount of space and time before producing its first word.
The step function in myWords_anotherReader does not require the result of the inner foldr until after it has produced the first letter of the first word. Unfortunately, as Apocalisp says, it uses O(length of first word) space before it produces the next word, because as the first word is being produced, the tail thunk keeps growing tail ([...] ++ tail ([...] ++ tail (...))).
In contrast, compare to
myWords :: String -> [String]
myWords = myWords' . dropWhile isSpace where
myWords' [] = []
myWords' string =
let (part1, part2) = break isSpace string
in part1 : myWords part2
using library functions which may be defined as
break :: (a -> Bool) -> [a] -> ([a], [a])
break p = span $ not . p
span :: (a -> Bool) -> [a] -> ([a], [a])
span p xs = (takeWhile p xs, dropWhile p xs)
takeWhile :: (a -> Bool) -> [a] -> [a]
takeWhile p (x:xs) | p x = x : takeWhile p xs
takeWhile _ _ = []
dropWhile :: (a -> Bool) -> [a] -> [a]
dropWhile p (x:xs) | p x = dropWhile p xs
dropWhile _ xs = xs
Notice that producing the intermediate results is never held up by future computation, and only O(1) space is needed as each element of the result is made available for consumption.
Addendum
So, here's the revised code. I usually try to avoid head and tail, merely because they are partial functions, and also because I need practice writing the pattern matching equivalent.
myWords :: String -> [String]
myWords string = foldr step [""] (dropWhile charIsSpace string)
where
step space acc | charIsSpace space = "":acc
step char (x:xs) = (char:x):xs
step _ [] = error "this should be impossible"
(Aside: You may not care, but the words "" == [] from the library, but your myWords "" = [""]. Similar issue with trailing spaces.)
Looks much-improved over myWords_anotherReader, and is pretty good for a foldr-based solution.
\n -> tail $ myWords $ replicate n 'a' ++ " b"
It's not possible to do better than O(n) time, but both myWords_anotherReader and myWords take O(n) space here. This may be inevitable given the use of foldr.
Worse,
\n -> head $ head $ myWords $ replicate n 'a' ++ " b"
myWords_anotherReader was O(1) but the new myWords is O(n), because pattern matching (x:xs) requires the further result.
You can work around this with
myWords :: String -> [String]
myWords = foldr step [""] . dropWhile isSpace
where
step space acc | isSpace space = "":acc
step char ~(x:xs) = (char:x):xs
The ~ introduces an "irrefutable pattern". Irrefutable patterns never fail and do not force immediate evaluation.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Haskell Cycle function - haskell

Related

How to create a Infinite List in Haskell where the new value consumes all the previous values

Haskell function parameter force evaluation

Haskell - get nth element without "!!"

How to define a rotates function

Why does this first Haskell function FAIL to handle infinite lists, while this second snippet SUCCEEDS with infinite lists?

Categories

Resources