Haskell laziness in list comprehensions - haskell

I wanted to take a sum of all even numbers <= 1000.
The following code:
sum [x | x <- [1..1000], even x] (I know it can be done with [2,4..1000], this is for practice)
reports that the sum is 250500.
However:
sum [x | x <- [1..], even x && x <= 1000]
never finished and has to be interrupted!
I thought that I could safely write [1..], which is an infinite list, because Haskell would not try to evaluate it.
Furthermore, I thought that it would simply start going x by x, checking them and adding them.
So why does the above fail to produce a result?

sum [x | x <- [1..], even x && x <= 1000]
translates to something like
sum (filter (\x -> even x && x <= 1000) [1..])
The range expression (or, desugared, enumFrom) continues to generate values and filter keeps discarding them (it doesn't know there'll never be another element that satisfies the predicate), but sum never gets to the end of the list, so it can't return a result. You want to stop evaluating the list as soon as you see the first value greater than 1000:
sum (takeWhile (<= 1000) $ filter even [1..])

Related

Infinite loop on a simple list for two predicates

When i try to compile this line :
mult y = [x*2 | x <- [1..], x <= y]
And run it, I have an infinite loop that I must cancel with CTRL + C
*Main> mult 10
[2,4,6,8,10,12,14,16,18,20
Do you know why those predicate are not correctly interpreted ?
Thank you
You're looking for
mult y = [x * 2 | x <- [1..y]]
In this version, the [1..y] gets compiled to a finite list from 1 up to y. In your original code
mult y = [x * 2 | x <- [1..], x <= y]
Haskell doesn't understand complicated concepts like the nature of <= as an ordering or that [1..] is a monotonic list. So Haskell is determined to come up with every natural number, just to make sure some really big number out there doesn't happen to be less than y, by some fluke. You and I can look at that code and see that it obviously won't find any, but Haskell doesn't understand that, so it goes looking anyway.

Haskell: Filtering a list based on a predicate for all other elements in the list

I have a list of natural numbers [1..n] (this list is never empty) and I would like to filter each element by testing a predicate with all other elements in the list. I would like to return a list of those numbers who never fulfilled the predicate. My idea is this:
filter (\x -> 1 == length [y| y <- [1..n], pred y x]) [1..n]
I am testing if the length is equal to 1 since for x==y the predicate returns true.
This does work as intended, however, I was wondering if there is a cleaner way to do this. I'm not really looking for more performance, but rather a more simple solution.
As far as complexity, I don't think you can do better than quadratic, since, after all, the very definition of the problem is to test each element with each other. So unless there is more to be known about the structure of the problem, you're stuck there.
But you can perhaps cut down on the performance somewhat by stopping early. Calculating length every time means enumerating all elements from 1 to n, but you don't actually need that, right? You can stop enumerating once pred returns True for the first time. To do that you can use and:
filter (\x -> and [not (pred y x) | y <- [1..n], y /= x]) [1..n]
Or, alternatively, you can move the predicate to the condition part and then test the resulting list for emptiness:
filter (\x -> null [y <- [1..n], y /= x && pred y x]) [1..n]
But I like the former variant better, because it better describes the intent.
Finally, I think this would look cleaner as a list comprehension:
[ x
| x <- [1..n]
, and [not (pred y x) | y <- [1..n], y /= x]
]
But that's a matter of personal taste, of course.

Is it possible to exit a generator?

Consider the following:
list = [1,3..]
generate n = [compute y | y <- list , (compute y) < n ]
compute a = ... whatever ...
Is it possible to exit the generator before getting to the last element of my list
(e.g. if (compute y > 20)?
I want to save computing power. I only need the elements smaller than n.
I'm new to Haskell. A simple answer might be the best answer.
The wonderful thing about Haskell is that it's lazy. If you said
> let x = generate 100000
then Haskell doesn't immediately calculate generate 100000, it just creates a promise to start calculating it (we normally call this a thunk).
If you want only elements only until compute y > 20, then you can do
> takeWhile (<= 20) (generate 100000)
This is the same semantics that let you do something like
> let nums = [1..] :: [Integer]
This makes a lazy reference to all Integer values from 1 to infinity. You can then do things like
> take 10 $ map (* 10) $ drop 12345 $ map (\x -> x ^ 2 + x ^ 3 + x ^ 4) $ filter even nums
[3717428823832552480,3718633373599415160,3719838216073150080,3721043351301172120,3722248779330900000,3723454500209756280,3724660513985167360,3725866820704563480,3727073420415378720,3728280313165051000]
And while tihs seems like a lot of work, it only calculates the bare minimum necessary to return the 10 elements you requested. The argument to take 10 in this example is still an infinite list, where we first grabbed all the evens, then mapped an algebraic expression to it, then dropped the first 12345 elements, then multiplied all remaining (infinite) elements by 10. Working with infinite structures in Haskell is very common and often advantageous.
As a side note, your current definition of generate will do extra work, you'd want something more like
generate n = [compute_y | y <- list, let compute_y = compute y, compute_y < n]
This way compute y is only calculated once and the value is shared between your filter compute_y < n and the left hand side of the | in the comprehension. Also be aware that when you have a condition in a comprehension, this gets translated to a filter, not a takeWhile:
> -- filter applies the predicate to all elements in the list
> filter (\x -> x `mod` 5 == 0) [5,10,15,21,25]
[5,10,15,20]
> -- takeWhile pulls values until the predicate returns False
> takeWhile (\x -> x `mod` 5 == 0) [5,10,15,21,25]
[5,10,15]

Keep getting stack overflow

I am repeatedly getting a stack overflow on my solution to Project Euler #7 and i have no idea why.
Here is my code:
import System.Environment
checkPrime :: Int -> Bool
checkPrime n = not $ testList n [2..n `div` 2]
--testList :: Int -> [Int] -> Bool
testList _ [] = False
testList n xs
| (n `rem` (head xs) == 0) = True
| otherwise = testList n (tail xs)
primesTill n = sum [1 | x <- [2..n], checkPrime x]
nthPrime n = nthPrime' n 2
nthPrime' n x
| (primesTill x == n) = x
| otherwise = nthPrime' n x+1
main = print (nthPrime 10001)
resolving the stackoverflow
As #bheklilr mentioned in his comment the stackoverflow is caused by a wrong evaluation order in the otherwise branch of the nthPrime' function:
nthPrime' n x+1
Will be interpreted as
(nthPrime' n x)+1
Because this expression is called recursively, your call of nthPrime' n 2 will expand into
(nthPrime' n 2)+1+1+1+1+1+1+1+1 ...
but the second parameter will never get incremented and your program collects a mass of unevaluated thunks. The evaluation can only happen if the first parameter is reduced to an Int, but your function is in an endless recursion so this will never take place. All the plus ones are stored on the stack, if there is no more space left you'll get a stackoverflow error.
To solve this problem you need to put parranteses around the x+1 so your recursive call will look like this
nthPrime' n (x+1)
Now the parameters gets incremented before it is passed to the recursive call.
This should solve your stackoverflow problem, you can try it out with a smaller number e.g. 101 and you'll get the desired result.
runtime optimization
If you test your program with the original value 10001 you may realize that it still won't finish in a reasonable amount of time.
I won't go into the details of fancy algorithms to solve this problems, if you're interested in them you can easily find them online.
Instead I'll show you were the problem in your code is and show you a simple solution.
The bottleneck is your nthPrime function:
primesTill n = sum [1 | x <- [2..n], checkPrime x]
nthPrime n = nthPrime' n 2
nthPrime' n x
| (primesTill x == n) = x
| otherwise = nthPrime' n (x+1)
This function checks if the number of primes between 2 and x is equal to n. The idea is correct, but it leads to an exponential runtime. The problem is that you recalculate primesTill x for every iteration. To count the primes smaller than x you calculate them all and than sum them up. In the next step for x+1 you forget every thing you know about the numbers between 2 and x and test them all again if they are prime only as a last step you test the if x+1 is prime. Than you repeat this - forget every thing and test all numbers again - until you are finished.
Wouldn't it be great if the computer could remember the primes it has already found?
There are many possibilities to do this I'll use a simple infinite list, if you are interested in other approaches you can search for the terms memoization or dynamic programming.
We start with the list comprehension you used in primesTill:
[1 | x <- [2..n], checkPrime x]
This calculates all primes between 2 and n, but immediately forgets the prime number and replaces it with 1, so the first step will be to keep the actual numbers.
[x | x <- [2..n], checkPrime x]
This gives us a list of all prime numbers between 2 and n. If we had a sufficiently large list of prime numbers we could use the index function !! to get the 10001st prime number. So we need to set n to a really really big number, to be sure that the filtered list is long enough?
Lazy evaluation to the rescue!
Lazy evaluation in haskell allows us to build an infinite list, that is only evaluated as much as needed. If we don't supply an upper bound to a list generator it will build such an infinite list for us.
[x | x <- [2..], checkPrime x]
Now we have a infinite list of all prime numbers.
We can bind it to the a name e.g. primes and use it to define nthPrime
primes = [x | x <- [2..], checkPrime x]
nthPrime n = primes !! n
Now you can compile it with ghc -O2, run it and the result will be promptly delivered to you.

using takeWhile AND filters with list comprehension issue

I am confused why GHCI is calculating this list infinitely:
takeWhile (>0) [x^2 | x <- [100, 99..], odd x]
This list, however, stops and is calculated as expected:
takeWhile (>0) [x | x <- [100, 99..], odd x]
What am I missing here? Why is it that squaring the input causes takeWhile(>0) to have no effect?
Also, this list ends as expected...why does this terminate but not the other?
takeWhile (<1000) [x^2 | x <- [1..], odd x]
Also, if I remove the odd x filter from the first list, it terminates properly:
takeWhile (>0) [x^2 | x <- [100, 99..]]
What the heck is going on?
Because x^2 >= 0 is always true. So you do actually have an infinite list, example
-1 * -1 = 1 > 0
This is just math.
So there's only one case where x^2 > 0 fails, when x = 0. When you have the odd x condition in there 0 is never considered so the list never terminates. When you remove it, it stops when x=0.
Finally, the third list trivially terminates since 100 * 100 > 1000.
x^2 always >0 for x /= 0.
You could change list to this:
[x^2 | x <- [100, 99..], odd x, x > 0]

Resources