When i try to compile this line :
mult y = [x*2 | x <- [1..], x <= y]
And run it, I have an infinite loop that I must cancel with CTRL + C
*Main> mult 10
[2,4,6,8,10,12,14,16,18,20
Do you know why those predicate are not correctly interpreted ?
Thank you
You're looking for
mult y = [x * 2 | x <- [1..y]]
In this version, the [1..y] gets compiled to a finite list from 1 up to y. In your original code
mult y = [x * 2 | x <- [1..], x <= y]
Haskell doesn't understand complicated concepts like the nature of <= as an ordering or that [1..] is a monotonic list. So Haskell is determined to come up with every natural number, just to make sure some really big number out there doesn't happen to be less than y, by some fluke. You and I can look at that code and see that it obviously won't find any, but Haskell doesn't understand that, so it goes looking anyway.
Related
After reading about the Haskell syntax for List Comprehensions online, I got the feeling that predicates always come last. Eg:
[(x,y) | x <- [1..10000], y <- [1..100], x==2000, odd y]
But the following line accomplishes the same result:
[(x,y) | x <- [1..10000], x==2000, y <- [1..100], odd y]
Normally I would just take this as a hint that the order doesn't matter and be done with it. However this is a problem that comes from an old exam, and the answer to the problem says that while the results may be the same, the way in which they are computed may differ.
I'm assuming this is true but I can't find any information about it on the web. So my question is: How could the computations differ between the two list comprehensions and why? Are list comprehensions some form of syntactic sugar that I don't know about?
You can think of a list comprehension like
[(x,y) | x <- [1..10000], y <- [1..100], x==2000, odd y]
as corresponding to the imperative pseudo-code
for x in [1..10000]:
for y in [1..100];
if x == 2000:
if odd y:
yield (x,y)
and
[(x,y) | x <- [1..10000], x==2000, y <- [1..100], odd y]
as corresponding to
for x in [1..10000]:
if x == 2000;
for y in [1..100]:
if odd y:
yield (x,y)
Specifically, passing the list comprehension to something like mapM_ print is the same operationally as replacing yield by print in the imperative version.
Obviously, it's almost always better to "float" a guard/if out of a generator/for when possible. (The rare exception is when the generator is actually an empty list, and the guard condition is expensive to compute.)
They differ in the way of how many intermediary results/lists are generated.
You can visualize this with some trace - note that I modified this a bit to give reasonable results - also I replaced the return values by () to make it clearer:
comprehension1 = [ () | x <- [1..3], trace' 'x' x, y <- [1..3], trace' 'y' y, x==2, odd y]
comprehension2 = [ () | x <- [1..3], trace' 'x' x, x==2, y <- [1..3], trace' 'y' y, odd y]
trace' :: Show a => Char -> a -> Bool
trace' c x = trace (c : '=' : show x) True
here is the evaluation:
λ> comprehension1
x=1
y=1
y=2
y=3
x=2
y=1
[()y=2
y=3
,()x=3
y=1
y=2
y=3
]
λ> comprehension2
x=1
x=2
y=1
[()y=2
y=3
,()x=3
]
now do you notice something?
Obviously in the first example every (x,y) pair for x=1,2,3 and y=1,2,3 is generated before the filters are applied.
But in the second example the ys are only generated when x=2 - so you could say it's better/more performant
f x y z = [n | n <- z, n > x + y]
f 1 2 [3,4]
Would x + y be executed only once at first so that the successive calls be replaced by the value 3 instead? Is GHC Haskell optimized up to this job for FP brings us the virtue of referential transparency?
How to trace to prove it?
I don't think the computed value will be reused.
The general problem with this kind of thing is, x + y is cheap, but you could instead have some operation there that produces an utterly vast result, which you probably don't want to keep in memory. Which is a wordy way of saying "this is a time/space tradeoff".
Because of this, it seems GHC tends to not reuse work, in case the lost space doesn't make up for the gained time.
The way to find out for sure is to ask GHC to dump Core when it compiles your code. You can then see precisely what's going to get executed. (Be prepared for it to be very verbose though!) Oh, and make sure you turn on optimisations! (I.e., the -O2 flag.)
If you rephrase your function as
f x y z = let s = x + y in [ n | n <- z, n > s ]
Now s will definitely be executed only once. (I.e., once per call to f. Each time you call f it'll still recompute s.)
Incidentally, if you're interested in saving already-computed results for the whole function, the search term you're looking for is "memoisation".
What will happen can depend on whether you are using ghci vs. ghc and then, if you are compiling the code, what optimization level is being used.
Here is one way to test the evaluations:
import Debug.Trace
f x y z = [n | n <- z, n > tx x + ty y]
where tx = trace "x"
ty = trace "y"
main = print $ f 1 2 [3,4]
With 7.8.3 I get the following results:
ghci: x y x y [4]
ghc (no optimization): x y x y [4]
ghc -O2: x y [4]
It is possible that the addition of the trace calls affects CSE optimization. But this does show that -O2 will hoist x+y out of the loop.
I am repeatedly getting a stack overflow on my solution to Project Euler #7 and i have no idea why.
Here is my code:
import System.Environment
checkPrime :: Int -> Bool
checkPrime n = not $ testList n [2..n `div` 2]
--testList :: Int -> [Int] -> Bool
testList _ [] = False
testList n xs
| (n `rem` (head xs) == 0) = True
| otherwise = testList n (tail xs)
primesTill n = sum [1 | x <- [2..n], checkPrime x]
nthPrime n = nthPrime' n 2
nthPrime' n x
| (primesTill x == n) = x
| otherwise = nthPrime' n x+1
main = print (nthPrime 10001)
resolving the stackoverflow
As #bheklilr mentioned in his comment the stackoverflow is caused by a wrong evaluation order in the otherwise branch of the nthPrime' function:
nthPrime' n x+1
Will be interpreted as
(nthPrime' n x)+1
Because this expression is called recursively, your call of nthPrime' n 2 will expand into
(nthPrime' n 2)+1+1+1+1+1+1+1+1 ...
but the second parameter will never get incremented and your program collects a mass of unevaluated thunks. The evaluation can only happen if the first parameter is reduced to an Int, but your function is in an endless recursion so this will never take place. All the plus ones are stored on the stack, if there is no more space left you'll get a stackoverflow error.
To solve this problem you need to put parranteses around the x+1 so your recursive call will look like this
nthPrime' n (x+1)
Now the parameters gets incremented before it is passed to the recursive call.
This should solve your stackoverflow problem, you can try it out with a smaller number e.g. 101 and you'll get the desired result.
runtime optimization
If you test your program with the original value 10001 you may realize that it still won't finish in a reasonable amount of time.
I won't go into the details of fancy algorithms to solve this problems, if you're interested in them you can easily find them online.
Instead I'll show you were the problem in your code is and show you a simple solution.
The bottleneck is your nthPrime function:
primesTill n = sum [1 | x <- [2..n], checkPrime x]
nthPrime n = nthPrime' n 2
nthPrime' n x
| (primesTill x == n) = x
| otherwise = nthPrime' n (x+1)
This function checks if the number of primes between 2 and x is equal to n. The idea is correct, but it leads to an exponential runtime. The problem is that you recalculate primesTill x for every iteration. To count the primes smaller than x you calculate them all and than sum them up. In the next step for x+1 you forget every thing you know about the numbers between 2 and x and test them all again if they are prime only as a last step you test the if x+1 is prime. Than you repeat this - forget every thing and test all numbers again - until you are finished.
Wouldn't it be great if the computer could remember the primes it has already found?
There are many possibilities to do this I'll use a simple infinite list, if you are interested in other approaches you can search for the terms memoization or dynamic programming.
We start with the list comprehension you used in primesTill:
[1 | x <- [2..n], checkPrime x]
This calculates all primes between 2 and n, but immediately forgets the prime number and replaces it with 1, so the first step will be to keep the actual numbers.
[x | x <- [2..n], checkPrime x]
This gives us a list of all prime numbers between 2 and n. If we had a sufficiently large list of prime numbers we could use the index function !! to get the 10001st prime number. So we need to set n to a really really big number, to be sure that the filtered list is long enough?
Lazy evaluation to the rescue!
Lazy evaluation in haskell allows us to build an infinite list, that is only evaluated as much as needed. If we don't supply an upper bound to a list generator it will build such an infinite list for us.
[x | x <- [2..], checkPrime x]
Now we have a infinite list of all prime numbers.
We can bind it to the a name e.g. primes and use it to define nthPrime
primes = [x | x <- [2..], checkPrime x]
nthPrime n = primes !! n
Now you can compile it with ghc -O2, run it and the result will be promptly delivered to you.
I've just started learning a bit of Haskell and functional programming, but I find it very difficult getting a hang of it :)
I am trying to translate a small piece of ruby code to Haskell (because I like the concept functional programming and Haskell proposes and even more because I come from a mathematics field and Haskell seems very mathematical):
class Integer
def factorial
f = 1; for i in 1..self; f *= i; end; f
end
end
boundary = 1000
m = 0
# Brown Numbers - pair of integers (m,n) where n factorial is equal with square root of m
while m <= boundary
n = 0
while n <= boundary
puts "(#{m},#{n})" if ((n.factorial + 1) == (m ** 2))
n += 1
end
m += 1
end
I could only figure out how to do factorials:
let factorial n = product [1..n]
I cannot figure out how to do the while loops or equivalent in Haskell, even though I found some examples that were far to confusing for me.
The idea is that the loops start from 0 (or 1) and continue (with an increment of 1) until it reaches a boundary (in my code is 1000). The reason there is a boundary is because I was thinking of starting parallel tasks that do the same operation but on different intervals so the results that I expect are returned faster (one operation would be done on 1 to 10000, another on 10000 to 100000, etc.).
I would really appreciate it if anyone could help out with this :)
Try this:
let results = [(x,y) | x <- [1..1000], y <- [1..1000] ,1 + fac x == y*y]
where fac n = product [1..n]
This is a list comprehension. More on that here.
To map it to your Ruby code,
The nested loops in m and n are replaced with x and y. Basically there is iteration over the values of x and y in the specified ranges (1 to 1000 inclusive in this case).
The check at the end is your filter condition for getting Brown numbers.
where allows us to create a helper function to calculate the factorial.
Note that instead of a separate function, we could have computed the factorial in place, like so:
(1 + product[1..x]) == y * y
Ultimately, the (x,y) on the left side means that it returns a list of tuples (x,y) which are your Brown numbers.
OK, this should work in your .hs file:
results :: [(Integer, Integer)] --Use instead of `Int` to fix overflow issue
results = [(x,y) | x <- [1..1000], y <- [1..1000] , fac x == y*y]
where fac n = product [1..n]
To add to shree.pat18's answer, maybe an exercise you could try is to translate the Haskell solution back into Ruby. It should be possible, because Ruby has ranges, Enumerator::Lazy and Enumerable#flat_map. The following rewritten Haskell solution should perhaps help:
import Data.List (concatMap)
results :: [(Integer, Integer)]
results = concatMap (\x -> concatMap (\y -> test x y) [1..1000]) [1..1000]
where test x y = if fac x == y*y then [(x,y)] else []
fac n = product [1..n]
Note that Haskell concatMap is more or less the same as Ruby Enumerable#flat_map.
I wanted to take a sum of all even numbers <= 1000.
The following code:
sum [x | x <- [1..1000], even x] (I know it can be done with [2,4..1000], this is for practice)
reports that the sum is 250500.
However:
sum [x | x <- [1..], even x && x <= 1000]
never finished and has to be interrupted!
I thought that I could safely write [1..], which is an infinite list, because Haskell would not try to evaluate it.
Furthermore, I thought that it would simply start going x by x, checking them and adding them.
So why does the above fail to produce a result?
sum [x | x <- [1..], even x && x <= 1000]
translates to something like
sum (filter (\x -> even x && x <= 1000) [1..])
The range expression (or, desugared, enumFrom) continues to generate values and filter keeps discarding them (it doesn't know there'll never be another element that satisfies the predicate), but sum never gets to the end of the list, so it can't return a result. You want to stop evaluating the list as soon as you see the first value greater than 1000:
sum (takeWhile (<= 1000) $ filter even [1..])