Generating primes in Haskell - haskell

I have been learning Haskell over the last few days, through Learn You A Haskell. I've been attempting to complete some Project Euler problems, some of which require primes. However the function I have written to try to generate some (in this case primes below 20000) isn't outputting correctly. When I run it, GHCi returns '[1, ' and seemingly doesn't terminate. The code I am using is:
sieve :: (Integral a) => a -> [a] -> [a]
sieve 20000 list = list
sieve n (x:xs) = sieve (n+1) $ x:(filter (\q -> q `mod` n /= 0) xs)
primesto20000 = sieve 2 [1..20000]
And then I am calling primesto20000. I understand that the function may be inefficient, I am mainly asking for help on syntactic/process errors that I must have made. Thankyou

You're filtering out multiples of every number, not just prime numbers. You want to check divisibility by x, not by n. (In fact, I'm not sure you need n in the sieve function at all; just make your primesto20000 function generate the appropriate input list, and pass that.)

There are two problems in your code:
Because its time complexity (quadratic I guess), it doesn't finish in a reasonable time and it seems that it just hangs. If you replace 20000 with 200, it'll finish, but the result will be [1].
The other problem is that for each n you want to filter all numbers divisible by n that are larger than n. Without this condition, you filter n itself, which has the result that you filter out all numbers.
A corrected version could look like (with a parametrized limit):
limit :: Integral a => a
limit = 20000
sieve :: (Integral a) => a -> [a] -> [a]
sieve n list | n == limit
= list
sieve n (x:xs)
= sieve (n+1) $ x : (filter filt xs)
where
-- filter everything divisible by `n`, but not `n` itself.
filt q = (q <= n) || (q `mod` n /= 0)
primesto20000 = sieve 2 [1..limit]

Related

Sieve gets stuck at the beginning

I wrote the following sieve:
isPrime :: Integer -> [Integer] -> Bool
isPrime n = all (\i -> n `mod` i /= 0)
sieve :: [Integer]
sieve = 2 : [i | i <- [3,5..], isPrime i sieve]
but I don't understand why it gets stuck after the first value. Running take 10 sieve results in [2, and nothing happens. It probably has something to do with infinite recursion. May the problem be that sieve is growing and at the same time it's used inside isPrime? For that reason I also tried modifying isPrime as follows, but without success:
isPrime :: Integer -> [Integer] -> Bool
isPrime n = all (\i -> n `mod` i /= 0) . takeWhile (<n)
EDIT: Surprisingly, #Jubobs's modification works:
isPrime :: Integer -> [Integer] -> Bool
isPrime n = all (\i -> n `mod` i /= 0) . takeWhile (\p -> p^2 <= n)
I cannot understand why this version of takeWhile works while the other does not. I see that with my previous version I tested many unnecessary divisors, but they were in a finite number nontheless.
The code should basically be equivalent to the following Python code:
def is_prime(n, ps):
for i in ps:
if n % i == 0: return False
return True
def sieve():
yield 2
n = 3
ps = [2]
while True:
if is_prime(n, ps):
yield n
ps.append(n)
n += 2
You cause an infinite recursion by applying all to the entire sieve and not just the values which you sieved so far. I.e. For the second element, isPrime tests all values in sieve instead of just 2.
In your Python version, you wrote
is_prime(n, ps)
which only tests n against all numbers sieved so far. The Python equivalent of what you did in Haskell is basically
is_prime(n, sieve())
Now, using takeWhile (<n) won't help because that also requires calculating the sieve elements. Imagine what happens for the second element of sieve (which should be 3): it tests all values of the sieve for which < 3 holds, but in order to test that you actually need to evaluate the sieve elements. So you still have an infinite recursion.
I'd say the comprehension list in sieve won't end until it completes, never. To use take like that, sieve would have to return the elements one by one. For instance, [1..] shows 1, 2 ,3 ,4... to infinity but your sieve only shows [2.
Or clearly not what I said

Expressing recursion in Haskell - Prime numbers sequence

I need to express the sequence of prime numbers. (struggling with ex 3 in project Euler).
I have happened to this recursive definition:
is_not_dividable_by :: (Integral a) => a -> a -> Bool
is_not_dividable_by x y = x `rem` y /= 0
accumulate_and :: (Integral a) => [a] -> (a -> Bool) -> Bool
accumulate_and (x:xs) (f) = (accumulate_and xs (f)) && f(x)
accumulate_and [] f = True
integers = [2,3..]
prime_sequence = [n | n <- integers, is_prime n]
where is_prime n = accumulate_and
(takeWhile (<n) (prime_sequence))
( n `is_not_dividable_by`)
result = take 20 prime_sequence
str_result = show result
main = putStrLn str_result
Though it compiles well, but when executed, it falls into a loop, and just returns <<loop>>
My problem is that I think that I can freely express recursive definitions in Haskell.
But obviously this definition does not fit with the language at all.
However, when I mentally try to solve the prime_sequence, I think I succeed and grow the sequence, but of course with imperative programming apriori.
What is plain wrong in my recursive definition, that makes this code not work in Haskell ?
The culprit is this definition:
prime_sequence = [n | n <- [2,3..], is_prime n] where
is_prime n = accumulate_and
(takeWhile (< n) (prime_sequence))
( n `is_not_dividable_by`)
Trying to find the head element of prime_sequence (the first of the 20 to be printed by your main) leads to takeWhile needing to examine prime_sequence's head element. Which leads to a takeWhile call needing to examine prime_sequence's head element. And so it goes, again and again.
That's the black hole, right away. takeWhile can't even start walking along its input, because nothing's there yet.
This is fixed easily enough by priming the sequence:
prime_sequence = 2 : [n | n <- [3,4..], is_prime n] where
is_prime n = accumulate_and
(takeWhile (< n) (prime_sequence))
( n `is_not_dividable_by`)
Now it gets to work, and hits the second problem, described in Rufflewind's answer: takeWhile can't stop walking along its input. The simplest fix is to stop at n/2. But it is much better to stop at the sqrt:
prime_sequence = 2 : [n | n <- [3,4..], is_prime n] where
is_prime n = accumulate_and
(takeWhile ((<= n).(^ 2)) (prime_sequence))
( n `is_not_dividable_by`)
Now it should work.
The reason it's an infinite loop is because of this line:
prime_sequence =
[n | n <- integers, is_prime n]
where is_prime n = accumulate_and (takeWhile (< n) prime_sequence)
(n `is_not_dividable_by`)
In order to compute is_prime n, it needs to take all the prime numbers less than n. However, in order for takeWhile to know when to stop taking it needs need to also check for n, which hasn't been computed yet.
(In a hand-wavy manner, it means your prime_sequence is too lazy so it ends up biting its own tail and becoming an infinite loop.)
Here's how you can generate an infinite list of prime numbers without running into an infinite loop:
-- | An infinite list of prime numbers in ascending order.
prime_sequence :: [Integer]
prime_sequence = find [] integers
where find :: [Integer] -> [Integer] -> [Integer]
find primes [] = []
find primes (n : remaining)
| is_prime = n : find (n : primes) remaining
| otherwise = find primes remaining
where is_prime = accumulate_and primes (n `is_not_dividable_by`)
The important function here is find, which takes an existing list of primes and a list of remaining integers and produces the next remaining integer that is prime, then delays the remaining computation until later by capturing it with (:).

How to write this as a single function?

I'm having a go at project Euler Q3 and need to get the largest prime factor of a number. So far I've gotten a pair of functions to return a list of all the factors of the given number but it seems like a really bad way to do it (partly because I only need the largest).
get_factors :: (Integral a) => a -> [a] -> [a]
get_factors _ [] = []
get_factors t (x:xs)
| t `mod` x == 0 = x:get_factors t xs
| otherwise = get_factors t xs
factors :: (Integral a) => a -> [a]
factors x = get_factors x [x,x-1..1]
> factors 1000
> [1000,500,250,200,125,100,50,40,25,20,10,8,5,4,2,1]
It seems weird to me that I would need to have a "launch" function if you will to start the recursive function off (or have a function where I have to pass it the same value twice, again, seems silly to me).
Can you point me in the right direction of how I should be going about doing this please?
You should try to recognize that what you're doing here, namely picking elements from a list which satisfy some condition, is a very common pattern. This pattern is implemented by the filter function in the Prelude.
Using filter, you can write your function as:
factors n = filter (\d -> n `mod` d == 0) [n, n-1 .. 1]
or, equivalently, you can use a list comprehension:
factors n = [d | d <- [n, n-1 .. 1], n `mod` d == 0]
Using a "launch" function for calling a recursive function is very common in Haskell, so don't be afraid of that. Most often it'd be written as
f = g someArgument
where
g = ...
in your case
factors :: (Integral a) => a -> [a]
factors x = get_factors [x,x-1..1]
where
get_factors [] = []
get_factors (y:ys)
| x `mod` y == 0 = y : get_factors ys
| otherwise = get_factors ys
This signals readers of your code that get_factors is used only here and nowhere else, and helps you to keep the code clean. Also get_factors has access to x, which simplifies the design.
Some other ideas:
It's inefficient to try dividing by all numbers. In problems like that it's much better to pre-compute the list of primes and factor using the list. There are many methods how to compute such a list, but for educational purposes I'd suggest you to write your own (this will come in handy for other Project Euler problems). Then you could take the list of primes, take a part of primes less or equal than x and try dividing by them.
When searching just for the largest factor, you have to search through all primes between 1 and x. But if x is composite, one of its factors must be <= sqrt(n). You can use this to construct a significantly better algorithm.
I do not think it is a very good idea to go through every number like [n, n-1..] since the problem says 600851475143.
largest_factors :: Integer -> Integer
largest_factors n = helper n 2
where
helper m p
| m < p^2 = m
| m == p = m
| m `mod` p == 0 = helper (m `div` p) p
| otherwise = helper m (p+1)
What I did is that, once it found that a certain number, say p, divides the number n, it just divides it. This one works on my computer just fine. This gave me the solution within a sec.

Haskell - Prime Powers Excercise - Infinite merges

At university my task is the following :
define the following function:
primepowers :: Integer -> [Integer]
that calculates the infinite list of the first n powers of the prime numbers for a given parameter n, sorted asc.
That is,
primepowers n contains in ascending order the elements of
{p^i | p is prime, 1≤i≤n}.
After working on this task I came to a dead end. I have the following four functions:
merge :: Ord t => [t] -> [t] -> [t]
merge [] b = b
merge a [] = a
merge (a:ax) (b:bx)
| a <= b = a : merge ax (b:bx)
| otherwise = b : merge (a:ax) bx
primes :: [Integer]
primes = sieve [2..]
where sieve [] = []
sieve (p:xs) = p : sieve (filter (not . multipleOf p) xs)
where multipleOf p x = x `mod` p == 0
powers :: Integer -> Integer -> [Integer]
powers n num = map (\a -> num ^ a) [1..n]
primepowers :: Integer -> [Integer]
primepowers n = foldr merge [] (map (powers n) primes)
I think that they work independently, as I have tested with some sample inputs.
merge merges two ordered lists to one ordered list
primes returns infinite list of prime numbers
powers calculates n powers of num (that is num^1 , num^2 ... num^n)
I try to merge everything in primepowers, but functions are not evaluated nothing happens respectively theres some kind of infinite loop.
I am not interested in optimization of primes or powers. Just I don't understand why that does not work. Or is my approach not good, not functional, not haskell?
I suspect the problem is: primes is an infinite list. Therefore, map (powers n) primes is an infinite list of (finite) lists. When you try to foldr merge [] them all together, merge must evaluate the head of each list...
Since there are an infinite number of lists, this is an infinite loop.
I would suggest transposing the structure, something like:
primepowers n = foldr merge [] [map (^i) primes | i <- [1..n]]
While you can probably not use this for your assignment, this can be solved quite elegantly using the primes and data-ordlist packages from Hackage.
import Data.List.Ordered
import Data.Numbers.Primes
primePowers n = mergeAll [[p^k | k <- [1..n]] | p <- primes]
Note that mergeAll is able to merge an infinite number of lists because it assumes that the heads of the lists are ordered in addition to the lists themselves being ordered. Thus, we can easily make this work for infinite powers as well:
allPrimePowers = mergeAll [[p^k | k <- [1..]] | p <- primes]
The reason why your program runs into an infinite loop is that you are trying to merge infinitely many lists only by using the invariant that each list is sorted in the ascending order. Before the program can output “2,” it has to know that none of the lists contains anything smaller than 2. This is impossible because there are infinitely many lists.
You need the following function:
mergePrio (h : l) r = h : merge l r

Explain this chunk of haskell code that outputs a stream of primes

I have trouble understanding this chunk of code:
let
sieve (p:xs) = p : sieve (filter (\ x -> x `mod` p /= 0) xs)
in sieve [2 .. ]
Can someone break it down for me? I understand there is recursion in it, but thats the problem I can't understand how the recursion in this example works.
Contrary to what others have stated here, this function does not implement the true sieve of Eratosthenes. It does returns an initial sequence of the prime numbers though, and in a similar manner, so it's okay to think of it as the sieve of Eratosthenes.
I was about done explaining the code when mipadi posted his answer; I could delete it, but since I spent some time on it, and because our answers are not completely identical, I'll leave it here.
Firs of all, note that there are some syntax errors in the code you posted. The correct code is,
let sieve (p:xs) = p : sieve (filter (\x -> x `mod` p /= 0) xs) in sieve [2..]
let x in y defines x and allows its definition to be used in y. The result of this expression is the result of y. So in this case we define a function sieve and return the result of applying [2..] to sieve.
Now let us have a closer look at the let part of this expression:
sieve (p:xs) = p : sieve (filter (\x -> x `mod` p /= 0) xs)
This defines a function sieve which takes a list as its first argument.
(p:xs) is a pattern which matches p with the head of said list and xs with the tail (everything but the head).
In general, p : xs is a list whose first element is p. xs is a list containing the remaining elements. Thus, sieve returns the first element of the list it receives.
Not look at the remainder of the list:
sieve (filter (\x -> x `mod` p /= 0) xs)
We can see that sieve is being called recursively. Thus, the filter expression will return a list.
filter takes a filter function and a list. It returns only those elements in the list for which the filter function returns true.
In this case xs is the list being filtered and
(\x -> x `mod` p /= 0)
is the filter function.
The filter function takes a single argument, x and returns true iff it is not a multiple of p.
Now that sieve is defined, we pass it [2 .. ], the list of all natural numbers starting at 2. Thus,
The number 2 will be returned. All other natural number which are a multiple of 2 will be discarded.
The second number is thus 3. It will be returned. All other multiples of 3 will be discarded.
Thus the next number will be 5. Etc.
It's actually pretty elegant.
First, we define a function sieve that takes a list of elements:
sieve (p:xs) =
In the body of sieve, we take the head of the list (because we're passing the infinite list [2..], and 2 is defined to be prime) and append it (lazily!) to the result of applying sieve to the rest of the list:
p : sieve (filter (\ x -> x 'mod' p /= 0) xs)
So let's look at the code that does the work on the rest of the list:
sieve (filter (\ x -> x 'mod' p /= 0) xs)
We're applying sieve to the filtered list. Let's break down what the filter part does:
filter (\ x -> x 'mod' p /= 0) xs
filter takes a function and a list on which we apply that function, and retains elements that meet the criteria given by the function. In this case, filter takes an anonymous function:
\ x -> x 'mod' p /= 0
This anonymous function takes one argument, x. It checks the modulus of x against p (the head of the list, every time sieve is called):
x 'mod' p
If the modulus is not equal to 0:
x 'mod' p /= 0
Then the element x is kept in the list. If it is equal to 0, it's filtered out. This makes sense: if x is divisible by p, than x is divisible by more than just 1 and itself, and thus it is not prime.
It defines a generator - a stream transformer called "sieve",
Sieve s =
while( True ):
p := s.head
s := s.tail
yield p -- produce this
s := Filter (nomultsof p) s -- go next
primes := Sieve (Nums 2)
which uses a curried form of an anonymous function equivalent to
nomultsof p x = (mod x p) /= 0
Both Sieve and Filter are data-constructing operations with internal state and by-value argument passing semantics.
Here we can see that the most glaring problem of this code is not, repeat not that it uses trial division to filter out the multiples from the working sequence, whereas it could find them out directly, by counting up in increments of p. If we were to replace the former with the latter, the resulting code would still have abysmal run-time complexity.
No, its most glaring problem is that it puts a Filter on top of its working sequence too soon, when it should really do that only after the prime's square is seen in the input. As a result it creates a quadratic number of Filters compared to what's really needed. The chain of Filters it creates is too long, and most of them aren't even needed at all.
The corrected version, with the filter creation postponed until the proper moment, is
Sieve ps s =
while( True ):
x := s.head
s := s.tail
yield x -- produce this
p := ps.head
q := p*p
while( (s.head) < q ):
yield (s.head) -- and these
s := s.tail
ps := ps.tail -- go next
s := Filter (nomultsof p) s
primes := Sieve primes (Nums 2)
or in Haskell,
primes = sieve primes [2..]
sieve ps (x:xs) = x : h ++ sieve pt [x | x <- t, rem x p /= 0]
where (p:pt) = ps
(h,t) = span (< p*p) xs
rem is used here instead of mod as it can be much faster in some interpreters, and the numbers are all positive here anyway.
Measuring the local orders of growth of an algorithm by taking its run times t1,t2 at problem-size points n1,n2, as logBase (n2/n1) (t2/t1), we get O(n^2) for the first one, and just above O(n^1.4) for the second (in n primes produced).
Just to clarify it, the missing parts could be defined in this (imaginary) language simply as
Nums x = -- numbers from x
while( True ):
yield x
x := x+1
Filter pred s = -- filter a stream by a predicate
while( True ):
if pred (s.head) then yield (s.head)
s := s.tail
see also.
update: Curiously, the first instance of this code in David Turner's 1976 SASL manual according to A.J.T. Davie's 1992 Haskell book,
primes = sieve [2..]
-- [Int] -> [Int]
sieve (p:nos) = p : sieve (remove (multsof p) nos)
actually admits two pairs of implementations for remove and multsof going together -- one pair for the trial division sieve as in this question, and the other for the ordered removal of each prime's multiples directly generated by counting, aka the genuine sieve of Eratosthenes (both would be non-postponed, of course). In Haskell,
-- Int -> (Int -> Bool) -- Int -> [Int]
multsof p n = (rem n p)==0 multsof p = [p*p, p*p+p..]
-- (Int -> Bool) -> ([Int] -> [Int]) -- [Int] -> ([Int] -> [Int])
remove m xs = filter (not.m) xs remove m xs = minus xs m
(If only he would've postponed picking the actual implementation here...)
As for the postponed code, in a pseudocode with "list patterns" it could've been
primes = [2, ...sieve primes [3..]]
sieve [p, ...ps] [...h, p*p, ...nos] =
[...h, ...sieve ps (remove (multsof p) nos)]
which in modern Haskell can be written with ViewPatterns as
{-# LANGUAGE ViewPatterns #-}
primes = 2 : sieve primes [3..]
sieve (p:ps) (span (< p*p) -> (h, _p2 : nos))
= h ++ sieve ps (remove (multsof p) nos)
It's implementing the Sieve of Eratosthenes
Basically, start with a prime (2), and filter out from the rest of the integers, all multiples of two. The next number in that filtered list must also be a prime, and therefore filter all of its multiples from the remaining, and so on.
It says "the sieve of some list is the first element of the list (which we'll call p) and the sieve of the rest of the list, filtered such that only elements not divisible by p are allowed through". It then gets things started by by returning the sieve of all integers from 2 to infinity (which is 2 followed by the sieve of all integers not divisible by 2, etc.).
I recommend The Little Schemer before you attack Haskell.

Resources