Why does LiquidHaskell fail to take guard into account? - liquid-haskell

I am following the Liquid Haskell tutorial:
and this example fails:
module Test2 where
import Data.Vector
import Prelude hiding (length)
vectorSum :: Vector Int -> Int
vectorSum vec = go 0 0
go acc i
| i < length vec = go (acc + (vec ! i)) (i + 1)
| otherwise = acc
with the following error:
Error: Liquid Type Mismatch
10 | | i < length vec = go (acc + (vec ! i)) (i + 1)
Inferred type
VV : {v : GHC.Types.Int | v == acc + ?b}
not a subtype of Required type
VV : {VV : GHC.Types.Int | VV < acc
&& VV >= 0}
In Context
?b : GHC.Types.Int
acc : GHC.Types.Int
The question is why? The guard (i < length vec) should ensure that (vec ! i) is safe.

This looks like a termination error. Liquid Haskell here seems to assume that acc is the termination metric of go (probably because it's the first Int argument). As such it should always be non-negative and decreasing in each iteration (hence the error message you get).
The way to fix this is providing a correct termination metric, which fulfills the above criteria. Here this would be length vec - i and the corresponding signature for go is:
{-# go :: _ -> i:_ -> _ /[length vec - i] #-}
or something along those lines.

First of all, I don't know which version of LH you were using but I just had the exact same problem. The tutorial states that
LiquidHaskell verifies vectorSum – or, to be precise, the safety of
the vector accesses vec ! i. The verification works out because
LiquidHaskell is able to automatically infer
go :: Int -> {v:Int | 0 <= v && v <= sz} -> Int
which states that the second parameter i is between 0 and the length
of vec (inclusive). LiquidHaskell uses this and the test that i < sz
to establish that i is between 0 and (vlen vec) to prove safety.
They also state that the tutorial is not in accordance with the current version of LiquidHaskell.
It seems that the (refinement-)type inference of LH has changed a bit since the tutorial was written, probably generalizing types more than before, which results in this problem.
The problem is NOT that LH doesn't figure out the guard properly. The problem is that it fails to verify the property 0 <= v.
The following checks fine with version of LH:
{-# LIQUID "--short-names" #-}
{-# LIQUID "--no-termination" #-}
{-# LIQUID "--reflection" #-}
import Prelude hiding (length)
import Data.Vector
{-# go' :: v:Vector Int -> Int -> {i:Int | i>=0 } ->Int #-}
go' :: Vector Int -> Int -> Int -> Int
go' vec acc i
| i < sz = go' vec (acc + (vec ! i)) (i + 1)
| otherwise = acc
where sz = length vec
vecSum :: Vector Int -> Int
vecSum vec = go' vec 0 0
It seems that LH infers thas go is a function in its own right and might be called with an integer smaller than 0 (which it obviously isn't).
I am still playing with that example to convince LH of this fact. If anybody had more success on this please leave a comment.
I found the following paragraph in the same tutorial; It seems that this might have changed:
At the call loop 0 n 0 body the parameters lo and hi are instantiated
with 0 and n respectively, which, by the way is where the inference
engine deduces non-negativity.


How to solve Alphametics puzzle using the State Monad and mutable Vector?

I'm working on the Alphametics puzzle
A set of words is written down in the form of an ordinary "long-hand" addition sum, and it is required that the letters of the alphabet be replaced with decimal digits so that the result is a valid arithmetic sum.Example:
This equation has a unique solution:
A non brute force solution is to use backtracking with memoization. My choice is to use the State Monad along with mutable Vectors.
The algorithm goes as follows:
If we are beyond the leftmost digit of the sum:
Return true if no carry, false otherwise.
Also check that there is no leading zero in the sum.
Else if addend and current column index is beyond the current row:
Recur on row beneath this one.
If we are currently trying to assign a char in one of the addends:
If char already assigned, recur on row beneath this one.
If not assigned, then:
For every possible choice among the digits not in use:
Make that choice and recur on row beneath this one.
If successful, return true.
Else, unmake assignment and try another digit.
Return false if no assignment worked to trigger backtracking.
Else if trying to assign a char in the sum:
If char already assigned:
If matches the sum digit, recur on next column to the left with carry.
Else, return false to trigger backtracking.
If char unassigned:
If correct digit already used, return false.
Assign it and recur on next column to the left with carry:
If successful return true.
Else, unmake assignment, and return false to trigger backtracking.
I'm having trouble with writing the part where a number is assigned to an addend.
Rust code for reference that needs to be translated to Haskell.
let used: HashSet<&u8> = HashSet::from_iter(solution.values());
let unused: Vec<u8> = (0..=9).filter(|x| !used.contains(x)).collect();
for i in unused {
if i == 0 && non_zero_letters.contains(&letter) {
solution.insert(letter, i);
if can_solve(
row + 1,
carry + (i as u32),
) {
return true;
My code, that I've yet to compile, and without the above case implemented, is shown below:
equation contains the addend rows.
result is the sum row.
solution is the assignments.
nonZeroLetters is an optimization that checks there are no leading zeros in any of the rows.
solve :: String -> Maybe [(Char, Int)]
solve puzzle = error "You need to implement this function."
type Solution = Vector Int
type Row = Vector Char
data PuzzleState = PuzzleState
{ equation :: Vector Row,
result :: Row,
nonZeroLetters :: Set Char,
solution :: MVector Row
canSolve :: Int -> Int -> Int -> State PuzzleState Bool
canSolve row col carry = do
PuzzleState {equation, result, nonZeroLetters, solution} <- get
let addend = row < length equation
let word = if addend then (equation ! row) else result
let n = length word
let letter = word ! col
let ord x = C.ord x - C.ord 'A'
let readC = UM.read (solution . ord)
i <- readC letter
let assigned = i >= 0
let isNonZero = flip S.member nonZeroLetters
case () of
| col >= n && addend -> canSolve (row + 1) col carry
| col == n && (not . addend) -> carry == 0
| addend && assigned -> canSolve (row + 1) col (carry + i)
ord :: Char -> Int
ord x = C.ord x - C.ord 'A'
readC ::
(PrimMonad m, UM.Unbox a) =>
MV.MVector (PrimState m) a ->
Char ->
m a
readC solution c = UM.read solution $ ord c
writeC ::
(PrimMonad m, UM.Unbox a) =>
UM.MVector (PrimState m) a ->
Char ->
a ->
m ()
writeC solution c x = UM.write solution $ ord c $ x
Here's the (invalid and incomplete) draft that I need help with. This is the part for which I showed Rust code above.
| addend -> let used <- M.mapM (0 <= UM.read solution) [0..length solution - 1]
unused = filter (\x -> x == 0 && isNonZero x) [0..9] \\ used
in do
i <- unused
writeC letter
Edit Jan 7, 2023:
Here's the cleaned up code that produces the compilation error shown at the end.
{-# LANGUAGE NamedFieldPuns #-}
module Alphametics (solve) where
import Control.Monad as M
import Control.Monad.Reader (ReaderT)
import qualified Control.Monad.Reader as R
import Control.Monad.ST (ST)
import qualified Control.Monad.ST as ST
import qualified Data.Char as C
import Data.List ((\\))
import Data.Set (Set)
import qualified Data.Set as S
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U
import Data.Vector.Unboxed.Mutable (MVector)
import qualified Data.Vector.Unboxed.Mutable as UM
solve :: String -> Maybe [(Char, Int)]
solve puzzle = error "You need to implement this function."
data PuzzleState s = PuzzleState
{ equation :: V.Vector (U.Vector Char),
result :: U.Vector Char,
nonZeroLetters :: Set Char,
solution :: MVector s Int
type M s = ReaderT (PuzzleState s) (ST s)
canSolve :: Int -> Int -> Int -> M s Bool
canSolve row col carry = do
PuzzleState {equation, result, nonZeroLetters, solution} <- R.ask
let addend = row < length equation
let word = if addend then ((V.!) equation row) else result
let n = length word
let letter = (U.!) word col
let x = ord letter
y <- R.lift $ UM.read solution x
let assigned = y >= 0
let isNonZero = flip S.member nonZeroLetters
let sumDigit = carry `mod` 10
let used = filter (\i -> 0 <= UM.read solution i) [0 .. length solution - 1]
case () of
| col >= n && addend -> canSolve (row + 1) col carry
| col == n && (not addend) -> return $ carry == 0
| addend && assigned -> canSolve (row + 1) col (carry + y)
| addend ->
let unused = filter (\i -> i == 0 && isNonZero letter) [0 .. 9] \\ used
in assignAny unused y solution
| assigned && sumDigit == y -> canSolve 0 (col + 1) (carry `mod` 10)
| sumDigit `elem` used -> return $ False
| sumDigit == 0 && isNonZero letter -> return $ False
| otherwise -> assign 0 (col + 1) (carry `mod` 10) y sumDigit solution
ord x = C.ord x - C.ord 'A'
assignAny [] _ _ = return (False)
assignAny (i : xs) y solution = do
success <- assign (row + 1) col (carry + i) y i solution
if success then return (success) else assignAny xs y solution
assign r c cr y i solution = do
UM.write solution y i
success <- canSolve r c cr
M.when (not success) (UM.write solution y (-1))
return (success)
• Couldn't match type ‘s’
with ‘primitive- m0’
Expected: MVector
(primitive- (ST s)) Int
Actual: MVector
(primitive- m0) Int
‘s’ is a rigid type variable bound by
the type signature for:
canSolve :: forall s. Int -> Int -> Int -> M s Bool
at src/Alphametics.hs:31:1-41
There's a larger conceptual problem in your code that's been pointed out in the comments.
The State monad simulates a mutable state by passing invisible state values into and out of a sequence of monadic actions. The state values themselves are immutable, but since they are passed into and out of each action, an action can simulate mutation by returning a state value that's different than the one it was given.
On the other hand, the Data.Vector package provides two types of mutable vectors, IOVector and STVector. These are vectors with truly mutable elements, not simulated. They can be mutated within the IO and ST monads respectively, because these monads provide the capability for true mutation. Neither type can be mutated in a State monad, because the State mutation is only simulated. It would be too inefficient to implement a third type of mutable vector using State's simulated mutation.
In your program, you are using a State that includes a field solution with a (true) mutable vector for its value. This points to a probable design problem. Either you want to simulate mutation with a State or you want to use a true mutable vector that's made available to your program in some convenient manner, maybe via a ReaderT, and manipulated within an IO or ST base monad.
I actually can't quite understand your PuzzleState representation from the code you've provided. For example, solution has type MVector Row, which isn't a valid type, but maybe it's supposed to have the unused type Solution instead, except that Solution isn't mutable. And I'm not sure what nonZeroLetters represents.
However, if your intention is to access fixed character vectors of the addends and result that don't change over the course of solving the puzzle, and mutate an array mapping letters to their digit assignments, then you probably want a "context" that's something like:
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed.Mutable as UM
data PuzzleContext = PuzzleContext
{ equation :: V.Vector (U.Vector Char) -- fixed matrix of addends, as Chars
, result :: U.Vector Char -- fixed result as Chars
, solution :: UM.IOVector Int -- mutable array of digit assignments
made available using a monad M like:
import Control.Monad.Reader
type M = ReaderT PuzzleContext IO
This allows you to access the puzzle details:
canSolve :: ... -> M Bool
canSolve ... = do
PuzzleContext{equation, result, solution} <- ask
let c = equation V.! 0 U.! 2 -- access third letter of first addend
and mutate the solution:
let ord x = C.ord x - C.ord 'A'
UM.write solution (ord 'M') 6 -- try M=6
Note that solution is an immutable reference that comes out of the reader, but what it references is a mutable vector, which is why you can UM.write to its elements within the base IO monad.
As for helping you with your specific draft code, as I say I can't really understand your representation from the code you've written so far, so it's not clear to me how to help with that.
OP here, figured it out myself. This code, and an alternative implementation using State monad, are available here. I’ve done some benchmarking, and surprisingly, the immutable version using State appears to be faster than the mutable code below.
{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE RecordWildCards #-}
module Alphametics (solve) where
import Control.Monad as M
import Control.Monad.Reader (ReaderT)
import qualified Control.Monad.Reader as R
import Control.Monad.ST (ST)
import qualified Data.Char as C
import Data.List ((\\))
import qualified Data.List as L
import Data.Set (Set)
import qualified Data.Set as S
import qualified Data.Vector as V
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed as VU
import Data.Vector.Unboxed.Mutable (MVector)
import qualified Data.Vector.Unboxed.Mutable as UM
solve :: String -> Maybe [(Char, Int)]
solve puzzle
-- validate equation, "ABC + DEF == GH" is invalid,
-- sum isn't wide enough
| any (\x -> length x > (length . head) res) eqn = Nothing
| otherwise = findSoln $ VU.create $ do
let nonZeroLetters = S.fromList nz
-- process in reverse
let equation = (V.fromList . map (U.fromList . reverse)) eqn
let result = (U.fromList . reverse . head) res
solution <- UM.replicate 26 (-1)
_ <- R.runReaderT (canSolve 0 0 0) PuzzleState {..}
return solution
xs = filter (all C.isAsciiUpper) $ words puzzle
(eqn, res) = L.splitAt (length xs - 1) xs
-- leading letters can't be zero
nz = [head x | x <- xs, length x > 1]
chr x = C.chr (C.ord 'A' + x)
findSoln v = case [ (chr x, y)
| x <- [0 .. 25],
let y = v VU.! x,
y >= 0
] of
[] -> Nothing
x -> Just x
data PuzzleState s = PuzzleState
{ equation :: V.Vector (U.Vector Char),
result :: U.Vector Char,
nonZeroLetters :: Set Char,
solution :: MVector s Int
type M s = ReaderT (PuzzleState s) (ST s)
canSolve :: Int -> Int -> Int -> M s Bool
canSolve row col carry = do
PuzzleState {equation, result, nonZeroLetters, solution} <- R.ask
let addend = row < V.length equation
let word = if addend then equation V.! row else result
let n = U.length word
case () of
| col >= n && addend -> canSolve (row + 1) col carry
| col == n && not addend -> return $ carry == 0
| otherwise -> do
let letter = word U.! col
let x = ord letter
i <- readM solution x
let assigned = i >= 0
let canBeZero = flip S.notMember nonZeroLetters
let sumDigit = carry `mod` 10
used <- M.mapM (readM solution) [0 .. 25]
let unused =
(\y -> y > 0 || canBeZero letter)
[0 .. 9]
\\ used
case () of
| addend && assigned -> canSolve (row + 1) col (carry + i)
| addend -> assignAny solution x unused
| assigned ->
if sumDigit == i
then canSolve 0 (col + 1) (carry `div` 10)
else return False
| sumDigit `elem` used -> return False
| sumDigit == 0 && (not . canBeZero) letter -> return False
| otherwise ->
(col + 1)
(carry `div` 10)
-- lift is needed because we're working in in a ReaderT monad,
-- whereas VM.read and VM.write work in the ST monad
readM solution = R.lift . UM.read solution
ord c = C.ord c - C.ord 'A'
assignAny _ _ [] = return False
assignAny solution ix (i : xs) = do
success <- assign (row + 1) col (carry + i) solution ix i
if success then return success else assignAny solution ix xs
assign r c cr solution ix i = do
UM.write solution ix i
success <- canSolve r c cr
M.unless success (UM.write solution ix (-1))
return success

Define Function with Constraint on List's Elements?

How can I define a function with the following signature,
f :: [Int???] -> [Int]
f xs = _ -- what I do with xs doesn't matter for the question
where a is a List of Int's
such that the first argument's inputs, i.e. list elements, must be >= 0, but <= 5 at compile-time?
In other words,
f [6] would fail to compile.
How about:
f :: [Int] -> [Int]
f = filter (\x -> x >= 0 && x <= 5)
Or do you want to enforce the bounds on the type (dependent types)?
If you want to restrict the range of the Int that is allowed you are probably better of using a smart constructor. Have a look here. The idea is that you create your own datatype and your own custom constructor:
newtype Range0_5 = Range0_5 { unRange :: Int }
makeRange0_5 :: Int -> Maybe Range0_5
makeRange0_5 x
| x >= 0 && x <= 5 = Just $ Range0_5 x
| otherwise = Nothing
If you make a smart constructor, it is important to not expose it to the user of the module. This can be done by simply not exporting the Range0_5 constructor.
However this is not a compile time check. Other languages than Haskell might be more appropriate if you really need such a feature.
Since the range is fairly small, you could also make a sum type to represent it:
data Range0_5 = Int0 | Int1 | Int2 | Int3 | Int4 | Int5
If the signature is
f :: [Int] -> [Int]
(which was the original form of the question), then it is impossible to enforce your constraint at compile time. This follows from the standard diagonalization argument of the Halting problem.
Suppose the compiler could detect that
f[g x]
should not compile. By incorporating the source code of the compiler into g, it could choose the opposite of the compiler's decision.
Following your comment on Liquid Haskell (which seems like a very interesting project), note the following:
{-# type Even = {v:Int | v mod 2 = 0} #-}
{-# foo :: n:Even -> {v:Bool | (v <=> (n mod 2 == 0))} #-}
foo :: Int -> Bool
foo n = if n^2 - 1 == (n + 1) * (n - 1) then True else foo (n - 1)
LiquidHaskell claims this function is unsafe, because, potentially foo n calls foo (n - 1). Note, however, that this will never happen: it will only be called if the relationship n2 - 1 ≠ (n + 1) (n - 1), which can never happen.
Again, this is not a criticism of the quality of LiquidHaskell, but rather just pointing out that it, too, cannot solve Halting Problem like issues.

Summing a large list of numbers is too slow

Task: "Sum the first 15,000,000 even numbers."
nats = [1..] :: [Int]
evens = filter even nats :: [Int]
MySum:: Int
MySum= sum $ take 15000000 evens
...but MySum takes ages. More precisely, about 10-20 times slower than C/C++.
Many times I've found, that a Haskell solution coded naturally is something like 10 times slower than C. I expected that GHC was a very neatly optimizing compiler and task such this don't seem that tough.
So, one would expect something like 1.5-2x slower than C. Where is the problem?
Can this be solved better?
This is the C code I'm comparing it with:
long long sum = 0;
int n = 0, i = 1;
for (;;) {
if (i % 2 == 0) {
sum += i;
if (n == 15000000)
Edit 1: I really know, that it can be computed in O(1). Please, resist.
Edit 2: I really know, that evens are [2,4..] but the function even could be something else O(1) and need to be implemented as a function.
Lists are not loops
So don't be surprised if using lists as a loop replacement, you get slower code if the loop body is small.
nats = [1..] :: [Int]
evens = filter even nats :: [Int]
dumbSum :: Int
dumbSum = sum $ take 15000000 evens
sum is not a "good consumer", so GHC is not (yet) able to eliminate the intermediate lists completely.
If you compile with optimisations (and don't export nat), GHC is smart enough to fuse the filter with the enumeration,
Rec {
Main.main_go [Occ=LoopBreaker]
:: GHC.Prim.Int# -> GHC.Prim.Int# -> [GHC.Types.Int]
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType L]
Main.main_go =
\ (x_aV2 :: GHC.Prim.Int#) ->
let {
r_au7 :: GHC.Prim.Int# -> [GHC.Types.Int]
[LclId, Str=DmdType]
r_au7 =
case x_aV2 of wild_Xl {
__DEFAULT -> Main.main_go (GHC.Prim.+# wild_Xl 1);
9223372036854775807 -> n_r1RR
} } in
case GHC.Prim.remInt# x_aV2 2 of _ {
__DEFAULT -> r_au7;
0 ->
let {
wild_atm :: GHC.Types.Int
[LclId, Str=DmdType m]
wild_atm = GHC.Types.I# x_aV2 } in
let {
lvl_s1Rp :: [GHC.Types.Int]
lvl_s1Rp =
# GHC.Types.Int wild_atm (GHC.Types.[] # GHC.Types.Int) } in
\ (m_aUL :: GHC.Prim.Int#) ->
case GHC.Prim.<=# m_aUL 1 of _ {
GHC.Types.False ->
GHC.Types.: # GHC.Types.Int wild_atm (r_au7 (GHC.Prim.-# m_aUL 1));
GHC.Types.True -> lvl_s1Rp
end Rec }
but that's as far as GHC's fusion takes it. You are left with boxing Ints and constructing list cells. If you give it a loop, like you give it to the C compiler,
module Main where
import Data.Bits
main :: IO ()
main = print dumbSum
dumbSum :: Int
dumbSum = go 0 0 1
go :: Int -> Int -> Int -> Int
go sm ct n
| ct >= 15000000 = sm
| n .&. 1 == 0 = go (sm + n) (ct+1) (n+1)
| otherwise = go sm ct (n+1)
you get the approximate relation of running times between the C and the Haskell version you expected.
This sort of algorithm is not what GHC has been taught to optimise well, there are bigger fish to fry elsewhere before the limited manpower is put into these optimisations.
The problem why list fusion can't work here is actually rather subtle. Say we define the right RULE to fuse the list away:
import GHC.Base
sum2 :: Num a => [a] -> a
sum2 = sum
{-# NOINLINE [1] sum2 #-}
{-# RULES "sum" forall (f :: forall b. (a->b->b)->b->b).
sum2 (build f) = f (+) 0 #-}
(The short explanation is that we define sum2 as an alias of sum, which we forbid GHC to inline early, so the RULE has a chance to fire before sum2 gets eliminated. Then we look for sum2 directly next to the list-builder build (see definition) and replace it by direct arithmetic.)
This has mixed success, as it yields the following Core:
Main.$wgo =
\ (w_s1T4 :: GHC.Prim.Int#) ->
case GHC.Prim.remInt# w_s1T4 2 of _ {
case w_s1T4 of wild_Xg {
__DEFAULT -> Main.$wgo (GHC.Prim.+# wild_Xg 1);
15000000 -> 0
0 ->
case w_s1T4 of wild_Xg {
case Main.$wgo (GHC.Prim.+# wild_Xg 1) of ww_s1T7 { __DEFAULT ->
GHC.Prim.+# wild_Xg ww_s1T7
15000000 -> 15000000
Which is nice, completely fused code - with the sole problem that we have a call to $wgo in a non-tail-call position. This means that we aren't looking at a loop, but actually at a deeply recursive function, with predictable program results:
Stack space overflow: current size 8388608 bytes.
The root problem here is that the Prelude's list fusion can only fuse right folds, and computing the sum as a right fold directly causes the excessive stack consumption.
The obvious fix would be to use a fusion framework that can actually deal with left folds, such as Duncan's stream-fusion package, which actually implements sum fusion.
Another solution would be to hack around it - and implement the left fold using a right fold:
main = print $ foldr (\x c -> c . (+x)) id [2,4..15000000] 0
This actually produces close-to-perfect code with current versions of GHC. On the other hand, this is generally not a good idea as it relies on GHC being smart enough to eliminate the partially applied functions. Already adding a filter into the chain will break that particular optimization.
Sum first 15,000,000 even numbers:
{-# LANGUAGE BangPatterns #-}
g :: Integer -- 15000000*15000001 = 225000015000000
g = go 1 0 0
go i !a c | c == 15000000 = a
go i !a c | even i = go (i+1) (a+i) (c+1)
go i !a c = go (i+1) a c
ought to be the fastest.
If you want to be sure to traverse the list only once, you can write the traversal explicitly:
nats = [1..] :: [Int]
requiredOfX :: Int -> Bool -- this way you can write a different requirement
requiredOfX x = even x
dumbSum :: Int
dumbSum = dumbSum' 0 0 nats
where dumbSum' acc 15000000 _ = acc
dumbSum' acc count (x:xs)
| requiredOfX x = dumbSum' (acc + x) (count + 1) xs
| otherwise = dumbSum' acc (count + 1) xs
First, you can be clever as young Gauss was and compute the sum in O(1).
Fun stuff aside, your Haskell solution uses lists. I'm quite sure your C/C++ solution doesn't. (Haskell lists are very easy to use so one is tempted to use them even in cases where it might not be appropriate.) Try benchmarking this:
sumBy2 :: Integer -> Integer
sumBy2 = f 0
f result n | n <= 1 = result
| otherwise = f (n + result) (n - 2)
Compile it using GHC with -O2 argument. This function is tail-recursive so compiler can implement it very efficiently.
Update: If you want it using even function, it's possible:
sumBy2 :: Integer -> Integer
sumBy2 = f 0
f result n | n <= 0 = result
| even n = f (n + result) (n - 1)
| otherwise = f result (n - 1)
You can also easily make the filtering function a parameter:
sumFilter :: (Integral a) => (a -> Bool) -> a -> a
sumFilter filtfn = f 0
f result n | n <= 0 = result
| filtfn n = f (n + result) (n - 1)
| otherwise = f result (n - 1)
Strict version works much faster:
foldl' (+) 0 $ take 15000000 [2, 4..]
Another thing to note is that nats and evens are so-called Constant Applicative Forms, or CAFs for short. Basically, those correspond to top-level definitions without any arguments. CAFs are a bit of an odd duck, for instance being the reason for the Dreaded Monomorphism Restriction; I'm not sure the language definition even allows CAFs to be inlined.
In my mental model of how Haskell executes, by the time dumbSum returns a value, evens will be evaluated to look something like 2:4: ... : 30000000 : <thunk> and nats to 1:2: ... : 30000000 : <thunk>, where the <thunk>s represent something that's not been looked at yet. If my understanding is correct, these allocations of : do have to happen and can't be optimized away.
So one way of speeding things up without altering your code too much would be to simply write:
dumbSum :: Int
dumbSum = sum . take 15000000 . filter even $ [1..]
dumbSum = sum $ take 15000000 evens where
nats = [1..]
evens = filter even nats
On my machine, compiled with -O2, that alone seems to result in a roughly 30% speedup.
I'm no GHC connaisseur (I've never even profiled a Haskell program!), so I could be wildly off the mark, though.

Project Euler 14: performance compared to C and memoization

I'm currently working on project euler problem 14.
I solved it using a poorly coded program, without memoization, that took 386 5 seconds to run (see edit).
Here it is:
step :: (Integer, Int) -> Integer -> (Integer, Int)
step (i, m) n | nextValue > m = (n, nextValue)
| otherwise = (i, m)
where nextValue = syr n 1
syr :: Integer -> Int -> Int
syr 1 acc = acc
syr x acc | even x = syr (x `div` 2) (acc + 1)
| otherwise = syr (3 * x + 1) (acc + 1)
p14 = foldl step (0, 0) [500000..999999]
My question is about several comments in the thread related to this problem, where were mentionned execution times of <1 s for programs as follow (C code, credits to the project euler forum user ix for the code -- note: I didn't check that the execution time is in fact as mentionned):
#include <stdio.h>
int main(int argc, char **argv) {
int longest = 0;
int terms = 0;
int i;
unsigned long j;
for (i = 1; i <= 1000000; i++) {
j = i;
int this_terms = 1;
while (j != 1) {
if (this_terms > terms) {
terms = this_terms;
longest = i;
if (j % 2 == 0) {
j = j / 2;
} else {
j = 3 * j + 1;
printf("longest: %d (%d)\n", longest, terms);
return 0;
To me, those programs are kind of the same, when talking about the algorithm.
So I wonder why there is such a big difference? Or is there any fondamental difference between our two algorithms that can justify a x6 factor in performance?
BTW, I'm currently trying to implement this algorithm with memoization, but am kind of lost as to me, it's way easier to implement in an imperative language (and I don't manipulate monads yet so I can't use this paradigm). So if you have any good tutorial that fits a beginner to learn memoization, I'll be glad (the ones I encountered were not detailed enough or out of my league).
Note: I came to declarative paradigm through Prolog and am still in the very early process of discovering Haskell, so I might miss important things.
Note2: any general advice about my code is welcome.
EDIT: thanks to delnan's help, I compiled the program and it now runs in 5 seconds, so I mainly look for hints on memoization now (even if ideas about the existing x6 gap are still welcome).
After having compiled it with optimisations, there are still several differences to the C programme
you use div, while the C programme uses machine division (which truncates) [but any self-respecting C compiler transforms that into a shift, so that makes it yet faster], that would be quot in Haskell; that reduced the run time by some 15% here.
the C programme uses fixed-width 64-bit (or even 32-bit, but then it's just luck that it gets the correct answer, since some intermediate values exceed 32-bit range) integers, the Haskell programme uses arbitrary precision Integers. If you have 64-bit Ints in your GHC (64-bit OS other than Windows), replace Integer with Int. That reduced the run time by a factor of about 3 here. If you're on a 32-bit system, you're out of luck, GHC doesn't use native 64-bit instructions there, these operations are implemented as C calls, that's still rather slow.
For the memoisation, you can outsource it to one of the memoisation packages on hackage, the only one that I remember is data-memocombinators, but there are others. Or you can do it yourself, for example keeping a map of previously calculated values - that would work best in the State monad,
import Control.Monad.State.Strict
import qualified Data.Map as Map
import Data.Map (Map, singleton)
type Memo = Map Integer Int
syr :: Integer -> State Memo Int
syr n = do
mb <- gets (Map.lookup n)
case mb of
Just l -> return l
Nothing -> do
let m = if even n then n `quot` 2 else 3*n+1
l <- syr m
let l' = l+1
modify (Map.insert n l')
return l'
solve :: Integer -> Int -> Integer -> State Memo (Integer,Int)
solve maxi len start
| len > 1000000 = return (maxi,len)
| otherwise = do
l <- syr start
if len < l
then solve start l (start+1)
else solve maxi len (start+1)
p14 :: (Integer,Int)
p14 = evalState (solve 0 0 500000) (singleton 1 1)
but that will probably not gain too much (not even when you've added the necessary strictness). The trouble is that a lookup in a Map is not too cheap and an insertion is relatively expensive.
Another method is to keep a mutable array for the lookup. The code becomes more complicated, since you have to choose a reasonable upper bound for the values to cache (should be not much larger than the bound for the starting values) and deal with the parts of the sequences falling outside the memoised range. But an array lookup and write are fast. If you have 64-bit Ints, the below code runs pretty fast, here it takes 0.03s for a limit of 1 million, and 0.33s for a limit of 10 million, the corresponding (as closely as I reasonably could) C code runs in 0.018 resp. 0.2s.
module Main (main) where
import System.Environment (getArgs)
import Data.Array.ST
import Data.Array.Base
import Control.Monad.ST
import Data.Bits
import Data.Int
main :: IO ()
main = do
args <- getArgs
let bd = case args of
a:_ -> read a
_ -> 100000
print $ collMax bd
next :: Int -> Int
next n
| n .&. 1 == 0 = n `unsafeShiftR` 1
| otherwise = 3*n + 1
collMax :: Int -> (Int,Int16)
collMax upper = runST $ do
arr <- newArray (0,upper) 0 :: ST s (STUArray s Int Int16)
let go l m
| upper < m = go (l+1) $ next m
| otherwise = do
l' <- unsafeRead arr m
case l' of
0 -> do
l'' <- go 1 $ next m
unsafeWrite arr m (l'' + 1)
return (l+l'')
_ -> return (l+l'-1)
collect mi ml i
| upper < i = return (mi, ml)
| otherwise = do
l <- go 1 i
if l > ml
then collect i l (i+1)
else collect mi ml (i+1)
unsafeWrite arr 1 1
collect 1 1 2
Well, the C program uses unsigned long, but Integer can store arbitrarily large integers (it's a bignum). If you import Data.Word, then you can use Word, which is a machine-word-sized unsigned integer.
After replacing Integer with Word, and using ghc -O2 and gcc -O3, the C program runs in 0.72 seconds, while the Haskell programs runs in 1.92 seconds. 2.6x isn't bad. However, ghc -O2 doesn't always help, and this is one of the programs on which it doesn't! Using just -O, as you did, brings the runtime down to 1.90 seconds.
I tried replacing div with quot (which uses the same type of division as C; they only differ on negative inputs), but strangely it actually made the Haskell program run slightly slower for me.
You should be able to speed up the syr function with the help of this previous Stack Overflow question I answered about the same Project Euler problem.
On my current system (32-bit Core2Duo) your Haskell code, including all the optimizations given in the answers, takes 0.8s to compile and 1.2s to run.
You could transfer the run-time to compile-time, and reduce the run-time to nearly zero.
module Euler14 where
import Data.Word
import Language.Haskell.TH
terms :: Word -> Word
terms n = countTerms n 0
countTerms 1 acc = acc + 1
countTerms n acc | even n = countTerms (n `div` 2) (acc + 1)
| otherwise = countTerms (3 * n + 1) (acc + 1)
longestT :: Word -> Word -> (Word, Word)
longestT mi mx = find mi mx (0, 0)
find mi mx (ct,cn) | mi == mx = if ct > terms mi then (ct,cn) else (terms mi, mi)
| otherwise = find (mi + 1) mx
(if ct > terms mi then (ct,cn) else (terms mi, mi))
longest :: Word -> Word -> ExpQ
longest mi mx = return $ TupE [LitE (IntegerL (fromIntegral a)),
LitE (IntegerL (fromIntegral b))]
(a,b) = longestT mi mx
{-# LANGUAGE TemplateHaskell #-}
import Euler14
main = print $(longest 500000 999999)
On my system it takes 2.3s to compile this but the run-time goes down to 0.003s. Compile Time Function Execution (CTFE) is something you can't do in C/C++. The only other programming language that I know of that supports CTFE is the D programming language. And just to be complete, the C code takes 0.1s to compile and 0.7s to run.

How to write parallel code with Haskell vectors?

One one hand, in Haskell Vector a seems to be the preferred type to use as an array of numbers. There is even an (incomplete) Vector Tutorial.
On the other hand, Control.Parallel.Strategies are defined mostly in terms of Traversable. Vector library doesn't provide these instances.
The minimal complete definition of Traversable t should also define Foldable and
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
sequenceA :: Applicative f => t (f a) -> f (t a)
I don't see how sequenceA can be defined for Data.Vector.Unboxed.Vector. So, what is the best approach to writing parallel code with unboxed vectors? Defining some new ad hoc strategies like evalVector or using par and pseq explicitly or using plain Data.Array instead of vectors?
P.S. Plain Arrays are parallelizable without problems: https://gist.github.com/701888
It's a hack job for parVector but this worked for me:
import qualified Data.Vector as V
import Control.Parallel.Strategies
import Control.Parallel
import Control.DeepSeq
ack :: Int -> Int -> Int
ack 0 n = n+1
ack m 0 = ack (m-1) 1
ack m n = ack (m-1) (ack m (n-1))
main = do
let vec = V.enumFromN 1 1000
let res = (V.map (ack 2) vec) `using` parVector
print res
parVector :: NFData a => Strategy (V.Vector a)
parVector vec = eval vec `seq` Done vec
chunkSize = 1
eval v
| vLen == 0 = ()
| vLen <= chunkSize = rnf (v V.! 0) -- FIX this to handle chunks > 1
| otherwise = eval (V.take half v) `par` eval (V.drop half v)
where vLen = V.length v
half = vLen `div` 2
And running this code:
[tommd#Mavlo Test]$ ghc --make -O2 -threaded t.hs
... dumb warning ...
[tommd#Mavlo Test]$ time ./t +RTS -N1 >/dev/null
real 0m1.962s user 0m1.951s sys 0m0.009s
[tommd#Mavlo Test]$ time ./t +RTS -N2 >/dev/null
real 0m1.119s user 0m2.221s sys 0m0.005s
When I run the code with Integer instead of Int in the type signature:
[tommd#Mavlo Test]$ time ./t +RTS -N2 >/dev/null
real 0m4.754s
user 0m9.435s
sys 0m0.028s
[tommd#Mavlo Test]$ time ./t +RTS -N1 >/dev/null
real 0m9.008s
user 0m8.952s
sys 0m0.029s
EDIT: And a solution that is a bit closer to your earlier attempt is cleaner (it doesn't use functions from three separate modules) and works great:
parVector :: NFData a => Strategy (V.Vector a)
parVector vec =
let vLen = V.length vec
half = vLen `div` 2
minChunk = 10
in if vLen > minChunk
then do
let v1 = V.unsafeSlice 0 half vec
v2 = V.unsafeSlice half (vLen - half) vec
parVector v1
parVector v2
return vec
evalChunk (vLen-1) >>
return vec
evalChunk 0 = rpar (rdeepseq (vec V.! 0)) >> return vec
evalChunk i = rpar (rdeepseq (vec V.! i)) >> evalChunk (i-1)
Things to learn from this solution:
It uses the Eval monad, which is strict so we're sure to spark everything (compared to wrapping things in let and remembering to use bang patterns).
Contrary to your proposed implementation it (a) doesn't construct a new vector, which is costly (b) evalChunk forces evaluation of each element using rpar and rdeepseq (I don't believe rpar vec forces any of the vector's elements).
Contrary to my belief, slice takes a start index and length, not a start and end index. Oops!
We still need to import Control.DeepSeq (NFData), but I've e-mailed the libraries list to try and fix that issue.
Performance seems similar to the first parVector solution in this answer, so I won't post numbers.
1) As you probably know, vector is a product of the DPH work that has proven harder than the researchers initially expected.
2) Unboxed vectors can't divide up the work for individual elements across multiple CPUs.
3) I'd be a lot more hopeful for boxed vectors. Something like:
using (map (rnf . (vec !)) [0..V.length vec - 1]) (parList rdeepseq)
Or maybe you can avoid constructing the list and using parlist. I think just assigning parts of the array is sufficient. The below code is likely broken, but the concept of making your own parVector using rnf and dividing the vector in half until it is a single element (or some tunable chunk size of elements) should work.
parVector :: Strategy (Vector a)
parVector = let !_ = eval vec in Done vec
chunkSize = 1
eval v
| vLen == 0 = ()
| vLen <= chunkSize = rnf (v ! 0) -- FIX this to handle chunks > 1
| otherwise = eval (V.take half v) `par` eval (V.drop half v)
where vLen = V.length v
half = vLen `div` 2
