Good day. Given code
import Control.DeepSeq
import Control.Exception
import Control.Parallel
import Control.Parallel.Strategies
import System.Environment
import Text.Printf
l = [34,56,43,1234,456,765,345,4574,58,878,978,456,34,234,1234123,1234,12341234]
f x = Just (sum [1..x])
fun1 :: [Maybe Integer]
fun1 = map f l `using` parList rdeepseq
fun2 :: [Maybe Integer]
fun2 = map f l `using` evalList (rparWith rdeepseq)
fun3 :: [Maybe Integer]
fun3 = map f l `using` evalList (rpar . force)
main :: IO ()
main = print fun1
Why fun1 and fun2 run sequentially?
From what I understood, rparWith should spark its argument. Answer here states the same. But for fun1 and fun2 I'm getting output like "SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)". So Sparks were not even created.
fun3 works as expected with sparks being created.
Ty for help
UPD: And I found that rdeepseq makes example from book (Parallel and Concurrent Programming in Haskell) works in sequence. Book says:
And we can use parPair to write a Strategy that fully evaluates both components of a pair in parallel:
parPair rdeepseq rdeepseq :: (NFData a, NFData b) => Strategy (a,b)
To break down what happens when this Strategy is applied to a pair: parPair calls, and evalPair calls rparWith rdeepseq on each component of the pair. So
the effect is that each component will be fully evaluated to normal form in parallel.
But if I run
(Just (fib 35), Just (fib 36)) `using` parPair rdeepseq rdeepseq
or even
(fib 35, fib 36) `using` parPair rdeepseq rdeepseq
Threadscope shows only one core running and 0 sparks created.
fib implemented like this (from book too)
fib :: Integer -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
rparWith was defined using realWorld#, a deeply magical GHC internal value. The way it was used is essentially the same as applying a "function" sometimes called accursedUnutterablePerformIO (more officially, unsafeInlinePerformIO). Using it is only legitimate when the IO in question is actually exceptionally pure. The thinking was that since Eval is just for calculation, that should be fine. But in fact, sparking threads is an IO effect, and one we care about! The optimizer was rearranging those effects in an unfortunate way, causing them ultimately to be dropped. The fix was to use unsafeDupablePerformIO instead. That's a much better-behaved "function", and seems to do the trick. See the ticket for details.
Note: my initial fix turned out to be a bit wrong; it's now been modified once again.
The original paper describes rdeepseq as
rdeepseq :: NFData a => Strategy a
rdeepseq x = rnf x ‘pseq‘ return x
And indeed, if you use this definition, it will create sparks, like you'd expect. Looks like rdeepseq sematrics was changes (probably here), intentionally or incidentally. I don't see any note neither in the documentation, nor in the changelog, so it is probably a bug. Please create at issue on their bug tracker and ask maintainers for clarification.
I was wondering if it is possible in Haskell to define a function which upon calling gives the next element of an (infinite) list, so for example:
Prelude> func
1
Prelude> func
2
Is it possible to have such a function in Haskell, and if there is, can you give me an example?
You could do a Stateful thing like this:
{-# LANGUAGE FlexibleContexts #-}
import Control.Monad.State
import Data.List
import Data.Maybe
-- This is not a function! The misleading name func comes from the question text.
func :: MonadState [a] m => m a
func = state (fromJust . uncons)
exampleUsage :: State [Int] (Int, Int)
exampleUsage = do
x <- func
y <- func
return (x, y)
You can try it in ghci:
> evalState exampleUsage [1..]
(1, 2)
However, at a high level, I would suggest rethinking your requirements. func is not very idiomatic at all; simply working with the infinite list directly is generally going to be much clearer, have lower (syntactic) overhead, and lead to better generated code. For example:
exampleUsage' :: [a] -> (a, a)
exampleUsage' (x:y:_) = (x,y)
N.B. this is two lines of code with no extensions or imports, compared to the previous 11 lines of code including a language extension and three imports. Usage is also simplified; you can drop the call to evalState entirely and be done.
> exampleUsage' [1..]
(1, 2)
You can use mutable references and the IO monad (or other stateful monad). This can be made rather pretty via partial application:
Prelude> import Data.IORef
Prelude Data.IORef> ref <- newIORef 0
Prelude Data.IORef> let func = readIORef ref >>= \r -> writeIORef ref (r+1) >> return r
Prelude Data.IORef> func
0
Prelude Data.IORef> func
1
Or closer to what you requested:
Prelude Data.IORef> ref2 <- newIORef [0..]
Prelude Data.IORef> let func2 = readIORef ref2 >>= \(x:xs) -> writeIORef ref2 xs >> return x
Prelude Data.IORef> func2
0
Prelude Data.IORef> func2
1
It sounds like you are looking for something like other languages' Iterator or Generator constructs. If so, this seems like a good use case for the conduit library. Note that there are options (e.g. pipes); however, conduit may be a good starting point for you.
If you are trying to operate only over lists, using the State Monad may be a simpler answer (as Daniel suggests); however, if you are looking for a more general solution, conduit (or the like) may indeed be the answer.
The func you are searching for is therefore most likely the await function.
Here's a simple example -
import Prelude
import Conduit
import Data.MonoTraversable
main :: IO ()
main = runConduit (source .| consume) >>= print
source :: Monad m => Producer m (Element [Integer])
source = yieldMany [0..]
consume :: Monad m => ConduitM i o m (Maybe (i, i))
consume = do
mx <- await
my <- await
return $ (,) <$> mx <*> my
And its output -
λ main
Just (0,1)
I have some specs (written with HSpec) and would like to have a test that checks whether the re-exporting of some functions takes place as intended.
Code:
https://github.com/Wizek/compose-ltr/blob/ab954f00beb56c6c1a595261381d40e7e824e3bc/spec/Spec.hs#L4
If I go into this file, I can run all tests with either import if I manually switch whether line 4 or 5 is commented out. Is there a simple way to have an automated specification that ensures that both modules export the same functions?
The first thing I thought of is to import one of the modules qualified, and check for equality:
(($>) == (ComposeLTR.$>)) `shouldBe` True
-- Or more succintly
($>) `shouldBe` (ComposeLTR.$>)
But that won't work since functions are not directly comparable, they are not part of the Eq type class.
The only thing I can think of that would work automatically is to import qualified and to define QuickCheck properties for all 4 functions like so:
import qualified ComposeLTR
it "should re-export the same function" $ do
let
prop :: (Fun Int Int) -> Int -> Bool
prop (Fun _ f) g = (g $> f) == (g ComposeLTR.$> f)
property prop
-- ... Essentially repeated 3 more times
But that seems awfully long-handed and redundant. Is there an elegant way to check this?
You can use StableNames in IO:
Prelude Data.List System.Mem.StableName> v <- makeStableName Prelude.takeWhile
Prelude Data.List System.Mem.StableName> v' <- makeStableName Data.List.takeWhile
Prelude Data.List System.Mem.StableName> v == v'
True
I have a simple script written in both Python and Haskell. It reads a file with 1,000,000 newline separated integers, parses that file into a list of integers, quick sorts it and then writes it to a different file sorted. This file has the same format as the unsorted one. Simple.
Here is Haskell:
quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
where
lesser = filter (< p) xs
greater = filter (>= p) xs
main = do
file <- readFile "data"
let un = lines file
let f = map (\x -> read x::Int ) un
let done = quicksort f
writeFile "sorted" (unlines (map show done))
And here is Python:
def qs(ar):
if len(ar) == 0:
return ar
p = ar[0]
return qs([i for i in ar if i < p]) + [p] + qs([i for i in ar if i > p])
def read_file(fn):
f = open(fn)
data = f.read()
f.close()
return data
def write_file(fn, data):
f = open('sorted', 'w')
f.write(data)
f.close()
def main():
data = read_file('data')
lines = data.split('\n')
lines = [int(l) for l in lines]
done = qs(lines)
done = [str(l) for l in done]
write_file('sorted', "\n".join(done))
if __name__ == '__main__':
main()
Very simple. Now I compile the Haskell code with
$ ghc -O2 --make quick.hs
And I time those two with:
$ time ./quick
$ time python qs.py
Results:
Haskell:
real 0m10.820s
user 0m10.656s
sys 0m0.154s
Python:
real 0m9.888s
user 0m9.669s
sys 0m0.203s
How can Python possibly be faster than native code Haskell?
Thanks
EDIT:
Python version: 2.7.1
GHC version: 7.0.4
Mac OSX, 10.7.3
2.4GHz Intel Core i5
List generated by
from random import shuffle
a = [str(a) for a in xrange(0, 1000*1000)]
shuffle(a)
s = "\n".join(a)
f = open('data', 'w')
f.write(s)
f.close()
So all numbers are unique.
The Original Haskell Code
There are two issues with the Haskell version:
You're using string IO, which builds linked lists of characters
You're using a non-quicksort that looks like quicksort.
This program takes 18.7 seconds to run on my Intel Core2 2.5 GHz laptop. (GHC 7.4 using -O2)
Daniel's ByteString Version
This is much improved, but notice it still uses the inefficient built-in merge sort.
His version takes 8.1 seconds (and doesn't handle negative numbers, but that's more of a non-issue for this exploration).
Note
From here on this answer uses the following packages: Vector, attoparsec, text and vector-algorithms. Also notice that kindall's version using timsort takes 2.8 seconds on my machine (edit: and 2 seconds using pypy).
A Text Version
I ripped off Daniel's version, translated it to Text (so it handles various encodings) and added better sorting using a mutable Vector in an ST monad:
import Data.Attoparsec.Text.Lazy
import qualified Data.Text.Lazy as T
import qualified Data.Text.Lazy.IO as TIO
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Algorithms.Intro as I
import Control.Applicative
import Control.Monad.ST
import System.Environment (getArgs)
parser = many (decimal <* char '\n')
main = do
numbers <- TIO.readFile =<< fmap head getArgs
case parse parser numbers of
Done t r | T.null t -> writeFile "sorted" . unlines
. map show . vsort $ r
x -> error $ Prelude.take 40 (show x)
vsort :: [Int] -> [Int]
vsort l = runST $ do
let v = V.fromList l
m <- V.unsafeThaw v
I.sort m
v' <- V.unsafeFreeze m
return (V.toList v')
This runs in 4 seconds (and also doesn't handle negatives)
Return to the Bytestring
So now we know we can make a more general program that's faster, what about making the ASCii -only version fast? No problem!
import qualified Data.ByteString.Lazy.Char8 as BS
import Data.Attoparsec.ByteString.Lazy (parse, Result(..))
import Data.Attoparsec.ByteString.Char8 (decimal, char)
import Control.Applicative ((<*), many)
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Algorithms.Intro as I
import Control.Monad.ST
parser = many (decimal <* char '\n')
main = do
numbers <- BS.readFile "rands"
case parse parser numbers of
Done t r | BS.null t -> writeFile "sorted" . unlines
. map show . vsort $ r
vsort :: [Int] -> [Int]
vsort l = runST $ do
let v = V.fromList l
m <- V.unsafeThaw v
I.sort m
v' <- V.unsafeFreeze m
return (V.toList v')
This runs in 2.3 seconds.
Producing a Test File
Just in case anyone's curious, my test file was produced by:
import Control.Monad.CryptoRandom
import Crypto.Random
main = do
g <- newGenIO :: IO SystemRandom
let rs = Prelude.take (2^20) (map abs (crandoms g) :: [Int])
writeFile "rands" (unlines $ map show rs)
If you're wondering why vsort isn't packaged in some easier form on Hackage... so am I.
In short, don't use read. Replace read with a function like this:
import Numeric
fastRead :: String -> Int
fastRead s = case readDec s of [(n, "")] -> n
I get a pretty fair speedup:
~/programming% time ./test.slow
./test.slow 9.82s user 0.06s system 99% cpu 9.901 total
~/programming% time ./test.fast
./test.fast 6.99s user 0.05s system 99% cpu 7.064 total
~/programming% time ./test.bytestring
./test.bytestring 4.94s user 0.06s system 99% cpu 5.026 total
Just for fun, the above results include a version that uses ByteString (and hence fails the "ready for the 21st century" test by totally ignoring the problem of file encodings) for ULTIMATE BARE-METAL SPEED. It also has a few other differences; for example, it ships out to the standard library's sort function. The full code is below.
import qualified Data.ByteString as BS
import Data.Attoparsec.ByteString.Char8
import Control.Applicative
import Data.List
parser = many (decimal <* char '\n')
reallyParse p bs = case parse p bs of
Partial f -> f BS.empty
v -> v
main = do
numbers <- BS.readFile "data"
case reallyParse parser numbers of
Done t r | BS.null t -> writeFile "sorted" . unlines . map show . sort $ r
More a Pythonista than a Haskellite, but I'll take a stab:
There's a fair bit of overhead in your measured runtime just reading and writing the files, which is probably pretty similar between the two programs. Also, be careful that you've warmed up the cache for both programs.
Most of your time is spent making copies of lists and fragments of lists. Python list operations are heavily optimized, being one of the most-frequently used parts of the language, and list comprehensions are usually pretty performant too, spending much of their time in C-land inside the Python interpreter. There is not a lot of the stuff that is slowish in Python but wicked fast in static languages, such as attribute lookups on object instances.
Your Python implementation throws away numbers that are equal to the pivot, so by the end it may be sorting fewer items, giving it an obvious advantage. (If there are no duplicates in the data set you're sorting, this isn't an issue.) Fixing this bug probably requires making another copy of most of the list in each call to qs(), which would slow Python down a little more.
You don't mention what version of Python you're using. If you're using 2.x, you could probably get Haskell to beat Python just by switching to Python 3.x. :-)
I'm not too surprised the two languages are basically neck-and-neck here (a 10% difference is not noteworthy). Using C as a performance benchmark, Haskell loses some performance for its lazy functional nature, while Python loses some performance due to being an interpreted language. A decent match.
Since Daniel Wagner posted an optimized Haskell version using the built-in sort, here's a similarly optimized Python version using list.sort():
mylist = [int(x.strip()) for x in open("data")]
mylist.sort()
open("sorted", "w").write("\n".join(str(x) for x in mylist))
3.5 seconds on my machine, vs. about 9 for the original code. Pretty much still neck-and-neck with the optimized Haskell. Reason: it's spending most of its time in C-programmed libraries. Also, TimSort (the sort used in Python) is a beast.
This is after the fact, but I think most of the trouble is in the Haskell writing. The following module is pretty primitive -- one should use builders probably and certainly avoid the ridiculous roundtrip via String for showing -- but it is simple and did distinctly better than pypy with kindall's improved python and better than the 2 and 4 sec Haskell modules elsewhere on this page (it surprised me how much they were using lists, so I made a couple more turns of the crank.)
$ time aa.hs real 0m0.709s
$ time pypy aa.py real 0m1.818s
$ time python aa.py real 0m3.103s
I'm using the sort recommended for unboxed vectors from vector-algorithms. The use of Data.Vector.Unboxed in some form is clearly now the standard, naive way of doing this sort of thing -- it's the new Data.List (for Int, Double, etc.) Everything but the sort is irritating IO management, which could I think still be massively improved, on the write end in particular. The reading and sorting together take about 0.2 sec as you can see from asking it to print what's at a bunch of indexes instead of writing to file, so twice as much time is spent writing as in anything else. If the pypy is spending most of its time using timsort or whatever, then it looks like the sorting itself is surely massively better in Haskell, and just as simple -- if you can just get your hands on the darned vector...
I'm not sure why there aren't convenient functions around for reading and writing vectors of unboxed things from natural formats -- if there were, this would be three lines long and would avoid String and be much faster, but maybe I just haven't seen them.
import qualified Data.ByteString.Lazy.Char8 as BL
import qualified Data.ByteString.Char8 as B
import qualified Data.Vector.Unboxed.Mutable as M
import qualified Data.Vector.Unboxed as V
import Data.Vector.Algorithms.Radix
import System.IO
main = do unsorted <- fmap toInts (BL.readFile "data")
vec <- V.thaw unsorted
sorted <- sort vec >> V.freeze vec
withFile "sorted" WriteMode $ \handle ->
V.mapM_ (writeLine handle) sorted
writeLine :: Handle -> Int -> IO ()
writeLine h int = B.hPut h $ B.pack (show int ++ "\n")
toInts :: BL.ByteString -> V.Vector Int
toInts bs = V.unfoldr oneInt (BL.cons ' ' bs)
oneInt :: BL.ByteString -> Maybe (Int, BL.ByteString)
oneInt bs = if BL.null bs then Nothing else
let bstail = BL.tail bs
in if BL.null bstail then Nothing else BL.readInt bstail
To follow up #kindall interesting answer, those timings are dependent from both the python / Haskell implementation you use, the hardware configuration on which you run the tests, and the algorithm implementation you right in both languages.
Nevertheless we can try to get some good hints of the relative performances of one language implementation compared to another, or from one language to another language. With well known alogrithms like qsort, it's a good beginning.
To illustrate a python/python comparison, I just tested your script on CPython 2.7.3 and PyPy 1.8 on the same machine:
CPython: ~8s
PyPy: ~2.5s
This shows there can be room for improvements in the language implementation, maybe compiled Haskell is not performing at best the interpretation and compilation of your corresponding code. If you are searching for speed in Python, consider also to switch to pypy if needed and if your covering code permits you to do so.
i noticed some problem everybody else didn't notice for some reason; both your haskell and python code have this. (please tell me if it's fixed in the auto-optimizations, I know nothing about optimizations). for this I will demonstrate in haskell.
in your code you define the lesser and greater lists like this:
where lesser = filter (<p) xs
greater = filter (>=p) xs
this is bad, because you compare with p each element in xs twice, once for getting in the lesser list, and again for getting in the greater list. this (theoretically; I havn't checked timing) makes your sort use twice as much comparisons; this is a disaster. instead, you should make a function which splits a list into two lists using a predicate, in such a way that
split f xs
is equivalent to
(filter f xs, filter (not.f) xs)
using this kind of function you will only need to compare each element in the list once to know in which side of the tuple to put it.
okay, lets do it:
where
split :: (a -> Bool) -> [a] -> ([a], [a])
split _ [] = ([],[])
split f (x:xs)
|f x = let (a,b) = split f xs in (x:a,b)
|otherwise = let (a,b) = split f xs in (a,x:b)
now lets replace the lesser/greater generator with
let (lesser, greater) = split (p>) xs in (insert function here)
full code:
quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) =
let (lesser, greater) = splitf (p>) xs
in (quicksort lesser) ++ [p] ++ (quicksort greater)
where
splitf :: (a -> Bool) -> [a] -> ([a], [a])
splitf _ [] = ([],[])
splitf f (x:xs)
|f x = let (a,b) = splitf f xs in (x:a,b)
|otherwise = let (a,b) = splitf f xs in (a,x:b)
for some reason I can't right the getter/lesser part in the where clauses so I had to right it in let clauses.
also, if it is not tail-recursive let me know and fix it for me (I don't know yet how tail-recorsive works fully)
now you should do the same for the python code. I don't know python so I can't do it for you.
EDIT:
there actually happens to already be such function in Data.List called partition. note this proves the need for this kind of function because otherwise it wouldn't be defined.
this shrinks the code to:
quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) =
let (lesser, greater) = partition (p>) xs
in (quicksort lesser) ++ [p] ++ (quicksort greater)
Python is really optimized for this sort of thing. I suspect that Haskell isn't. Here's a similar question that provides some very good answers.
The source code for Control.Parallel.Strategies ( http://hackage.haskell.org/packages/archive/parallel/3.1.0.1/doc/html/src/Control-Parallel-Strategies.html#Eval ) contains a type Eval defined as:
data Eval a = Done a
which has the following Monad instance:
instance Monad Eval where
return x = Done x
Done x >>= k = k x -- Note: pattern 'Done x' makes '>>=' strict
Note the comment in the definition of bind. Why is this comment true? My understanding of strictness is that a function is only strict if it must "know something" about its argument. Here, bind just applies k to x, thus it doesn't appear (to me) to need to know anything about x. Further, the comment suggests that strictness is "induced" in the pattern match, before the function is even defined. Can someone help me understand why bind is strict?
Also, it looks like Eval is just the identity Monad and that, given the comment in the definition of bind, bind would be strict for almost any Monad. Is this the case?
It is strict in that m >> n evaluates m, unlike the Identity Monad:
Prelude Control.Parallel.Strategies Control.Monad.Identity> runIdentity (undefined >> return "end")
"end"
Prelude Control.Parallel.Strategies Control.Monad.Identity> runEval (undefined >> return "end")
"*** Exception: Prelude.undefined
It is not strict in the value that m produces,w which is what you are pointing out:
Prelude Control.Parallel.Strategies Control.Monad.Identity> runEval (return undefined >> return "end")
"end"