haskell Convert IO Int to Int System.Random.MWC - haskell

I would like to convert an IO Int to Int from System.Random.MWC, using unsafePerformIO. It does work in ghci:
Prelude System.Random.MWC System.IO.Unsafe> let p = unsafePerformIO(uniformR (0, 30) gen :: IO Int)
Prelude System.Random.MWC System.IO.Unsafe> p
11
Prelude System.Random.MWC System.IO.Unsafe> :t p
p :: Int
However in GHC
import System.Random.MWC
import System.IO.Unsafe
main :: IO()
main = do
gen <-createSystemRandom
print $! s 30 gen
s :: Int-> GenIO -> Int
s !k g = unsafePerformIO(uniformR (0, k - 1) g)
it returns
ghc: panic! (the 'impossible' happened)
(GHC version 7.6.3 for i386-unknown-linux):
make_exp (App _ (Coercion _))
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug

There's really no need for unsafePerformIO here. Just change the type of s to return IO Int and use do-notation or the bind operator to feed the result to print.
s :: Int -> GenIO -> IO Int
s k g = uniformR (0, k - 1) g
main :: IO ()
main = do
gen <- createSystemRandom
x <- s 30 gen
print x
or
main = do
gen <- createSystemRandom
print =<< s 30 gen
or
main = print =<< s 30 =<< createSystemRandom

Related

Difference in GHC versions - compile error

I was practicing my Haskell and I came across a weird problem which I was unable to find a solution to on the Internet. I decided to solve this problem:
https://www.hackerrank.com/challenges/fibonacci-fp
In as many ways I can think of. One way is to perform recursion with memoization where I want to use State monad as a cache. I have GHC 7.10.2 on my Windows 10 and GHC 7.6.2 on my Ubuntu 14.04. This code below compiles (and runs very well) on 7.6.2 and doesn't compile on 7.10.2 giving error wherever I type 'Map', for example:
Not in scope: type constructor or class: 'Map.Map'
Not in scope: 'Map.lookup'
module Main (
main
) where
import qualified Data.Map as Map
import Control.Monad.State
type CacheState = Map.Map Int Int
type IOState a = StateT CacheState IO a
modNum :: Int
modNum = 100000007
fibsMod :: [Int]
fibsMod = 0 : 1 : zipWith (\x y -> (x + y) mod modNum ) fibsMod (tail fibsMod)
-- | calculate Fibs with memoization in map
memoizedFib :: Int -> IOState Int
memoizedFib n = do
state <- get
let x = Map.lookup n state
case x of
Just y ->
return y
Nothing -> do
n1 <- memoizedFib (n - 1)
n2 <- memoizedFib (n - 2)
let n3 = mod (n1 + n2) modNum
put (Map.insert n n3 state)
return n3
query :: [Int] -> IOState ()
query [] = return ()
query (n:ns) = do
fibNum <- memoizedFib n
liftIO $ print fibNum
query ns
main :: IO ()
main = do
inputdata <- getContents
let intList = (map (read :: String -> Int) . tail . words) inputdata
evalIOState $ query intList
where
initState :: Int -> Map.Map Int Int
initState upTo = Map.fromList $ zip [0 .. upTo] $ take upTo fibsMod
--initState upTo = Map.fromList $ [(0, 0), (1, 1)]
evalIOState :: IOState a -> IO a
evalIOState m = evalStateT m (initState 10001)
Does anybody know why am I facing this problem? It's very disturbing.
Additional question
As you can see I didn't perform exactly recursion with memoization. However leaving one of those lines uncommented can change approach:
initState upTo = Map.fromList $ zip [0 .. upTo] $ take upTo fibsMod
--initState upTo = Map.fromList $ [(0, 0), (1, 1)]
The problem is that using the second line performs terrible. I don't know where I made a mistake, but I think it should run in linear time with memoization. However with this line my algorithm is clearly exponential (I couldn't even get the answer for 50-th Fib number - that long). What did I do wrong in this case?
UPDATE
Thanks to your comments I fixed my code. Obviously there was a problem with mod function (I completely don't know how did this compile on GHC 7.6.2). Also I changed:
import qualified Data.Map as Map
to:
import qualified Data.Map.Strict as Map
and now this code below works as intended:
module Main (
main
) where
import qualified Data.Map.Strict as Map
import Control.Monad.State
type CacheState = Map.Map Int Int
type IOState a = StateT CacheState IO a
modNum :: Int
modNum = 100000007
fibsMod :: [Int]
fibsMod = 0 : 1 : zipWith (\x y -> (x + y) `mod` modNum) fibsMod (tail fibsMod)
-- | calculate Fibs with memoization in map
memoizedFib :: Int -> IOState Int
memoizedFib n = do
state <- get
let x = Map.lookup n state
case x of
Just y ->
return y
Nothing -> do
n1 <- memoizedFib (n - 1)
n2 <- memoizedFib (n - 2)
state <- get
let n3 = mod (n1 + n2) modNum
put (Map.insert n n3 state)
return n3
query :: [Int] -> IOState ()
query [] = return ()
query (n:ns) = do
fibNum <- memoizedFib n
liftIO $ print fibNum
query ns
main :: IO ()
main = do
inputdata <- getContents
let intList = (map (read :: String -> Int) . tail . words) inputdata
evalIOState $ query intList
where
initState :: Int -> Map.Map Int Int
--initState upTo = Map.fromList $ zip [0 .. upTo] $ take upTo fibsMod
initState upTo = Map.fromList [(0, 0), (1, 1)]
evalIOState :: IOState a -> IO a
evalIOState m = evalStateT m (initState 10001)
So now the question comes down to: Why did I need to use Data.Map.Strict, how is it different and why GHC 7.6.2 didn't need it?

How GHC forces evaluation in multithreaded applications?

For example: I have a quite simple memoised implementation of fibonacci sequence,
which I request in multiple threads:
{-# LANGUAGE BangPatterns #-}
module Main where
import Control.Concurrent
import Control.DeepSeq
import System.Environment (getArgs)
import System.IO.Unsafe (unsafePerformIO)
data Tree a = Tree (Tree a) a (Tree a)
instance Functor Tree where
fmap f (Tree l m r) = Tree (fmap f l) (f m) (fmap f r)
index :: Tree a -> Int -> a
index (Tree _ m _) 0 = m
index (Tree l _ r) n = case (n - 1) `divMod` 2 of
(q,0) -> index l q
(q,_) -> index r q
nats :: Tree Int
nats = go 0 1
where go !n !s = Tree (go l s') n (go r s')
where l = n + s
r = l + s
s' = s * 2
fib :: (Int -> Integer) -> Int -> Integer
fib _ 0 = 0
fib _ 1 = 1
fib f n = f (n - 1) + f (n - 2)
fib_tree :: Tree Integer
fib_tree = fmap (fib fastfib) nats
fastfib :: Int -> Integer
fastfib = index fib_tree
writeMutex :: MVar ()
writeMutex = unsafePerformIO (newMVar ())
fibIO :: Int -> IO ()
fibIO n = let fibn = fastfib n
in deepseq fibn $ do takeMVar writeMutex
putStrLn (show n ++ " " ++ show fibn)
putMVar writeMutex ()
children :: MVar [MVar ()]
children = unsafePerformIO (newMVar [])
waitForChildren :: IO ()
waitForChildren = do
cs <- takeMVar children
case cs of
[] -> return ()
m:ms -> do
putMVar children ms
takeMVar m
waitForChildren
forkChild :: IO () -> IO ThreadId
forkChild io = do
mvar <- newEmptyMVar
childs <- takeMVar children
putMVar children (mvar:childs)
forkFinally io (\_ -> putMVar mvar ())
main' :: [Int] -> IO ()
main' = mapM_ (forkChild . fibIO)
main :: IO ()
main = do
nargs <- fmap read `fmap` getArgs :: IO [Int]
main' nargs
waitForChildren
And when compiled with -threaded I can run it:
% time ./concur 10 10 10 10 10 10 10 +RTS -N4
10 55
10 55
10 55
10 55
10 55
10 55
10 55
./concur 10 10 10 10 10 10 10 +RTS -N4 0.00s user 0.00s system 82% cpu 0.007 total
And as expected if I have single large argument, or many, the execution time is the same.
I'm interested how evaluation of thunks in memoised tree is performed, on low-level?
When one thread evaluates a thunk, the chunk is locked, and other threads block on it (aka black hole). See Haskell on a Shared-Memory Multiprocessor paper for details.

getLine x times haskell

Based on Hackerrank question
My problem is how can I do getLine t times on stdIn?
main = do
t <- getInt
let x = [divisorsInNumber unsafeGetInt | a <-[1..t] ]
print x
getInt :: IO Int
getInt = fmap read getLine
unsafeGetInt :: Int
unsafeGetInt = unsafePerformIO getInt
divisorsInNumber n = length $ filter (== True) $ map (isDivisor n) (integralToListOfInts n)
Just replicate t times the getLine operation with replicateM:
import Control.Monad (replicateM)
getLines :: Int -> IO [String]
getLines t = replicateM t getLine
Thus getInts, that is getInt t times, can be expressed with:
getInts :: Int -> IO [Int]
getInts = fmap read <$> getLines
The full code rewritten to use getInts could be:
import Control.Applicative ((<$>))
import Control.Monad (replicateM)
getLines :: Int -> IO [String]
getLines n = replicateM n getLine
getInts :: Int -> IO [Int]
getInts n = fmap read <$> getLines n
getInt :: IO Int
getInt = fmap read getLine
divisorsInNumber :: Int -> Int
divisorsInNumber n = length $ filter (isDivisor n) (integralToListOfInts n)
main :: IO ()
main = do
t <- getInt
nums <- getInts t
let x = [divisorsInNumber num | num <- nums]
print x

Frequency of characters

I am trying to find frequency of characters in file using Haskell. I want to be able to handle files ~500MB size.
What I've tried till now
It does the job but is a bit slow as it parses the file 256 times
calculateFrequency :: L.ByteString -> [(Word8, Int64)]
calculateFrequency f = foldl (\acc x -> (x, L.count x f):acc) [] [255, 254.. 0]
I have also tried using Data.Map but the program runs out of memory (in ghc interpreter).
import qualified Data.ByteString.Lazy as L
import qualified Data.Map as M
calculateFrequency' :: L.ByteString -> [(Word8, Int64)]
calculateFrequency' xs = M.toList $ L.foldl' (\m word -> M.insertWith (+) word 1 m) (M.empty) xs
Here's an implementation using mutable, unboxed vectors instead of higher level constructs. It also uses conduit for reading the file to avoid lazy I/O.
import Control.Monad.IO.Class
import qualified Data.ByteString as S
import Data.Conduit
import Data.Conduit.Binary as CB
import qualified Data.Conduit.List as CL
import qualified Data.Vector.Unboxed.Mutable as VM
import Data.Word (Word8)
type Freq = VM.IOVector Int
newFreq :: MonadIO m => m Freq
newFreq = liftIO $ VM.replicate 256 0
printFreq :: MonadIO m => Freq -> m ()
printFreq freq =
liftIO $ mapM_ go [0..255]
where
go i = do
x <- VM.read freq i
putStrLn $ show i ++ ": " ++ show x
addFreqWord8 :: MonadIO m => Freq -> Word8 -> m ()
addFreqWord8 f w = liftIO $ do
let index = fromIntegral w
oldCount <- VM.read f index
VM.write f index (oldCount + 1)
addFreqBS :: MonadIO m => Freq -> S.ByteString -> m ()
addFreqBS f bs =
loop (S.length bs - 1)
where
loop (-1) = return ()
loop i = do
addFreqWord8 f (S.index bs i)
loop (i - 1)
-- | The main entry point.
main :: IO ()
main = do
freq <- newFreq
runResourceT
$ sourceFile "random"
$$ CL.mapM_ (addFreqBS freq)
printFreq freq
I ran this on 500MB of random data and compared with #josejuan's UArray-based answer:
conduit based/mutable vectors: 1.006s
UArray: 17.962s
I think it should be possible to keep much of the elegance of josejuan's high-level approach yet keep the speed of the mutable vector implementation, but I haven't had a chance to try implementing something like that yet. Also, note that with some general purpose helper functions (like Data.ByteString.mapM or Data.Conduit.Binary.mapM) the implementation could be significantly simpler without affecting performance.
You can play with this implementation on FP Haskell Center as well.
EDIT: I added one of those missing functions to conduit and cleaned up the code a bit; it now looks like the following:
import Control.Monad.Trans.Class (lift)
import Data.ByteString (ByteString)
import Data.Conduit (Consumer, ($$))
import qualified Data.Conduit.Binary as CB
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Unboxed.Mutable as VM
import System.IO (stdin)
freqSink :: Consumer ByteString IO (V.Vector Int)
freqSink = do
freq <- lift $ VM.replicate 256 0
CB.mapM_ $ \w -> do
let index = fromIntegral w
oldCount <- VM.read freq index
VM.write freq index (oldCount + 1)
lift $ V.freeze freq
main :: IO ()
main = (CB.sourceHandle stdin $$ freqSink) >>= print
The only difference in functionality is how the frequency is printed.
#Alex answer is good but, with only 256 values (indexes) an array should be better
import qualified Data.ByteString.Lazy as L
import qualified Data.Array.Unboxed as A
import qualified Data.ByteString as B
import Data.Int
import Data.Word
fq :: L.ByteString -> A.UArray Word8 Int64
fq = A.accumArray (+) 0 (0, 255) . map (\c -> (c, 1)) . concat . map B.unpack . L.toChunks
main = L.getContents >>= print . fq
#alex code take (for my sample file) 24.81 segs, using array take 7.77 segs.
UPDATED:
although Snoyman solution is better, an improvement avoiding unpack maybe
fq :: L.ByteString -> A.UArray Word8 Int64
fq = A.accumArray (+) 0 (0, 255) . toCounterC . L.toChunks
where toCounterC [] = []
toCounterC (x:xs) = toCounter x (B.length x) xs
toCounter _ 0 xs = toCounterC xs
toCounter x i xs = (B.index x i', 1): toCounter x i' xs
where i' = i - 1
with ~50% speedup.
UPDATED:
Using IOVector as Snoyman is as Conduit version (a bit faster really, but this is a raw code, better use Conduit)
import Data.Int
import Data.Word
import Control.Monad.IO.Class
import qualified Data.ByteString.Lazy as L
import qualified Data.Array.Unboxed as A
import qualified Data.ByteString as B
import qualified Data.Vector.Unboxed.Mutable as V
fq :: L.ByteString -> IO (V.IOVector Int64)
fq xs =
do
v <- V.replicate 256 0 :: IO (V.IOVector Int64)
g v $ L.toChunks xs
return v
where g v = toCounterC
where toCounterC [] = return ()
toCounterC (x:xs) = toCounter x (B.length x) xs
toCounter _ 0 xs = toCounterC xs
toCounter x i xs = do
let i' = i - 1
w = fromIntegral $ B.index x i'
c <- V.read v w
V.write v w (c + 1)
toCounter x i' xs
main = do
v <- L.getContents >>= fq
mapM_ (\i -> V.read v i >>= liftIO . putStr . (++", ") . show) [0..255]
This works for me on my computer:
module Main where
import qualified Data.HashMap.Strict as M
import qualified Data.ByteString.Lazy as L
import Data.Word
import Data.Int
calculateFrequency :: L.ByteString -> [(Word8, Int64)]
calculateFrequency xs = M.toList $ L.foldl' (\m word -> M.insertWith (+) word 1 m) M.empty xs
main = do
bs <- L.readFile "E:\\Steam\\SteamApps\\common\\Sid Meier's Civilization V\\Assets\\DLC\\DLC_Deluxe\\Behind the Scenes\\Behind the Scenes.wmv"
print (calculateFrequency bs)
Doesn't run out of memory, or even load the whole file in, but takes forever (about a minute) on 600mb+ files! I compiled this using ghc 7.6.3.
I should point out that the code is basically identical save for the strict HashMap instead of the lazy Map.
Note that insertWith is twice as fast with HashMap than Map in this case. On my machine, the code as written executes in 54 seconds, while the version using Map takes 107.
My two cents (using an STUArray). Can't compare it to other solutions here. Someone might be willing to try it...
module Main where
import Data.Array.ST (runSTUArray, newArray, readArray, writeArray)
import Data.Array.Unboxed (UArray)
import qualified Data.ByteString.Lazy as L (ByteString, unpack, getContents)
import Data.Word
import Data.Int
import Control.Monad (forM_)
calculateFrequency :: L.ByteString -> UArray Word8 Int64
calculateFrequency bs = runSTUArray $ do
a <- newArray (0, 255) 0
forM_ (L.unpack bs) $ \i -> readArray a i >>= writeArray a i . succ
return a
main = L.getContents >>= print . calculateFrequency

How to convert a STArray to a List in Haskell?

Is there a function can do what the function arrayToList do:
import Data.Array.ST
import Control.Monad.ST
genArray :: ST s [Int]
genArray = do
a <- new Array (0, 99) 0 :: ST s (STArray s Int Int)
writeArray a 0 1
{- ... write something to the array ... -}
return arrayToList(a)
If not, how to write one?
You don't need IO for this, constructing a list is a pure operation:
genArray :: [Int]
genArray = runST $ do
a <- newArray (0, 99) 0 :: ST s (STArray s Int Int)
writeArray a 0 1
{- ... write something to the array ... -}
getElems a
Use stToIO and getElems:
genArray :: IO [Int]
genArray = stToIO $ do
a <- newArray (0,99) 0 :: ST s (STArray s Int Int)
writeArray a 0 1
getElems a

Resources