How to work with mutable structures in the IO monad - haskell

TL;DR:
How do I ensure persistence of values generated by randomRIO (from System.Random) within a given do statement?
How do I work with mutable structures in the IO Monad?
My initial question was (so very) wrong - I'm updating the title so future readers who want to understand use mutable structures in the IO monad can find this post.
Longer version:
A heads up:
This looks long but a lot of it is just me giving an overview of how exercism.io works. (UPDATE: the last two code-blocks are older versions of my code which are included as reference, in case future readers would like to follow along with the iterations in the code based on the comments/answers.)
Overview of Exercise:
I'm working on the Robot Name exercise from (the extremely instructive) exercism.io. The exercise involves creating a Robot data type which is capable of storing a name, which is randomly generated (exercise Readme is included below).
For those who aren't familiar with it, the exercism.io learning model is based on automated testing of student-generated code. Each exercise consists of a series of tests (written by the test author) and the solution code must be able to pass all of them. Our code must pass all tests in a given exercise's test file, before we can move to the next exercise - an effective model, imo. (Robot Name is exercise #20 or so.)
In this particular exercise, we're asked to create a Robot data-type and three accompanying functions: mkRobot, robotName and resetName.
mkRobot generates an instance of a Robot
robotName generates and "returns" a unique name for a unnamed Robot (i.e., robotName does not overwrite a pre-existing name); if a Robot already has a name, it simply "returns" the existing name
resetName overwrites a pre-existing name with a new one.
In this particular exercise, there are 7 tests. The tests checks that:
0) robotName generates names that conforms to the specified pattern (a
name is 5 characters long and is made up of two letters followed by
three digits, e.g., AB123, XQ915, etc.)
1) a name assigned by robotName is persistent (i.e., let's say we create robot A and assign him (or her) a name using robotName; calling robotName a second time (on robot A) shouldn't overwrite his name)
2) robotName generates unique names for different robots (i.e., it tests that we're actually randomizing the process)
3) resetName generates names that conform to the specified pattern (similar to test #0)
4) a name assigned by resetName is persistent
5) resetName assigns a different name (i.e., resetName gives a robot a name that's different form it's current name)
6) resetName affects only one robot at a time (i.e., let's say we have robot A and robot B; resetting robot A's name shouldn't affect robot B's name) AND (ii) names that are generated by resetName are persistent
As reference, here's the test itself: https://github.com/dchaudh/exercism-haskell-solutions/blob/master/robot-name/robot-name_test.hs
Where I'm stuck:
Version 1 (original post): At the moment, my code fails on three tests (#1, #4 and #6) all of which have to do with persistence of a robot's name..
Version 2: (interim) Now my code fails on one test (#5) only - test 5 has to do with changing the name of a robot that we've already created
(thanks to bheklikr for his helpful comments which helped me clean up version 1)
Version 3 (final): The code is now fixed (and passes all tests) thanks to Cirdec's thorough post below. For future reader's benefit, I'm including the final version of the code along with the two earlier versions (so they can follow along with the various comments/answers).
Version 3 (Final):
Here's the final version based on Cirdec's answer below (which I'd highly recommend reading). It turns out that my original question (which asked how to create persistent variables using System.Random) was just totally wrong because my initial implementation was unsound. My question should instead have asked how to work with mutable structures in the IO monad (which Cirdec explains below).
{-# LANGUAGE NoMonomorphismRestriction #-}
module Robot (robotName, mkRobot, resetName) where
import Data.Map (fromList, findWithDefault)
import System.Random (Random, randomRIO)
import Control.Monad (replicateM)
import Data.IORef (IORef, newIORef, modifyIORef, readIORef)
newtype Robot = Robot { name :: String }
mkRobot :: IO (IORef Robot)
mkRobot = mkRobotName >>= return . Robot >>= newIORef
robotName :: IORef Robot -> IO String
robotName rr = readIORef rr >>= return . name
resetName :: IORef Robot -> IO ()
resetName rr = mkRobotName >>=
\newName -> modifyIORef rr (\r -> r {name = newName})
mkRobotName :: IO String
mkRobotName = replicateM 2 getRandLetter >>=
\l -> replicateM 3 getRandNumber >>=
\n -> return $ l ++ n
getRandNumber :: IO Char
getRandNumber = fmap getNumber $ randomRIO (1, 10)
getRandLetter :: IO Char
getRandLetter = fmap getLetter $ randomRIO (1, 26)
getNumber :: Int -> Char
getNumber i = findWithDefault ' ' i alphabet
where alphabet = fromList $ zip [1..] ['0'..'9']
getLetter :: Int -> Char
getLetter i = findWithDefault ' ' i alphabet
where alphabet = fromList $ zip [1..] ['A'..'Z']
Version 2 (Interim):
Based on bheklikr's comments which clean up the mkRobotName function and which help start fixing the mkRobot function. This version of the code yielded an error on test #5 only - test #5 has to do with changing a robot's name, which motivates the need for mutable structures...
{-# LANGUAGE NoMonomorphismRestriction #-}
module Robot (robotName, mkRobot, resetName) where
import Data.Map (fromList, findWithDefault)
import System.Random (Random, randomRIO)
import Control.Monad (replicateM)
data Robot = Robot (IO String)
resetName :: Robot -> IO String
resetName (Robot _) = mkRobotName >>= \name -> return name
mkRobot :: IO Robot
mkRobot = mkRobotName >>= \name -> return (Robot (return name))
robotName :: Robot -> IO String
robotName (Robot name) = name
-------------------------------------------------------------------------
--Supporting functions:
mkRobotName :: IO String
mkRobotName = replicateM 2 getRandLetter >>=
\l -> replicateM 3 getRandNumber >>=
\n -> return $ l ++ n
getRandNumber :: IO Char
getRandNumber = fmap getNumber $ randomRIO (1, 10)
getRandLetter :: IO Char
getRandLetter = fmap getLetter $ randomRIO (1, 26)
getNumber :: Int -> Char
getNumber i = findWithDefault ' ' i alphabet
where alphabet = fromList $ zip [1..] ['0'..'9']
getLetter :: Int -> Char
getLetter i = findWithDefault ' ' i alphabet
where alphabet = fromList $ zip [1..] ['A'..'Z']
Version 1 (Original):
In retrospect, this is laughably bad. This version failed on tests #1, #4 and #6 all of which are related to persistence of a robot's name.
{-# LANGUAGE NoMonomorphismRestriction #-}
module Robot (robotName, mkRobot, resetName) where
import Data.Map (fromList, findWithDefault)
import System.Random (Random, randomRIO)
data Robot = Robot (IO String)
resetName :: Robot -> IO Robot
resetName (Robot _) = return $ (Robot mkRobotName)
mkRobot :: IO Robot
mkRobot = return (Robot mkRobotName)
robotName :: Robot -> IO String
robotName (Robot name) = name
--the mass of code below is used to randomly generate names; it's probably
--possible to do it in way fewer lines. but the crux of the main problem lies
--with the three functions above
mkRobotName :: IO String
mkRobotName = getRandLetter >>=
\l1 -> getRandLetter >>=
\l2 -> getRandNumber >>=
\n1 -> getRandNumber >>=
\n2 -> getRandNumber >>=
\n3 -> return (l1:l2:n1:n2:n3:[])
getRandNumber :: IO Char
getRandNumber = randomRIO (1,10) >>= \i -> return $ getNumber i
getNumber :: Int -> Char
getNumber i = findWithDefault ' ' i alphabet
where alphabet = fromList $ zip [1..] ['0'..'9']
getRandLetter :: IO Char
getRandLetter = randomRIO (1,26) >>= \i -> return $ getLetter i
getLetter :: Int -> Char
getLetter i = findWithDefault ' ' i alphabet
where alphabet = fromList $ zip [1..] ['A'..'Z']

Let's start with the types, based on what is required by the tests. mkRobot returns something in IO
mkRobot :: IO r
robotName takes what is returned from mkRobot and returns an IO String.
robotName :: r -> IO String
Finally, resetName takes what is returned from mkRobot and produces an IO action. The return of this action is never used, so we'll use the unit type () for it which is normal for IO actions with no result in Hasekll.
resetName :: r -> IO ()
Based on the tests, whatever r is needs to be able to behave like it is mutated by resetName. We have a number of options for things that behave like they are mutable in IO: IORefs, STRefs, MVarss, and software transactional memory. My go-to preference for simple problems is the IORef. I'm going to take a slightly different tack than you, and separate the IORef from what a Robot is.
newtype Robot = Robot {name :: String}
This leaves Robot a very pure data type. Then I'll use IORef Robot for what r is in the interface to the tests.
IORefs provide five extremely useful functions for working with them, which we will use three of. newIORef :: a -> IO (IORef a) makes a new IORef holding the provided value. readIORef :: IORef a -> IO a reads the value stored in the IORef. modifyIORef :: IORef a -> (a -> a) -> IO () applies the function to the value stored in the IORef. There are two other extremely useful functions we won't use, writeIORef which sets the value without looking at what's there, and atomicModifyIORef which solves about half of the shared memory problems in writing multi-threaded programs. We'll import the three that we will use
import Data.IORef (IORef, newIORef, modifyIORef, readIORef)
When we make a new Robot we'll be making a new IORef Robot with newIORef.
mkRobot :: IO (IORef Robot)
mkRobot = mkRobotName >>= return . Robot >>= newIORef
When we read the name, we'll read the Robot with readIORef, then return the Robot's name
robotName :: IORef Robot -> IO String
robotName rr = readIORef rr >>= return . name
Finally, resetName will mutate the IORef. We'll make a new name for the robot with mkRobotName, then call modifyIORef with a function that sets the robot's name to the new name`.
resetName :: IORef Robot -> IO ()
resetName rr = mkRobotName >>=
\newName -> modifyIORef rr (\r -> r {name = newName})
The function \r -> r {name = newName} is the same as const (Robot newName), except that it will only change the name if we later decide to add some other field to the Robot data type.

Related

What are good Haskell conventions for managing deeply nested bracket patterns?

I am currently working with Haskell bindings to a HDF5 C library. Like many C libraries, this one uses many pointers in its functions calls.
The usual "best practice" Haskell functions for allocating and releasing C resources follow the bracket pattern, like alloca, withArray, etc. In using them, I often enter several nested brackets. For instance, here is a small excerpt for HDF5 bindings:
selectHyperslab rID dName = withDataset rID dName $ \dID -> do
v <- withDataspace 10 $ \dstDS -> do
srcDS <- c'H5Dget_space dID
dat <- alloca3 (0, 1, 10) $ \(start, stride, count) -> do
err <- c'H5Sselect_hyperslab srcDS c'H5S_SELECT_SET start stride count nullPtr
-- do some work ...
return value
alloca3 (a, b, c) action =
alloca $ \aP -> do
poke aP a
alloca $ \bP -> do
poke bP b
alloca $ \cP -> do
poke cP c
action (aP, bP, cP)
In the code above, the nested brackets are bracket functions I wrote withDataset, withDataspace, and alloca3, which I wrote to prevent the bracket nesting from going another 3 levels deep in the code. For C libraries with lots of resource acquisition calls and pointer arguments, coding with the standard bracket primitives can get unmanageable (which is why I wrote alloca3 to reduce the nesting.)
So generally, are there any best practices or coding techniques to help reduce the nesting of brackets when needing to allocate and deallocate many resources (such as with C calls)? The only alternative I have found is the ResourceT transformer, which from the tutorial looks like it is designed to make interleaving resource acquire/release possible, and not to simplify the bracket pattern.
Recently I was investigating this problem in Scala. The recurring pattern is (a -> IO r) -> IO r, where a given function is executed within some resource allocation context given a value of type a. And this is just ContT r IO a, which is readily available in Haskell. So we can write:
import Control.Monad
import Control.Monad.Cont
import Control.Monad.IO.Class
import Control.Exception (bracket)
import Foreign.Ptr (Ptr)
import Foreign.Storable (Storable)
import Foreign.Marshal.Alloc (alloca)
allocaC :: Storable a => ContT r IO (Ptr a)
allocaC = ContT alloca
bracketC :: IO a -> (a -> IO b) -> ContT r IO a
bracketC start end = ContT (bracket start end)
bracketC_ :: IO a -> IO b -> ContT r IO a
bracketC_ start end = ContT (bracket start (const end))
-- ...etc...
-- | Example:
main :: IO ()
main = flip runContT return $ do
bracketC_ (putStrLn "begin1") (putStrLn "end1")
bracketC_ (putStrLn "begin2") (putStrLn "end2")
liftIO $ putStrLn "..."
The standard monad/applicative functions allow you to simplify a lot of your code, for example:
allocAndPoke :: (Storable a) => a -> ContT r IO (Ptr a)
allocAndPoke x = allocaC >>= \ptr -> liftIO (poke ptr x) >> return ptr
-- With the monad alloca3 won't be probably needed, just as an example:
alloca3C (a, b, c) =
(,,) <$> allocAndPoke a <*> allocAndPoke b <*> allocAndPoke c
allocaManyC :: (Storable a) => [a] -> ContT r IO [Ptr a]
allocaManyC = mapM allocAndPoke

Using a random number to encrypt a message

I'm currently trying to encrypt a message (String) with the help of a random generated number in Haskell. The idea is to get the message, generate a random String of numbers with the same length (or more and then to take the length I need).
Then i want to perform some actions based on the ASCII representation and then return the encrypted String.
Unfortunately I'm not very versed with monads in Haskell, so it might be a very simple problem to solve, which I can't comprehend yet.
generateMyKey string = newStdGen >>= \x -> print $ concatMap show $ map abs $ rs x
where rs x = randomlist (length string) x
randomlist :: Int -> StdGen -> [Int]
randomlist n = take n . unfoldr (Just . random)
So the problem is I get an IO() out of getMyKey, but I want to have a String, or atleast a IO(String) to perform the encrypting mechanism.
Right now I'm getting a big list of positive (hence the abs + map) random numbers, but I can't access them.
There are two basic ways to go about this (and one more complicated but easier). If you're just using System.Random, you can generate random numbers in two ways, either by accepting a StdGen and staying pure, or using the OS's random generator and staying in IO. At some point, you'll have to make a call to the OS's random functionality to get a seed or value, but this can happen in main far away from your actual code.
To keep your functions pure, you'll need to pass around a StdGen and use the functions
random :: Random a => StdGen -> (a, StdGen)
randoms :: Random a => StdGen -> [a]
(Note: I've substituted RandomGen g => g for StdGen, there's no need to write a custom RandomGen instance for your case)
You can then write your function generateMyKey as
randomList :: Int -> StdGen -> [Int]
randomList n = take n . randoms
generateMyKey :: String -> StdGen -> String
generateMyKey text g
= concatMap show
$ map abs
$ randomList (length text) g
And this entirely avoids having to live in IO. Be wary, though, if you re-use the same g, you'll generate the same random list each time. We can avoid this by using IO and its related functions
randomList :: Int -> IO [Int]
randomList 0 = return []
randomList n = do
first <- randomIO
rest <- randomList (n - 1) -- Recursively generate the rest
return $ first : rest
generateMyKey :: String -> IO String
generateMyKey text = do
key <- randomList (length text)
return $ concatMap show $ map abs $ key
This will come with a performance hit, and now we've lost the ability to generate the same key repeatedly, making it difficult to test our functions reliably! How can we reconcile these two approaches?
Enter the package MonadRandom. This package provides a monad (and monad transformer, but you don't need to worry about that right now) that lets you abstract away how you generate random numbers so that you can choose how you want to run your code in different circumstances. If you want IO, you can use IO. If you want to supply a seed, you can supply a seed. It's very handy. You can install it with cabal install MonadRandom and use it as
import Control.Monad.Random
randomList :: Int -> Rand StdGen [Int]
randomList n = fmap (take n) getRandoms
generateMyKey :: String -> Rand StdGen String
generateMyKey text = do
key <- randomList (length text)
return $ concatMap show $ map abs $ key
Our generateMyKey code is even the same as the IO version other than the type signature!
Now to run it.
main :: IO ()
main = do
-- Entirely impure, have it automatically grab a StdGen from IO for us
ioVersion <- evalRandIO $ generateMyKey "password"
-- Make a StdGen that stays the same every time we run the program, useful for testing
let pureStdGen = mkStdGen 12345
pureVersion = evalRand (generateMyKey "password") pureStdGen
-- Get a StdGen from the system, but still evaluate it purely
ioStdGen <- getStdGen
let pureVersion2 = evalRand (generateMyKey "password") ioStdGen
-- Print out all three versions
putStrLn ioVersion
putStrLn pureVersion
putStrLn pureVersion2
There are a number of solutions to this problem, but at first glance it might seem that you need to have your entire program operate in the IO monad, but you don't! The entry (/exit) point of your program is the only place that needs to see IO -- you can factor out any transformations on your random list into pure functions, i.e:
import Data.List
import System.Random
generateMyKey :: String -> IO String
generateMyKey string = do
x <- newStdGen
let rs = randomlist (length string)
return $ concatMap show $ map abs $ rs x
randomlist :: Int -> StdGen -> [Int]
randomlist n = take n . unfoldr (Just . random)
change :: String -> String
change = reverse -- for example
main :: IO ()
main = do
key <- generateMyKey "what"
putStrLn $ change key
generateMyKey is identical to what you had before, except that it's written in do notation now and is returning the string instead of just printing it. This allows us to "pull out" a random key from inside the IO monad and transform it with regular pure functions, like change, for example. This allows you to reason about the pure functions as normal, while still pulling in your values from IO.

Get value from IO rather than the computation itself

Being quite new to Haskell, I'm currently trying to improve my skills by writing an interpreter for a simple imperative toy language.
One of the expressions in this language is input, which reads a single integer from standard input. However, when I assign the value of this expression to a variable and then use this variable later, it seems ot me that I actually stored the computation of reading a value rather the read value itself. This means that e.g. the statements
x = input;
y = x + x;
will cause the interpreter to invoke the input procedure three times rather than one.
Internally in the evaluator module, I use a Map to store the values of variables. Because I need to deal with IO, this gets wrapped in an IO monad, as immortalized in the following minimal example:
import qualified Data.Map as Map
type State = Map.Map String Int
type Op = Int -> Int -> Int
input :: String -> IO State -> IO State
input x state = do line <- getLine
st <- state
return $ Map.insert x (read line) st
get :: String -> IO State -> IO Int
get x state = do st <- state
return $ case Map.lookup x st of
Just i -> i
eval :: String -> Op -> String -> IO State -> IO Int
eval l op r state = do i <- get l state
j <- get r state
return $ op i j
main :: IO ()
main = do let state = return Map.empty
let state' = input "x" state
val <- eval "x" (+) "x" state'
putStrLn . show $ val
The second line in the main function simulates the assignment of x, while the third line simulates the evaluation of the binary + operator.
My question is: How do I get around this, such that the code above only inputs once? I suspect that it is the IO-wrapping that causes the problem, but as we're dealing with IO I see no way out of that..?
Remember that IO State is not an actual state, but instead the specification for an IO machine which eventually produces a State. Let's consider input as an IO-machine transformer
input :: String -> IO State -> IO State
input x state = do line <- getLine
st <- state
return $ Map.insert x (read line) st
Here, provided a machine for producing a state, we create a bigger machine which takes that passed state and adding a read from an input line. Again, to be clear, input name st is an IO-machine which is a slight modification of the IO-machine st.
Let's now examine get
get :: String -> IO State -> IO Int
get x state = do st <- state
return $ case Map.lookup x st of
Just i -> i
Here we have another IO-machine transformer. Given a name and an IO-machine which produces a State, get will produce an IO-machine which returns a number. Note again that get name st is fixed to always use the state produced by the (fixed, input) IO-machine st.
Let's combine these pieces in eval
eval :: String -> Op -> String -> IO State -> IO Int
eval l op r state = do i <- get l state
j <- get r state
return $ op i j
Here we call get l and get r each on the same IO-machine state and thus produce two (completely independent) IO-machines get l state and get r state. We then evaluate their IO effects one after another and return the op-combination of their results.
Let's examine the kinds of IO-machines built in main. In the first line we produce a trivial IO-machine, called state, written return Map.empty. This IO-machine, each time it's run, performs no side effects in order to return a fresh, blank Map.Map.
In the second line, we produce a new kind of IO-machine called state'. This IO-machine is based off of the state IO-machine, but it also requests an input line. Thus, to be clear, each time state' runs, a fresh Map.Map is generated and then an input line is read to read some Int, stored at "x".
It should be clear where this is going, but now when we examine the third line we see that we pass state', the IO-machine, into eval. Previously we stated that eval runs its input IO-machine twice, once for each name, and then combines the results. By this point it should be clear what's happening.
All together, we build a certain kind of machine which draws input and reads it as an integer, assigning it to a name in a blank Map.Map. We then build this IO-machine into a larger one which uses the first IO-machine twice, in two separate invocations, in order to collect data and combine it with an Op.
Finally, we run this eval machine using do notation (the (<-) arrow indicates running the machine). Clearly it should collect two separate lines.
So what do we really want to do? Well, we need to simulate ambient state in the IO monad, not just pass around Map.Maps. This is easy to do by using an IORef.
import Data.IORef
input :: IORef State -> String -> IO ()
input ref name = do
line <- getLine
modifyIORef ref (Map.insert name (read line))
eval :: IORef State -> Op -> String -> String -> IO Int
eval ref op l r = do
stateSnapshot <- readIORef ref
let Just i = Map.lookup l stateSnapshot
Just j = Map.lookup l stateSnapshot
return (op i j)
main = do
st <- newIORef Map.empty -- create a blank state, embedded into IO, not a value
input st "x" -- request input *once*
val <- eval st (+) "x" "x" -- compute the op
putStrLn . show $ val
It's fine to wrap your actions such as getLine in IO, but to me it looks like your problem is that you're trying to pass your state in the IO monad. Instead, I think this is probably time you get introduced to monad transformers and how they'll let you layer the IO and State monads to get the functionality of both in one.
Monad transformers are a pretty complex topic and it'll take a while to get to where you're comfortable with them (I'm still learning new things all the time about them), but they're a very useful tool when you need to layer multiple monads. You'll need the mtl library to follow this example.
First, imports
import qualified Data.Map as Map
import Control.Monad.State
Then types
type Op = Int -> Int -> Int
-- Renamed to not conflict with Control.Monad.State.State
type AppState = Map.Map String Int
type Interpreter a = StateT AppState IO a
Here Interpreter is the Monad in which we'll build our interpreter. We also need a way to run the interpreter
-- A utility function for kicking off an interpreter
runInterpreter :: Interpreter a -> IO a
runInterpreter interp = evalStateT interp Map.empty
I figured defaulting to Map.empty was sufficient.
Now, we can build our interpreter actions in our new monad. First we start with input. Instead of returning our new state, we just modify what is current in our map:
input :: String -> Interpreter ()
input x = do
-- IO actions have to be passed to liftIO
line <- liftIO getLine
-- modify is a member of the MonadState typeclass, which StateT implements
modify (Map.insert x (read line))
I had to rename get so that it didn't conflict with get from Control.Monad.State, but it does basically the same thing as before, it just takes our map and looks up that variable in it.
-- Had to rename to not conflict with Control.Monad.State.get
-- Also returns Maybe Int because it's safer
getVar :: String -> Interpreter (Maybe Int)
getVar x = do
-- get is a member of MonadState
vars <- get
return $ Map.lookup x vars
-- or
-- get x = fmap (Map.lookup x) get
Next, eval now just looks up each variable in our map, then uses liftM2 to keep the return value as Maybe Int. I prefer the safety of Maybe, but you can rewrite it if you prefer
eval :: String -> Op -> String -> Interpreter (Maybe Int)
eval l op r = do
i <- getVar l
j <- getVar r
-- liftM2 op :: Maybe Int -> Maybe Int -> Maybe Int
return $ liftM2 op i j
Finally, we write our sample program. It stores user input to the variable "x", adds it to itself, and prints out the result.
-- Now we can write our actions in our own monad
program :: Interpreter ()
program = do
input "x"
y <- eval "x" (+) "x"
case y of
Just y' -> liftIO $ putStrLn $ "y = " ++ show y'
Nothing -> liftIO $ putStrLn "Error!"
-- main is kept very simple
main :: IO ()
main = runInterpreter program
The basic idea is that there is a "base" monad, here IO, and these actions are "lifted" up to the "parent" monad, here StateT AppState. There is a typeclass implementation for the different state operations get, put, and modify in the MonadState typeclass, which StateT implements, and in order to lift IO actions there's a pre-made liftIO function that "lifts" IO actions to the parent monad. Now we don't have to worry about passing around our state explicitly, we can still perform IO, and it has even simplified the code!
I would recommend reading the Real World Haskell chapter on monad transformers to get a better feel for them. There are other useful ones as well, such as ErrorT for handling errors, ReaderT for static configuration, WriterT for aggregating results (usually used for logging), and many others. These can be layered into what is called a transformer stack, and it's not too difficult to make your own either.
Instead of passing an IO State, you can pass State and then use higher-level functions to deal with IO. You can go further and make get and eval free from side-effects:
input :: String -> State -> IO State
input x state = do
line <- getLine
return $ Map.insert x (read line) state
get :: String -> State -> Int
get x state = case Map.lookup x state of
Just i -> i
eval :: String -> Op -> String -> State -> Int
eval l op r state = let i = get l state
j = get r state
in op i j
main :: IO ()
main = do
let state = Map.empty
state' <- input "x" state
let val = eval "x" (+) "x" state'
putStrLn . show $ val
If you're actually building an interpreter, you'll presumably have a list of instructions to execute at some point.
This is my rough translation of your code (although I'm only a beginner myself)
import Data.Map (Map, empty, insert, (!))
import Control.Monad (foldM)
type ValMap = Map String Int
instrRead :: String -> ValMap -> IO ValMap
instrRead varname mem = do
putStr "Enter an int: "
line <- getLine
let intval = (read line)::Int
return $ insert varname intval mem
instrAdd :: String -> String -> String -> ValMap -> IO ValMap
instrAdd varname l r mem = do
return $ insert varname result mem
where result = (mem ! l) + (mem ! r)
apply :: ValMap -> (ValMap -> IO ValMap) -> IO ValMap
apply mem instr = instr mem
main = do
let mem0 = empty
let instructions = [ instrRead "x", instrAdd "y" "x" "x" ]
final <- foldM apply mem0 instructions
print (final ! "y")
putStrLn "done"
The foldM applies a function (apply) to a start value (mem0) and a list (instructions) but does so within a monad.

After reading a file I have IO [Char], but I need [IO Char]

I have a file number.txt which contains a large number and I read it into an IO String like this:
readNumber = readFile "number.txt" >>= return
In another function I want to create a list of Ints, one Int for each digit…
Lets assume the content of number.txt is:
1234567890
Then I want my function to return [1,2,3,4,5,6,7,8,9,0].
I tried severall versions with map, mapM(_), liftM, and, and, and, but I got several error messages everytime, which I was able to reduce to
Couldn't match expected type `[m0 Char]'
with actual type `IO String'
The last version I have on disk is the following:
module Main where
import Control.Monad
import Data.Char (digitToInt)
main = intify >>= putStrLn . show
readNumber = readFile "number.txt" >>= return
intify = mapM (liftM digitToInt) readNumber
So, as far as I understand the error, I need some function that takes IO [a] and returns [IO a], but I was not able to find such thing with hoogle… Only the other way round seemes to exist
In addition to the other great answers here, it's nice to talk about how to read [IO Char] versus IO [Char]. In particular, you'd call [IO Char] "an (immediate) list of (deferred) IO actions which produce Chars" and IO [Char] "a (deferred) IO action producing a list of Chars".
The important part is the location of "deferred" above---the major difference between a type IO a and a type a is that the former is best thought of as a set of instructions to be executed at runtime which eventually produce an a... while the latter is just that very a.
This phase distinction is key to understanding how IO values work. It's also worth noting that it can be very fluid within a program---functions like fmap or (>>=) allow us to peek behind the phase distinction. As an example, consider the following function
foo :: IO Int -- <-- our final result is an `IO` action
foo = fmap f getChar where -- <-- up here getChar is an `IO Char`, not a real one
f :: Char -> Int
f = Data.Char.ord -- <-- inside here we have a "real" `Char`
Here we build a deferred action (foo) by modifying a deferred action (getChar) by using a function which views a world that only comes into existence after our deferred IO action has run.
So let's tie this knot and get back to the question at hand. Why can't you turn an IO [Char] into an [IO Char] (in any meaningful way)? Well, if you're looking at a piece of code which has access to IO [Char] then the first thing you're going to want to do is sneak inside of that IO action
floob = do chars <- (getChars :: IO [Char])
...
where in the part left as ... we have access to chars :: [Char] because we've "stepped into" the IO action getChars. This means that by this point we've must have already run whatever runtime actions are required to generate that list of characters. We've let the cat out of the monad and we can't get it back in (in any meaningful way) since we can't go back and "unread" each individual character.
(Note: I keep saying "in any meaningful way" because we absolutely can put cats back into monads using return, but this won't let us go back in time and have never let them out in the first place. That ship has sailed.)
So how do we get a type [IO Char]? Well, we have to know (without running any IO) what kind of IO operations we'd like to do. For instance, we could write the following
replicate 10 getChar :: [IO Char]
and immediately do something like
take 5 (replicate 10 getChar)
without ever running an IO action---our list structure is immediately available and not deferred until the runtime has a chance to get to it. But note that we must know exactly the structure of the IO actions we'd like to perform in order to create a type [IO Char]. That said, we could use yet another level of IO to peek at the real world in order to determine the parameters of our action
do len <- (figureOutLengthOfReadWithoutActuallyReading :: IO Int)
return $ replicate len getChar
and this fragment has type IO [IO Char]. To run it we have to step through IO twice, we have to let the runtime perform two IO actions, first to determine the length and then second to actually act on our list of IO Char actions.
sequence :: [IO a] -> IO [a]
The above function, sequence, is a common way to execute some structure containing a, well, sequence of IO actions. We can use that to do our two-phase read
twoPhase :: IO [Char]
twoPhase = do len <- (figureOutLengthOfReadWithoutActuallyReading :: IO Int)
putStrLn ("About to read " ++ show len ++ " characters")
sequence (replicate len getChar)
>>> twoPhase
Determining length of read
About to read 22 characters
let me write 22 charac"let me write 22 charac"
You got some things mixed up:
readNumber = readFile "number.txt" >>= return
the return is unecessary, just leave it out.
Here is a working version:
module Main where
import Data.Char (digitToInt)
main :: IO ()
main = intify >>= print
readNumber :: IO String
readNumber = readFile "number.txt"
intify :: IO [Int]
intify = fmap (map digitToInt) readNumber
Such a function can't exists, because you would be able to evaluate the length of the list without ever invoking any IO.
What is possible is this:
imbue' :: IO [a] -> IO [IO a]
imbue' = fmap $ map return
Which of course generalises to
imbue :: (Functor f, Monad m) => m (f a) -> m (f (m a))
imbue = liftM $ fmap return
You can then do, say,
quun :: IO [Char]
bar :: [IO Char] -> IO Y
main = do
actsList <- imbue quun
y <- bar actsLists
...
Only, the whole thing about using [IO Char] is pointless: it's completely equivalent to the much more straightforward way of working only with lists of "pure values", only using the IO monad "outside"; how to do that is shown in Markus's answer.
Do you really need many different helper functions? Because you may write just
main = do
file <- readFile "number.txt"
let digits = map digitToInt file
print digits
or, if you really need to separate them, try to minimize the amount of IO signatures:
readNumber = readFile "number.txt" --Will be IO String
intify = map digitToInt --Will be String -> [Int], not IO
main = readNumber >>= print . intify

How do I avoid memory problems when writing to file using the Writer monad?

I am building some moderately large DIMACS files, however with the method used below the memory usage is rather large compared to the size of the files generated, and on some of the larger files I need to generate I run in to out of memory problems.
import Control.Monad.State.Strict
import Control.Monad.Writer.Strict
import qualified Data.ByteString.Lazy.Char8 as B
import Control.Monad
import qualified Text.Show.ByteString as BS
import Data.List
main = printDIMACS "test.cnf" test
test = do
xs <- freshs 100000
forM_ (zip xs (tail xs))
(\(x,y) -> addAll [[negate x, negate y],[x,y]])
type Var = Int
type Clause = [Var]
data DIMACSS = DS{
nextFresh :: Int,
numClauses :: Int
} deriving (Show)
type DIMACSM a = StateT DIMACSS (Writer B.ByteString) a
freshs :: Int -> DIMACSM [Var]
freshs i = do
next <- gets nextFresh
let toRet = [next..next+i-1]
modify (\s -> s{nextFresh = next+i})
return toRet
fresh :: DIMACSM Int
fresh = do
i <- gets nextFresh
modify (\s -> s{nextFresh = i+1})
return i
addAll :: [Clause] -> DIMACSM ()
addAll c = do
tell
(B.concat .
intersperse (B.pack " 0\n") .
map (B.unwords . map BS.show) $ c)
tell (B.pack " 0\n")
modify (\s -> s{numClauses = numClauses s + length c})
add h = addAll [h]
printDIMACS :: FilePath -> DIMACSM a -> IO ()
printDIMACS file f = do
writeFile file ""
appendFile file (concat ["p cnf ", show i, " ", show j, "\n"])
B.appendFile file b
where
(s,b) = runWriter (execStateT f (DS 1 0))
i = nextFresh s - 1
j = numClauses s
I would like to keep the monadic building of clauses since it is very handy, but I need to overcome the memory problem. How do I optimize the above program so that it doesn't use too much memory?
If you want good memory behavior, you need to make sure that you write out the clauses as you generate them, instead of collecting them in memory and dumping them as such, either using lazyness or a more explicit approach such as conduits, enumerators, pipes or the like.
The main obstacle to that approach is that the DIMACS format expects the number of clauses and variables in the header. This prevents the naive implementation from being sufficiently lazy. There are two possibilities:
The pragmatic one is to write the clauses first to a temporary location. After that the numbers are known, so you write them to the real file and append the contents of the temporary file.
The prettier approach is possible if the generation of clauses has no side effects (besides the effects offered by your DIMACSM monad) and is sufficiently fast: Run it twice, first throwing away the clauses and just calculating the numbers, print the header line, run the generator again; now printing the clauses.
(This is from my experience with implementing SAT-Britney, where I took the second approach, because it fitted better with other requirements in that context.)
Also, in your code, addAll is not lazy enough: The list c needs to be retained even after writing (in the MonadWriter sense) the clauses. This is another space leak. I suggest you implement add as the primitive operation and then addAll = mapM_ add.
As explained in Joachim Breitner's answer the problem was that DIMACSM was not lazy enough, both because the strict versions of the monads was used and because the number of variables and clauses are needed before the ByteString can be written to the file. The solution is to use the lazy versions of the Monads and execute them twice. It turns out that it is also necessary to have WriterT be the outer monad:
import Control.Monad.State
import Control.Monad.Writer
...
type DIMACSM a = WriterT B.ByteString (State DIMACSS) a
...
printDIMACS :: FilePath -> DIMACSM a -> IO ()
printDIMACS file f = do
writeFile file ""
appendFile file (concat ["p cnf ", show i, " ", show j, "\n"])
B.appendFile file b
where
s = execState (execWriterT f) (DS 1 0)
b = evalState (execWriterT f) (DS 1 0)
i = nextFresh s - 1
j = numClauses s

Resources