Monad Transformers vs Passing parameters to functions - haskell

I am new to Haskell but understand how Monad Transformers can be used.
Yet, I still have difficulties grabbing their claimed advantage over passing parameters to function calls.
Based on the wiki Monad Transformers Explained, we basically have a Config Object defined as
data Config = Config Foo Bar Baz
and to pass it around, instead of writing functions with this signature
client_func :: Config -> IO ()
we use a ReaderT Monad Transformer and change the signature to
client_func :: ReaderT Config IO ()
pulling the Config is then just a call to ask.
The function call changes from client_func c to runReaderT client_func c
Fine.
But why does this make my application simpler ?
1- I suspect Monad Transformers have an interest when you stitch a lot of functions/modules together to form an application. But this is where is my understanding stops. Could someone please shed some light?
2- I could not find any documentation on how you write a large modular application in Haskell, where modules expose some form of API and hide their implementations, as well as (partly) hide their own States and Environments from the other modules. Any pointers please ?
(Edit: Real World Haskell states that ".. this approach [Monad Transformers] ... scales to bigger programs.", but there is no clear example demonstrating that claim)
EDIT Following Chris Taylor Answer Below
Chris perfectly explains why encapsulating Config, State,etc... in a Transformer Monad provides two benefits:
It prevents a higher level function from having to maintain in its type signature all the parameters required by the (sub)functions it calls but not required for its own use (see the getUserInput function)
and as a consequence makes higher level functions more resilient to a change of the content of the Transformer Monad (say you want to add a Writer to it to provide Logging in a lower level function)
This comes at the cost of changing the signature of all functions so that they run "in" the Transformer Monad.
So question 1 is fully covered. Thank you Chris.
Question 2 is now answered in this SO post

Let's say that we're writing a program that needs some configuration information in the following form:
data Config = C { logFile :: FileName }
One way to write the program is to explicitly pass the configuration around between functions. It would be nice if we only had to pass it to the functions that use it explicitly, but sadly we're not sure if a function might need to call another function that uses the configuration, so we're forced to pass it as a parameter everywhere (indeed, it tends to be the low-level functions that need to use the configuration, which forces us to pass it to all the high-level functions as well).
Let's write the program like that, and then we'll re-write it using the Reader monad and see what benefit we get.
Option 1. Explicit configuration passing
We end up with something like this:
readLog :: Config -> IO String
readLog (C logFile) = readFile logFile
writeLog :: Config -> String -> IO ()
writeLog (C logFile) message = do x <- readFile logFile
writeFile logFile $ x ++ message
getUserInput :: Config -> IO String
getUserInput config = do input <- getLine
writeLog config $ "Input: " ++ input
return input
runProgram :: Config -> IO ()
runProgram config = do input <- getUserInput config
putStrLn $ "You wrote: " ++ input
Notice that in the high level functions we have to pass config around all the time.
Option 2. Reader monad
An alternative is to rewrite using the Reader monad. This complicates the low level functions a bit:
type Program = ReaderT Config IO
readLog :: Program String
readLog = do C logFile <- ask
readFile logFile
writeLog :: String -> Program ()
writeLog message = do C logFile <- ask
x <- readFile logFile
writeFile logFile $ x ++ message
But as our reward, the high level functions are simpler, because we never need to refer to the configuration file.
getUserInput :: Program String
getUserInput = do input <- getLine
writeLog $ "Input: " ++ input
return input
runProgram :: Program ()
runProgram = do input <- getUserInput
putStrLn $ "You wrote: " ++ input
Taking it further
We could re-write the type signatures of getUserInput and runProgram to be
getUserInput :: (MonadReader Config m, MonadIO m) => m String
runProgram :: (MonadReader Config m, MonadIO m) => m ()
which gives us a lot of flexibility for later, if we decide that we want to change the underlying Program type for any reason. For example, if we want to add modifiable state to our program we could redefine
data ProgramState = PS Int Int Int
type Program a = StateT ProgramState (ReaderT Config IO) a
and we don't have to modify getUserInput or runProgram at all - they'll continue to work fine.
N.B. I haven't type checked this post, let alone tried to run it. There may be errors!

Related

Monad and MonadIO for custom type

I have a Logger type of kind * -> * which can take any type and log the value in a file. I am trying to implement this in a monadic way so that I log and keep working the same. My code looks like
import Control.Applicative
import Control.Monad
import System.IO
import Control.Monad.IO.Class
instance Functor Logger where
fmap = liftM
instance Applicative Logger where
pure = return
(<*>) = ap
newtype Logger a = Logger a deriving (Show)
instance Monad (Logger) where
return = Logger
Logger logStr >>= f = f logStr
instance MonadIO (Logger) where
liftIO a = do
b <- liftIO a
return b
logContent :: (Show a) => a -> Logger a
logContent a = do
b <- liftIO $ logContent2 a
return b
logContent2 :: (Show a) => a -> IO a
logContent2 a = do
fHandle <- openFile "test.log" AppendMode
hPrint fHandle a
hClose fHandle
return (a)
The liftIO function goes on endless loop as it calls itself. I am not able to do b <- a either. Can someone help on getting MonadIO implementation right ?
As noted in the comments, I think you've misunderstood what MonadIO and liftIO do.
These typeclasses and functions come from mtl library. Rather unfortunately, mtl stands for "monad transformer library", but mtl is not a monad transformer library. Rather, mtl is a set of typeclasses that allow you to take a monad that --- and this is important --- already has a particular type of functionality and provide that monad with a consistent interface around that functionality. This ends up being really useful for working with actual monad transformers. That's because mtl allows you to use tell and ask and put to access the Writer, Reader, and State functionality of your monad transformer stack in a consistent way.
Separately from this transformer business, if you already have a custom monad, say that supports arbitrary IO and has State functionality, then you can define a MonadState instance to make the standard state operations (state, get, gets, put, modify) available for your custom monad, and you can define a MonadIO instance to allow an arbitrary IO action to be executed in your custom monad using liftIO. However, none of these typeclasses are capable of adding functionality to a monad that it doesn't already have. In particular, you can't transform an arbitrary monadic action m a into an IO a using a MonadIO instance.
Note that the transformers package contains types that are capable of adding functionality to a monad that it doesn't already have (e.g., adding reader or writer functionality), but there is no transformer to add IO to an arbitrary monad. Such a transformer would be impossible (without unsafe or nonterminating operations).
Also note that the signature for liftIO :: MonadIO m => IO a -> m a puts a MonadIO constraint on m, and this isn't just a trivial constraint. It actually indicates that liftIO only works for monads m that already have IO functionality, so either m is the IO monad, or it's a monad stack with IO at its base. Your Logger example doesn't have IO functionality and so can't have a (sensible) MonadIO instance.
Getting back to your specific problem, it's actually a little bit hard to steer you right here without knowing exactly what you're trying to do. If you just want to add file-based logging to an existing IO computation, then defining a new transformer stack will probably do the trick:
type LogIO = ReaderT Handle IO
logger :: (Show a) => a -> LogIO ()
logger a = do
h <- ask
liftIO $ hPrint h a
runLogIO :: LogIO a -> FilePath -> IO a
runLogIO act fp = withFile fp AppendMode $ \h -> runReaderT act h
and you can write things like:
main :: IO ()
main = runLogIO start "test.log"
start :: LogIO ()
start = do
logger "Starting program"
liftIO . putStrLn $ "Please enter your name:"
n <- liftIO $ getLine
logger n
liftIO . putStrLn $ "Hello, " ++ n
logger "Ending program"
The need to add liftIO calls when using IO actions within the LogIO monad is ugly but largely unavoidable.
This solution would also work for adding file-based logging to pure computations, with the understanding that you have to convert them to IO computations anyway if you want to safely log to a file.
The more general solution is to define your own monad transformer (not merely your own monad), like LoggerT m, together with an associated MonadLogger type class that will add file-based logging to to any IO-capable monad stack. The idea would be that you could then create arbitrary custom monad stacks:
type MyMonad = StateT Int (LoggerT IO)
and then write code that mixes monadic computations from different layers (like mixing state computations and file-based logging):
newSym :: String -> MyMonad String
newSym pfx = do
n <- get
logger (pfx, n)
put (n+1)
return $ pfx ++ show n
Is this what you what you're trying to do? If not, maybe you could describe, either here or in a new question, how you're trying to add logging to some example code.

Reader Monad - explanation of trivial case

I have been trying to get to grips with the reader monad and came across this tutorial. In it, the author presents this example:
example2 :: String -> String
example2 context = runReader (greet "James" >>= end) context
where
greet :: String -> Reader String String
greet name = do
greeting <- ask
return $ greeting ++ ", " ++ name
end :: String -> Reader String String
end input = do
isHello <- asks (== "Hello")
return $ input ++ if isHello then "!" else "."
I know that this is a trivial example that shows the mechanics, but I am trying to figure out why it would be better than doing something like:
example3 :: String -> String
example3 = end <*> (greet "James")
where
greet name input = input ++ ", " ++ name
end input = if input == "Hello" then (++ "!") else (++ ".")
Reader isn't often used by itself in real code. As you have observed, it's not really better than just passing an extra argument to your functions. However, as part of a monad transformer it is an excellent way to pass configuration parameters through your application. Usually this is done by adding a MonadReader constraint to any function that needs access to configuration.
Here's an attempt at a more real-world example:
data Config = Config
{ databaseConnection :: Connection
, ... other configuration stuff
}
getUser :: (MonadReader Config m, MonadIO m) => UserKey -> m User
getUser x = do
db <- asks databaseConnection
.... fetch user from database using the connection
then your main would look something like:
main :: IO ()
main = do
config <- .... create the configuration
user <- runReaderT (getUser (UserKey 42)) config
print user
dfeuer, chi and user2297560 are right in that "Reader isn't often used by itself in real code". It is worth noting, though, that there is next to no essential difference between what you do in the second snippet in the question and actually using Reader as a monad: the function functor is just Reader without the wrappers, and the Monad and Applicative instances for both of them are equivalent. By the way, outside of highly polymorphic code1, the typical motivation for using the function Applicative is making code more pointfree. In that case, moderation is highly advisable. For instance, as far as my own taste goes, this...
(&&) <$> isFoo <*> isBar
... is fine (and sometimes it might even read nicer than the pointful spelling), while this...
end <*> greet "James"
... is just confusing.
Footnotes
For instance, as Carl points out in a comment, it and the related instances can be useful in...
[...] places where you have code that's polymorphic in a type constructor and your use case is passing an argument in. This can come up when using the polymorphic types offered by lenses, for instance.

How to limit code changes when introducing state?

I am a senior C/C++/Java/Assembler programmer and I have been always fascinated by the pure functional programming paradigm. From time to time, I try to implement something useful with it, e.g., a small tool, but often I quickly reach a point where I realize that I (and my tool, too) would be much faster in a non-pure language. It's probably because I have much more experience with imperative programming languages with thousands of idoms, patterns and typical solution approaches in my head.
Here is one of those situations. I have encountered it several times and I hope you guys can help me.
Let's assume I write a tool to simulate communication networks. One important task is the generation of network packets. The generation is quite complex, consisting of dozens of functions and configuration parameters, but at the end there is one master function and because I find it useful I always write down the signature:
generatePackets :: Configuration -> [Packet]
However, after a while I notice that it would be great if the packet generation would have some kind of random behavior deep down in one of the many sub-functions of the generation process. Since I need a random number generator for that (and I also need it at some other places in the code), this means to manually change dozens of signatures to something like
f :: Configuration -> RNGState [Packet]
with
type RNGState = State StdGen
I understand the "mathematical" necessity (no states) behind this. My question is on a higher (?) level: How would an experienced Haskell programmer have approached this situation? What kind of design pattern or work flow would have avoided the extra work later?
I have never worked with an experienced Haskell programmer. Maybe you will tell me that you never write signatures because you have to change them too often afterwards, or that you give all your functions a state monad, "just in case" :)
One approach that I've been fairly successful with is using a monad transformer stack. This lets you both add new effects when needed and also track the effects required by particular functions.
Here's a really simple example.
import Control.Monad.State
import Control.Monad.Reader
data Config = Config { v1 :: Int, v2 :: Int }
-- the type of the entire program describes all the effects that it can do
type Program = StateT Int (ReaderT Config IO) ()
runProgram program config startState =
runReaderT (runStateT program startState) config
-- doesn't use configuration values. doesn't do IO
step1 :: MonadState Int m => m ()
step1 = get >>= \x -> put (x+1)
-- can use configuration and change state, but can't do IO
step2 :: (MonadReader Config m, MonadState Int m) => m ()
step2 = do
x <- asks v1
y <- get
put (x+y)
-- can use configuration and do IO, but won't touch our internal state
step3 :: (MonadReader Config m, MonadIO m) => m ()
step3 = do
x <- asks v2
liftIO $ putStrLn ("the value of v2 is " ++ show x)
program :: Program
program = step1 >> step2 >> step3
main :: IO ()
main = do
let config = Config { v1 = 42, v2 = 123 }
startState = 17
result <- runProgram program config startState
return ()
Now if we want to add another effect:
step4 :: MonadWriter String m => m()
step4 = tell "done!"
program :: Program
program = step1 >> step2 >> step3 >> step4
Just adjust Program and runProgram
type Program = StateT Int (ReaderT Config (WriterT String IO)) ()
runProgram program config startState =
runWriterT $ runReaderT (runStateT program startState) config
To summarize, this approach lets us decompose a program in a way that tracks effects but also allows adding new effects as needed without a huge amount of refactoring.
edit:
It's come to my attention that I didn't answer the question about what to do for code that's already written. In many cases, it's not too difficult to change pure code into this style:
computation :: Double -> Double -> Double
computation x y = x + y
becomes
computation :: Monad m => Double -> Double -> m Double
computation x y = return (x + y)
This function will now work for any monad, but doesn't have access to any extra effects. Specifically, if we add another monad transformer to Program, then computation will still work.

Can I create a function in Haskell that will encapsulate reading data from file and returning me a simple list of data?

Consider the code below taken from a working example I've built to help me learn Haskell. This code parses a CSV file containing stock quotes downloaded from Yahoo into a nice simple list of bars with which I can then work.
My question: how can I write a function that will take a file name as its parameter and return an OHLCBarList so that the first four lines inside main can be properly encapsulated?
In other words, how can I implement (without getting all sorts of errors about IO stuff) the function whose type would be
getBarsFromFile :: Filename -> OHLCBarList
so that the grunt work that was being done in the first four lines of main can be properly encapsulated?
I've tried to do this myself but with my limited Haskell knowledge, I'm failing miserably.
import qualified Data.ByteString as BS
type Filename = String
getContentsOfFile :: Filename -> IO BS.ByteString
barParser :: Parser Bar
barParser = do
time <- timeParser
char ','
open <- double
char ','
high <- double
char ','
low <- double
char ','
close <- double
char ','
volume <- decimal
char ','
return $ Bar Bar1Day time open high low close volume
type OHLCBar = (UTCTime, Double, Double, Double, Double)
type OHLCBarList = [OHLCBar]
barsToBarList :: [Either String Bar] -> OHLCBarList
main :: IO ()
main = do
contents :: C.ByteString <- getContentsOfFile "PriceData/Daily/yhoo1.csv" --PriceData/Daily/Yhoo.csv"
let lineList :: [C.ByteString] = C.lines contents -- Break the contents into a list of lines
let bars :: [Either String Bar] = map (parseOnly barParser) lineList -- Using the attoparsec
let ohlcBarList :: OHLCBarList = barsToBarList bars -- Now I have a nice simple list of tuples with which to work
--- Now I can do simple operations like
print $ ohlcBarList !! 0
If you really want your function to have type Filename -> OHLCBarList, it can't be done.* Reading the contents of a file is an IO operation, and Haskell's IO monad is specifically designed so that values in the IO monad can never leave. If this restriction were broken, it would (in general) mess with a lot of things. Instead of doing this, you have two options: make the type of getBarsFromFile be Filename -> IO OHLCBarList — thus essentially copying the first four lines of main — or write a function with type C.ByteString -> OHLCBarList that the output of getContentsOfFile can be piped through to encapsulate lines 2 through 4 of main.
* Technically, it can be done, but you really, really, really shouldn't even try, especially if you're new to Haskell.
Others have explained that the correct type of your function has to be Filename -> IO OHLCBarList, I'd like to try and give you some insight as to why the compiler imposes this draconian measure on you.
Imperative programming is all about managing state: "do certain operations to certain bits of memory in sequence". When they grow large, procedural programs become brittle; we need a way of limiting the scope of state changes. OO programs encapsulate state in classes but the paradigm is not fundamentally different: you can call the same method twice and get different results. The output of the method depends on the (hidden) state of the object.
Functional programming goes all the way and bans mutable state entirely. A Haskell function, when called with certain inputs, will always produce the same output. Simple examples of
pure functions are mathematical operators like + and *, or most of the list-processing functions like map. Pure functions are all about the inputs and outputs, not managing internal state.
This allows the compiler to be very smart in optimising your program (for example, it can safely collapse duplicated code for you), and helps the programmer not to make mistakes: you can't put the system in an invalid state if there is none! We like pure functions.
The exception to the rule is IO. Code that performs IO is impure by definition: you could call getLine a hundred times and never get the same result, because it depends on what the user typed. Haskell handles this using the type system: all impure functions are marred with the IO type. IO can be thought of as a dependency on the state of the real world, sort of like World -> (NewWorld, a)
To summarise: pure functions are good because they are easy to reason about; this is why Haskell makes functions pure by default. Any impure code has to be labelled as such with an IO type signature; this tells the compiler and the reader to be careful with this function. So your function which reads from a file (a fundamentally impure action) but returns a pure value can't exist.
Addendum in response to your comment
You can still write pure functions to operate on data that was obtained impurely. Consider the following straw-man:
main :: IO ()
main = do
putStrLn "Enter the numbers you want me to process, separated by spaces"
line <- getLine
let numberStrings = words line
let numbers = map read numberStrings
putStrLn $ "The result of the calculation is " ++ (show $ foldr1 (*) numbers + 10)
Lots of code inside IO here. Let's extract some functions:
main :: IO ()
main = do
putStrLn "Enter the numbers you want me to process, separated by spaces"
result <- fmap processLine getLine -- fmap :: (a -> b) -> IO a -> IO b
-- runs an impure result through a pure function
-- without leaving IO
putStrLn $ "The result of the calculation is " ++ result
processLine :: String -> String -- look ma, no IO!
processLine = show . calculate . readNumbers
readNumbers :: String -> [Int]
readNumbers = map read . words
calculate :: [Int] -> Int
calculate numbers = product numbers + 10
product :: [Int] -> Int
product = foldr1 (*)
I've pulled logic out of main into pure functions which are easier to read, easier for the compiler to optimise, and more reusable (and so more testable). The program as a whole still lives inside IO because the data is obtained impurely (see the last part of this answer for a more thorough treatment of this argument). Impure data can be piped through pure functions using fmap and other combinators; you should try to put as little logic in main as possible.
Your code does seem to be most of the way there; as others have suggested you could extract lines 2-4 of your main into another function.
In other words, how can I implement (without getting all sorts of errors about IO stuff) the function whose type would be
getBarsFromFile :: Filename -> OHLCBarList
so that the grunt work that was being done in the first four lines of main can be properly encapsulated?
You cannot do this without getting all sorts of errors about IO stuff because this type for getBarsFromFile misses an IO. Probably that's what the errors about IO stuff are trying to tell you. Did you try understanding and fixing the errors?
In your situation, I would start by abstracting over the second to fourth line of your main in a function:
parseBars :: ByteString -> OHLCBarList
And then I would combine this function with getContentsOfFile to get:
getBarsFromFile :: FilePath -> IO OHLCBarList
This I would call in main.

Is there something better than unsafePerformIO for this....?

I've so far avoided ever needing unsafePerformIO, but this might have to change today.... I would like to see if the community agrees, or if someone has a better solution.
I have a library which needs to use some config data stored in a bunch of files. This data is guaranteed static (during the run), but needs to be in files that can (on very rare occasions) be edited by an end user who can not compile Haskell programs. (The details are uninportant, but think of "/etc/mime.types" as a pretty good approximation. It is a large almost static data file used throughout many programs).
If this weren't a library I would just use the IO monad.... But because it is a library which is called throughout my code, it literally forces a bubbling up of the IO monad through pretty much everything I have written in multiple modules! Although I need to do a one time read of the data files, this low level call is effetively pure, so this is a pretty unacceptable outcome.
FYI, I plan to also wrap the call in unsafeInterleaveIO, so that only files that are needed will be loaded. My code will look something like this....
dataDir="<path to files>"
datafiles::[FilePath]
datafiles =
unsafePerformIO $
unsafeInterleaveIO $
map (dataDir </>)
<$> filter (not . ("." `isPrefixOf`))
<$> getDirectoryContents dataDir
fileData::[String]
fileData = unsafePerformIO $ unsafeInterleaveIO $ sequence $ readFile <$> datafiles
Given that the data read is referentially transparent, I am pretty sure that unsafePerformIO is safe (this has been discussed in many place, such as "Use of unsafePerformIO appropriate?"). Still, though, if there is a better way, I would love to hear about it.
UPDATE-
In response to Anupam's comment....
There are two reasons why I can't break up the lib into IO and non IO parts.
First, the amount of data is large, and I don't want to read it all into memory at once. Remember that IO is always read strictly.... This is the reason that I need to put in the unsafeInterleaveIO call, to make it lazy. IMHO, once you use unsafeInterleaveIO, you might as well use unsafePerformIO, as the risk is already there.
Second, breaking out the IO specific parts just substitutes the bubbling up of the IO monad with the bubbling up of the IO read code, as well as the passing around of the data (I might actually choose to pass around the data using the state monad anyway, so it really isn't an improvement to substitute the IO monad for the state monad everywhere). This wouldn't be so bad if the low level function itself wasn't effectively pure (ie- think of my /etc/mime.types example above, and imagine a Haskell extensionToMimeType function, which is basically pure, but needs to get the database data from the file.... Suddenly everything from low to high in the stack needs to call or pass through a readMimeData::IO String. Why should each main even need to care about the library choice of a submodule many levels deep?).
I agree with Anupam Jain, you would be better off reading these data files at a somewhat higher level, in IO, and then passing the data in them through the rest of your program purely.
You could, for example, put the functions that need the results of fileData into Reader [String], so that they can just ask for the results as needed (or some Reader Config, where Config holds these strings and whatever else you need).
A sketch of what I'm suggesting follows:
type AppResult = String
fileData :: IO [String]
fileData = undefined -- read the files
myApp :: String -> Reader [String] AppResult
myApp s = do
files <- ask
return undefined -- do whatever with s and config
main = do
config <- fileData
return $ runReader (myApp "test") config
I gather that you don't want to read all the data at once, because that would be costly. And maybe you don't really know up-front what files you will need to load, so loading all of them at the start would be wasteful.
Here's an attempt at a solution. It requires you to work inside a free monad and relegate the side-effecting operations to an interpreter. Some preliminary imports:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import qualified Data.ByteString as B
import Data.Monoid
import Data.List
import Data.Functor.Compose
import Control.Applicative
import Control.Monad
import Control.Monad.Free
import System.IO
We define a functor for the free monad. It will offer a value p do the interpreter and continue the computation after receiving a value b:
type LazyLoad p b = Compose ((,) p) ((->) b)
A convenience function to request the loading of a file:
lazyLoad :: FilePath -> Free (LazyLoad FilePath B.ByteString) B.ByteString
lazyLoad path = liftF $ Compose (path,id)
A dummy interpreter function that reads "file contents" from stdin:
interpret :: Free (LazyLoad FilePath B.ByteString) a -> IO a
interpret = iterM $ \(Compose (path,next)) -> do
putStrLn $ "Enter the contents for file " <> path <> ":"
B.hGetLine stdin >>= next
Some silly example functions:
someComp :: B.ByteString -> B.ByteString
someComp b = "[" <> b <> "]"
takesAwhile :: Int
takesAwhile = foldl' (+) 0 $ take 400000000 $ intersperse (negate 1) $ repeat 1
An example program:
main :: IO ()
main = do
r <- interpret $ do
r1 <- someComp <$> lazyLoad "file1"
r2 <- return takesAwhile
if (r2 == 1)
then return r1
else someComp <$> lazyLoad "file2"
putStrLn . show $ r
When executed, this program will request a line, spend some time computing takesAwhile and only then request another line.
If want to allow different kinds of "requests", this solution could be extended with something like Data types à la carte so that each function only needs to know about about the precise effects it requires.
If you are content with allowing only one type of request, you could also use Clients and Servers from Pipes.Core instead of the free monad.

Resources