I’m writing a program that validates a complex data structure according to a number of complex rules. It inputs the data and outputs a list of messages indicating problems with the data.
Think along these lines:
import Control.Monad (when)
import Control.Monad.Writer (Writer, tell)
data Name = FullName String String | NickName String
data Person = Person { name :: Name, age :: Maybe Int }
data Severity = E | W | C -- error/warning/comment
data Message = Message { severity :: Severity, code :: Int, title :: String }
type Validator = Writer [Message]
report :: Severity -> Int -> String -> Validator ()
report s c d = tell [Message s c d]
checkPerson :: Person -> Validator ()
checkPerson person = do
case age person of
Nothing -> return ()
Just years -> do
when (years < 0) $ report E 1001 "negative age"
when (years > 200) $ report W 1002 "age too large"
case name person of
FullName firstName lastName -> do
when (null firstName) $ report E 1003 "empty first name"
NickName nick -> do
when (null nick) $ report E 1004 "empty nickname"
For documentation, I also want to compile a list of all messages this program can output. That is, I want to obtain the value:
[ Message E 1001 "negative age"
, Message W 1002 "age too large"
, Message E 1003 "empty first name"
, Message E 1004 "empty nickname"
]
I could move the messages out of checkPerson into some external data structure, but I like it when the messages are defined right at the spot where they are used.
I could (and probably should) extract the messages from the AST at compile time.
But the touted flexibility of Haskell made me thinking: can I achieve that at runtime? That is, can I write a function
allMessages :: (Person -> Validator ()) -> [Message]
such that allMessages checkPerson would give me the above list?
Of course, checkPerson and Validator need not stay the same.
I can almost (not quite) see how I could make a custom Validator monad with a “backdoor” that would run checkPerson in a sort of “reflection mode,” traversing all paths and returning all Messages encountered. I would have to write a custom when function that would know to ignore its first argument under some circumstances (which ones?). So, a kind of a DSL. Perhaps I could even emulate pattern matching?
So: can I do something like this, how, and what would I have to sacrifice?
Please feel free to suggest any solutions even if they do not exactly fit the above description.
This kind of half-static analysis is basically exactly what arrows were invented for. So let's make an arrow! Our arrow will basically be just a Writer action, but one that remembers what messages it might have spit out at any given moment. First, some boilerplate:
{-# LANGUAGE Arrows #-}
import Control.Arrow
import Control.Category
import Control.Monad.Writer
import Prelude hiding (id, (.))
Now, the type described above:
data Validator m a b = Validator
{ possibleMessages :: [m]
, action :: Kleisli (Writer m) a b
}
runValidator :: Validator m a b -> a -> Writer m b
runValidator = runKleisli . action
There are some straightforward instances to put in place. Of particular interest: the composition of two validators remembers messages from both the first action and the second action.
instance Monoid m => Category (Validator m) where
id = Validator [] id
Validator ms act . Validator ms' act' = Validator (ms ++ ms') (act . act')
instance Monoid m => Arrow (Validator m) where
arr f = Validator [] (arr f)
first (Validator ms act) = Validator ms (first act)
instance Monoid m => ArrowChoice (Validator m) where
left (Validator ms act) = Validator ms (left act)
All the magic is in the operation that actually lets you report something:
reportWhen :: Monoid m => m -> (a -> Bool) -> Validator m a ()
reportWhen m f = Validator [m] (Kleisli $ \a -> when (f a) (tell m))
This is the operation that notices when you're about to output a possible message, and makes a note of it. Let's copy your types and show how to code up checkPerson as an arrow. I've simplified your messages a little bit, but nothing important is different there -- just less syntactic overhead in the example.
type Message = String
data Name = FullName String String | NickName String -- http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
data Person = Person { name :: Name, age :: Maybe Int }
checkPerson :: Validator Message Person ()
checkPerson = proc person -> do
case age person of
Nothing -> returnA -< ()
Just years -> do
"negative age" `reportWhen` (< 0) -< years
"age too large" `reportWhen` (>200) -< years
case name person of
FullName firstName lastName -> do
"empty first name" `reportWhen` null -< firstName
NickName nick -> do
"empty nickname" `reportWhen` null -< nick
I hope you'll agree that this syntax is not too far removed from what you originally wrote. Let's see it in action in ghci:
> runWriter (runValidator checkPerson (Person (NickName "") Nothing))
((),"empty nickname")
> possibleMessages checkPerson
["empty nickname","empty first name","age too large","negative age"]
Related
I am new to Haskell, so maybe I am missing some fundamental concepts here (or maybe failed to find the appropriate extension). I was wondering if there was a way to optimize or further abstract the following scenario. This code seems very redundant.
Let's say I have the following data classes:
data Person = Person
{ personName :: !String
, personAge :: !Int
} deriving Show
data Dog = Dog
{ dogName :: !String
, dogAge :: !Int
} deriving Show
Let's say I have a service and I'm only concerned with outputing records as strings. In reality, the strings will probably be JSON and the records fetched from the DB, but let's take a simpler case. I basically need a URL token to fetch an appropriate object (say, the string "dog" will get me a Dog, or even just the Haskell "show" string, without expressly declaring it as (value)::Dog).
I have attempted to implement this in several ways...the only thing that seems to work is the following:
data Creature = DogC Dog
| PersonC Person
deriving Show
fromString :: String -> Maybe Creature
fromString "dog" = Just $ DogC $ Dog "muffin" 8
fromString "person" = Just $ PersonC $ Person "John" 22
fromString _ = Nothing
main :: IO ()
main = do
putStrLn $ show $ fromString "dog"
I'm not entirely fond of the new type, nor the list of fromString declarations. And to benefit from the original data declarations, I would probably need to write a similarly tedious expression (eg, "fromCreature") to revert Creature back into my original types. This information might change, so I would probably need TH for a few of the declarations...
Is there a way around some of this? I fiddled with GADTs and classes, but both seemed to be dependent on type- rather than value- based polymorphism (A string identifier tends to cause issues with ambiguous instances). It would be nice to map the constructor to a string (Say, with Data.Map), but constructors often have different kinds.
Update
So, I went with an approach that isn't exactly relevant to the question I had asked, but it may be useful to someone. I did want to maintain some record types, but most didn't add much value and were getting in my way. The steps I had followed went something like:
Use a different/lower-level DB driver, that returns workable types (eg, [ColumnDef] and [[SQLValue]] instead of tuples and records...).
Create ToJSON instances for SQLValue -- most of the types were covered, except a few ByteString types, and I had to handle the conversion of SQLNull to Null. To maintain compatibility with some record types, my default handler looked like: toJSON = genericToJSON defaultOptions { sumEncoding = UnTaggedValue} The untagged value should allow one to read the JSON into defined data types (eg, Dog / Person ) if desired....
Given that column name is accessible from ColumnDef, I wrote an expression that zips [ColumnDef] and [SqlValue] to a list of Aeson-compatible key-value pairs, eg: toJsPairs :: [ColumnDef] -> [SqlValue] -> [(Text,Value)]
Then, I wrote an expression to fetch the JSON from a table name, which more or less serves as my "universal dispatcher." It references a list of authorized tables, so it's less crazy than it might sound.
The code looked a bit like this (using mysql-haskell).
{-# LANGUAGE OverloadedStrings #-}
import qualified Control.Applicative as App
import Database.MySQL.Base
import qualified System.IO.Streams as Streams
import Data.Aeson (FromJSON, ToJSON)
import Data.Aeson.Encode.Pretty (encodePretty)
import Data.Aeson.Types
import Data.Text.Encoding
import Data.String (fromString)
import Data.ByteString.Internal
import qualified Data.ByteString.Lazy.Internal as BLI
import Data.HashMap.Strict (fromList)
appConnectInfo = defaultConnectInfo {
ciUser = "some_user"
, ciPassword = "some_password"
, ciDatabase = "some_db"
}
instance FromJSON ByteString where
parseJSON (String s) = pure $ encodeUtf8 s
parseJSON _ = App.empty
instance ToJSON ByteString where
toJSON = String . decodeUtf8
instance ToJSON MySQLValue where
toJSON (MySQLNull) = Null
toJSON x = genericToJSON defaultOptions
{ sumEncoding = UntaggedValue } x
-- This expression should fail on dimensional mismatch.
-- It's stupidly lenient, but really dimensional mismatch should
-- never occur...
toJsPairs :: [ColumnDef] -> [MySQLValue] -> [(Text,Value)]
toJsPairs [] _ = []
toJsPairs _ [] = []
toJsPairs (x:xs) (y:ys) = (txt x, toJSON y):toJsPairs xs ys
where
-- Implement any modifications to the key names here
txt = decodeUtf8.columnName
listRecords :: String -> IO BLI.ByteString
listRecords tbl = do
conn <- connect appConnectInfo
-- This is clearly an injection vulnerability.
-- Implemented, however, the values for 'tbl' are intensely
-- vetted. This is just an example.
(defs, is) <- query_ conn $ fromString ( "SELECT * FROM `" ++ tbl ++ "` LIMIT 100")
rcrds <- Streams.toList is
return $ encodePretty $ map (jsnobj defs) rcrds
where
jsnobj :: [ColumnDef] -> [MySQLValue] -> Value
jsnobj defs x = Object $ fromList $ toJsPairs defs x
If what you want to consume at the end is json value - it might make sense to
represent result as json value using aeson library:
{-# LANGUAGE DeriveGeneric #-}
import Data.Aeson
import GHC.Generics
data Dog = Dog Int String deriving (Show, Generic)
data Cat = Cat Int String deriving (Show, Generic)
-- here I'm using instance derived with generics, but you can write one by
-- hands
instance ToJSON Dog
instance ToJSON Cat
-- actions to get stuff from db
getDog :: Monad m => Int -> m Dog
getDog i = return (Dog i (show i))
getCat :: Monad m => Int -> m Cat
getCat i = return (Cat i (show i))
-- dispatcher - picks which action to use
getAnimal :: Monad m => String -> Int -> m (Maybe Value)
getAnimal "dog" i = Just . toJSON <$> getDog i
getAnimal "cat" i = Just . toJSON <$> getCat i
getAnimal _ _ = return Nothing
main :: IO ()
main = do
getAnimal "dog" 2 >>= print
getAnimal "cat" 3 >>= print
getAnimal "chupakabra" 12 >>= print
High energy magic version
class Monad m => MonadAnimal m where
-- basically you want something that fetches extra argumets from HTTP or
-- whatevere, perform DB query and so on.
class Animal a where
animalName :: Proxy a -> String
animalGetter :: MonadAnimal m => m a
locateAnimals :: MonadAnimal m => Q [(String, m Value)]
locateAnimals -- implement using TH (reify function is your friend). It should look for
-- all the animal instances in scope and make a list from them with serialized
-- fetcher.
-- with that in place dispatcher should be easy to implement
I am a Haskell newbie, and I am thinking about how I can modularize my Rest application, which essentially passes around a ReaderT everywhere. I have devised a primitive working example of how to do that (below) using ExistentialQuantification. In a comment to a relevant answer, user MathematicalOrchid claimed something similar to be an anti-pattern. Is this an anti-pattern? In newbie terms, can you explain why if so and show a better alternative?
{-# LANGUAGE ExistentialQuantification #-}
import Control.Monad.Reader
import Control.Monad.Trans
import Data.List (intersect)
data Config = Config Int Bool
data User = Jane | John | Robot deriving (Show)
listUsers = [Jane, John, Robot]
class Database d where
search :: d -> String -> IO [User]
fetch :: d -> Int -> IO (Maybe User)
data LiveDb = LiveDb
instance Database LiveDb where
search d q = return $ filter ((q==) . intersect q . show) listUsers
fetch d i = return $ if i<3 then Just $ listUsers!!i else Nothing
data TestDb = TestDb
instance Database TestDb where
search _ _ = return [Robot]
fetch _ _ = return $ Just Robot
data Context = forall d. (Database d) => Context {
db :: d
, config :: Config
}
liveContext = Context { db = LiveDb, config = Config 123 True }
testContext = Context { db = TestDb, config = Config 123 True }
runApi :: String -> ReaderT Context IO String
runApi query = do
Context { db = db } <- ask
liftIO . fmap show $ search db query
main = do
let q = "Jn"
putStrLn $ "searching users for " ++ q
liveResult <- runReaderT (runApi q) liveContext
putStrLn $ "live result " ++ liveResult
testResult <- runReaderT (runApi q) testContext
putStrLn $ "test result " ++ testResult
Edit: a working example based on the accepted answer
import Control.Monad.Reader
import Control.Monad.Trans
import Data.List (intersect)
data Config = Config Int Bool
data User = Jane | John | Robot deriving (Show)
listUsers = [Jane, John, Robot]
data Database = Database {
search :: String -> IO [User]
, fetch :: Int -> IO (Maybe User)
}
liveDb :: Database
liveDb = Database search fetch where
search q = return $ filter ((q==) . intersect q . show) listUsers
fetch i = return $ if i<3 then Just $ listUsers!!i else Nothing
testDb :: Database
testDb = Database search fetch where
search _ = return [Robot]
fetch _ = return $ Just Robot
data Context = Context {
db :: Database
, config :: Config
}
liveContext = Context { db = liveDb, config = Config 123 True }
testContext = Context { db = testDb, config = Config 123 True }
runApi :: String -> ReaderT Context IO String
runApi query = do
d <- fmap db $ ask
liftIO . fmap show $ search d $ query
main = do
let q = "Jn"
putStrLn $ "searching users for " ++ q
liveResult <- runReaderT (runApi q) liveContext
putStrLn $ "live result " ++ liveResult
testResult <- runReaderT (runApi q) testContext
putStrLn $ "test result " ++ testResult
When you pattern-match on a Context, you get in the db field a value of a type that you can never know precisely; all you're allowed to know about it is that it's a Database instance, and thus you can use that class' methods with it. But that means that, from the point of view of the Context type, the existential d type affords it no more capabilities than this type does:
-- The "record of methods" pattern
data Database =
Database { search :: String -> IO [User]
, fetch :: Int -> IO (Maybe User)
}
liveDb :: Database
liveDb = Database search fetch
where search d q = return $ filter ((q==) . intersect q . show) listUsers
fetch d i = return $ if i<3 then Just $ listUsers!!i else Nothing
testDb :: Database
testDb = Database search fetch
where search _ _ = return [Robot]
fetch _ _ = return (Just Robot)
data Context =
Context { db :: Database
, config :: Config
}
That's the core argument against using existential types in the manner that you've done—there is a completely equivalent alternative that doesn't require existential types.
The argument against existential types is quite simple (and strong): often, you can avoid both the existential type and type class machinery, and use plain functions instead.
This is clearly the case where your class has the form
class D a where
method1 :: a -> T1
method2 :: a -> T2
-- ...
as in the posted Database example, since its instances can be replaced by values in a plain record type
data D = {
method1 :: T1
, method2 :: T2
-- ...
}
This is, essentially, the solution by #LuisCasillas .
However, note that the above translation relies on types T1,T2 not to depend on a. What if this is not the case? E.g. what if we had
class Database d where
search :: d -> String -> [User]
fetch :: d -> Int -> Maybe User
insert :: d -> User -> d
The above is a "pure" (no-IO) interface to a database, also allowing updates through insert. An instance could then be
data LiveDb = LiveDb [User]
instance Database LiveDb where
search (LiveDb d) q = filter ((q==) . intersect q . show) d
fetch (LiveDb d) i = case drop i d of [] -> Nothing ; (x:_) -> Just x
insert (LiveDb d) u = LiveDb (u:d)
Note that here we do use the parameter d, unlike in the original case where it was a placeholder.
Can we do without classes and existentials here?
data Database =
Database { search :: String -> [User]
, fetch :: Int -> Maybe User
, insert :: User -> Database
}
Notice that above we are returning an abstract Database in insert. This interface is more general than the existential-classy one, since it allows insert to change the underlying representation for the database. I.e., insert could move from a list-based representation to a tree-based one. This is like having insert acting from the existentially-quantified Database to itself, instead of from a concrete instance to itself.
Anyway, let's write LiveDb in the record-style way:
liveDb :: Database
liveDb = Database (search' listUsers) (fetch' listUsers) (insert' listUsers)
where search' d q = filter ((q==) . intersect q . show) d
fetch' d i = case drop i d of [] -> Nothing ; (x:_) -> Just x
insert' d u = Database (search' d') (fetch' d') (insert' d')
where d' = u:d
listUsers = [Jane, John, Robot]
Above I had to pass the underlying state d to each function, and in insert I had to update such state.
Overall, I find the above more involved than the instance Database LiveDb methods, which require no state-passing. Surely, we can apply a little refactoring and clarify the code:
makeLiveDb :: [User] -> Database
makeLiveDb d = Database search fetch insert
where search q = filter ((q==) . intersect q . show) d
fetch i = case drop i d of [] -> Nothing ; (x:_) -> Just x
insert u = makeLiveDb (u:d)
liveDb :: Database
liveDb = makeLiveDb [Jane, John, Robot]
This is a bit better, yet not as simple than the plain instance. There is no straightforward winner in this case, and which style to use is a matter of personal preference.
Personally, I stay away from existentially-quantified classes as much as possible, since in many, many cases they lose to much simpler approaches. However, I'm not dogmatic about them, and allow myself to use the "anti-pattern" when the alternative starts becoming too clumsy.
As an alternative, one could use an external function working at the abstract level, only:
data Database =
Database { search :: String -> [User]
-- let's neglect other methods for simplicity's sake
}
insert :: Database -> User -> Database
insert (Database s) u = Database s'
where s' str = s str ++ [ u | show u == str ] -- or something similar
The advantage of doing this is that insert works on the abstract Database, whatever its underlying data structure is. The disadvantage is that, in this way, insert can only access the database through its "methods", and can only work by building closures upon closures. If we also implemented a remove method, applying insert and delete many times will cause a larger and larger memory footprint, since remove can not remove the element from the underlying data structure, but can only build yet another closure which skips over the removed element. More pragmatically, it would be as if insert and remove simply appended to a log, and search scanned the log to see if the most recent action on an element was an insertion or a removal. This will not have a great performance.
I'm working on a project that, amongst other things, involves a database access layer. Pretty normal, really. In a previous project, a collaborator encouraged me to use the Free Monads concept for a database layer and so I did. Now I'm trying to decide in my new project what I gain.
In the previous project, I had an API that looked rather like this.
saveDocument :: RawDocument -> DBAction ()
getDocuments :: DocumentFilter -> DBAction [RawDocument]
getDocumentStats :: DBAction [(DocId, DocumentStats)]
etc. About twenty such public functions. To support them, I had the DBAction data structure:
data DBAction a =
SaveDocument RawDocument (DBAction a)
| GetDocuments DocumentFilter ([RawDocument] -> DBAction a)
| GetDocumentStats ([(DocId, DocumentStats)] -> DBAction a)
| Return a
And then a monad implementation:
instance Monad DBAction where
return = Return
SaveDocument doc k >>= f = SaveDocument doc (k >>= f)
GetDocuments df k >>= f = GetDocuments df (k >=> f)
And then the interpreter. And then the primitive functions that implement each of the different queries. Basically, I'm feeling that I had a huge amount of glue code.
In my current project (in a totally different field), I have instead gone with a pretty ordinary monad for my database:
newtype DBM err a = DBM (ReaderT DB (EitherT err IO) a)
deriving (Monad, MonadIO, MonadReader DB)
indexImage :: (ImageId, UTCTime) -> Exif -> Thumbnail -> DBM SaveError ()
removeImage :: DB -> ImageId -> DBM DeleteError ()
And so on. I figure that, ultimately, I'll have the "public" functions that represent high level concepts all running in the DBM context, and then I'll have the whole slew of functions that do the SQL/Haskell glue. This is, overall, feeling much better than the free monad system because I'm not writing a huge amount of boilerplate code to gains me nothing but the ability to swap out my interpreter.
Or...
Do I actually gain something else with the Free Monad + Interpreter pattern? If so, what?
As mentioned in the comments, it is frequently desirable to have some abstraction between code and database implementation. You can get much of the same abstraction as a free monad by defining a class for your DB Monad (I've taken a couple liberties here):
class (Monad m) => MonadImageDB m where
indexImage :: (ImageId, UTCTime) -> Exif -> Thumbnail -> m SaveResult
removeImage :: ImageId -> m DeleteResult
If your code is written against MonadImageDB m => instead of tightly coupled to DBM, you will be able to swap out the database and error handling without modifying your code.
Why would you use free instead? Because it "frees the interpreter as much as possible", meaning the intepreter is only committed to providing a monad, and nothing else. This means you are as unconstrained as possible writing monad instances to go with your code. Note that, for the free monad, you don't write your own instance for Monad, you get it for free. You'd write something like
data DBActionF next =
SaveDocument RawDocument ( next)
| GetDocuments DocumentFilter ([RawDocument] -> next)
| GetDocumentStats ([(DocId, DocumentStats)] -> next)
derive Functor DBActionF, and get the monad instance for Free DBActionF from the existing instance for Functor f => Monad (Free f).
For your example, it'd instead be:
data ImageActionF next =
IndexImage (ImageId, UTCTime) Exif Thumbnail (SaveResult -> next)
| RemoveImage ImageId (DeleteResult -> next)
You can also get the property "frees the interpreter as much as possible" for the type class. If you have no other constraints on m than the type class, MonadImageDB, and all of MonadImageDB's methods could be constructors for a Functor, then you get the same property. You can see this by implementing instance MonadImageDB (Free ImageActionF).
If you are going to mix your code with interactions with some other monad, you can get a monad transformer from free instead of a monad.
Choosing
You don't have to choose. You can convert back and forth between the representations. This example shows how to do so for actions with zero, one, or two arguments returning zero, one, or two results. First, a bit of boilerplate
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleInstances #-}
import Control.Monad.Free
We have a type class
class Monad m => MonadAddDel m where
add :: String -> m Int
del :: Int -> m ()
set :: Int -> String -> m ()
add2 :: String -> String -> m (Int, Int)
nop :: m ()
and an equivalent functor representation
data AddDelF next
= Add String ( Int -> next)
| Del Int ( next)
| Set Int String ( next)
| Add2 String String (Int -> Int -> next)
| Nop ( next)
deriving (Functor)
Converting from the free representation to the type class replaces Pure with return, Free with >>=, Add with add, etc.
run :: MonadAddDel m => Free AddDelF a -> m a
run (Pure a) = return a
run (Free (Add x next)) = add x >>= run . next
run (Free (Del id next)) = del id >> run next
run (Free (Set id x next)) = set id x >> run next
run (Free (Add2 x y next)) = add2 x y >>= \ids -> run (next (fst ids) (snd ids))
run (Free (Nop next)) = nop >> run next
A MonadAddDel instance for the representation builds functions for the next arguments of the constructors using Pure.
instance MonadAddDel (Free AddDelF) where
add x = Free . (Add x ) $ Pure
del id = Free . (Del id ) $ Pure ()
set id x = Free . (Set id x) $ Pure ()
add2 x y = Free . (Add2 x y) $ \id1 id2 -> Pure (id1, id2)
nop = Free . Nop $ Pure ()
(Both of these have patterns we could extract for production code, the hard part to writing these generically would be dealing with the varying number of input and result arguments)
Coding against the type class uses only the MonadAddDel m => constraint, for example:
example1 :: MonadAddDel m => m ()
example1 = do
id <- add "Hi"
del id
nop
(id3, id4) <- add2 "Hello" "World"
set id4 "Again"
I was too lazy to write another instance for MonadAddDel besides the one I got from free, and too lazy to make an example besides by using the MonadAddDel type class.
If you like running example code, here's enough to see the example interpreted once (converting the type class representation to the free representation), and again after converting the free representation back to the type class representation again. Again, I'm too lazy to write the code twice.
debugInterpreter :: Free AddDelF a -> IO a
debugInterpreter = go 0
where
go n (Pure a) = return a
go n (Free (Add x next)) =
do
print $ "Adding " ++ x ++ " with id " ++ show n
go (n+1) (next n)
go n (Free (Del id next)) =
do
print $ "Deleting " ++ show id
go n next
go n (Free (Set id x next)) =
do
print $ "Setting " ++ show id ++ " to " ++ show x
go n next
go n (Free (Add2 x y next)) =
do
print $ "Adding " ++ x ++ " with id " ++ show n ++ " and " ++ y ++ " with id " ++ show (n+1)
go (n+2) (next n (n+1))
go n (Free (Nop next)) =
do
print "Nop"
go n next
main =
do
debugInterpreter example1
debugInterpreter . run $ example1
If I understood correctly, a monad is just the implementation of a bind >>= and a return operator following certain rules which basically compose 2 functions of different return types together. So, for example, those are equivalent:
putStrLn "What is your name?"
>>= (\_ -> getLine)
>>= (\name -> putStrLn ("Welcome, " ++ name ++ "!"))
(bind (putStrLn "What is your name?")
(bind
(\_ -> getLine)
(\name -> putStrLn ("Welcome, " ++ name ++ "!"))))
But if we strongly normalize this expression, the final result will be just:
(putStrLn ("Welcome, " ++ getline ++ "!"))
The first statement (putStrLn "What is your name?") is completely lost. Also, getLine looks like a function with no arguments, which is nonsense. So how does this work, and what is the actual definition of the >>= and return functions?
Your logical misstep is that you assume certain reduction rules hold which do not. In particular, you appear to be using
f >>= (\x -> g x) ==== g f
If that held then, yes, monads would be pretty silly: (>>=) would just be flip ($). But it doesn't, in general, hold at all. In fact, the very reason it doesn't hold is what provides monads an opportunity to be interesting.
For a little bit of further exploration, here's the one monad where (>>=) == flip ($) (basically) holds.
newtype Identity a = Identity { unIdentity :: a }
To make our equations work out, we'll have to use that Identity a ~ a. This isn't strictly true, obviously, but let's pretend. In particular, Identity . unIdentity and unIdentity . Identity are both identities, no-ops, and we can freely apply Identity or unIdentity however we like to make types match
instance Functor Identity where
fmap f (Identity a) = Identity (f a)
instance Monad Identity where
return a = Identity a -- this is a no-op
ida >>= f = f (unIdentity ida)
Now, in particular, we want to examine
ida :: Identity a
f :: a -> b
ida >>= Identity . f :: Identity b
===
Identity (f (unIdentity ida)) :: Identity b
and if we throw away the Identity/unIdentity noise and thus produce the knowledge that ida = Identity a for some a
Identity (f (unIdentity ida)) :: Identity b
===
Identity (f a) :: Identity b
=== ~
f a :: b
So, while (>>=) == flip ($) forms a certain basis of intuition about (>>=)... in any circumstance more interesting than the Identity monad (and all other monads are) it doesn't hold exactly.
Seems to be a misunderstanding of how evaluation in IO proceeds in Haskell. If you look at the type signature for (>>=):
λ: :t (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
It takes a monadic value parameterized by a type a, and a function which accepts a type of the same type and applies it inside the function body yielding a monadic value of type b.
The IO monad itself is a rather degenerate monad since it has special status in Haskell's implementation. A type of IO a stands for a potentially impure computation which, when performed, does some IO before returning a value of type a.
The first statement (putStrLn "What is your name?") is completely
lost.
The misunderstanding about this statement is that the value of putStrLn :: String -> IO () does in fact lose it's value in some sense, or more precisely it just yields the unit type () to the bound function after performing the IO action of printing a string to the outside world.
But if we strongly normalize this expression, the final result will be
just: (putStrLn ("Welcome, " ++ getline ++ "!"))
It's best to think of getLine :: IO String as being a computation yielding a value instead of a value itself. In this case as well the function getLine is not itself substituted in but the result of the computation it performs is, which behaves like you expect it to: getting a value from stdin and printing it back out.
It has been so long til I asked that question! The simple answer is that, no, the term I posted does not reduce to putStrLn ("Welcome, " ++ getline ++ "!"). Instead, its normal form will have the shape bind foo (\ _ -> bind bar (\ _ -> ...)), i.e., a chain of lambdas, which holds the ordering information I was worried about.
[...] what are the actual definitions for the (>>=) and return functions?
From section 6.1.7 (page 75) of the Haskell 2010 report:
The IO type serves as a tag for operations (actions) that interact with the outside world. The IO type is abstract: no constructors are visible to the user. IO is an instance of the Monad and Functor classes.
the crucial point being:
The IO type is abstract: no constructors are visible to the user.
There are no actual (written in idiomatic Haskell) definitions - it's the implementors' choice as to which model to use: state-threading, continuations, direct effects, etc. (This wasn't always the case - I provide more details here :-) We also benefit, as we're able to choose the most convenient model for the investigation being made.
So how does this work [...]?
I will choose the direct-effect model, based on examples from Philip Wadler's How to Declare an Imperative:
(* page 26, modified *)
type 'a io = oi -> 'a
infix >>=
val >>= : 'a io * ('a -> 'b io) -> 'b io
fun m >>= k = fn Oblige => let
val x = m Oblige
val y = k x Oblige
in
y
end
val return : 'a -> 'a io
fun return x = fn Oblige => x
val putc : char -> unit io
fun putc c = fn Oblige => putcML c
val getc : char io
val getc = fn Oblige => getcML ()
I'm using a new type:
datatype oi = Oblige
to reserve the unit type and its value () for the usual purpose of vacuous
results, for clarity.
(Yes - that's Standard ML: just imagine it's 1997, and you're writing a
prototype Haskell implementation ;-)
With the help of some extra definitions:
val gets : (char list) io
val putsl : char list -> unit io
that Haskell code sample, modified slightly:
putStrLn "What is your name?" >>=
(\_ -> getLine >>=
(\name -> putStrLn (greet name)))
greet :: String -> String
greet name = "Welcome, " ++ name ++ "!"
translates to:
putsl "What is your name?"
>>= (fn _ => gets
>>= (fn name => putsl (greet name))
where:
val greet : char list -> char list
fun greet name = List.concat (String.explode "Welcome, "::name::[#"!"])
All going well, the sample should simplify down to:
fun Oblige => let
val x = putsl "What is your name?" Oblige
val name = gets Oblige
val y = putsl (greet name) Oblige
in
y
end
Even though x isn't used it's still evaluated in Standard ML, which causes the prompt "What is your name?" to be displayed.
Now for a guess at the next question...Standard ML and Haskell are both functional languages - could all that oi stuff be transferred across to Haskell?
I was wrong? Meh; I'll answer it anyway - sort of; you can read about what I devised over here. If that was just too abominable to contemplate...well, here are those extra Standard ML definitions:
(* from pages 25-26, verbatim *)
val putcML : char -> unit
fun putcML c = TextIO.output1(TextIO.stdOut,c);
val getcML : unit -> char
fun getcML () = valOf(TextIO.input1(TextIO.stdIn));
(* Caution: work of SML novice... *)
val gets = fn Oblige => let
val c = getc Oblige
in
if c = #"\n" then
[]
else
let
val cs = gets Oblige
in
(c::cs)
end
end
fun putsl cs = fn Oblige => let
val _ = putsl cs Oblige
val _ = putc #"\n" Oblige
in
()
end
val puts : char list -> unit io
fun puts cs = fn Oblige => case cs of
[] => ()
| (c::cs) => let val _ = putc c Oblige in
puts cs Oblige
Say, I have a random DB function in my separate DB.hs file.
Something like this:
savePerson :: Person -> IO ()
savePerson p = do
c <- connect
run c "INSERT INTO persons (name, age) \
\VALUES (?, ?)"
[toSql (personName p), toSql (personAge p)]
commit c
disconnect c
return ()
Now, how do I execute this functions within my handler in Site.hs if I import my DB.hs?
If I simply stick it in my handler like this(this is just an example):
insertPerson = do
par <- getPostParams
let p = toPerson par
savePerson p
return ()
where
toPerson m =
Person {personName = head (m ! (B.pack "name"))
,personAge = read (B.unpack (head (m ! (B.pack "age")))) :: Int
}
This does not work. I dont want the handler to return anything, I just want it to save the Person and not return/render anything.
What is the correct way of doing it?
Thanks.
Your problem seems to be that you do not understand what the unit type and value are about.
The unit type is a special built-in type called "()", which has exactly one value, also called "()".
So for instance I can create a list of 4 units, which is of type "list of units".
fourUnits :: [()]
fourUnits = [(), (), (), ()]
The unit type is used where we don't want to have any other information. So technically the type "IO ()" is the type of an IO action that gives the unit value.
A "do" clause unsugars to a chain of ">>=" invocations. ">>=" has the type
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
In other words, the type of a "do" clause is the type returned by its last action.
So where you say
savePerson p
return ()
the "return ()" is spurious because it has exactly the same type as "savePerson p".
Remember that "return" has nothing to do with flow of control: it is merely a function with the type
return :: (Monad m) => a -> m a
It would have been better called "wrap" or "inject" or something similar to avoid this confusion.