Distributed Process in monad transformer

Distributed Process in monad transformer - haskell

Im toying with implementing a gossip based cluster membership backend for the so called cloud-haskell or is it Distributed.Process.. anyway Im trying to get away with handeling state without ioref or MVars and instead using a state transformer and putting the Process monad on the bottom, like so:
type ClusterT = StateT ClusterState
type Cluster a = ClusterT Process a
This works fairly well using Control.Distributed.Process.Lifted (https://hackage.haskell.org/package/distributed-process-lifted) allowing you to do something like this:
mystatefulcomp :: Cluster ()
mystatefulcomp = do
msg <- expect :: Cluster String
old_state <- get
say $ "My old state was " ++ (show old_state)
put $ modifyState curr_state msg
mystatefulcomp
main = do
Right transport <- createTransport '127.0.0.1' '3000' (\n -> ('127.0.0.1', n) defaultTCPParameters
node <- newLocalNode transport initRemoteTable
runProcess node (evalStateT mystatefulcomp initialstate)
where initialstate = ClusterState.empty
this works resonably well and allows me to structure my program fairly well, i can keep my state functional and thread it along in the Cluster monad.
This all break tho when i try to use receiveWait and match to receive messages.
lets rewrite statefulcomp to do something else using receiveWait
doSomethingWithString :: String -> Cluster ()
doSomethingWithString str = do
s < get
put $ modifyState s str
mystatefulcomp :: Cluster ()
mystatefulcomp = do
old_state <- get
receiveWait [ match doSomthingWithString ]
new_state <- get
say $ "old state " ++ (show old_state) ++ " new " ++ (show new_state)
This wont work since the match function is of type (a -> Process b) -> Match b but we want it to be of type (a -> Cluster b) -> Match b. And here is where i get out on thin ice. As i understand Control.Distributed.Process.Lifted rexposes Control.Distributed.Process functions lifted into the tansformer stack allowing you to use functions like expect and say but does not rexposes match, matchIf and so on..
Im really struggeling with this trying to find a work around or a way of re implementing match and its friends to the form of MonadProcess m => (a -> m b) -> Match b.
Any insights is apriciated.
edit
So after som fiddeling about I came up with the following
doSomethingWithString :: String -> Cluster ()
doSomethingWithString str = do
s < get
put $ modifyState s str
doSomethingWithInt :: Int -> Cluster ()
...
mystatefulcomp :: Cluster ()
mystatefulcomp = do
old_state <- get
id =<< receiveWait [ match $ return . doSomethingWithString
, match $ return . doSomethingWithInt ]
new_state <- get
say $ "old state " ++ (show old_state) ++ " new " ++ (show new_state)
This works fairly well but I am still curious about how good of a design this is

As Michael Snoyman points out in a series of blog posts (that's 5 links), wrapping StateT around IO is a bad idea. You just stumbled over one instance where that surfaces.
mystatefulcomp :: Cluster ()
mystatefulcomp = do
old_state <- get
receiveWait [ match doSomethingWithString ]
new_state <- get
The problem is what ends up in new_state if doSomethingWithString throws an error. The old_state? Some intermediate state from doSomethingWithString before the exception? You see, the very fact that we are wondering makes this approach no less bad than just storing the state in an IORef or MVar.
Apart from questionable semantics, this can't even be implemented without distributed-process being rewritten to use MonadBaseControl everywhere. This is exactly why distributed-process-lifted fails to deliver, because it just wraps around the primitives from distributed-process.
So, what I would do here instead is to pass around a data Config = Config { clusterState :: MVar ClusterState } environment (Oh look, Process does that, too!). Possibly with ReaderT which interacts with IO in a sane way, plus you can easily lift any number of nested occurences of Process to ReaderT Config Process yourself.
Repeating the message of Michael's blog posts: StateT isn't bad in general (in a pure transformer stack, that is), just for cases where we wrap IO in some way. I encourage you to read those posts, they were very inspiring for me, so here they are again:
https://www.fpcomplete.com/blog/2017/06/readert-design-pattern
https://www.fpcomplete.com/blog/2017/06/understanding-resourcet
https://www.fpcomplete.com/blog/2017/06/tale-of-two-brackets
https://www.fpcomplete.com/blog/2017/07/announcing-new-unliftio-library
https://www.fpcomplete.com/blog/2017/07/the-rio-monad

Related

The "Haskell way" to extract/cumulate results inside an predefined vistor pattern iterator

I'm getting started with Haskell (from many years of C and c++) and have decided to attempt a small database project. I'm using a predefined binder library to a C database library (Database.kyotocabint). I'm struggling to get my head round how to do anything with the iterator interfaces due to the separation of effects when using a pre-defined method.
The toy demo to iterate over the data base and print it out (which works fine) is
test7 = do
db <- openTree "testdatabase/mydb.kct" defaultLoggingOptions (Writer [] [])
let visitor = \k v -> putStr (show k) >> putStr ":" >> putStrLn (show v) >>
return (Left NoOperation)
iterate db visitor False
close db
Where iterate and visitor are provided by the library bindings and the relevant types are
iterate :: forall db. WithDB db => db -> VisitorFull -> Writable -> IO ()
visitor :: ByteString -> ByteString -> IO (Either VisitorAction b)
But I can't see to how extract information out from inside the iterator rather than process each one individually - for example collect all the keys beginning with 'a' in a list or even just count the number of entries.
Am I limited because iterate just has the type IO () and so I can't build in side effects and would have to rebuild this replacing the library versions? The state monad on paper seems to adress this but the visitor type doesn't seem to allow me to maintain the state over subsequent visitor calls.
What would be the Haskell way to solve this ?
Matthew
Edit - many thanks for the clear answer below which siad both 0 its not the Haskell way but also provided a solution - this answer led me to Mutable objects which I found a clear explanation of the options.

The kyotocabinet library unfortunately does not seem to support your operation. Beyond iterate, it should expose some similar operation which returns something more complex than IO (), say IO a or IO [a] while requiring a more complex visitor function.
Still, since we work inside IO, there is a workaround: we can exploit IORefs and collect results. I want to stress, though, that this is not idiomatic code one would write in Haskell, but something one is forced to use because if the limitation of this library.
Anyway, the code would look something like this (untested):
test7 = do
db <- openTree "testdatabase/mydb.kct" defaultLoggingOptions (Writer [] [])
w <- newIORef [] -- create mutable var, initialize to []
let visitor = \k v -> do
putStrLn (show k ++ ":" ++ show v)
modifyIORef w ((k,v):) -- prepend (k,v) to the list w
return (Left NoOperation)
iterate db visitor False
result <- readIORef w -- get the whole list
print result
close db
Since you come from C++, you might want to compare the code above to the following pseudo-C++:
std::vector<std::pair<int,int>> w;
db.iterate([&](int k, int v) {
std::cout << k << ", " << v << "\n";
w.push_back({k,v});
});
// here we can read w, even if db.iterate returns void
Again, this is not something I would consider idiomatic Haskell.

Refactoring Haskell when adding IO

I have a concern regarding how far the introduction of IO trickles through a program. Say a function deep within my program is altered to include some IO; how do I isolate this change to not have to also change every function in the path to IO as well?
For instance, in a simplified example:
a :: String -> String
a s = (b s) ++ "!"
b :: String -> String
b s = '!':(fetch s)
fetch :: String -> String
fetch s = reverse s
main = putStrLn $ a "hello"
(fetch here could more realistically be reading a value from a static Map to give as its result)
But say if due to some business logic change, I needed to lookup the value returned by fetch in some database (which I can exemplify here with a call to getLine):
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
So my question is, how to prevent having to rewrite every function call in this chain?
a :: String -> IO String
a s = fmap (\x -> x ++ "!") (b s)
b :: String -> IO String
b s = fmap (\x -> '!':x) (fetch s)
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = a "hello" >>= putStrLn
I can see that refactoring this would be much simpler if the functions themselves did not depend on each other. That is fine for a simple example:
a :: String -> String
a s = s ++ "!"
b :: String -> String
b s = '!':s
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
doit :: String -> IO String
doit s = fmap (a . b) (fetch s)
main = doit "hello" >>= putStrLn
but I don't know if that is necessarily practical in more complicated programs.
The only way I've found thus far to really isolate an IO addition like this is to use unsafePerformIO, but, by its very name, I don't want to do that if I can help it. Is there some other way to isolate this change? If the refactoring is substantial, I would start to feel inclined to avoid having to do it (especially under deadlines, etc).
Thanks for any advice!

Here are a few methods I use.
Reduce dependencies on effects by inverting control. (One of the methods you described in your question.) That is, execute the effects outside and pass the results (or functions with those results partially applied) into pure code. Instead of having main → a → b → fetch, have main → fetch and then main → a → b:
a :: String -> String
a f = b f ++ "!"
b :: String -> String
b f = '!' : f
fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
main = do
f <- fetch "hello"
putStrLn $ a f
For more complex cases of this, where you need to thread an argument to do this sort of “dependency injection” through many levels, Reader/ReaderT lets you abstract over the boilerplate.
Write pure code that you expect might need effects in monadic style from the start. (Polymorphic over the choice of monad.) Then if you do eventually need effects in that code, you don’t need to change the implementation, only the signature.
a :: (Monad m) => String -> m String
a s = (++ "!") <$> b s
b :: (Monad m) => String -> m String
b s = ('!' :) <$> fetch s
fetch :: (Monad m) => String -> m String
fetch s = pure (reverse s)
Since this code works for any m with a Monad instance (or in fact just Applicative), you can run it directly in IO, or purely with the “dummy” monad Identity:
main = putStrLn =<< a "hello"
main = putStrLn $ runIdentity $ a "hello"
Then as you need more effects, you can use “mtl style” (as #dfeuer’s answer describes) to enable effects on an as-needed basis, or if you’re using the same monad stack everywhere, just replace m with that concrete type, e.g.:
newtype Fetch a = Fetch { unFetch :: IO a }
deriving (Applicative, Functor, Monad, MonadIO)
a :: String -> Fetch String
a s = pure (b s ++ "!")
b :: String -> Fetch String
b s = ('!' :) <$> fetch s
fetch :: String -> Fetch String
fetch s = do
x <- liftIO getLine
return $ s ++ x
main = putStrLn =<< unFetch (a "hello")
The advantage of mtl style is that you can have multiple different implementations of your effects. That makes things like testing & mocking easy, since you can reuse the logic but run it with different “handlers” for production & testing. In fact, you can get even more flexibility (at the cost of some runtime performance) using an algebraic effects library such as freer-effects, which not only lets the caller change how each effect is handled, but also the order in which they’re handled.
Roll up your sleeves and do the refactoring. The compiler will tell you everywhere that needs to be updated anyway. After enough times doing this, you’ll naturally end up recognising when you’re writing code that will require this refactoring later, so you’ll consider effects from the beginning and not run into the problem.
You’re quite right to doubt unsafePerformIO! It’s not just unsafe because it breaks referential transparency, it’s unsafe because it can break type, memory, and concurrency safety as well—you can use it to coerce any type to any other, cause a segfault, or cause deadlocks and concurrency errors that would ordinarily be impossible. You’re telling the compiler that some code is pure, so it’s going to assume it can do all the transformations it does with pure code—such as duplicating, reordering, or even dropping it, which may completely change the correctness and performance of your code.
The main legitimate use cases for unsafePerformIO are things like using the FFI to wrap foreign code (that you know is pure), or doing GHC-specific performance hacks; stay away from it otherwise, since it’s not meant as an “escape hatch” for ordinary code.

First off, the refactoring doesn't tend to be as bad as you might imagine. Once you make the first change, the type checker will point you to the next few, and so on. But suppose you have a reason to suspect from the start that you might need some extra capability to make a function go. A common way to do this (called mtl-style, after the monad transformer library) is to express your needs in a constraint.
class Monad m => MonadFetch m where
fetch :: String -> m String
a :: MonadFetch m => String -> m String
a s = fmap (\x -> x ++ "!") (b s)
b :: MonadFetch m => String -> m String
b s = fmap (\x -> '!':x) (fetch s)
instance MonadFetch IO where
-- fetch :: String -> IO String
fetch s = do
x <- getLine
return $ s ++ x
instance MonadFetch Identity where
-- fetch :: String -> Identity String
fetch = Identity . reverse
You're no longer tied to a particular monad: you just need one that can fetch. Code operating on an arbitrary MonadFetch instance is pure, except that it can fetch.

Reader Monad - explanation of trivial case

I have been trying to get to grips with the reader monad and came across this tutorial. In it, the author presents this example:
example2 :: String -> String
example2 context = runReader (greet "James" >>= end) context
where
greet :: String -> Reader String String
greet name = do
greeting <- ask
return $ greeting ++ ", " ++ name
end :: String -> Reader String String
end input = do
isHello <- asks (== "Hello")
return $ input ++ if isHello then "!" else "."
I know that this is a trivial example that shows the mechanics, but I am trying to figure out why it would be better than doing something like:
example3 :: String -> String
example3 = end <*> (greet "James")
where
greet name input = input ++ ", " ++ name
end input = if input == "Hello" then (++ "!") else (++ ".")

Reader isn't often used by itself in real code. As you have observed, it's not really better than just passing an extra argument to your functions. However, as part of a monad transformer it is an excellent way to pass configuration parameters through your application. Usually this is done by adding a MonadReader constraint to any function that needs access to configuration.
Here's an attempt at a more real-world example:
data Config = Config
{ databaseConnection :: Connection
, ... other configuration stuff
}
getUser :: (MonadReader Config m, MonadIO m) => UserKey -> m User
getUser x = do
db <- asks databaseConnection
.... fetch user from database using the connection
then your main would look something like:
main :: IO ()
main = do
config <- .... create the configuration
user <- runReaderT (getUser (UserKey 42)) config
print user

dfeuer, chi and user2297560 are right in that "Reader isn't often used by itself in real code". It is worth noting, though, that there is next to no essential difference between what you do in the second snippet in the question and actually using Reader as a monad: the function functor is just Reader without the wrappers, and the Monad and Applicative instances for both of them are equivalent. By the way, outside of highly polymorphic code1, the typical motivation for using the function Applicative is making code more pointfree. In that case, moderation is highly advisable. For instance, as far as my own taste goes, this...
(&&) <$> isFoo <*> isBar
... is fine (and sometimes it might even read nicer than the pointful spelling), while this...
end <*> greet "James"
... is just confusing.
Footnotes
For instance, as Carl points out in a comment, it and the related instances can be useful in...
[...] places where you have code that's polymorphic in a type constructor and your use case is passing an argument in. This can come up when using the polymorphic types offered by lenses, for instance.

How to limit code changes when introducing state?

I am a senior C/C++/Java/Assembler programmer and I have been always fascinated by the pure functional programming paradigm. From time to time, I try to implement something useful with it, e.g., a small tool, but often I quickly reach a point where I realize that I (and my tool, too) would be much faster in a non-pure language. It's probably because I have much more experience with imperative programming languages with thousands of idoms, patterns and typical solution approaches in my head.
Here is one of those situations. I have encountered it several times and I hope you guys can help me.
Let's assume I write a tool to simulate communication networks. One important task is the generation of network packets. The generation is quite complex, consisting of dozens of functions and configuration parameters, but at the end there is one master function and because I find it useful I always write down the signature:
generatePackets :: Configuration -> [Packet]
However, after a while I notice that it would be great if the packet generation would have some kind of random behavior deep down in one of the many sub-functions of the generation process. Since I need a random number generator for that (and I also need it at some other places in the code), this means to manually change dozens of signatures to something like
f :: Configuration -> RNGState [Packet]
with
type RNGState = State StdGen
I understand the "mathematical" necessity (no states) behind this. My question is on a higher (?) level: How would an experienced Haskell programmer have approached this situation? What kind of design pattern or work flow would have avoided the extra work later?
I have never worked with an experienced Haskell programmer. Maybe you will tell me that you never write signatures because you have to change them too often afterwards, or that you give all your functions a state monad, "just in case" :)

One approach that I've been fairly successful with is using a monad transformer stack. This lets you both add new effects when needed and also track the effects required by particular functions.
Here's a really simple example.
import Control.Monad.State
import Control.Monad.Reader
data Config = Config { v1 :: Int, v2 :: Int }
-- the type of the entire program describes all the effects that it can do
type Program = StateT Int (ReaderT Config IO) ()
runProgram program config startState =
runReaderT (runStateT program startState) config
-- doesn't use configuration values. doesn't do IO
step1 :: MonadState Int m => m ()
step1 = get >>= \x -> put (x+1)
-- can use configuration and change state, but can't do IO
step2 :: (MonadReader Config m, MonadState Int m) => m ()
step2 = do
x <- asks v1
y <- get
put (x+y)
-- can use configuration and do IO, but won't touch our internal state
step3 :: (MonadReader Config m, MonadIO m) => m ()
step3 = do
x <- asks v2
liftIO $ putStrLn ("the value of v2 is " ++ show x)
program :: Program
program = step1 >> step2 >> step3
main :: IO ()
main = do
let config = Config { v1 = 42, v2 = 123 }
startState = 17
result <- runProgram program config startState
return ()
Now if we want to add another effect:
step4 :: MonadWriter String m => m()
step4 = tell "done!"
program :: Program
program = step1 >> step2 >> step3 >> step4
Just adjust Program and runProgram
type Program = StateT Int (ReaderT Config (WriterT String IO)) ()
runProgram program config startState =
runWriterT $ runReaderT (runStateT program startState) config
To summarize, this approach lets us decompose a program in a way that tracks effects but also allows adding new effects as needed without a huge amount of refactoring.
edit:
It's come to my attention that I didn't answer the question about what to do for code that's already written. In many cases, it's not too difficult to change pure code into this style:
computation :: Double -> Double -> Double
computation x y = x + y
becomes
computation :: Monad m => Double -> Double -> m Double
computation x y = return (x + y)
This function will now work for any monad, but doesn't have access to any extra effects. Specifically, if we add another monad transformer to Program, then computation will still work.

Hide a function parameter in Haskell?

I need to backup some data to access it later.
At the interface level, I have two functions:
put: backs up data and returns a backup_Id.
get: retrieves data given a backup_Id.
My current code requires me to supply these two functions with the backup parameter.
import Data.Maybe
data Data = Data String deriving Show
type Backup = [(String,Data)]
put :: Backup -> String -> IO Backup
put boilerPlate a =
do let id = "id" ++ show(length (boilerPlate))
putStrLn $ id ++": " ++ a
return ((id,(Data a)):boilerPlate)
get :: Backup -> String -> Maybe Data
get boilerPlate id = lookup id (boilerPlate)
It works OK.
In the following sample, two values are backed up. The second one is retrieved.
main :: IO ()
main = do
let bp0 = []
bp1 <- put bp0 "a"
bp2 <- put bp1 "b"
let result = get bp2 "id1"
putStrLn $ "Looking for id1: " ++ show (fromJust(result))
But I need to simplify the signatures of put and get by getting rid of all the backup parameters.
I need something that looks like this:
main = do
put "a"
put "b"
let result = get "id1"
What is the simplest way to achieve this?

Here's an example using StateT. Note that the function names are changed because State and StateT already have get and put functions.
module Main where
import Control.Monad.State
data Data = Data String deriving Show
type Backup = [(String,Data)]
save :: String -> StateT Backup IO ()
save a = do
backup <- get
let id = "id" ++ ((show . length) backup)
liftIO $ putStrLn $ id ++ ": " ++ a
put ((id, Data a):backup)
retrieve :: String -> StateT Backup IO (Maybe Data)
retrieve id = do
backup <- get
return $ lookup id backup
run :: IO (Maybe Data)
run = flip evalStateT [] $ do
save "a"
save "b"
retrieve "id1"
main :: IO ()
main = do
result <- run
print result
The State monad threads a 'mutable' value through a computation. StateT combines State with other monads; in this case, allowing the use of IO.
As dfeuer mentioned, it is possible to make save and retrieve a bit more general with these types:
save :: (MonadState Backup m, MonadIO m) => String -> m ()
retrieve :: (MonadState Backup m, MonadIO m) => String -> m (Maybe Data)
(This also requires {-# LANGUAGE FlexibleContexts #-}) The advantage of this approach is that it allows our functions to work with any monad that provides the Backup state and IO. In particular, we can add effects to the monad and the functions will still work.
All this monad / monad transformer stuff can be pretty confusing at first, but it's actually pretty elegant once you get used to it. The advantage is that you can easily see what kind of effects are required in each function. That being said, I don't want you to think that there are things that Haskell can't do, so here's another way to achieve your goal which does away with the state monad in favor of a mutable reference.
module Main where
import Data.IORef
data Data = Data String deriving Show
type Backup = [(String,Data)]
mkSave :: IORef Backup -> String -> IO ()
mkSave r a = do
backup <- readIORef r
let id = "id" ++ ((show . length) backup)
putStrLn $ id ++ ": " ++ a
writeIORef r ((id, Data a):backup)
mkRetrieve :: IORef Backup -> String -> IO (Maybe Data)
mkRetrieve r id = do
backup <- readIORef r
return $ lookup id backup
main :: IO ()
main = do
ref <- newIORef []
let save = mkSave ref
retrieve = mkRetrieve ref
save "a"
save "b"
result <- retrieve "id0"
print result
Just be warned that this isn't usually the recommended approach.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string