Lazy mutual recursive evaluation of Polysemy expressions - haskell

NB: I don't think the problem here is with Polysemy per se.
The Goal:
I'm trying to write a single-thread simulation of multiple parties communicating synchronously. More specifically, the goal is to write the program that one party will run* in some effect system** and have a pure function that will simulate the multiple parties all running their programs together. The output of this simulation should be a representation of all the messages sent (or received) by each party***.
*All parties will run the same program.
**Polysemy is what I know.
***And each party's respective return value, although that's not really important.
Setup:
We can simplify a little and suppose there are only two parties. For this context I wrote a simple Pair functor; for the real system I expect to use fixed-length vectors.
The particular effects we're interested in are Input and Output.
output $ Pair a1 a2 means that "the party in question" sends a1 to party #1 and a2 to party #2.
Pair b1 b2 <- input means that "the party in question" receives b1 from party #1 and b2 from party #2.
These always have to come in pairs, with output first. In the real system we'd guarantee that by having a Communicate effect that got handled into Input and Output, but for now we'll just take it as an assumption.
I don't think it's possible to do the kind of parallel synchronous simulation I want entirely within the structure of a Polysemy handler (aka interpreter or runner). I'm pretty sure what I need is a function like
parallelEvaluation :: forall x a r.
(Sem r ([Pair x], a) -> Sem '[] ([Pair x], a)) ->
Pair (Sem (Output (Pair x) ': Input (Pair x) ': r) a) ->
Pair ([Pair x], a)
parallelEvaluation handleRest sems = ...
The first argument to parallelEvaluation is just a shim; I may not need it.
The Idea:
If we imagine the two parties running in parallel, we can clearly gather up their outputs as they send them; Polysemy has builtin handler just for that. That will give us exactly what we want: a list of all the messages sent.
The issue is what to feed into their inputs.
The idea is the use the results of parallelEvaluation recursively as the input-tape to the Input handler. If the whole function is "sufficiently" lazy, and our above assumption that output and input always appear together is sound, then this should be computable.
Polysemy is capable of the necessary laziness!
Specifically see here. doRealRecursion works fine; the only special thing I did to make it work was use the runLazyOutputList handler (which isn't Polysemy's default for space-efficiency reasons).
The Code:
Testing the pure-parallel-simulation:
import Polysemy (Members, Sem, reinterpret, run)
import Polysemy.Input (Input(Input), input)
import Polysemy.Output (Output, output, runLazyOutputList)
import Polysemy.State (get, put, runState)
import Pair (Pair(Pair),universe, (!))
runUnsafeInputList :: [i] -> Sem (Input i ': r) a -> Sem r a
-- Just like the normal PolySemy.Input.runInputList, except if there aren't enough inputs that's a runtime error!
runUnsafeInputList is = fmap snd . runState is . reinterpret (\case
Input -> do ~(s : ss) <- get
put ss
pure s
)
parallelEvaluation :: forall x a r.
(Sem r ([Pair x], a) -> Sem '[] ([Pair x], a)) ->
Pair (Sem (Output (Pair x) ': Input (Pair x) ': r) a) ->
Pair ([Pair x], a)
parallelEvaluation handleRest sems = run . handleRest <$> (runUnsafeInputList <$> os <*> (runLazyOutputList <$> sems))
where os :: Pair [Pair x]
os = fst <$> (parallelEvaluation handleRest sems)
testProgram :: forall r.
Members '[Input (Pair String), Output (Pair String)] r =>
Bool -> Sem r String
testProgram self = do output $ ((ownName ++ " says hi to ") ++) <$> parties
Pair m11 m12 <- input
let c1 = show $ (length m11) + (length m12)
output $ (++ ("; " ++ ownName ++ " got " ++ c1 ++ " characters last turn!")) <$> parties
Pair m21 m22 <- input
let c2 = show $ (length m21) + (length m22)
return $ ownName ++ "successfully got " ++ c2 ++ " characters in second round!"
where parties = Pair "Party1" "Party2"
ownName = parties ! self
doParallelRecursion :: IO ()
doParallelRecursion = do print "Attempting..."
let results = parallelEvaluation id $ testProgram <$> universe
print $ "Results: " ++ (show results)
The Pair helper:
data Pair a where
Pair :: a -> a -> Pair a
deriving (Read, Show, Eq)
instance Functor Pair where
fmap f (Pair a1 a2) = Pair (f a1) (f a2)
instance Applicative Pair where
pure a = Pair a a
(Pair f1 f2) <*> (Pair a1 a2) = Pair (f1 a1) (f2 a2)
(!) :: Pair a -> Bool -> a
(Pair a1 a2) ! b = if b then a2 else a1
universe :: Pair Bool
universe = Pair False True
The Problem:
As an executable, doParallelRecursion prints "Attempting..." (with a newline), and then hangs (seemingly forever).
From GHCI, it prints
"Attempting..."
"Results:
and then hangs on that line for several seconds before *** Exception: stack overflow.
I've attempted to use the debugger in GHCI to narrow down the problem. Assuming I'm using it correctly, the bodies of runUnsafeInputList and testProgram aren't ever being evaluated. runUnsafeInputList <$> os evaluates os, which immediately recurses.
I've tried swapping the order of the effects/handlers, this didn't affect the behavior at all. (I don't think it ought to matter, but the order presented here may be more intuitive.)

Related

An ArrowCircuit instance for stream processors which could block

The Control.Arrow.Operations.ArrowCircuit class is for:
An arrow type that can be used to interpret synchronous circuits.
I want to know what synchronous means here. I looked it up on Wikipedia, where they are speaking of digital electronics. My electronics is quite rusty, so here is the question: what is wrong (if anything is) with such an instance for the so-called asynchronous stream processors:
data StreamProcessor a b = Get (a -> StreamProcessor a b) |
Put b (StreamProcessor a b) |
Halt
instance Category StreamProcessor where
id = Get (\ x -> Put x id)
Put c bc . ab = Put c (bc . ab)
Get bbc . Put b ab = (bbc b) . ab
Get bbc . Get aab = Get $ \ a -> (Get bbc) . (aab a)
Get bbc . Halt = Halt
Halt . ab = Halt
instance Arrow StreamProcessor where
...
getThroughBlocks :: [a] -> StreamProcessor a b -> StreamProcessor a b
getThroughBlocks ~(a : input) (Get f) = getThroughBlocks input (f a)
getThroughBlocks _input putOrHalt = putOrHalt
getThroughSameArgBlocks :: a -> StreamProcessor a b -> StreamProcessor a b
getThroughSameArgBlocks = getThroughBlocks . repeat
instance ArrowLoop StreamProcessor where
loop Halt = Halt
loop (Put (c, d) bdcd') = Put c (loop bdcd')
loop (Get f) = Get $ \ b ->
let
Put (c, d) bdcd' = getThroughSameArgBlocks (b, d) (f (b, d))
in Put c (loop bdcd')
instance ArrowCircuit StreamProcessor where
delay b = Put b id
I reckon this solution to work for us as: we want someArrowCircuit >>> delay b to be someArrowCircuit delayed by one tick with b coming before anything from it. It is easy to see we get what we want:
someArrowCircuit >>> delay b
= someArrowCircuit >>> Put b id
= Put b id . someArrowCircuit
= Put b (id . someArrowCircuit)
= Put b someArrowCircuit
Are there any laws for such a class? If I made no mistake writing delay down, how does synchronous live alongside asynchronous?
The only law that I know of related to ArrowCircuit is actually for the similar ArrowInit class from Causal Commutative Arrows, which says that delay i *** delay j = delay (i,j). I'm pretty sure your version satisfies this (and it looks like a totally reasonable implementation), but it still feels a little strange considering that StreamProcessor isn't synchronous.
Particularly, synchronous circuits follow a pattern of a single input producing a single output. For example, if you have a Circuit a b and provide it a value of type a, then you will get one and only one output b. The "one-tick delay" that delay introduces is thus a delay of one output by one step.
But things are a little funky for asynchronous circuits. Let's consider an example:
runStreamProcessor :: StreamProcessor a b -> [a] -> [b]
runStreamProcessor (Put x s) xs = x : runStreamProcessor s xs
runStreamProcessor _ [] = []
runStreamProcessor Halt _ = []
runStreamProcessor (Get f) (x:xs) = runStreamProcessor (f x) xs
multiplyOneThroughFive :: StreamProcessor Int Int
multiplyOneThroughFive = Get $ \x ->
Put (x*1) $ Put (x*2) $ Put (x*3) $ Put (x*4) $ Put (x*5) multiplyOneThroughFive
Here, multiplyOneThroughFive produces 5 outputs for each input it receives. Now, consider the difference between multiplyOneThroughFive >>> delay 100 and delay 100 >>> multiplyOneThroughFive:
> runStreamProcessor (multiplyOneThroughFive >>> delay 100) [1,2]
[100,1,2,3,4,5,2,4,6,8,10]
> runStreamProcessor (delay 100 >>> multiplyOneThroughFive) [1,2]
[100,200,300,400,500,1,2,3,4,5,2,4,6,8,10]
Inserting the delay at a different point in the circuit actually caused us to produce a different number of results. Indeed, it seems as if the circuit as a whole underwent a 5-tick delay instead of just a 1-tick delay. This would definitely be unexpected behavior in a synchronous environment!

How can I get output mention in below

readInts = fmap (map read.words) getLine
readInts :: IO [Int]
main = do
putStrLn "List number of A: "
num1 <- readInts
let a = [] ++ num1
putStrLn "List number of B: "
num2 <- readInts
let b = [] ++ num2
Choose some element a of A and some element b of B such that a + b doesn't belong to A and doesn't belong to B
Serious
If your instructor isn't teaching, and that isn't just you being burned out and stressed, then talk to the instructor. They probably aren't trying to waste your and their time. If that doesn't work then talk to the professor.
As for getting homework help here, it is entirely doable but help is very unlikely to appear without some semblance of an attempt and a clear cut issue. You usually need to come to the table with how the problem can be solved and have problems translating that how into the specifics of Haskell or whatever target language.
Cheeky
A cheeky response I'd use if I were in the classroom:
This is a finite domain so I'd just use DPLL. DPLL is a general purpose algorithm for finite domains that allows us to just state the problem as a symbolic computation and constraints then request satisfying models. We'll construct the problem first then use the SBV library to get the model.
Choose some element a of A
So lets define the set A (called as) as a list of symbolics and then constrain an existential to being a member of this set!
a <- exists "value1"
constrain (a `sElem` as)
and some element b of B
OK, same thing. We make a list of symbolic values and constrain an existential to being a member.
b <- exists "value2"
constrain (b `sElem` bs)
such that a + b
Let's define an alias for this:
let c = a + b
doesn't belong to A
We can just reuse the test for membership, sElem, and symbolic negation sNot.
constrain $ sNot (c `sElem` as)
and doesn't belong to B
Yep, same!
constrain $ sNot (c `sElem` bs)
Putting it together
Honestly the hardest part is actually running your problem more than stating it. We need to read the inputs (as you showed), call the solver (sat), and get the answer (aka the "model) via extractModel which can finally be printed.
#!/usr/bin/env cabal
{- cabal:
build-depends:
base, sbv >= 8.4
-}
{-# LANGUAGE ViewPatterns #-}
import Data.SBV
readInts :: IO [Int64]
readInts = fmap read . words <$> getLine
readInt :: IO Int64
readInt = read <$> getLine
main =
do putStrLn "List number of A: "
a <- readInts
putStrLn "List number of B: "
b <- readInts
result <- getValues a b
let values :: Maybe (Int64,Int64)
values = extractModel result
print values
getValues :: [Int64] -> [Int64] -> IO SatResult
getValues (map literal -> as) (map literal -> bs) = sat $
do a <- exists "value1"
constrain (a `sElem` as)
b <- exists "value2"
constrain (b `sElem` bs)
let c = a + b
constrain $ sNot (c `sElem` as)
constrain $ sNot (c `sElem` bs)
Because this uses SBV you'll have to have first installed z3. I included a cabal header to auto build as a package. For example:
brew install z3
...
chmod +x mycode.hs
./mycode.hs
...
List number of A:
1 3 4 5
List number of B:
1 2 3
Just (3,3)

Is this syntax as expressive as the do-notation?

The do notation allows us to express monadic code without overwhelming nestings, so that
main = getLine >>= \ a ->
getLine >>= \ b ->
putStrLn (a ++ b)
can be expressed as
main = do
a <- getLine
b <- getLine
putStrLn (a ++ b)
Suppose, though, the syntax allows ... #expression ... to stand for do { x <- expression; return (... x ...) }. For example, foo = f a #(b 1) c would be desugared as: foo = do { x <- b 1; return (f a x c) }. The code above could, then, be expressed as:
main = let a = #getLine in
let b = #getLine in
putStrLn (a ++ b)
Which would be desugared as:
main = do
x <- getLine
let a = x in
return (do
x' <- getLine
let b = x' in
return (putStrLn (a ++ b)))
That is equivalent. This syntax is appealing to me because it seems to offer the same functionality as the do-notation, while also allowing some shorter expressions such as:
main = putStrLn (#(getLine) ++ #(getLine))
So, I wonder if there is anything defective with this proposed syntax, or if it is indeed complete and equivalent to the do-notation.
putStrLn is already String -> IO (), so your desugaring ... return (... return (putStrLn (a ++ b))) ends up having type IO (IO (IO ())), which is likely not what you wanted: running this program won't print anything!
Speaking more generally, your notation can't express any do-block which doesn't end in return. [See Derek Elkins' comment.]
I don't believe your notation can express join, which can be expressed with do without any additional functions:
join :: Monad m => m (m a) -> m a
join mx = do { x <- mx; x }
However, you can express fmap constrained to Monad:
fmap' :: Monad m => (a -> b) -> m a -> m b
fmap' f mx = f #mx
and >>= (and thus everything else) can be expressed using fmap' and join. So adding join would make your notation complete, but still not convenient in many cases, because you end up needing a lot of joins.
However, if you drop return from the translation, you get something quite similar to Idris' bang notation:
In many cases, using do-notation can make programs unnecessarily verbose, particularly in cases such as m_add above where the value bound is used once, immediately. In these cases, we can use a shorthand version, as follows:
m_add : Maybe Int -> Maybe Int -> Maybe Int
m_add x y = pure (!x + !y)
The notation !expr means that the expression expr should be evaluated and then implicitly bound. Conceptually, we can think of ! as being a prefix function with the following type:
(!) : m a -> a
Note, however, that it is not really a function, merely syntax! In practice, a subexpression !expr will lift expr as high as possible within its current scope, bind it to a fresh name x, and replace !expr with x. Expressions are lifted depth first, left to right. In practice, !-notation allows us to program in a more direct style, while still giving a notational clue as to which expressions are monadic.
For example, the expression:
let y = 42 in f !(g !(print y) !x)
is lifted to:
let y = 42 in do y' <- print y
x' <- x
g' <- g y' x'
f g'
Adding it to GHC was discussed, but rejected (so far). Unfortunately, I can't find the threads discussing it.
How about this:
do a <- something
b <- somethingElse a
somethingFinal a b

How do I call a constructor that may fail, especially when implementing 'Read' and 'Arbitrary'?

I have a "public safe" that may fail with a (potentially informative) errors:
data EnigmaError = BadRotors
| BadWindows
| MiscError String
instance Show EnigmaError where
show BadRotors = "Bad rotors"
show BadWindows = "Bad windows"
show (MiscError str) = str
configEnigma :: String -> String -> String -> String -> Except EnigmaError EnigmaConfig
configEnigma rots winds plug rngs = do
unless (and $ [(>=1),(<=26)] <*> rngs') (throwError BadRotors)
unless (and $ (`elem` letters) <$> winds') (throwError BadWindows)
-- ...
return EnigmaConfig {
components = components',
positions = zipWith (\w r -> (mod (numA0 w - r + 1) 26) + 1) winds' rngs',
rings = rngs'
}
where
rngs' = reverse $ (read <$> (splitOn "." $ "01." ++ rngs ++ ".01") :: [Int])
winds' = "A" ++ reverse winds ++ "A"
components' = reverse $ splitOn "-" $ rots ++ "-" ++ plug
but it is unclear how I should call this, particularly (and specifically) in implementing Read and Arbitrary (for QuickCheck).
For the former, I can get as far as
instance Read EnigmaConfig where
readsPrec _ i = case runExcept (configEnigma c w s r) of
Right cfg -> [(cfg, "")]
Left err -> undefined
where [c, w, s, r] = words i
but this seems to end up hiding error information available in err; while for the latter, I'm stuck at
instance Arbitrary EnigmaConfig where
arbitrary = do
nc <- choose (3,4) -- This could cover a wider range
ws <- replicateM nc capitals
cs <- replicateM nc (elements rotors)
uk <- elements reflectors
rs <- replicateM nc (choose (1,26))
return $ configEnigma (intercalate "-" (uk:cs))
ws
"UX.MO.KZ.AY.EF.PL" -- TBD - Generate plugboard and test <<<
(intercalate "." $ (printf "%02d") <$> (rs :: [Int]))
which fails with a mismatch between the expected and actual types:
Expected type: Gen EnigmaConfig
Actual type: Gen (transformers-0.4.2.0:Control.Monad.Trans.Except.Except Crypto.Enigma.EnigmaError EnigmaConfig)
How do I call a ("public safe") constructor when it may fail, particularly when using it in implementing Read and Arbitrary for my class?
The Read typeclass represents parses as lists of successes (with failures being the same as no successes); so rather than undefined you should return []. As for losing information about what went wrong: that's true, and the type of readsPrec means you can't do much about that. If you really, really wanted to [note: I don't think you should want this] you could define a newtype wrapper around Except EnigmaError EnigmaConfig and give that a Read instance that had successful parses of configuration errors.
For Arbitrary you have a couple choices. One choice is so-called rejection sampling; e.g.
arbitrary = do
-- ...
case configEnigma ... of
Left err -> arbitrary -- try again
Right v -> return v
You might also consider an Arbitrary instance to be part of your internal API, and use unsafe, internal calls rather than using the safe, public API for constructing your configuration. Other options include calling error or fail. (I consider these four options to be in roughly preference order -- rejection sampling, then unsafe internal calls, then error, then fail -- though your judgement may differ.)

Haskell memory usage and IO

I had just written a piece of Haskell code where in order to debug my code I put in a bunch of print statements in my code (so, my most important function returned IO t, when it just needed to return t) and I saw that this function, on a successful run, would take up a lot of memory (roughly 1.2GB). Once I saw that the program was working fine, I removed all the print statements from the function and ran it, only to realize that it was giving me this error:
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
Even though it was the same exact piece of code, for some reason the print statements made it ignore stack space overflow. Can anyone enlighten me as to why this happens?
I know I haven't provided my code which might make it harder to answer this question, but I've hacked a bunch of things together and it doesn't look very pretty so I doubt it would be useful and I am fairly certain that the only difference is the print statements.
EDIT:
Since people really wanted to see the code here is the relevant part:
linkCallers :: ([Int], Int, Int, I.IntDisjointSet, IntMap Int) -> ([Int], Int, Int, I.IntDisjointSet, IntMap Int)
linkCallers ([], x, y, us, im) = ([], x, y, us, im)
linkCallers ((a:b:r), x, y, us, im) = if a == b
then (r, x, y+1, us, im)
else if sameRep
then (r, x+1, y+1, us, im)
else (r, x+1, y+1, us', im')
where
ar = fst $ I.lookup a us
br = fst $ I.lookup b us
sameRep = case ar of
Nothing -> False
_ -> ar == br
as' = ar >>= flip lookup im
bs' = br >>= flip lookup im
totalSize = do
asize <- as'
bsize <- bs'
return $ asize + bsize
maxSize = (convertMaybe as') + (convertMaybe bs')
us' = I.union a b $ I.insert a $ I.insert b $ us
newRep = fromJust $ fst $ I.lookup a us'
newRep' = fromJust $ fst $ I.lookup b us'
im'' = case ar of
Nothing -> case br of
Nothing -> im
Just bk -> delete bk im
Just ak -> delete ak $ case br of
Nothing -> im
Just bk -> delete bk im
im' = case totalSize of
Nothing -> insert newRep maxSize im''
Just t -> insert newRep t im''
startLinkingAux' (c,x,y,us,im) = let t#(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
Just True -> x'
_ -> startLinkingAux' t
startLinkingAux' used to look something like this:
startLinkingAux' (c,x,y,us,im) = do
print (c,x,y,us,im)
let t#(_,x',_,us',im') = linkCallers (c,x,y,us,im) in
case (fst $ I.lookup primeMinister us') >>= flip lookup im' >>= return . (>=990000) of
Just True -> return x'
_ -> startLinkingAux' t
There could be a memory leak in one of the arguments. Probably the first thing I'd try would be to ask the author of disjoint-set to add a RFData instance for IntDisjointSet (or do it yourself, looking at the source code, it'd fairly easy). Then try calling force on all values returned by linkCallers to see if it helps.
Second, you're not using disjoint-set right. The main idea of the algorithm is that lookups compress paths in the set. This is what gives it it's great performance! So every time you make a lookup, you should replace your old set with a new one. But this makes using a disjoint set quite clumsy in a functional language. It'd suggest to use the State monad for this and use it internally in linkCallers, as one big do block instead of where, just passing the starting set and extracting the final one. And define functions like
insertS :: (MonadState IntDisjointSet m) => Int -> m ()
insertS x = modify (insert x)
lookupS :: (MonadState IntDisjointSet m) => Int -> m (Maybe Int)
lookupS x = state (lookup x)
-- etc
to use inside State. (Perhaps they'd be a good contribution to the library as well as this will be probably a common problem.)
Finally, there are lot of small improvements that can make the code more readable:
Many times you're applying a single function to two values. I'd suggest to define something like
onPair :: (a -> b) -> (a, a) -> (b, b)
onPair f (x, y) = (f x, f y)
-- and use it like:
(ar, br) = onPair (fst . flip I.lookup us) (a, b)
Also using Applicative functions can make things shorter:
sameRep = fromMaybe False $ (==) <$> ar <*> br
totalSize = (+) <$> as' <*> bs'
then also
im'' = maybe id delete ar . maybe id delete br $ im
im' = insert newRep (fromJust maxSize totalSize) im''
Hope it helps.

Resources