An ArrowCircuit instance for stream processors which could block

An ArrowCircuit instance for stream processors which could block - haskell

The Control.Arrow.Operations.ArrowCircuit class is for:
An arrow type that can be used to interpret synchronous circuits.
I want to know what synchronous means here. I looked it up on Wikipedia, where they are speaking of digital electronics. My electronics is quite rusty, so here is the question: what is wrong (if anything is) with such an instance for the so-called asynchronous stream processors:
data StreamProcessor a b = Get (a -> StreamProcessor a b) |
Put b (StreamProcessor a b) |
Halt
instance Category StreamProcessor where
id = Get (\ x -> Put x id)
Put c bc . ab = Put c (bc . ab)
Get bbc . Put b ab = (bbc b) . ab
Get bbc . Get aab = Get $ \ a -> (Get bbc) . (aab a)
Get bbc . Halt = Halt
Halt . ab = Halt
instance Arrow StreamProcessor where
...
getThroughBlocks :: [a] -> StreamProcessor a b -> StreamProcessor a b
getThroughBlocks ~(a : input) (Get f) = getThroughBlocks input (f a)
getThroughBlocks _input putOrHalt = putOrHalt
getThroughSameArgBlocks :: a -> StreamProcessor a b -> StreamProcessor a b
getThroughSameArgBlocks = getThroughBlocks . repeat
instance ArrowLoop StreamProcessor where
loop Halt = Halt
loop (Put (c, d) bdcd') = Put c (loop bdcd')
loop (Get f) = Get $ \ b ->
let
Put (c, d) bdcd' = getThroughSameArgBlocks (b, d) (f (b, d))
in Put c (loop bdcd')
instance ArrowCircuit StreamProcessor where
delay b = Put b id
I reckon this solution to work for us as: we want someArrowCircuit >>> delay b to be someArrowCircuit delayed by one tick with b coming before anything from it. It is easy to see we get what we want:
someArrowCircuit >>> delay b
= someArrowCircuit >>> Put b id
= Put b id . someArrowCircuit
= Put b (id . someArrowCircuit)
= Put b someArrowCircuit
Are there any laws for such a class? If I made no mistake writing delay down, how does synchronous live alongside asynchronous?

The only law that I know of related to ArrowCircuit is actually for the similar ArrowInit class from Causal Commutative Arrows, which says that delay i *** delay j = delay (i,j). I'm pretty sure your version satisfies this (and it looks like a totally reasonable implementation), but it still feels a little strange considering that StreamProcessor isn't synchronous.
Particularly, synchronous circuits follow a pattern of a single input producing a single output. For example, if you have a Circuit a b and provide it a value of type a, then you will get one and only one output b. The "one-tick delay" that delay introduces is thus a delay of one output by one step.
But things are a little funky for asynchronous circuits. Let's consider an example:
runStreamProcessor :: StreamProcessor a b -> [a] -> [b]
runStreamProcessor (Put x s) xs = x : runStreamProcessor s xs
runStreamProcessor _ [] = []
runStreamProcessor Halt _ = []
runStreamProcessor (Get f) (x:xs) = runStreamProcessor (f x) xs
multiplyOneThroughFive :: StreamProcessor Int Int
multiplyOneThroughFive = Get $ \x ->
Put (x*1) $ Put (x*2) $ Put (x*3) $ Put (x*4) $ Put (x*5) multiplyOneThroughFive
Here, multiplyOneThroughFive produces 5 outputs for each input it receives. Now, consider the difference between multiplyOneThroughFive >>> delay 100 and delay 100 >>> multiplyOneThroughFive:
> runStreamProcessor (multiplyOneThroughFive >>> delay 100) [1,2]
[100,1,2,3,4,5,2,4,6,8,10]
> runStreamProcessor (delay 100 >>> multiplyOneThroughFive) [1,2]
[100,200,300,400,500,1,2,3,4,5,2,4,6,8,10]
Inserting the delay at a different point in the circuit actually caused us to produce a different number of results. Indeed, it seems as if the circuit as a whole underwent a 5-tick delay instead of just a 1-tick delay. This would definitely be unexpected behavior in a synchronous environment!

Related

Lazy mutual recursive evaluation of Polysemy expressions

NB: I don't think the problem here is with Polysemy per se.
The Goal:
I'm trying to write a single-thread simulation of multiple parties communicating synchronously. More specifically, the goal is to write the program that one party will run* in some effect system** and have a pure function that will simulate the multiple parties all running their programs together. The output of this simulation should be a representation of all the messages sent (or received) by each party***.
*All parties will run the same program.
**Polysemy is what I know.
***And each party's respective return value, although that's not really important.
Setup:
We can simplify a little and suppose there are only two parties. For this context I wrote a simple Pair functor; for the real system I expect to use fixed-length vectors.
The particular effects we're interested in are Input and Output.
output $ Pair a1 a2 means that "the party in question" sends a1 to party #1 and a2 to party #2.
Pair b1 b2 <- input means that "the party in question" receives b1 from party #1 and b2 from party #2.
These always have to come in pairs, with output first. In the real system we'd guarantee that by having a Communicate effect that got handled into Input and Output, but for now we'll just take it as an assumption.
I don't think it's possible to do the kind of parallel synchronous simulation I want entirely within the structure of a Polysemy handler (aka interpreter or runner). I'm pretty sure what I need is a function like
parallelEvaluation :: forall x a r.
(Sem r ([Pair x], a) -> Sem '[] ([Pair x], a)) ->
Pair (Sem (Output (Pair x) ': Input (Pair x) ': r) a) ->
Pair ([Pair x], a)
parallelEvaluation handleRest sems = ...
The first argument to parallelEvaluation is just a shim; I may not need it.
The Idea:
If we imagine the two parties running in parallel, we can clearly gather up their outputs as they send them; Polysemy has builtin handler just for that. That will give us exactly what we want: a list of all the messages sent.
The issue is what to feed into their inputs.
The idea is the use the results of parallelEvaluation recursively as the input-tape to the Input handler. If the whole function is "sufficiently" lazy, and our above assumption that output and input always appear together is sound, then this should be computable.
Polysemy is capable of the necessary laziness!
Specifically see here. doRealRecursion works fine; the only special thing I did to make it work was use the runLazyOutputList handler (which isn't Polysemy's default for space-efficiency reasons).
The Code:
Testing the pure-parallel-simulation:
import Polysemy (Members, Sem, reinterpret, run)
import Polysemy.Input (Input(Input), input)
import Polysemy.Output (Output, output, runLazyOutputList)
import Polysemy.State (get, put, runState)
import Pair (Pair(Pair),universe, (!))
runUnsafeInputList :: [i] -> Sem (Input i ': r) a -> Sem r a
-- Just like the normal PolySemy.Input.runInputList, except if there aren't enough inputs that's a runtime error!
runUnsafeInputList is = fmap snd . runState is . reinterpret (\case
Input -> do ~(s : ss) <- get
put ss
pure s
)
parallelEvaluation :: forall x a r.
(Sem r ([Pair x], a) -> Sem '[] ([Pair x], a)) ->
Pair (Sem (Output (Pair x) ': Input (Pair x) ': r) a) ->
Pair ([Pair x], a)
parallelEvaluation handleRest sems = run . handleRest <$> (runUnsafeInputList <$> os <*> (runLazyOutputList <$> sems))
where os :: Pair [Pair x]
os = fst <$> (parallelEvaluation handleRest sems)
testProgram :: forall r.
Members '[Input (Pair String), Output (Pair String)] r =>
Bool -> Sem r String
testProgram self = do output $ ((ownName ++ " says hi to ") ++) <$> parties
Pair m11 m12 <- input
let c1 = show $ (length m11) + (length m12)
output $ (++ ("; " ++ ownName ++ " got " ++ c1 ++ " characters last turn!")) <$> parties
Pair m21 m22 <- input
let c2 = show $ (length m21) + (length m22)
return $ ownName ++ "successfully got " ++ c2 ++ " characters in second round!"
where parties = Pair "Party1" "Party2"
ownName = parties ! self
doParallelRecursion :: IO ()
doParallelRecursion = do print "Attempting..."
let results = parallelEvaluation id $ testProgram <$> universe
print $ "Results: " ++ (show results)
The Pair helper:
data Pair a where
Pair :: a -> a -> Pair a
deriving (Read, Show, Eq)
instance Functor Pair where
fmap f (Pair a1 a2) = Pair (f a1) (f a2)
instance Applicative Pair where
pure a = Pair a a
(Pair f1 f2) <*> (Pair a1 a2) = Pair (f1 a1) (f2 a2)
(!) :: Pair a -> Bool -> a
(Pair a1 a2) ! b = if b then a2 else a1
universe :: Pair Bool
universe = Pair False True
The Problem:
As an executable, doParallelRecursion prints "Attempting..." (with a newline), and then hangs (seemingly forever).
From GHCI, it prints
"Attempting..."
"Results:
and then hangs on that line for several seconds before *** Exception: stack overflow.
I've attempted to use the debugger in GHCI to narrow down the problem. Assuming I'm using it correctly, the bodies of runUnsafeInputList and testProgram aren't ever being evaluated. runUnsafeInputList <$> os evaluates os, which immediately recurses.
I've tried swapping the order of the effects/handlers, this didn't affect the behavior at all. (I don't think it ought to matter, but the order presented here may be more intuitive.)

Conduit: Join two sources into one

I have two conduit sources A and B which I want to merge them into one yielding:
data Result = Left Int | Value Int | Right Int
merge :: Monad m => Source m Int -> Source m Int -> Source Result
merge a b = undefined
such as it:
Consumes values from both a and b
Performs some computation to produce Value Int
As a result of computation either a or b could have a leftover
When one of the sequences is exhausted the result source should keep producing either Left or Right values, (depending on which original source still has values) until both sources are exhausted
I tried implementing it with ZipSource such as:
getZipSource (ZipSource (a =$= CL.map Left) <* ZipSource (b =$= CL.map Right))
but I couldn't figure out how to make it alternate between sources (when I do two awaits) and how to deal with leftovers in a way that I described above.
I also looked at sequenceSources but it doesn't seem to help either.
Can something like that be built with Conduit?
A concrete example would be:
Have two (assumed sorted) Int sources
Get values from both and compare
Yield the min value, subtract it from the biggest and put the remaining back to its stream
Repeat.
The expected output would be:
runConduit $ merge (CL.sourceList [10, 20, 30]) (CL.sourceList [6, 4, 20]) $$ CL.take 10
Value 6 -- 10-6 = 4, 6 yielded, 4 goes back to "a"
Value 4 -- 4-4 = 0, both values are fully consumed
Value 20 -- 20-20 = 0, both values are fully consumed
Left 30 -- "b" has no values, but "a" still yielding
[UPDATE]
The best way I found so far is to write something similar to zipSources tweaking its internals as:
go (Done ()) (HaveOutput src close y) = HaveOutput (go (Done ()) src) close (Nothing, Just y)
go (HaveOutput src close x) (Done ()) = HaveOutput (go src (Done ())) close (Just x, Nothing)
Would it be the right way to go?

I ended up doing this way:
data MergedValue a v b = BackL a v | MergedValue v | BackR v b
data JoinResult a v b = LeftoverL a | JoinValue v | LeftoverR b
joinSources :: Monad m
=> (a -> b -> MergedValue a v b)
-> Source m a
-> Source m b
-> Source m (JoinResult a v b)
joinSources f as bs =
go (newResumableSource as) (newResumableSource bs)
where
go ras rbs = do
(ras', ma) <- lift $ ras $$++ await
(rbs', mb) <- lift $ rbs $$++ await
case (ma, mb) of
(Nothing, Nothing) -> pure ()
(Nothing, Just b) -> yield (LeftoverR b) >> go ras' rbs'
(Just a, Nothing) -> yield (LeftoverL a) >> go ras' rbs'
(Just a, Just b) -> case f a b of
BackL x v -> do
yield (JoinValue v)
(nxt, _) <- lift $ ras' $$++ leftover x
go nxt rbs'
BackR v x -> do
yield (JoinValue v)
(nxt, _) <- lift $ rbs' $$++ leftover x
go ras' nxt
MergedValue v -> yield (JoinValue v) >> go ras' rbs'

How to increment a variable in functional programming?

How do you increment a variable in a functional programming language?
For example, I want to do:
main :: IO ()
main = do
let i = 0
i = i + 1
print i
Expected output:
1

Simple way is to introduce shadowing of a variable name:
main :: IO () -- another way, simpler, specific to monads:
main = do main = do
let i = 0 let i = 0
let j = i i <- return (i+1)
let i = j+1 print i
print i -- because monadic bind is non-recursive
Prints 1.
Just writing let i = i+1 doesn't work because let in Haskell makes recursive definitions — it is actually Scheme's letrec. The i in the right-hand side of let i = i+1 refers to the i in its left hand side — not to the upper level i as might be intended. So we break that equation up by introducing another variable, j.
Another, simpler way is to use monadic bind, <- in the do-notation. This is possible because monadic bind is not recursive.
In both cases we introduce new variable under the same name, thus "shadowing" the old entity, i.e. making it no longer accessible.
How to "think functional"
One thing to understand here is that functional programming with pure — immutable — values (like we have in Haskell) forces us to make time explicit in our code.
In imperative setting time is implicit. We "change" our vars — but any change is sequential. We can never change what that var was a moment ago — only what it will be from now on.
In pure functional programming this is just made explicit. One of the simplest forms this can take is with using lists of values as records of sequential change in imperative programming. Even simpler is to use different variables altogether to represent different values of an entity at different points in time (cf. single assignment and static single assignment form, or SSA).
So instead of "changing" something that can't really be changed anyway, we make an augmented copy of it, and pass that around, using it in place of the old thing.

As a general rule, you don't (and you don't need to). However, in the interests of completeness.
import Data.IORef
main = do
i <- newIORef 0 -- new IORef i
modifyIORef i (+1) -- increase it by 1
readIORef i >>= print -- print it
However, any answer that says you need to use something like MVar, IORef, STRef etc. is wrong. There is a purely functional way to do this, which in this small rapidly written example doesn't really look very nice.
import Control.Monad.State
type Lens a b = ((a -> b -> a), (a -> b))
setL = fst
getL = snd
modifyL :: Lens a b -> a -> (b -> b) -> a
modifyL lens x f = setL lens x (f (getL lens x))
lensComp :: Lens b c -> Lens a b -> Lens a c
lensComp (set1, get1) (set2, get2) = -- Compose two lenses
(\s x -> set2 s (set1 (get2 s) x) -- Not needed here
, get1 . get2) -- But added for completeness
(+=) :: (Num b) => Lens a b -> Lens a b -> State a ()
x += y = do
s <- get
put (modifyL x s (+ (getL y s)))
swap :: Lens a b -> Lens a b -> State a ()
swap x y = do
s <- get
let x' = getL x s
let y' = getL y s
put (setL y (setL x s y') x')
nFibs :: Int -> Int
nFibs n = evalState (nFibs_ n) (0,1)
nFibs_ :: Int -> State (Int,Int) Int
nFibs_ 0 = fmap snd get -- The second Int is our result
nFibs_ n = do
x += y -- Add y to x
swap x y -- Swap them
nFibs_ (n-1) -- Repeat
where x = ((\(x,y) x' -> (x', y)), fst)
y = ((\(x,y) y' -> (x, y')), snd)

There are several solutions to translate imperative i=i+1 programming to functional programming. Recursive function solution is the recommended way in functional programming, creating a state is almost never what you want to do.
After a while you will learn that you can use [1..] if you need a index for example, but it takes a lot of time and practice to think functionally instead of imperatively.
Here's a other way to do something similar as i=i+1 not identical because there aren't any destructive updates. Note that the State monad example is just for illustration, you probably want [1..] instead:
module Count where
import Control.Monad.State
count :: Int -> Int
count c = c+1
count' :: State Int Int
count' = do
c <- get
put (c+1)
return (c+1)
main :: IO ()
main = do
-- purely functional, value-modifying (state-passing) way:
print $ count . count . count . count . count . count $ 0
-- purely functional, State Monad way
print $ (`evalState` 0) $ do {
count' ; count' ; count' ; count' ; count' ; count' }

Note: This is not an ideal answer but hey, sometimes it might be a little good to give anything at all.
A simple function to increase the variable would suffice.
For example:
incVal :: Integer -> Integer
incVal x = x + 1
main::IO()
main = do
let i = 1
print (incVal i)
Or even an anonymous function to do it.

Filtering / branching enumeratee

I am using enumerator-0.4.10, and I need to distribute processing of
different parts of the incoming stream to different iteratees (I am
parsing a huge XML file, and different sub-trees have different
processing logic). Only a single iteratee will be active at a time
since the sub-trees don't intersect.
I wrote a simple example that filters the stream and passes the result
to one iteratee; please see below. However, with multiple nested
iteratees it seems to me that I can no longer use an enumeratee. Do I
need to write my own multi-enumeratee that holds multiple inner
iteratees? Any better ideas?
Here is my (beginner's) code for a single nested iteratee:
module Main ( main ) where
import qualified Data.Enumerator as E ( Enumeratee, Step(..), Stream(..),
checkDone, checkDoneEx, continue, enumList, joinI, run_, yield )
import Data.Enumerator ( ($$), (>>==) )
import qualified Data.Enumerator.List as EL ( consume )
-- cribbed from EL.concatMap
concatMapAccum :: Monad m => (s -> ao -> (s, [ai])) -> s ->
E.Enumeratee ao ai m b
concatMapAccum f s0 = E.checkDone (E.continue . step s0)
where
step _ k E.EOF = E.yield (E.Continue k) E.EOF
step s k (E.Chunks xs) = loop s k xs
loop s k [] = E.continue (step s k)
loop s k (x:xs) = case f s x of
(s', ais) -> k (E.Chunks $ ais) >>==
E.checkDoneEx (E.Chunks xs) (\k' -> loop s' k' xs)
passFromTo :: Monad m => ((a -> Bool), (a -> Bool)) -> Bool -> E.Enumeratee a a m b
passFromTo (from, to) pass0 =
concatMapAccum updatePass pass0
where
updatePass pass el = case (pass, from el, to el) of
(True, _, to_el) -> (not to_el, [el])
(False, True, _) -> (True, [el])
(False, False, _) -> (False, [])
main :: IO()
main = do
E.run_ (E.enumList 3 [1..20] $$
E.joinI $ passFromTo ((\e -> e == 3 || e == 13), (\e -> e == 7 || e == 17)) False $$
EL.consume) >>= print
$ ./dist/build/StatefulEnumeratee/StatefulEnumeratee
[3,4,5,6,7,13,14,15,16,17]

Yes, you need an enumeratee that passes the stream to multiple iteratees, like Data.Iteratee.sequence_ and Data.Iteratee.Parallel.psequence_ from iteratee-0.8.6. sequence_ takes a list of iteratees to run simultaneously, and handles each input chunk by mapM across that list. psequence_ takes similar arguments, but runs each input iteratee in a separate forkIO thread.
There has been some discussion on haskell-cafe and the iteratee mailing lists about these over the past year, eg: http://www.haskell.org/pipermail/haskell-cafe/2011-January/088319.html The main thing to be careful about is handling errors from the inner iteratees: in your application, if one inner iteratee fails do you want to terminate all iteratees or just that one, and [how] do you want to propagate those errors.

Simplifying some Haskell code

So I'm working on a minimax implementation for a checkers-like game to help myself learn Haskell better. The function I'm having trouble with takes a list for game states, and generates the list of immediate successor game states. Like checkers, if a jump is available, the player must take it. If there's more than one, the player can choose.
For the most part, this works nicely with the list monad: loop over all the input game states, loop over all marbles that could be jumped, loop over all jumps of that marble. This list monad nicely flattens all the lists out into a simple list of states at the end.
The trick is that, if no jumps are found for a given game state, I need to return the current game state, rather than the empty list. The code below is the best way I've come up with of doing that, but it seems really ugly to me. Any suggestions on how to clean it up?
eHex :: Coord -> Coord -- Returns the coordinates immediately to the east on the board
nwHex :: Coord -> Coord -- Returns the coordinates immediately to the northwest on the board
generateJumpsIter :: [ZertzState] -> [ZertzState]
generateJumpsIter states = do
ws <- states
case children ws of
[] -> return ws
n#_ -> n
where
children ws#(ZertzState s1 s2 b p) = do
(c, color) <- occupiedCoords ws
(start, end) <- [(eHex, wHex), (wHex, eHex), (swHex, neHex),
(neHex, swHex), (nwHex, seHex), (seHex, nwHex)]
if (hexOccupied b $ start c) && (hexOpen b $ end c)
then case p of
1 -> return $ ZertzState (scoreMarble s1 color) s2
(jumpMarble (start c) c (end c) b) p
(-1) -> return $ ZertzState s1 (scoreMarble s2 color)
(jumpMarble (start c) c (end c) b) p
else []
EDIT: Provide example type signatures for the *Hex functions.

The trick is that, if no jumps are found for a given game state, I need to return the current game state, rather than the empty list.
Why? I've written minimax several times, and I can't imagine a use for such a function. Wouldn't you be better off with a function of type
nextStates :: [ZertzState] -> [Maybe [ZertzState]]
or
nextStates :: [ZertzState] -> [[ZertzState]]
However if you really want to return "either the list of next states, or if that list is empty, the original state", then the type you want is
nextStates :: [ZertzState] -> [Either ZertzState [ZertzState]]
which you can then flatten easily enough.
As to how to implement, I recommend defining a helper function of type
[ZertzState] -> [(ZertzState, [ZertzState])]
and than you can map
(\(start, succs) -> if null succs then Left start else Right succs)
over the result, plus various other things.
As Fred Brooks said (paraphrasing), once you get the types right, the code practically writes itself.

Don't abuse monads notation for list, it's so heavy for nothing. Moreover you can use list comprehension in the same fashion :
do x <- [1..3]
y <- [2..5] <=> [ x + y | x <- [1..3], y <- [2..5] ]
return x + y
now for the 'simplification'
listOfHex :: [(Coord -> Coord,Coord -> Coord)]
listOfHex = [ (eHex, wHex), (wHex, eHex), (swHex, neHex)
, (neHex, swHex), (nwHex, seHex), (seHex, nwHex)]
generateJumpsIter :: [ZertzState] -> [ZertzState]
generateJumpsIter states =
[if null ws then ws else children ws | ws <- states]
where -- I named it foo because I don t know what it do....
foo True 1 = ZertzState (scoreMarble s1 color) s2
(jumpMarble (start c) c (end c) b) p
foo True (-1) = ZertzState s1 (scoreMarble s2 color)
(jumpMarble (start c) c (end c) b) p
foo False _ = []
foo _ _ = error "Bleh"
children ws#(ZertzState s1 s2 b p) =
[ foo (valid c hex) p | (c, _) <- occupiedCoords ws, hex <- listOfHex ]
where valid c (start, end) =
(hexOccupied b $ start c) && (hexOpen b $ end c)
The let in the let in list commprehension at the top bother me a little, but as I don't have all the code, I don't really know how to do it in an other way. If you can modify more in depth, I suggest you to use more combinators (map, foldr, foldl' etc) as they really reduce code size in my experience.
Note, the code is not tested, and may not compile.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

An ArrowCircuit instance for stream processors which could block - haskell

Related

Lazy mutual recursive evaluation of Polysemy expressions

Conduit: Join two sources into one

How to increment a variable in functional programming?

Filtering / branching enumeratee

Simplifying some Haskell code

Categories

Resources