Finite State Transducers in Haskell? - haskell

I've been wondering if there is a way to define and work with finite state transducers in Haskell in an idiomatic way.
You can approach FSTs as generators (it generates an output of type {x1,x2}), or as recognizers (given an input of type {x1,x2} it recognizes it if it belongs to the rational relation), or as translators (given an input tape, it translates it into an output tape). Would the representation change depending on the approach?
Would it also be possible to model a FST in a way that you can produce one by specifying rewriting rules? E.g creating a DSL to model rewriting rules, and then creating a function createFST :: [Rule] -> FST.
The closest I could find is Kmett, Bjarnason and Cough's machines library:
https://hackage.haskell.org/package/machines
But I can't seem to realize how to model a FST with a Machine. I'd suppose that the correct way of doing it would be similar to how they define Moore and Mealy machines: define a FST as a different entity, but provide an instance of Automaton to be able to use it as a machine.
I found some other options too, but they define it in a straightforward way (like in https://hackage.haskell.org/package/fst ). That doesn't convince me much, as I wonder if there's a better way to do so idiomatically using the strengths of Haskell's type system (like how Moore and Mealy machines are defined in the machines library).

A Mealy machine alternately reads an a from a stream of inputs a and outputs a b to a stream of outputs. It reads first and then outputs once after each read.
newtype Mealy a b = Mealy { runMealy :: a -> (b, Mealy a b) }
A Moore machine alternately outputs a b to a stream of outputs and reads an input a from a stream of inputs. It starts with an output of b and then reads once after each output.
data Moore a b = Moore b (a -> Moore a b)
An FST either reads from it's input, writes to its output, or stops. It can read as many times in a row as it wants or write as many times in a row as it wants.
data FST a b
= Read (a -> FST a b)
| Write (b, FST a b)
| Stop
The equivalent of an FST from machines is Process. It's definition is a little spread out. To simplify the discussion we are going to forget about Process for now and explore it from the inside-out.
The base functor
To describe what a Process is, we're going to first notice a pattern in all three machines so far. Each of them recursively refers to itself for "what to do next". We are going to replace "what to do next" with any type next. The Mealy machine, while mapping an input to an output, also provides the next machine to run.
newtype MealyF a b next = MealyF { runMealyF :: a -> (b, next) }
The Moore machine, after outputting and requesting an input, figures out the next machine to run.
data MooreF a b next = MooreF b (a -> next)
We can write the FST the same way. When we Read from the input we'll figure out what to do next depending on the input. When we Write to the output we'll also provide what to do next after outputting. When we Stop there's nothing to do next.
data FSTF a b next
= Read (a -> next)
| Write (b, next)
| Stop
This pattern of eliminating explicit recursion shows up repeatedly in Haskell code, and is usually called a "base functor". In the machines package the base functor is Step. Compared to our code, Step has renamed the type variable for the output to o, what to do next to r, reading to Await, and writing to Yield.
data Step k o r
= forall t. Await (t -> r) (k t) r
| Yield o r
| Stop
Awaiting is a little more complicated than Read because a Machine can read from multiple sources. For Processes that can only read from a single source, k is Is applied to a specific type, which is a proof the second type Is the first type. For a Process reading inputs a, k will be Is a.
data Step (Is a) o r
= forall t. Await (t -> r) (Is a t) r
| Yield o r
| Stop
The existential quantification forall t. is an implementation detail for dealing with Sources. After witnessing that a ~ t this becomes.
data Step (Is a) o r
= forall t ~ a. Await (t -> r) Refl r
| Yield o r
| Stop
If we unify t with a and remove the Refl constructor which is always the same, this looks like our FSTF.
data Step (Is a) o r
= Await (a -> r) r
| Yield o r
| Stop
The extra r for what to do next in Await is what to do next when there's no more input.
The machine transformer `MachineT`
The machine transformer, MachineT, makes Step look almost like our FST. It says, "A machine operating over some monad m is what to do in that monad to get the next Step. The next thing to do after each step is another MachineT."
newtype MachineT m k o = MachineT { runMachineT :: m (Step k o (MachineT m k o)) }
Overall, specialized for our types, this looks like
newtype MachineT m (Is a) o =
MachineT m (
Await (a -> MachineT m (Is a) o) (MachineT m (Is a) o)
| Yield o (MachineT m (Is a) o)
| Stop
)
Machine is a pure MachineT.
type Machine k o = forall m. Monad m => MachineT m k o
Universal quantification over all Monads m is another way of saying a computation doesn't need anything from an underlying Monad. This can be seen by substituting Identity for m.
type Machine k o =
MachineT Identity (
Await (a -> MachineT Identity k o) (MachineT Identity k o)
| Yield o (MachineT Identity k o)
| Stop
)
Processes
A Process or ProcessT is a Machine or MachineT that only reads a single type of input a, Is a.
type Process a b = Machine (Is a) b
type ProcessT m a b = MachineT m (Is a) b
A Process has the following structure after removing all the intermediate constructors that are always the same. This structure is exactly the same as our FST, except it has an added "what to do next" in the case that there's no more input.
type Process a b =
Await (a -> Process a b) (Process a b)
| Yield b (Process a b)
| Stop
The ProcessT variant has an m wrapped around it so that it can act in the monad at each step.
Process models state transducers.

Related

Generic Pattern in Haskell

I was hoping that Haskell's compiler would understand that f v Is type-safe given Unfold v f (Although that is a tall order).
data Sequence a = FirstThen a (Sequence a) | Repeating a | UnFold b (b -> b) (b -> a)
Is there some way that I can encapsulate a Generic pattern for a datatype without adding extra template parameters.
(I am aware of a solution for this specific case using lazy maps but I am after a more general solution)
You can use existential quantification to get there:
data Sequence a = ... | forall b. UnFold b (b -> b) (b -> a)
I'm not sure this buys you much over the simpler solution of storing the result of unfolding directly, though:
data Stream a = Cons a (Stream a)
data Sequence' a = ... | Explicit (Stream a)
In particular, if somebody hands you a Sequence a, you can't pattern match on the b's contained within, even if you think you know what type they are.

What is indexed monad?

What is indexed monad and the motivation for this monad?
I have read that it helps to keep track of the side effects. But the type signature and documentation doesn't lead me to anywhere.
What would be an example of how it can help to keep track of side effects (or any other valid example)?
As ever, the terminology people use is not entirely consistent. There's a variety of inspired-by-monads-but-strictly-speaking-isn't-quite notions. The term "indexed monad" is one of a number (including "monadish" and "parameterised monad" (Atkey's name for them)) of terms used to characterize one such notion. (Another such notion, if you're interested, is Katsumata's "parametric effect monad", indexed by a monoid, where return is indexed neutrally and bind accumulates in its index.)
First of all, let's check kinds.
IxMonad (m :: state -> state -> * -> *)
That is, the type of a "computation" (or "action", if you prefer, but I'll stick with "computation"), looks like
m before after value
where before, after :: state and value :: *. The idea is to capture the means to interact safely with an external system that has some predictable notion of state. A computation's type tells you what the state must be before it runs, what the state will be after it runs and (like with regular monads over *) what type of values the computation produces.
The usual bits and pieces are *-wise like a monad and state-wise like playing dominoes.
ireturn :: a -> m i i a -- returning a pure value preserves state
ibind :: m i j a -> -- we can go from i to j and get an a, thence
(a -> m j k b) -- we can go from j to k and get a b, therefore
-> m i k b -- we can indeed go from i to k and get a b
The notion of "Kleisli arrow" (function which yields computation) thus generated is
a -> m i j b -- values a in, b out; state transition i to j
and we get a composition
icomp :: IxMonad m => (b -> m j k c) -> (a -> m i j b) -> a -> m i k c
icomp f g = \ a -> ibind (g a) f
and, as ever, the laws exactly ensure that ireturn and icomp give us a category
ireturn `icomp` g = g
f `icomp` ireturn = f
(f `icomp` g) `icomp` h = f `icomp` (g `icomp` h)
or, in comedy fake C/Java/whatever,
g(); skip = g()
skip; f() = f()
{h(); g()}; f() = h(); {g(); f()}
Why bother? To model "rules" of interaction. For example, you can't eject a dvd if there isn't one in the drive, and you can't put a dvd into the drive if there's one already in it. So
data DVDDrive :: Bool -> Bool -> * -> * where -- Bool is "drive full?"
DReturn :: a -> DVDDrive i i a
DInsert :: DVD -> -- you have a DVD
DVDDrive True k a -> -- you know how to continue full
DVDDrive False k a -- so you can insert from empty
DEject :: (DVD -> -- once you receive a DVD
DVDDrive False k a) -> -- you know how to continue empty
DVDDrive True k a -- so you can eject when full
instance IxMonad DVDDrive where -- put these methods where they need to go
ireturn = DReturn -- so this goes somewhere else
ibind (DReturn a) k = k a
ibind (DInsert dvd j) k = DInsert dvd (ibind j k)
ibind (DEject j) k = DEject j $ \ dvd -> ibind (j dvd) k
With this in place, we can define the "primitive" commands
dInsert :: DVD -> DVDDrive False True ()
dInsert dvd = DInsert dvd $ DReturn ()
dEject :: DVDrive True False DVD
dEject = DEject $ \ dvd -> DReturn dvd
from which others are assembled with ireturn and ibind. Now, I can write (borrowing do-notation)
discSwap :: DVD -> DVDDrive True True DVD
discSwap dvd = do dvd' <- dEject; dInsert dvd ; ireturn dvd'
but not the physically impossible
discSwap :: DVD -> DVDDrive True True DVD
discSwap dvd = do dInsert dvd; dEject -- ouch!
Alternatively, one can define one's primitive commands directly
data DVDCommand :: Bool -> Bool -> * -> * where
InsertC :: DVD -> DVDCommand False True ()
EjectC :: DVDCommand True False DVD
and then instantiate the generic template
data CommandIxMonad :: (state -> state -> * -> *) ->
state -> state -> * -> * where
CReturn :: a -> CommandIxMonad c i i a
(:?) :: c i j a -> (a -> CommandIxMonad c j k b) ->
CommandIxMonad c i k b
instance IxMonad (CommandIxMonad c) where
ireturn = CReturn
ibind (CReturn a) k = k a
ibind (c :? j) k = c :? \ a -> ibind (j a) k
In effect, we've said what the primitive Kleisli arrows are (what one "domino" is), then built a suitable notion of "computation sequence" over them.
Note that for every indexed monad m, the "no change diagonal" m i i is a monad, but in general, m i j is not. Moreover, values are not indexed but computations are indexed, so an indexed monad is not just the usual idea of monad instantiated for some other category.
Now, look again at the type of a Kleisli arrow
a -> m i j b
We know we must be in state i to start, and we predict that any continuation will start from state j. We know a lot about this system! This isn't a risky operation! When we put the dvd in the drive, it goes in! The dvd drive doesn't get any say in what the state is after each command.
But that's not true in general, when interacting with the world. Sometimes you might need to give away some control and let the world do what it likes. For example, if you are a server, you might offer your client a choice, and your session state will depend on what they choose. The server's "offer choice" operation does not determine the resulting state, but the server should be able to carry on anyway. It's not a "primitive command" in the above sense, so indexed monads are not such a good tool to model the unpredictable scenario.
What's a better tool?
type f :-> g = forall state. f state -> g state
class MonadIx (m :: (state -> *) -> (state -> *)) where
returnIx :: x :-> m x
flipBindIx :: (a :-> m b) -> (m a :-> m b) -- tidier than bindIx
Scary biscuits? Not really, for two reasons. One, it looks rather more like what a monad is, because it is a monad, but over (state -> *) rather than *. Two, if you look at the type of a Kleisli arrow,
a :-> m b = forall state. a state -> m b state
you get the type of computations with a precondition a and postcondition b, just like in Good Old Hoare Logic. Assertions in program logics have taken under half a century to cross the Curry-Howard correspondence and become Haskell types. The type of returnIx says "you can achieve any postcondition which holds, just by doing nothing", which is the Hoare Logic rule for "skip". The corresponding composition is the Hoare Logic rule for ";".
Let's finish by looking at the type of bindIx, putting all the quantifiers in.
bindIx :: forall i. m a i -> (forall j. a j -> m b j) -> m b i
These foralls have opposite polarity. We choose initial state i, and a computation which can start at i, with postcondition a. The world chooses any intermediate state j it likes, but it must give us the evidence that postcondition b holds, and from any such state, we can carry on to make b hold. So, in sequence, we can achieve condition b from state i. By releasing our grip on the "after" states, we can model unpredictable computations.
Both IxMonad and MonadIx are useful. Both model validity of interactive computations with respect to changing state, predictable and unpredictable, respectively. Predictability is valuable when you can get it, but unpredictability is sometimes a fact of life. Hopefully, then, this answer gives some indication of what indexed monads are, predicting both when they start to be useful and when they stop.
There are at least three ways to define an indexed monad that I know.
I'll refer to these options as indexed monads à la X, where X ranges over the computer scientists Bob Atkey, Conor McBride and Dominic Orchard, as that is how I tend to think of them. Parts of these constructions have a much longer more illustrious history and nicer interpretations through category theory, but I first learned of them associated with these names, and I'm trying to keep this answer from getting too esoteric.
Atkey
Bob Atkey's style of indexed monad is to work with 2 extra parameters to deal with the index of the monad.
With that you get the definitions folks have tossed around in other answers:
class IMonad m where
ireturn :: a -> m i i a
ibind :: m i j a -> (a -> m j k b) -> m i k b
We can also define indexed comonads à la Atkey as well. I actually get a lot of mileage out of those in the lens codebase.
McBride
The next form of indexed monad is Conor McBride's definition from his paper "Kleisli Arrows of Outrageous Fortune". He instead uses a single parameter for the index. This makes the indexed monad definition have a rather clever shape.
If we define a natural transformation using parametricity as follows
type a ~> b = forall i. a i -> b i
then we can write down McBride's definition as
class IMonad m where
ireturn :: a ~> m a
ibind :: (a ~> m b) -> (m a ~> m b)
This feels quite different than Atkey's, but it feels more like a normal Monad, instead of building a monad on (m :: * -> *), we build it on (m :: (k -> *) -> (k -> *).
Interestingly you can actually recover Atkey's style of indexed monad from McBride's by using a clever data type, which McBride in his inimitable style chooses to say you should read as "at key".
data (:=) a i j where
V :: a -> (a := i) i
Now you can work out that
ireturn :: IMonad m => (a := j) ~> m (a := j)
which expands to
ireturn :: IMonad m => (a := j) i -> m (a := j) i
can only be invoked when j = i, and then a careful reading of ibind can get you back the same as Atkey's ibind. You need to pass around these (:=) data structures, but they recover the power of the Atkey presentation.
On the other hand, the Atkey presentation isn't strong enough to recover all uses of McBride's version. Power has been strictly gained.
Another nice thing is that McBride's indexed monad is clearly a monad, it is just a monad on a different functor category. It works over endofunctors on the category of functors from (k -> *) to (k -> *) rather than the category of functors from * to *.
A fun exercise is figuring out how to do the McBride to Atkey conversion for indexed comonads. I personally use a data type 'At' for the "at key" construction in McBride's paper. I actually walked up to Bob Atkey at ICFP 2013 and mentioned that I'd turned him inside out at made him into a "Coat". He seemed visibly disturbed. The line played out better in my head. =)
Orchard
Finally, a third far-less-commonly-referenced claimant to the name of "indexed monad" is due to Dominic Orchard, where he instead uses a type level monoid to smash together indices. Rather than go through the details of the construction, I'll simply link to this talk:
https://github.com/dorchard/effect-monad/blob/master/docs/ixmonad-fita14.pdf
As a simple scenario, assume you have a state monad. The state type is a complex large one, yet all these states can be partitioned into two sets: red and blue states. Some operations in this monad make sense only if the current state is a blue state. Among these, some will keep the state blue (blueToBlue), while others will make the state red (blueToRed). In a regular monad, we could write
blueToRed :: State S ()
blueToBlue :: State S ()
foo :: State S ()
foo = do blueToRed
blueToBlue
triggering a runtime error since the second action expects a blue state. We would like to prevent this statically. Indexed monad fulfills this goal:
data Red
data Blue
-- assume a new indexed State monad
blueToRed :: State S Blue Red ()
blueToBlue :: State S Blue Blue ()
foo :: State S ?? ?? ()
foo = blueToRed `ibind` \_ ->
blueToBlue -- type error
A type error is triggered because the second index of blueToRed (Red) differs from the first index of blueToBlue (Blue).
As another example, with indexed monads you can allow a state monad to change the type for its state, e.g. you could have
data State old new a = State (old -> (new, a))
You could use the above to build a state which is a statically-typed heterogeneous stack. Operations would have type
push :: a -> State old (a,old) ()
pop :: State (a,new) new a
As another example, suppose you want a restricted IO monad which does not
allow file access. You could use e.g.
openFile :: IO any FilesAccessed ()
newIORef :: a -> IO any any (IORef a)
-- no operation of type :: IO any NoAccess _
In this way, an action having type IO ... NoAccess () is statically guaranteed to be file-access-free. Instead, an action of type IO ... FilesAccessed () can access files. Having an indexed monad would mean you don't have to build a separate type for the restricted IO, which would require to duplicate every non-file-related function in both IO types.
An indexed monad isn't a specific monad like, for example, the state monad but a sort of generalization of the monad concept with extra type parameters.
Whereas a "standard" monadic value has the type Monad m => m a a value in an indexed monad would be IndexedMonad m => m i j a where i and j are index types so that i is the type of the index at the beginning of the monadic computation and j at the end of the computation. In a way, you can think of i as a sort of input type and j as the output type.
Using State as an example, a stateful computation State s a maintains a state of type s throughout the computation and returns a result of type a. An indexed version, IndexedState i j a, is a stateful computation where the state can change to a different type during the computation. The initial state has the type i and state and the end of the computation has the type j.
Using an indexed monad over a normal monad is rarely necessary but it can be used in some cases to encode stricter static guarantees.
It may be important to take a look how indexing is used in dependent types (eg in agda). This can explain how indexing helps in general, then translate this experience to monads.
Indexing permits to establish relationships between particular instances of types. Then you can reason about some values to establish whether that relationship holds.
For example (in agda) you can specify that some natural numbers are related with _<_, and the type tells which numbers they are. Then you can require that some function is given a witness that m < n, because only then the function works correctly - and without providing such witness the program will not compile.
As another example, given enough perseverance and compiler support for your chosen language, you could encode that the function assumes that a certain list is sorted.
Indexed monads permit to encode some of what dependent type systems do, to manage side effects more precisely.

How to understand the abstration of RST, Lensed and LensT in Snaplet?

I am reading the source of Snap recently, which is great, but when I moved on to read the Snaplet Handler source, I got stuck at the abstraction of RST,Lensed and LensT.
newtype RST r s m a = RST { runRST :: r -> s -> m (a, s) }
newtype LensT b v s m a = LensT (RST (Lens b v) s m a)
newtype Handler b v a = Handler (LensT (Snaplet b) (Snaplet v) (Snaplet b) Snap a)
Now LensT changed to Lensed
newtype Lensed b v m a = Lensed { unlensed :: ALens' b v -> v -> b -> m (a, v, b) }
And the Snaplet Design said We switched to a slightly more specialized monad formulation called Lensed that avoids traversal of the whole state hierarchy when the state is manipulated.
I feel like there is a gap between the implementation of Snap and Snaplet Handler, and the key is RST, LensT and Lensed, is there any reference documentation to help me out?
TL;DR - There's no gap. The Handler definition you pasted is out of date. The current one uses Lensed.
Long answer: We don't have any documentation on this because it is a low-level implementation detail--i.e. all of it is completely hidden from the end user. Carl is right that RST is just RWST minus the W, but let's do some deeper investigation. Using the types that you show above, we'll substitute the definition of RST into the LensT definition. This gives us the following substitutions:
r = Lens b v
s = s
m = m
a = a
With that we can easily write the expanded LensT definition:
newtype LensT b v s m a = LensT { unlensT :: Lens b v -> s -> m (a, s) }
Compare that to Lensed:
newtype Lensed b v m a = Lensed { unlensed :: ALens' b v -> v -> b -> m (a, v, b) }
If we assume that Lens b v and ALens' b v are interchangeable (which, conceptually speaking, they are), then you can see this equivalence:
Lensed b v m a = LensT b v (b,v) m a
Now we see the crux of the issue. LensT is a more general construct than Lensed. LensT allows you to choose your s arbitrarily. Lensed fixes s to be completely determined by b and v. Now that we understand the difference, the question is how are these two constructs actually used in Snap. A quick grep through Types.hs shows us that Handler uses Lensed and Initializer uses LensT. (A side note: the definition you give for Handler is not the one we're currently using.) Here are the important parts of the definitions.
Handler b v a = Handler (L.Lensed (Snaplet b) (Snaplet v) ...)
Initializer b v a = Initializer (LT.LensT (Snaplet b) (Snaplet v) (InitializerState b)...)
Initializer uses the more general LensT construction because it needs s to be InitializerState, which contains extra information not related to b and v. In fact, the whole point of Initializer's existence is to facilitate the construction of a b that will be used as Handler's initial state. Initializer runs when your application starts up, but Handler is what your application runs in. We wanted Handler to be as efficient as possible, so we created Lensed to be optimized for exactly the needs of Handler. You could argue that was premature optimization, but since someone else did it for me, I wasn't going to say no.
You might wonder why we even have Lensed and LensT at all. They're only used in one place so we could just substitute their definitions into Handler and Initializer respectively. That's because we didn't have Lensed at the very beginning. Both Handler and Initializer were written in terms of LensT, and therefore it was a perfectly reasonable abstraction to eliminate duplicated code. Lensed came later and since they are all newtypes anyway, those layers of abstraction impose zero runtime cost.

The "reader" monad

OK, so the writer monad allows you to write stuff to [usually] some kind of container, and get that container back at the end. In most implementations, the "container" can actually be any monoid.
Now, there is also a "reader" monad. This, you might think, would offer the dual operation - incrementally reading from some kind of container, one item at a time. In fact, this is not the functionality that the usual reader monad provides. (Instead, it merely offers easy access to a semi-global constant.)
To actually write a monad which is dual to the usual writer monad, we would need some kind of structure which is dual to a monoid.
Does anybody have any idea what this dual structure might be?
Has anybody written this monad? Is there a well-known name for it?
The dual of a monoid is a comonoid. Recall that a monoid is defined as (something isomorphic to)
class Monoid m where
create :: () -> m
combine :: (m,m) -> m
with these laws
combine (create (),x) = x
combine (x,create ()) = x
combine (combine (x,y),z) = combine (x,combine (y,z))
thus
class Comonoid m where
delete :: m -> ()
split :: m -> (m,m)
some standard operations are needed
first :: (a -> b) -> (a,c) -> (b,c)
second :: (c -> d) -> (a,c) -> (a,d)
idL :: ((),x) -> x
idR :: (x,()) -> x
assoc :: ((x,y),z) -> (x,(y,z))
with laws like
idL $ first delete $ (split x) = x
idR $ second delete $ (split x) = x
assoc $ first split (split x) = second split (split x)
This typeclass looks weird for a reason. It has an instance
instance Comonoid m where
split x = (x,x)
delete x = ()
in Haskell, this is the only instance. We can recast reader as the exact dual of writer, but since there is only one instance for comonoid, we get something isomorphic to the standard reader type.
Having all types be comonoids is what makes the category "Cartesian" in "Cartesian Closed Category." "Monoidal Closed Categories" are like CCCs but without this property, and are related to substructural type systems. Part of the appeal of linear logic is the increased symmetry that this is an example of. While, having substructural types allows you to define comonoids with more interesting properties (supporting things like resource management). In fact, this provides a framework for understand the role of copy constructors and destructors in C++ (although C++ does not enforce the important properties because of the existence of pointers).
EDIT: Reader from comonoids
newtype Reader r x = Reader {runReader :: r -> x}
forget :: Comonoid m => (m,a) -> a
forget = idL . first delete
instance Comonoid r => Monad (Reader r) where
return x = Reader $ \r -> forget (r,x)
m >>= f = \r -> let (r1,r2) = split r in runReader (f (runReader m r1)) r2
ask :: Comonoid r => Reader r r
ask = Reader id
note that in the above code every variable is used exactly once after binding (so these would all type with linear types). The monad law proofs are trivial, and only require the comonoid laws to work. Hence, Reader really is dual to Writer.
I'm not entirely sure of what the dual of a monoid should be, but thinking of dual (probably incorrectly) as the opposite of something (simply on the basis that a Comonad is the dual of a Monad, and has all the same operations but the opposite way round). Rather than basing it on mappend and mempty I would base it on:
fold :: (Foldable f, Monoid m) => f m -> m
If we specialise f to a list here, we get:
fold :: Monoid m => [m] -> m
This seems to me to contain all of the monoid class, in particular.
mempty == fold []
mappend x y == fold [x, y]
So, then I guess the dual of this different monoid class would be:
unfold :: (Comonoid m) => m -> [m]
This is a lot like the monoid factorial class that I have seen on hackage here.
So on this basis, I think the 'reader' monad you describe would be a supply monad. The supply monad is effectively a state transformer of a list of values, so that at any point we can choose to be supplied with an item from the list. In this case, the list would be the result of unfold.supply monad
I should stress, I am no Haskell expert, nor an expert theoretician. But this is what your description made me think of.
Supply is based on State, which makes it suboptimal for some applications. For example, we might want to make an infinite tree of supplied values (e.g. randoms):
tree :: (Something r) => Supply r (Tree r)
tree = Branch <$> supply <*> sequenceA [tree, tree]
But since Supply is based on State, all the labels will be bottom except for the ones one the leftmost path down the tree.
You need something splittable (like in #PhillipJF's Comonoid). But there is a problem if you try to make this into a Monad:
newtype Supply r a = Supply { runSupply :: r -> a }
instance (Splittable r) => Monad (Supply r) where
return = Supply . const
Supply m >>= f = Supply $ \r ->
let (r',r'') = split r in
runSupply (f (m r')) r''
Because the monad laws require f >>= return = f, so that means that r'' = r in the definition of (>>=).. But, the monad laws also require that return x >>= f = f x, so r' = r as well. Thus, for Supply to be a monad, split x = (x,x), and thus you've got the regular old Reader back again.
A lot of monads that are used in Haskell aren't real monads -- i.e. they only satisfy the laws up to some equivalence relation. E.g. many nondeterminism monads will give results in a different order if you transform according to the laws. But that's okay, that's still monad enough if you're just wondering whether a particular element appears in the list of outputs, rather than where.
If you allow Supply to be a monad up to some equivalence relation, then you can get nontrivial splits. E.g. value-supply will construct splittable entities which will dole out unique labels from a list in an unspecified order (using unsafe* magic) -- so a supply monad of value supply would be a monad up to permutation of labels. This is all that is needed for many applications. And, in fact, there is a function
runSupply :: (forall r. Eq r => Supply r a) -> a
which abstracts over this equivalence relation to give a well-defined pure interface, because the only thing it allows you to do to labels is to see if they are equal, and that doesn't change if you permute them. If this runSupply is the only observation you allow on Supply, then Supply on a supply of unique labels is a real monad.

How do you use the latest version (0.8.1.2 at time of writing) of the iteratee library?

I've read tutorials on the iteratee and enumerator concepts, and have implemented a sample version as a way of learning how they work. However, the types used in the iteratee package are very different than any of the tutorials I have found. For example, Iteratee is defined as:
Iteratee
runIter :: forall r. (a -> Stream s -> m r)
-> ((Stream s -> Iteratee s m a)
-> Maybe SomeException -> m r)
-> m r
I really don't understand what I am meant to do with that. Are there any tutorials on using this version, and why it was written this way (ie what benefits this has over the original way Oleg did it).
Disclaimer: I'm the current maintainer of iteratee.
You may find some of the files in the iteratee Examples directory useful for understanding how to use the library; word.hs is probably the easiest to follow.
Basically, users shouldn't need to use runIter unless you're creating custom enumeratees. Iteratees can be created by combining the provided primitives, and also with the liftI, idone, and icont functions, enumerated over, and then run with run or tryRun.
Oleg has two "original" versions and a CPS version (and possibly others too). The original versions are both in http://okmij.org/ftp/Haskell/Iteratee/IterateeM.hs. The first is the actual code, and the second is in the comments. The first requires some special functions, >>== and $$, in place of the usual >>= and $. The second can use the standard functions, but unfortunately it's very difficult to reason about monadic ordering with this version. There are a few other drawbacks as well. The CPS version avoids all of these issues, which is why I switched over iteratee. I also find that iteratees defined in this style are shorter and more readable. Unfortunately I'm not aware of any tutorials specific to the CPS version, however Oleg's comments may be useful.
Disclaimer: I don't know the iteratee much and have never used them. So take my answer
with a grain of salt.
This definition is equivalent to Oleg's (more precisely, it's
a CPS-style sum), with a twist: it guarantees that accessing the
iteratee always return a monadic value.
Here is Oleg definition:
data Iteratee el m a =
| IE_done a
| IE_cont (Maybe ErrMsg)
(Stream el -> m (Iteratee el m a, Stream el))
So it's a sum of either done a, we're done a give a as a result,
or cont (Maybe ErrMsg) (Stream el -> ...) : a way to continue the
iteration given another chunk of input and, possibly an error (in
which case continuing the continuation amounts to restarting the
computation).
It is well-known that Either a b is equivalent to forall r. (a ->
r) -> (b -> r) -> r : giving you either a or b is equivalent to
promising you that, for any result r transform on a and transform
of b you may come up with, I will be able to produce such an r (to
do that I must have a a or a b). In a sense, (Either a b)
introduces a data, and r. (a -> r) -> (b -> r) -> r eliminates
this data: if such a function was named case_ab, then case_ab (\a ->
foo_a) (\b -> foo_b) is equivalent to the pattern matching case
ab of { Left a -> foo_a; Right b -> foo_b } for some ab :: Either a b.
So here is the continuation (we talk of continuations here because
(a -> r) represents "what will happen of the value once we know it's
an a") equivalent of Oleg's definition:
data Iteratee el m a =
forall r.
(a -> r) ->
((Maybe ErrMsg), (Stream el -> m (Iteratee el m a, Stream el)) -> r) ->
r
But there is a twist in the iteratee definition (modulo some innocuous
currying): the result is not r but m r: in a sense, we force the
result of pattern matching on our iteratee to always live in the monad
m.
data Iteratee el m a =
forall r.
(a -> m r) ->
(Maybe ErrMsg -> (Stream el -> m (Iteratee el m a, Stream el)) -> m r) ->
m r
Finally, notice that the "continuating the iteration" data in Oleg's
definition is Stream .. -> m (Iterate .., Stream ..), while in the
iteratee package it's only Stream -> Iteratee. I assume that they
have removed the monad here because they enforce it at the outer level
(if you apply the iteratee, you are forced to live in the monad, so
why also force subsequent computation to live in the monad?). I don't
know why there isn't the Stream output anymore, I suppose it means
that those Iteratee have to consume all the input when it is available
(or encode a "not finished yet" logic in the return type a). Perhaps
this is for efficiency reasons.

Resources