How to understand the abstration of RST, Lensed and LensT in Snaplet? - haskell

I am reading the source of Snap recently, which is great, but when I moved on to read the Snaplet Handler source, I got stuck at the abstraction of RST,Lensed and LensT.
newtype RST r s m a = RST { runRST :: r -> s -> m (a, s) }
newtype LensT b v s m a = LensT (RST (Lens b v) s m a)
newtype Handler b v a = Handler (LensT (Snaplet b) (Snaplet v) (Snaplet b) Snap a)
Now LensT changed to Lensed
newtype Lensed b v m a = Lensed { unlensed :: ALens' b v -> v -> b -> m (a, v, b) }
And the Snaplet Design said We switched to a slightly more specialized monad formulation called Lensed that avoids traversal of the whole state hierarchy when the state is manipulated.
I feel like there is a gap between the implementation of Snap and Snaplet Handler, and the key is RST, LensT and Lensed, is there any reference documentation to help me out?

TL;DR - There's no gap. The Handler definition you pasted is out of date. The current one uses Lensed.
Long answer: We don't have any documentation on this because it is a low-level implementation detail--i.e. all of it is completely hidden from the end user. Carl is right that RST is just RWST minus the W, but let's do some deeper investigation. Using the types that you show above, we'll substitute the definition of RST into the LensT definition. This gives us the following substitutions:
r = Lens b v
s = s
m = m
a = a
With that we can easily write the expanded LensT definition:
newtype LensT b v s m a = LensT { unlensT :: Lens b v -> s -> m (a, s) }
Compare that to Lensed:
newtype Lensed b v m a = Lensed { unlensed :: ALens' b v -> v -> b -> m (a, v, b) }
If we assume that Lens b v and ALens' b v are interchangeable (which, conceptually speaking, they are), then you can see this equivalence:
Lensed b v m a = LensT b v (b,v) m a
Now we see the crux of the issue. LensT is a more general construct than Lensed. LensT allows you to choose your s arbitrarily. Lensed fixes s to be completely determined by b and v. Now that we understand the difference, the question is how are these two constructs actually used in Snap. A quick grep through Types.hs shows us that Handler uses Lensed and Initializer uses LensT. (A side note: the definition you give for Handler is not the one we're currently using.) Here are the important parts of the definitions.
Handler b v a = Handler (L.Lensed (Snaplet b) (Snaplet v) ...)
Initializer b v a = Initializer (LT.LensT (Snaplet b) (Snaplet v) (InitializerState b)...)
Initializer uses the more general LensT construction because it needs s to be InitializerState, which contains extra information not related to b and v. In fact, the whole point of Initializer's existence is to facilitate the construction of a b that will be used as Handler's initial state. Initializer runs when your application starts up, but Handler is what your application runs in. We wanted Handler to be as efficient as possible, so we created Lensed to be optimized for exactly the needs of Handler. You could argue that was premature optimization, but since someone else did it for me, I wasn't going to say no.
You might wonder why we even have Lensed and LensT at all. They're only used in one place so we could just substitute their definitions into Handler and Initializer respectively. That's because we didn't have Lensed at the very beginning. Both Handler and Initializer were written in terms of LensT, and therefore it was a perfectly reasonable abstraction to eliminate duplicated code. Lensed came later and since they are all newtypes anyway, those layers of abstraction impose zero runtime cost.

Related

What's the difference between Monad.Reader and the (->) monads?

I learned that Monad.Reader is actually an encapsulation of a function, namely:
newtype Reader r a = Reader { runReader :: r -> a }
Which is made an instance of Monad,
instance Monad (Reader r) where
return a = Reader $ \_ -> a
m >>= k = Reader $ \r -> runReader (k (runReader m r)) r
In contrast, I knew that (->) is also a Monad,
instance Monad ((->) r) where
return = const
f >>= k = \ r -> k (f r) r
From the definitions it's able to see that they actually behave the same exactly.
So are they interchangeable in all usages? And what's the actual significance of differing these two Monads?
TL;DR
They are the same.
Some history lessons
State, Writer and Reader were inspired by Mark P. Jones' Functional Programming with Overloading and
Higher-Order Polymorphism, where he defined Reader as follows:
A Reader monad is used to allow a computation to access the values held
in some enclosing environment (represented by the type r in the following
definitions).
> instance Monad (r->) where
> result x = \r -> x
> x `bind` f = \r -> f (x r) r
As a passing comment, it is interesting to note that these two functions are
just the standard K and S combinators of combinatory logic.
Later, he defines (almost) today's MonadReader:
Reader monads : A class of monads for describing computations that consult some fixed environment:
> class Monad m => ReaderMonad m r where
> env :: r -> m a -> m a
> getenv :: m r
> instance ReaderMonad (r->) r where
> env e c = \_ -> c e
> getenv = id
getenv is simply ask, and env is local . const. Therefore, this definition already contained all significant parts of a Reader. Ultimately, Jones defines the monad transformer ReaderT (BComp is backward composition):
To begin with, it is useful to define two different forms of composition; forwards (FComp) and backwards (BComp):
> data FComp m n a = FC (n (m a))
> data BComp m n a = BC (m (n a))
[omitting Functor, Monad and OutOf instances]
> type ReaderT r = BComp (r ->)
Since StateT, WriterT, and others had their non-transformer variant, it was only logical to have a Reader r, which really is the same as (->) r.
Either way, nowadays Reader, Writer and State are defined in terms of their transformer variant, and you use their respective Monad* typeclass (MonadReader).
Conclusion
So are they interchangeable in all usages?
Yes.
And what's the actual significance of differing these two Monads?
None, except that ReaderT is actually a monad transformer, which makes things easier.
They are both instance of the MonadReader class. So yes, you can use one instead of the other.
They are in fact the exact same.
We can make this more formal by mapping between them:
toArrow :: Reader r a -> r -> a and toReader :: (r -> a) -> Reader r a
with implementations toReader = Reader and toArrow = runReader.
Edit: The semantic behind a Reader is that it holds some read-only configuration which you can thread through your chain of computations.
You should always prefer a Reader over using the plain arrow type when your want to thread some configuration information, because it is part of a very generic interface that provides useful helper functions, a MonadReader class for manipulating Reader like data types as well as a ReaderT for stacking Monads.

What is indexed monad?

What is indexed monad and the motivation for this monad?
I have read that it helps to keep track of the side effects. But the type signature and documentation doesn't lead me to anywhere.
What would be an example of how it can help to keep track of side effects (or any other valid example)?
As ever, the terminology people use is not entirely consistent. There's a variety of inspired-by-monads-but-strictly-speaking-isn't-quite notions. The term "indexed monad" is one of a number (including "monadish" and "parameterised monad" (Atkey's name for them)) of terms used to characterize one such notion. (Another such notion, if you're interested, is Katsumata's "parametric effect monad", indexed by a monoid, where return is indexed neutrally and bind accumulates in its index.)
First of all, let's check kinds.
IxMonad (m :: state -> state -> * -> *)
That is, the type of a "computation" (or "action", if you prefer, but I'll stick with "computation"), looks like
m before after value
where before, after :: state and value :: *. The idea is to capture the means to interact safely with an external system that has some predictable notion of state. A computation's type tells you what the state must be before it runs, what the state will be after it runs and (like with regular monads over *) what type of values the computation produces.
The usual bits and pieces are *-wise like a monad and state-wise like playing dominoes.
ireturn :: a -> m i i a -- returning a pure value preserves state
ibind :: m i j a -> -- we can go from i to j and get an a, thence
(a -> m j k b) -- we can go from j to k and get a b, therefore
-> m i k b -- we can indeed go from i to k and get a b
The notion of "Kleisli arrow" (function which yields computation) thus generated is
a -> m i j b -- values a in, b out; state transition i to j
and we get a composition
icomp :: IxMonad m => (b -> m j k c) -> (a -> m i j b) -> a -> m i k c
icomp f g = \ a -> ibind (g a) f
and, as ever, the laws exactly ensure that ireturn and icomp give us a category
ireturn `icomp` g = g
f `icomp` ireturn = f
(f `icomp` g) `icomp` h = f `icomp` (g `icomp` h)
or, in comedy fake C/Java/whatever,
g(); skip = g()
skip; f() = f()
{h(); g()}; f() = h(); {g(); f()}
Why bother? To model "rules" of interaction. For example, you can't eject a dvd if there isn't one in the drive, and you can't put a dvd into the drive if there's one already in it. So
data DVDDrive :: Bool -> Bool -> * -> * where -- Bool is "drive full?"
DReturn :: a -> DVDDrive i i a
DInsert :: DVD -> -- you have a DVD
DVDDrive True k a -> -- you know how to continue full
DVDDrive False k a -- so you can insert from empty
DEject :: (DVD -> -- once you receive a DVD
DVDDrive False k a) -> -- you know how to continue empty
DVDDrive True k a -- so you can eject when full
instance IxMonad DVDDrive where -- put these methods where they need to go
ireturn = DReturn -- so this goes somewhere else
ibind (DReturn a) k = k a
ibind (DInsert dvd j) k = DInsert dvd (ibind j k)
ibind (DEject j) k = DEject j $ \ dvd -> ibind (j dvd) k
With this in place, we can define the "primitive" commands
dInsert :: DVD -> DVDDrive False True ()
dInsert dvd = DInsert dvd $ DReturn ()
dEject :: DVDrive True False DVD
dEject = DEject $ \ dvd -> DReturn dvd
from which others are assembled with ireturn and ibind. Now, I can write (borrowing do-notation)
discSwap :: DVD -> DVDDrive True True DVD
discSwap dvd = do dvd' <- dEject; dInsert dvd ; ireturn dvd'
but not the physically impossible
discSwap :: DVD -> DVDDrive True True DVD
discSwap dvd = do dInsert dvd; dEject -- ouch!
Alternatively, one can define one's primitive commands directly
data DVDCommand :: Bool -> Bool -> * -> * where
InsertC :: DVD -> DVDCommand False True ()
EjectC :: DVDCommand True False DVD
and then instantiate the generic template
data CommandIxMonad :: (state -> state -> * -> *) ->
state -> state -> * -> * where
CReturn :: a -> CommandIxMonad c i i a
(:?) :: c i j a -> (a -> CommandIxMonad c j k b) ->
CommandIxMonad c i k b
instance IxMonad (CommandIxMonad c) where
ireturn = CReturn
ibind (CReturn a) k = k a
ibind (c :? j) k = c :? \ a -> ibind (j a) k
In effect, we've said what the primitive Kleisli arrows are (what one "domino" is), then built a suitable notion of "computation sequence" over them.
Note that for every indexed monad m, the "no change diagonal" m i i is a monad, but in general, m i j is not. Moreover, values are not indexed but computations are indexed, so an indexed monad is not just the usual idea of monad instantiated for some other category.
Now, look again at the type of a Kleisli arrow
a -> m i j b
We know we must be in state i to start, and we predict that any continuation will start from state j. We know a lot about this system! This isn't a risky operation! When we put the dvd in the drive, it goes in! The dvd drive doesn't get any say in what the state is after each command.
But that's not true in general, when interacting with the world. Sometimes you might need to give away some control and let the world do what it likes. For example, if you are a server, you might offer your client a choice, and your session state will depend on what they choose. The server's "offer choice" operation does not determine the resulting state, but the server should be able to carry on anyway. It's not a "primitive command" in the above sense, so indexed monads are not such a good tool to model the unpredictable scenario.
What's a better tool?
type f :-> g = forall state. f state -> g state
class MonadIx (m :: (state -> *) -> (state -> *)) where
returnIx :: x :-> m x
flipBindIx :: (a :-> m b) -> (m a :-> m b) -- tidier than bindIx
Scary biscuits? Not really, for two reasons. One, it looks rather more like what a monad is, because it is a monad, but over (state -> *) rather than *. Two, if you look at the type of a Kleisli arrow,
a :-> m b = forall state. a state -> m b state
you get the type of computations with a precondition a and postcondition b, just like in Good Old Hoare Logic. Assertions in program logics have taken under half a century to cross the Curry-Howard correspondence and become Haskell types. The type of returnIx says "you can achieve any postcondition which holds, just by doing nothing", which is the Hoare Logic rule for "skip". The corresponding composition is the Hoare Logic rule for ";".
Let's finish by looking at the type of bindIx, putting all the quantifiers in.
bindIx :: forall i. m a i -> (forall j. a j -> m b j) -> m b i
These foralls have opposite polarity. We choose initial state i, and a computation which can start at i, with postcondition a. The world chooses any intermediate state j it likes, but it must give us the evidence that postcondition b holds, and from any such state, we can carry on to make b hold. So, in sequence, we can achieve condition b from state i. By releasing our grip on the "after" states, we can model unpredictable computations.
Both IxMonad and MonadIx are useful. Both model validity of interactive computations with respect to changing state, predictable and unpredictable, respectively. Predictability is valuable when you can get it, but unpredictability is sometimes a fact of life. Hopefully, then, this answer gives some indication of what indexed monads are, predicting both when they start to be useful and when they stop.
There are at least three ways to define an indexed monad that I know.
I'll refer to these options as indexed monads à la X, where X ranges over the computer scientists Bob Atkey, Conor McBride and Dominic Orchard, as that is how I tend to think of them. Parts of these constructions have a much longer more illustrious history and nicer interpretations through category theory, but I first learned of them associated with these names, and I'm trying to keep this answer from getting too esoteric.
Atkey
Bob Atkey's style of indexed monad is to work with 2 extra parameters to deal with the index of the monad.
With that you get the definitions folks have tossed around in other answers:
class IMonad m where
ireturn :: a -> m i i a
ibind :: m i j a -> (a -> m j k b) -> m i k b
We can also define indexed comonads à la Atkey as well. I actually get a lot of mileage out of those in the lens codebase.
McBride
The next form of indexed monad is Conor McBride's definition from his paper "Kleisli Arrows of Outrageous Fortune". He instead uses a single parameter for the index. This makes the indexed monad definition have a rather clever shape.
If we define a natural transformation using parametricity as follows
type a ~> b = forall i. a i -> b i
then we can write down McBride's definition as
class IMonad m where
ireturn :: a ~> m a
ibind :: (a ~> m b) -> (m a ~> m b)
This feels quite different than Atkey's, but it feels more like a normal Monad, instead of building a monad on (m :: * -> *), we build it on (m :: (k -> *) -> (k -> *).
Interestingly you can actually recover Atkey's style of indexed monad from McBride's by using a clever data type, which McBride in his inimitable style chooses to say you should read as "at key".
data (:=) a i j where
V :: a -> (a := i) i
Now you can work out that
ireturn :: IMonad m => (a := j) ~> m (a := j)
which expands to
ireturn :: IMonad m => (a := j) i -> m (a := j) i
can only be invoked when j = i, and then a careful reading of ibind can get you back the same as Atkey's ibind. You need to pass around these (:=) data structures, but they recover the power of the Atkey presentation.
On the other hand, the Atkey presentation isn't strong enough to recover all uses of McBride's version. Power has been strictly gained.
Another nice thing is that McBride's indexed monad is clearly a monad, it is just a monad on a different functor category. It works over endofunctors on the category of functors from (k -> *) to (k -> *) rather than the category of functors from * to *.
A fun exercise is figuring out how to do the McBride to Atkey conversion for indexed comonads. I personally use a data type 'At' for the "at key" construction in McBride's paper. I actually walked up to Bob Atkey at ICFP 2013 and mentioned that I'd turned him inside out at made him into a "Coat". He seemed visibly disturbed. The line played out better in my head. =)
Orchard
Finally, a third far-less-commonly-referenced claimant to the name of "indexed monad" is due to Dominic Orchard, where he instead uses a type level monoid to smash together indices. Rather than go through the details of the construction, I'll simply link to this talk:
https://github.com/dorchard/effect-monad/blob/master/docs/ixmonad-fita14.pdf
As a simple scenario, assume you have a state monad. The state type is a complex large one, yet all these states can be partitioned into two sets: red and blue states. Some operations in this monad make sense only if the current state is a blue state. Among these, some will keep the state blue (blueToBlue), while others will make the state red (blueToRed). In a regular monad, we could write
blueToRed :: State S ()
blueToBlue :: State S ()
foo :: State S ()
foo = do blueToRed
blueToBlue
triggering a runtime error since the second action expects a blue state. We would like to prevent this statically. Indexed monad fulfills this goal:
data Red
data Blue
-- assume a new indexed State monad
blueToRed :: State S Blue Red ()
blueToBlue :: State S Blue Blue ()
foo :: State S ?? ?? ()
foo = blueToRed `ibind` \_ ->
blueToBlue -- type error
A type error is triggered because the second index of blueToRed (Red) differs from the first index of blueToBlue (Blue).
As another example, with indexed monads you can allow a state monad to change the type for its state, e.g. you could have
data State old new a = State (old -> (new, a))
You could use the above to build a state which is a statically-typed heterogeneous stack. Operations would have type
push :: a -> State old (a,old) ()
pop :: State (a,new) new a
As another example, suppose you want a restricted IO monad which does not
allow file access. You could use e.g.
openFile :: IO any FilesAccessed ()
newIORef :: a -> IO any any (IORef a)
-- no operation of type :: IO any NoAccess _
In this way, an action having type IO ... NoAccess () is statically guaranteed to be file-access-free. Instead, an action of type IO ... FilesAccessed () can access files. Having an indexed monad would mean you don't have to build a separate type for the restricted IO, which would require to duplicate every non-file-related function in both IO types.
An indexed monad isn't a specific monad like, for example, the state monad but a sort of generalization of the monad concept with extra type parameters.
Whereas a "standard" monadic value has the type Monad m => m a a value in an indexed monad would be IndexedMonad m => m i j a where i and j are index types so that i is the type of the index at the beginning of the monadic computation and j at the end of the computation. In a way, you can think of i as a sort of input type and j as the output type.
Using State as an example, a stateful computation State s a maintains a state of type s throughout the computation and returns a result of type a. An indexed version, IndexedState i j a, is a stateful computation where the state can change to a different type during the computation. The initial state has the type i and state and the end of the computation has the type j.
Using an indexed monad over a normal monad is rarely necessary but it can be used in some cases to encode stricter static guarantees.
It may be important to take a look how indexing is used in dependent types (eg in agda). This can explain how indexing helps in general, then translate this experience to monads.
Indexing permits to establish relationships between particular instances of types. Then you can reason about some values to establish whether that relationship holds.
For example (in agda) you can specify that some natural numbers are related with _<_, and the type tells which numbers they are. Then you can require that some function is given a witness that m < n, because only then the function works correctly - and without providing such witness the program will not compile.
As another example, given enough perseverance and compiler support for your chosen language, you could encode that the function assumes that a certain list is sorted.
Indexed monads permit to encode some of what dependent type systems do, to manage side effects more precisely.

Finite State Transducers in Haskell?

I've been wondering if there is a way to define and work with finite state transducers in Haskell in an idiomatic way.
You can approach FSTs as generators (it generates an output of type {x1,x2}), or as recognizers (given an input of type {x1,x2} it recognizes it if it belongs to the rational relation), or as translators (given an input tape, it translates it into an output tape). Would the representation change depending on the approach?
Would it also be possible to model a FST in a way that you can produce one by specifying rewriting rules? E.g creating a DSL to model rewriting rules, and then creating a function createFST :: [Rule] -> FST.
The closest I could find is Kmett, Bjarnason and Cough's machines library:
https://hackage.haskell.org/package/machines
But I can't seem to realize how to model a FST with a Machine. I'd suppose that the correct way of doing it would be similar to how they define Moore and Mealy machines: define a FST as a different entity, but provide an instance of Automaton to be able to use it as a machine.
I found some other options too, but they define it in a straightforward way (like in https://hackage.haskell.org/package/fst ). That doesn't convince me much, as I wonder if there's a better way to do so idiomatically using the strengths of Haskell's type system (like how Moore and Mealy machines are defined in the machines library).
A Mealy machine alternately reads an a from a stream of inputs a and outputs a b to a stream of outputs. It reads first and then outputs once after each read.
newtype Mealy a b = Mealy { runMealy :: a -> (b, Mealy a b) }
A Moore machine alternately outputs a b to a stream of outputs and reads an input a from a stream of inputs. It starts with an output of b and then reads once after each output.
data Moore a b = Moore b (a -> Moore a b)
An FST either reads from it's input, writes to its output, or stops. It can read as many times in a row as it wants or write as many times in a row as it wants.
data FST a b
= Read (a -> FST a b)
| Write (b, FST a b)
| Stop
The equivalent of an FST from machines is Process. It's definition is a little spread out. To simplify the discussion we are going to forget about Process for now and explore it from the inside-out.
The base functor
To describe what a Process is, we're going to first notice a pattern in all three machines so far. Each of them recursively refers to itself for "what to do next". We are going to replace "what to do next" with any type next. The Mealy machine, while mapping an input to an output, also provides the next machine to run.
newtype MealyF a b next = MealyF { runMealyF :: a -> (b, next) }
The Moore machine, after outputting and requesting an input, figures out the next machine to run.
data MooreF a b next = MooreF b (a -> next)
We can write the FST the same way. When we Read from the input we'll figure out what to do next depending on the input. When we Write to the output we'll also provide what to do next after outputting. When we Stop there's nothing to do next.
data FSTF a b next
= Read (a -> next)
| Write (b, next)
| Stop
This pattern of eliminating explicit recursion shows up repeatedly in Haskell code, and is usually called a "base functor". In the machines package the base functor is Step. Compared to our code, Step has renamed the type variable for the output to o, what to do next to r, reading to Await, and writing to Yield.
data Step k o r
= forall t. Await (t -> r) (k t) r
| Yield o r
| Stop
Awaiting is a little more complicated than Read because a Machine can read from multiple sources. For Processes that can only read from a single source, k is Is applied to a specific type, which is a proof the second type Is the first type. For a Process reading inputs a, k will be Is a.
data Step (Is a) o r
= forall t. Await (t -> r) (Is a t) r
| Yield o r
| Stop
The existential quantification forall t. is an implementation detail for dealing with Sources. After witnessing that a ~ t this becomes.
data Step (Is a) o r
= forall t ~ a. Await (t -> r) Refl r
| Yield o r
| Stop
If we unify t with a and remove the Refl constructor which is always the same, this looks like our FSTF.
data Step (Is a) o r
= Await (a -> r) r
| Yield o r
| Stop
The extra r for what to do next in Await is what to do next when there's no more input.
The machine transformer `MachineT`
The machine transformer, MachineT, makes Step look almost like our FST. It says, "A machine operating over some monad m is what to do in that monad to get the next Step. The next thing to do after each step is another MachineT."
newtype MachineT m k o = MachineT { runMachineT :: m (Step k o (MachineT m k o)) }
Overall, specialized for our types, this looks like
newtype MachineT m (Is a) o =
MachineT m (
Await (a -> MachineT m (Is a) o) (MachineT m (Is a) o)
| Yield o (MachineT m (Is a) o)
| Stop
)
Machine is a pure MachineT.
type Machine k o = forall m. Monad m => MachineT m k o
Universal quantification over all Monads m is another way of saying a computation doesn't need anything from an underlying Monad. This can be seen by substituting Identity for m.
type Machine k o =
MachineT Identity (
Await (a -> MachineT Identity k o) (MachineT Identity k o)
| Yield o (MachineT Identity k o)
| Stop
)
Processes
A Process or ProcessT is a Machine or MachineT that only reads a single type of input a, Is a.
type Process a b = Machine (Is a) b
type ProcessT m a b = MachineT m (Is a) b
A Process has the following structure after removing all the intermediate constructors that are always the same. This structure is exactly the same as our FST, except it has an added "what to do next" in the case that there's no more input.
type Process a b =
Await (a -> Process a b) (Process a b)
| Yield b (Process a b)
| Stop
The ProcessT variant has an m wrapped around it so that it can act in the monad at each step.
Process models state transducers.

Existential types and monad transformers

Context: I'm trying to produce an error monad that also keeps track of a list of warnings, something like this:
data Dangerous a = forall e w. (Error e, Show e, Show w) =>
Dangerous (ErrorT e (State [w]) a)
i.e. Dangerous a is an operation resulting in (Either e a, [w]) where e is a showable error and w is showable.
The problem is, I can't seem to actually run the thing, mostly because I don't understand existential types all that well. Observe:
runDangerous :: forall a e w. (Error e, Show e, Show w) =>
Dangerous a -> (Either e a, [w])
runDangerous (Dangerous f) = runState (runErrorT f) []
This doesn't compile, because:
Could not deduce (w1 ~ w)
from the context (Error e, Show e, Show w)
...
`w1' is a rigidtype variable bound by
a pattern with constructor
Dangerous :: forall a e w.
(Error e, Show e, Show w) =>
ErrorT e (State [w]) a -> Dangerous a
...
`w' is a rigid type variable bound by
the type signature for
runDangerous :: (Error e, Show e, Show w) =>
Dangerous a -> (Either e a, [w])
I'm lost. What's w1? Why can't we deduce that it's ~ w?
An existential is probably not what you want here; there is no way to "observe" the actual types bound to e or w in a Dangerous a value, so you're completely limited to the operations given to you by Error and Show.
In other words, the only thing you know about w is that you can turn it into a String, so it might as well just be a String (ignoring precedence to simplify things), and the only thing you know about e is that you can turn it into a String, you can turn Strings into it, and you have a distinguished value of it (noMsg). There is no way to assert or check that these types are the same as any other, so once you put them into a Dangerous, there's no way to recover any special structure those types may have.
What the error message is saying is that, essentially, your type for runDangerous claims that you can turn a Dangerous into an (Either e a, [w]) for any e and w that have the relevant instances. This clearly isn't true: you can only turn a Dangerous into that type for one choice of e and w: the one it was created with. The w1 is just because your Dangerous type is defined with a type variable w, and so is runDangerous, so GHC renames one of them to avoid name clashes.
The type you need to give runDangerous looks like this:
runDangerous
:: (forall e w. (Error e, Show e, Show w) => (Either e a, [w]) -> r)
-> Dangerous a -> r
which, given a function which will accept a value of type (Either e a, [w]) for any choices of e and w so long as they have the instances given, and a Dangerous a, produces that function's result. This is quite hard to get your head around!
The implementation is as simple as
runDangerous f (Dangerous m) = f $ runState (runErrorT m) []
which is a trivial change to your version. If this works for you, great; but I doubt that an existential is the right way to achieve whatever you're trying to do.
Note that you'll need {-# LANGUAGE RankNTypes #-} to express the type of runDangerous. Alternatively, you can define another existential for your result type:
data DangerousResult a = forall e w. (Error e, Show e, Show w) =>
DangerousResult (Either e a, [w])
runDangerous :: Dangerous a -> DangerousResult a
runDangerous (Dangerous m) = DangerousResult $ runState (runErrorT m) []
and extract the result with case, but you'll have to be careful, or GHC will start complaining that you've let e or w escape — which is the equivalent of trying to pass an insufficiently polymorphic function to the other form of runDangerous; i.e. one that requires more constraints on what e and w are beyond what the type of runDangerous guarantees.
Ok, I think I figured out what I was floundering after:
data Failure = forall e. (Error e, Show e) => Failure e
data Warning = forall w. (Show w) => Warning w
class (Monad m) => Errorable m where
warn :: (Show w) => w -> m ()
throw :: (Error e, Show e) => e -> m ()
instance Errorable Dangerous where
warn w = Dangerous (Right (), [Warning w])
throw e = Dangerous (Left $ Failure e, [])
(instance Monad Dangerous and data DangerousT help too.)
This allows you to have the following code:
foo :: Dangerous Int
foo = do
when (badThings) (warn $ BadThings with some context)
when (worseThings) (throw $ BarError with other context)
data FooWarning = BadThings FilePath Int String
instance Show FooWarning where
...
and then in your main module you may define custom instances of Show Failure, Error Failure, and Show Warning and have a centralized way to format your error messages, for example
instance Show Warning where show (Warning s) = "WARNING: " ++ show s
instance Show Failure where ...
let (result, warnings) = runDangerous function
in ...
Which, in my opinion, is a pretty cool way to handle errors and warnings. I've got a working module that's something like this, now I'm off to polish it up and maybe put it on hackage. Suggestions appreciated.

How do you use the latest version (0.8.1.2 at time of writing) of the iteratee library?

I've read tutorials on the iteratee and enumerator concepts, and have implemented a sample version as a way of learning how they work. However, the types used in the iteratee package are very different than any of the tutorials I have found. For example, Iteratee is defined as:
Iteratee
runIter :: forall r. (a -> Stream s -> m r)
-> ((Stream s -> Iteratee s m a)
-> Maybe SomeException -> m r)
-> m r
I really don't understand what I am meant to do with that. Are there any tutorials on using this version, and why it was written this way (ie what benefits this has over the original way Oleg did it).
Disclaimer: I'm the current maintainer of iteratee.
You may find some of the files in the iteratee Examples directory useful for understanding how to use the library; word.hs is probably the easiest to follow.
Basically, users shouldn't need to use runIter unless you're creating custom enumeratees. Iteratees can be created by combining the provided primitives, and also with the liftI, idone, and icont functions, enumerated over, and then run with run or tryRun.
Oleg has two "original" versions and a CPS version (and possibly others too). The original versions are both in http://okmij.org/ftp/Haskell/Iteratee/IterateeM.hs. The first is the actual code, and the second is in the comments. The first requires some special functions, >>== and $$, in place of the usual >>= and $. The second can use the standard functions, but unfortunately it's very difficult to reason about monadic ordering with this version. There are a few other drawbacks as well. The CPS version avoids all of these issues, which is why I switched over iteratee. I also find that iteratees defined in this style are shorter and more readable. Unfortunately I'm not aware of any tutorials specific to the CPS version, however Oleg's comments may be useful.
Disclaimer: I don't know the iteratee much and have never used them. So take my answer
with a grain of salt.
This definition is equivalent to Oleg's (more precisely, it's
a CPS-style sum), with a twist: it guarantees that accessing the
iteratee always return a monadic value.
Here is Oleg definition:
data Iteratee el m a =
| IE_done a
| IE_cont (Maybe ErrMsg)
(Stream el -> m (Iteratee el m a, Stream el))
So it's a sum of either done a, we're done a give a as a result,
or cont (Maybe ErrMsg) (Stream el -> ...) : a way to continue the
iteration given another chunk of input and, possibly an error (in
which case continuing the continuation amounts to restarting the
computation).
It is well-known that Either a b is equivalent to forall r. (a ->
r) -> (b -> r) -> r : giving you either a or b is equivalent to
promising you that, for any result r transform on a and transform
of b you may come up with, I will be able to produce such an r (to
do that I must have a a or a b). In a sense, (Either a b)
introduces a data, and r. (a -> r) -> (b -> r) -> r eliminates
this data: if such a function was named case_ab, then case_ab (\a ->
foo_a) (\b -> foo_b) is equivalent to the pattern matching case
ab of { Left a -> foo_a; Right b -> foo_b } for some ab :: Either a b.
So here is the continuation (we talk of continuations here because
(a -> r) represents "what will happen of the value once we know it's
an a") equivalent of Oleg's definition:
data Iteratee el m a =
forall r.
(a -> r) ->
((Maybe ErrMsg), (Stream el -> m (Iteratee el m a, Stream el)) -> r) ->
r
But there is a twist in the iteratee definition (modulo some innocuous
currying): the result is not r but m r: in a sense, we force the
result of pattern matching on our iteratee to always live in the monad
m.
data Iteratee el m a =
forall r.
(a -> m r) ->
(Maybe ErrMsg -> (Stream el -> m (Iteratee el m a, Stream el)) -> m r) ->
m r
Finally, notice that the "continuating the iteration" data in Oleg's
definition is Stream .. -> m (Iterate .., Stream ..), while in the
iteratee package it's only Stream -> Iteratee. I assume that they
have removed the monad here because they enforce it at the outer level
(if you apply the iteratee, you are forced to live in the monad, so
why also force subsequent computation to live in the monad?). I don't
know why there isn't the Stream output anymore, I suppose it means
that those Iteratee have to consume all the input when it is available
(or encode a "not finished yet" logic in the return type a). Perhaps
this is for efficiency reasons.

Resources