Megaparsec, backtracking user state with StateT and ParsecT - haskell

Using Megaparsec 5.
Following this guide, I can achieve a back-tracking user-state by combining StateT and ParsecT (non-defined types should be obvious/irrelevant):
type MyParser a = StateT UserState (ParsecT Dec T.Text Identity) a
if I run a parser p :: MyParser a, like this:
parsed = runParser (runStateT p initialUserState) "" input
The type of parsed is:
Either (ParseError Char Dec) (a, UserState)
Which means, in case of error, the user state is lost.
Is there any way to have it in both cases?
EDIT:
Could I perhaps, in case of error, use a custom error component instead of Dec (a feature introduced in 5.0) and encapsulate the user state in there?

You can use a custom error component combined with the observing function for this purpose (see this great post for more information):
{-# LANGUAGE RecordWildCards #-}
module Main where
import Text.Megaparsec
import qualified Data.Set as Set
import Control.Monad.State.Lazy
data MyState = MyState Int deriving (Ord, Eq, Show)
data MyErrorComponent = MyErrorComponent (Maybe MyState) deriving (Ord, Eq, Show)
instance ErrorComponent MyErrorComponent where
representFail _ = MyErrorComponent Nothing
representIndentation _ _ _= MyErrorComponent Nothing
type Parser = StateT MyState (Parsec MyErrorComponent String)
trackState :: Parser a -> Parser a
trackState parser = do
result <- observing parser -- run parser but don't fail right away
case result of
Right x -> return x -- if it succeeds we're done here
Left ParseError {..} -> do
state <- get -- read the current state to add it to the error component
failure errorUnexpected errorExpected $
if Set.null errorCustom then Set.singleton (MyErrorComponent $ Just state) else errorCustom
In the above snipped, observing functions a bit like a try/catch block that catches a parse error, then reads the current state and adds the it to the custom error component. The custom error component in turn is returned when runParser returns a ParseError.
Here's a demonstration how this function could be used:
a = trackState $ do
put (MyState 6)
string "foo"
b = trackState $ do
put (MyState 5)
a
main = putStrLn (show $ runParser (runStateT b (MyState 0)) "" "bar")
In reality you would probably want to do something more clever (for instance I imagine you could also add the entire stack of states you go through while traversing the stack).

You could try sandwiching ParserT between two States, like
type MyParser a = StateT UserState (ParsecT Dec T.Text (State UsersState)) a
And write special-purpose put and modify operations that, after changing the outer state, copy the entire state into the inner State monad using put.
That way, even if parsing fails, you'll have the last "state before failure" available from the inner State monad.

I hit similar problem. I use default typing state:
type SubDefPos = Int
type SubDefName = String
data MyParserSt = MyParserSt {
subDefs :: [(SubDefPos, SubDefName)]
}
ParsecT Void String (StateT MyParserSt Identity) Expr
Every change to user state is supplied with the value of getOffset to be able to reject later if current position is less than position from the state.

Related

Nested States in Haskell

I am trying to define a family of state machines with somewhat different kinds of states. In particular, the more "complex" state machines have states which are formed by combining the states of simpler state machines.
(This is similar to an object oriented setting where an object has several attributes which are also objects.)
Here is a simplified example of what I want to achieve.
data InnerState = MkInnerState { _innerVal :: Int }
data OuterState = MkOuterState { _outerTrigger :: Bool, _inner :: InnerState }
innerStateFoo :: Monad m => StateT InnerState m Int
innerStateFoo = do
i <- _innerVal <$> get
put $ MkInnerState (i + 1)
return i
outerStateFoo :: Monad m => StateT OuterState m Int
outerStateFoo = do
b <- _outerTrigger <$> get
if b
then
undefined
-- Here I want to "invoke" innerStateFoo
-- which should work/mutate things
-- "as expected" without
-- having to know about the outerState it
-- is wrapped in
else
return 666
More generally, I want a generalized framework where these nestings are more complex. Here is something I wish to know how to do.
class LegalState s
data StateLess
data StateWithTrigger where
StateWithTrigger :: LegalState s => Bool -- if this trigger is `True`, I want to use
-> s -- this state machine
-> StateWithTrigger
data CombinedState where
CombinedState :: LegalState s => [s] -- Here is a list of state machines.
-> CombinedState -- The combinedstate state machine runs each of them
instance LegalState StateLess
instance LegalState StateWithTrigger
instance LegalState CombinedState
liftToTrigger :: Monad m, LegalState s => StateT s m o -> StateT StateWithTrigger m o
liftToCombine :: Monad m, LegalState s => [StateT s m o] -> StateT CombinedState m o
For context, this is what I want to achieve with this machinery:
I want to design these things called "Stream Transformers", which are basically stateful functions: They consume a token, mutate their internal state and output something. Specifically, I am interested in a class of Stream Transformers where the output is a Boolean value; we will call these "monitors".
Now, I am trying to design combinators for these objects. Some of them are:
A pre combinator. Suppose that mon is a monitor. Then, pre mon is a monitor which always produces False after the first token is consumed and then mimicks the behaviour of mon as if the previous token is being inserted now. I would want to model the state of pre mon with StateWithTrigger in the example above since the new state is a boolean along with the original state.
An and combinator. Suppose that m1 and m2 are monitors. Then, m1 `and` m2 is a monitor which feeds the token to m1, and then to m2, and then produces True if both of the answers were true. I would want to model the state of m1 `and` m2 with CombinedState in the example above since the state of both monitors must be maintained.
For your first question, as Carl mentioned, zoom from lens does exactly what you want. Your code with lenses could be written like this:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Monad.State.Lazy
newtype InnerState = MkInnerState { _innerVal :: Int }
deriving (Eq, Ord, Read, Show)
data OuterState = MkOuterState
{ _outerTrigger :: Bool
, _inner :: InnerState
} deriving (Eq, Ord, Read, Show)
makeLenses ''InnerState
makeLenses ''OuterState
innerStateFoo :: Monad m => StateT InnerState m Int
innerStateFoo = do
i <- gets _innerVal
put $ MkInnerState (i + 1)
return i
outerStateFoo :: Monad m => StateT OuterState m Int
outerStateFoo = do
b <- gets _outerTrigger
if b
then zoom inner $ innerStateFoo
else pure 666
Edit: While we're at it, if you're already bringing in lens then innerStateFoo can be written like so:
innerStateFoo :: Monad m => StateT InnerState m Int
innerStateFoo = innerVal <<+= 1
For context, this is what I want to achieve with this machinery:
I want to design these things called "Stream Transformers", which are basically stateful functions: They consume a token, mutate their internal state and output something. Specifically, I am interested in a class of Stream Transformers where the output is a Boolean value; we will call these "monitors".
I think that what you want to achieve doesn't need very much machinery.
newtype StreamTransformer input output = StreamTransformer
{ runStreamTransformer :: input -> (output, StreamTransformer input output)
}
type Monitor input = StreamTransformer input Bool
pre :: Monitor input -> Monitor input
pre st = StreamTransformer $ \i ->
-- NB: the first output of the stream transformer vanishes.
-- Is that OK? Maybe this representation doesn't fit the spec?
let (_, st') = runStreamTransformer st i
in (False, st')
and :: Monitor input -> Monitor input -> Monitor input
and left right = StreamTransformer $ \i ->
let (bleft, mleft) = runStreamTransformer left i
(bright, mright) = runStreamTransformer right i
in (bleft && bright, mleft `and` mright)
This StreamTransformer is not necessarily stateful, but admits stateful ones. You don't need to (and IMO should not! in most cases!!) reach for typeclasses in order to define these (or indeed ever! :) but that's another topic).
notStateful :: StreamTransformer input ()
notStateful = StreamTransformer $ \_ -> ((), notStateful)
stateful :: s -> (input -> s -> (output, s)) -> StreamTransformer input output
stateful s k = StreamTransformer $ \input ->
let (output, s') = k input s
in (output, stateful s' k)
alternateBool :: Monitor anything
alternateBool = stateful True $ \_ s -> (s, not s)

Haskell UUID generation

I am new to Haskell and need help. I am trying to build a new data type that has to be somehow unique, so I decided to use UUID as a unique identifier:
data MyType = MyType {
uuid :: UUID,
elements :: AnotherType
}
in this way, I can do following:
instance Eq MyType where
x == y = uuid x == uuid y
x /= y = not (x == y)
The problem is that all known (to me) UUID generators produce IO UUID, but I need to use it in a pure code as mentioned above. Could you please suggest if there is any way to extract UUID out of IO UUID, or maybe be there is a better way to do what I need in Haskell? Thanks.
UPDATE
Thanks for all the great suggestions and the code example. From what is posted here I can say you cannot break a referential transparency, but there are smart ways how to solve the problem without breaking it and, probably the most optimal one, is listed in the answer below.
There is also one alternative approach that I was able to explore myself based on provided recommendations with the usage of State Monad:
type M = State StdGen
type AnotherType = String
data MyType = MyType {
uuid :: UUID,
elements :: AnotherType
} deriving (Show)
mytype :: AnotherType -> M MyType
mytype x = do
gen <- get
let (val, gen') = random gen
put gen'
return $ MyType val x
main :: IO ()
main = do
state <- getStdGen
let (result, newState) = runState (mytype "Foo") state
putStrLn $ show result
let (result', newState') = runState (mytype "Bar") newState
setStdGen newState'
putStrLn $ show result'
Not sure if it is the most elegant implementation, but it works.
If you're looking at the functions in the uuid package, then UUID has a Random instance. This means that it's possible to generate a sequence of random UUIDs in pure code using standard functions from System.Random using a seed:
import System.Random
import Data.UUID
someUUIDs :: [UUID]
someUUIDs =
let seed = 123
g0 = mkStdGen seed -- RNG from seed
(u1, g1) = random g0
(u2, g2) = random g1
(u3, g3) = random g2
in [u1,u2,u3]
Note that someUUIDs creates the same three "unique" UUIDs every time it's called because the seed is hard-coded.
As with all pure Haskell code, unless you cheat (using unsafe functions), you can't expect to generate a sequence of actually unique UUIDs without explicitly passing some state (in this case, a StdGen RNG) between calls to random.
The usual solution to avoid the ugly boilerplate of passing the generator around is to run at least part of your code within a monad that can maintain the needed state. Some people like to use the MonadRandom package, though you can also use the regular State monad with a StdGen somewhere in the state. The main advantages of MonadRandom over State is that you get some dedicated syntax (getRandom) and can create a monad stack that includes both RandomT and StateT so you can separate your RNG state from the rest of your application state.
Using MonadRandom, you might write an application like:
import Control.Monad.Random.Strict
import System.Random
import Data.UUID
-- monad for the application
type M = Rand StdGen
-- get a generator and run the application in "M"
main :: IO ()
main = do
g <- getStdGen -- get a timestamp-seeded generator
let log = evalRand app g -- run the (pure) application in the monad
putStr log
-- the "pure" application, running in monad "M"
app :: M String
app = do
foo <- myType "foo"
bar <- myType "bar"
-- do some processing
return $ unlines ["Results:", show foo, show bar]
type AnotherType = String
data MyType = MyType {
uuid :: UUID,
elements :: AnotherType
} deriving (Show)
-- smart constructor for MyType with unique UUID
myType :: AnotherType -> M MyType
myType x = MyType <$> getRandom <*> pure x
Note that substantial parts of the application will need to be written in monadic syntax and run in the application M monad. This isn't a big restriction -- most non-trivial applications are going to be written in some monad.

How to combine data composition and monad transformers

I'm somewhat new to monad transformers, and currently trying to use a StateT/Except stack in a project. The difficulty I'm having is that I have a few layers of data composition (types with operations on them, contained within types that have other operations on them), and I can't figure out how to elegantly use monad transformers in that design. Concretely, I'm having trouble writing the following code (simplified example, obviously):
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Control.Monad.Except
import Control.Monad.State
import Control.Monad.Trans.Except (Except, throwE)
import Control.Monad.Trans.State (StateT)
data ComposedState = ComposedState { state :: Bool }
data MyError = MyError { message :: String }
-- If the passed in state is true, change it to false; otherwise throw.
throwingModification :: ComposedState -> Except MyError ComposedState
throwingModification (ComposedState True) = return $ ComposedState False
throwingModification _ = throwE $ MyError "error!"
-- A state which composes with #ComposedState#,
data MyState = MyState { composed :: ComposedState }
-- and a monad transformer state to allow me to modify it and propagate
-- errors.
newtype MyMonad a = MyMonad { contents :: StateT MyState (Except MyError) a }
deriving ( Functor
, Applicative
, Monad
, MonadState MyState
, MonadError MyError )
anAction :: MyMonad ()
anAction = do -- want to apply throwingModification to the `composed` member,
-- propogating any exception
undefined
where I have a potentially "throwing" operation on ComposedState, and I want to use that operation in a stateful, throwing operation on MyState. I can obviously do that by deconstructing the whole stack and rebuilding it, but the whole point of the monadic structure is that I shouldn't have to. Is there a terse, idiomatic solution?
Apologies for the lengthy code snippet--I did my best to cut it down.
The more natural way of doing this would be to write throwingModification from the start in the MyMonad monad, like so:
throwingModification' :: MyMonad ()
throwingModification' = do ComposedState flag <- gets composed
if not flag then throwError $ MyError "error!"
else modify (\s -> s { composed = (composed s)
{ Main.state = False } })
I'm assuming here that the composed states contain other components that you want to preserve, which makes the modify clause ugly. Using lenses can make this cleaner.
However, if you're stuck with the current form of throwingModification, you'll probably have to write your own combinator, since the usual State combinators don't include mechanisms for switching the state type s, which is what you're effectively trying to do.
The following definition of usingState may help. It transforms a StateT operation from one state to another using a getter and setter. (Again, a lens approach would be cleaner.)
usingState :: (Monad m) => (s -> t) -> (s -> t -> s)
-> StateT t m a -> StateT s m a
usingState getter setter mt = do
s <- get
StateT . const $ do (a, t) <- runStateT mt (getter s)
return (a, setter s t)
I don't think there's an easy way to modify usingState to work between general MonadState monads instead of directly on a StateT, so you'll need to lift it manually and convert it through your MyMonad data type.
With usingState so defined, you can write the following. (Note >=> comes from Control.Monad.)
MyMonad $ usingState getComposed putComposed $
StateT (throwingModification >=> return . ((),))
with helpers:
getComposed = composed
putComposed s c = s { composed = c }
This is still a little ugly, but that's because the type t -> Except e t must be adapted to StateT (t -> Except e ((), t)), then transformed to the s state by the combinator, and then wrapped manually in your MyMonad, as explained above.
With Lenses
I'm not suggesting lenses are a miracle cure or anything, but they do help clean up a few of the uglier parts of the code.
After adding lenses:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE TupleSections #-}
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Monad ((>=>))
import Control.Monad.Except (Except, MonadError, throwError)
import Control.Monad.State (get, MonadState, runStateT, StateT(..))
data MyError = MyError { _message :: String }
data MyState = MyState { _composed :: ComposedState }
data ComposedState = ComposedState { _state :: Bool }
makeLenses ''ComposedState
makeLenses ''MyError
makeLenses ''MyState
the definition of throwingModification looks a little cleaner:
throwingModification :: ComposedState -> Except MyError ComposedState
throwingModification s =
if s^.state then return $ s&state .~ False
else throwError $ MyError "error!"
and the MyMonad version I gave above certainly benefits:
throwingModification' :: MyMonad ()
throwingModification' = do
flag <- use (composed.state)
if flag then composed.state .= False
else throwError (MyError "error!")
The definition of usingStateL doesn't look much different:
usingStateL :: (Monad m) => Lens' s t -> StateT t m a -> StateT s m a
usingStateL tPart mt = do
s <- get
StateT . const $ do (a, t) <- runStateT mt (s^.tPart)
return (a, s&tPart .~ t)
but it allows the existing lens composed to be used in place of helper functions:
MyMonad $ usingStateL composed $
StateT (throwingModification >=> return . ((),))
and it would generalize to (composed.underneath.state4) if you had complex nested state.
The best solution would be re-write throwingModification as a MyMonad.
throwingModification :: MyMonad ()
throwingModification = do
s <- get
if state s then
put $ ComposedState False
else
throwError $ MyError "error!"
If you can't re-write your function (because it is used elsewhere), you can wrap it instead.

Why does ParsecT type have 'u' argument?

Documentation for the parsec package states that u argument is used to carry some user state through monadic computation. But the same functionality can be achieved by basing ParsecT monad transformer on State monad. So if my parser is not stateful, i don't need u altogether, but have to set it to () with parsec. What's rationale for adding non-optional state support to ParsecT?
Because a parser of type ParsecT s () (State st) a behaves differently from a parser of type Parsec s st Identity a when it comes to backtracking:
User state resets when parsec tries an alternative after a failing parse that consumes no input.
But the underlying Monad m does not backtrack; all the effects that happened on the way to a final parse result are kept.
Consider the following example:
{-# LANGUAGE FlexibleContexts #-}
module Foo where
import Control.Applicative
import Control.Monad.State
import Text.Parsec.Prim hiding ((<|>), State(..))
import Text.Parsec.Error (ParseError)
tick :: MonadState Int m => ParsecT s Int m ()
tick = do
lift $ modify (+1)
modifyState (+1)
tickTock :: MonadState Int m => ParsecT s Int m ()
tickTock = (tick >> empty) <|> tick
-- | run a parser that has both user state and an underlying state monad.
--
-- Example:
-- >>> run tickTock
-- (Right 1,2)
run :: ParsecT String Int (State Int) () -> (Either ParseError Int, Int)
run m = runState (runParserT (m >> getState) initUserState "-" "") initStateState
where initUserState = 0
initStateState = 0
As you can see, the underlying state monad registered two ticks (from both alternatives that were tried),
while the user state of the Parsec monad transformer only kept the successful one.
ParsecT carries it's own state already: parsing position and input: http://haddocks.fpcomplete.com/fp/7.8/20140916-162/parsec/Text-Parsec-Prim.html#t:State
So as leftaroundabout pointed out, it's probably due optimisation purposes.

What is the name of this Monad Stack function?

I've got a bunch of stateful functions inside a State monad. At one point in the program there needs to be some IO actions so I've wrapped IO inside a StateT getting a pair of types like this:
mostfunctions :: State Sometype a
toplevel :: StateT Sometype IO a
To keep things simple I don't want pass the IO context into the main set of functions and I would like to avoid wrapping them in the monad stack type. But in order to call them from the toplevel function I need something akin to a lift, but I'm not trying to lift a value from the inner monad. Rather I want to convert the state in the StateT monad into something equivalent in the State monad. To do this I've got the following:
wrapST :: (State Sometype a) -> StateT Sometype IO a
wrapST f = do s <- get
let (r,s2) = runState f s
put s2
return r
This then get used to interleave things like the following:
toplevel = do liftIO $ Some IO functions
wrapST $ Some state mutations
liftIO $ More IO functions
....
It seems like a fairly obvious block of code so I'm wondering does this function have a standard name, and it is already implemented somewhere in the standard libraries? I've tried to keep the description simple but obviously this extends to pulling one transformer out of a stack, converting the wrapped value to the cousin of the transformer type, skipping the monads below in the stack, and then pushing the results back in at the end.
It may be a good idea to refactor your code to use the type StateT SomeType m a instead of State SomeType a, because the first one is compatible to an arbitrary monad stack. If you'd change it like this, you don't need a function wrapST anymore, since you can call the stateful functions directly.
Okay. Suppose you have a function subOne :: Monad m => State Int Int:
subOne = do a <- get
put $ a - 1
return a
Now, change the types of all functions like this one from State SomeType a to StateT SomeType m a, leaving m as is. This way, your functions can work on any monadic stack. For those functions, that require IO, you can specify, that the monad at the bottom must be IO:
printState :: MonadIO m => StateT Int m ()
printState = do a <- get
liftIO $ print a
Now, it should be possible to use both functions together:
-- You could use me without IO as well!
subOne :: Monad m => StateT Int m ()
subOne = do a <- get
put $ a - 1
printState :: MonadIO m => StateT Int m ()
printState = do a <- get
liftIO $ print a
toZero :: StateT Int IO ()
toZero = do subOne -- A really pure function
printState -- function may perform IO
a <- get
when (a > 0) toZero
PS: I use GHC 7, some of the libs changed midway, so it might be a bit different on GHC 6.
A more direct answer to your question: the function hoist does exactly what you're describing in a slightly more generic way. Example usage:
import Control.Monad.State
import Data.Functor.Identity
import Control.Monad.Morph
foo :: State Int Integer
foo = put 1 >> return 1
bar :: StateT Int IO Integer
bar = hoist (return . runIdentity) foo
hoist is part of the MFunctor class, which is defined like this:
class MFunctor t where
hoist :: Monad m => (forall a. m a -> n a) -> t m b -> t n b
There are instances for most monad tranformers, but not ContT.

Resources