User state in Parsec

User state in Parsec - haskell

I'm parsing an expression using Parsec and I want to keep track of variables in these expressions using the user state in Parsec. Unfortunately I don't really get how to do it.
Given the following code:
import Data.Set as Set
inp = "$x = $y + $z"
data Var = V String
var = do char '$'
n <- many1 letter
let v = Var n
-- I want to modify the set of variables here
return v
parseAssignment = ... -- parses the above assignment
run = case runIdentity $ runParserT parseAssignment Set.empty "" inp of
Left err -> ...
Right -> ...
So, the u in ParsecT s u m a would be Set.Set. But how would I integrate the state update into var?
I tried something like modify $ Set.insert v, but this doesn't work, since Set.Set is not a state monad.

Unfortunately, Yuras' suggestion of updateParserState is not optimal (you'd use that function if you're looking to modify Parsec's internal state as well); instead you should pass a function that works over your custom user state (i.e. of type u -> u) to modifyState, such as in this example:
expr = do
x <- identifier
modifyState (+1)
-- ^ in this example, our type u is Int
return (Id x)
or use any combination of the getState and putState functions. For your case, you'd do something like:
modifyState (Set.insert v)
See this link for more info.
For a more tutorial-like introduction to working with user state in Parsec, this document, though old, should be relevant.

You can use updateParserState

Related

In Parsec, how do I run second parser, only if the first parser consumed some input?

I need a combinator like p1 << p2, but p2 should run only if p1 has succeeded and consumed some input.
If p1 succeeded without consuming input, p2 should not run.
If p1 failed, then p2 is also ignored?
Overall result is r1's result

Parsec primitives make an internal distinction between a parser that succeeds after consuming some input and a parser that succeeds after consuming no input which you should be able to leverage. In particular, the following ought to work to parse p and then -- conditioned on p successfully consuming input -- parse q and discard its results:
ifConsumed :: Monad m => ParsecT s u m a -> ParsecT s u m b -> ParsecT s u m a
ifConsumed p q = mkPT k
where -- k :: State s u -> m (Consumed (m (Reply s u a)))
k s = do cons <- runParsecT p s
case cons of
Consumed mrep -> do
rep <- mrep
case rep of
Ok x s' err -> runParsecT (fmap (const x) q) s'
Error err -> return . Consumed . return $ Error err
Empty mrep -> do
rep <- mrep
case rep of
Ok x s' err -> return . Empty . return $ Ok x s' err
Error err -> return . Empty . return $ Error err
It's ugly because Parsec doesn't directly expose the ParsecT constructor, so you have to use the mkPt and runParsecT intermediaries which add a lot of boilerplate.
In a nutshell, it runs the p parser. If this succeeds with input consumed (the Consumed -> Ok branch), it runs the q parser modified via fmap to return the value parsed by p. On the other hand, if p succeeds with no input consumed (the Empty -> Ok branch), it simply returns success without running the q parser.
The only caveat is that I'm not 100% sure how, within the Parsec library itself, the invariant is preserved whereby the Consumed -> Ok branch only gets called when input has been consumed, so I don't know if this is truly reliable. You'll want to test it carefully in your particular use case.
For the following parser --- which parses a list of one or more elements separated commas where each element consists of zero or more digits, then two exclamation marks only if the previous parser consumed some input, then a semicolon --- it seems to work:
p :: Parser [String]
p = ifConsumed (sepBy1 (many digit) (char ',')) (char '!' >> char '!') <* char ';'
runp :: String -> Either ParseError [String]
runp = parse p ""
Some tests:
runp "" -- fails, expecting semicolon
runp ";" -- returns [""]
runp "!!;" -- fails, "!!" w/ no preceding content
runp ",;" -- fails, missing "!!"
runp ",!!;" -- returns ["",""]
runp ",!;" -- fails, expecting second "!"
runp ",1,23;" -- fails, missing "!!"
runp ",1,23!!;" -- returns ["","1","23"]

With a naive parser implementation, you should be able to do this:
(<<) p1 p2 = P $ \inp -> case parse p1 inp of
ErrorResult e -> ErrorResult e
SuccessResult (rem, res) -> if rem == inp
then SuccessResult (rem, res)
else parse p2 rem
Though Parsec is more advanced, you could probably roll your own there as well.

I don't think you can do that for arbitrary parsers p1 and p2: you need them to communicate somehow. If you could do this, it seems to me that you would break referential transparency.
For example, consider parsing the input string repeat 'x': whether p1 consumes a character or not, p2 will see the string as an endless sea of x characters. If it hasn't communicated with p1 somehow (eg by modifying something in the parser's state), then you can't know whether a character was consumed; if your combinator were somehow able to treat these two cases differently, it would be breaking the rules.

How to define and use global array in Haskell?

I am new in Haskell world. I have to create a game based on checkers board and would like to define global variable - an array representing board. Do you know how to define this array and use it (modify it) in functions?
Please, help me.

Although the concept of global variables is not encouraged at functional programming, you can have a look at this solution and use State Monad as mentioned here :
import Control.Monad.State
import qualified Data.Map as Map
funcs :: Map.Map String Double
funcs = Map.empty
f :: String -> Double -> State (Map.Map String Double) ()
f str d = do
funcs <- get
put (Map.insert str d funcs)
g :: State (Map.Map String Double) String
g = do
funcs <- get
if (Map.lookup "aaa" funcs) == Nothing then return "not defined" else return "ok"
main = putStrLn $ flip evalState funcs $ do {f "aaa" 1; g}

Check out: http://www.haskell.org/haskellwiki/Arrays#Mutable_IO_arrays_.28module_Data.Array.IO.29
Remember, evey mutation in haskell is inside a Monad i.e you describe the mutations using a moand and then you run the monad by passing it the initial value (state) that will go through the mutations.

Parsec: error message at specific location

Using Parsec how does one indicate an error at a specific position if a semantic rule is violated. I know typically we don't want to do such things, but consider the example grammar.
<foo> ::= <bar> | ...
<bar> ::= a positive integer power of two
The <bar> rule is a finite set (my example is arbitrary), and a pure approach to the above could be a careful application of the choice combinator, but this might be impractical in space and time. In recursive descent or toolkit-generated parsers the standard trick is to parse an integer (a more relaxed grammar) and then semantically check the harder constraints. For Parsec, I could use a natural parser and check the result calling fail when that doesn't match or unexpected or whatever. But if we do that, the default error location is the wrong one. Somehow I need to raise the error at the earlier state.
I tried a brute force solution and wrote a combinator that uses getPosition and setPosition as illustrated by this very similar question. Of course, I was also unsuccessful (the error location is, of course wrong). I've run into this pattern many times. I am kind of looking for this type of combinator:
withPredicate :: (a -> Bool) -> String -> P a -> P a
withPredicate pred lbl p = do
ok <- lookAhead $ fmap pred (try p) <|> return False -- peek ahead
if ok then p -- consume the input if the value passed the predicate
else fail lbl -- otherwise raise the error at the *start* of this token
pPowerOfTwo = withPredicate isPowerOfTwo "power of two" natural
where isPowerOfTwo = (`elem` [2^i | i<-[1..20]])
The above does not work. (I tried variants on this as well.) Somehow the parser backtracks a says it's expecting a digit. I assume it's returning the error that made it the furthest. Even {get,set}ParserState fails erase that memory.
Am I handling this syntactic pattern wrong? How would all you Parsec users approach these type of problems?
Thanks!

I think both your ideas are OK. The other two answers deal with Parsec, but I'd like to note that in both
cases Megaparsec just does the right thing:
{-# LANGUAGE TypeApplications #-}
module Main (main) where
import Control.Monad
import Data.Void
import Text.Megaparsec
import qualified Text.Megaparsec.Char.Lexer as L
type Parser = Parsec Void String
withPredicate1 :: (a -> Bool) -> String -> Parser a -> Parser a
withPredicate1 f msg p = do
r <- lookAhead p
if f r
then p
else fail msg
withPredicate2 :: (a -> Bool) -> String -> Parser a -> Parser a
withPredicate2 f msg p = do
mpos <- getNextTokenPosition -- †
r <- p
if f r
then return r
else do
forM_ mpos setPosition
fail msg
main :: IO ()
main = do
let msg = "I only like numbers greater than 42!"
parseTest' (withPredicate1 #Integer (> 42) msg L.decimal) "11"
parseTest' (withPredicate2 #Integer (> 42) msg L.decimal) "22"
If I run it:
The next big Haskell project is about to start!
λ> :main
1:1:
|
1 | 11
| ^
I only like numbers greater than 42!
1:1:
|
1 | 22
| ^
I only like numbers greater than 42!
λ>
Try it for yourself! Works as expected.
† getNextTokenPosition is more correct than getPosition for streams where tokens contain position of their beginning and end in themselves. This may or may not be important in your case.

It's not a solution I like, but you can hypnotize Parsec into believing it's had a single failure with consumption:
failAt pos msg = mkPT (\_ -> return (Consumed (return $ Error $ newErrorMessage (Expect msg) pos)))
Here's a complete example:
import Control.Monad
import Text.Parsec
import Text.Parsec.Char
import Text.Parsec.Error
import Text.Parsec.Prim
import Debug.Trace
failAt pos msg = mkPT (\_ -> return (Consumed (return $ Error $ newErrorMessage (Expect msg) pos)))
type P a = Parsec String () a
withPredicate :: (a -> Bool) -> String -> P a -> P a
withPredicate pred msg p = do
pos <- getPosition
x <- p
unless (pred x) $ failAt pos msg
return x
natural = read <$> many1 digit
pPowerOfTwo = withPredicate isPowerOfTwo "power of two" natural
where isPowerOfTwo = (`elem` [2^i | i<-[1..20]])
main = print $ runParser pPowerOfTwo () "myinput" "4095"
When run, it results in:
Left "myinput" (line 1, column 1):
expecting power of two

I think the problem stems from how Parsec picks the "best error" in the non-deterministic setting. See Text.Parsec.Error.mergeError. Specifically, this selects the longest match when choosing which error is the error to report. I think we need some way to make Parsec order errors differently, which may be too obscure for us solving this problem.
In my case, I here's how I worked around the problem:
I solved stacked an Exception monad within my ParsecT type.
type P m = P.ParsecT String ParSt (ExceptT Diagnostic m)
Then I introduced a pair of combinators:
(Note: Loc is my internal location type)
-- stops hard on an error (no backtracking)
-- which is why I say "semantic" instead of "syntax" error
throwSemanticError :: (MonadTrans t, Monad m) => Loc -> String -> t (ExceptT Diagnostic m) a
throwSemanticError loc msg = throwSemanticErrorDiag $! Diagnostic loc msg
withLoc :: Monad m => (Loc -> P m a) -> P m a
withLoc pa = getLoc >>= pa
Now in parsing I can write:
parsePrimeNumber = withLoc $ \loc ->
i <- parseInt
unless (isPrime i) $ throwSemanticError loc "number is not prime!"
return i
The top level interface to run one of these monads is really nasty.
runP :: Monad m
=> ParseOpts
-> P m a
-> String
-> m (ParseResult a)
runP pos pma inp =
case runExceptT (P.runParserT pma (initPSt pos) "" inp) of
mea -> do
ea <- mea
case ea of
-- semantic error (throwSemanticError)
Left err -> return $! PError err
-- regular parse error
Right (Left err) -> return $ PError (errToDiag err)
-- success
Right (Right a) -> return (PSuccess a [])
I'm not terribly happy with this solution and desire something better.
I wish parsec had a:
semanticCheck :: (a -> Parsec Bool) -> Parsec a -> Parsec a
semanticCheck pred p =
a <- p
z <- pred a
unless z $
... somehow raise the error from the beginning of this token/parse
rather than the end ... and when propagating the error up,
use the end parse position, so this parse error beats out other
failed parsers that make it past the beginning of this token
(but not to the end)
return a

Using lookAhead, we can run a parser without consuming any input or registering any new errors, but record the state that we end up in. We can then apply a guard to the result of the parser. The guard can fail in whatever manner it desires if the value does not pass the semantic check. If the guard fails, then the error is located at the initial position. If the guard succeeds, we reset the parser to the recorded state, avoiding the need to re-execute p.
guardP :: Stream s m t => (a -> ParsecT s u m ()) -> ParsecT s u m a -> ParsecT s u m a
guardP guard p = do
(a, s) <- try . lookAhead $ do
a <- p
s <- getParserState
return (a, s)
guard a
setParserState s
return a
We can now implement pPowerOfTwo:
pPowerOfTwo :: Stream s m Char => ParsecT s u m Integer
pPowerOfTwo = guardP guardPowerOfTwo natural <?> "power of two"
where guardPowerOfTwo s = unless (s `elem` [2^i | i <- [1..20]]) . unexpected $ show s

Get value from IO rather than the computation itself

Being quite new to Haskell, I'm currently trying to improve my skills by writing an interpreter for a simple imperative toy language.
One of the expressions in this language is input, which reads a single integer from standard input. However, when I assign the value of this expression to a variable and then use this variable later, it seems ot me that I actually stored the computation of reading a value rather the read value itself. This means that e.g. the statements
x = input;
y = x + x;
will cause the interpreter to invoke the input procedure three times rather than one.
Internally in the evaluator module, I use a Map to store the values of variables. Because I need to deal with IO, this gets wrapped in an IO monad, as immortalized in the following minimal example:
import qualified Data.Map as Map
type State = Map.Map String Int
type Op = Int -> Int -> Int
input :: String -> IO State -> IO State
input x state = do line <- getLine
st <- state
return $ Map.insert x (read line) st
get :: String -> IO State -> IO Int
get x state = do st <- state
return $ case Map.lookup x st of
Just i -> i
eval :: String -> Op -> String -> IO State -> IO Int
eval l op r state = do i <- get l state
j <- get r state
return $ op i j
main :: IO ()
main = do let state = return Map.empty
let state' = input "x" state
val <- eval "x" (+) "x" state'
putStrLn . show $ val
The second line in the main function simulates the assignment of x, while the third line simulates the evaluation of the binary + operator.
My question is: How do I get around this, such that the code above only inputs once? I suspect that it is the IO-wrapping that causes the problem, but as we're dealing with IO I see no way out of that..?

Remember that IO State is not an actual state, but instead the specification for an IO machine which eventually produces a State. Let's consider input as an IO-machine transformer
input :: String -> IO State -> IO State
input x state = do line <- getLine
st <- state
return $ Map.insert x (read line) st
Here, provided a machine for producing a state, we create a bigger machine which takes that passed state and adding a read from an input line. Again, to be clear, input name st is an IO-machine which is a slight modification of the IO-machine st.
Let's now examine get
get :: String -> IO State -> IO Int
get x state = do st <- state
return $ case Map.lookup x st of
Just i -> i
Here we have another IO-machine transformer. Given a name and an IO-machine which produces a State, get will produce an IO-machine which returns a number. Note again that get name st is fixed to always use the state produced by the (fixed, input) IO-machine st.
Let's combine these pieces in eval
eval :: String -> Op -> String -> IO State -> IO Int
eval l op r state = do i <- get l state
j <- get r state
return $ op i j
Here we call get l and get r each on the same IO-machine state and thus produce two (completely independent) IO-machines get l state and get r state. We then evaluate their IO effects one after another and return the op-combination of their results.
Let's examine the kinds of IO-machines built in main. In the first line we produce a trivial IO-machine, called state, written return Map.empty. This IO-machine, each time it's run, performs no side effects in order to return a fresh, blank Map.Map.
In the second line, we produce a new kind of IO-machine called state'. This IO-machine is based off of the state IO-machine, but it also requests an input line. Thus, to be clear, each time state' runs, a fresh Map.Map is generated and then an input line is read to read some Int, stored at "x".
It should be clear where this is going, but now when we examine the third line we see that we pass state', the IO-machine, into eval. Previously we stated that eval runs its input IO-machine twice, once for each name, and then combines the results. By this point it should be clear what's happening.
All together, we build a certain kind of machine which draws input and reads it as an integer, assigning it to a name in a blank Map.Map. We then build this IO-machine into a larger one which uses the first IO-machine twice, in two separate invocations, in order to collect data and combine it with an Op.
Finally, we run this eval machine using do notation (the (<-) arrow indicates running the machine). Clearly it should collect two separate lines.
So what do we really want to do? Well, we need to simulate ambient state in the IO monad, not just pass around Map.Maps. This is easy to do by using an IORef.
import Data.IORef
input :: IORef State -> String -> IO ()
input ref name = do
line <- getLine
modifyIORef ref (Map.insert name (read line))
eval :: IORef State -> Op -> String -> String -> IO Int
eval ref op l r = do
stateSnapshot <- readIORef ref
let Just i = Map.lookup l stateSnapshot
Just j = Map.lookup l stateSnapshot
return (op i j)
main = do
st <- newIORef Map.empty -- create a blank state, embedded into IO, not a value
input st "x" -- request input *once*
val <- eval st (+) "x" "x" -- compute the op
putStrLn . show $ val

It's fine to wrap your actions such as getLine in IO, but to me it looks like your problem is that you're trying to pass your state in the IO monad. Instead, I think this is probably time you get introduced to monad transformers and how they'll let you layer the IO and State monads to get the functionality of both in one.
Monad transformers are a pretty complex topic and it'll take a while to get to where you're comfortable with them (I'm still learning new things all the time about them), but they're a very useful tool when you need to layer multiple monads. You'll need the mtl library to follow this example.
First, imports
import qualified Data.Map as Map
import Control.Monad.State
Then types
type Op = Int -> Int -> Int
-- Renamed to not conflict with Control.Monad.State.State
type AppState = Map.Map String Int
type Interpreter a = StateT AppState IO a
Here Interpreter is the Monad in which we'll build our interpreter. We also need a way to run the interpreter
-- A utility function for kicking off an interpreter
runInterpreter :: Interpreter a -> IO a
runInterpreter interp = evalStateT interp Map.empty
I figured defaulting to Map.empty was sufficient.
Now, we can build our interpreter actions in our new monad. First we start with input. Instead of returning our new state, we just modify what is current in our map:
input :: String -> Interpreter ()
input x = do
-- IO actions have to be passed to liftIO
line <- liftIO getLine
-- modify is a member of the MonadState typeclass, which StateT implements
modify (Map.insert x (read line))
I had to rename get so that it didn't conflict with get from Control.Monad.State, but it does basically the same thing as before, it just takes our map and looks up that variable in it.
-- Had to rename to not conflict with Control.Monad.State.get
-- Also returns Maybe Int because it's safer
getVar :: String -> Interpreter (Maybe Int)
getVar x = do
-- get is a member of MonadState
vars <- get
return $ Map.lookup x vars
-- or
-- get x = fmap (Map.lookup x) get
Next, eval now just looks up each variable in our map, then uses liftM2 to keep the return value as Maybe Int. I prefer the safety of Maybe, but you can rewrite it if you prefer
eval :: String -> Op -> String -> Interpreter (Maybe Int)
eval l op r = do
i <- getVar l
j <- getVar r
-- liftM2 op :: Maybe Int -> Maybe Int -> Maybe Int
return $ liftM2 op i j
Finally, we write our sample program. It stores user input to the variable "x", adds it to itself, and prints out the result.
-- Now we can write our actions in our own monad
program :: Interpreter ()
program = do
input "x"
y <- eval "x" (+) "x"
case y of
Just y' -> liftIO $ putStrLn $ "y = " ++ show y'
Nothing -> liftIO $ putStrLn "Error!"
-- main is kept very simple
main :: IO ()
main = runInterpreter program
The basic idea is that there is a "base" monad, here IO, and these actions are "lifted" up to the "parent" monad, here StateT AppState. There is a typeclass implementation for the different state operations get, put, and modify in the MonadState typeclass, which StateT implements, and in order to lift IO actions there's a pre-made liftIO function that "lifts" IO actions to the parent monad. Now we don't have to worry about passing around our state explicitly, we can still perform IO, and it has even simplified the code!
I would recommend reading the Real World Haskell chapter on monad transformers to get a better feel for them. There are other useful ones as well, such as ErrorT for handling errors, ReaderT for static configuration, WriterT for aggregating results (usually used for logging), and many others. These can be layered into what is called a transformer stack, and it's not too difficult to make your own either.

Instead of passing an IO State, you can pass State and then use higher-level functions to deal with IO. You can go further and make get and eval free from side-effects:
input :: String -> State -> IO State
input x state = do
line <- getLine
return $ Map.insert x (read line) state
get :: String -> State -> Int
get x state = case Map.lookup x state of
Just i -> i
eval :: String -> Op -> String -> State -> Int
eval l op r state = let i = get l state
j = get r state
in op i j
main :: IO ()
main = do
let state = Map.empty
state' <- input "x" state
let val = eval "x" (+) "x" state'
putStrLn . show $ val

If you're actually building an interpreter, you'll presumably have a list of instructions to execute at some point.
This is my rough translation of your code (although I'm only a beginner myself)
import Data.Map (Map, empty, insert, (!))
import Control.Monad (foldM)
type ValMap = Map String Int
instrRead :: String -> ValMap -> IO ValMap
instrRead varname mem = do
putStr "Enter an int: "
line <- getLine
let intval = (read line)::Int
return $ insert varname intval mem
instrAdd :: String -> String -> String -> ValMap -> IO ValMap
instrAdd varname l r mem = do
return $ insert varname result mem
where result = (mem ! l) + (mem ! r)
apply :: ValMap -> (ValMap -> IO ValMap) -> IO ValMap
apply mem instr = instr mem
main = do
let mem0 = empty
let instructions = [ instrRead "x", instrAdd "y" "x" "x" ]
final <- foldM apply mem0 instructions
print (final ! "y")
putStrLn "done"
The foldM applies a function (apply) to a start value (mem0) and a list (instructions) but does so within a monad.

Haskell laziness question or why this monad not working as expected

here is kinda huge piece of code, it finally got huge, because of monadic stuff, but task is simple: parse following string to data structure:
"hello(some, args)" -> [("fid","hello"),("sym","("),("args","some, args"),("sym",")")]
but code i written produces following:
"hello(some, args)" -> [("fid",""),("sym","("),("args",""),("sym",")")]
as you can see 'args' and 'fid' values are lost somewhere on the way, i suspect compiler decided not to calculate them for some mysterious reason.
I guess the code is completely bad, also i marked with "?"'s parts which seem useless to me, but compiler forced me to leave them in place :)
And here is the code:
type PStream = String
type PToken a = (String, a)
data Pstate a = Pstate (String -> ([PToken String], PStream)) a
instance Monad Pstate where
return x = Pstate (\_ -> ([("start", "")], "?")) x -- not used ?
(Pstate bindparser v) >>= f = Pstate newparser fv
where
Pstate fparser fv = f v
(btok, brest) = bindparser "this string also not used"
(tok, rest) = fparser brest
newparser _ = (btok ++ tok, rest)
-- parsers
parseFid :: Pstate String
parseFid = Pstate parser "???"
where parser r = let (fid, rest) = span (/= '(') r in ([("fid", fid)],rest)
parseSym :: Char -> Pstate String
parseSym c = Pstate parser "???"
where parser r = let rest = parseOne c r in ([("sym", [c])],rest)
parseOne s (h:t) = if h == s then t else error $ "symbol not match:" ++ [h] ++ " /= " ++ [s]
parseOne s [] = []
parseArgs :: Pstate String
parseArgs = Pstate parser "???"
where parser r = let (args,rest) = span (/=')') r in ([("args", args)],rest)
-- util
load :: String -> Pstate String
load s = Pstate (\ls -> ([("load", "")],ls)) "???"
runP :: Pstate String -> ([PToken String], PStream)
runP (Pstate fparser fv) = fparser "???"
-- combined parser
parseFunction :: String -> Pstate String
parseFunction s = do
load s --- should be 'return' here ?
parseFid
parseSym '('
parseArgs
parseSym ')'
main = putStrLn $ show $ runP $ parseFunction "hello(a b c)"

First, about the "???" you had to leave there. Consider your definition of Pstate:
data Pstate a = Pstate (String -> ([PToken String], PStream)) a
This means, that your data constructor has the following type:
Pstate :: (String -> ([PToken String], PStream)) -> a -> Pstate a
This is the default construct of a monad. If you define monadic combinators, it's actually not uncommon to have some combinators where this is not needed, so the convention is to leave it to () in this case.
But actually I think that your code is very strange, it seems like you didn't grabbed the point of a stateful monad. Let me explain:
Usually, a stateful computation has this type:
data MyState a = MyState (TypeOfState -> (a, TypeOfState))
This means, that your monadic action is actually some kind of computation, that does something (possible with your piece of state) and than returns a result and a new state. The state is wrapped up in the monad, so you don't have to think about it.
In your code, you're using the same pattern, but somewhat different. It seems like that you fixed the result of the computation to [PToken String]. Let me fix up your definition a bit:
data Pstate a = Pstate (PStream -> (a, PStream))
So now, you get the return value of your computation by applying the combinators, which look like this:
instance Monad Pstate where
-- return should just wrap the computation up, so no changes
return x = Pstate (\p -> (x,p))
parser1 >>= parser2 = Pstate $ \input -> let
Pstate parser1' = parser1
Pstate parser2' = parser2
(output, rest) = parser1' input
result = parser2' output rest
in result
Now, you can look at the type signatures for your parsers, they should be something like this: parseFid :: Pstate [PToken PStream]. This means, your parser consumes some input and returns the parsed stuff as [PToken PStream] and sets the new input to what is left. Consider this definition of parseFid about how it could look like:
parseFid :: Pstate [PToken PStream]
parseFid = Pstate $ \r -> let (fid, rest) = span (/= '(') r in ([("fid", fid)],rest)
The rest is left as an exercise to the reader. I would suggest you to reformulate your parser using the State monad from Control.Monad.State.Strict instead. You'll see, that the monad above is basically the same.
Actually, it's most times easier to rely on existing and well known tools, instead of rolling down an own parser. Here's a parser for what you need created with Parsec, a state of the art library for parsing:
import Text.Parsec
parseFunction = do name <- parseName
obrace <- parseOpeningBrace
args <- parseArguments
cbrace <- parseClosingBrace
return [name,obrace,args,cbrace]
parseName = many (noneOf "(") >>= \name -> return ("fid",name)
parseOpeningBrace = char '(' >> return ("sym","(")
parseArguments = many (noneOf ")") >>= \name -> return ("args",name)
parseClosingBrace = char ')' >> return ("sym",")")
main = case parse parseFunction "" "hello(some, args)" of
Left error -> print error
Right result -> print result
Here's the output:
[("fid","hello"),("sym","("),("args","some, args"),("sym",")")]
I actually would suggest you to think of some better representation of the parsed function, this may make things easier.

If you run the code as posted, you can see that the "this string also not used" string is in fact used, as you get this output:
([("load",""),("fid","this string also not used"),("sym","("),("args","this string also not used"),("sym",")")],"")
In fact the string is basically used as input for all the parsers. In the definition of >>=, the string is given as input to the bindparser. This parser then takes it as it's input and creates tokens from it. parseFid for example produces the token ("fid","this string also not used").
The newparser that is constructed in >>= ignores any input it might receive later, it just returns the result of parsing "this string also not used". Similar, the parser created with return ignores the value it should return.
The parsers created with bind should not ignore/override their inputs for parsing to work correctly.
Also you should decide what role the second parameter of Pstate should fulfill, since at the moment is mostly contains "???", which doesn't look particularly useful.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

User state in Parsec - haskell

You can use updateParserState

Related

In Parsec, how do I run second parser, only if the first parser consumed some input?

How to define and use global array in Haskell?

Parsec: error message at specific location

Get value from IO rather than the computation itself

Haskell laziness question or why this monad not working as expected

Categories

Resources