How can I declare the types for this problem? - haskell

I'm trying to create a simple programming language with some primitives and user defined functions.
These are the types I created:
data Type = IntT | BoolT
data Value = IntV Int | BoolV Bool | OperatorCall String [Value]
data Expr = LetE String Value | ProcedureCall String [Value]
As you can see, I've divided functions into operators (which return a value) and procedures (which don't return anything and act as expressions instead of values). A function call contains the string id of the function being called and the list of arguments being passed. Also, a program is just a list of expressions (I've omitted user defined functions here for the sake of simplicity)
My problem comes from the fact that I need to write a function that parses a function call from a string:
parseFunctionCall :: String -> ???
...
The return type of that function can be a Value (for operator calls) or an Expr (for procedure calls). This function is rather complicated and I'd prefer to avoid writing it twice, or polluting it with an Either return type. What should I do? How can I change my types so that this can be achieved cleanly? Something like this perhaps, but I don't think this is the way:
type FunctionCall = (String, [Value])
data Value = ... | OperatorCall FunctionCall
data Expr = ... | ProcedureCall FunctionCall
parseAsFunctionCall :: String -> FunctionCall
...

You can have the function call parser return (String, [Value]), and let the caller fix that up into whatever data structure they like best -- in your case, by applying \(s, vs) -> OperatorCall s vs if parsing a value or \(s, vs) -> ProcedureCall s vs if parsing an expression.
parseFunctionCall :: Parser (String, [Value])
parseLiteralInt :: Parser Int
parseLiteralBool :: Parser Bool
parseLet :: Parser (String, Value)
(parseFunctionCall, parseLiteralInt, parseBool, parseLet) = {- ... -}
parseValue :: Parser Value
parseValue =
((\(s, vs) -> OperatorCall s vs) <$> parseFunctionCall)
<|>
(IntV <$> parseLiteralInt)
<|>
(BoolV <$> parseLiteralBool)
parseExpr :: Parser Expr
((\(s, vs) -> ProcedureCall s vs) <$> parseFunctionCall)
<|>
((\(s, v) -> Let s v) <$> parseLet)

Related

Why the newtype syntax creates a function

I look at this declaration:
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
Here is what I understand:
1) Parser is declared as a type with a type parameter a
2) You can instantiate Parser by providing a parser function for example p = Parser (\s -> Nothing)
What I observed is that suddenly I have a function name parse defined and it is capable of running Parsers.
For example, I can run:
parse (Parser (\s -> Nothing)) "my input"
and get Nothing as output.
How was this parse function got defined with this specific signature? How does this function "know" to execute the Parser given to it? Hope that someone can clear my confusion.
Thanks!
When you write newtype Parser a = Parser { parse :: String -> Maybe (a,String) } you introduce three things:
A type named Parser.
A term level constructor of Parsers named Parser. The type of this function is
Parser :: (String -> Maybe (a, String)) -> Parser a
You give it a function and it wraps it inside a Parser
A function named parse to remove the Parser wrapper and get your function back. The type of this function is:
parse :: Parser a -> String -> Maybe (a, String)
Check yourself in ghci:
Prelude> newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
Prelude> :t Parser
Parser :: (String -> Maybe (a, String)) -> Parser a
Prelude> :t parse
parse :: Parser a -> String -> Maybe (a, String)
Prelude>
It's worth nothing that the term level constructor (Parser) and the function to remove the wrapper (parse) are both arbitrary names and don't need to match the type name. It's common for instance to write:
newtype Parser a = Parser { unParser :: String -> Maybe (a,String) }
this makes it clear unParse removes the wrapper around the parsing function. However, I recommend your type and constructor have the same name when using newtypes.
How does this function "know" to execute the Parser given to it
You are unwrapping the function using parse and then calling the unwrapped function with "myInput".
First, let’s have a look at a parser newtype without record syntax:
newtype Parser' a = Parser' (String -> Maybe (a,String))
It should be obvious what this type does: it stores a function String -> Maybe (a,String). To run this parser, we will need to make a new function:
runParser' :: Parser' a -> String -> Maybe (a,String)
runParser' (Parser' p) i = p i
And now we can run parsers like runParser' (Parser' $ \s -> Nothing) "my input".
But now note that, since Haskell functions are curried, we can simply remove the reference to the input i to get:
runParser'' :: Parser' a -> (String -> Maybe (a,String))
runParser'' (Parser' p) = p
This function is exactly equivalent to runParser', but you could think about it differently: instead of applying the parser function to the value explicitly, it simply takes a parser and fetches the parser function from it; however, thanks to currying, runParser'' can still be used with two arguments.
Now, let’s go back to back to your original type:
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
The only difference between your type and mine is that your type uses record syntax, although it may be a bit hard to recognise since a newtype can only have one field; this record syntax automatically defines a function parse :: Parser a -> (String -> Maybe (a,String)), which extracts the String -> Maybe (a,String) function from the Parser a. Hopefully the rest should be obvious: thanks to currying, parse can be used with two arguments rather than one, and this simply has the effect of running the function stored within the Parser a. In other words, your definition is exactly equivalent to the following code:
newtype Parser a = Parser (String -> Maybe (a,String))
parse :: Parser a -> (String -> Maybe (a,String))
parse (Parser p) = p

QuickCheck with Dynamic Element Sets

Is there a way to control programmatically the set of values to use in an elements call within an arbitrary definition? I want to be able to generate an arbitrary variable reference as part of a random expression, but the set of variables identifiers to choose from should be configurable.
As example, imagine the following data type:
data Expr = Num Int
| Var String
| BinOp Op Expr Expr
data Op = Add | Sub | Mul | Div deriving (Eq, Ord, Enum)
And then I want to define an arbitrary instance for this type that would look something like this:
instance Arbitrary Op where
arbitrary = elements [Add .. ]
instance Arbitrary Expr where
arbitrary = oneof [ fmap Num arbitrary
, arbitraryBinOp
, fmap Var (elements varNames)
]
arbitraryBinOp = do (op, e0, e1) <- arbitrary
return (BinOp op e0 e1)
Now the tricky thing is the "varNames" part. Conceptually I would like to be able to do something like this:
do args <- getArgs
tests <- generate $ vectorOf 10 ((arbitrary args)::Gen Expr)
But obviously I can't propagate that args-vector down through the arbitrary calls as "arbitrary" does not take such an argument...
Arbitrary is really only a convenience when the generator does not require any context. If you need to parameterize your generators, you can define them as regular functions, and QuickCheck has combinators to use such explicit generators instead of Arbitrary instances.
genExpr :: [String] -> Gen Expr
genExpr varNames =
oneof [ fmap Num arbitrary
, arbitraryBinOp
, fmap Var (elements varNames)
]
main :: IO ()
main = do
args <- getArgs
tests <- generate $ vectorOf 10 (genExpr args)
{- do stuff -}
return ()

How to create a newtype of parser?

newtype Parser a = PsrOf{
-- | Function from input string to:
--
-- * Nothing, if failure (syntax error);
-- * Just (unconsumed input, answer), if success.
dePsr :: String -> Maybe (String, a)}
I want to create a newtype of Parser to see how it looks like.
I tried
*ParserLib> PsrOf{"hello"}
But it comes up with an error
<interactive>:5:7: error: parse error on input ‘"’
You've already created the type. Now you want to create a value of that type. To do that, you need to call PsrOf with a value of type String -> Maybe (String, a). For example:
newtype Parser a = PsrOf { dePsr :: String -> Maybe (String, a) }
get3 :: String -> Maybe (String, Int)
get3 ('3':xs) = Just (xs, 3)
get3 _ = Nothing -- Any string, including the empty string, that doesn't start with '3'
get3P :: Parser Int
get3P = PsrOf get3
To actually use the parser, you need to extract the function before applying it to a string:
dePsr get3P "38" -- Just ("8", 3)
dePsr get3P "" -- Nothing
dePsr get3P "hello" -- Nothing
Record syntax here is just used to simplify the definition of the type, instead of writing
newtype Parser a = PsrOf (String -> Maybe (String, a))
dePsr :: Parser a -> String -> Maybe (String, a)
dPsr (PsrOf f) = f
The rest of the uses for record syntax (pattern matching or making slightly modified copies of a value) don't really apply usefully to types that wrap a single value.

Deserializing many network messages without using an ad-hoc parser implementation

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.
You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

understanding do notation and bindings

I am very new to haskell and I am trying to understand the methodology used to create Monadic parser in this document https://www.cs.nott.ac.uk/~gmh/pearl.pdf
Instead of following it exactly, I am trying to do it a little bit differently in order to understand it correctly, therefore, I ended up with this code
newtype Parser a = Parser (String -> Maybe (a, String))
item :: Parser Char
item = Parser (\cs -> case cs of
"" -> Nothing
(c:cs) -> Just (c, cs))
getParser (Parser x) = x
instance Monad Parser where
return x = Parser (\cs -> Just (x,cs))
(Parser p) >>= f = Parser (\cs -> let result = p cs in
case result of
Nothing -> Nothing
Just (c,cs') -> getParser (f c) cs')
takeThreeDropSecond :: Parser (Char, Char)
takeThreeDropSecond = do
c1 <- item
item
c2 <- item
return (c1, c2)
This seems to be working, but I am having hard time following what is going on in do notation.
For example; in c1 <- item, what is assigned to c1? Is it the function that is contained in Parser type, or result of that computation, or what else? Moreover, second line in do notation is just item, so does it just run item but doesn't assign the result? And finally, what does return (c1,c2) produce? Is it Parser (String -> Maybe ((c1, c2)), String) or just Just (c1, c2)?
The Parser type wraps up a function that can 1) represent failure using Maybe and 2) returns the remaining text that was not parsed through (a, String) along with 3) some value a that was parsed, which can be anything. The monad instance is the plumbing to tie them together. The return implementation creates a Parser around a function that 1) succeeds with Just something, 2) does not modify its input text, and 3) directly passes the value given to it. The >>= implementation takes a parser and a function, then returns a new parser created by first running the p, then based on whether that result passed or failed running f.
In takeThreeDropSecond, first c1 <- item says "parse the given using item, assign its result to c1, and feed the rest of the input forward". This does not assign the function inside the item parser to c1, it assigns the result of running the function inside item against the current input. Then you reach item, which parses a value using item, doesn't assign it to anything, and feeds the rest of the input forward. Next you reach c2 <- item, which does basically the same thing as the first line, and finally return (c1, c2), which would expand to Parser (\cs -> Just ((c1, c2), cs)). This means that return (c1, c2) has the type Parser (Char, Char). With type annotations it would be
takeThreeDropSecond :: Parser (Char, Char)
takeThreeDropSecond = do
(c1 :: Char) <- (item :: Parser Char)
(item :: Parser Char)
(c2 :: Char) <- (item :: Parser Char)
(return (c1, c2) :: Parser (Char, Char))
Note that the last line of any monadic do block must have the same type as the function it is a member of. Since return (c1, c2) has type Parser (Char, Char), so must takeThreeDropSecond, and vice-versa.

Resources