How to create a newtype of parser? - haskell

newtype Parser a = PsrOf{
-- | Function from input string to:
--
-- * Nothing, if failure (syntax error);
-- * Just (unconsumed input, answer), if success.
dePsr :: String -> Maybe (String, a)}
I want to create a newtype of Parser to see how it looks like.
I tried
*ParserLib> PsrOf{"hello"}
But it comes up with an error
<interactive>:5:7: error: parse error on input ‘"’

You've already created the type. Now you want to create a value of that type. To do that, you need to call PsrOf with a value of type String -> Maybe (String, a). For example:
newtype Parser a = PsrOf { dePsr :: String -> Maybe (String, a) }
get3 :: String -> Maybe (String, Int)
get3 ('3':xs) = Just (xs, 3)
get3 _ = Nothing -- Any string, including the empty string, that doesn't start with '3'
get3P :: Parser Int
get3P = PsrOf get3
To actually use the parser, you need to extract the function before applying it to a string:
dePsr get3P "38" -- Just ("8", 3)
dePsr get3P "" -- Nothing
dePsr get3P "hello" -- Nothing
Record syntax here is just used to simplify the definition of the type, instead of writing
newtype Parser a = PsrOf (String -> Maybe (String, a))
dePsr :: Parser a -> String -> Maybe (String, a)
dPsr (PsrOf f) = f
The rest of the uses for record syntax (pattern matching or making slightly modified copies of a value) don't really apply usefully to types that wrap a single value.

Related

Using data constructor as a function parameter

I am making my way through "Haskell Programming..." and, in Chapter 10, have been working with a toy database. The database is defined as:
data DatabaseItem = DBString String
| DBNumber Integer
| DBDate UTCTime
deriving (Eq, Ord, Show)
and, given a database of the form [databaseItem], I am asked to write a function
dbNumberFilter :: [DatabaseItem] -> [Integer]
that takes a list of DatabaseItems, filters them for DBNumbers, and returns a list the of Integer values stored in them.
I solved that with:
dbNumberFilter db = foldr selectDBNumber [] db
where
selectDBNumber (DBNumber a) b = a : b
selectDBNumber _ b = b
Obviously, I can write an almost identical to extract Strings or UTCTTimes, but I am wondering if there is a way to create a generic filter that can extract a list of Integers, Strings, by passing the filter a chosen data constructor. Something like:
dbGenericFilter :: (a -> DataBaseItem) -> [DatabaseItem] -> [a]
dbGenericFilter DBICon db = foldr selectDBDate [] db
where
selectDBDate (DBICon a) b = a : b
selectDBDate _ b = b
where by passing DBString, DBNumber, or DBDate in the DBICon parameter, will return a list of Strings, Integers, or UTCTimes respectively.
I can't get the above, or any variation of it that I can think of, to work. But is there a way of achieving this effect?
You can't write a function so generic that it just takes a constructor as its first argument and then does what you want. Pattern matches are not first class in Haskell - you can't pass them around as arguments. But there are things you could do to write this more simply.
One approach that isn't really any more generic, but is certainly shorter, is to make use of the fact that a failed pattern match in a list comprehension skips the item:
dbNumberFilter db = [n | DBNumber n <- db]
If you prefer to write something generic, such that dbNUmberFilter = genericFilter x for some x, you can extract the concept of "try to match a DBNumber" into a function:
import Data.Maybe (mapMaybe)
genericFilter :: (DatabaseItem -> Maybe a) -> [DatabaseItem] -> [a]
genericFilter = mapMaybe
dbNumberFilter = genericFilter getNumber
where getNumber (DBNumber n) = Just n
getNumber _ = Nothing
Another somewhat relevant generic thing you could do would be to define the catamorphism for your type, which is a way of abstracting all possible pattern matches for your type into a single function:
dbCata :: (String -> a)
-> (Integer -> a)
-> (UTCTime -> a)
-> DatabaseItem -> a
dbCata s i t (DBString x) = s x
dbCata s i t (DBNumber x) = i x
dbCata s i t (DBDate x) = t x
Then you can write dbNumberFilter with three function arguments instead of a pattern match:
dbNumberFilter :: [DatabaseItem] -> [Integer]
dbNumberFilter = (>>= dbCata mempty pure mempty)

How can I declare the types for this problem?

I'm trying to create a simple programming language with some primitives and user defined functions.
These are the types I created:
data Type = IntT | BoolT
data Value = IntV Int | BoolV Bool | OperatorCall String [Value]
data Expr = LetE String Value | ProcedureCall String [Value]
As you can see, I've divided functions into operators (which return a value) and procedures (which don't return anything and act as expressions instead of values). A function call contains the string id of the function being called and the list of arguments being passed. Also, a program is just a list of expressions (I've omitted user defined functions here for the sake of simplicity)
My problem comes from the fact that I need to write a function that parses a function call from a string:
parseFunctionCall :: String -> ???
...
The return type of that function can be a Value (for operator calls) or an Expr (for procedure calls). This function is rather complicated and I'd prefer to avoid writing it twice, or polluting it with an Either return type. What should I do? How can I change my types so that this can be achieved cleanly? Something like this perhaps, but I don't think this is the way:
type FunctionCall = (String, [Value])
data Value = ... | OperatorCall FunctionCall
data Expr = ... | ProcedureCall FunctionCall
parseAsFunctionCall :: String -> FunctionCall
...
You can have the function call parser return (String, [Value]), and let the caller fix that up into whatever data structure they like best -- in your case, by applying \(s, vs) -> OperatorCall s vs if parsing a value or \(s, vs) -> ProcedureCall s vs if parsing an expression.
parseFunctionCall :: Parser (String, [Value])
parseLiteralInt :: Parser Int
parseLiteralBool :: Parser Bool
parseLet :: Parser (String, Value)
(parseFunctionCall, parseLiteralInt, parseBool, parseLet) = {- ... -}
parseValue :: Parser Value
parseValue =
((\(s, vs) -> OperatorCall s vs) <$> parseFunctionCall)
<|>
(IntV <$> parseLiteralInt)
<|>
(BoolV <$> parseLiteralBool)
parseExpr :: Parser Expr
((\(s, vs) -> ProcedureCall s vs) <$> parseFunctionCall)
<|>
((\(s, v) -> Let s v) <$> parseLet)

Why the newtype syntax creates a function

I look at this declaration:
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
Here is what I understand:
1) Parser is declared as a type with a type parameter a
2) You can instantiate Parser by providing a parser function for example p = Parser (\s -> Nothing)
What I observed is that suddenly I have a function name parse defined and it is capable of running Parsers.
For example, I can run:
parse (Parser (\s -> Nothing)) "my input"
and get Nothing as output.
How was this parse function got defined with this specific signature? How does this function "know" to execute the Parser given to it? Hope that someone can clear my confusion.
Thanks!
When you write newtype Parser a = Parser { parse :: String -> Maybe (a,String) } you introduce three things:
A type named Parser.
A term level constructor of Parsers named Parser. The type of this function is
Parser :: (String -> Maybe (a, String)) -> Parser a
You give it a function and it wraps it inside a Parser
A function named parse to remove the Parser wrapper and get your function back. The type of this function is:
parse :: Parser a -> String -> Maybe (a, String)
Check yourself in ghci:
Prelude> newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
Prelude> :t Parser
Parser :: (String -> Maybe (a, String)) -> Parser a
Prelude> :t parse
parse :: Parser a -> String -> Maybe (a, String)
Prelude>
It's worth nothing that the term level constructor (Parser) and the function to remove the wrapper (parse) are both arbitrary names and don't need to match the type name. It's common for instance to write:
newtype Parser a = Parser { unParser :: String -> Maybe (a,String) }
this makes it clear unParse removes the wrapper around the parsing function. However, I recommend your type and constructor have the same name when using newtypes.
How does this function "know" to execute the Parser given to it
You are unwrapping the function using parse and then calling the unwrapped function with "myInput".
First, let’s have a look at a parser newtype without record syntax:
newtype Parser' a = Parser' (String -> Maybe (a,String))
It should be obvious what this type does: it stores a function String -> Maybe (a,String). To run this parser, we will need to make a new function:
runParser' :: Parser' a -> String -> Maybe (a,String)
runParser' (Parser' p) i = p i
And now we can run parsers like runParser' (Parser' $ \s -> Nothing) "my input".
But now note that, since Haskell functions are curried, we can simply remove the reference to the input i to get:
runParser'' :: Parser' a -> (String -> Maybe (a,String))
runParser'' (Parser' p) = p
This function is exactly equivalent to runParser', but you could think about it differently: instead of applying the parser function to the value explicitly, it simply takes a parser and fetches the parser function from it; however, thanks to currying, runParser'' can still be used with two arguments.
Now, let’s go back to back to your original type:
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
The only difference between your type and mine is that your type uses record syntax, although it may be a bit hard to recognise since a newtype can only have one field; this record syntax automatically defines a function parse :: Parser a -> (String -> Maybe (a,String)), which extracts the String -> Maybe (a,String) function from the Parser a. Hopefully the rest should be obvious: thanks to currying, parse can be used with two arguments rather than one, and this simply has the effect of running the function stored within the Parser a. In other words, your definition is exactly equivalent to the following code:
newtype Parser a = Parser (String -> Maybe (a,String))
parse :: Parser a -> (String -> Maybe (a,String))
parse (Parser p) = p

Deserializing many network messages without using an ad-hoc parser implementation

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.
You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

Haskell type declarations

In Haskell, why does this compile:
splice :: String -> String -> String
splice a b = a ++ b
main = print (splice "hi" "ya")
but this does not:
splice :: (String a) => a -> a -> a
splice a b = a ++ b
main = print (splice "hi" "ya")
>> Type constructor `String' used as a class
I would have thought these were the same thing. Is there a way to use the second style, which avoids repeating the type name 3 times?
The => syntax in types is for typeclasses.
When you say f :: (Something a) => a, you aren't saying that a is a Something, you're saying that it is a type "in the group of" Something types.
For example, Num is a typeclass, which includes such types as Int and Float.
Still, there is no type Num, so I can't say
f :: Num -> Num
f x = x + 5
However, I could either say
f :: Int -> Int
f x = x + 5
or
f :: (Num a) => a -> a
f x = x + 5
Actually, it is possible:
Prelude> :set -XTypeFamilies
Prelude> let splice :: (a~String) => a->a->a; splice a b = a++b
Prelude> :t splice
splice :: String -> String -> String
This uses the equational constraint ~. But I'd avoid that, it's not really much shorter than simply writing String -> String -> String, rather harder to understand, and more difficult for the compiler to resolve.
Is there a way to use the second style, which avoids repeating the type name 3 times?
For simplifying type signatures, you may use type synonyms. For example you could write
type S = String
splice :: S -> S -> S
or something like
type BinOp a = a -> a -> a
splice :: BinOp String
however, for something as simple as String -> String -> String, I recommend just typing it out. Type synonyms should be used to make type signatures more readable, not less.
In this particular case, you could also generalize your type signature to
splice :: [a] -> [a] -> [a]
since it doesn't depend on the elements being characters at all.
Well... String is a type, and you were trying to use it as a class.
If you want an example of a polymorphic version of your splice function, try:
import Data.Monoid
splice :: Monoid a=> a -> a -> a
splice = mappend
EDIT: so the syntax here is that Uppercase words appearing left of => are type classes constraining variables that appear to the right of =>. All the Uppercase words to the right are names of types
You might find explanations in this Learn You a Haskell chapter handy.

Resources