Using Parsec to parse configurations - haskell

Here I have in mind that a possible configuration is a tree of specifications, each specification has a corresponding keyword (the string) and type. Something like this:
data Select = And | Or
data ConfigTree = Node Select [ConfigTree] | Leaf (String, *)
I'm not sure how to write this properly, given that there's no "type of types", but nevermind that for the moment.
Now, given such a tree, I want to build a parser that can read a possible valid configuration; I assume I already have sub-parsers that can parse keyword/type pairs.
For instance, a possible configuration tree is:
Node And [ Leaf ("width", Double)
, Node Or [ Leaf ("height", Double) , Leaf ("aspectratio", Double)
]
which can specify the size of a rectangle. A possible configuration file would be, say:
aspectratio = 2
width = 10
(Let's assume that a configuration file is just a list of newline separated pairs, keyword = blah, where blah is something the corresponding parser for that keyword can deal with; but they can be in any order, and just have to match up with one possible "valid subset" of the tree, where a valid subset is any subset containing the top node, that contains all the children of an "and" node it contains, and say exactly one child of an "or" node it contains.)
I have no idea how to even start building such a parser. Can anyone give some tips about how to proceed, or a way to completely restructure the above ConfigTree datatype to something more amenable to parsing?

The problem with building a parser for this, is that your input format doesn't match your data type at all. The input format is a simple, easily parsable list of key-value pairs while your data type is a tree. E.g. to determine if all the sub-trees in an And node are valid you have to know the complete input.
So, instead of doing validating the list of key-value pairs directly in the parser, just do it afterwards.
I've put together a small example to show what I mean:
data Type = TDouble | TString
data Select = And | Or
data ConfigTree = Node Select [ConfigTree] | Leaf (String, Type)
-- matches a list of key-value pairs against a tree
match :: [(String, String)] -> ConfigTree -> Bool
match sts (Leaf (s, t)) = case filter ((== s) . fst) sts of
-- we don't want multiple occurences of a key
[(_, v)] -> if valid v t then True else False
_ -> False
match sts (Node And cfgs) = and . map (match sts) $ cfgs
-- not completely what you described, because it will match 1 or more
match sts (Node Or cfgs) = or . map (match sts) $ cfgs
-- validates a string against a type
valid :: String -> Type -> Bool
valid s TDouble = case reads s :: [(Double, String)] of
[(_, "")] -> True
_ -> False
valid _ TString = True
-- this is what you actually parsed
config = [ ("aspectratio", "2")
, ("width", "123")
, ("name", "Sam")
]
-- the example tree
cfgTree = Node And [ Leaf ("width", TDouble)
, Node Or [ Leaf ("height", TDouble), Leaf ("aspectratio", TDouble)]
]
I don't think that this is a particularly useful example, because all it does is check if your config data are valid, it doesn't extract them, but I hope it demonstrates what I meant.

Related

How to use constructors correctly when writing structures and what are the alternatives to Null?

I'm trying to write a red-black tree in haskell. In the properties of the red-black tree there is a note that all leaves that do not contain data are black.
I want to write something like this:
data EmptyNode = EmptyNode{
data = ???,
color = ???, <-- it should black (assume that it is False)
left = ???,
right = ???
}
data NodeBR a = NodeBR {
data :: a,
color :: Bool,
left :: NodeBR,
right :: NodeBR
}
data TreeBR a = EmptyNode | NodeBR a (TreeBR a) (TreeBR a)
I don't understand 2 things, what type is suitable for me to replace Null in the usual languages (undefined as I understand you can not use here) and work with constructors, how can I specify color in EmptyNode defaults to False?
One solution is to define the red-black tree (RBT for short) in a more "Haskell" way. So there are two possible ways to construct an RBT:
Leaf (EmptyNode in your code)
Node a c l r (NodeBR in your code), where a is the data, c is the color, l and r are also RBT.
data RBT a = Leaf | Node a Bool (RBT a) (RBT a)
In the line above, we have defined datatype RBT a with two constructors:
Leaf :: RBT a
Node :: a -> Bool -> RBT a -> RBT a -> RBT a
which means:
Leaf is of type RBT a
Node is a function, taking a, Bool(color), RBT a (left tree), RBT a (right tree), which returns an RBT a.
Therefore, we don't need to specify NULL in this case for Leaf, as there is no need for saying so at all (i.e., with Leaf, we want to say there is no data, no left/right subtree).
To treat Leaf as black, we could ## Heading ##define a function on RBT a using pattern matching:
color :: RBT a -> Bool
color Leaf = False
color (Node _ c _ _) = c
The record syntax which you have mentioned in your code is just syntactic sugar for generating such color function. But in this case, using record syntax cannot generate the correct code for the Leaf case, as they are not identical in the declaration.
Thus, when you have to check if an RBT a is Leaf or not, you can just use pattern matching instead of "if-with-null" in other languages.
Update:
As mentioned by amalloy in the comments, we could define the color as a separate datatype for better readability:
data Color = Red | Black
data RBT a = Leaf | Node a Color (RBT a) (RBT a)
color :: RBT a -> Color
color Leaf = Black
color (Node _ c _ _) = c
Note that Bool and Color are isomorphic.

How to do a Read instance for a simple custom format of data

I'm trying to use a short and easy to read format to show and read my data, and I would like that it could be used from the Haskell interpreter, in order to write hand or copy-paste inputs while I try new functions.
My data is a list of Int numbers, each one with an boolean property, that I associate with + and -, being the first one the default, so it doesn't need explicit representation (as with usual sign). I would like to represent the - after the number, like in this example:
[2, 5-, 4, 0-, 1, 6-, 2-]
Note that I can not use the usual sign because I need to be able of assigning - to 0, so that 0- is different than 0 (also, may be in the future I will need to use negative numbers, like in [-4-, -2]).
I did the easy part, which is to define the data type for the terms of the list and implement the show function.
data Term = T Int Bool deriving (Eq)
instance Show Term where
show (T v True) = show v
show (T v False) = show v ++ "-"
What I don't know is how to do the corresponding read function, or whether I cannot use the - sign, because it is a sign of the Haskell language. Suggestions are welcome.
Try something like this:
instance Read Term where
readsPrec n s = do
(i,rest) <- readsPrec (n+1) s -- read `i :: Int`
return $ case rest of -- look at the rest of the string
('-':rest') -> (T i False, rest') -- if it starts with '-'...
rest' -> (T i True, rest') -- if it doesn't...
Read in Haskell follows closely the idea that a parser can be represented by the type String -> [(a, String)] (this type is given the type synonym ReadS. To familiarize yourself with this idea of parsing, I recommend the reading the following functional pearl on monadic parsing.
Then, from GHCi:
ghci> read "[2, 5-, 4, 0-, 1, 6-, 2-]" :: [Term]
[2,5-,4,0-,1,6-,2-]
I like very much the answer of Alec, which I accepted. But after reading, thinking and trying, I reached another quite simple solution, that I would like to share here.
It uses reads instead of readsPrec because the Term constructor is not infix, so we don't need to manage precedence, and it is not monadic.
instance Read Term where
readsPrec _ s =
[(T v False, rest) | (v, '-' : rest) <- reads s] ++
[(T v True , rest) | (v, rest) <- reads s]
The symmetry with the corresponding Show instance is notable:
instance Show Term where
show (T v True) = show v
show (T v False) = show v ++ "-"

How can I get the default value for a type?

I am trying to build the graph ADT in Haskell.
I don't know how to get the default value for a generic type.
type Node = Int
type Element a = (Node, a, [Int]) --Node: ID; a: generic value; [Int]: adjancent nodes' IDs
type Graph a = [Element a]
insNode :: Graph a -> Node -> Graph a
insNode g n = g ++ [(n,?,[])]
What do I have to write in place of ? in order to get the default value for type a?
Many thanks in advance!
You can't. There's no way to magically create an value of any type.
There's undefined :: a, but if ever evaluated this will crash your program. Instead, I'd suggest either
type Element a = (Node, Maybe a, [Int]) -- It's optional to have a value
or
insNode :: a -> Node -> Graph a -> Graph a
-- Optional idea, use `Data.Default` to ease the typing burden
defInsNode :: Def a => Node -> Graph a -> Graph a
defInsNode = insNode def
With the first option you can just stick Nothing in there (which is really what you have), or with the second option you just require the user to provide a value.
Finally a style note, instead of synonyms for tuples, I'd suggest using
type Node = Int
data Element a = Element { node :: Node
, val :: Maybe a
, edges :: [Node]}
deriving (Eq, Show)
Now you construct these Elements with Element node val edges and can pattern match in much the same way. Tuples of more than 2 elements are usually the wrong way to go.

Nil Value for Tree a -> a in Haskell

So I have a tree defined as
data Tree a = Leaf | Node a (Tree a) (Tree a) deriving Show
I know I can define Leaf to be Leaf a. But I really just want my nodes to have values. My problem is that when I do a search I have a return value function of type
Tree a -> a
Since leafs have no value I am confused how to say if you encounter a leaf do nothing. I tried nil, " ", ' ', [] nothing seems to work.
Edit Code
data Tree a = Leaf | Node a (Tree a) (Tree a) deriving Show
breadthFirst :: Tree a -> [a]
breadthFirst x = _breadthFirst [x]
_breadthFirst :: [Tree a] -> [a]
_breadthFirst [] = []
_breadthFirst xs = map treeValue xs ++
_breadthFirst (concat (map immediateChildren xs))
immediateChildren :: Tree a -> [Tree a]
immediateChildren (Leaf) = []
immediateChildren (Node n left right) = [left, right]
treeValue :: Tree a -> a
treeValue (Leaf) = //this is where i need nil
treeValue (Node n left right) = n
test = breadthFirst (Node 1 (Node 2 (Node 4 Leaf Leaf) Leaf) (Node 3 Leaf (Node 5 Leaf Leaf)))
main =
do putStrLn $ show $ test
In Haskell, types do not have an "empty" or nil value by default. When you have something of type Integer, for example, you always have an actual number and never anything like nil, null or None.
Most of the time, this behavior is good. You can never run into null pointer exceptions when you don't expect them, because you can never have nulls when you don't expect them. However, sometimes we really need to have a Nothing value of some sort; your tree function is a perfect example: if we don't find a result in the tree, we have to signify that somehow.
The most obvious way to add a "null" value like this is to just wrap it in a type:
data Nullable a = Null | NotNull a
So if you want an Integer which could also be Null, you just use a Nullable Integer. You could easily add this type yourself; there's nothing special about it.
Happily, the Haskell standard library has a type like this already, just with a different name:
data Maybe a = Nothing | Just a
you can use this type in your tree function as follows:
treeValue :: Tree a -> Maybe a
treeValue (Node value _ _) = Just value
treeValue Leaf = Nothing
You can use a value wrapped in a Maybe by pattern-matching. So if you have a list of [Maybe a] and you want to get a [String] out, you could do this:
showMaybe (Just a) = show a
showMaybe Nothing = ""
myList = map showMaybe listOfMaybes
Finally, there are a bunch of useful functions defined in the Data.Maybe module. For example, there is mapMaybe which maps a list and throws out all the Nothing values. This is probably what you would want to use for your _breadthFirst function, for example.
So my solution in this case would be to use Maybe and mapMaybe. To put it simply, you'd change treeValue to
treeValue :: Tree a -> Maybe a
treeValue (Leaf) = Nothing
treeValue (Node n left right) = Just n
Then instead of using map to combine this, use mapMaybe (from Data.Maybe) which will automatically strip away the Just and ignore it if it's Nothing.
mapMaybe treeValue xs
Voila!
Maybe is Haskell's way of saying "Something might not have a value" and is just defined like this:
data Maybe a = Just a | Nothing
It's the moral equivalent of having a Nullable type. Haskell just makes you acknowledge the fact that you'll have to handle the case where it is "null". When you need them, Data.Maybe has tons of useful functions, like mapMaybe available.
In this case, you can simply use a list comprehension instead of map and get rid of treeValue:
_breadthFirst xs = [n | Node n _ _ <- xs] ++ ...
This works because using a pattern on the left hand side of <- in a list comprehension skips items that don't match the pattern.
This function treeValue :: Tree a -> a can't be a total function, because not all Tree a values actually contain an a for you to return! A Leaf is analogous to the empty list [], which is still of type [a] but doesn't actually contain an a.
The head function from the standard library has the type [a] -> a, and it also can't work all of the time:
*Main> head []
*** Exception: Prelude.head: empty list
You could write treeValue to behave similarly:
treeValue :: Tree a -> a
treeValue Leaf = error "empty tree"
treeValue (Node n _ _) = n
But this won't actually help you, because now map treeValue xs will throw an error if any of the xs are Leaf values.
Experienced Haskellers usually try to avoid using head and functions like it for this very reason. Sure, there's no way to get an a from any given [a] or Tree a, but maybe you don't need to. In your case, you're really trying to get a list of a from a list of Tree, and you're happy for Leaf to simply contribute nothing to the list rather than throw an error. But treeValue :: Tree a -> a doesn't help you build that. It's the wrong function to help you solve your problem.
A function that helps you do whatever you need would be Tree a -> Maybe a, as explained very well in some other answers. That allows you to later decide what to do about the "missing" value. When you go to use the Maybe a, if there's really nothing else to do, you can call error then (or use fromJust which does exactly that), or you can decide what else to do. But a treeValue that claims to be able to return an a from any Tree a and then calls error when it can't denies any caller the ability to decide what to do if there's no a.

Adding a leaf to Binary Search Tree, Haskell

The type is defined as
data BST = MakeNode BST String BST
| Empty
I'm trying to add a new leaf to the tree, but I don't really understand how to do it with recursion.
the function is set up like this
add :: String -> BST -> BST
The advantage of using binary trees is that you only need to look at the "current part" of the tree to know where to insert the node.
So, let's define the add function:
add :: String -> BST -> BST
If you insert something into an empty tree (Case #1), you just create a leaf directly:
add s Empty = MakeNode Empty s Empty
If you want to insert something into a node (Case #2), you have to decide which sub-node to insert the value in. You use comparisons to do this test:
add s t#(MakeNode l p r) -- left, pivot, right
| s > p = Node l p (add s r) -- Insert into right subtree
| s < p = Node (add s l) p r -- Insert into left subtree
| otherwise = t -- The tree already contains the value, so just return it
Note that this will not rebalance the binary tree. Binary tree rebalancing algorithms can be very complicated and will require a lot of code. So, if you insert a sorted list into the binary tree (e.g. ["a", "b", "c", "d"]), it will become very unbalanced, but such cases are very uncommon in practice.

Resources