Related
I'm trying to understand what is the effect that filter has in the shrink tree of a generator when using hedgehog integrated shrinking.
Consider the following function:
{-# LANGUAGE OverloadedStrings #-}
import Hedgehog
import qualified Hedgehog.Gen as Gen
aFilteredchar:: Gen Char
aFilteredchar =
Gen.filter (`elem` ("x" :: String)) (Gen.element "yx")
When a print the shrink tree:
>>> Gen.printTree aFilteredchar
I'd get shrink trees that look as follow:
'x'
└╼'x'
└╼'x'
└╼'x'
...
└╼<discard>
this is, a very deep tree containing only x's, and a discard at the end.
Why does the shrink function keeps on returning x's, instead of an empty list, which signals that there are no further shrinks possible?
Gen is essentially a composition of a probability monad and a tree monad, and the behavior you observe mostly arises from the tree monad and the definition of Gen.filter.
Basically, Gen.filter p g is a simple monadic loop, try 0 where:
-- simplified body of filter
try k =
if k > 100 then
discard -- empty tree
else do
x <- g
if p x then
pure x -- singleton tree
else
try (k + 1) -- keep looping
So to understand the tree you got, you must understand the tree monad under the do notation here.
The tree monad
The Tree type in hedgehog that is internally used by Gen looks roughly like this (if you are looking at the linked implementation in hedgehog, set m ~ Maybe):
data Tree a = Empty | Node a [Tree a] -- node label and children
There are many other Tree-like types that are monads, and the monadic bind (>>=) generally takes the form of a tree substitution.
Say you have a tree t = Node x [t1, t2, ...] :: Tree a, and a continuation/substitution k :: a -> Tree b, which replaces every node/variable x :: a with the tree k x :: Tree b. We can describe t >>= k in two steps, fmap then join, as follows. First the fmap applies the substitution on every node label. So we obtain a tree where every node is labeled by another tree. For concreteness, say k x = Node y [u1, u2, ...]:
fmap k t
=
Node
(k x) -- node label
[fmap k t1, fmap k t2, ...] -- node children
=
Node
(Node y [u1, u2, ...]) -- node label
[fmap k t1, fmap k t2, ...] -- node children
Then the join step flattens the nested tree structure, concatenating the children from inside the label with those outside:
t >>= k
=
join (fmap k t)
=
Node
y
([join (fmap k t1), join (fmap k t2), ...] ++ [u1, u2, ...])
To complete the Monad instance, note that we have pure x = Node x [].
The try loop
Now that we have some intuition for the tree monad we can turn to your particular generator. We want to evaluate try k above, where p = (== 'x') and g = elements "yx". I'm waving my hands here, but you should imagine that g evaluates randomly to either the tree Node 'y' [] (generate 'y' with no shrinkings), aka. pure 'y', or Node 'x' [Node 'y' []] (generate 'x' and shrink to 'y'; indeed, "elements shrinks to the left"), and that every occurence of g is independent from others, so we get a different result when we retry.
Let's examine each case separately. What happens if g = pure 'y'? Assume k <= 100 so we're in the else branch of the toplevel if, already simplified away below:
-- simplified body of filter
try k = do
c <- pure 'y' -- g = pure 'y'
if c == 'x' then -- p c = (c == 'x')
pure c
else
try (k + 1)
-- since (do c <- pure 'y' ; s c) = s 'y' (monad law) and ('y' == 'x') = False
try k = try (k + 1)
So all the times where g evaluates to pure 'y' end up simplified away as the recursive term try (k + 1), and we are left with the cases where g evaluates to the other tree Node 'x' [Node 'y' []]:
try k = do
c <- Node 'x' [Node 'y' []] -- g
if c == 'x' then
pure c
else
try (k + 1)
As illustrated in the previous section, the monadic bind is equivalent to the following, and we finish with some equational reasoning.
try k = join (Node (s 'x') [Node (s 'y') []])
where
s c = if c == 'x' then pure c else try (k + 1)
try k = join (Node (pure 'x') [Node (try (k + 1)) []])
try k = join (Node (pure 'x') [pure (try (k + 1))] -- simplifying join
try k = Node 'x' [join (pure (try (k + 1)))] -- join . pure = id
try k = Node 'x' [try (k + 1)]
In summary, starting from try 0, with half probability try k = try (k + 1), and with the other half try k = Node 'x' [try (k + 1)], finally we stop at try 100. This explains the tree you observe.
try 0 = Node 'x' [Node 'x' [ ... ]] -- about 50 nodes
(I believe this also provides at least a partial answer to your other question, since this shows how shrinking a Gen.filter often amounts to rerunning the generator from scratch.)
While Li-yao Xia's detailed answer correctly describes how this happens, it doesn't address the why; why does it re-run the generator after each shrink? The answer is that it shouldn't; this is a bug. See bug report Improve Filter on GitHub.
I'm trying binary tree type constructor which is:
data Tree a = Leaf a | Branch a (Tree a) (Tree a)
How we prove that not all kinds of binary tree can be represented by this constructor? How we improve this definition to cover all types of binary tree? And how it works?
Your Tree a has labels of type a at every Branch and every Leaf constructor. So, for example, Branch 'u' (Branch 'n' (Leaf 'i') (Leaf 'p')) (Leaf 'z') looks like this:
+-'u'-+
| |
+-'n'-+ 'z'
| |
'i' 'p'
That excludes, say, trees with different labels at the nodes and leaves, or trees that are labelled only internally or only externally. For example, this tree has numbers at the leaves and characters at the nodes.
+-'o'-+
| |
+-'h'-+ 9
| |
7 2
(You can use Tree (Either n l) but that doesn't encode the invariant that only ns appear internally and only ls appear externally.)
Since this appears to be a homework assignment I won't tell you what a more general type of tree might look like, but I'm sure you can figure it out.
Ask yourself, how many a values can your tree hold? They appear either in leaves or nodes,
data Tree a = Leaf a | Branch a (Tree a) (Tree a)
so
num_values = 1 | ( 1 + num_values + num_values )
It doesn't make much sense in this form, so let's write it as
numvals = 1 : [ 1 + s | s <- diagonalize
[ [ n + m | m <- numvals ]
| n <- numvals ] ]
diagonalize :: [[a]] -> [a]
diagonalize ((n:ns):t) = n:go [ns] t
where
go as (b:bs) = map head as ++ go (b:map tail as) bs
so that we get
~> take 100 numvals
[1,3,5,5,7,7,7,7,7,9,9,9,9,9,11,9,9,9,11,11,11,11,9,9,11,13,11,13,11,9,9,11,13,1
3,13,13,11,9,9,11,13,13,15,13,13,11,11,11,11,13,13,15,15,13,13,11,11,11,13,13,13
,15,15,15,13,13,13,11,11,13,15,13,15,15,15,15,13,15,13,11,11,13,15,15,15,15,15,1
5,15,15,15,13,11,11,13,15,15,17,15,15]
but you want 0, 2, 4, ... to appear there as well.
edit:
It is easy to fix this, with
data Tree a = Leaf | Branch a (Tree a) (Tree a)
Now
numvals2 = 0 : [ 1 + s | s <- diagonalize
[ [ n + m | m <- numvals2 ]
| n <- numvals2 ] ]
and
~> take 100 numvals2
[0,1,2,2,3,3,3,3,3,4,4,4,4,4,5,4,4,4,5,5,5,5,4,4,5,6,5,6,5,4,4,5,6,6,6,6,5,4,4,5
,6,6,7,6,6,5,5,5,5,6,6,7,7,6,6,5,5,5,6,6,6,7,7,7,6,6,6,5,5,6,7,6,7,7,7,7,6,7,6,5
,5,6,7,7,7,7,7,7,7,7,7,6,5,5,6,7,7,8,7,7]
Making tree like data structures is relatively easy in Haskell. However, what if I want a structure like the following:
A (root)
/ \
B C
/ \ / \
D E F
So if I traverse down the structure through B to update E, the returned new updated structure also has E updated if I traverse through C.
Could someone give me some hints about how to achieve this? You can assume there are no loops.
I would flatten the data structure to an array, and operate on this instead:
import Data.Array
type Tree = Array Int -- Bounds should start at (1) and go to sum [1..n]
data TreeTraverse = TLeft TreeTraverse | TRight TreeTraverse | TStop
Given some traverse directions (left, right, stop), it's easy to see that if we go left, we simply add the current level to our position, and if we go right, we also add the current position plus one:
getPosition :: TreeTraverse -> Int
getPosition = getPosition' 1 1
where
getPosition' level pos (TLeft ts) = getPosition' (level+1) (pos+level) ts
getPosition' level pos (TRight ts) = getPosition' (level+1) (pos+level + 1) ts
getPosition' _ pos (TStop) = pos
In your case, you want to traverse either ABE or ACE:
traverseABE = TLeft $ TRight TStop
traverseACE = TRight $ TLeft TStop
Since we already now how to get the position of your element, and Data.Array provides some functions to set/get specific elements, we can use the following functions to get/set tree values:
getElem :: TreeTraverse -> Tree a -> a
getElem tt t = t ! getPosition tt
setElem :: TreeTraverse -> Tree a -> a -> Tree a
setElem tt t x = t // [(getPosition tt, x)]
To complete the code, lets use your example:
example = "ABCDEF"
exampleTree :: Tree Char
exampleTree = listArray (1, length example) example
And put everything to action:
main :: IO ()
main = do
putStrLn $ "Traversing from A -> B -> E: " ++ [getElem traverseABE exampleTree]
putStrLn $ "Traversing from A -> C -> E: " ++ [getElem traverseACE exampleTree]
putStrLn $ "exampleTree: " ++ show exampleTree ++ "\n"
putStrLn $ "Setting element from A -> B -> E to 'X', "
let newTree = setElem traverseABE exampleTree 'X'
putStrLn $ "but show via A -> C -> E: " ++ [getElem traverseACE newTree]
putStrLn $ "newTree: " ++ show newTree ++ "\n"
Note that this is most-likely not the best way to do this, but the first thing that I had in mind.
Once you've established identity, it can be done.
But first you must establish identity.
In many languages, values can be distinct from each other, but equal. In Python, for example:
>>> a = [1]
>>> b = [1]
>>> a == b
True
>>> a is b
False
You want to update E in one branch of the tree, and also update all other elements for which that element is E. But Haskell is referentially transparent: it has no notion of things being the same object; only equality, and even that is not applicable for every object.
One way you could do this is equality. Say this was your tree:
__A__
/ \
B C
/ \ / \
1 2 2 3
Then we could go through the tree and update all the 2s to, say, four. But this isn't exactly what you want in some cases.
In Haskell, if you want to update one thing in multiple places, you'll have to be explicit about what is and isn't the same thing. Another way you could deal with this is to tag each different value with a unique integer, and use that integer to determine identity:
____________A___________
/ \
B C
/ \ / \
(id=1)"foo" (id=2)"bar" (id=2)"bar" (id=3)"baz"
Then we could update all values with an identity of 2. Accidental collisions cannot be a problem, as there can be no collisions except those that are intentional.
This is essentially what STRef and IORef do, except they hoist the actual value into the monad's state and hide the identities from you. The only downside of using these is you'll need to make much of your code monadic, but you're probably not going to get away from that easily whatever you do. (Modifying values rather than replacing them is an inherently effectful thing to do.)
The structure you gave was not specified in much detail so it's impossible to tailor an example to your use case, but here's a simple example using the ST monad and a Tree:
import Control.Monad
import Control.Monad.ST
import Data.Tree
import Data.Traversable (traverse)
import Data.STRef
createInitialTree :: ST s (Tree (STRef s String))
createInitialTree = do
[a, b, c, d, e, f] <- mapM newSTRef ["A", "B", "C", "D", "E", "F"]
return $ Node a [ Node b [Node d [], Node e []]
, Node c [Node e [], Node f []]
]
dereferenceTree :: Tree (STRef s a) -> ST s (Tree a)
dereferenceTree = traverse readSTRef
test :: ST s (Tree String, Tree String)
test = do
tree <- createInitialTree
before <- dereferenceTree tree
let leftE = subForest (subForest tree !! 0) !! 1
writeSTRef (rootLabel leftE) "new" -- look ma, single update!
after <- dereferenceTree tree
return (before, after)
main = do
let (before, after) = runST test
putStrLn $ drawTree before
putStrLn $ drawTree after
Observe that although we only explicitly modified the value of the left E value, it changed on the right side, too, as desired.
I should note that these are not the only ways. There are probably many other solutions to this same problem, but they all require you to define identity sensibly. Only once that has been done can one begin the next step.
Karva notation is used in Gene Expression Programming to represent mathematical expressions.
See here http://www.gene-expression-programming.com/Tutorial002.asp
You create an expression tree by reading the off the gene and filling in nodes from left to right, top to bottom.
So for example using the operators ( +, * ) and terminals (1,2,3,4,5,6) in "+*+1+2*3456" would evaluate to 39.
How would I do this in haskell using attoparsec (or parsec)?
karvaParser :: Parser Int
karvaParser = ????????????
Prelude> parse karvaParser "+*+1+2*3456"
Done 39
(I've proved this is a linear time algorithm in this answer to the question mentioned in the comments. There's a lengthier more hand-rolled solution in a previous revision of this answer.)
Gene Expression Programming: Karva notation.
There's probably a neat solution using the continuation passing monad, Cont, but I haven't thought of it. Here's a fairly clean pure functional solution to the problem. I'll take the opportunity to name drop some good general recursion schemes along the way.
Plan:
split the input into lists, one for each layer, using the total arity of the previous line. This is an anamorphism, i.e. grows a list from a seed ([]) and can be written using unfoldr :: (b -> Maybe (a, b)) -> b -> [a] or equivalently, unfoldr' :: (b -> (a, b)) -> (b -> Bool)-> b -> [a]
input: "Q/a*+b-cbabaccbac"
arities: 12022020000000000
output: ["Q","/","a*","+b","-c","ba"]
Recursively use splitAt to glue the children under the parent. This is a catamorphism, i.e. collapses a list down to a single (tree) value, and can be written using foldr :: (a -> b -> b) -> b -> [a] -> b
Combine the anamorphism and the catamorphism into one. That's called a hylomorphism.
These terms are introduced to the FP community in the seminal paper Functional Programming with Bananas, Lenses and Barbed wire.
Code
In case you're not familiar with it, Data.Tree supplies data Tree a = Node {rootLabel :: a, subForest :: Forest a} where type Forest a = [Tree a].
import Data.Tree
import Data.Tree.Pretty -- from the pretty-tree package
arity :: Char -> Int
arity c
| c `elem` "+*-/" = 2
| c `elem` "Q" = 1
| otherwise = 0
hylomorphism :: b -> (a -> b -> b) -> (c -> (a, c)) -> (c -> Bool) -> c -> b
hylomorphism base combine pullout stop seed = hylo seed where
hylo s | stop s = base
| otherwise = combine new (hylo s')
where (new,s') = pullout s
To pull out a level, we use the total arity from the previous level to find where to split off this new level, and pass on the total arity for this one ready for next time:
pullLevel :: (Int,String) -> (String,(Int,String))
pullLevel (n,cs) = (level,(total, cs')) where
(level, cs') = splitAt n cs
total = sum $ map arity level
To combine a level (as a String) with the level below (that's already a Forest), we just pull off the number of trees that each character needs.
combineLevel :: String -> Forest Char -> Forest Char
combineLevel "" [] = []
combineLevel (c:cs) levelBelow = Node c subforest : combineLevel cs theRest
where (subforest,theRest) = splitAt (arity c) levelBelow
Now we can parse the Karva using a hylomorphism. Note that we seed it with a total arity from outside the string of 1, since there's only one node at the root level. I've used the head function because that 1 causes the top level to be a list containing one tree.
karvaToTree :: String -> Tree Char
karvaToTree cs = let
zero (n,_) = n == 0
in head $ hylomorphism [] combineLevel pullLevel zero (1,cs)
Demo
Let's have a draw of the results (because Tree is so full of syntax it's hard to read the output!). You have to cabal install pretty-tree to get Data.Tree.Pretty.
see :: Tree Char -> IO ()
see = putStrLn.drawVerticalTree.fmap (:"")
ghci> arity '+'
2
ghci> pullLevel (3,"+a*bc/acb")
("+a*",(4,"bc/acb"))
ghci> combineLevel "a*" [Node 'b' [],Node 'c' []]
[Node {rootLabel = 'a', subForest = []},Node {rootLabel = '*', subForest = [Node {rootLabel = 'b', subForest = []},Node {rootLabel = 'c', subForest = []}]}]
ghci> see . Node '.' $ combineLevel "a*" [Node 'b' [],Node 'c' []]
.
|
---
/ \
a *
|
--
/ \
b c
ghci> karvaToTree "Q/a*+b-cbabaccbac"
Node {rootLabel = 'Q', subForest = [Node {rootLabel = '/', subForest = [Node {rootLabel = 'a', subForest = []},Node {rootLabel = '*', subForest = [Node {rootLabel = '+', subForest = [Node {rootLabel = '-', subForest = [Node {rootLabel = 'b', subForest = []},Node {rootLabel = 'a', subForest = []}]},Node {rootLabel = 'c', subForest = []}]},Node {rootLabel = 'b', subForest = []}]}]}]}
Which matches
as we see when we see it:
ghci> see $ karvaToTree "Q/a*+b-cbabaccbac"
Q
|
/
|
------
/ \
a *
|
-----
/ \
+ b
|
----
/ \
- c
|
--
/ \
b a
Eval
Once you have a Tree, it's easy to convert it to other things. Let's evaluate an expression in Karva notation:
action :: (Read num,Floating num) => Char -> [num] -> num
action c = case c of
'Q' -> sqrt.head
'+' -> sum
'*' -> product
'-' -> \[a,b] -> a - b
'/' -> \[a,b] -> a / b
v -> const (read (v:""))
eval :: (Read num,Floating num) => Tree Char -> num
eval (Node c subforest) = action c (map eval subforest)
ghci> see $ karvaToTree "Q+-*826/12"
Q
|
+
|
-------
/ \
- *
| |
-- ---
/ \ / \
8 2 6 /
|
--
/ \
1 2
ghci> eval $ karvaToTree "Q+-*826/12"
3.0
I'm trying to write a function searching for a given element in a rose tree and returning it's location.
It may be clearer when I show you what I already got:
Given a tree with a definition:
data Tree text = Node value
[Tree value]
for example:
test = Node "1" [
Node "11" [
Node "111" [],
Node "112" [
Node "1121" [], Node "1122" [], Node "1123" []
]
],
Node "12" []
]
1
11 12
111 112
1121 1122 1123
I'm looking for a function search:
search :: String -> Tree String -> [Integer]
search 1123 test -> should return [1,2,3]
- first subtree of 1=11 -> 2nd subtree of 11=112, 3rd subtree of 112=1123
I know how to iterate through tree,
display (Node v xs) = v ++ concatMap display xs
But have no idea how can I assign integer value to every element of subtrees array and additionally pass it recursively from upper to lower parts of the tree.
Can you guys direct me where/how to look for a solution? I'm very new to Haskell..
The easiest way is to let the function return the list of all paths to a node with the desired data (there should only ever be at most one in the tree, I suppose, but that doesn't matter) first, and then use the first of these:
searchList :: (Eq a) => a -> Tree a -> [[Integer]]
searchList val (Node dat subs)
| val == dat = [[]] -- empty path
| otherwise = concat [map (c:) (searchList val t) | (c,t) <- zip [1 .. ] subs]
search :: Eq a => a -> Tree a -> [Integer]
search val t = case searchList val t of
(p:_) -> p
_ -> error "Value not found"
If Daniel Wagner's suspicion is correct and your trees are tries, you can search more efficiently, but the principle remains the same, however, since we now know that we either have one node with the desired data or none, the result is more appropriately a Maybe [Integer]:
import Data.List (isPrefixOf)
import Control.Monad -- for the MonadPlus instance of Maybe
searchTrie :: String -> Tree String -> Maybe [Integer]
searchTrie target (Node val subs)
| val == target = Just []
| val `isPrefixOf` target = case dropWhile smaller (zip [1 .. ] subs) of
((c,t):_) -> fmap (c:) $ searchTrie target t
_ -> Nothing
| otherwise = Nothing
where
smaller (_,Node v _) = v < take (length v) target