Parsing Karva notation in haskell - haskell

Karva notation is used in Gene Expression Programming to represent mathematical expressions.
See here http://www.gene-expression-programming.com/Tutorial002.asp
You create an expression tree by reading the off the gene and filling in nodes from left to right, top to bottom.
So for example using the operators ( +, * ) and terminals (1,2,3,4,5,6) in "+*+1+2*3456" would evaluate to 39.
How would I do this in haskell using attoparsec (or parsec)?
karvaParser :: Parser Int
karvaParser = ????????????
Prelude> parse karvaParser "+*+1+2*3456"
Done 39

(I've proved this is a linear time algorithm in this answer to the question mentioned in the comments. There's a lengthier more hand-rolled solution in a previous revision of this answer.)
Gene Expression Programming: Karva notation.
There's probably a neat solution using the continuation passing monad, Cont, but I haven't thought of it. Here's a fairly clean pure functional solution to the problem. I'll take the opportunity to name drop some good general recursion schemes along the way.
Plan:
split the input into lists, one for each layer, using the total arity of the previous line. This is an anamorphism, i.e. grows a list from a seed ([]) and can be written using unfoldr :: (b -> Maybe (a, b)) -> b -> [a] or equivalently, unfoldr' :: (b -> (a, b)) -> (b -> Bool)-> b -> [a]
input: "Q/a*+b-cbabaccbac"
arities: 12022020000000000
output: ["Q","/","a*","+b","-c","ba"]
Recursively use splitAt to glue the children under the parent. This is a catamorphism, i.e. collapses a list down to a single (tree) value, and can be written using foldr :: (a -> b -> b) -> b -> [a] -> b
Combine the anamorphism and the catamorphism into one. That's called a hylomorphism.
These terms are introduced to the FP community in the seminal paper Functional Programming with Bananas, Lenses and Barbed wire.
Code
In case you're not familiar with it, Data.Tree supplies data Tree a = Node {rootLabel :: a, subForest :: Forest a} where type Forest a = [Tree a].
import Data.Tree
import Data.Tree.Pretty -- from the pretty-tree package
arity :: Char -> Int
arity c
| c `elem` "+*-/" = 2
| c `elem` "Q" = 1
| otherwise = 0
hylomorphism :: b -> (a -> b -> b) -> (c -> (a, c)) -> (c -> Bool) -> c -> b
hylomorphism base combine pullout stop seed = hylo seed where
hylo s | stop s = base
| otherwise = combine new (hylo s')
where (new,s') = pullout s
To pull out a level, we use the total arity from the previous level to find where to split off this new level, and pass on the total arity for this one ready for next time:
pullLevel :: (Int,String) -> (String,(Int,String))
pullLevel (n,cs) = (level,(total, cs')) where
(level, cs') = splitAt n cs
total = sum $ map arity level
To combine a level (as a String) with the level below (that's already a Forest), we just pull off the number of trees that each character needs.
combineLevel :: String -> Forest Char -> Forest Char
combineLevel "" [] = []
combineLevel (c:cs) levelBelow = Node c subforest : combineLevel cs theRest
where (subforest,theRest) = splitAt (arity c) levelBelow
Now we can parse the Karva using a hylomorphism. Note that we seed it with a total arity from outside the string of 1, since there's only one node at the root level. I've used the head function because that 1 causes the top level to be a list containing one tree.
karvaToTree :: String -> Tree Char
karvaToTree cs = let
zero (n,_) = n == 0
in head $ hylomorphism [] combineLevel pullLevel zero (1,cs)
Demo
Let's have a draw of the results (because Tree is so full of syntax it's hard to read the output!). You have to cabal install pretty-tree to get Data.Tree.Pretty.
see :: Tree Char -> IO ()
see = putStrLn.drawVerticalTree.fmap (:"")
ghci> arity '+'
2
ghci> pullLevel (3,"+a*bc/acb")
("+a*",(4,"bc/acb"))
ghci> combineLevel "a*" [Node 'b' [],Node 'c' []]
[Node {rootLabel = 'a', subForest = []},Node {rootLabel = '*', subForest = [Node {rootLabel = 'b', subForest = []},Node {rootLabel = 'c', subForest = []}]}]
ghci> see . Node '.' $ combineLevel "a*" [Node 'b' [],Node 'c' []]
.
|
---
/ \
a *
|
--
/ \
b c
ghci> karvaToTree "Q/a*+b-cbabaccbac"
Node {rootLabel = 'Q', subForest = [Node {rootLabel = '/', subForest = [Node {rootLabel = 'a', subForest = []},Node {rootLabel = '*', subForest = [Node {rootLabel = '+', subForest = [Node {rootLabel = '-', subForest = [Node {rootLabel = 'b', subForest = []},Node {rootLabel = 'a', subForest = []}]},Node {rootLabel = 'c', subForest = []}]},Node {rootLabel = 'b', subForest = []}]}]}]}
Which matches
as we see when we see it:
ghci> see $ karvaToTree "Q/a*+b-cbabaccbac"
Q
|
/
|
------
/ \
a *
|
-----
/ \
+ b
|
----
/ \
- c
|
--
/ \
b a
Eval
Once you have a Tree, it's easy to convert it to other things. Let's evaluate an expression in Karva notation:
action :: (Read num,Floating num) => Char -> [num] -> num
action c = case c of
'Q' -> sqrt.head
'+' -> sum
'*' -> product
'-' -> \[a,b] -> a - b
'/' -> \[a,b] -> a / b
v -> const (read (v:""))
eval :: (Read num,Floating num) => Tree Char -> num
eval (Node c subforest) = action c (map eval subforest)
ghci> see $ karvaToTree "Q+-*826/12"
Q
|
+
|
-------
/ \
- *
| |
-- ---
/ \ / \
8 2 6 /
|
--
/ \
1 2
ghci> eval $ karvaToTree "Q+-*826/12"
3.0

Related

Traverse a rose tree until some condition is met, then modify tree

I have two rose trees in Haskell of m and n nodes, respectively. I want to replace the ith node of the first tree with the jth node of the second tree.
e.g.
tree 1:
R
_________|____________
| | |
A B C
/ \ / \ / \
D E F G H I
tree 2:
r
_____|____
| |
P Q
/ \ / | \
S T U V W
then the resulting tree of tree 1 node 7 (C) replaced with tree 2 node 4 (Q) should be
(assuming indexing is pre order and starting at 0)
R
_________|____________
| | |
A B Q
/ \ / \ / | \
D E F G U V W
I have tried using a zipper, but my problem is I can't workout how to get the zipper focus to the ith element,
i.e. how can i implement a function with type:
someFunc :: Tree a -> Int -> Zipper a
that take the root of the tree and traverses it (in some order) to return the zipper focussed on the ith pre order node and the context.
from the tree library I can flatten a tree to a pre order list of values using flatten, i can change this slightly to give me a list of trees e.g.
flattenToTreeList :: Tree a -> [Tree a]
flattenToTreeList t = squish t []
where squish (Node x ts) xs = Node x ts : foldr squish xs ts
if I could do this but with a zipper then the ith element of this list would satisfy me, but I'm lost and now going round in circles.
Well, I was a bit off in my comment. I misread the requirements a bit, but in a way that actually cuts the dependencies down a bit.
-- lens
import Control.Lens ((^?), contexts, cosmos, elementOf, ix)
-- containers
import Data.Tree (Tree(..))
-- adjunctions
import Control.Comonad.Representable.Store (peeks)
tree1 :: Tree Char
tree1 = Node 'R' [ Node 'A' [Node 'D' [], Node 'E' []]
, Node 'B' [Node 'F' [], Node 'G' []]
, Node 'C' [Node 'H' [], Node 'I' []]
]
tree2 :: Tree Char
tree2 = Node 'R' [ Node 'P' [Node 'S' [], Node 'T' []]
, Node 'Q' [Node 'U' [], Node 'V' [], Node 'W' []]
]
-- Replace subtree i in t1 with subtree j in t2
-- returns Nothing if either index doesn't exist
replace :: Int -> Tree a -> Int -> Tree a -> Maybe (Tree a)
replace i t1 j t2 = update =<< replacement
where
update u = peeks (const u) <$> contexts t1 ^? ix i
replacement = t2 ^? elementOf cosmos j
main :: IO ()
main = print $ replace 7 tree1 4 tree2
This adds a dependency on the adjunctions package, but it's a dependency of lens. So it's an extra import, but no additional required packages. In exchange, it doesn't need to use tree-traversals at all.
This is a bit unlike usual lens code, in that neither cosmos nor contexts are especially common, but they're great tools for manipulating substructures of self-similar data types. And that's a perfect description of replacing subtrees.
This uses pretty conceptually heavy tools, but I think the meaning comes across pretty well.

Swapping 2 characters in list of strings (Haskell)

I need to swap blank space with letter from "moves" and each time I swap it I need to continue with another one from moves. I get Couldn't match expected type, even though I just want to return value x when it doesn't meet condition.
Error message:
[1 of 1] Compiling Main ( puzzlesh.hs, interpreted )
puzzlesh.hs:19:43: error:
• Couldn't match expected type ‘Int -> a’ with actual type ‘Char’
• In the expression: x
In the expression: if x == ' ' then repl x else x
In an equation for ‘eval’: eval x = if x == ' ' then repl x else x
• Relevant bindings include
eval :: Char -> Int -> a (bound at puzzlesh.hs:19:5)
repl :: forall p. p -> Int -> a (bound at puzzlesh.hs:20:5)
moves :: [a] (bound at puzzlesh.hs:16:9)
p :: t [Char] -> [a] -> [Int -> a] (bound at puzzlesh.hs:16:1)
|
19 | eval x = if x == ' ' then repl x else x
| ^
Failed, no modules loaded.
Code:
import Data.Char ( intToDigit )
sample :: [String]
sample = ["AC DE",
"FBHIJ",
"KGLNO",
"PQMRS",
"UVWXT"]
moves = "CBGLMRST"
type Result = [String]
pp :: Result -> IO ()
pp x = putStr (concat (map (++"\n") x))
p input moves = [eval x | x <- (concat input)]
where
c = 1
eval x = if x == ' ' then repl x else x
repl x count = moves !! count
count c = c + 1
I need to take character from moves, replace it onto blank space and do this till moves is []
Desired output:
ABCDE
FGHIJ
KLMNO
PQRST
UVWX
As with most problems, the key is to break it down into smaller problems. Your string that encodes character swaps: can we break that into pairs?
Yes, we just need to create a tuple from the first two elements in the list, and then add that to the result of calling pairs on the tail of the list.
pairs :: [a] -> [(a, a)]
pairs (x:tl#(y:_)) = (x, y) : pairs tl
pairs _ = []
If we try this with a string.
Prelude> pairs "CBGLMRST"
[('C','B'),('B','G'),('G','L'),('L','M'),('M','R'),('R','S'),('S','T')]
But you want a blank space swapped with the first character:
Prelude> pairs $ " " ++ "CBGLMRST"
[(' ','C'),('C','B'),('B','G'),('G','L'),('L','M'),('M','R'),('R','S'),('S','T')]
Now you have a lookup table with original characters and their replacements and the rest is straightforward. Just map a lookup on this table over each character in each string in the list.
Because you never touch any letter in the original strings more than once, you won't have to worry about double replacements.
Prelude> s = ["AC DE","FBHIJ","KGLNO","PQMRS","UVWXT"]
Prelude> r = "CBGLMRST"
Prelude> r' = " " ++ r
Prelude> p = pairs r'
Prelude> [[case lookup c p of {Just r -> r; _ -> c} | c <- s'] | s' <- s]
["ABCDE","FGHIJ","KLMNO","PQRST","UVWXT"]

Correct way to format Haskell functions considering scope?

I'm new to Haskell. I've put together a basic Caesar Cipher, it works, but it's very messy and difficult to read.
caesarCipher :: Int -> String -> String
caesarCipher n xs = [shift n x | x <- xs]
shift n c = num2let ((let2num c + n) `mod` 26)
alphabet = ['a'..'z']
let2num c = head[ b | (a,b) <- zip alphabet [0..length alphabet], a==c]
num2let = (!!) alphabet
What is the "correct" way in Haskell to format functions that consist of multiple variables and expressions, and should I be considering the scope of the variables? And other than efficiency based suggestions have I made any other "major" mistakes?
This is my attempt:
caesarCipher n xs = let
shift n c = num2let ((let2num c + n) `mod` 26) where
alphabet = ['a'..'z']
let2num c = head[ b | (a,b) <- zip alphabet [0..length alphabet], a==c]
num2let = (!!) alphabet
in [shift n x | x <- xs]
I would first of all rewrite some functions. For example. zip alphabet [0 .. length alphabet] can be replaced with zip alphabet [0..], since the zip will stop from the moment one of the lists is exhausted. Making use of (!!) and head is often not good practice, since these functions are non-total: if the index is too large, or the list is empty, (!!) and head will error respectively.
We can define helper functions, for example for num2let:
import Data.Char(chr, ord)
num2let :: Int -> Char
num2let n = chr (n + ord 'a')
here num2let will map 0 to 'a', 1 to 'b', etc.
let2num can be done in a similar manner:
import Data.Char(ord)
let2num :: Char -> Int
let2num c = ord c - ord 'a'
So now we can define caesarCipher as:
caesarCipher :: Int -> String -> String
caesarCipher n = map (num2let . (`mod 26`) . (n+) . let2num)
So that would look in full as:
import Data.Char(chr, ord)
num2let :: Int -> Char
num2let n = chr (n + ord 'a')
let2num :: Char -> Int
let2num c = ord c - ord 'a'
caesarCipher :: Int -> String -> String
caesarCipher n = map (num2let . (`mod` 26) . (n+) . let2num)
The nice thing is that you can here reuse the let2num and num2let for other functions.
Normally top-level functions are separated with a blank line, and are given a signature. This is not necessary, but makes it usually more convenient to read.

Building a suffix tree by inserting each suffix in Haskell

I am working with the following data type:
data SuffixTree = Leaf Int | Node [(String, SuffixTree)]
deriving (Eq, Show)
Each subtree has a corresponding label (string).
The idea is to build the corresponding suffix tree by adding each suffix and its index into an accumulating tree (at the beginning it is Node []).
This is already defined
buildTree s
= foldl (flip insert) (Node []) (zip (suffixes s) [0..length s-1])
where suffixes is correctly defined.
I've been trying to implement the insert function for a while but can't seem to succeed.
This is what I have now (the names and style are not the best since this is still work in progress):
insert :: (String, Int) -> SuffixTree -> SuffixTree
insert pair tree#(Node content)
= insert' pair tree content
where
insert' :: (String, Int) -> SuffixTree -> [(String, SuffixTree)] -> SuffixTree
insert' (s, n) (Node []) subtrees
= Node ((s, Leaf n) : subtrees)
insert' (s, n) (Node content#((a, tree) : pairs)) subtrees
| null p = insert' (s, n) (Node pairs) subtrees
| p == a = insert' (r, n) tree subtrees
| p /= a = Node ((p, newNode) : (subtrees \\ [(a, tree)]))
where
(p, r, r') = partition s a
newNode = Node [(r, (Leaf n)), (r', tree)]
The partition function takes two strings and returns a tuple consisting of:
The common prefix (if it exists)
The first string without the prefix
The second string without the prefix
I think I understand the rules needed to build the tree.
We start by comparing the label of the first subtree to the string we want to insert (say, str). If they don't have a prefix in common, we try to insert in the next subtree.
If the label is a prefix of str, we continue to look into that subtree, but instead of using str we try to insert str without the prefix.
If str is a prefix of label, then we replace the existing subtree with a new Node, having a Leaf and the old subtree. We also adjust the labels.
If we don't have a match between str and any label then we add a new Leaf to the list of subtrees.
However, the biggest problem that I have is that I need to return a new tree containing the changes, so I have to keep track of everything else in the tree (not sure how to do this or if I'm thinking correctly about this).
The code appears to be working correctly on this string: "banana":
Node [("a",Node [("",Leaf 5),("na",Node [("",Leaf 3),("na",Leaf 1)])]),
("na",Node [("",Leaf 4),("na",Leaf 2)]),("banana",Leaf 0)]
However, on this string "mississippi" I get an Exception: Non-exhaustive patterns in function insert'.
Any help or ideas are greatly appreciated!
You are using a quadratic algorithm; whereas optimally, suffix tree can be constructed in linear time. That said, sticking with the same algorithm, a possibly better approach would be to first build the (uncompressed) suffix trie (not tree) and then compress the resulting trie.
The advantage would be that a suffix trie can be represented using Data.Map:
data SuffixTrie
= Leaf' Int
| Node' (Map (Maybe Char) SuffixTrie)
which makes manipulations both more efficient and easier than list of pairs. Doing so, you may also completely bypass common prefix calculations, as it comes out by itself:
import Data.List (tails)
import Data.Maybe (maybeToList)
import Control.Arrow (first, second)
import Data.Map.Strict (Map, empty, insert, insertWith, assocs)
data SuffixTree
= Leaf Int
| Node [(String, SuffixTree)]
deriving Show
data SuffixTrie
= Leaf' Int
| Node' (Map (Maybe Char) SuffixTrie)
buildTrie :: String -> SuffixTrie
buildTrie s = foldl go (flip const) (init $ tails s) (length s) $ Node' empty
where
go run xs i (Node' ns) = run (i - 1) $ Node' tr
where tr = foldr loop (insert Nothing $ Leaf' (i - 1)) xs ns
loop x run = insertWith (+:) (Just x) . Node' $ run empty
where _ +: Node' ns = Node' $ run ns
buildTree :: String -> SuffixTree
buildTree = loop . buildTrie
where
loop (Leaf' i) = Leaf i
loop (Node' m) = Node $ con . second loop <$> assocs m
con (Just x, Node [(xs, tr)]) = (x:xs, tr) -- compress single-child nodes
con n = maybeToList `first` n
then:
\> buildTree "banana"
Node [("a",Node [("",Leaf 5),
("na",Node [("",Leaf 3),
("na",Leaf 1)])]),
("banana",Leaf 0),
("na",Node [("",Leaf 4),
("na",Leaf 2)])]
similarly:
\> buildTree "mississippi"
Node [("i",Node [("",Leaf 10),
("ppi",Leaf 7),
("ssi",Node [("ppi",Leaf 4),
("ssippi",Leaf 1)])]),
("mississippi",Leaf 0),
("p",Node [("i",Leaf 9),
("pi",Leaf 8)]),
("s",Node [("i",Node [("ppi",Leaf 6),
("ssippi",Leaf 3)]),
("si",Node [("ppi",Leaf 5),
("ssippi",Leaf 2)])])]
Here's how the problem is occurring.
Let's say you're processing buildTree "nanny". After you've inserted the suffixes "nanny", "anny", and "nny", your tree looks like t1 given by:
let t1 = Node t1_content
t1_content = [("n",t2),("anny",Leaf 1)]
t2 = Node [("ny",Leaf 2),("anny",Leaf 0)]
Next, you try to insert the prefix "ny":
insert ("ny", 3) t1
= insert' ("ny", 3) t1 t1_content
-- matches guard p == a with p="n", r="y", r'=""
= insert' ("y", 3) t2 t1_content
What you intend to do next is insert ("y", 3) into t2 to yield:
Node [("y", Leaf 3), ("ny",Leaf 2),("anny",Leaf 0)])
Instead, what happens is:
insert' ("y", 3) t2 t1_content
-- have s="y", a="ny", so p="", r="y", r'="ny"
-- which matches guard: null p
= insert' ("y", 3) (Node [("anny", Leaf 0)]) t1_content
-- have s="y", a="anny", so p="", r="y", r'="anny"
-- which matches guard: null p
= insert' ("y", 3) (Node []) t1_content
= Node [("y", Leaf 3), ("n",t2), ("anny",Leaf 1)]
and suffix "y" has been added to t1 instead of t2.
When you next try to insert suffix "y", the guard p==a case tries to insert ("y",3) into Leaf 3 and you get a pattern error.
The reason it works on banana is that you only ever insert a new node at the top level of the tree, so "adding to t2" and "adding to t1" are the same thing.
I suspect you'll need to substantially rethink the structure of your recursion to get this working.
Looks like this code does the job, although there may still be improvements to make. I hope that it's general enough to work on any string. I also tried to avoid using ++, but it's still better than nothing.
getContent (Node listOfPairs)
= listOfPairs
insert :: (String, Int) -> SuffixTree -> SuffixTree
insert (s, n) (Node [])
= Node [(s, Leaf n)]
insert (s, n) (Node (pair#(a, tree) : pairs))
| p == a = Node ((a, insert (r, n) tree) : pairs)
| null p = Node (pair : (getContent (insert (r, n) (Node pairs))))
| p /= a = Node ([(p, Node [(r, Leaf n), (r', tree)])] ++ pairs)
where
(p, r, r') = partition s a

Do recursion and counting numbers at the same time in Haskell

If I had a known list A :: [Int], and wanted to get a new list B = newList A with newList defined as the following:
newList :: [Int] -> [Int]
newList [] = []
newList (a:as) | a==0 = f(a) : newList (as)
| a==1 = g(a) : newList (as)
| otherwise = h(a) : newList (as)
where f, g, h :: Int -> Int are unimportant functions.
Other than B, I also wanted to know how many 0, 1 are there in A respectively.
But since when producing B recursively, it has already checked whether a== (0 or 1) for each elements in A, so it's a redundancy to check it again separably.
Is it possible to get B but at the same time get how many 0, 1 are there in A with checking only once?
This is not an answer you are looking for, but there is a nice abstract structure behind your function, so I'll leave it here:
import Data.Monoid
import Data.Functor
import Data.Traversable
import Control.Arrow
import Control.Monad.Trans.Writer
wr :: Int -> Writer (Sum Int, Sum Int) Int
wr 0 = tell (Sum 1, Sum 0) $> f 0
wr 1 = tell (Sum 0, Sum 1) $> g 1
wr n = return $ h n
collect :: [Int] -> ([Int], (Int, Int))
collect = second (getSum *** getSum) . runWriter . traverse wr
Summing is a monoid, double summing is a monoid, the Writer monad handles monoids, traverse maps a list with an effectful function and performs all effects.
This:
f = (+ 1)
g = (+ 2)
h = (+ 3)
main = print $ collect [0, 1, 2, 3, 0, 0, 0, 4, 1]
prints ([1,3,5,6,1,1,1,7,3],(4,2)) — four zeros and two ones.

Resources