I have this language AST
data ExprF r = Const Int
| Var String
| Lambda String r
| EList [r]
| Apply r r
deriving ( Show, Eq, Ord, Functor, Foldable )
And I want to convert it to string
toString = cata $ \case
Const x -> show x
Var x -> x
EList x -> unwords x
Lambda x y -> unwords [x, "=>", y]
Apply x y -> unwords [x, "(", y, ")"]
But when lambda is used in Apply I need the parentheses
(x => x)(1)
but I cannot match inner structure with cata
toString :: Fix ExprF -> String
toString = cata $ \case
Const x -> show x
Var x -> x
Lambda x y -> unwords [x, "=>", y]
Apply (Lambda{}) y -> unwords ["(", x, ")", "(", y, ")"]
Apply x y -> unwords [x, "(", y, ")"]
Is there any better solution than para?
toString2 :: Fix ExprF -> String
toString2 = para $ \case
Const x -> show x
Var x -> x
Lambda x (_,y) -> unwords [x, "=>", y]
EList x -> unwords (snd <$> x)
Apply ((Fix Lambda {}),x) (_,y) -> unwords ["(", x, ")", "(", y, ")"]
Apply (_,x) (_,y) -> unwords [x, "(", y, ")"]
It looks uglier. Even it is needed only in one place I need to remove fst tuple parameters everywhere and I guess it will be slower.
As #chi, #DanielWagner and I pointed out in the comments, the way to do this sort of pretty-printing-with-parenthesisation in a structurally recursive manner is "the showsPrec approach".
The big idea is not to fold up the syntax tree into a String, but into a function Bool -> String. This gives us a degree of context-sensitivity in the fold: we'll use that extra Bool parameter to keep track of whether we're currently in the context of the left-hand side of an application.
parens x = "(" ++ x ++ ")"
ppAlg :: ExprF (Bool -> String) -> (Bool -> String)
ppAlg (Const x) isBeingApplied = show x
ppAlg (Var x) isBeingApplied = x
ppAlg (Lambda name body) isBeingApplied = p ("\\" ++ name ++ " -> " ++ body False)
where p = if isBeingApplied then parens else id
ppAlg (EList es) isBeingApplied = unwords (sequenceA es False)
ppAlg (Apply fun arg) isBeingApplied = fun True ++ " " ++ arg False
We pass values of isBeingApplied down the recursive calls depending on where we are in the syntax tree right now. Note that the only place we're passing down True is as an argument to fun in the body of the Apply case. Then, in the Lambda case, we inspect that argument. If the current term is the left-hand part of an application we parenthesise the lambda; if not we don't.
At the top level, having folded up the whole tree into a function Bool -> String, we pass it an argument of False - we're not currently in the context of an application - to get a String out.
pp :: Expr -> String
pp ex = cata ppAlg ex False
ghci> pp $ app (lam "x" (var "x")) (cnst 2)
"(\\x -> x) 2"
By replacing the Bool with an Int, this approach can be generalised to parenthesising operators with arbitrary precedences, as covered in #DanielWagner's linked answer.
One solution is to use the {-# LANGUAGE PatternSynonyms #-} extension and define unidirectional patterns like:
pattern Apply' r1 r2 <- Apply (_,r1) (_,r2)
that you could then use in your definitions like this:
toString2 :: Fix ExprF -> String
toString2 = para $ \case
Const x -> show x
Var x -> x
Lambda x (_,y) -> unwords [x, "=>", y]
EList x -> unwords (snd <$> x)
Apply ((Fix Lambda {}),x) (_,y) -> unwords ["(", x, ")", "(", y, ")"]
Apply' x y -> unwords [x, "(", y, ")"]
Since ExprF is a Functor, another option would be simply to write:
toString2' :: Fix ExprF -> String
toString2' = para $ \case
Apply ((Fix Lambda {}),x) (_,y) -> unwords ["(", x, ")", "(", y, ")"]
other -> case fmap snd other of
Const x -> show x
Var x -> x
Lambda x y -> unwords [x, "=>", y]
Apply x y -> unwords [x, "(", y, ")"]
With the pattern synonym, and compiling with -Wall, I'm having trouble convincing the exhaustivity checker that the pattern matches are exhaustive.
How about straight recursion for the missing case :
toString :: Fix ExprF -> String
toString (Fix (Apply (Fix (Lambda _ x)) y)) = "(" ++ toString x ++ ")(" ++ toString y ++ ")"
toString z = (cata $ \case
Const x -> show x
Var x -> x
EList x -> unwords x
Lambda x y -> unwords [x, "=>", y]
Apply x y -> unwords [x, "(", y, ")"]) z
Related
I need to join two IO Strings with a - in between. Here's what I came up with, which works - what's the right way?
import System.Environment
f :: String -> String -> IO String
f x y = (foldl1 (++)) <$> sequence [(getEnv x),(return "-"),(getEnv y)]
You could here use an applicative style function:
f :: String -> String -> IO String
f x y = withHyp <$> getEnv x <*> getEnv y
where withHyp ex ey = ex ++ '-' : ey
So here we join the two Strings that are then joined with a hypen in the middle through the withHyp function.
Or for a list of environment variables that we need to fetch, we can use mapM and perform an intercalate:
import Data.List(intercalate)
f :: [String] -> IO String
f xs = intercalate "-" <$> mapM getEnv xs
I'll be honest, the idea behind your approach actually looks pretty sane to me. To start with, I'd probably use concat intsead of foldl1 (++), and drop some parens, getting us to:
f x y = concat <$> sequence [getEnv x, return "-", getEnv y]
This really doesn't seem that bad to me. But if I really wanted to push farther, here's some thoughts I would have. First, I'd recall the intercalate function.
f x y = intercalate "-" <$> sequence [getEnv x, getEnv y]
There's a handy shorthand for applying a function to each element of a list, too; mapM f = sequence . map f. So:
f x y = intercalate "-" <$> mapM getEnv [x,y]
I would stop there; it looks quite clean and maintainable to me.
One way of joining two IO Strings would be:
dash :: IO String -> IO String -> IO String
dash x y = do
s1 <- x
s2 <- y
return $ s1 <> "-" <> s2
We "unbox" each of x and y to get the contained Strings, then "rebox` them with a hyphen (using the analogy for Functors).
It can be shortened to:
dash = liftA2 (\s1 s2 -> s1 <> "-" <> s2)
Where liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c takes a binary function and "lifts" it into a binary function on Applicatives, which are a superset of Monads.
Your f can then be implemented as f x y = dash (getEnv x) (getEnv y).
I have a data type and function like this:
data Expr = Num Int | Add Expr Expr | Mult Expr Expr | Neg Expr | If Expr Expr Expr deriving (Show, Read)
prettyPrint :: Expr -> IO ()
prettyPrint expr = prettyPrint' expr 0
prettyPrint' :: Expr -> Int -> IO ()
prettyPrint' (Num x) i = putStrLn $ concat (replicate i " ") ++ "Num " ++ show x
prettyPrint' (Add x y) i = do
putStrLn $ concat (replicate i " ") ++ "Add"
prettyPrint' x (i+1)
prettyPrint' y (i+1)
prettyPrint' (Mult x y) i = do
putStrLn $ concat (replicate i " ") ++ "Mult"
prettyPrint' x (i+1)
prettyPrint' y (i+1)
prettyPrint' (Neg x) i = do
putStrLn $ concat (replicate i " ") ++ "Neg"
prettyPrint' x (i+1)
prettyPrint' (If x y z) i = do
putStrLn $ concat (replicate i " ") ++ "If"
prettyPrint' x (i+1)
prettyPrint' y (i+1)
prettyPrint' z (i+1)
In the function I am using pattern matching. The problem is that their is a lot of reuse of code. For example, the case for Mult and Add is basically the same code. Same goes for Num and Neg. Is there a way to write this based on how many variables the expression have? Like one for Num and Neg, since they have only one variable. One case for Mult and Add, since they have two variables. And a last case for If, since that expression have three variables.
NOTE:
I landed on this answer, I think it's a better solution than I started with:
prettyPrint :: Expr -> IO ()
prettyPrint expr = putStrLn (prettyPrint' 1 expr)
prettyPrint' :: Int -> Expr -> String
prettyPrint' i (Num x) = "Num " ++ show x
prettyPrint' i expr =
let indent x = concat (replicate i " ") ++ x
(op, args) = case expr of
Add x y -> ("Add", [x,y])
Mult x y -> ("Mult", [x,y])
Neg x -> ("Neg", [x])
If x y z -> ("If", [x,y,z])
in intercalate "\n" (op : map (indent . prettyPrint' (i + 1)) args)
First, I would stay out of the IO monad for as long as possible. Have prettyPrint' return a string to be printed.
prettyPrint :: Expr -> IO ()
prettyPrint = putStrLn . prettyPrint'
Now, the only job of prettyPrint' is to create a (possibly multiline) string to be printed. For numbers, that's easy: just use the show instance.
prettyPrint' :: Expr -> String
prettyPrint' e#(Num _) = show e
-- or, ignoring the Show instance for Expr altogether
-- prettyPrint' (Num x) = "Num " ++ show x
For the rest, there is a pattern:
Identify the constructor
Identify its arguments
Join the constructor name and its pretty-printed arguments with newlines. Each argument will be indented one level relative to its operator; the recursion will take care of multiple levels of indentation.
That will look like
prettyPrint' expr = let indent x = " " ++ x
(op, args) = case expr of
Add x y -> ("Add", [x,y])
Mult x y -> ("Mult", [x,y])
Neg x -> ("Neg", [x])
If x y z -> ("If", [x,y,z])
in intercalate "\n" (op : map (indent . prettyPrint') args)
As an example, consider what prettyPrint' will do with the expression Add (Num 3) (Num 5). First, it sets op to "Add" and args to [Num 3, Num 5]. Next, it maps indent . prettyPrint' over the argument list, to get [" Num 3", " Num 5"]. Putting the operator on the front of the list yields ["Add", " Num 3", " Num 3"], then joining them with intercalate produces "Add\n Num 3\n Num 5".
The only remaining boilerplate is in the case expression. I think it's possible to eliminate that, but it requires a level of generic programming I'm not familiar with. I'm sure someone else could probably run with my answer to fix that.
In general, when addressing duplication in code, it pays to keep the rule of three in mind. Two occurrences of a block of code isn't necessarily a problem.
That said, Haskell is a (very) strongly-typed language, so you generally can't pattern-match on arity like you can in, say, Erlang or Clojure.
If you really want to abstract away the recursion part of a recursive data structure, you can define the catamorphism for it. People often also call this a fold, so let's keep that slightly more friendly name:
data Expr =
Num Int | Add Expr Expr | Mult Expr Expr | Neg Expr | If Bool Expr Expr deriving (Show, Read)
foldExpr ::
(Int -> a) -> (a -> a -> a) -> (a -> a -> a) -> (a -> a) -> (Bool -> a -> a -> a) -> Expr -> a
foldExpr num _ _ _ _ (Num x) = num x
foldExpr num add mul neg iff (Add x y) =
add (foldExpr num add mul neg iff x) (foldExpr num add mul neg iff y)
foldExpr num add mul neg iff (Mult x y) =
mul (foldExpr num add mul neg iff x) (foldExpr num add mul neg iff y)
foldExpr num add mul neg iff (Neg x) = neg (foldExpr num add mul neg iff x)
foldExpr num add mul neg iff (If b x y) =
iff b (foldExpr num add mul neg iff x) (foldExpr num add mul neg iff y)
This is an entirely generic function that enables you turn turn any Expr value into any value of the type a, without worrying about reimplementing recursion every time. You just have to supply functions that deal with each of the cases.
You can, for example, easily write an evaluator:
evaluate :: Expr -> Int
evaluate = foldExpr id (+) (*) negate (\p x y -> if p then x else y)
(Notice, BTW, that I changed the definition of If, because I couldn't see how the OP definition would work.)
You can also write a function to turn an Expr value into a string, although this one is just a sketch; it needs indentation or bracket logic to work correctly:
prettyPrint :: Expr -> String
prettyPrint =
foldExpr
show -- Num
(\x y -> x ++ "+" ++ y) -- Add
(\x y -> x ++ "*" ++ y) -- Mult
(\x -> "(-" ++ x ++ ")") -- Neg
(\p x y -> "if " ++ show p ++ " then " ++ x ++ " else " ++ y) -- If
You can try it out in GHCi:
*Q53284410> evaluate (Num 42)
42
*Q53284410> evaluate (Add (Num 40) (Num 2))
42
*Q53284410> evaluate (Add (Mult (Num 4) (Num 10)) (Num 2))
42
*Q53284410> prettyPrint $ Num 42
"42"
*Q53284410> prettyPrint $ Mult (Num 6) (Num 7)
"6*7"
*Q53284410> prettyPrint $ Add (Mult (Num 2) (Num 3)) (Num 7)
"2*3+7"
Yes, just create a function to print list of Expr:
import Control.Monad (forM_)
printExprList::[Expr]->Int->String->IO ()
printExprList exprs i desc = do
putStrLn $ concat (replicate i " ") ++ desc
forM_ (zip exprs [i..]) $ \(e, j)-> prettyPrint' e (j+1)
and then call it to print:
prettyPrint' :: Expr -> Int -> IO ()
prettyPrint' (Add x y) i = printExprList [x, y] i "Add"
prettyPrint' (Mult x y) i = printExprList [x, y] i "Mult"
prettyPrint' (Neg x) i = printExprList [x] i "Neg"
prettyPrint' (If x y z) i = printExprList [x, y, z] i "If"
prettyPrint' (Num x) i = putStrLn $ concat (replicate i " ")
++ "Num " ++ show x
My goal is to write a recursive function that uses the mapping function to generate a set of lists of integers that are ultimately generated from an initial list of integers.
To help me understand what my coding needs to do, I set lst0 as my initial list of integers, then used lst0 to make lst1,and lst1 to make lst2,and so on and so forth.
lst0 = [1,25,100]
lst1 = map (\x -> 0 + x) lst0
lst2 = map (\y -> map (\x -> y + x) lst0) lst1
lst3 = map (\z -> map (\y -> map (\x -> y + x) lst0) z) lst2
lst4 = map (\w -> map (\z -> map (\y -> map (\x -> y + x) lst0) z) w) lst3
Then I rewrote the above lists as functions that take as input the previous list/funcition. They all work fine (e.g., list4 lst3 which is short for list4 $ list3 $ list2 $ list1 $ list0).
list0 = [1,25,100]
list1 prev = map (\x -> 0 + x) prev
list2 prev = map (\y -> map (\x -> y + x) list0) prev
list3 prev = map (\z -> map (\y -> map (\x -> y + x) list0)z) prev
list4 prev = map (\w -> map (\z -> map (\y -> map (\x -> y + x) list0)z) w) prev
I then rewrote each function with the previous function in it to get the below functions. These also work just fine.
list0' = [1,25,100]
list1' = map (\x -> 0 + x) list0'
list2' prev = map (\y -> map (\x -> y + x) list0') prev
list3' prev = map (\z -> list2' z) prev
list4' prev = map (\w -> list3' w) prev
Yet when I try to condense all of the above functions into a recursive foo function below, I get the following error about (foo (n-1) initlst) z. The n is the depth of the game tree I am building and initlst will be lst0 (that is [1,25,100]).
foo n initlst
| n == 1 = map (\x -> 0 + x) initlst
| n == 2 = map (\y -> map (\x -> y + x) initlst) $ foo 1 initlst
| otherwise = map (\z -> (foo (n-1) initlst) z) $ foo (n-1) initlst
The error message is the following and it references this part of the foo function: (foo (n-1) initlst) z:
Couldn't match expected type ‘a1 -> a1’ with actual type ‘[a1]’
Relevant bindings include
z :: a1 (bound at BuildTreesQ3.lhs:61:24)
initlst :: [a1] (bound at BuildTreesQ3.lhs:58:9)
foo :: a -> [a1] -> [a1] (bound at BuildTreesQ3.lhs:58:3)
The function ‘foo’ is applied to three arguments,
but its type ‘a -> [a1] -> [a1]’ has only two
In the expression: (foo (n - 1) initlst) z
In the first argument of ‘map’, namely
‘(\ z -> (foo (n - 1) initlst) z)’
Any insights into what I'm missing will be very much appreciated. Thanks in advance.
I have a data type representing arithmetic expressions:
data E = Add E E | Mul E E | Var String
I want to write an expansion function which will convert an expression into sum of products of variables (sort of braces expansion). Using recursion schemes of course.
I only could think of an algorithm in the spirit of "progress and preservation". The algorithm at each step constructs terms that are fully expanded so there is no need to re-check.
The handling of Mul made me crazy, so instead of doing it directly I used an isomorphic type of [[String]] and took advantage of concat and concatMap already implemented for me:
type Poly = [Mono]
type Mono = [String]
mulMonoBy :: Mono -> Poly -> Poly
mulMonoBy x = map (x ++)
mulPoly :: Poly -> Poly -> Poly
mulPoly x = concatMap (flip mulMonoBy x)
So then I just use cata:
expandList :: E -> Poly
expandList = cata $ \case
Var x -> [[x]]
Add e1 e2 = e1 ++ e2
Mul e1 e2 = mulPoly e1 e2
And convert back:
fromPoly :: Poly -> Expr
fromPoly = foldr1 Add . map fromMono where
fromMono = foldr1 Mul . map Var
Are there significantly better approaches?
Upd: There are few confusions.
The solution does allow multiline variable names. Add (Val "foo" (Mul (Val "foo) (Var "bar"))) is a representation of foo + foo * bar. I'm not representing x*y*z with Val "xyz" or something. Note that also as there are no scalars repeated vars such as "foo * foo * quux" are perfectly allowed.
By sum of products I mean sort of "curried" n-ary sum of products. A concise definition of sum of products is that I want an expression without any parentheses, with all parens represented by associativity and priority.
So (foo * bar + bar) + (foo * bar + bar) is not a sum of products as the because of middle + is sum of sums
(foo * bar + (bar + (foo * bar + bar))) or corresponding left-associative version are right answers, although we must guarantee that associativity is always left of always right. So the correct type for right-assoaciative solution is
data Poly = Sum Mono Poly
| Product Mono
which is isomorphic to nonempty lists: NonEmpty Poly (note Sum Mono Poly instead of Sum Poly Poly). If we allow empty sums or products then we get just the list of list representation I used.
Also of you don't care about performance, the multiplication seems to be just liftA2 (++)
I am no expert in recursion schemes, but since it sounds like you are trying to practice them, hopefully you will not find it too onerous to convert a solution using manual recursion to one using recursion schemes. I'll write it with mixed prose and code first, and include the complete code again at the end for simpler copy/pasting.
It is not too difficult to do using simply the distributive property and a bit of recursive algebra. Before we begin, though, let's define a better result type, one that guarantees we can only ever represent sums of products:
data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term)
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show
This way we can't possibly mess up and accidentally yield an incorrect result like
(Mul (Var "x") (Add (Var "y") (Var "z")))
Now, let's write our function.
expand :: E -> Poly String
First, a base case: it is trivial to expand a Var, because it is already in sum-of-products form. But we must convert it a bit to fit it into our Poly result type:
expand (Var x) = Product (Term x)
Next, note that it is easy to expand an addition: simply expand the two sub-expressions, and add them together.
expand (Add x y) = Sum (expand x) (expand y)
What about a multiplication? That is a bit more complicated, since
Product (expand x) (expand y)
is ill-typed: we can't multiply polynomials, only monomials. But we do know how to do algebraic manipulation to turn a multiplication of polynomials into a sum of multiplications of monomials, via the distributive rule. As in your question, we'll need a function mulPoly. But let's just assume that exists, and implement it later.
expand (Mul x y) = mulPoly (expand x) (expand y)
That handles all the cases, so all that's left is to implement mulPoly by distributing the multiplications across the two polynomials' terms. We simply break down one of the polynomials one term at a time, and multiply the term across each of the terms in the other polynomial, adding together the results.
mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly -> Poly
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x
And in the end, we can test that it works as intended:
expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
{- results in: Sum (Sum (Product (MonoMul (Term "y") (Term "a")))
(Product (MonoMul (Term "z") (Term "a"))))
(Sum (Product (MonoMul (Term "y") (Term "b")))
(Product (MonoMul (Term "z") (Term "b"))))
-}
Or,
(a + b)(y * z) = ay + az + by + bz
which we know to be correct.
The complete solution, as promised above:
data E = Add E E | Mul E E | Var String
data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term)
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show
expand :: E -> Poly String
expand (Var x) = Product (Term x)
expand (Add x y) = Sum (expand x) (expand y)
expand (Mul x y) = mulPoly (expand x) (expand y)
mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly String -> Poly String
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x
main = print $ expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
This answer has three sections. The first section, a summary in which I present my two favourite solutions, is the most important one. The second section contains types and imports, as well as extended commentary on the way towards the solutions. The third section focuses on the task of reassociating expressions, something that the original version of the answer (i.e. the second section) had not given due attention.
At the end of the day, I ended up with two solutions worth discussing. The first one is expandDirect (cf. the third section):
expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = \case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x
With it, we rebuild the tree from the bottom (cata). On every branch, if we find something invalid we walk back and rewrite the subtree (apo), redistributing and reassociating as needed until all immediate children are correctly arranged (apo makes it possible to do that without having to rewrite everyting down to the very bottom).
The second solution, expandMeta, is a much simplified version of expandFlat from the third section.
expandMeta :: E a -> E a
expandMeta = apo coalg . cata alg
where
alg = \case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> Mul <$> x <*> y
coalg = \case
x :| [] -> Left <$> project x
x :| (y:ys) -> Add' (Left x) (Right (y :| ys))
expandMeta is a metamorphism; that is, a catamorphism followed by an anamorphism (while we are using apo here as well, an apomorphism is just a fancy kind of anamorphism, so I guess the nomenclature still applies). The catamorphism changes the tree into a non-empty list -- that implicitly handles the reassociation of the Adds -- with the list applicative being used to distribute multiplication (much like you suggest). The coalgebra then quite trivially converts the non-empty list back into a tree with the appropriate shape.
Thank you for the question -- I had a lot of fun with it! Preliminaries:
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Data.Functor.Foldable
import qualified Data.List.NonEmpty as N
import Data.List.NonEmpty (NonEmpty(..))
import Data.Semigroup
import Data.Foldable (toList)
import Data.List (nub)
import qualified Data.Map as M
import Data.Map (Map, (!))
import Test.QuickCheck
data E a = Var a | Add (E a) (E a) | Mul (E a) (E a)
deriving (Eq, Show, Functor, Foldable)
data EF a b = Var' a | Add' b b | Mul' b b
deriving (Eq, Show, Functor)
type instance Base (E a) = EF a
instance Recursive (E a) where
project = \case
Var x -> Var' x
Add x y -> Add' x y
Mul x y -> Mul' x y
instance Corecursive (E a) where
embed = \case
Var' x -> Var x
Add' x y -> Add x y
Mul' x y -> Mul x y
To begin with, my first working (if flawed) attempt, which uses the applicative instance of (non-empty) lists to distribute:
expandTooClever :: E a -> E a
expandTooClever = cata $ \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> foldr1 Add (Mul <$> flatten x <*> flatten y)
where
flatten :: E a -> NonEmpty (E a)
flatten = cata $ \case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> pure (foldr1 Mul (x <> y))
expandTooClever has one relatively serious problem: as it calls flatten, a full-blown fold, for both subtrees whenever it reaches a Mul, it has horrible asymptotics for chains of Mul.
Brute force, simplest-thing-that-could-possibly-work solution, with an algebra that calls itself recursively:
expandBrute :: E a -> E a
expandBrute = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' (Add x x') y -> Add (alg (Mul' x y)) (alg (Mul' x' y))
Mul' x (Add y y') -> Add (alg (Mul' x y)) (alg (Mul' x y'))
Mul' x y -> Mul x y
The recursive calls are needed because the distribution might introduce new occurrences of Add under Mul.
A slightly more tasteful variant of expandBrute, with the recursive call factored out into a separate function:
expandNotSoBrute :: E a -> E a
expandNotSoBrute = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> dis x y
dis (Add x x') y = Add (dis x y) (dis x' y)
dis x (Add y y') = Add (dis x y) (dis x y')
dis x y = Mul x y
A tamed expandNotSoBrute, with dis being turned into an apomorphism. This way of phrasing it expresses nicely the big picture of what is going on: if you only have Vars and Adds, you can trivially reproduce the tree bottom-up without a care in the world; if you hit a Mul, however, you have to go back and reconstuct the whole subtree to perform the distributions (I wonder is there is a specialised recursion scheme that captures this pattern).
expandEvert :: E a -> E a
expandEvert = cata alg
where
alg :: EF a (E a) -> E a
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> apo coalg (x, y)
coalg :: (E a, E a) -> EF a (Either (E a) (E a, E a))
coalg (Add x x', y) = Add' (Right (x, y)) (Right (x', y))
coalg (x, Add y y') = Add' (Right (x, y)) (Right (x, y'))
coalg (x, y) = Mul' (Left x) (Left y)
apo is necessary because we want to anticipate the final result if there is nothing else to distribute. (There is a way to write it with ana; however, that requires wastefully rebuilding trees of Muls without changes, which leads to the same asymptotics problem expandTooClever had.)
Last, but not least, a solution which is both a successful realisation of what I had attempted with expandTooClever and my interpretation of amalloy's answer. BT is a garden-variety binary tree with values on the leaves. A product is represented by a BT a, while a sum of products is a tree of trees.
expandSOP :: E a -> E a
expandSOP = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (BT (BT a)) -> BT (BT a)
algSOP = \case
Var' s -> pure (pure s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = \case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: BTF (E a) (E a) -> E a
algS = \case
Leaf' x -> x
Branch' x y -> Add x y
BT and its instances:
data BT a = Leaf a | Branch (BT a) (BT a)
deriving (Eq, Show)
data BTF a b = Leaf' a | Branch' b b
deriving (Eq, Show, Functor)
type instance Base (BT a) = BTF a
instance Recursive (BT a) where
project (Leaf s) = Leaf' s
project (Branch l r) = Branch' l r
instance Corecursive (BT a) where
embed (Leaf' s) = Leaf s
embed (Branch' l r) = Branch l r
instance Semigroup (BT a) where
l <> r = Branch l r
-- Writing this, as opposed to deriving it, for the sake of illustration.
instance Functor BT where
fmap f = cata $ \case
Leaf' x -> Leaf (f x)
Branch' l r -> Branch l r
instance Applicative BT where
pure x = Leaf x
u <*> v = ana coalg (u, v)
where
coalg = \case
(Leaf f, Leaf x) -> Leaf' (f x)
(Leaf f, Branch xl xr) -> Branch' (Leaf f, xl) (Leaf f, xr)
(Branch fl fr, v) -> Branch' (fl, v) (fr, v)
To wrap things up, a test suite:
newtype TestE = TestE { getTestE :: E Char }
deriving (Eq, Show)
instance Arbitrary TestE where
arbitrary = TestE <$> sized genExpr
where
genVar = Var <$> choose ('a', 'z')
genAdd n = Add <$> genSub n <*> genSub n
genMul n = Mul <$> genSub n <*> genSub n
genSub n = genExpr (n `div` 2)
genExpr = \case
0 -> genVar
n -> oneof [genVar, genAdd n, genMul n]
data TestRig b = TestRig (Map Char b) (E Char)
deriving (Show)
instance Arbitrary b => Arbitrary (TestRig b) where
arbitrary = do
e <- genExpr
d <- genDict e
return (TestRig d e)
where
genExpr = getTestE <$> arbitrary
genDict x = M.fromList . zip (keys x) <$> (infiniteListOf arbitrary)
keys = nub . toList
unsafeSubst :: Ord a => Map a b -> E a -> E b
unsafeSubst dict = fmap (dict !)
eval :: Num a => E a -> a
eval = cata $ \case
Var' x -> x
Add' x y -> x + y
Mul' x y -> x * y
evalRig :: (E Char -> E Char) -> TestRig Integer -> Integer
evalRig f (TestRig d e) = eval (unsafeSubst d (f e))
mkPropEval :: (E Char -> E Char) -> TestRig Integer -> Bool
mkPropEval f = (==) <$> evalRig id <*> evalRig f
isDistributed :: E a -> Bool
isDistributed = para $ \case
Add' (_, x) (_, y) -> x && y
Mul' (Add _ _, _) _ -> False
Mul' _ (Add _ _, _) -> False
Mul' (_, x) (_, y) -> x && y
_ -> True
mkPropDist :: (E Char -> E Char) -> TestE -> Bool
mkPropDist f = isDistributed . f . getTestE
main = mapM_ test
[ ("expandTooClever" , expandTooClever)
, ("expandBrute" , expandBrute)
, ("expandNotSoBrute", expandNotSoBrute)
, ("expandEvert" , expandEvert)
, ("expandSOP" , expandSOP)
]
where
test (header, func) = do
putStrLn $ "Testing: " ++ header
putStr "Evaluation test: "
quickCheck $ mkPropEval func
putStr "Distribution test: "
quickCheck $ mkPropDist func
By sum of products I mean sort of "curried" n-ary sum of products. A concise definition of sum of products is that I want an expression without any parentheses, with all parens represented by associativity and priority.
We can adjust the solutions above so that the sums are reassociated. The easiest way is replacing the outer BT in expandSOP with NonEmpty. Given that the multiplication there is, much like you suggest, liftA2 (<>), this works straight away.
expandFlat :: E a -> E a
expandFlat = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (NonEmpty (BT a)) -> NonEmpty (BT a)
algSOP = \case
Var' s -> pure (Leaf s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = \case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: NonEmptyF (E a) (E a) -> E a
algS = \case
NonEmptyF x Nothing -> x
NonEmptyF x (Just y) -> Add x y
Another option is using any of the other solutions and reassociating the sums in the distributed tree in a separate step.
flattenSum :: E a -> E a
flattenSum = cata alg
where
alg = \case
Add' x y -> apo coalg (x, y)
x -> embed x
coalg = \case
(Add x x', y) -> Add' (Left x) (Right (x', y))
(x, y) -> Add' (Left x) (Left y)
We can also roll flattenSum and expandEvert into a single function. Note that the sum coalgebra needs an extra case when it gets the result of the distribution coalgebra. That happens because, as the coalgebra proceeds from top to bottom, we can't be sure that the subtrees it generates are properly associated.
-- This is written in a slightly different style than the previous functions.
expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = \case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x
Perhaps there is a more clever way of writing expandDirect, but I haven't figured it out yet.
I am reading haskell book and curious why the return type of the bind operator look odd to me
For the given definitions
type Parser a = String -> [(a, String)]
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
bind :: Parser a -> (a -> Parser b) -> Parser b
p `bind` f = \inp -> concat [ f x inp' | (x, inp') <- p inp]
when I define z in GHCI as
let z = item `bind` (\x -> (\y -> result (x,y))) "Rohit"
the return type is
>> :t z
z :: Parser ([Char], Char)
Question:
(1) Shouldn't the return type of (Char, [Char])? looking at the list comprehension, "(x, inp') <- p inp" should yield -> "('r', "ohit")". Next f x inp' is left associative, so f x should yield character 'r' and pass to the lambda that should return result tuple ('r', "ohit"), but why is it that z type is ([Char], char) :: (x,y)
(2) How can i print the value of z in the above case on the ghci
Assuming that result is of type a -> [a] (did you mean return for the list monad?), the issue you encounter comes from the fact that you use the infix bind.
item `bind` (\x -> (\y -> result (x,y))) "Rohit"
is parsed as
bind item ((\ x y -> result (x, y)) "Rohit")
rather than what you expected which I assume is:
bind item (\ x y -> result (x, y)) "Rohit"
You can fix this by using $:
let z = item `bind` (\x -> (\y -> result (x,y))) $ "Rohit"
I'm unsure what result is here, but this is a problem of associativity. Consider your z:
let z = item `bind` (\x -> (\y -> result (x,y))) "Rohit"
This is equivalent to
let z = item `bind` ((\x -> (\y -> result (x,y))) "Rohit")
= item `bind` (\y -> result ("Rohit",y))
I believe you would get the result you desire by adding the following brackets:
let z = (item `bind` (\x -> (\y -> result (x,y)))) "Rohit"