Find tree height using folde function in Haskell - haskell

One of the assignments I am working on leading up to exams had me create
data Exp = T | F | And Exp Exp | Or Exp Exp | Not Exp deriving (Eq, Show, Ord, Read)
Then it asked to make
folde :: a -> a -> (a -> a -> a) -> (a -> a -> a) -> (a -> a) -> Exp -> a
This is what I came up with
folde :: a -> a -> (a -> a -> a) -> (a -> a -> a) -> (a -> a) -> Exp -> a
folde t f a o n T = t
folde t f a o n F = f
folde t f a o n (And x y) = a (folde t f a o n x) (folde t f a o n y)
folde t f a o n (Or x y) = o (folde t f a o n x) (folde t f a o n y)
folde t f a o n (Not x) = n (folde t f a o n x)
The assignment asks for evb, evi and evh.
They are all supposed to work with one single call to folde using correct parameters.
Evb evaluates boolean expressions.
evb :: Exp -> Bool
evb = folde True False (&&) (||) not
Evi evaluates to an integer, treating T as Int 1, F as Int 5, And as +, Or as * and Not as negate.
evi :: Exp -> Int
evi = folde 1 5 (+) (*) negate
So far so good, it all works. I'll be happy for any feedback on this as well.
However, I can't seem to understand how to solve the evh.
evh is supposed to calculate the heigh of the tree.
It should be evh :: Exp -> Int
The assignment says it should treat T and F as height 1.
It goes on that Not x should evaluate to height x + 1. And and Or has the height of its tallest subtree + 1.
I can't seem to figure out what I should pass on to my folde function

The assignment says it should treat T and F as height 1. It goes on that Not x should evaluate to height x + 1. And and Or has the height of its tallest subtree + 1.
You can write this pretty directly with explicit recursion:
height T = 1
height F = 1
height (Not x) = height x + 1
height (And x y) = max (height x) (height y) + 1
height (Or x y) = max (height x) (height y) + 1
Now, how do you write this with folde? The key thing about recursive folding is that folde gives each of your functions the result of folding all the subtrees. When you folde on And l r, it folds both subtrees first, and then passes those results into the argument to folde. So, instead of you manually calling height x, folde is going to calculate that for you and pass it as an argument, so your own work ends up something like \x y -> max x y + 1. Essentially, split height into 5 definitions, one per constructor, and instead of destructuring and recursing down subtrees, take the heights of the subtrees as arguments:
heightT = 1 -- height T = 1
heightF = 1 -- height F = 1
heightN x = x + 1 -- height (Not x) = height x + 1
heightA l r = max l r + 1 -- height (And l r) = max (height l) (height r) + 1
heightO l r = max l r + 1 -- height (Or l r) = max (height l) (height r) + 1
Feed them to folde, and simplify
height = folde 1 1 -- T F
ao -- And
ao -- Or
(+1) -- Not
where ao x y = max x y + 1
And now for something new! Take this definition:
data ExpF a = T | F | Not a | And a a | Or a a
deriving (Functor, Foldable, Traversable)
This looks like your Exp, except instead of recursion it's got a type parameter and a bunch of holes for values of that type. Now, take a look at the types of expressions under ExpF:
T :: forall a. ExpF a
Not F :: forall a. ExpF (ExpF a)
And F (Not T) :: forall a. ExpF (ExpF (ExpF a))
If you set a = ExpF (ExpF (ExpF (ExpF (ExpF ...)))) (on to infinity) in each of the above, you find that they can all be made to have the same type:
T :: ExpF (ExpF (ExpF ...))
Not F :: ExpF (ExpF (ExpF ...))
And F (Not T) :: ExpF (ExpF (ExpF ...))
Infinity is fun! We can encode this infinitely recursive type with Fix
newtype Fix f = Fix { unFix :: f (Fix f) }
-- Compare
-- Type level: Fix f = f (Fix f)
-- Value level: fix f = f (fix f)
-- Fix ExpF = ExpF (ExpF (ExpF ...))
-- fix (1:) = 1:( 1:( 1: ...))
-- Recover original Exp
type Exp = Fix ExpF
-- Sprinkle Fix everywhere to make it work
Fix T :: Exp
Fix $ And (Fix T) (Fix $ Not $ Fix F) :: Exp
-- can also use pattern synonyms
pattern T' = Fix T
pattern F' = Fix F
pattern Not' t = Fix (Not t)
pattern And' l r = Fix (And l r)
pattern Or' l r = Fix (Or l r)
T' :: Exp
And' T' (Not' F') :: Exp
And now here's the nice part: one definition of fold to rule them all:
fold :: Functor f => (f a -> a) -> Fix f -> a
fold alg (Fix ffix) = alg $ fold alg <$> ffix
-- ffix :: f (Fix f)
-- fold alg :: Fix f -> a
-- fold alg <$> ffix :: f a
-- ^ Hey, remember when I said folds fold the subtrees first?
-- Here you can see it very literally
Here's a monomorphic height
height = fold $ \case -- LambdaCase extension: \case ... ~=> \fresh -> case fresh of ...
T -> 1
F -> 1
Not x -> x + 1
And x y -> max x y + 1
Or x y -> max x y + 1
And now a very polymorphic height (in your case it's off by one; oh well).
height = fold $ option 0 (+1) . fmap getMax . foldMap (Option . Just . Max)
height $ Fix T -- 0
height $ Fix $ And (Fix T) (Fix $ Not $ Fix F) -- 2
See the recursion-schemes package to learn these dark arts. It also makes this work for base types like [] with a type family, and removes the need to Fix everything with said trickery + some TH.

Related

How to fix "Illegal type signature" error in Haskell

I ran into the following error in Haskell:
"Type signatures are only allowed in patterns with ScopedTypeVariables"
How should I re-use the defined variables. Thanks in advance
sum :: (Double -> Double) -> (Double -> Double) -> Int ->
(Double -> Double)
sum f g n = (\x -> helper f g n x)
where
helper :: (Double -> Double) -> (Double -> Double) -> Int -> Double ->
Double
|n == 0 = 0
|mod n 2 == 1 = f(x) + helper f g n-1 f(x)
|otherwise = g(x) + helper f g n-1 g(x)
This actually looks more like a syntactical error: you never defined a function body for helper, indeed you defined the signature of helper, followed by guards (the | ... part), but you should again state helper f g n x = ....
Furthermore I don't think it is useful to define helper here with a variable for f, an g, since these remain fixed throughout the recursion.
You can probably define the function as:
sumfg :: (Double -> Double) -> (Double -> Double) -> Int -> Double -> Double
sumfg f g = helperf
where helperf 0 _ = 0
helperf i x = let fx = f x in fx + helperg (i-1) fx
helperg 0 _ = 0
helperg i x = let gx = g x in gx + helperf (i-1) gx
We here defined two "helper" functions helperf and helperg, helperf will sum up f x with helperg (i-1) (f x), and helperg does the same, except that we use g instead of f. We here thus use mutual recursion to solve the problem.
We can however solve this problem more elegantly, by making use of scanl :: (b -> a -> b) -> b -> [a] -> [b], take :: Int -> [a] and sum :: Num a => [a] -> a:
sumfg :: Num a => (a -> a) -> (a -> a) -> Int -> a -> a
sumfg f g n x = sum (take n (scanl (flip ($)) (f x) (cycle [g, f])))
Here we thus make an infinite list of g and f, like [g, f, g, f, g, f, ...] with cycle [f, g]. We then use scanl (flip ($)) to each time apply the accumulator to one of the functions, and yield that element. We take the first n items of that list with take n, and finally we use sum to sum up these values.
For example:
Prelude> sumfg (2+) (3*) 5 1
91
Since (2+1) + (3*(2+1)) + (2+(3*(2+1))) + (3*(2+(3*(2+1)))) + (2+(3*(2+(3*(2+1))))) is 91.
We also generalized the signature: we can now work with any numerical type a, with the two functions f and g of type f, g :: a -> a.

Expression expansion using recursion schemes

I have a data type representing arithmetic expressions:
data E = Add E E | Mul E E | Var String
I want to write an expansion function which will convert an expression into sum of products of variables (sort of braces expansion). Using recursion schemes of course.
I only could think of an algorithm in the spirit of "progress and preservation". The algorithm at each step constructs terms that are fully expanded so there is no need to re-check.
The handling of Mul made me crazy, so instead of doing it directly I used an isomorphic type of [[String]] and took advantage of concat and concatMap already implemented for me:
type Poly = [Mono]
type Mono = [String]
mulMonoBy :: Mono -> Poly -> Poly
mulMonoBy x = map (x ++)
mulPoly :: Poly -> Poly -> Poly
mulPoly x = concatMap (flip mulMonoBy x)
So then I just use cata:
expandList :: E -> Poly
expandList = cata $ \case
Var x -> [[x]]
Add e1 e2 = e1 ++ e2
Mul e1 e2 = mulPoly e1 e2
And convert back:
fromPoly :: Poly -> Expr
fromPoly = foldr1 Add . map fromMono where
fromMono = foldr1 Mul . map Var
Are there significantly better approaches?
Upd: There are few confusions.
The solution does allow multiline variable names. Add (Val "foo" (Mul (Val "foo) (Var "bar"))) is a representation of foo + foo * bar. I'm not representing x*y*z with Val "xyz" or something. Note that also as there are no scalars repeated vars such as "foo * foo * quux" are perfectly allowed.
By sum of products I mean sort of "curried" n-ary sum of products. A concise definition of sum of products is that I want an expression without any parentheses, with all parens represented by associativity and priority.
So (foo * bar + bar) + (foo * bar + bar) is not a sum of products as the because of middle + is sum of sums
(foo * bar + (bar + (foo * bar + bar))) or corresponding left-associative version are right answers, although we must guarantee that associativity is always left of always right. So the correct type for right-assoaciative solution is
data Poly = Sum Mono Poly
| Product Mono
which is isomorphic to nonempty lists: NonEmpty Poly (note Sum Mono Poly instead of Sum Poly Poly). If we allow empty sums or products then we get just the list of list representation I used.
Also of you don't care about performance, the multiplication seems to be just liftA2 (++)
I am no expert in recursion schemes, but since it sounds like you are trying to practice them, hopefully you will not find it too onerous to convert a solution using manual recursion to one using recursion schemes. I'll write it with mixed prose and code first, and include the complete code again at the end for simpler copy/pasting.
It is not too difficult to do using simply the distributive property and a bit of recursive algebra. Before we begin, though, let's define a better result type, one that guarantees we can only ever represent sums of products:
data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term)
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show
This way we can't possibly mess up and accidentally yield an incorrect result like
(Mul (Var "x") (Add (Var "y") (Var "z")))
Now, let's write our function.
expand :: E -> Poly String
First, a base case: it is trivial to expand a Var, because it is already in sum-of-products form. But we must convert it a bit to fit it into our Poly result type:
expand (Var x) = Product (Term x)
Next, note that it is easy to expand an addition: simply expand the two sub-expressions, and add them together.
expand (Add x y) = Sum (expand x) (expand y)
What about a multiplication? That is a bit more complicated, since
Product (expand x) (expand y)
is ill-typed: we can't multiply polynomials, only monomials. But we do know how to do algebraic manipulation to turn a multiplication of polynomials into a sum of multiplications of monomials, via the distributive rule. As in your question, we'll need a function mulPoly. But let's just assume that exists, and implement it later.
expand (Mul x y) = mulPoly (expand x) (expand y)
That handles all the cases, so all that's left is to implement mulPoly by distributing the multiplications across the two polynomials' terms. We simply break down one of the polynomials one term at a time, and multiply the term across each of the terms in the other polynomial, adding together the results.
mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly -> Poly
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x
And in the end, we can test that it works as intended:
expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
{- results in: Sum (Sum (Product (MonoMul (Term "y") (Term "a")))
(Product (MonoMul (Term "z") (Term "a"))))
(Sum (Product (MonoMul (Term "y") (Term "b")))
(Product (MonoMul (Term "z") (Term "b"))))
-}
Or,
(a + b)(y * z) = ay + az + by + bz
which we know to be correct.
The complete solution, as promised above:
data E = Add E E | Mul E E | Var String
data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term)
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show
expand :: E -> Poly String
expand (Var x) = Product (Term x)
expand (Add x y) = Sum (expand x) (expand y)
expand (Mul x y) = mulPoly (expand x) (expand y)
mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly String -> Poly String
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x
main = print $ expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
This answer has three sections. The first section, a summary in which I present my two favourite solutions, is the most important one. The second section contains types and imports, as well as extended commentary on the way towards the solutions. The third section focuses on the task of reassociating expressions, something that the original version of the answer (i.e. the second section) had not given due attention.
At the end of the day, I ended up with two solutions worth discussing. The first one is expandDirect (cf. the third section):
expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = \case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x
With it, we rebuild the tree from the bottom (cata). On every branch, if we find something invalid we walk back and rewrite the subtree (apo), redistributing and reassociating as needed until all immediate children are correctly arranged (apo makes it possible to do that without having to rewrite everyting down to the very bottom).
The second solution, expandMeta, is a much simplified version of expandFlat from the third section.
expandMeta :: E a -> E a
expandMeta = apo coalg . cata alg
where
alg = \case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> Mul <$> x <*> y
coalg = \case
x :| [] -> Left <$> project x
x :| (y:ys) -> Add' (Left x) (Right (y :| ys))
expandMeta is a metamorphism; that is, a catamorphism followed by an anamorphism (while we are using apo here as well, an apomorphism is just a fancy kind of anamorphism, so I guess the nomenclature still applies). The catamorphism changes the tree into a non-empty list -- that implicitly handles the reassociation of the Adds -- with the list applicative being used to distribute multiplication (much like you suggest). The coalgebra then quite trivially converts the non-empty list back into a tree with the appropriate shape.
Thank you for the question -- I had a lot of fun with it! Preliminaries:
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Data.Functor.Foldable
import qualified Data.List.NonEmpty as N
import Data.List.NonEmpty (NonEmpty(..))
import Data.Semigroup
import Data.Foldable (toList)
import Data.List (nub)
import qualified Data.Map as M
import Data.Map (Map, (!))
import Test.QuickCheck
data E a = Var a | Add (E a) (E a) | Mul (E a) (E a)
deriving (Eq, Show, Functor, Foldable)
data EF a b = Var' a | Add' b b | Mul' b b
deriving (Eq, Show, Functor)
type instance Base (E a) = EF a
instance Recursive (E a) where
project = \case
Var x -> Var' x
Add x y -> Add' x y
Mul x y -> Mul' x y
instance Corecursive (E a) where
embed = \case
Var' x -> Var x
Add' x y -> Add x y
Mul' x y -> Mul x y
To begin with, my first working (if flawed) attempt, which uses the applicative instance of (non-empty) lists to distribute:
expandTooClever :: E a -> E a
expandTooClever = cata $ \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> foldr1 Add (Mul <$> flatten x <*> flatten y)
where
flatten :: E a -> NonEmpty (E a)
flatten = cata $ \case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> pure (foldr1 Mul (x <> y))
expandTooClever has one relatively serious problem: as it calls flatten, a full-blown fold, for both subtrees whenever it reaches a Mul, it has horrible asymptotics for chains of Mul.
Brute force, simplest-thing-that-could-possibly-work solution, with an algebra that calls itself recursively:
expandBrute :: E a -> E a
expandBrute = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' (Add x x') y -> Add (alg (Mul' x y)) (alg (Mul' x' y))
Mul' x (Add y y') -> Add (alg (Mul' x y)) (alg (Mul' x y'))
Mul' x y -> Mul x y
The recursive calls are needed because the distribution might introduce new occurrences of Add under Mul.
A slightly more tasteful variant of expandBrute, with the recursive call factored out into a separate function:
expandNotSoBrute :: E a -> E a
expandNotSoBrute = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> dis x y
dis (Add x x') y = Add (dis x y) (dis x' y)
dis x (Add y y') = Add (dis x y) (dis x y')
dis x y = Mul x y
A tamed expandNotSoBrute, with dis being turned into an apomorphism. This way of phrasing it expresses nicely the big picture of what is going on: if you only have Vars and Adds, you can trivially reproduce the tree bottom-up without a care in the world; if you hit a Mul, however, you have to go back and reconstuct the whole subtree to perform the distributions (I wonder is there is a specialised recursion scheme that captures this pattern).
expandEvert :: E a -> E a
expandEvert = cata alg
where
alg :: EF a (E a) -> E a
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> apo coalg (x, y)
coalg :: (E a, E a) -> EF a (Either (E a) (E a, E a))
coalg (Add x x', y) = Add' (Right (x, y)) (Right (x', y))
coalg (x, Add y y') = Add' (Right (x, y)) (Right (x, y'))
coalg (x, y) = Mul' (Left x) (Left y)
apo is necessary because we want to anticipate the final result if there is nothing else to distribute. (There is a way to write it with ana; however, that requires wastefully rebuilding trees of Muls without changes, which leads to the same asymptotics problem expandTooClever had.)
Last, but not least, a solution which is both a successful realisation of what I had attempted with expandTooClever and my interpretation of amalloy's answer. BT is a garden-variety binary tree with values on the leaves. A product is represented by a BT a, while a sum of products is a tree of trees.
expandSOP :: E a -> E a
expandSOP = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (BT (BT a)) -> BT (BT a)
algSOP = \case
Var' s -> pure (pure s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = \case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: BTF (E a) (E a) -> E a
algS = \case
Leaf' x -> x
Branch' x y -> Add x y
BT and its instances:
data BT a = Leaf a | Branch (BT a) (BT a)
deriving (Eq, Show)
data BTF a b = Leaf' a | Branch' b b
deriving (Eq, Show, Functor)
type instance Base (BT a) = BTF a
instance Recursive (BT a) where
project (Leaf s) = Leaf' s
project (Branch l r) = Branch' l r
instance Corecursive (BT a) where
embed (Leaf' s) = Leaf s
embed (Branch' l r) = Branch l r
instance Semigroup (BT a) where
l <> r = Branch l r
-- Writing this, as opposed to deriving it, for the sake of illustration.
instance Functor BT where
fmap f = cata $ \case
Leaf' x -> Leaf (f x)
Branch' l r -> Branch l r
instance Applicative BT where
pure x = Leaf x
u <*> v = ana coalg (u, v)
where
coalg = \case
(Leaf f, Leaf x) -> Leaf' (f x)
(Leaf f, Branch xl xr) -> Branch' (Leaf f, xl) (Leaf f, xr)
(Branch fl fr, v) -> Branch' (fl, v) (fr, v)
To wrap things up, a test suite:
newtype TestE = TestE { getTestE :: E Char }
deriving (Eq, Show)
instance Arbitrary TestE where
arbitrary = TestE <$> sized genExpr
where
genVar = Var <$> choose ('a', 'z')
genAdd n = Add <$> genSub n <*> genSub n
genMul n = Mul <$> genSub n <*> genSub n
genSub n = genExpr (n `div` 2)
genExpr = \case
0 -> genVar
n -> oneof [genVar, genAdd n, genMul n]
data TestRig b = TestRig (Map Char b) (E Char)
deriving (Show)
instance Arbitrary b => Arbitrary (TestRig b) where
arbitrary = do
e <- genExpr
d <- genDict e
return (TestRig d e)
where
genExpr = getTestE <$> arbitrary
genDict x = M.fromList . zip (keys x) <$> (infiniteListOf arbitrary)
keys = nub . toList
unsafeSubst :: Ord a => Map a b -> E a -> E b
unsafeSubst dict = fmap (dict !)
eval :: Num a => E a -> a
eval = cata $ \case
Var' x -> x
Add' x y -> x + y
Mul' x y -> x * y
evalRig :: (E Char -> E Char) -> TestRig Integer -> Integer
evalRig f (TestRig d e) = eval (unsafeSubst d (f e))
mkPropEval :: (E Char -> E Char) -> TestRig Integer -> Bool
mkPropEval f = (==) <$> evalRig id <*> evalRig f
isDistributed :: E a -> Bool
isDistributed = para $ \case
Add' (_, x) (_, y) -> x && y
Mul' (Add _ _, _) _ -> False
Mul' _ (Add _ _, _) -> False
Mul' (_, x) (_, y) -> x && y
_ -> True
mkPropDist :: (E Char -> E Char) -> TestE -> Bool
mkPropDist f = isDistributed . f . getTestE
main = mapM_ test
[ ("expandTooClever" , expandTooClever)
, ("expandBrute" , expandBrute)
, ("expandNotSoBrute", expandNotSoBrute)
, ("expandEvert" , expandEvert)
, ("expandSOP" , expandSOP)
]
where
test (header, func) = do
putStrLn $ "Testing: " ++ header
putStr "Evaluation test: "
quickCheck $ mkPropEval func
putStr "Distribution test: "
quickCheck $ mkPropDist func
By sum of products I mean sort of "curried" n-ary sum of products. A concise definition of sum of products is that I want an expression without any parentheses, with all parens represented by associativity and priority.
We can adjust the solutions above so that the sums are reassociated. The easiest way is replacing the outer BT in expandSOP with NonEmpty. Given that the multiplication there is, much like you suggest, liftA2 (<>), this works straight away.
expandFlat :: E a -> E a
expandFlat = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (NonEmpty (BT a)) -> NonEmpty (BT a)
algSOP = \case
Var' s -> pure (Leaf s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = \case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: NonEmptyF (E a) (E a) -> E a
algS = \case
NonEmptyF x Nothing -> x
NonEmptyF x (Just y) -> Add x y
Another option is using any of the other solutions and reassociating the sums in the distributed tree in a separate step.
flattenSum :: E a -> E a
flattenSum = cata alg
where
alg = \case
Add' x y -> apo coalg (x, y)
x -> embed x
coalg = \case
(Add x x', y) -> Add' (Left x) (Right (x', y))
(x, y) -> Add' (Left x) (Left y)
We can also roll flattenSum and expandEvert into a single function. Note that the sum coalgebra needs an extra case when it gets the result of the distribution coalgebra. That happens because, as the coalgebra proceeds from top to bottom, we can't be sure that the subtrees it generates are properly associated.
-- This is written in a slightly different style than the previous functions.
expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = \case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x
Perhaps there is a more clever way of writing expandDirect, but I haven't figured it out yet.

Memoisation with auxiliary parameter in Haskell

I have a recursive function f that takes two parameters x and y. The function is uniquely determined by the first parameter; the second one merely makes things easier.
I now want to memoise that function w.r.t. it's first parameter while ignoring the second one. (I.e. f is evaluated at most one for every value of x)
What is the easiest way to do that? At the moment, I simply define an array containing all values recursively, but that is a somewhat ad-hoc solution. I would prefer some kind of memoisation combinator that I can just throw at my function.
EDIT: to clarify, the function f takes a pair of integers and a list. The first integer is some parameter value, the second one denotes the index of an element in some global list xs to consume.
To avoid indexing the list, I pass the partially consumed list to f as well, but obviously, the invariant is that if the first parameter is (m, n), the second one will always be drop n xs, so the result is uniquely determined by the first parameter.
Just using a memoisation combinator on the partially applied function will not work, since that will leave an unevaluated thunk \xs -> … lying around. I could probably wrap the two parameters in a datatype whose Eq instance ignores the second value (and similarly for other instances), but that seems like a very ad-hoc solution. Is there not an easier way?
EDIT2: The concrete function I want to memoise:
g :: [(Int, Int)] -> Int -> Int
g xs n = f 0 n
where f :: Int -> Int -> Int
f _ 0 = 0
f m n
| m == length xs = 0
| w > n = f (m + 1) n
| otherwise = maximum [f (m + 1) n, v + f (m + 1) (n - w)]
where (w, v) = xs !! m
To avoid the expensive indexing operation, I instead pass the partially-consumed list to f as well:
g' :: [(Int, Int)] -> Int -> Int
g' xs n = f xs 0 n
where f :: [(Int, Int)] -> Int -> Int -> Int
f [] _ _ = 0
f _ _ 0 = 0
f ((w,v) : xs) m n
| w > n = f xs (m + 1) n
| otherwise = maximum [f xs (m + 1) n, v + f xs (m + 1) (n - w)]
Memoisation of f w.r.t. the list parameter is, of course, unnecessary, since the list does not (morally) influence the result. I would therefore like the memoisation to simply ignore the list parameter.
Your function is unnecessarily complicated. You don't need the index m at all:
foo :: [(Int, Int)] -> Int -> Int
foo [] _ = 0
foo _ 0 = 0
foo ((w,v):xs) n
| w > n = foo xs n
| otherwise = foo xs n `max` foo xs (n - w) + v
Now if you want to memoize foo then both the arguments must be considered (as it should be).
We'll use the monadic memoization mixin method to memoize foo:
First, we create an uncurried version of foo (because we want to memoize both arguments):
foo' :: ([(Int, Int)], Int) -> Int
foo' ([], _) = 0
foo' (_, 0) = 0
foo' ((w,v):xs, n)
| w > n = foo' (xs, n)
| otherwise = foo' (xs, n) `max` foo' (xs, n - w) + v
Next, we monadify the function foo' (because we want to thread a memo table in the function):
foo' :: Monad m => ([(Int, Int)], Int) -> m Int
foo' ([], _) = return 0
foo' (_, 0) = return 0
foo' ((w,v):xs, n)
| w > n = foo' (xs, n)
| otherwise = do
a <- foo' (xs, n)
b <- foo' (xs, n - w)
return (a `max` b + v)
Then, we open the self-reference in foo' (because we want to call the memoized function):
type Endo a = a -> a
foo' :: Monad m => Endo (([(Int, Int)], Int) -> Int)
foo' _ ([], _) = return 0
foo' _ (_, 0) = return 0
foo' self ((w,v):xs, n)
| w > n = foo' (xs, n)
| otherwise = do
a <- self (xs, n)
b <- self (xs, n - w)
return (a `max` b + v)
We'll use the following memoization mixin to memoize our function foo':
type Dict a b m = (a -> m (Maybe b), a -> b -> m ())
memo :: Monad m => Dict a b m -> Endo (a -> m b)
memo (check, store) super a = do
b <- check a
case b of
Just b -> return b
Nothing -> do
b <- super a
store a b
return b
Our dictionary (memo table) will use the State monad and a Map data structure:
import Prelude hiding (lookup)
import Control.Monad.State
import Data.Map.Strict
mapDict :: Ord a => Dict a b (State (Map a b))
mapDict = (check, store) where
check a = gets (lookup a)
store a b = modify (insert a b)
Finally, we combine everything to create a memoized function memoFoo:
import Data.Function (fix)
type MapMemoized a b = a -> State (Map a b) b
memoFoo :: MapMemoized ([(Int, Int)], Int) Int
memoFoo = fix (memo mapDict . foo')
We can recover the original function foo as follows:
foo :: [(Int, Int)] -> Int -> Int
foo xs n = evalState (memoFoo (xs, n)) empty
Hope that helps.

How to make catamorphisms work with parameterized/indexed types?

I recently learned a bit about F-algebras:
https://www.fpcomplete.com/user/bartosz/understanding-algebras.
I wanted to lift this functionality to more advanced types (indexed and higher-kinded).
Furthermore, I checked "Giving Haskell a Promotion" (http://research.microsoft.com/en-us/people/dimitris/fc-kind-poly.pdf), which was very helpful because it gave names to my own vague "inventions".
However, I cannot seem to create a unified approach that works for all shapes.
Algebras need some "carrier type", but the structure we're traversing expects a certain shape (itself, applied recursively), so I came up with a "Dummy" container that can carry any type, but is shaped as expected. I then use a type family to couple these.
This approach seems to work, leading to a fairly generic signature for my 'cata' function.
However, the other things I use (Mu, Algebra) still need separate versions for each shape, just for passing a bunch of type variables around. I was hoping something like PolyKinds could help (which I use successfully to shape the dummy type), but it seems it is only meant to work the other way around.
As IFunctor1 and IFunctor2 do not have extra variables, I tried to unify them by attaching (via type family) the index-preserving-function type, but this seems not allowed because of the existential quantification, so I'm left with multiple versions there too.
Is there any way to unify these 2 cases? Did I overlook some tricks, or is this just a limitation for now?
Are there other things that can be simplified?
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
module Cata where
-- 'Fix' for indexed types (1 index)
newtype Mu1 f a = Roll1 { unRoll1 :: f (Mu1 f) a }
deriving instance Show (f (Mu1 f) a) => Show (Mu1 f a)
-- 'Fix' for indexed types (2 index)
newtype Mu2 f a b = Roll2 { unRoll2 :: f (Mu2 f) a b }
deriving instance Show (f (Mu2 f) a b) => Show (Mu2 f a b)
-- index-preserving function (1 index)
type s :-> t = forall i. s i -> t i
-- index-preserving function (2 index)
type s :--> t = forall i j. s i j -> t i j
-- indexed functor (1 index)
class IFunctor1 f where
imap1 :: (s :-> t) -> (f s :-> f t)
-- indexed functor (2 index)
class IFunctor2 f where
imap2 :: (s :--> t) -> (f s :--> f t)
-- dummy container type to store a solid result type
-- the shape should follow an indexed type
type family Dummy (x :: i -> k) :: * -> k
type Algebra1 f a = forall t. f ((Dummy f) a) t -> (Dummy f) a t
type Algebra2 f a = forall s t. f ((Dummy f) a) s t -> (Dummy f) a s t
cata1 :: IFunctor1 f => Algebra1 f a -> Mu1 f t -> (Dummy f) a t
cata1 alg = alg . imap1 (cata1 alg) . unRoll1
cata2 :: IFunctor2 f => Algebra2 f a -> Mu2 f s t -> (Dummy f) a s t
cata2 alg = alg . imap2 (cata2 alg) . unRoll2
And 2 example structures to work with:
ExprF1 seems like a normal useful thing, attaching an embedded type to an object language.
ExprF2 is contrived (extra argument which happens to be lifted (DataKinds) as well), just to find out if the "generic" cata2 is able to handle these shapes.
-- our indexed type, which we want to use in an F-algebra (1 index)
data ExprF1 f t where
ConstI1 :: Int -> ExprF1 f Int
ConstB1 :: Bool -> ExprF1 f Bool
Add1 :: f Int -> f Int -> ExprF1 f Int
Mul1 :: f Int -> f Int -> ExprF1 f Int
If1 :: f Bool -> f t -> f t -> ExprF1 f t
deriving instance (Show (f t), Show (f Bool)) => Show (ExprF1 f t)
-- our indexed type, which we want to use in an F-algebra (2 index)
data ExprF2 f s t where
ConstI2 :: Int -> ExprF2 f Int True
ConstB2 :: Bool -> ExprF2 f Bool True
Add2 :: f Int True -> f Int True -> ExprF2 f Int True
Mul2 :: f Int True -> f Int True -> ExprF2 f Int True
If2 :: f Bool True -> f t True -> f t True -> ExprF2 f t True
deriving instance (Show (f s t), Show (f Bool t)) => Show (ExprF2 f s t)
-- mapper for f-algebra (1 index)
instance IFunctor1 ExprF1 where
imap1 _ (ConstI1 x) = ConstI1 x
imap1 _ (ConstB1 x) = ConstB1 x
imap1 eval (x `Add1` y) = eval x `Add1` eval y
imap1 eval (x `Mul1` y) = eval x `Mul1` eval y
imap1 eval (If1 p t e) = If1 (eval p) (eval t) (eval e)
-- mapper for f-algebra (2 index)
instance IFunctor2 ExprF2 where
imap2 _ (ConstI2 x) = ConstI2 x
imap2 _ (ConstB2 x) = ConstB2 x
imap2 eval (x `Add2` y) = eval x `Add2` eval y
imap2 eval (x `Mul2` y) = eval x `Mul2` eval y
imap2 eval (If2 p t e) = If2 (eval p) (eval t) (eval e)
-- turned into a nested expression
type Expr1 = Mu1 ExprF1
-- turned into a nested expression
type Expr2 = Mu2 ExprF2
-- dummy containers
newtype X1 x y = X1 x deriving Show
newtype X2 x y z = X2 x deriving Show
type instance Dummy ExprF1 = X1
type instance Dummy ExprF2 = X2
-- a simple example agebra that evaluates the expression
-- turning bools into 0/1
alg1 :: Algebra1 ExprF1 Int
alg1 (ConstI1 x) = X1 x
alg1 (ConstB1 False) = X1 0
alg1 (ConstB1 True) = X1 1
alg1 ((X1 x) `Add1` (X1 y)) = X1 $ x + y
alg1 ((X1 x) `Mul1` (X1 y)) = X1 $ x * y
alg1 (If1 (X1 0) _ (X1 e)) = X1 e
alg1 (If1 _ (X1 t) _) = X1 t
alg2 :: Algebra2 ExprF2 Int
alg2 (ConstI2 x) = X2 x
alg2 (ConstB2 False) = X2 0
alg2 (ConstB2 True) = X2 1
alg2 ((X2 x) `Add2` (X2 y)) = X2 $ x + y
alg2 ((X2 x) `Mul2` (X2 y)) = X2 $ x * y
alg2 (If2 (X2 0) _ (X2 e)) = X2 e
alg2 (If2 _ (X2 t) _) = X2 t
-- simple helpers for construction
ci1 :: Int -> Expr1 Int
ci1 = Roll1 . ConstI1
cb1 :: Bool -> Expr1 Bool
cb1 = Roll1 . ConstB1
if1 :: Expr1 Bool -> Expr1 a -> Expr1 a -> Expr1 a
if1 p t e = Roll1 $ If1 p t e
add1 :: Expr1 Int -> Expr1 Int -> Expr1 Int
add1 x y = Roll1 $ Add1 x y
mul1 :: Expr1 Int -> Expr1 Int -> Expr1 Int
mul1 x y = Roll1 $ Mul1 x y
ci2 :: Int -> Expr2 Int True
ci2 = Roll2 . ConstI2
cb2 :: Bool -> Expr2 Bool True
cb2 = Roll2 . ConstB2
if2 :: Expr2 Bool True -> Expr2 a True-> Expr2 a True -> Expr2 a True
if2 p t e = Roll2 $ If2 p t e
add2 :: Expr2 Int True -> Expr2 Int True -> Expr2 Int True
add2 x y = Roll2 $ Add2 x y
mul2 :: Expr2 Int True -> Expr2 Int True -> Expr2 Int True
mul2 x y = Roll2 $ Mul2 x y
-- test case
test1 :: Expr1 Int
test1 = if1 (cb1 True)
(ci1 3 `mul1` ci1 4 `add1` ci1 5)
(ci1 2)
test2 :: Expr2 Int True
test2 = if2 (cb2 True)
(ci2 3 `mul2` ci2 4 `add2` ci2 5)
(ci2 2)
main :: IO ()
main = let (X1 x1) = cata1 alg1 test1
(X2 x2) = cata2 alg2 test2
in do print x1
print x2
Output:
17
17
I wrote a talk on this topic called "Slicing It" in 2009. It certainly points to the work by my Strathclyde colleagues, Johann and Ghani, on initial algebra semantics for GADTs. I used the notation which SHE provides for writing data-indexed types, but that has pleasingly been superseded by the "promotion" story.
The key point of the talk is, as per my comment, to be systematic about using exactly one index, but to exploit the fact that its kind can vary.
So indeed, we have (using my current preferred "Goscinny and Uderzo" names)
type s :-> t = forall i. s i -> t i
class FunctorIx f where
mapIx :: (s :-> t) -> (f s :-> f t)
Now you can show FunctorIx is closed under fixpoints. The key is to combine two indexed sets into a one that offers a choice of index.
data Case (f :: i -> *) (g :: j -> *) (b :: Either i j) :: * where
L :: f i -> Case f g (Left i)
R :: g j -> Case f g (Right j)
(<?>) :: (f :-> f') -> (g :-> g') -> Case f g :-> Case f' g'
(f <?> g) (L x) = L (f x)
(f <?> g) (R x) = R (g x)
Now we can now take fixpoints of functors whose "contained elements" stand for either "payload" or "recursive substructures".
data MuIx (f :: (Either i j -> *) -> j -> *) :: (i -> *) -> j -> * where
InIx :: f (Case x (MuIx f x)) j -> MuIx f x j
As a result, we can mapIx over "payload"...
instance FunctorIx f => FunctorIx (MuIx f) where
mapIx f (InIx xs) = InIx (mapIx (f <?> mapIx f) xs)
...or write a catamorphism over the "recursive substructures"...
foldIx :: FunctorIx f => (f (Case x t) :-> t) -> MuIx f x :-> t
foldIx f (InIx xs) = f (mapIx (id <?> foldIx f) xs)
...or both at once.
mapFoldIx :: FunctorIx f => (x :-> y) -> (f (Case y t) :-> t) -> MuIx f x :-> t
mapFoldIx e f (InIx xs) = f (mapIx (e <?> mapFoldIx e f) xs)
The joy of FunctorIx is that it has such splendid closure properties, thanks to the ability to vary the indexing kinds. MuIx allows for notions of payload, and can be iterated. There is thus an incentive to work with structured indices rather than multiple indices.
If I understand it properly, this is precisely the problem tackled by Johann and Ghani's "Initial Algebra Semantics is Enough!"
https://personal.cis.strath.ac.uk/neil.ghani/papers/ghani-tlca07.pdf
See in particular their hfold
Edit: For the GADT case, see their later paper "Foundations for Structured Programming using GADTs". Note that they encounter an obstacle that can be resolved using PolyKinds, which we now have: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.111.2948
This blog post may also be of interest: http://www.timphilipwilliams.com/posts/2013-01-16-fixing-gadts.html

How can I check if a BST is valid?

How can I check if a BST is a valid one, given its definition and using a generalized version of fold for BST?
data(Ord a, Show a, Read a) => BST a = Void | Node {
val :: a,
left, right :: BST a
} deriving (Eq, Ord, Read, Show)
fold :: (Read a, Show a, Ord a) => (a -> b -> b -> b) -> b -> BST a -> b
fold _ z Void = z
fold f z (Node x l r) = f x (fold f z l) (fold f z r)
The idea is to check that a node value is greater then all values in left-subtree and smaller than all values in its right-subtree. This must be True for all nodes in the tree. A function bstList simply output the list of (ordered) values in the BST.
Of course something like this won't work:
--isBST :: (Read a, Show a, Ord a) => BST a -> Bool
isBST t = fold (\x l r -> all (<x) (bstList l) && all (>x) (bstList r)) (True) t
because, for example, applying the fold function to the node 19 ends up all (<19) (bstList True) && all (>19) (bstList True).
Your problem seems to be that you lose information because your function only returns a boolean when it examines the left and right subtrees. So change it to also return the minimum and maximum values of the subtrees. (This is probably more efficient as well, since you don't need to used bslist to check all elements anymore)
And make a wrapper function to ignore these "auxiliary" values after you are done, of course.
(Please don't put typeclass constraints on the data type.)
A BST is valid iff an in-order traversal is monotonically increasing.
flatten tree = fold (\a l r -> l . (a:) . r) id tree []
ordered list#(_:rest) = and $ zipWith (<) list rest
ordered _ = True
isBST = ordered . flatten
A nice way of encoding this is to lean on the traversal provided by Data.Foldable.
{-# LANGUAGE DeriveFunctor, DeriveFoldable #-}
import Data.Foldable
import Data.Monoid
We can derive an instance of it automatically using an extension, but we need to reorder the fields of the Node constructor to provide us an in-order traversal.
While we're at it, we should eliminate the constraints on the data type itself. They actually provide no benefit, and has been removed from the language as of Haskell 2011. (When you want to use such constraints you should put them on instances of classes, not on the data type.)
data BST a
= Void
| Node
{ left :: BST a
, val :: a
, right :: BST a
} deriving (Eq, Ord, Read, Show, Foldable)
First we define what it means for a list to be strictly sorted.
sorted :: Ord a => [a] -> Bool
sorted [] = True
sorted [x] = True
sorted (x:xs) = x < head xs && sorted xs
-- head is safe because of the preceeding match.
Then we can use the toList method provided by Data.Foldable and the above helper.
isBST :: Ord a => BST a -> Bool
isBST = sorted . toList
We can also implement this more directly, like you asked. Since we removed the spurious constraints on the data type, we can simplify the definition of your fold.
cata :: (b -> a -> b -> b) -> b -> BST a -> b
cata _ z Void = z
cata f z (Node l x r) = f (cata f z l) x (cata f z r)
Now we need a data type to model the result of our catamorphism, which is that we either have no nodes (Z), or a range of strictly increasing nodes (T) or have failed (X)
data T a = Z | T a a | X deriving Eq
And we can then implement isBST directly
isBST' :: Ord a => BST a -> Bool
isBST' b = cata phi Z b /= X where
phi X _ _ = X
phi _ _ X = X
phi Z a Z = T a a
phi Z a (T b c) = if a < b then T a c else X
phi (T a b) c Z = if b < c then T a c else X
phi (T a b) c (T d e) = if b < c && c < d then T a e else X
This is a bit tedious, so perhaps it would be better to decompose the way we compose the interim states a bit:
cons :: Ord a => a -> T a -> T a
cons _ X = X
cons a Z = T a a
cons a (T b c) = if a < b then T a c else X
instance Ord a => Monoid (T a) where
mempty = Z
Z `mappend` a = a
a `mappend` Z = a
X `mappend` _ = X
_ `mappend` X = X
T a b `mappend` T c d = if b < c then T a d else X
isBST'' :: Ord a => BST a -> Bool
isBST'' b = cata phi Z b /= X where
phi l a r = l `mappend` cons a r
Personally, I'd probably just use the Foldable instance.
If you don't insist on using a fold you can do it like this:
ord Void = True
ord (Node v l r) = every (< v) l && every (> v) r && ord l && ord r where
every p Void = True
every p (Node v l r) = p v && every p l && every p r

Resources