Pattern Matching in a Haskell Function - haskell

I have many methods that have boilerplate code in their definition, look at the example above.
replace:: Term -> Term -> Formula -> Formula
replace x y (Not f) = Not $ replace x y f
replace x y (And f g) = And (replace x y f) (replace x y g)
replace x y (Or f g) = Or (replace x y f) (replace x y g)
replace x y (Biimp f g) = Biimp (replace x y f) (replace x y g)
replace x y (Imp f g) = Imp (replace x y f) (replace x y g)
replace x y (Forall z f) = Forall z (replace x y f)
replace x y (Exists z f) = Exists z (replace x y f)
replace x y (Pred idx ts) = Pred idx (replace_ x y ts)
As you can see, the definitions for replace function follows a pattern. I want to have the same behavior of the function, simplifying his definition, probably using some pattern matching, maybe with a wildcard _ or X over the arguments, something like:
replace x y (X f g) = X (replace x y f) (replace x y g)
For avoiding the following definitions:
replace x y (And f g) = And (replace x y f) (replace x y g)
replace x y (Or f g) = Or (replace x y f) (replace x y g)
replace x y (Biimp f g) = Biimp (replace x y f) (replace x y g)
replace x y (Imp f g) = Imp (replace x y f) (replace x y g)
Is there some way? Forget about the purpose of the function, it could be whatever.

If you have many constructors that should be treated in a uniform way, you should make your data type reflect that.
data BinOp = BinAnd | BinOr | BinBiimp | BinImp
data Quantifier = QForall | QExists
data Formula = Not Formula
| Binary BinOp Formula Formula -- key change here
| Quantified Quantifier Formula
| Pred Index [Formula]
Now the pattern match for all binary operators is much easier:
replace x y (Binary op f g) = Binary op (replace x y f) (replace x y g)
To preserve existing code, you can turn on PatternSynonyms and define the old versions of And, Or, and so on back into existence:
pattern And x y = Binary BinAnd x y
pattern Forall f = Quantified QForall f

I'm not entirely sure this is what you are looking for but you could do the following. The idea is that you can consider a formula to be abstracted over another type (usually a Term in your case). Then, you can define what it means to map over a formula. I tried to replicate your data definitions, although I have some problems with Formula - namely that all the constructors seem to require another Formula...
{-# LANGUAGE DeriveFunctor #-}
data Formula a
= Not (Formula a)
| And (Formula a) (Formula a)
| Or (Formula a) (Formula a)
| Biimp (Formula a) (Formula a)
| Imp (Formula a) (Formula a)
| Forall a (Formula a)
| Exists a (Formula a)
| Pred a (Formula a)
deriving (Functor)
data Term = Term String {- However you define it, doesn't matter -} deriving (Eq)
replace :: (Functor f, Eq a) => a -> a -> f a -> f a
replace x y = fmap (\z -> if x == z then y else z)
The interesting thing to note is that now the replace function can be applied to anything that is a functor - it even serves as replace for a list!
replace 3 9 [1..6] = [1,2,9,4,5,6]
EDIT As an afterthought, if you are implementing a substitution style replace where terms in formulas can be shadowed (the usual scoping rules), you will probably end up doing something like this:
replace' :: (Eq a) => a -> a -> Formula a -> Formula a
replace' x y f#(Forall z _) | x == z = f
replace' x y f#(Exists z _) | x == z = f
replace' x y f#(Pred z _) | x == z = f
replace' x y formula = fmap (replace' x y) formula
Which isn't as cute, but also isn't as straightforward pf a problem.

Data.Functor.Foldable abstracts the pattern of recursive data structures:
import Data.Functor.Foldable
data FormulaF t
= Not t
| And t t
| Or t t
| Biimp t t
| Imp t t
| Forall A t
| Exists A t
| Pred B C
deriving (Functor, Foldable, Traversable)
type Formula = Fix FormulaF
replace :: Term -> Term -> Formula -> Formula
replace x y = cata $ \case ->
Pred idx ts -> Pred idx (replace_ x y ts)
f -> f
By the way, beware of replace x y (Forall x (f x)) = Forall x (f y): Substitution is the process of replacing all free occurences of a variable in an expression with an expression.

Related

A traversal as data

I heard about this construction which is loosely described as “a traversal represented in data, applied to some structure, without the need for the applicative”
It can be defined as:
data X a b r =
| Done r
| Step a (X a b (b -> r))
A word description would be as follows:
the type X a b r describes the shape of a structure
which contains things of type a
and for each a you get the opportunity to produce something of type b
and provided you do that for each a,
you get something of type r.
Thus a “traversal” of a list, [a], has type X a b [b], because if you can turn each a of the list into a b then you get a [b].
My question is: what is this thing called? Is there a reference to more information about it?
Example usage:
instance Functor (X a b) where
fmap f (Done r) = f r
fmap f (Step a next) = Step a (fmap (f .) next)
f :: [a] -> X a b [b]
f [] = Done []
f (a:as) = Step a (fmap (flip (:)) as)
g :: Applicative f => (a -> f b) -> X a b r -> f r
g f (Done r) = pure r
g f (Step a next) = g f next <*> f a
More generally:
instance Applicative (X a b) where
pure x = Done x
Done f <*> y = fmap (\y -> f y) y
Step a next <*> y = Step a (fmap flip next <*> y)
t :: Traversable t => t a -> X a b (t b)
t = traverse (\a -> Step a (Done id))
And, assuming I haven’t made any errors, we should find that:
flip g . t == traverse
Edit: I’ve thought about this some more. There is something this doesn’t have which a traversal has: a traversal can split up the computation into something that isn’t “one at a time,” for example to traverse a binary tree one can traverse the left and right half “in parallel.” Here is a structure that I think gives the same effect:
data Y a b r =
| Done r
| One a (b -> r)
| forall s t. Split (Y a b s) (Y a b t) (s -> t -> r)
(Slightly vague syntax as I don’t remember it and don’t want to write this as a gadt)
f1 :: X a b r -> Y a b r
f1 (Done x) = Done x
f1 (Step a next) = Split (One a id) (f1 next) (flip ($))
f2 :: Y a b r -> X a b r
f2 (Done x) = Done x
f2 (One a f) = Step a (Done f)
f2 (Split x y f) = f <$> f2 x <*> f2 y

Explain (.)(.) to me

Diving into Haskell, and while I am enjoying the language I'm finding the pointfree style completely illegible. I've come a across this function which only consists of these ASCII boobies as seen below.
f = (.)(.)
And while I understand its type signature and what it does, I can't for the life of me understand why it does it. So could someone please write out the de-pointfreed version of it for me, and maybe step by step work back to the pointfree version sorta like this:
f g x y = (g x) + y
f g x = (+) (g x)
f g = (+) . g
f = (.) (+)
Generally (?) (where ? stands for an arbitrary infix operator) is the same as \x y -> x ? y. So we can rewrite f as:
f = (\a b -> a . b) (\c d -> c . d)
Now if we apply the argument to the function, we get:
f = (\b -> (\c d -> c . d) . b)
Now b is just an argument to f, so we can rewrite this as:
f b = (\c d -> c . d) . b
The definition of . is f . g = \x -> f (g x). If replace the outer . with its definition, we get:
f b = \x -> (\c d -> c . d) (b x)
Again we can turn x into a regular parameter:
f b x = (\c d -> c . d) (b x)
Now let's replace the other .:
f b x = (\c d y -> c (d y)) (b x)
Now let's apply the argument:
f b x = \d y -> (b x) (d y)
Now let's move the parameters again:
f b x d y = (b x) (d y)
Done.
You can also gradually append arguments to f:
f = ((.) . )
f x = (.) . x
f x y = ((.) . x) y
= (.) (x y)
= ((x y) . )
f x y z = (x y) . z
f x y z t = ((x y) . z) t
= (x y) (z t)
= x y (z t)
= x y $ z t
The result reveals that x and z are actually (binary and unary, respectively) functions, so I'll use different identifiers:
f g x h y = g x (h y)
We can work backwards by "pattern matching" over the combinators' definitions. Given
f a b c d = a b (c d)
= (a b) (c d)
we proceed
= B (a b) c d
= B B a b c d -- writing B for (.)
so by eta-contraction
f = B B
because
a (b c) = B a b c -- bidirectional equation
by definition. Haskell's (.) is actually the B combinator (see BCKW combinators).
edit: Potentially, many combinators can match the same code. That's why there are many possible combinatory encodings for the same piece of code. For example, (ab)(cd) = (ab)(I(cd)) is a valid transformation, which might lead to some other combinator definition matching that. Choosing the "most appropriate" one is an art (or a search in a search space with somewhat high branching factor).
That's about going backwards, as you asked. But if you want to go "forward", personally, I like the combinatory approach much better over the lambda notation fidgeting. I would even just write many arguments right away, and get rid of the extra ones in the end:
BBabcdefg = B(ab)cdefg = (ab)(cd)efg
hence,
BBabcd = B(ab)cd = (ab)(cd)
is all there is to it.

Why does it apply the second argument?

I am trying to understand the Interchange law of applicative functor:
u <*> pure y = pure ($ y) <*> u
What make me confuse is, the function application $ y, consider following example:
($ 2) :: (a -> b) -> b
Why does the second argument get applied not the first?
That's an operator section. A few simple examples:
Prelude> (/2) <$> [1..8]
[0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0]
Prelude> (:"!") <$> ['a'..'e']
["a!","b!","c!","d!","e!"]
The section (:"!") is syntactic sugar for \c -> c:"!", i.e. it takes a character c and prepends it to the string "!".
Likewise, the section ($ 2) takes a function f and simply applies it to the number 2.
Note that this is different from ordinary partial application:
Prelude> ((/) 2) <$> [1..8]
[2.0,1.0,0.6666666666666666,0.5,0.4,0.3333333333333333,0.2857142857142857,0.25]
Here, I've simply applied the function (/) to one fixed argument 2, the dividend. This can also be written as a left section (2/). But the right section (/2) applies 2 as the divisor instead.
You can do that with operator sections. For example:
(5+ ) -- Same as \ x -> 5+x
( +5) -- Same as \ x -> x+5
It's only operators you can do this with; normal named functions can only be curried from left to right.
Haskell cheat sheet operator sections entry could be:
(a `op` b) = (a `op`) b = (`op` b) a = (op) a b
When op is an actual operator (not an alpha-numerical name), backticks aren't needed.
The above can be seen as partially applying implicitly defined lambda expressions:
(a `op`) b = (a `op` b) = (\y -> a `op` y) b = (\x y -> x `op` y) a b = op a b
(`op` b) a = (a `op` b) = (\x -> x `op` b) a = (\y x -> x `op` y) b a = flip op b a
If a function f expects more than two arguments eventually, we can similarly create its curried version by partially applying an explicit lambda expression:
(\y z x -> f x y z) b c -- = (\x -> f x b c)
(\x z y -> f x y z) a c -- = (\y -> f a y c)
(\x y z -> f x y z) a b -- = (\z -> f a b z)
The last case is equivalent to just f a b, and the second to (flip . f) a c:
g b c a = f a b c = flip f b a c = flip (flip f b) c a = (flip . flip f) b c a
g a c b = f a b c = flip (f a) c b = (flip . f) a c b
g a b c = f a b c

Expression expansion using recursion schemes

I have a data type representing arithmetic expressions:
data E = Add E E | Mul E E | Var String
I want to write an expansion function which will convert an expression into sum of products of variables (sort of braces expansion). Using recursion schemes of course.
I only could think of an algorithm in the spirit of "progress and preservation". The algorithm at each step constructs terms that are fully expanded so there is no need to re-check.
The handling of Mul made me crazy, so instead of doing it directly I used an isomorphic type of [[String]] and took advantage of concat and concatMap already implemented for me:
type Poly = [Mono]
type Mono = [String]
mulMonoBy :: Mono -> Poly -> Poly
mulMonoBy x = map (x ++)
mulPoly :: Poly -> Poly -> Poly
mulPoly x = concatMap (flip mulMonoBy x)
So then I just use cata:
expandList :: E -> Poly
expandList = cata $ \case
Var x -> [[x]]
Add e1 e2 = e1 ++ e2
Mul e1 e2 = mulPoly e1 e2
And convert back:
fromPoly :: Poly -> Expr
fromPoly = foldr1 Add . map fromMono where
fromMono = foldr1 Mul . map Var
Are there significantly better approaches?
Upd: There are few confusions.
The solution does allow multiline variable names. Add (Val "foo" (Mul (Val "foo) (Var "bar"))) is a representation of foo + foo * bar. I'm not representing x*y*z with Val "xyz" or something. Note that also as there are no scalars repeated vars such as "foo * foo * quux" are perfectly allowed.
By sum of products I mean sort of "curried" n-ary sum of products. A concise definition of sum of products is that I want an expression without any parentheses, with all parens represented by associativity and priority.
So (foo * bar + bar) + (foo * bar + bar) is not a sum of products as the because of middle + is sum of sums
(foo * bar + (bar + (foo * bar + bar))) or corresponding left-associative version are right answers, although we must guarantee that associativity is always left of always right. So the correct type for right-assoaciative solution is
data Poly = Sum Mono Poly
| Product Mono
which is isomorphic to nonempty lists: NonEmpty Poly (note Sum Mono Poly instead of Sum Poly Poly). If we allow empty sums or products then we get just the list of list representation I used.
Also of you don't care about performance, the multiplication seems to be just liftA2 (++)
I am no expert in recursion schemes, but since it sounds like you are trying to practice them, hopefully you will not find it too onerous to convert a solution using manual recursion to one using recursion schemes. I'll write it with mixed prose and code first, and include the complete code again at the end for simpler copy/pasting.
It is not too difficult to do using simply the distributive property and a bit of recursive algebra. Before we begin, though, let's define a better result type, one that guarantees we can only ever represent sums of products:
data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term)
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show
This way we can't possibly mess up and accidentally yield an incorrect result like
(Mul (Var "x") (Add (Var "y") (Var "z")))
Now, let's write our function.
expand :: E -> Poly String
First, a base case: it is trivial to expand a Var, because it is already in sum-of-products form. But we must convert it a bit to fit it into our Poly result type:
expand (Var x) = Product (Term x)
Next, note that it is easy to expand an addition: simply expand the two sub-expressions, and add them together.
expand (Add x y) = Sum (expand x) (expand y)
What about a multiplication? That is a bit more complicated, since
Product (expand x) (expand y)
is ill-typed: we can't multiply polynomials, only monomials. But we do know how to do algebraic manipulation to turn a multiplication of polynomials into a sum of multiplications of monomials, via the distributive rule. As in your question, we'll need a function mulPoly. But let's just assume that exists, and implement it later.
expand (Mul x y) = mulPoly (expand x) (expand y)
That handles all the cases, so all that's left is to implement mulPoly by distributing the multiplications across the two polynomials' terms. We simply break down one of the polynomials one term at a time, and multiply the term across each of the terms in the other polynomial, adding together the results.
mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly -> Poly
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x
And in the end, we can test that it works as intended:
expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
{- results in: Sum (Sum (Product (MonoMul (Term "y") (Term "a")))
(Product (MonoMul (Term "z") (Term "a"))))
(Sum (Product (MonoMul (Term "y") (Term "b")))
(Product (MonoMul (Term "z") (Term "b"))))
-}
Or,
(a + b)(y * z) = ay + az + by + bz
which we know to be correct.
The complete solution, as promised above:
data E = Add E E | Mul E E | Var String
data Poly term = Sum (Poly term) (Poly term)
| Product (Mono term)
deriving Show
data Mono term = Term term
| MonoMul (Mono term) (Mono term)
deriving Show
expand :: E -> Poly String
expand (Var x) = Product (Term x)
expand (Add x y) = Sum (expand x) (expand y)
expand (Mul x y) = mulPoly (expand x) (expand y)
mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)
mulMonoBy :: Mono String -> Poly String -> Poly String
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
where x' = Product x
main = print $ expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
This answer has three sections. The first section, a summary in which I present my two favourite solutions, is the most important one. The second section contains types and imports, as well as extended commentary on the way towards the solutions. The third section focuses on the task of reassociating expressions, something that the original version of the answer (i.e. the second section) had not given due attention.
At the end of the day, I ended up with two solutions worth discussing. The first one is expandDirect (cf. the third section):
expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = \case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x
With it, we rebuild the tree from the bottom (cata). On every branch, if we find something invalid we walk back and rewrite the subtree (apo), redistributing and reassociating as needed until all immediate children are correctly arranged (apo makes it possible to do that without having to rewrite everyting down to the very bottom).
The second solution, expandMeta, is a much simplified version of expandFlat from the third section.
expandMeta :: E a -> E a
expandMeta = apo coalg . cata alg
where
alg = \case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> Mul <$> x <*> y
coalg = \case
x :| [] -> Left <$> project x
x :| (y:ys) -> Add' (Left x) (Right (y :| ys))
expandMeta is a metamorphism; that is, a catamorphism followed by an anamorphism (while we are using apo here as well, an apomorphism is just a fancy kind of anamorphism, so I guess the nomenclature still applies). The catamorphism changes the tree into a non-empty list -- that implicitly handles the reassociation of the Adds -- with the list applicative being used to distribute multiplication (much like you suggest). The coalgebra then quite trivially converts the non-empty list back into a tree with the appropriate shape.
Thank you for the question -- I had a lot of fun with it! Preliminaries:
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Data.Functor.Foldable
import qualified Data.List.NonEmpty as N
import Data.List.NonEmpty (NonEmpty(..))
import Data.Semigroup
import Data.Foldable (toList)
import Data.List (nub)
import qualified Data.Map as M
import Data.Map (Map, (!))
import Test.QuickCheck
data E a = Var a | Add (E a) (E a) | Mul (E a) (E a)
deriving (Eq, Show, Functor, Foldable)
data EF a b = Var' a | Add' b b | Mul' b b
deriving (Eq, Show, Functor)
type instance Base (E a) = EF a
instance Recursive (E a) where
project = \case
Var x -> Var' x
Add x y -> Add' x y
Mul x y -> Mul' x y
instance Corecursive (E a) where
embed = \case
Var' x -> Var x
Add' x y -> Add x y
Mul' x y -> Mul x y
To begin with, my first working (if flawed) attempt, which uses the applicative instance of (non-empty) lists to distribute:
expandTooClever :: E a -> E a
expandTooClever = cata $ \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> foldr1 Add (Mul <$> flatten x <*> flatten y)
where
flatten :: E a -> NonEmpty (E a)
flatten = cata $ \case
Var' s -> pure (Var s)
Add' x y -> x <> y
Mul' x y -> pure (foldr1 Mul (x <> y))
expandTooClever has one relatively serious problem: as it calls flatten, a full-blown fold, for both subtrees whenever it reaches a Mul, it has horrible asymptotics for chains of Mul.
Brute force, simplest-thing-that-could-possibly-work solution, with an algebra that calls itself recursively:
expandBrute :: E a -> E a
expandBrute = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' (Add x x') y -> Add (alg (Mul' x y)) (alg (Mul' x' y))
Mul' x (Add y y') -> Add (alg (Mul' x y)) (alg (Mul' x y'))
Mul' x y -> Mul x y
The recursive calls are needed because the distribution might introduce new occurrences of Add under Mul.
A slightly more tasteful variant of expandBrute, with the recursive call factored out into a separate function:
expandNotSoBrute :: E a -> E a
expandNotSoBrute = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> dis x y
dis (Add x x') y = Add (dis x y) (dis x' y)
dis x (Add y y') = Add (dis x y) (dis x y')
dis x y = Mul x y
A tamed expandNotSoBrute, with dis being turned into an apomorphism. This way of phrasing it expresses nicely the big picture of what is going on: if you only have Vars and Adds, you can trivially reproduce the tree bottom-up without a care in the world; if you hit a Mul, however, you have to go back and reconstuct the whole subtree to perform the distributions (I wonder is there is a specialised recursion scheme that captures this pattern).
expandEvert :: E a -> E a
expandEvert = cata alg
where
alg :: EF a (E a) -> E a
alg = \case
Var' s -> Var s
Add' x y -> Add x y
Mul' x y -> apo coalg (x, y)
coalg :: (E a, E a) -> EF a (Either (E a) (E a, E a))
coalg (Add x x', y) = Add' (Right (x, y)) (Right (x', y))
coalg (x, Add y y') = Add' (Right (x, y)) (Right (x, y'))
coalg (x, y) = Mul' (Left x) (Left y)
apo is necessary because we want to anticipate the final result if there is nothing else to distribute. (There is a way to write it with ana; however, that requires wastefully rebuilding trees of Muls without changes, which leads to the same asymptotics problem expandTooClever had.)
Last, but not least, a solution which is both a successful realisation of what I had attempted with expandTooClever and my interpretation of amalloy's answer. BT is a garden-variety binary tree with values on the leaves. A product is represented by a BT a, while a sum of products is a tree of trees.
expandSOP :: E a -> E a
expandSOP = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (BT (BT a)) -> BT (BT a)
algSOP = \case
Var' s -> pure (pure s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = \case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: BTF (E a) (E a) -> E a
algS = \case
Leaf' x -> x
Branch' x y -> Add x y
BT and its instances:
data BT a = Leaf a | Branch (BT a) (BT a)
deriving (Eq, Show)
data BTF a b = Leaf' a | Branch' b b
deriving (Eq, Show, Functor)
type instance Base (BT a) = BTF a
instance Recursive (BT a) where
project (Leaf s) = Leaf' s
project (Branch l r) = Branch' l r
instance Corecursive (BT a) where
embed (Leaf' s) = Leaf s
embed (Branch' l r) = Branch l r
instance Semigroup (BT a) where
l <> r = Branch l r
-- Writing this, as opposed to deriving it, for the sake of illustration.
instance Functor BT where
fmap f = cata $ \case
Leaf' x -> Leaf (f x)
Branch' l r -> Branch l r
instance Applicative BT where
pure x = Leaf x
u <*> v = ana coalg (u, v)
where
coalg = \case
(Leaf f, Leaf x) -> Leaf' (f x)
(Leaf f, Branch xl xr) -> Branch' (Leaf f, xl) (Leaf f, xr)
(Branch fl fr, v) -> Branch' (fl, v) (fr, v)
To wrap things up, a test suite:
newtype TestE = TestE { getTestE :: E Char }
deriving (Eq, Show)
instance Arbitrary TestE where
arbitrary = TestE <$> sized genExpr
where
genVar = Var <$> choose ('a', 'z')
genAdd n = Add <$> genSub n <*> genSub n
genMul n = Mul <$> genSub n <*> genSub n
genSub n = genExpr (n `div` 2)
genExpr = \case
0 -> genVar
n -> oneof [genVar, genAdd n, genMul n]
data TestRig b = TestRig (Map Char b) (E Char)
deriving (Show)
instance Arbitrary b => Arbitrary (TestRig b) where
arbitrary = do
e <- genExpr
d <- genDict e
return (TestRig d e)
where
genExpr = getTestE <$> arbitrary
genDict x = M.fromList . zip (keys x) <$> (infiniteListOf arbitrary)
keys = nub . toList
unsafeSubst :: Ord a => Map a b -> E a -> E b
unsafeSubst dict = fmap (dict !)
eval :: Num a => E a -> a
eval = cata $ \case
Var' x -> x
Add' x y -> x + y
Mul' x y -> x * y
evalRig :: (E Char -> E Char) -> TestRig Integer -> Integer
evalRig f (TestRig d e) = eval (unsafeSubst d (f e))
mkPropEval :: (E Char -> E Char) -> TestRig Integer -> Bool
mkPropEval f = (==) <$> evalRig id <*> evalRig f
isDistributed :: E a -> Bool
isDistributed = para $ \case
Add' (_, x) (_, y) -> x && y
Mul' (Add _ _, _) _ -> False
Mul' _ (Add _ _, _) -> False
Mul' (_, x) (_, y) -> x && y
_ -> True
mkPropDist :: (E Char -> E Char) -> TestE -> Bool
mkPropDist f = isDistributed . f . getTestE
main = mapM_ test
[ ("expandTooClever" , expandTooClever)
, ("expandBrute" , expandBrute)
, ("expandNotSoBrute", expandNotSoBrute)
, ("expandEvert" , expandEvert)
, ("expandSOP" , expandSOP)
]
where
test (header, func) = do
putStrLn $ "Testing: " ++ header
putStr "Evaluation test: "
quickCheck $ mkPropEval func
putStr "Distribution test: "
quickCheck $ mkPropDist func
By sum of products I mean sort of "curried" n-ary sum of products. A concise definition of sum of products is that I want an expression without any parentheses, with all parens represented by associativity and priority.
We can adjust the solutions above so that the sums are reassociated. The easiest way is replacing the outer BT in expandSOP with NonEmpty. Given that the multiplication there is, much like you suggest, liftA2 (<>), this works straight away.
expandFlat :: E a -> E a
expandFlat = cata algS . fmap (cata algP) . cata algSOP
where
algSOP :: EF a (NonEmpty (BT a)) -> NonEmpty (BT a)
algSOP = \case
Var' s -> pure (Leaf s)
Add' x y -> x <> y
Mul' x y -> (<>) <$> x <*> y
algP :: BTF a (E a) -> E a
algP = \case
Leaf' s -> Var s
Branch' x y -> Mul x y
algS :: NonEmptyF (E a) (E a) -> E a
algS = \case
NonEmptyF x Nothing -> x
NonEmptyF x (Just y) -> Add x y
Another option is using any of the other solutions and reassociating the sums in the distributed tree in a separate step.
flattenSum :: E a -> E a
flattenSum = cata alg
where
alg = \case
Add' x y -> apo coalg (x, y)
x -> embed x
coalg = \case
(Add x x', y) -> Add' (Left x) (Right (x', y))
(x, y) -> Add' (Left x) (Left y)
We can also roll flattenSum and expandEvert into a single function. Note that the sum coalgebra needs an extra case when it gets the result of the distribution coalgebra. That happens because, as the coalgebra proceeds from top to bottom, we can't be sure that the subtrees it generates are properly associated.
-- This is written in a slightly different style than the previous functions.
expandDirect :: E a -> E a
expandDirect = cata alg
where
alg = \case
Var' s -> Var s
Add' x y -> apo coalgAdd (Add x y)
Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
coalgAdd = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
x -> Left <$> project x
coalgAdd' = \case
Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
x -> Left <$> project x
coalgMul = \case
Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
x -> Left <$> project x
Perhaps there is a more clever way of writing expandDirect, but I haven't figured it out yet.

Is there a way to elegantly represent this pattern in Haskell?

Mind the pure function below, in an imperative language:
def foo(x,y):
x = f(x) if a(x)
if c(x):
x = g(x)
else:
x = h(x)
x = f(x)
y = f(y) if a(y)
x = g(x) if b(y)
return [x,y]
That function represents a style where you have to incrementally update variables. It can be avoided in most cases, but there are situations where that pattern is unavoidable - for example, writing a cooking procedure for a robot, which inherently requires a series of steps and decisions. Now, imagine we were trying to represent foo in Haskell.
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
That code works, but it is too complicated and error prone due to the need for manually managing the numeric tags. Notice that, after x1 is set, x0's value should never be used again, but it still can. If you accidentally use it, that will be an undetected error.
I've managed to solve this problem using the State monad:
fooSt x y = execState (do
(x,y) <- get
when (a x) (put (f x, y))
(x,y) <- get
if c x
then put (g x, y)
else put (h x, y)
(x,y) <- get
put (f x, y)
(x,y) <- get
when (a y) (put (x, f y))
(x,y) <- get
when (b y) (put (g x, x))) (x,y)
This way, need for tag-tracking goes away, as well as the risk of accidentally using an outdated variable. But now the code is verbose and much harder to understand, mainly due to the repetition of (x,y) <- get.
So: what is a more readable, elegant and safe way to express this pattern?
Full code for testing.
Your goals
While the direct transformation of imperative code would usually lead to the ST monad and STRef, lets think about what you actually want to do:
You want to manipulate values conditionally.
You want to return that value.
You want to sequence the steps of your manipulation.
Requirements
Now this indeed looks first like the ST monad. However, if we follow the simple monad laws, together with do notation, we see that
do
x <- return $ if somePredicate x then g x
else h x
x <- return $ if someOtherPredicate x then a x
else b x
is exactly what you want. Since you need only the most basic functions of a monad (return and >>=), you can use the simplest:
The Identity monad
foo x y = runIdentity $ do
x <- return $ if a x then f x
else x
x <- return $ if c x then g x
else h x
x <- return $ f x
y <- return $ if a x then f y
else y
x <- return $ if b y then g x
else y
return (x,y)
Note that you cannot use let x = if a x then f x else x, because in this case the x would be the same on both sides, whereas
x <- return $ if a x then f x
else x
is the same as
(return $ if a x then (f x) else x) >>= \x -> ...
and the x in the if expression is clearly not the same as the resulting one, which is going to be used in the lambda on the right hand side.
Helpers
In order to make this more clear, you can add helpers like
condM :: Monad m => Bool -> a -> a -> m a
condM p a b = return $ if p then a else b
to get an even more concise version:
foo x y = runIdentity $ do
x <- condM (a x) (f x) x
x <- fmap f $ condM (c x) (g x) (h x)
y <- condM (a y) (f y) y
x <- condM (b y) (g x) x
return (x , y)
Ternary craziness
And while we're up to it, lets crank up the craziness and introduce a ternary operator:
(?) :: Bool -> (a, a) -> a
b ? ie = if b then fst ie else snd ie
(??) :: Monad m => Bool -> (a, a) -> m a
(??) p = return . (?) p
(#) :: a -> a -> (a, a)
(#) = (,)
infixr 2 ??
infixr 2 #
infixr 2 ?
foo x y = runIdentity $ do
x <- a x ?? f x # x
x <- fmap f $ c x ?? g x # h x
y <- a y ?? f y # y
x <- b y ?? g x # x
return (x , y)
But the bottomline is, that the Identity monad has everything you need for this task.
Imperative or non-imperative
One might argue whether this style is imperative. It's definitely a sequence of actions. But there's no state, unless you count the bound variables. However, then a pack of let … in … declarations also gives an implicit sequence: you expect the first let to bind first.
Using Identity is purely functional
Either way, the code above doesn't introduce mutability. x doesn't get modified, instead you have a new x or y shadowing the last one. This gets clear if you desugar the do expression as noted above:
foo x y = runIdentity $
a x ?? f x # x >>= \x ->
c x ?? g x # h x >>= \x ->
return (f x) >>= \x ->
a y ?? f y # y >>= \y ->
b y ?? g x # x >>= \x ->
return (x , y)
Getting rid of the simplest monad
However, if we would use (?) on the left hand side and remove the returns, we could replace (>>=) :: m a -> (a -> m b) -> m b) by something with type a -> (a -> b) -> b. This just happens to be flip ($). We end up with:
($>) :: a -> (a -> b) -> b
($>) = flip ($)
infixr 0 $> -- same infix as ($)
foo x y = a x ? f x # x $> \x ->
c x ? g x # h x $> \x ->
f x $> \x ->
a y ? f y # y $> \y ->
b y ? g x # x $> \x ->
(x, y)
This is very similar to the desugared do expression above. Note that any usage of Identity can be transformed into this style, and vice-versa.
The problem you state looks like a nice application for arrows:
import Control.Arrow
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' p f g x = if p x then f x else g x
foo2 :: (Int,Int) -> (Int,Int)
foo2 = first (if' c g h . if' a f id) >>>
first f >>>
second (if' a f id) >>>
(\(x,y) -> (if b y then g x else x , y))
in particular, first lifts a function a -> b to (a,c) -> (b,c), which is more idiomatic.
Edit: if' allows a lift
import Control.Applicative (liftA3)
-- a functional if for lifting
if'' b x y = if b then x else y
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' = liftA3 if''
I'd probably do something like this:
foo x y = ( x', y' )
where x' = bgf y' . cgh . af $ x
y' = af y
af z = (if a z then f else id) z
cgh z = (if c z then g else h) z
bg y x = (if b y then g else id) x
For something more complicated, you may want to consider using lens:
whenM :: Monad m => m Bool -> m () -> m ()
whenM c a = c >>= \res -> when res a
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM mb ml mr = mb >>= \b -> if b then ml else mr
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
whenM (uses _1 a) $
_1 %= f
ifM (uses _1 c)
(_1 %= g)
(_1 %= h)
_1 %= f
whenM (uses _2 a) $
_2 %= f
whenM (uses _2 b) $ do
_1 %= g
And there's nothing stopping you from using more descriptive variable names:
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
let x :: Lens (a, c) (b, c) a b
x = _1
y :: Lens (c, a) (c, b) a b
y = _2
whenM (uses x a) $
x %= f
ifM (uses x c)
(x %= g)
(x %= h)
x %= f
whenM (uses y a) $
y %= f
whenM (uses y b) $ do
x %= g
This is a job for the ST (state transformer) library.
ST provides:
Stateful computations in the form of the ST type. These look like ST s a for a computation that results in a value of type a, and may be run with runST to obtain a pure a value.
First-class mutable references in the form of the STRef type. The newSTRef a action creates a new STRef s a reference with an initial value of a, and which can be read with readSTRef ref and written with writeSTRef ref a. A single ST computation can use any number of STRef references internally.
Together, these let you express the same mutable variable functionality as in your imperative example.
To use ST and STRef, we need to import:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Monad.ST.Safe
import Data.STRef
Instead of using the low-level readSTRef and writeSTRef all over the place, we can define the following helpers to match the imperative operations that the Python-style foo example uses:
-- STRef assignment.
(=:) :: STRef s a -> ST s a -> ST s ()
ref =: x = writeSTRef ref =<< x
-- STRef function application.
($:) :: (a -> b) -> STRef s a -> ST s b
f $: ref = f `fmap` readSTRef ref
-- Postfix guard syntax.
if_ :: Monad m => m () -> m Bool -> m ()
action `if_` guard = act' =<< guard
where act' b = if b then action
else return ()
This lets us write:
ref =: x to assign the value of ST computation x to the STRef ref.
(f $: ref) to apply a pure function f to the STRef ref.
action `if_` guard to execute action only if guard results in True.
With these helpers in place, we can faithfully translate the original imperative definition of foo into Haskell:
a = (< 10)
b = even
c = odd
f x = x + 3
g x = x * 2
h x = x - 1
f3 x = x + 2
-- A stateful computation that takes two integer STRefs and result in a final [x,y].
fooST :: Integral n => STRef s n -> STRef s n -> ST s [n]
fooST x y = do
x =: (f $: x) `if_` (a $: x)
x' <- readSTRef x
if c x' then
x =: (g $: x)
else
x =: (h $: x)
x =: (f $: x)
y =: (f $: y) `if_` (a $: y)
x =: (g $: x) `if_` (b $: y)
sequence [readSTRef x, readSTRef y]
-- Pure wrapper: simply call fooST with two fresh references, and run it.
foo :: Integral n => n -> n -> [n]
foo x y = runST $ do
x' <- newSTRef x
y' <- newSTRef y
fooST x' y'
-- This will print "[9,3]".
main = print (foo 0 0)
Points to note:
Although we first had to define some syntactical helpers (=:, $:, if_) before translating foo, this demonstrates how you can use ST and STRef as a foundation to grow your own little imperative language that's directly suited to the problem at hand.
Syntax aside, this matches the structure of the original imperative definition exactly, without any error-prone restructuring. Any minor changes to the original example can be mirrored directly to Haskell. (The addition of the temporary x' <- readSTRef x binding in the Haskell code is only in order to use it with the native if/else syntax: if desired, this can be replaced with an appropriate ST-based if/else construct.)
The above code demonstrates giving both pure and stateful interfaces to the same computation: pure callers can use foo without knowing that it uses mutable state internally, while ST callers can directly use fooST (and for example provide it with existing STRefs to modify).
#Sibi said it best in his comment:
I would suggest you to stop thinking imperatively and rather think in a functional way. I agree that it will take some time to getting used to the new pattern, but try to translate imperative ideas to functional languages isn't a great approach.
Practically speaking, your chain of let can be a good starting point:
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
But I would suggest using a single let and giving descriptive names to the intermediate stages.
In this example unfortunately I don't have a clue what the various x's and y's do, so I cannot suggest meaningful names. In real code you would use names such as x_normalized, x_translated, or such, instead of x1 and x2, to describe what those values really are.
In fact, in a let or where you don't really have variables: they're just shorthand names you give to intermediate results, to make it easy to compose the final expression (the one after in or before the where.)
This is the spirit behind the x_bar and x_baz below. Try to come up with names that are reasonably descriptive, given the context of your code.
foo x y =
let x_bar = if a x then f x else x
x_baz = f if c x_bar then g x_bar else h x_bar
y_bar = if a y then f y else y
x_there = if b y_bar then g x_baz else x_baz
in [x_there, y_bar]
Then you can start recognizing patterns that were hidden in the imperative code. For example, x_bar and y_bar are basically the same transformation, applied respectively to x and y: that's why they have the same suffix "_bar" in this nonsensical example; then your x2 probably doesn't need an intermediate name , since you can just apply f to the result of the entire "if c then g else h".
Going on with the pattern recognition, you should factor out the transformations that you are applying to variables into sub-lambdas (or whatever you call the auxiliary functions defined in a where clause.)
Again, I don't have a clue what the original code did, so I cannot suggest meaningful names for the auxiliary functions. In a real application, f_if_a would be called normalize_if_needed or thaw_if_frozen or mow_if_overgrown... you get the idea:
foo x y =
let x_bar = f_if_a x
y_bar = f_if_a y
x_baz = f (g_if_c_else_h x_bar)
x_there = g_if_b x_baz y_bar
in [x_there, y_bar]
where
f_if_a x
| a x = f x
| otherwise = x
g_if_c_else_h x
| c x = g x
| otherwise = h x
g_if_b x y
| b y = g x
| otherwise = x
Don't disregard this naming business.
The whole point of Haskell and other pure functional languages is to express algorithms without the assignment operator, meaning the tool that can modify the value of an existing variable.
The names you give to things inside a function definition, whether introduced as arguments, let, or where, can only refer to one value (or auxiliary function) throughout the entire definition, so that your code can be more easily reasoned about and proven correct.
If you don't give them meaningful names (and conversely giving your code a meaningful structure) then you're missing out on the entire purpose of Haskell.
(IMHO the other answers so far, citing monads and other shenanigans, are barking up the wrong tree.)
I always prefer layering state transformers to using a single state over a tuple: it definitely declutters things by letting you "focus" on a specific layer (representations of the x and y variables in our case):
import Control.Monad.Trans.Class
import Control.Monad.Trans.State
foo :: x -> y -> (x, y)
foo x y =
(flip runState) y $ (flip execStateT) x $ do
get >>= \v -> when (a v) (put (f v))
get >>= \v -> put ((if c v then g else h) v)
modify f
lift $ get >>= \v -> when (a v) (put (f v))
lift get >>= \v -> when (b v) (modify g)
The lift function allows us to focus on the inner state layer, which is y.

Resources