Cleaner Alternative to Extensive Pattern Matching in Haskell - haskell

Right now, I have some code that essentially works like this:
data Expression
= Literal Bool
| Variable String
| Not Expression
| Or Expression Expression
| And Expression Expression
deriving Eq
simplify :: Expression -> Expression
simplify (Literal b) = Literal b
simplify (Variable s) = Variable s
simplify (Not e) = case simplify e of
(Literal b) -> Literal (not b)
e' -> Not e'
simplify (And a b) = case (simplify a, simplify b) of
(Literal False, _) -> Literal False
(_, Literal False) -> Literal False
(a', b') -> And a' b'
simplify (Or a b) = case (simplify a, simplify b) of
(Literal True, _) -> Literal True
(_, Literal True) -> Literal True
(a', b') -> Or a' b'
And many more such patterns regarding all the ways one can simplify a boolean expression. As I add more operators and rules however, this grows immensely and feels.. clunky. Especially so since some rules need to be added twice to account for commutativity.
How can I nicely refactor lots and lots of patterns of which some (most, I'd say) are even symmetrical (take the And and Or patterns for example)?
Right now, adding a rule to simplify And (Variable "x") (Not (Variable "x")) to Literal False requires me to add two nested rules, which is all but optimal.

Basically the problem is that you have to write out simplify of the subexpressions in each clause, over and over again. It would be better to first get all the subexpressions done before even considering laws involving the top-level operator. One simple way is to add an auxiliary version of simplify, that doesn't recurse down:
simplify :: Expression -> Expression
simplify (Literal b) = Literal b
simplify (Variable s) = Variable s
simplify (Not e) = simplify' . Not $ simplify e
simplify (And a b) = simplify' $ And (simplify a) (simplify b)
simplify (Or a b) = simplify' $ Or (simplify a) (simplify b)
simplify' :: Expression -> Expression
simplify' (Not (Literal b)) = Literal $ not b
simplify' (And (Literal False) _) = Literal False
...
With the only small amount of operations you have in booleans, this is probably the most sensible approach. However with more operations, the duplication in simplify might still be worth to avoid. To that end, you can conflate the unary and binary operations to a common constructor:
data Expression
= Literal Bool
| Variable String
| BoolPrefix BoolPrefix Expression
| BoolInfix BoolInfix Expression Expression
deriving Eq
data BoolPrefix = Negation
data BoolInfix = AndOp | OrOp
and then you have just
simplify (Literal b) = Literal b
simplify (Variable s) = Variable s
simplify (BoolPrefix bpf e) = simplify' . BoolPrefix bpf $ simplify e
simplify (BoolInfix bifx a b) = simplify' $ BoolInfix bifx (simplify a) (simplify b)
Obviously this makes simplify' more awkward though, so perhaps not such a good idea. You can however get around this syntactical overhead by defining specialised pattern synonyms:
{-# LANGUAGE PatternSynonyms #-}
pattern Not :: Expression -> Expression
pattern Not x = BoolPrefix Negation x
infixr 3 :∧
pattern (:∧) :: Expression -> Expression -> Expression
pattern a:∧b = BoolInfix AndOp a b
infixr 2 :∨
pattern (:∨) :: Expression -> Expression -> Expression
pattern a:∨b = BoolInfix OrOp a b
For that matter, perhaps also
pattern F, T :: Expression
pattern F = Literal False
pattern T = Literal True
With that, you can then write
simplify' :: Expression -> Expression
simplify' (Not (Literal b)) = Literal $ not b
simplify' (F :∧ _) = F
simplify' (_ :∧ F) = F
simplify' (T :∨ _) = T
simplify' (a :∧ Not b) | a==b = T
...
I should add a caveat though: when I tried something similar to those pattern synonyms, not for booleans but affine mappings, it made the compiler extremely slow. (Also, GHC-7.10 didn't yet support polymorphic pattern synonyms yet; this has changed quite a bit as of now.)
Note also that all this will not generally yield the simplest possible form –
for that, you'd need to find the fixed point of simplify.

One thing you can do is simplify as you construct, rather than first constructing then repeatedly destructing. So:
module Simple (Expression, true, false, var, not, or, and) where
import Prelude hiding (not, or, and)
data Expression
= Literal Bool
| Variable String
| Not Expression
| Or Expression Expression
| And Expression Expression
deriving (Eq, Ord, Read, Show)
true = Literal True
false = Literal False
var = Variable
not (Literal True) = false
not (Literal False) = true
not x = Not x
or (Literal True) _ = true
or _ (Literal True) = true
or x y = Or x y
and (Literal False) _ = false
and _ (Literal False) = false
and x y = And x y
We can try it out in ghci:
> and (var "x") (and (var "y") false)
Literal False
Note that the constructors are not exported: this ensures that folks constructing one of these can't avoid the simplification process. Actually, this may be a drawback; occasionally it is nice to see the "full" form. A standard approach to dealing with this is to make the exported smart constructors part of a type-class; you can either use them to build a "full" form or a "simplified" way. To avoid having to define the type twice, we could either use a newtype or a phantom parameter; I'll elect for the latter here to reduce the noise in pattern-matching.
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE KindSignatures #-}
module Simple (Format(..), true, false, var, not, or, and) where
import Prelude hiding (not, or, and)
data Format = Explicit | Simplified
data Expression (a :: Format)
= Literal Bool
| Variable String
| Not (Expression a)
| Or (Expression a) (Expression a)
| And (Expression a) (Expression a)
deriving (Eq, Ord, Read, Show)
class Expr e where
true, false :: e
var :: String -> e
not :: e -> e
or, and :: e -> e -> e
instance Expr (Expression Explicit) where
true = Literal True
false = Literal False
var = Variable
not = Not
or = Or
and = And
instance Expr (Expression Simplified) where
true = Literal True
false = Literal False
var = Variable
not (Literal True) = false
not (Literal False) = true
not x = Not x
or (Literal True) _ = true
or _ (Literal True) = true
or x y = Or x y
and (Literal False) _ = false
and _ (Literal False) = false
and x y = And x y
Now in ghci we can "run" the same term in two different ways:
> :set -XDataKinds
> and (var "x") (and (var "y") false) :: Expression Explicit
And (Variable "x") (And (Variable "y") (Literal False))
> and (var "x") (and (var "y") false) :: Expression Simplified
Literal False
You might want to add more rules later; for example, maybe you want:
and (Variable x) (Not (Variable y)) | x == y = false
and (Not (Variable x)) (Variable y) | x == y = false
Having to repeat both "orders" of patterns is a bit annoying. We should abstract over that! The data declaration and classes will be the same, but we'll add the helper function eitherOrder and use it in the definitions of and and or. Here's a more complete set of simplifications using this idea (and our final version of the module):
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE KindSignatures #-}
module Simple (Format(..), true, false, var, not, or, and) where
import Data.Maybe
import Data.Monoid
import Prelude hiding (not, or, and)
import Control.Applicative ((<|>))
data Format = Explicit | Simplified
data Expression (a :: Format)
= Literal Bool
| Variable String
| Not (Expression a)
| Or (Expression a) (Expression a)
| And (Expression a) (Expression a)
deriving (Eq, Ord, Read, Show)
class Expr e where
true, false :: e
var :: String -> e
not :: e -> e
or, and :: e -> e -> e
instance Expr (Expression Explicit) where
true = Literal True
false = Literal False
var = Variable
not = Not
or = Or
and = And
eitherOrder :: (e -> e -> e)
-> (e -> e -> Maybe e)
-> e -> e -> e
eitherOrder fExplicit fSimplified x y = fromMaybe
(fExplicit x y)
(fSimplified x y <|> fSimplified y x)
instance Expr (Expression Simplified) where
true = Literal True
false = Literal False
var = Variable
not (Literal True) = false
not (Literal False) = true
not (Not x) = x
not x = Not x
or = eitherOrder Or go where
go (Literal True) _ = Just true
go (Literal False) x = Just x
go (Variable x) (Variable y) | x == y = Just (var x)
go (Variable x) (Not (Variable y)) | x == y = Just true
go _ _ = Nothing
and = eitherOrder And go where
go (Literal True) x = Just x
go (Literal False) _ = Just false
go (Variable x) (Variable y) | x == y = Just (var x)
go (Variable x) (Not (Variable y)) | x == y = Just false
go _ _ = Nothing
Now in ghci we can do more complicated simplifications, like:
> and (not (not (var "x"))) (var "x") :: Expression Simplified
Variable "x"
And even though we only wrote one order of the rewrite rule, both orders work properly:
> and (not (var "x")) (var "x") :: Expression Simplified
Literal False
> and (var "x") (not (var "x")) :: Expression Simplified
Literal False

I think Einstein said, "Simplify as much as possible, but no more." You have yourself a complicated datatype, and a correspondingly complicated concept, so I assume any technique can only be so much cleaner for the problem at hand.
That said, the first option is to use instead a case structure.
simplify x = case x of
Literal _ -> x
Variable _ -> x
Not e -> simplifyNot $ simplify e
...
where
sharedFunc1 = ...
sharedFunc2 = ...
This has the added benefit of including shared functions which will be usable by all cases but not at the top level namespace. I also like how the cases are freed of their parenthesis. (Also note that in the first two cases i just return the original term, not creating a new one). I often use this sort of structure to just break out other simplify functions, as in the case of Not.
This problem in particular may lend itself to basing Expression on an underlying functor, so that you may fmap a simplification of the subexpressions and then perform the specific combinations of the given case. It will look something like the following:
simplify :: Expression' -> Expression'
simplify = Exp . reduce . fmap simplify . unExp
The steps in this are unwrapping Expression' into the underlying functor representation, mapping the simplification on the underlying term, and then reducing that simplification and wrapping back up into the new Expression'
{-# Language DeriveFunctor #-}
newtype Expression' = Exp { unExp :: ExpressionF Expression' }
data ExpressionF e
= Literal Bool
| Variable String
| Not e
| Or e e
| And e e
deriving (Eq,Functor)
Now, I have pushed the complexity off into the reduce function, which is only a little less complex because it doesn't have to worry about first reducing the subterm. But it will now contain solely the business logic of combining one term with another.
This may or may not be a good technique for you, but it may make some enhancements easier. For instance, if it is possible to form invalid expressions in your language, you could simplify that with Maybe valued failures.
simplifyMb :: Expression' -> Maybe Expression'
simplifyMb = fmap Exp . reduceMb <=< traverse simplifyMb . unExp
Here, traverse will apply simplfyMb to the subterms of the ExpressionF, resulting in an expression of Maybe subterms, ExpressionF (Maybe Expression'), and then if any subterms are Nothing, it will return Nothing, if all are Just x, it will return Just (e::ExpressionF Expression'). Traverse isn't actually separated into distinct phases like that, but it's easier to explain as if it were. Also note, you will need language pragmas for DeriveTraversable and DeriveFoldable, as well as deriving statements on the ExpressionF data type.
The downside? Well, for one the dirt of your code will then lie in a bunch of Exp wrappers everywhere. Consider the application of simplfyMb of the simple term below:
simplifyMb (Exp $ Not (Exp $ Literal True))
It's also a lot to get a head around, but if you understand traverse and fmap pattern above, you can reuse it in lots of places, so that's good. I also believe defining simplify in that way makes it more robust to whatever the specific ExpressionF constructions may turn into. It doesn't mention them so the deep simplification will be unaffected by refactors. The reduce function on the other hand will be.

Carrying on with your Binary Op Expression Expression idea, we could have the datatype:
data Expression
= Literal Bool
| Variable String
| Not Expression
| Binary Op Expression Expression
deriving Eq
data Op = Or | And deriving Eq
And an auxiliary function
{-# LANGUAGE ViewPatterns #-}
simplifyBinary :: Op -> Expression -> Expression -> Expression
simplifyBinary binop (simplify -> leftexp) (simplify -> rightexp) =
case oneway binop leftexp rightexp ++ oneway binop rightexp leftexp of
simplified : _ -> simplified
[] -> Binary binop leftexp rightexp
where
oneway :: Op -> Expression -> Expression -> [Expression]
oneway And (Literal False) _ = [Literal False]
oneway Or (Literal True) _ = [Literal True]
-- more cases here
oneway _ _ _ = []
The idea is that you would put the simplification cases in oneway and then simplifyBinary would take care of reversing the arguments, to avoid having to write the symmetric cases.

You could write a generic simplifier for all binary operations:
simplifyBinWith :: (Bool -> Bool -> Bool) -- the boolean operation
-> (Expression -> Expression -> Expression) -- the constructor
-> Expression -> Expression -- the two operands
-> Expression) -- the simplified result
simplifyBinWith op cons a b = case (simplify a, simplify b) of
(Literal x, Literal y) -> Literal (op x y)
(Literal x, b') -> tryAll (x `op`) b'
(a', Literal y) -> tryAll (`op` y) a'
(a', b') -> cons a' b'
where
tryAll f term = case (f True, f False) of -- what would f do if term was true of false
(True, True) -> Literal True
(True, False) -> term
(False, True) -> Not term
(False, False) -> Literal False
That way, your simplify function would become
simplify :: Expression -> Expression
simplify (Not e) = case simplify e of
(Literal b) -> Literal (not b)
e' -> Not e'
simplify (And a b) = simplifyBinWith (&&) And a b
simplify (Or a b) = simplifyBinWith (||) Or a b
simplify t = t
and could be easily extended to more binary operations. It would also work well with the Binary Op Expression Expression idea, you'd pass Op instead of an Expression constructor to simplifyBinWith and the pattern in simplify could be generalised:
simplify :: Expression -> Expression
simplify (Not e) = case simplify e of
(Literal b) -> Literal (not b)
e' -> Not e'
simplify (Binary op a b) = simplifyBinWith (case op of
And -> (&&)
Or -> (||)
Xor -> (/=)
Implies -> (<=)
Equals -> (==)
…
) op a b
simplify t = t
where
simplifyBinWith f op a b = case (simplify a, simplify b) of
(Literal x, Literal y) -> Literal (f x y)
…
(a', b') -> Binary op a' b'

Related

How to reduce code duplication when dealing with recursive sum types

I am currently working on a simple interpreter for a programming language and I have a data type like this:
data Expr
= Variable String
| Number Int
| Add [Expr]
| Sub Expr Expr
And I have many functions that do simple things like:
-- Substitute a value for a variable
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = go
where
go (Variable x)
| x == name = Number newValue
go (Add xs) =
Add $ map go xs
go (Sub x y) =
Sub (go x) (go y)
go other = other
-- Replace subtraction with a constant with addition by a negative number
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = go
where
go (Sub x (Number y)) =
Add [go x, Number (-y)]
go (Add xs) =
Add $ map go xs
go (Sub x y) =
Sub (go x) (go y)
go other = other
But in each of these functions, I have to repeat the part that calls the code recursively with just a small change to one part of the function. Is there any existing way to do this more generically? I would rather not have to copy and paste this part:
go (Add xs) =
Add $ map go xs
go (Sub x y) =
Sub (go x) (go y)
go other = other
And just change a single case each time because it seems inefficient to duplicate code like this.
The only solution I could come up with is to have a function that calls a function first on the whole data structure and then recursively on the result like this:
recurseAfter :: (Expr -> Expr) -> Expr -> Expr
recurseAfter f x =
case f x of
Add xs ->
Add $ map (recurseAfter f) xs
Sub x y ->
Sub (recurseAfter f x) (recurseAfter f y)
other -> other
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue =
recurseAfter $ \case
Variable x
| x == name -> Number newValue
other -> other
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd =
recurseAfter $ \case
Sub x (Number y) ->
Add [x, Number (-y)]
other -> other
But I feel like there should probably be a simpler way to do this already. Am I missing something?
Congratulations, you just rediscovered anamorphisms!
Here's your code, rephrased so that it works with the recursion-schemes package. Alas, it's not shorter, since we need some boilerplate to make the machinery work. (There might be some automagic way to avoid the boilerplate, e.g. using generics. I simply do not know.)
Below, your recurseAfter is replaced with the standard ana.
We first define your recursive type, as well as the functor it is the fixed point of.
{-# LANGUAGE DeriveFunctor, TypeFamilies, LambdaCase #-}
{-# OPTIONS -Wall #-}
module AnaExpr where
import Data.Functor.Foldable
data Expr
= Variable String
| Number Int
| Add [Expr]
| Sub Expr Expr
deriving (Show)
data ExprF a
= VariableF String
| NumberF Int
| AddF [a]
| SubF a a
deriving (Functor)
Then we connect the two with a few instances so that we can unfold Expr into the isomorphic ExprF Expr, and fold it back.
type instance Base Expr = ExprF
instance Recursive Expr where
project (Variable s) = VariableF s
project (Number i) = NumberF i
project (Add es) = AddF es
project (Sub e1 e2) = SubF e1 e2
instance Corecursive Expr where
embed (VariableF s) = Variable s
embed (NumberF i) = Number i
embed (AddF es) = Add es
embed (SubF e1 e2) = Sub e1 e2
Finally, we adapt your original code, and add a couple of tests.
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = ana $ \case
Variable x | x == name -> NumberF newValue
other -> project other
testSub :: Expr
testSub = substituteName "x" 42 (Add [Add [Variable "x"], Number 0])
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = ana $ \case
Sub x (Number y) -> AddF [x, Number (-y)]
other -> project other
testReplace :: Expr
testReplace = replaceSubWithAdd
(Add [Sub (Add [Variable "x", Sub (Variable "y") (Number 34)]) (Number 10), Number 4])
An alternative could be to define ExprF a only, and then derive type Expr = Fix ExprF. This saves some of the boilerplate above (e.g. the two instances), at the cost of having to use Fix (VariableF ...) instead of Variable ..., as well as the analogous for the other constructors.
One could further alleviate that using pattern synonyms (at the cost of a little more boilerplate, though).
Update: I finally found the automagic tool, using template Haskell. This makes the whole code reasonably short. Note that the ExprF functor and the two instances above still exist under the hood, and we still have to use them. We only save the hassle of having to define them manually, but that alone saves a lot of effort.
{-# LANGUAGE DeriveFunctor, DeriveTraversable, TypeFamilies, LambdaCase, TemplateHaskell #-}
{-# OPTIONS -Wall #-}
module AnaExpr where
import Data.Functor.Foldable
import Data.Functor.Foldable.TH
data Expr
= Variable String
| Number Int
| Add [Expr]
| Sub Expr Expr
deriving (Show)
makeBaseFunctor ''Expr
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = ana $ \case
Variable x | x == name -> NumberF newValue
other -> project other
testSub :: Expr
testSub = substituteName "x" 42 (Add [Add [Variable "x"], Number 0])
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = ana $ \case
Sub x (Number y) -> AddF [x, Number (-y)]
other -> project other
testReplace :: Expr
testReplace = replaceSubWithAdd
(Add [Sub (Add [Variable "x", Sub (Variable "y") (Number 34)]) (Number 10), Number 4])
As an alternative approach, this is also a typical use case for the uniplate package. It can use Data.Data generics rather than Template Haskell to generate the boilerplate, so if you derive Data instances for your Expr:
import Data.Data
data Expr
= Variable String
| Number Int
| Add [Expr]
| Sub Expr Expr
deriving (Show, Data)
then the transform function from Data.Generics.Uniplate.Data applies a function recursively to each nested Expr:
import Data.Generics.Uniplate.Data
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = transform f
where f (Variable x) | x == name = Number newValue
f other = other
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = transform f
where f (Sub x (Number y)) = Add [x, Number (-y)]
f other = other
Note that in replaceSubWithAdd in particular, the function f is written to perform a non-recursive substitution; transform makes it recursive in x :: Expr, so it's doing the same magic to the helper function as ana does in #chi's answer:
> substituteName "x" 42 (Add [Add [Variable "x"], Number 0])
Add [Add [Number 42],Number 0]
> replaceSubWithAdd (Add [Sub (Add [Variable "x",
Sub (Variable "y") (Number 34)]) (Number 10), Number 4])
Add [Add [Add [Variable "x",Add [Variable "y",Number (-34)]],Number (-10)],Number 4]
>
This is no shorter than #chi's Template Haskell solution. One potential advantage is that uniplate provides some additional functions that may be helpful. For example, if you use descend in place of transform, it transforms only the immediate children which can give you control over where the recursion happens, or you can use rewrite to re-transform the result of transformations until you reach a fixed point. One potential disadvantage is that "anamorphism" sounds way cooler than "uniplate".
Full program:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data -- in base
import Data.Generics.Uniplate.Data -- package uniplate
data Expr
= Variable String
| Number Int
| Add [Expr]
| Sub Expr Expr
deriving (Show, Data)
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = transform f
where f (Variable x) | x == name = Number newValue
f other = other
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = transform f
where f (Sub x (Number y)) = Add [x, Number (-y)]
f other = other
replaceSubWithAdd1 :: Expr -> Expr
replaceSubWithAdd1 = descend f
where f (Sub x (Number y)) = Add [x, Number (-y)]
f other = other
main = do
print $ substituteName "x" 42 (Add [Add [Variable "x"], Number 0])
print $ replaceSubWithAdd e
print $ replaceSubWithAdd1 e
where e = Add [Sub (Add [Variable "x", Sub (Variable "y") (Number 34)])
(Number 10), Number 4]

How does enumFromTo work?

I cannot add a number to a Char; the following will fail to compile 'a' + 1. But yet, ['a'..'z'] successfully creates a string in which each of the character value is incremented. Is there a special function that can increment a Char?
I know that I can do chr (ord c + 1).
How does the ['a'..'z'] or the underlying enumFromTo function increment the characters in the resulting String?
Yes, there is a special function that can add to a Char, from the same Enum class that enumFromTo is from, named succ. Beware that it is partial: succ maxBound is undefined, so take care to check the value of the character before you apply succ. succ is indeed the same as \c -> chr (ord c + 1), as you can verify with the universe package:
> let avoidMaxBound f x = if x == maxBound then Nothing else Just (f x)
> avoidMaxBound succ == avoidMaxBound (\c -> chr (ord c + 1))
True
In fact the implementation of succ in GHC is quite close to the function you suggested:
instance Enum Char where
succ (C# c#)
| isTrue# (ord# c# /=# 0x10FFFF#) = C# (chr# (ord# c# +# 1#))
| otherwise = error ("Prelude.Enum.Char.succ: bad argument")
However, succ is not used in the implementation of enumFromTo in GHC:
instance Enum Char where
{-# INLINE enumFromTo #-}
enumFromTo (C# x) (C# y) = eftChar (ord# x) (ord# y)
{-# RULES
"eftChar" [~1] forall x y. eftChar x y = build (\c n -> eftCharFB c n x y)
#-}
-- We can do better than for Ints because we don't
-- have hassles about arithmetic overflow at maxBound
{-# INLINE [0] eftCharFB #-}
eftCharFB :: (Char -> a -> a) -> a -> Int# -> Int# -> a
eftCharFB c n x0 y = go x0
where
go x | isTrue# (x ># y) = n
| otherwise = C# (chr# x) `c` go (x +# 1#)
{-# NOINLINE [1] eftChar #-}
eftChar :: Int# -> Int# -> String
eftChar x y | isTrue# (x ># y ) = []
| otherwise = C# (chr# x) : eftChar (x +# 1#) y
If you can squint past the nastiness that exists primarily for efficiency reasons, you can see that eftChar is essentially using succ, but an inlined version of it rather than an actual call to succ (here, to avoid boxing and re-boxing the Char being manipulated).
I think you're after the pred and succ methods, which return the predecessor or successor of Enum a. The problem is that for a Bounded Enum, if you apply succ on the maximum member of the set you will get an error.
Bearing this in mind, you can define enumFromTo recursively as so (avoiding dangerous succ calls):
eftEnum :: (Enum a, Eq a, Ord a) => a -> a -> [a]
eftEnum a b
| a > b = []
| a == b = [a]
| otherwise = a : rest
where rest = eftEnum (succ a) b

Reusing patterns in pattern guards or case expressions

My Haskell project includes an expression evaluator, which for the purposes of this question can be simplified to:
data Expression a where
I :: Int -> Expression Int
B :: Bool -> Expression Bool
Add :: Expression Int -> Expression Int -> Expression Int
Mul :: Expression Int -> Expression Int -> Expression Int
Eq :: Expression Int -> Expression Int -> Expression Bool
And :: Expression Bool -> Expression Bool -> Expression Bool
Or :: Expression Bool -> Expression Bool -> Expression Bool
If :: Expression Bool -> Expression a -> Expression a -> Expression a
-- Reduces an Expression down to the simplest representation.
reduce :: Expression a -> Expression a
-- ... implementation ...
The straightforward approach to implementing this is to write a case expression to recursively evaluate and pattern match, like so:
reduce (Add x y) = case (reduce x, reduce y) of
(I x', I y') -> I $ x' + y'
(x', y') -> Add x' y'
reduce (Mul x y) = case (reduce x, reduce y) of
(I x', I y') -> I $ x' * y'
(x', y') -> Mul x' y'
reduce (And x y) = case (reduce x, reduce y) of
(B x', B y') -> B $ x' && y'
(x', y') -> And x' y'
-- ... and similarly for other cases.
To me, that definition looks somewhat awkward, so I then rewrote the definition using pattern guards, like so:
reduce (Add x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' + y'
I think this definition looks cleaner compared to the case expression, but when defining multiple patterns for different constructors, the pattern is repeated multiple times.
reduce (Add x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' + y'
reduce (Mul x y) | I x' <- reduce x
, I y' <- reduce y
= I $ x' * y'
Noting these repeated patterns, I was hoping there would be some syntax or structure that could cut down on the repetition in the pattern matching. Is there a generally accepted method to simplify these definitions?
Edit: after reviewing the pattern guards, I've realised they don't work as a drop-in replacement here. Although they provide the same result when x and y can be reduced to I _, they do not reduce any values when the pattern guards do not match. I would still like reduce to simplify subexpressions of Add et al.
One partial solution, which I've used in a similar situation, is to extract the logic into a "lifting" function that takes a normal Haskell operation and applies it to your language's values. This abstracts over the wrappping/unwrapping and resulting error handling.
The idea is to create two typeclasses for going to and from your custom type, with appropriate error handling. Then you can use these to create a liftOp function that could look like this:
liftOp :: (Extract a, Extract b, Pack c) => (a -> b -> c) ->
(Expression a -> Expression b -> Expression c)
liftOp err op a b = case res of
Nothing -> err a' b'
Just res -> pack res
where res = do a' <- extract $ reduce' a
b' <- extract $ reduce' b
return $ a' `op` b'
Then each specific case looks like this:
Mul x y -> liftOp Mul (*) x y
Which isn't too bad: it isn't overly redundant. It encompasses the information that matters: Mul gets mapped to *, and in the error case we just apply Mul again.
You would also need instances for packing and unpacking, but these are useful anyhow. One neat trick is that these can also let you embed functions in your DSL automatically, with an instance of the form (Extract a, Pack b) => Pack (a -> b).
I'm not sure this will work exactly for your example, but I hope it gives you a good starting point. You might want to wire additional error handling through the whole thing, but the good news is that most of that gets folded into the definition of pack, unpack and liftOp, so it's still pretty centralized.
I wrote up a similar solution for a related (but somewhat different) problem. It's also a way to handle going back and forth between native Haskell values and an interpreter, but the interpreter is structured differently. Some of the same ideas should still apply though!
This answer is inspired by rampion's follow-up question, which suggests the following function:
step :: Expression a -> Expression a
step x = case x of
Add (I x) (I y) -> I $ x + y
Mul (I x) (I y) -> I $ x * y
Eq (I x) (I y) -> B $ x == y
And (B x) (B y) -> B $ x && y
Or (B x) (B y) -> B $ x || y
If (B b) x y -> if b then x else y
z -> z
step looks at a single term, and reduces it if everything needed to reduce it is present. Equiped with step, we only need a way to replace a term everywhere in the expression tree. We can start by defining a way to apply a function inside every term.
{-# LANGUAGE RankNTypes #-}
emap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
emap f x = case x of
I a -> I a
B a -> B a
Add x y -> Add (f x) (f y)
Mul x y -> Mul (f x) (f y)
Eq x y -> Eq (f x) (f y)
And x y -> And (f x) (f y)
Or x y -> Or (f x) (f y)
If x y z -> If (f x) (f y) (f z)
Now, we need to apply a function everywhere, both to the term and everywhere inside the term. There are two basic possibilities, we could apply the function to the term before applying it inside or we could apply the function afterwards.
premap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
premap f = emap (premap f) . f
postmap :: (forall a. Expression a -> Expression a) -> Expression x -> Expression x
postmap f = f . emap (postmap f)
This gives us two possibilities for how to use step, which I will call shorten and reduce.
shorten = premap step
reduce = postmap step
These behave a little differently. shorten removes the innermost level of terms, replacing them with literals, shortening the height of the expression tree by one. reduce completely evaluates the expression tree to a literal. Here's the result of iterating each of these on the same input
"shorten"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
If (And (B True) (B True)) (Add (I 1) (I 6)) (I 0)
If (B True) (I 7) (I 0)
I 7
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul (I 2) (I 3))) (I 0)
I 7
Partial reduction
Your question implies that you sometimes expect that expressions can't be reduced completely. I'll extend your example to include something to demonstrate this case, by adding a variable, Var.
data Expression a where
Var :: Expression Int
...
We will need to add support for Var to emap:
emap f x = case x of
Var -> Var
...
bind will replace the variable, and evaluateFor performs a complete evaluation, traversing the expression only once.
bind :: Int -> Expression a -> Expression a
bind a x = case x of
Var -> I a
z -> z
evaluateFor :: Int -> Expression a -> Expression a
evaluateFor a = postmap (step . bind a)
Now reduce iterated on an example containing a variable produces the following output
"reduce"
If (And (B True) (Or (B False) (B True))) (Add (I 1) (Mul Var (I 3))) (I 0)
Add (I 1) (Mul Var (I 3))
If the output expression from the reduction is evaluated for a specific value of Var, we can reduce the expression all the way to a literal.
"evaluateFor 5"
Add (I 1) (Mul Var (I 3))
I 16
Applicative
emap can instead be written in terms of an Applicative Functor, and postmap can be made into a generic piece of code suitable for other data types than expressions. How to do so is described in this answer to rampion's follow-up question.

Haskell-Use instance function for pattern matching

I want to implement a simple expression tree with plus and minus operations.
I implemented a class "Group" with the following function signatures:
-- type class Group a ---------------------------------------------------------
class (Eq a, Show a, Read a, Num a) => Group a where
add :: a -> a -> a
identity :: a
invers :: a -> a
Add gets two elements and returns the sum (e.g. 4+7 = 11), identity is a special element that leaves other elements unchanged (i.e. 0 because 3 + 0 = 3) and invers calculates the inverse of a element (e.g. inverse of 3 is -3).
The instance of this class is for values of type integer, therefore it looks like this:
-- Group Integer ----------------------------------------------------------------
instance Group Integer where
add x y = x+y
invers x = -x
identity = 0
The expression tree should consist of the following data elements:
-- expression tree with values having Group property ---------------------------
data Expr a = Lit a | Invers (Expr a) | Add (Expr a) (Expr a) deriving (Eq, Read)
The Lit constructor gets an element of some type (e.g. an Integer value), Invers gets a sub expression and Add gets two sub expressions.
What I want to achieve now is to implement a function called "simplify". It should simplify any expression based on the following axioms:
x `add` idenity = x
idenity `add` x = x
x `add` (invers x) = identity
(invers x) `add` x = identity
What I've implemented so far is the following:
-- simplify --------------------------------------------------------------------
-- simplify simplifies expression trees applying Group laws as follows ---------
-- add x zero = x
-- add zero x = x
-- add x (minus x) = zero
-- add (minus x) x = zero
----- Match for any axiom ----
simplify :: (Group a) => Expr a -> Expr a
simplify (Add(Lit x) (Invers (Lit y))) | x == y = Lit identity
simplify (Invers (Invers (Lit x))) = Lit x
simplify (Invers (Lit identity)) = Lit identity
simplify (Add(Lit x) (Lit identity)) = Lit x
simplify (Add(Lit identity) (Lit x)) = Lit x
----- No axiom found, so call simplify recusively ----
simplify (Invers x) = simplify (x) -- x is a sub expression
simplify (Add x y) = Add (simplify x) (simplify y) -- x and/or y are sub expressions
My problem is that I cannot match for the "identity" element on the left side of the simplify function.
The line
simplify (Invers (Lit identity)) = Lit identity
would be the same as
simplify (Invers (Lit x)) = Lit x
since identity is any variable in this scope. Is there any possibility to match against the identity function of the "Group" class?
Thanks much.
Why not just simplify (Inverse (Lit x)) | x == identity = identity, or am I misunderstanding your problem? [bennofs]

Type Matching in Haskell

If SomeType is defined as:
data SomeType = X {myBool :: Bool}
| Y {myString :: String}
| Z {myString :: String}
and I will update an arbitrary X, dependent of his type as follows:
changeST :: SomeType -> SomeType
changeST (X b) = (X True)
changeST (Y s) = (Y "newString")
changeST (Z s) = (Z "newString")
The third and the fourth line do the very same, they update the string in the given type.
Is there any way replace these two lines by a single one, eg. by assigning the type to a variable?
Not by assigning the type to a variable, but by doing field replacement:
changeST :: SomeType -> SomeType
changeST (X b) = (X True)
changeST st = st { myString = "newString" }
This returns the same st as its argument, but with the value of the myString field replaced. It's one of the nice features of fields that you can do this without caring which data constructor it is, as long as it's one of the data constructors that uses myString.
You can use Scrap-Your-Boilerplate for this.
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data SomeType
= X { myBool :: Bool }
| Y { myString :: String }
| Z { myString :: String }
deriving (Data, Typeable)
changeST :: SomeType -> SomeType
changeST = everywhere (mkT (const True)) . everywhere (mkT (const "newString"))
This changeST changes every internal String in your structure to "newString" and every Bool to True.
I prefer Dan's solution, but pattern guards in GHC (standard in Haskell 2010) are a neat alternative to Michael's proposal:
{-# LANGUAGE PatternGuards #-}
changeST :: SomeType -> SomeType
changeST x | X _ <- x = X True
| Y _ <- x = Y newString
| Z _ <- x = Z newString
where newString = "newString"
Your three definitions of changeST are separate from each other, so the short answer is "no". There are, however, at least two ways you can do this.
Pattern match both the Y and Z constructors at once:
You can combine the 2nd and 3rd definition by making your pattern matching more general:
changeST x = x { myString = "newString"}
This creates a new version of x, whether it be a Y or a Z, replacing the string member. You have to be careful when doing this, though. If you later rename the string field of Z, for example, you will get runtime pattern match failures when calling changeST with a Z argument.
Use a case expression:
If you combine your three definitions into one, you can share data between them.
changeST :: SomeType -> SomeType
changeST x = case x of
X _ -> X True
Y _ -> Y newString
Z _ -> Z newString
where
newString = "newString"

Resources