Haskell record pattern matching - haskell

I'm looking for a way to simplify function patterns when the actual data is not required:
data X = A | B String | C Int Int String
myfn :: X -> Int
myfn A = 50
myfn (B _) = 200
myfn (C _ _ _) = 500
Is there a way to make a simpler pattern for matching C, just discarding the values?
hsdev adds a hint "Hint: use record patterns", but Google did not help me there.

You can use record patterns like this:
data X = A | B {name :: String} | C {x::Int, y::Int, name::String}
myfn :: X -> Int
myfn A = 50
myfn B{} = 200
myfn C{} = 500
Record patterns allow you to give names to the fields of the constructors.
you can also do things like:
myfn C{name=n} = length n
so you can see that you can pattern match only on the specific field you need.
Note: you can use the empty record pattern even with data types that do not use record syntax:
data A = A Int | B Int Int
myfn A{} = 1
myfn B{} = 2
This is fine.
There a number of other extensions related to record patterns:
RecordWildCards allows you to write things like C{..} which is equivalent to the pattern: C{x=x, y=y, name=name}, i.e. it matches all fields and you now have in scope x with the value matched for the x field etc.
NamedFieldPuns allows you to write C{name} to be equivalent to C{name=name}, so that name is now in scope and contains the value matched for the name field.
Keep in mind that using record patterns doesn't prevent you from using your constructors in a positional way, so you can still write:
myfn (B _) = 200
It only adds functionality.

Related

Pattern matching against record syntax

Consider the following data definition:
data Foo = A{field::Int}
| B{field::Int}
| C
| D
Now let's say we want to write a function that takes a Foo and increases field if it exists, and leave it unchanged otherwise:
incFoo :: Foo -> Foo
incFoo A{field=n} = A{field=n+1}
incFoo B{field=n} = B{field=n+1}
incFoo x = x
This naive approach leads to some code duplication. But the fact that both A and B shares field allows us to rewrite it:
incFoo :: Foo -> Foo
incFoo x | hasField x, n <- field x = x{field=n+1}
incFoo x = x
hasField A{} = True
hasField B{} = True
hasField _ = False
Less elegant, but that's defiantly easier to maintain when the actual manipulation is complex. The key feature here is x{field=n+1} - record syntax allows us to "update" field without specifying x's type. Considering this, I'd expect something similar to the following syntax (which is not supported):
incFoo :: Foo -> Foo
incFoo x{field=n} = x{field=n+1}
incFoo x = x
I've considered using View Patterns, but since field is a partial function (field C raises an error) it'll require wrapping it in more boilerplate code.
So my question is: why there's no support for the above syntax, and is there any elegant way of implementing a similar behavior?
Thanks!
The reason why you can't do this is because in Haskell, records are inherently second class. They always must be wrapped in a constructor. So in order to have this work as intended you either must use an individual case for each constructor, or use a record substitute.
One possible solution is to use lenses. I'll use the implementation of lenses lens since lens-family-th doesn't seem to handle duplicate field names.
import Control.Lens
data Foo = A {_f :: Int}
| B {_f :: Int}
deriving Show
makeLenses ''Foo
foo :: Foo -> Foo
foo = f %~ (+1)
And then we can use it like this
> foo (A 1)
A{_f = 1}

Omitting constructor arguments in Haskell case statements

Omitting function arguments is a nice tool for concise Haskell code.
h :: String -> Int
h = (4 +) . length
What about omitting data constructor arguments in case statements. The following code might be considered a little grungy, where s and i are the final arguments in A and B but are repeated as the final arguments in the body of each case match.
f :: Foo -> Int
f = \case
A s -> 4 + length s
B i -> 2 + id i
Is there a way to omit such arguments in case pattern matching? For constructors with a large number of arguments, this would radically shorten code width. E.g. the following pseudo code.
g :: Foo -> Int
g = \case
{- match `A` constructor -> function application to A's arguments -}
A -> (4 +) . length
{- match `B` constructor -> function application to B's arguments -}
B -> (2 +) . id
The GHC extension RecordWildCards lets you concisely bring all the fields of a constructor into scope (of course, this requires you to give names to those fields).
{-# LANGUAGE LambdaCase, RecordWildCards #-}
data Foo = Foo {field1, field2 :: Int} | Bar {field1 :: Int}
baz = \case
Foo{..} -> 4 + field2
Bar{..} -> 2 + field1
-- plus it also "sucks in" fields from a scope
mkBar400 = let field1 = 400 in Bar{..}
`
You can always refactor case statements on constructors into a single function so that from then on you only pass your concise function definitions as arguments to these specific functions. Allow me to illustrate.
Consider the Maybe a datatype:
data Maybe a = Nothing | Just a
Should you now need to define a function f :: Maybe a -> b (for some fixed b and perhaps also a), instead of writing it like
f Nothing = this
f (Just x) = that x
you could start by first defining a function
maybe f _ Nothing = f
maybe _ g (Just x) = g x
and then f can by defined as maybe this that. Pretty much as what happens with all the familiar recursion patterns.
This way you're effectively refactoring out case statements. The code gets arguably cleaner and it does not require language extensions.

Haskell function that returns arbitrary number of fields as list

I want to write a Haskell function that takes a custom type with eleven fields and returns either a list of all the fields' values, or a map associating the fields' names with their values. I don't want to have to explicitly get every field because that would be verbose and less versatile. Is there any way to do this?
What you write would be possible to some degree, but it wouldn't be very useful.
Let's imagine we insist on writing this function for a moment. Given that the fields' values may have different types, you probably rather want to yield a tuple. I.e.
data MyType = MyType Int String Bool
getFields :: MyType -> (Int, String, Bool)
getFields (MyType a b c) = (a,b,c)
So you could now call it like
let v = MyType 1 "Hello" True
let (x, y, z) = getFields v
Now, this isn't actually very useful, because you could use pattern matching in all of these cases, e.g.
let v = MyType 1 "Hello" True
let (MyType x y z) = v
Alright, but what if you wanted to address individual fields? Like
let x = fst (getFields v)
...how to do that without a 'getFields' function? Well, you can simply assign field names (as you probably already did):
data MyType = MyType
{ i :: Int
, s :: String
, b :: Bool
}
Now you could functions for accessing indivial fields for free:
let x = i v
...since assigning names ot fields actually generates functions like i :: MyType -> Int or s :: MyType -> String.

Recursive data structures in haskell: prolog-like terms

I have a question about recursive data structures in Haskell (language that I'm currently trying to learn).
I would like to encode in Haskell Prolog-like terms, but every solution I came up with has different drawbacks that I would really like to avoid. I would like to find a cheap and elegant way of encoding a BNF grammar in Haskell types, if you wish to see my problem from this perspective.
Just as a reminder, some prolog terms could be male, sum(2, 3.1, 5.1), btree(btree(0, 1), Variable).
Solution 1
data Term = SConst String
| IConst Integer
| FConst Double
| Var String
| Predicate {predName :: String, predArgs :: [Term]}
With this solution I can have nested predicates (since predArgs are Term), but I can't distinguish predicates from other terms in type signatures.
Solution 2
data Term = SConst String
| IConst Integer
| FConst Double
| Var String
data Predicate = Predicate {predName :: String, predArgs ::[Either Term Predicate}
In this variant I can clearly distinguish predicates from basic terms, but the Either type in the predArgs list can be quite a nuisance to manage later in the code (I think... I'm new to Haskell).
Solution 3
data Term = SConst String
| IConst Integer
| FConst Double
| Var String
| Struct String [Term]
data Predicate = Predicate String [Term]
With this last solution, I split terms in two different types as before, but this time I avoid Either Term Predicate adding a Struct constructor in Term with basically the same semantics as Predicate.
It's just like solution 1 with two predicate constructors for terms. One is recursion-enabled, Struct, and the other one, Predicate is to be able to distinguish between predicates and regular terms.
The problem with this try is that Struct and Predicate are structurally equivalent and have almost the same meaning, but I will not be able to write functions that works - in example - both on (Predicate "p" []) and (Struct "p" []).
So again my question is: please, is there a better way to encode my predicates and terms such that:
I'm able to distinguish between predicate and terms in type signatures;
nested predicates like p(q(1), r(q(3), q(4))) are supported;
I can write functions that will work uniformly on predicates, without any
distinction like the one in solution #3?
Please feel free to ask me for further clarifications should you need any.
Thank you very much.
You could add a term constructor to wrap a predicate. Here, I also factored all of the literals into their own data type:
data Term = TLit Literal
| TVar String
| TPred Predicate
data Literal = LitS String
| LitI Int
| LitF Double
data Predicate = Predicate String [Term]
Here's one way (that's probably not worth the trouble):
{-# LANGUAGE EmptyDataDecls #-}
-- 'T' and 'F' are short for 'True' and 'False'
data T = T
data F
-- 'p' is short for 'mayNotBeAPredicate'
data Term p
= SConst !p String
| IConst !p Integer
| FConst !p Double
| Var !p String
| Predicate {predName :: String, predArgs :: [Term T]}
sconst :: String -> Term T
iconst :: Integer -> Term T
fconst :: Double -> Term T
var :: String -> Term T
predicate :: String -> [Term T] -> Term p
sconst = SConst T
iconst = IConst T
fconst = FConst T
var = Var T
predicate = Predicate
checkPredicate :: Term p -> Maybe (Term F)
checkPredicate (Predicate name args) = Just (Predicate name args)
checkPredicate _ = Nothing
forgetPredicate :: Term p -> Term T
forgetPredicate (SConst _ s) = sconst s
forgetPredicate (IConst _ i) = iconst i
forgetPredicate (FConst _ f) = fconst f
forgetPredicate (Var _ s) = var s
forgetPredicate (Predicate name args) = predicate name args
You can now write functions which only accept predicates by giving them an input type of Term F, and functions which accept any input type by giving them an input type of Term p.

Can you pattern match constructors on a type class constrained parameter?

See code example below. It won't compile. I had thought that maybe it's because it has to have a single type for the first parameter in the test function. But that doesn't make sense because if I don't pattern match on it so it will compile, I can call it with both MyObj11 5 and MyObj21 5 which are two different types.
So what is it that restricts so you can't pattern match on constructors with a type class constrained parameter? Or is there some mechanism by which you can?
class SomeClass a where toString :: a -> String
instance SomeClass MyType1 where toString v = "MyType1"
instance SomeClass MyType2 where toString v = "MyType2"
data MyType1 = MyObj11 Int | MyObj12 Int Int
data MyType2 = MyObj21 Int | MyObj22 Int Int
test :: SomeClass a => a -> String
test (MyObj11 x) = "11"
test (MyObj12 x y) = "12" -- Error here if remove 3rd line: rigid type bound error
test (MyObj22 x y) = "22" -- Error here about not match MyType1.
what is it that restricts so you can't pattern match on constructors with a type class constrained parameter?
When you pattern match on an explicit constructor, you commit to a specific data type representation. This data type is not shared among all instances of the class, and so it is simply not possible to write a function that works for all instances in this way.
Instead, you need to associate the different behaviors your want with each instance, like so:
class C a where
toString :: a -> String
draw :: a -> String
instance C MyType1 where
toString v = "MyType1"
draw (MyObj11 x) = "11"
draw (MyObj12 x y) = "12"
instance C MyType2 where
toString v = "MyType2"
draw (MyObj22 x y) = "22"
data MyType1 = MyObj11 Int | MyObj12 Int Int
data MyType2 = MyObj21 Int | MyObj22 Int Int
test :: C a => a -> String
test x = draw x
The branches of your original test function are now distributed amongst the instances.
Some alternative tricks involve using class-associated data types (where you prove to the compiler that a data type is shared amongst all instances), or view patterns (which let you generalize pattern matching).
View patterns
We can use view patterns to clean up the connection between pattern matching and type class instances, a little, allowing us to approximate pattern matching across instances by pattern matching on a shared type.
Here's an example, where we write one function, with two cases, that lets us pattern match against anything in the class.
{-# LANGUAGE ViewPatterns #-}
class C a where
view :: a -> View
data View = One Int
| Two Int Int
data MyType1 = MyObj11 Int | MyObj12 Int Int
instance C MyType1 where
view (MyObj11 n) = One n
view (MyObj12 n m) = Two n m
data MyType2 = MyObj21 Int | MyObj22 Int Int
instance C MyType2 where
view (MyObj21 n) = One n
view (MyObj22 n m) = Two n m
test :: C a => a -> String
test (view -> One n) = "One " ++ show n
test (view -> Two n m) = "Two " ++ show n ++ show m
Note how the -> syntax lets us call back to the right view function in each instance, looking up a custom data type encoding per-type, in order to pattern match on it.
The design challenge is to come up with a view type that captures all the behavior variants you're interested in.
In your original question, you wanted every constructor to have a different behavior, so there's actually no reason to use a view type (dispatching directly to that behavior in each instance already works well enough).

Resources