How to pattern match on Constructors in Haskell? - haskell

I have a state machine where states are implemented using a sum type. Posting a simplified version here:
data State =
A { value :: Int }
| B { value :: Int }
| C { other :: String }
most of my functions are monadic consuming States and doing some actions based on the type. Something like (this code doesn't compile):
f :: State -> m ()
f st= case st of
s#(A | B) -> withValueAction (value s)
C -> return ()
I know that I could unroll constructors like:
f :: State -> m ()
f st= case st of
A v -> withValueAction v
B v -> withValueAction v
C _ -> return ()
But that's a lot of boilerplate and brittle to changes. If I change the parameters to the constructor I need to rewrite all case .. of in my codebase.
So how would you pattern match on a subset of constructors and access a shared element?

One way to implement this idiomatically is to use a slightly different value function:
value :: State -> Maybe Int
value (A v) = Just v
value (B v) = Just v
value _ = Nothing
Then you can write your case using a pattern guard like this:
f st | Just v <- value st -> withValueAction v
f C{} = return ()
f _ = error "This should never happen"
Or you can simplify this a bit further using view patterns and even more with pattern synonyms:
{-# LANGUAGE ViewPatterns, PatternSynonyms #-}
pattern V :: Int -> State
pattern V x <- (value -> Just v)
{-# COMPLETE V, C #-}
f (V x) = withValueAction x
f C{} = return ()

#Noughtmare's answer demonstrates how you can use view patterns to get the right "pattern matching syntax". To auto-generate the value function that selects a shared field from several constructors, you can use lens, though this kind of requires buying into the whole Lens ecosystem. After:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Lens.TH
data State =
A { _value :: Int }
| B { _value :: Int }
| C { _other :: String }
makeLenses ''State
you will have a traversal value that can be used to access the partially shared field:
f :: (Monad m) => State -> m ()
f st = case st ^? value of
Just v -> withValueAction v
Nothing -> return ()

This is the solution I've picked at the end. My two main requirements were:
"Or" pattern matching over constructors
Selection of a subset of fields shared by the pattern match
As reported by #Noughtmare 1 is not possible at the moment https://github.com/ghc-proposals/ghc-proposals/pull/522.
Since for my problem the source of variability comes mostly from parameters in the constructors and not from the number of states, the solution I picked was to enable NamedFieldPuns extension, so the solution is something like:
f :: State -> m ()
f st= case st of
A {value} -> withValueAction value
B {value} -> withValueAction value
C {} -> return ()
It has some boilerplate enumerating constructors but at least it has none at the constructor parameters. I'll have a look at the view patterns maybe they are useful when the source of variability comes from the number of constructors and not the arguments.

Related

RankNTypes and pattern matching

Is there a way in haskell to erase type information/downcast to a polymorphic value ?
In the example I have a boxed type T which can contain either an Int or a Char
And I want to write a function which extract this value without knowing which type it is.
{#- LANGUAGE RankNTypes -#}
data T = I Int | C Char
-- This is not working because GHC cannot bind "a"
-- with specific type Int and Char at the same time.
-- I just want a polymorphic value back ;(
getValue :: T -> (forall a. a)
getValue (I val) = val
getValue (C val) = val
-- This on the other hand works, because the function
-- is local to the pattern matching expression
onValue :: T -> (forall a. a -> a) -> T
onValue (I val) f = I $ f val
onValue (C val) f = C $ f val
Is there a way to write a function that can extract this value without forcing a type at the end ?
a getValue function like the first one ?
Let me know if it is not clear enough.
Answer
So the question was stupid as AndrewC (in the comment) and YellPika pointed out. An infinite type has no meaning.
J. Abrahamson provides an explanation for what I am looking for, so I put his answer as the solution.
P.S: I do not want to use GADT as I do not want a new type each time.
What you probably want is not to return a value (forall a . a) as it is wrong on several fronts. For one, you do not have any value but instead just one of two. For two, such a type cannot exist in a well-behaved program: it corresponds to the type of infinite loops and exceptions, e.g. bottom.
Finally, such a type allows the person who owns it to make the choice to instantiate it more concretely. Since you're giving it to the caller of your function that means that they would get to choose which of an Int or Char you had. Clearly that doesn't make sense.
Instead, what you most likely want is to make a demand of the user of your function: "you have to work regardless of what this type is".
foo :: (forall a . a -> r) -> (T -> r)
foo f (I i) = f i
foo f (C c) = f c
You'll find this function to be really similar to the following
bar :: r -> T -> r
bar x (I _) = x
bar x (C _) = x
In other words, if you force the consumer of your function to disregard all type information then, well, actually nothing at all remains: e.g. a constant function.
You can use GADTs:
{-# LANGUAGE GADTs #-}
data T a where
I :: Int -> T Int
C :: Char -> T Char
getValue :: T a -> a
getValue (I i) = i
getValue (C c) = c
If you turn on ExistentialTypes, you can write:
data Anything = forall a. Anything a
getValue :: T -> Anything
getValue (I val) = Anything val
getValue (C val) = Anything val
However, this is pretty useless. Say we pattern match on an Anything:
doSomethingWith (Anything x) = ?
We don't know anything about x other than that it exists... (well, not even - it might be undefined). There's no type information, so we can't do anything with it.

Haskell: how to write a monadic variadic function, with parameters using the monadic context

I'm trying to make a variadic function with a monadic return type, whose parameters also require the monadic context. (I'm not sure how to describe that second point: e.g. printf can return IO () but it's different in that its parameters are treated the same whether it ends up being IO () or String.)
Basically, I've got a data constructor that takes, say, two Char parameters. I want to provide two pointer style ID Char arguments instead, which can be automagically decoded from an enclosing State monad via a type class instance. So, instead of doing get >>= \s -> foo1adic (Constructor (idGet s id1) (idGet s id2)), I want to do fooVariadic Constructor id1 id2.
What follows is what I've got so far, Literate Haskell style in case somebody wants to copy it and mess with it.
First, the basic environment:
> {-# LANGUAGE FlexibleContexts #-}
> {-# LANGUAGE FlexibleInstances #-}
> {-# LANGUAGE MultiParamTypeClasses #-}
> import Control.Monad.Trans.State
> data Foo = Foo0
> | Foo1 Char
> | Foo2 Bool Char
> | Foo3 Char Bool Char
> deriving Show
> type Env = (String,[Bool])
> newtype ID a = ID {unID :: Int}
> deriving Show
> class InEnv a where envGet :: Env -> ID a -> a
> instance InEnv Char where envGet (s,_) i = s !! unID i
> instance InEnv Bool where envGet (_,b) i = b !! unID i
Some test data for convenience:
> cid :: ID Char
> cid = ID 1
> bid :: ID Bool
> bid = ID 2
> env :: Env
> env = ("xy", map (==1) [0,0,1])
I've got this non-monadic version, which simply takes the environment as the first parameter. This works fine but it's not quite what I'm after. Examples:
$ mkFoo env Foo0 :: Foo
Foo0
$ mkFoo env Foo3 cid bid cid :: Foo
Foo3 'y' True 'y'
(I could use functional dependencies or type families to get rid of the need for the :: Foo type annotations. For now I'm not fussed about it, since this isn't what I'm interested in anyway.)
> mkFoo :: VarC a b => Env -> a -> b
> mkFoo = variadic
>
> class VarC r1 r2 where
> variadic :: Env -> r1 -> r2
>
> -- Take the partially applied constructor, turn it into one that takes an ID
> -- by using the given state.
> instance (InEnv a, VarC r1 r2) => VarC (a -> r1) (ID a -> r2) where
> variadic e f = \aid -> variadic e (f (envGet e aid))
>
> instance VarC Foo Foo where
> variadic _ = id
Now, I want a variadic function that runs in the following monad.
> type MyState = State Env
And basically, I have no idea how I should proceed. I've tried expressing the type class in different ways (variadicM :: r1 -> r2 and variadicM :: r1 -> MyState r2) but I haven't succeeded in writing the instances. I've also tried adapting the non-monadic solution above so that I somehow "end up" with an Env -> Foo which I could then easily turn into a MyState Foo, but no luck there either.
What follows is my best attempt thus far.
> mkFooM :: VarMC r1 r2 => r1 -> r2
> mkFooM = variadicM
>
> class VarMC r1 r2 where
> variadicM :: r1 -> r2
>
> -- I don't like this instance because it requires doing a "get" at each
> -- stage. I'd like to do it only once, at the start of the whole computation
> -- chain (ideally in mkFooM), but I don't know how to tie it all together.
> instance (InEnv a, VarMC r1 r2) => VarMC (a -> r1) (ID a -> MyState r2) where
> variadicM f = \aid -> get >>= \e -> return$ variadicM (f (envGet e aid))
>
> instance VarMC Foo Foo where
> variadicM = id
>
> instance VarMC Foo (MyState Foo) where
> variadicM = return
It works for Foo0 and Foo1, but not beyond that:
$ flip evalState env (variadicM Foo1 cid :: MyState Foo)
Foo1 'y'
$ flip evalState env (variadicM Foo2 cid bid :: MyState Foo)
No instance for (VarMC (Bool -> Char -> Foo)
(ID Bool -> ID Char -> MyState Foo))
(Here I would like to get rid of the need for the annotation, but the fact that this formulation needs two instances for Foo makes that problematic.)
I understand the complaint: I only have an instance that goes from Bool ->
Char -> Foo to ID Bool -> MyState (ID Char -> Foo). But I can't make the
instance it wants because I need MyState in there somewhere so that I can
turn the ID Bool into a Bool.
I don't know if I'm completely off track or what. I know that I could solve my basic issue (I don't want to pollute my code with the idGet s equivalents all over the place) in different ways, such as creating liftA/liftM-style functions for different numbers of ID parameters, with types like (a -> b -> ... -> z -> ret) -> ID a -> ID b -> ... -> ID z -> MyState ret, but I've spent too much time thinking about this. :-) I want to know what this variadic solution should look like.
WARNING
Preferably don't use variadic functions for this type of work. You only have a finite number of constructors, so smart constructors don't seem to be a big deal. The ~10-20 lines you would need are a lot simpler and more maintainable than a variadic solution. Also an applicative solution is much less work.
WARNING
The monad/applicative in combination with variadic functions is the problem. The 'problem' is the argument addition step used for the variadic class. The basic class would look like
class Variadic f where
func :: f
-- possibly with extra stuff
where you make it variadic by having instances of the form
instance Variadic BaseType where ...
instance Variadic f => Variadic (arg -> f) where ...
Which would break when you would start to use monads. Adding the monad in the class definition would prevent argument expansion (you would get :: M (arg -> f), for some monad M). Adding it to the base case would prevent using the monad in the expansion, as it's not possible (as far as I know) to add the monadic constraint to the expansion instance. For a hint to a complex solution see the P.S..
The solution direction of using a function which results in (Env -> Foo) is more promising. The following code still requires a :: Foo type constraint and uses a simplified version of the Env/ID for brevity.
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses, TypeFamilies #-}
module Test where
data Env = Env
data ID a = ID
data Foo
= Foo0
| Foo1 Char
| Foo2 Char Bool
| Foo3 Char Bool Char
deriving (Eq, Ord, Show)
class InEnv a where
resolve :: Env -> ID a -> a
instance InEnv Char where
resolve _ _ = 'a'
instance InEnv Bool where
resolve _ _ = True
The Type families extension is used to make the matching stricter/better. Now the variadic function class.
class MApp f r where
app :: Env -> f -> r
instance MApp Foo Foo where
app _ = id
instance (MApp r' r, InEnv a, a ~ b) => MApp (a -> r') (ID b -> r) where
app env f i = app env . f $ resolve env i
-- using a ~ b makes this instance to match more easily and
-- then forces a and b to be the same. This prevents ambiguous
-- ID instances when not specifying there type. When using type
-- signatures on all the ID's you can use
-- (MApp r' r, InEnv a) => MApp (a -> r') (ID a -> r)
-- as constraint.
The environment Env is explicitly passed, in essence the Reader monad is unpacked preventing the problems between monads and variadic functions (for the State monad the resolve function should return a new environment). Testing with app Env Foo1 ID :: Foo results in the expected Foo1 'a'.
P.S.
You can get monadic variadic functions to work (to some extent) but it requires bending your functions (and mind) in some very strange ways. The way I've got such things to work is to 'fold' all the variadic arguments into a heterogeneous list. The unwrapping can then be done monadic-ally. Though I've done some things like that, I strongly discourage you from using such things in actual (used) code as it quickly gets incomprehensible and unmaintainable (not to speak of the type errors you would get).

Challenged by types

The problem:
Currently I have a type WorkConfig, which looks like this
data WorkConfig = PhaseZero_wc BuildConfig
| PhaseOne_wc BuildConfig Filename (Maybe XMLFilepath)
| PhaseTwo_wc String
| SoulSucker_wc String
| ImageInjector_wc String
| ESX_wc String
| XVA_wc String
| VNX_wc String
| HyperV_wc String
| Finish_wc String
deriving Show
(I'm using String from PhaseTwo_wc on as a placeholder for what will actually be used)
I have a function updateConfig that takes a WorkConfig as one of it's parameters.
The problem is that I want to be able to enforce which constructor is used.
For example in the function phaseOne I want to be able to guarantee that when updateConfig is invoked, only the PhaseTwo_wc constructor can be used.
In order to use a type class for this enforcement, I would have to make separate data constructors, for example:
data PhaseOne_wc = PhaseOne_wc BuildConfig Filename (Maybe XMLFilepath)
If I go this route, I have another problem to solve. I have other data types that have WorkConfig as a value, what would I do to address this? For example,
type ConfigTracker = TMVar (Map CurrentPhase WorkConfig)
How can I use the type system for the enforcement I would like, while keeping in mind what I mentioned above?
ConfigTracker would have to be able to know which data type I wanted.
* Clarification:
I'm looking to restrict which WorkConfig that updateConfig may take as a parameter.
Your question was a little vague, so I will answer in the general.
If you have a type of the form:
data MyType a b c d e f g = C1 a b | C2 c | C3 e f g
... and you want some function f that works on all three constructors:
f :: MyType a b c d e f g -> ...
... but you want some function g that works on just the last constructor, then you have two choices.
The first option is to create a second type embedded within C3:
data SecondType e f g = C4 e f g
... and then embed that within the original C3 constructor:
data MyType a b c d e f g = C1 a b | C2 c | C3 (SecondType e f g)
... and make g a function of SecondType:
g :: SecondType e f g -> ...
This only slightly complicates the code for f as you will have to first unpack C3 to access the C4 constructor.
The second solution is that you just make g a function of the values stored in the C3 constructor:
g :: e -> f -> g -> ...
This requires no modification to f or the MyType type.
To be more concrete and drive the discussion, how close does this GADT & Existential code get to what you want?
{-# LANGUAGE GADTs, KindSignatures, DeriveDataTypeable, ExistentialQuantification, ScopedTypeVariables, StandaloneDeriving #-}
module Main where
import qualified Data.Map as Map
import Data.Typeable
import Data.Maybe
data Phase0 deriving(Typeable)
data Phase1 deriving(Typeable)
data Phase2 deriving(Typeable)
data WC :: * -> * where
Phase0_ :: Int -> WC Phase0
Phase1_ :: Bool -> WC Phase1
deriving (Typeable)
deriving instance Show (WC a)
data Some = forall a. Typeable a => Some (WC a)
deriving instance Show Some
things :: [Some]
things = [ Some (Phase0_ 6) , Some (Phase1_ True) ]
do'phase0 :: WC Phase0 -> WC Phase1
do'phase0 (Phase0_ i) = Phase1_ (even i)
-- Simplify by using TypeRep of the Phase* types as key
type M = Map.Map TypeRep Some
updateConfig :: forall a. Typeable a => WC a -> M -> M
updateConfig wc m = Map.insert key (Some wc) m
where key = typeOf (undefined :: a)
getConfig :: forall a. Typeable a => M -> Maybe (WC a)
getConfig m = case Map.lookup key m of
Nothing -> Nothing
Just (Some wc) -> cast wc
where key = typeOf (undefined :: a)
-- Specialization of updateConfig restricted to taking Phase0
updateConfig_0 :: WC Phase0 -> M -> M
updateConfig_0 = updateConfig
-- Example of processing from Phase0 to Phase1
process_0_1 :: WC Phase0 -> WC Phase1
process_0_1 (Phase0_ i) = (Phase1_ (even i))
main = do
print things
let p0 = Phase0_ 6
m1 = updateConfig p0 Map.empty
m2 = updateConfig (process_0_1 p0) m1
print m2
print (getConfig m2 :: Maybe (WC Phase0))
print (getConfig m2 :: Maybe (WC Phase1))
print (getConfig m2 :: Maybe (WC Phase2))
Sorry, this is more of an extended comment than an answer. I'm a little confused. It sounds like you have some functions like
phaseOne = ...
... (updateConfig ...) ...
phaseTwo = ...
... (updateConfig ...) ...
and you're trying to make sure that eg, inside the definition of phaseOne, this never appears:
phaseOne = ...
... (updateConfig $ PhaseTwo_wc ...) ...
But now I ask you: is updateConfig a pure (non-monadic) function? Because if it is, than phaseOne can easily be perfectly correct and still invoke updateConfig with a PhaseTwo_wc; ie it could just throw away the result (and even if it's monadic too):
phaseOne = ...
... (updateConfig $ PhaseTwo_wc ...) `seq` ...
In other words, I'm wondering if the constraint you're trying to enforce is really the actual property you are looking for?
But now, if we're thinking of monads, there is a common pattern that what you describe is sort of like: making "special" monads that limit the kind of actions that can be performed; eg
data PhaseOneMonad a = PhaseOnePure a | PhaseOneUpdate BuildConfig Filename (Maybe XMLFilepath) a
instance Monad PhaseOneMonad where ...
Is this maybe what you're getting at? Note also that PhaseOneUpdate doesn't take a WorkConfig; it just takes the constructor parameters it is interested in. This is another common pattern: you can't constrain which constructor is used, but you can just take the arguments directly.
Hm... still not sure though.

Sort by constructor ignoring (part of) value

Suppose I have
data Foo = A String Int | B Int
I want to take an xs :: [Foo] and sort it such that all the As are at the beginning, sorted by their strings, but with the ints in the order they appeared in the list, and then have all the Bs at the end, in the same order they appeared.
In particular, I want to create a new list containg the first A of each string and the first B.
I did this by defining a function taking Foos to (Int, String)s and using sortBy and groupBy.
Is there a cleaner way to do this? Preferably one that generalizes to at least 10 constructors.
Typeable, maybe? Something else that's nicer?
EDIT: This is used for processing a list of Foos that is used elsewhere. There is already an Ord instance which is the normal ordering.
You can use
sortBy (comparing foo)
where foo is a function that extracts the interesting parts into something comparable (e.g. Ints).
In the example, since you want the As sorted by their Strings, a mapping to Int with the desired properties would be too complicated, so we use a compound target type.
foo (A s _) = (0,s)
foo (B _) = (1,"")
would be a possible helper. This is more or less equivalent to Tikhon Jelvis' suggestion, but it leaves space for the natural Ord instance.
To make it easier to build comparison function for ADTs with large number of constructors, you can map values to their constructor index with SYB:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data Foo = A String Int | B Int deriving (Show, Eq, Typeable, Data)
cIndex :: Data a => a -> Int
cIndex = constrIndex . toConstr
Example:
*Main Data.Generics> cIndex $ A "foo" 42
1
*Main Data.Generics> cIndex $ B 0
2
Edit:After re-reading your question, I think the best option is to make Foo an instance of Ord. I do not think there is any way to do this automatically that will act the way you want (just using deriving will create different behavior).
Once Foo is an instance of Ord, you can just use sort from Data.List.
In your exact example, you can do something like this:
data Foo = A String Int | B Int deriving (Eq)
instance Ord Foo where
(A _ _) <= (B _) = True
(A s _) <= (A s' _) = s <= s'
(B _) <= (B _) = True
When something is an instance of Ord, it means the data type has some ordering. Once we know how to order something, we can use a bunch of existing functions (like sort) on it and it will behave how you want. Anything in Ord has to be part of Eq, which is what the deriving (Eq) bit does automatically.
You can also derive Ord. However, the behavior will not be exactly what you want--it will order by all of the fields if it has to (e.g. it will put As with the same string in order by their integers).
Further edit: I was thinking about it some more and realized my solution is probably semantically wrong.
An Ord instance is a statement about your whole data type. For example, I'm saying that Bs are always equal with each other when the derived Eq instance says otherwise.
If the data your representing always behaves like this (that is, Bs are all equal and As with the same string are all equal) then an Ord instance makes sense. Otherwise, you should not actually do this.
However, you can do something almost exactly like this: write your own special compare function (Foo -> Foo -> Ordering) that encapsulates exactly what you want to do then use sortBy. This properly codifies that your particular sorting is special rather than the natural ordering of the data type.
You could use some template haskell to fill in the missing transitive cases. The mkTransitiveLt creates the transitive closure of the given cases (if you order them least to greatest). This gives you a working less-than, which can be turned into a function that returns an Ordering.
{-# LANGUAGE TemplateHaskell #-}
import MkTransitiveLt
import Data.List (sortBy)
data Foo = A String Int | B Int | C | D | E deriving(Show)
cmp a b = $(mkTransitiveLt [|
case (a, b) of
(A _ _, B _) -> True
(B _, C) -> True
(C, D) -> True
(D, E) -> True
(A s _, A s' _) -> s < s'
otherwise -> False|])
lt2Ord f a b =
case (f a b, f b a) of
(True, _) -> LT
(_, True) -> GT
otherwise -> EQ
main = print $ sortBy (lt2Ord cmp) [A "Z" 1, A "A" 1, B 1, A "A" 0, C]
Generates:
[A "A" 1,A "A" 0,A "Z" 1,B 1,C]
mkTransitiveLt must be defined in a separate module:
module MkTransitiveLt (mkTransitiveLt)
where
import Language.Haskell.TH
mkTransitiveLt :: ExpQ -> ExpQ
mkTransitiveLt eq = do
CaseE e ms <- eq
return . CaseE e . reverse . foldl go [] $ ms
where
go ms m#(Match (TupP [a, b]) body decls) = (m:ms) ++
[Match (TupP [x, b]) body decls | Match (TupP [x, y]) _ _ <- ms, y == a]
go ms m = m:ms

Checking for a particular data constructor

Let's say that I defined my own data-Type like
data MyData = A arg| B arg2| C arg3
How would I write a function (for instance: isMyDataType) that checks wether the given argument is one out of the particular types in MyData and successively returns a boolean (True or False) , e.g. typing in Ghci:
isMyDataType B returns True and isMyDataType Int returns False.
I believe you want functions to test for particular constructors:
isA :: MyData -> Bool
isB :: MyData -> Bool
If so, then you can write these yourself or derive them. The implementation would look like:
isA (A _) = True
isA _ = False
isB (B _) = True
isB _ = False
To derive them automatically, just use the derive library and add, in your source code:
{-# LANGUAGE TemplateHaskell #-}
import Data.DeriveTH
data MyData = ...
deriving (Eq, Ord, Show}
derive makeIs ''MyData
-- Older GHCs require more syntax: $( derive makeIs ''MyData)
Also note: your data declaration is invalid, the name must be capitalized, MyData instead of myData.
Finally, this whole answer is based on the assumption you want to test constructors, not data types as you said (which are statically checked at compile time, as Tarrasch said).
Haskell always checks that the types makes sense. The compiler would complain immediately if you wrote isMyDataType 4, because 4 is not of type MyData, it's of type Int.
I'm not sure this is what you asked for, but either way I strongly suggest for you to try out what you've asked here in practice, so you can see for yourself. Most important is that you check out type signatures in haskell, it is key for learning haskell.
You can use Maybes. You can create a set of functions that check for each of the types
getA, getB, getC :: MyData a -> Maybe a
getA x = case x of {(A v) -> Just v; _ -> Nothing}
getB x = case x of {(B v) -> Just v; _ -> Nothing}
getC x = case x of {(C v) -> Just v; _ -> Nothing}
This affords some practical idioms for certain tasks:
allAs :: [MyData a] -> [a]
allAs xs = mapMaybe getA xs
printIfA :: Show a => MyData a -> IO ()
printIfA x = maybe (return ()) print $ getA x

Resources