Can I match a data constructor wildcard in Haskell? - haskell

I have
data Foo = X (String, Int) | A String | B String | C String | D String -- ...
and have defined
f (X (s, _)) =s
f (A s) = s
f (B s) = s
f (C s) = s
f (D s) = s
-- ...
but would prefer to be able to write something like
f (X (s, _)) =s
f (_ s) = s
But there seems to be no way to do this (I get a "parse error" associated with the _).
Is there a way to match a data constructor wildcard in Haskell?

Nope. But you can write this:
data Foo
= X { name :: String, age :: Int }
| A { name :: String }
| B { name :: String }
| C { name :: String }
| D { name :: String }
and then have name :: Foo -> String. You could also consider this:
data Tag = A | B | C | D | X Int
data Foo = Tagged Tag String
f (Tagged _ s) = s

In addition to #DanielWagner's answer, an alternative is to use Scrap your boilerplate (AKA "SYB"). It allows you to find the first subterm of a given type. So you could define
{-# LANGUAGE DeriveDataTypeable #-}
import Control.Monad
import Data.Data
import Data.Generics.Schemes
import Data.Typeable
data Foo = X (String, Int) | A String | B String | C String | D String
deriving (Show, Eq, Ord, Data, Typeable)
fooString :: Foo -> Maybe String
fooString = msum . gmapQ cast
and fooString will return the first String argument of your constructors. Function cast filters out Strings and gmapQ gets the filtered values for all immediate subterms.
However, this won't return the String from X, because X has no immediate String subterm, it has only one subterm of type (String, Int). In order to get the first String anywhere in the term hierarchy, you could use everywhere:
fooString' :: Foo -> Maybe String
fooString' = everything mplus cast
Note that this approach is slightly fragile: It includes simply all Strings it finds, which might not be always what you want, in particular, if you later extend your data type (or some data type that it references).

Related

How to pattern match on Constructors in Haskell?

I have a state machine where states are implemented using a sum type. Posting a simplified version here:
data State =
A { value :: Int }
| B { value :: Int }
| C { other :: String }
most of my functions are monadic consuming States and doing some actions based on the type. Something like (this code doesn't compile):
f :: State -> m ()
f st= case st of
s#(A | B) -> withValueAction (value s)
C -> return ()
I know that I could unroll constructors like:
f :: State -> m ()
f st= case st of
A v -> withValueAction v
B v -> withValueAction v
C _ -> return ()
But that's a lot of boilerplate and brittle to changes. If I change the parameters to the constructor I need to rewrite all case .. of in my codebase.
So how would you pattern match on a subset of constructors and access a shared element?
One way to implement this idiomatically is to use a slightly different value function:
value :: State -> Maybe Int
value (A v) = Just v
value (B v) = Just v
value _ = Nothing
Then you can write your case using a pattern guard like this:
f st | Just v <- value st -> withValueAction v
f C{} = return ()
f _ = error "This should never happen"
Or you can simplify this a bit further using view patterns and even more with pattern synonyms:
{-# LANGUAGE ViewPatterns, PatternSynonyms #-}
pattern V :: Int -> State
pattern V x <- (value -> Just v)
{-# COMPLETE V, C #-}
f (V x) = withValueAction x
f C{} = return ()
#Noughtmare's answer demonstrates how you can use view patterns to get the right "pattern matching syntax". To auto-generate the value function that selects a shared field from several constructors, you can use lens, though this kind of requires buying into the whole Lens ecosystem. After:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Lens.TH
data State =
A { _value :: Int }
| B { _value :: Int }
| C { _other :: String }
makeLenses ''State
you will have a traversal value that can be used to access the partially shared field:
f :: (Monad m) => State -> m ()
f st = case st ^? value of
Just v -> withValueAction v
Nothing -> return ()
This is the solution I've picked at the end. My two main requirements were:
"Or" pattern matching over constructors
Selection of a subset of fields shared by the pattern match
As reported by #Noughtmare 1 is not possible at the moment https://github.com/ghc-proposals/ghc-proposals/pull/522.
Since for my problem the source of variability comes mostly from parameters in the constructors and not from the number of states, the solution I picked was to enable NamedFieldPuns extension, so the solution is something like:
f :: State -> m ()
f st= case st of
A {value} -> withValueAction value
B {value} -> withValueAction value
C {} -> return ()
It has some boilerplate enumerating constructors but at least it has none at the constructor parameters. I'll have a look at the view patterns maybe they are useful when the source of variability comes from the number of constructors and not the arguments.

How to create a data type containing a list

I'm trying to create a data type class that contains a list:
data Test = [Int] deriving(Show)
But Haskell can't parse the constructor. What am i doing wrong here and how can I best achieve what I'm trying to do?
An Answer
You need to include a constructor, which you haven't done.
data Test = Test [Int]
Consider reviewing Haskell's several type declarations, their use, and their syntax.
Haskell Type Declarations
data
Allows declaration of zero or more constructors each with zero or more fields.
newtype
Allows declaration of one constructor with one field. This field is strict.
type
Allows creation of a type alias which can be textually interchanged with the type to the right of the equals at any use.
constructor
Allows creation of a value of the declared type. Also allows decomposition of values to obtain the individual fields (via pattern matching)
Examples
data
data Options = OptionA Int | OptionB Integer | OptionC PuppyAges
^ ^ ^ ^ ^ ^ ^
| | Field | | | |
type Constructor | | Constructor |
Constructor Field Field
deriving (Show)
myPuppies = OptionB 1234
newtype
newtype PuppyAges = AgeList [Int] deriving (Show)
myPuppies :: PuppyAges
myPuppies = AgeList [1,2,3,4]
Because the types (PuppyAges) and the constructors (AgeList) are in different namespaces people can and often do use the same name for each, such as newtype X = X [Int].
type
type IntList = [Int]
thisIsThat :: IntList -> [Int]
thisIsThat x = x
constructors (more)
option_is_a_puppy_age :: Options -> Bool
option_is_a_puppy_age (OptionC _) = True
option_is_a_puppy_age () = False
option_has_field_of_value :: Options -> Int -> Bool
option_has_field_of_value (OptionA a) x = x == a
option_has_field_of_value (OptionB b) x = fromIntegral x == b
option_has_field_of_value (OptionC (AgeList cs)) x = x `elem` cs
increment_option_a :: Options -> Options
increment_option_a (OptionA a) = OptionA (a+1)
increment_option_a x = x

Challenged by types

The problem:
Currently I have a type WorkConfig, which looks like this
data WorkConfig = PhaseZero_wc BuildConfig
| PhaseOne_wc BuildConfig Filename (Maybe XMLFilepath)
| PhaseTwo_wc String
| SoulSucker_wc String
| ImageInjector_wc String
| ESX_wc String
| XVA_wc String
| VNX_wc String
| HyperV_wc String
| Finish_wc String
deriving Show
(I'm using String from PhaseTwo_wc on as a placeholder for what will actually be used)
I have a function updateConfig that takes a WorkConfig as one of it's parameters.
The problem is that I want to be able to enforce which constructor is used.
For example in the function phaseOne I want to be able to guarantee that when updateConfig is invoked, only the PhaseTwo_wc constructor can be used.
In order to use a type class for this enforcement, I would have to make separate data constructors, for example:
data PhaseOne_wc = PhaseOne_wc BuildConfig Filename (Maybe XMLFilepath)
If I go this route, I have another problem to solve. I have other data types that have WorkConfig as a value, what would I do to address this? For example,
type ConfigTracker = TMVar (Map CurrentPhase WorkConfig)
How can I use the type system for the enforcement I would like, while keeping in mind what I mentioned above?
ConfigTracker would have to be able to know which data type I wanted.
* Clarification:
I'm looking to restrict which WorkConfig that updateConfig may take as a parameter.
Your question was a little vague, so I will answer in the general.
If you have a type of the form:
data MyType a b c d e f g = C1 a b | C2 c | C3 e f g
... and you want some function f that works on all three constructors:
f :: MyType a b c d e f g -> ...
... but you want some function g that works on just the last constructor, then you have two choices.
The first option is to create a second type embedded within C3:
data SecondType e f g = C4 e f g
... and then embed that within the original C3 constructor:
data MyType a b c d e f g = C1 a b | C2 c | C3 (SecondType e f g)
... and make g a function of SecondType:
g :: SecondType e f g -> ...
This only slightly complicates the code for f as you will have to first unpack C3 to access the C4 constructor.
The second solution is that you just make g a function of the values stored in the C3 constructor:
g :: e -> f -> g -> ...
This requires no modification to f or the MyType type.
To be more concrete and drive the discussion, how close does this GADT & Existential code get to what you want?
{-# LANGUAGE GADTs, KindSignatures, DeriveDataTypeable, ExistentialQuantification, ScopedTypeVariables, StandaloneDeriving #-}
module Main where
import qualified Data.Map as Map
import Data.Typeable
import Data.Maybe
data Phase0 deriving(Typeable)
data Phase1 deriving(Typeable)
data Phase2 deriving(Typeable)
data WC :: * -> * where
Phase0_ :: Int -> WC Phase0
Phase1_ :: Bool -> WC Phase1
deriving (Typeable)
deriving instance Show (WC a)
data Some = forall a. Typeable a => Some (WC a)
deriving instance Show Some
things :: [Some]
things = [ Some (Phase0_ 6) , Some (Phase1_ True) ]
do'phase0 :: WC Phase0 -> WC Phase1
do'phase0 (Phase0_ i) = Phase1_ (even i)
-- Simplify by using TypeRep of the Phase* types as key
type M = Map.Map TypeRep Some
updateConfig :: forall a. Typeable a => WC a -> M -> M
updateConfig wc m = Map.insert key (Some wc) m
where key = typeOf (undefined :: a)
getConfig :: forall a. Typeable a => M -> Maybe (WC a)
getConfig m = case Map.lookup key m of
Nothing -> Nothing
Just (Some wc) -> cast wc
where key = typeOf (undefined :: a)
-- Specialization of updateConfig restricted to taking Phase0
updateConfig_0 :: WC Phase0 -> M -> M
updateConfig_0 = updateConfig
-- Example of processing from Phase0 to Phase1
process_0_1 :: WC Phase0 -> WC Phase1
process_0_1 (Phase0_ i) = (Phase1_ (even i))
main = do
print things
let p0 = Phase0_ 6
m1 = updateConfig p0 Map.empty
m2 = updateConfig (process_0_1 p0) m1
print m2
print (getConfig m2 :: Maybe (WC Phase0))
print (getConfig m2 :: Maybe (WC Phase1))
print (getConfig m2 :: Maybe (WC Phase2))
Sorry, this is more of an extended comment than an answer. I'm a little confused. It sounds like you have some functions like
phaseOne = ...
... (updateConfig ...) ...
phaseTwo = ...
... (updateConfig ...) ...
and you're trying to make sure that eg, inside the definition of phaseOne, this never appears:
phaseOne = ...
... (updateConfig $ PhaseTwo_wc ...) ...
But now I ask you: is updateConfig a pure (non-monadic) function? Because if it is, than phaseOne can easily be perfectly correct and still invoke updateConfig with a PhaseTwo_wc; ie it could just throw away the result (and even if it's monadic too):
phaseOne = ...
... (updateConfig $ PhaseTwo_wc ...) `seq` ...
In other words, I'm wondering if the constraint you're trying to enforce is really the actual property you are looking for?
But now, if we're thinking of monads, there is a common pattern that what you describe is sort of like: making "special" monads that limit the kind of actions that can be performed; eg
data PhaseOneMonad a = PhaseOnePure a | PhaseOneUpdate BuildConfig Filename (Maybe XMLFilepath) a
instance Monad PhaseOneMonad where ...
Is this maybe what you're getting at? Note also that PhaseOneUpdate doesn't take a WorkConfig; it just takes the constructor parameters it is interested in. This is another common pattern: you can't constrain which constructor is used, but you can just take the arguments directly.
Hm... still not sure though.

Sort by constructor ignoring (part of) value

Suppose I have
data Foo = A String Int | B Int
I want to take an xs :: [Foo] and sort it such that all the As are at the beginning, sorted by their strings, but with the ints in the order they appeared in the list, and then have all the Bs at the end, in the same order they appeared.
In particular, I want to create a new list containg the first A of each string and the first B.
I did this by defining a function taking Foos to (Int, String)s and using sortBy and groupBy.
Is there a cleaner way to do this? Preferably one that generalizes to at least 10 constructors.
Typeable, maybe? Something else that's nicer?
EDIT: This is used for processing a list of Foos that is used elsewhere. There is already an Ord instance which is the normal ordering.
You can use
sortBy (comparing foo)
where foo is a function that extracts the interesting parts into something comparable (e.g. Ints).
In the example, since you want the As sorted by their Strings, a mapping to Int with the desired properties would be too complicated, so we use a compound target type.
foo (A s _) = (0,s)
foo (B _) = (1,"")
would be a possible helper. This is more or less equivalent to Tikhon Jelvis' suggestion, but it leaves space for the natural Ord instance.
To make it easier to build comparison function for ADTs with large number of constructors, you can map values to their constructor index with SYB:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data Foo = A String Int | B Int deriving (Show, Eq, Typeable, Data)
cIndex :: Data a => a -> Int
cIndex = constrIndex . toConstr
Example:
*Main Data.Generics> cIndex $ A "foo" 42
1
*Main Data.Generics> cIndex $ B 0
2
Edit:After re-reading your question, I think the best option is to make Foo an instance of Ord. I do not think there is any way to do this automatically that will act the way you want (just using deriving will create different behavior).
Once Foo is an instance of Ord, you can just use sort from Data.List.
In your exact example, you can do something like this:
data Foo = A String Int | B Int deriving (Eq)
instance Ord Foo where
(A _ _) <= (B _) = True
(A s _) <= (A s' _) = s <= s'
(B _) <= (B _) = True
When something is an instance of Ord, it means the data type has some ordering. Once we know how to order something, we can use a bunch of existing functions (like sort) on it and it will behave how you want. Anything in Ord has to be part of Eq, which is what the deriving (Eq) bit does automatically.
You can also derive Ord. However, the behavior will not be exactly what you want--it will order by all of the fields if it has to (e.g. it will put As with the same string in order by their integers).
Further edit: I was thinking about it some more and realized my solution is probably semantically wrong.
An Ord instance is a statement about your whole data type. For example, I'm saying that Bs are always equal with each other when the derived Eq instance says otherwise.
If the data your representing always behaves like this (that is, Bs are all equal and As with the same string are all equal) then an Ord instance makes sense. Otherwise, you should not actually do this.
However, you can do something almost exactly like this: write your own special compare function (Foo -> Foo -> Ordering) that encapsulates exactly what you want to do then use sortBy. This properly codifies that your particular sorting is special rather than the natural ordering of the data type.
You could use some template haskell to fill in the missing transitive cases. The mkTransitiveLt creates the transitive closure of the given cases (if you order them least to greatest). This gives you a working less-than, which can be turned into a function that returns an Ordering.
{-# LANGUAGE TemplateHaskell #-}
import MkTransitiveLt
import Data.List (sortBy)
data Foo = A String Int | B Int | C | D | E deriving(Show)
cmp a b = $(mkTransitiveLt [|
case (a, b) of
(A _ _, B _) -> True
(B _, C) -> True
(C, D) -> True
(D, E) -> True
(A s _, A s' _) -> s < s'
otherwise -> False|])
lt2Ord f a b =
case (f a b, f b a) of
(True, _) -> LT
(_, True) -> GT
otherwise -> EQ
main = print $ sortBy (lt2Ord cmp) [A "Z" 1, A "A" 1, B 1, A "A" 0, C]
Generates:
[A "A" 1,A "A" 0,A "Z" 1,B 1,C]
mkTransitiveLt must be defined in a separate module:
module MkTransitiveLt (mkTransitiveLt)
where
import Language.Haskell.TH
mkTransitiveLt :: ExpQ -> ExpQ
mkTransitiveLt eq = do
CaseE e ms <- eq
return . CaseE e . reverse . foldl go [] $ ms
where
go ms m#(Match (TupP [a, b]) body decls) = (m:ms) ++
[Match (TupP [x, b]) body decls | Match (TupP [x, y]) _ _ <- ms, y == a]
go ms m = m:ms

Using Data.Typeable's cast with a locally defined data type

I have a data type which I'm using to represent a wxAny object in wxHaskell, currently I only support wxAnys which contain a String or an Int, thus:
data Any
= IsString String
| IsInt Int
| IsUndefined
I need a function (a -> Any) and I'm wondering if I can do it elegantly using Data.Typeable, or if anyone can suggest another approach?
You can do it relatively simply by combining the cast function with pattern guards:
f :: Typeable a => a -> Any
f x
| Just s <- cast x = IsString s
| Just n <- cast x = IsInt n
| otherwise = IsUndefined
That does require that the input be an instance of Typeable, but most standard types have a deriving Typeable clause so it's usually not a problem.
You could use a type-class for this:
class ToAny a where
toAny :: a -> Any
instance ToAny Int where
toAny = IsInt
instance ToAny String where
toAny = IsString
For the other case, you could just not call the function on values of other types - it would be less code.

Resources